State of Redland RDF Libraries 2008-02
Redland was born 2000-08.  Happy 7.5th birthday!
This is a review of 2007 (roughly) since I reported in the
State of Redland 2007-02
on 2007-02-18.  It covers:
(This is on the web at
http://librdf.org/2008/02/18-state/)
1. Redland Users
Redland is made available by several Linux, Unix and other
open source projects such as:
- Debian (sarge onwards)
- Fedora (FC4 onwards).  In 2007 FC6 onwards added librdf.
- FreeBSD Ports
- Gentoo
- Macports.  Added all packages by Oct 2007
- Mandriva (9.1 onwards).  In 2007 Mandriva Cooker gained all packages.
- SUSE (9.2 onwards)
- Ubuntu (breezy onwards)
and the libraries are also used inside other applications
and services such as, for example:
2. State of the Packages
My summary of the high-level state of the packages is:
- Raptor syntax parsing and serializing: libraptor
 
- Mature. The API is growing a little since some internal parts
(SAX2 API) are getting pushed into the public API.  There are also
some new features and syntaxes being added, portability fixes and more rarely,
actual bug fixes.
- Rasqal query parsing, executing: librasqal
 
- Under development. The current API is unstable and being deliberately broken in the next release. The query engine is not complete enough to execute SPARQL and that is still the priority for 1.0.
- Redland RDF API and triple stores: librdf
 
- Mature. Some API change is happening and the storages are getting
improvements.  Mostly updates from Raptor and Rasqal plus bug fixes.
- Language Bindings to Perl, PHP, Python and Ruby
 
- Mature.  Removed the C#, Java and Tcl bindings in 2007 as
promised since I was not going to maintain them.
3. Development
In 2007, each of the packages has seen the following
releases and major changes:
- Raptor 1.4.15 - 1.4.16 (2 releases)
 
- 
- GRDDL support was completed and passed the test cases.
- Improved XML and URI error handling
- Updated Turtle parser for Turtle 2007-09-11
- Added a TRiG parser
- Many low-memory situation improvements
 
- Rasqal 0.9.14 - 0.9.15 (2 releases)
 
- 
- Updated the SPARQL syntax support to match the W3C Recommendation.
- Query engine supports all SPARQL datatypes and evaluation rules.
- Added LAQRS syntax extensions
- Many low-memory situation improvements
 
- Redland 1.0.6 - 1.0.7 (2 releases)
 
- 
- A new transactions API was added implemented for MySQL and SQLite storages
- Added a optional modular storage configuration to load storage modules on demand
- A new query results formatter class was added
- Many low-memory and resource allocation failure improvements
- Many bug fixes
 
- Language Bindings 1.0.6.1 - 1.0.7.1 (2 releases)
 
- 
- Removed Tcl, Java and C# bindings as promised
- Many updates to the Python and Ruby Bindings
- Many bug fixes
 
In 2007 Lauri Aalto was a new committer and made a lot of changes
to the libraries in the areas of low-memory and handling resource
allocation failures, mentioned above plus portability fixes for
non-gcc compilers and Win32 as well as other bug fixes and
improvements.
The redland mailing lists are now (early 2008) archived by
gmane.org and
The Mail Archive.
You can read them on their web sites at:
4. Challenges
The main challenge continues to be to make the project more
scalable.  Although I package the source code, I only really deal
with Debian binary packages since as can be seen above, there are
others working on distribution-specific packages, which is good.
The loss of SourceForge's compile farm was tragic since it means
there is no automated way to test cross-platform compatibility.
I noticed that although 2007 had 9 releases, the previous year
there were 15.  This is mainly a consequence of me being busier and
not developing code as part of my day job.  Less releases is not
necessarily bad as the packages mature but it can mean a long time
to get out bug fixes.
My main goal to deal with these could be summarised as:
  - Try harder to encourage more shared development
5. Tasks
5.1 General tasks
More of a wishlist than an ordered list
- Think about a License change to Apache2 only.
- Make Redland turn SPARQL into underlying SQL queries when possible.
- Start the Redland (librdf) API tutorial.
- Create some documentation to explain the libraries structure and relationships.
- Consider not shipping Raptor and Rasqal inside the Redland tarball
- Create documentation on the data flow inside the libraries
- Figure out whether to keep writing manual pages as well as gtkdoc. (DRY)
- The demos need to be updated and the changes made
put back into subversion.
- A SPARQL protocol endpoint demo would be good to have
DRY = Don't Repeat Yourself
5.2 Pending stuff
There are several tasks already in progress either sitting in a
patch, in Subversion or underway separately.
- A new schema for the SQLite store: me (patch emailed to redland-dev)
- Object-based PHP5 bindings: Yahoo! (pending)
- Two JSON serializers for Raptor (svn)
- AVL Trees improvements added to Raptor that should make
RDF/XML and Turtle serializers faster (svn)
- Rasqal API and ABI change announced for future 0.9.16 (partially in svn)
- Rasqal can read result sets from the SPARQL query results XML (svn)
5.3 Raptor tasks
- Plan for Raptor 2 API/ABI change
- Focus should be bug fixes
5.4 Rasqal tasks
- Rasqal 1.0: when Rasqal can execute complete SPARQL
 
    - Make SPARQL OPTIONALs work
- Make SPARQL GROUPwork
- Make SPARQL UNIONwork
 
- Write a query optimiser
- Add a way to declare extension functions
- Look into language extensions
- Address query engine denial of service:
 
    - limit query wall clock time
- limit triple pattern matches
- callback to allow application to abort queries?
- limit memory use?
- limit sorting of results?
 
5.5 Redland librdf tasks
  - Improve the storages performance
5.6 Bindings tasks
  - Split the single language bindings package to be one per-binding.
    That would be: Perl, PHP5, Python and Ruby
- Make the Perl binding into a CPAN installable tarball -
    partially done but not entirely working
6. Future Ideas
6.1 New Version Control System
This is the same as last year's
New Version Control System
idea and although it is not urgent, I'm favouring GIT right now
with the main issue that it's got a steep learning curve compared
to anything else.
6.2 Raptor Version 2
This break-the-binary-API I also discussed last year in
Raptor Version 2
I can see being started once the focus on Rasqal 1.0 is over which should
happen in 2008.  There are several cleanups that need to be done.
7. Changes
In order to encourage more help with Redland, I'm proposing this:
Five good patches get you commit access.
(after Brian Aker
but I'm slightly more cautious)
Plus I have started a
Redland development blog
...
Thanks for reading.
Dave Beckett,
California, USA, 2008-02-18