Redland

Dave Beckett

 
 
Hosted by
Dreamhost

since 2005.

Data

DOAP
(See DOAP Project)

Redland librdf RDF API Library - Query API

THIS DOCUMENT IS OUT OF DATE

Rasqal is the resulting query language support for Redland and is complete and working. The librasqal manual page explains the API with examples. Query APIs in Redland that match this have been created and higher-level language APIs developed.

Introduction

A rough design/braindump for how real query languages (strings returning result sets) could be added to Redland.

Terminology here: QL=query language, such as RDQL, Algae, ...

The idea is that the main code hunts for a QL implementation. The particular storage (aka database, triple store) doesn't do any of that delegation itself (unless it wants to) it just replies with an answer or No (NULL) when asked if it does a particular QL.

Registering a query language

Each QL implementation such as Algae, RDQL registers itself with the query factory so that the system can recognise the language name and identifier (URI) when the user calls.

For example, something like this for a language Algae

librdf_query_algae.c

query_language_factory QLdescription = { 
  "Algae", 
  "http://...uri-of-algae/". 
  librdf_query_algae_start, 
  ... more factory methods 
}; 


void
librdf_query_algae_init (...) {
  ...
  librdf_query_language_add_factory(QLdescription) 
  ...
}

(This is just like storage, parsers, hashes, other factories)

Performing a query:

The user does a query by invoking a method on the model class giving a particular query string and an identifier for a query language.

queryLanguageURI=new URI ("http://...uri-of-algae") 
resultSet=librdf_model_query_string(model, "query",
                                    queryLanguageURI) 

(Aside: have a prepare/excute stage?)

The model (interface) class passes that down to the factory implementing the particular model, usually librdf_model_storage

rdf_model.c:

librdf_model_query_string(model, ql_string, ql_uri) {
 
  /* Calls the model implementation for this method
   * in this case:  */
  model->factory->query_string(...)
  /* in this case, calling librdf_model_storage_query_string */
}

Invokes the model implementation class:

rdf_model_storage.c:

librdf_model_storage_query_string(model, ql_string, ql_uri) {
 
  /* calls the storage class to do the work */
  librdf_storage_query_string(storage, ql_string, ql_uri);
}

The storage interface class (librdf_storage) looks to see if the QL is supported by the storage

rdf_storage.c:

librdf_storage_query_string(storage, ql_string, ql_uri) {
 
  if(librdf_storage_supports_ql(ql_uri)) { 

    /* the storage implementation supports this
     * query language directly so call it  */
    return storage->factory->query_string(storage,
       ql_string, ql_uri);
  } else { 

    /* it doesn't so we need a query adaptor class */
    ql_adaptor=librdf_query_get_adaptor(ql_uri) 
    return librdf_query_adaptor_query_string(ql_adaptor,
       ql_string, storage);
  }
}

(This factory stuff is similar to how storage, parsers, hashes work).

At this point there are two choices from above condition, described in the next two sections.

1. The storage implementation understands the QL

The storage factory method query_string(storage, ql_string) has been invoked to perform the query directly. It can do whatever it wants directly using the storage internals. For example, at this point a SQL store could rewrite RDQL into SQL and execute it. Any other smart query work goes here. For example:

rdf_storage_abc.c:

int
librdf_storage_abc_supports_ql(ql_uri) {
  if(/* I understand this ql */) {
    return 1;
  else
    return 0;
}

librdf_storage_abc_query_string(storage, ql_string, ql_uri) {
  /* I already know I understand this QL,
   * but should check ql_uri anyway */
  if(!librdf_storage_abc_supports_ql(ql_uri))
    return NULL;

  /* do the query - I have the ql_string with the query text
   * and I support this language */

  return query_results;
}

2. The query language adaptor class is processing the QL

the librdf_query_adaptor class factory method query_string was invoked to do perform the query using the storage

rdf_query_algae.c:

librdf_query_algae_query_string(ql_adaptor, ql_string, 
                                ql_uri, storage) {

  /* do the query - I have the ql_string with the query text
   * and I support this language 
   */

  /* storage is the underlying triple store so I can use
   * it to do find_statements aka "triplesmatching"
   */

  return query_results;
}

Results

finally the result object/s are passed up to the user - unsure what type these are

Something like, the result object will be a result set like a JDBC one with operations on it like getRow and a cursor style interface that can move forward. This would be implemented like redland stream or iterator internals with callbacks to the class that handles it.

For example the user invokes binding=librdf_resultset_get_row(resultset) which causes the internal calls:

librdf_resultset.c:

row
librdf_resultset_get_row(resultset) {
  return resultset->get_row(resultset->context);
}

Invokes the query adaptor class get_row factory method. This could call some method in either the storage implementation class, when it handles the QL or in the adaptor class, when the storage doesn't. i.e. either of librdf_storage_abc_resultset_get_row (and implicitly knows this is algae). This was created by librdf_storage_abc_query_string).

OR librdf_query_algae_resultset_get_row for the resultset returned by librdf_query_algae_query_string which is shown here:

librdf_query_algae.c

row
librdf_query_algae_resultset_get_row(context) {
  algae_internal=(cast)context;

  if(algae_end_of_results(algae_internal))
    return NULL;

  row new_row=malloc(sizeof(row));
  algae_get_row(algae_internal, new_row);

  return new_row;
}

Last Modified: $Date: 2004/07/16 11:02:47 $