I am thinking about writing a RSS aggregator. Not that it is original or has never been done, but just as an exercise to play with a RDF database and so far I am still not convinced of the way to go!
As a query language, I really like Versa, but RDQL seems like having much more traction and I have given it a closer look.
What I don’t like about it is that it seems to bring the RDF data model to its end: after you’ve used it, you don’t see triples any longer but tables of resources.
To take an example from the RDQL tutorial:
SELECT ?resource, ?givenName
WHERE (?resource, <http://www.w3.org/2001/vcard-rdf/3.0#N>, ?z) ,
(?z, <http://www.w3.org/2001/vcard-rdf/3.0#Given>, ?givenName)
Returns:
resource | givenName
============================================
<http://somewhere/JohnSmith/> | "John"
<http://somewhere/RebeccaSmith/> | "Rebecca"
<http://somewhere/SarahJones/> | "Sarah"
<http://somewhere/MattJones/> | "Matthew"
Where are the triples gone?
What does the strenght of other query languages in other domains such as SQL or XPath is that their data models are the same in input as in output: the result of a SQL select statement is basically like a table and I can do sub queries on this result or insert it into a table, the result of an XPath query is (or can be) a nodeset which I can output as XML and on which I can perform a new XPath query.
By contrast, the result of a RDQL query seems to be a bunch of resoures that I can’t really use as triples.
Why is that a problem?
Let’s say I want to create a RSS channel with RSS items meeting a condition. With XPath I can just write “//rss:item[my condition]” and I have a nodeset with the complete definition of these items. With RDQL, I can write a query that will give me these items as resources but I haven’t seen how I could get these items with their descriptions as triples ready to be serialized back as RSS.
What I’d really like, is a query language which would let me do with RDF graphs what I am doing in XPath!