SPARQL Versus Versa

A new working draft of SPARQL has been released.

While there is no doubt that the language is getting better and more polished with each new release of this specification, I am surprised to see that the limitations I had found in rdfDB back in early 2001 when I have tried to use it for XMLfr are still there.

This is an old story that I have presented in Austin at KT 2001 and published as an XML.com article: it can be very interesting to compute the distance between resources and to do so, you need the equivalent of a SQL “group by” clause and the related aggregate functions.

In the case of XMLfr, I rely on this feature to compute the distance between two topics by counting the number of articles in which they appear together. To do so, I use the SQL group by clause with the “count” aggregate function.

The fact that these features were missing in rdfDB has been the reason why I have had to drop rdfDB and RDF altogether and store my triples in a relational database that I query with SQL.

As far as I know, there is only one RDF query language that support these features: 4Suite’s Versa query language.

Versa is so different from SPARQL that these two languages are as difficult to compare as, let’s say the W3C XML Schema’s XML syntax and the RELAX NG’s compact syntax.

Instead of trying to bend the well known SQL syntax to make it work on triples, Versa has defined a totally new language for the purpose of traversing triples data stores.

The result is surprising. You won’t find anything that will remind you of SQL and, to take an example from “Versa by example“, to get a list of people’s first names sorted by their age, you’d write: “sortq(all(), “.-o:age->*”, vsort:number) – o:fname -> *”

If you insist and don’t let the first surprise stop you, the second surprise is that this language is working incredibly well. During the (unfortunately too few) opportunities I have had to work with Versa, I have never been blocked by a limit of the language like I had been with rdfDB or would be with SPARQL.

The bad news is that there is only one implementation of Versa (4Suite). This means that you won’t be able to use Versa over Redland or Jena and I wish people implementing RDF databases could consider more closely implementing Versa over their databases!

I also wish the W3C could have taken Versa as the main input for their RDF query language, but this wish doesn’t seem too likely to happen :-( …

Share and Enjoy:
  • Identi.ca
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Add to favorites
This entry was posted in English, SemWeb, XML. Bookmark the permalink.

One Response to SPARQL Versus Versa

  1. I agree – the lack of GROUP BY and aggregate functions such as SUM() and COUNT() is a real deal-breaker. If SPARQL can do these, it’s certainly not obvious.

Leave a Reply

Your email address will not be published. Required fields are marked *

Enter your OpenID as your website to log and skip name and email validation and moderation!

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>