TreeBind goes RDF

TreeBind can be seen as yet another open source XML <-> Java object data binding framework.

The two major design decisions that differentiate TreeBind from other similar frameworks are that:

  1. TreeBind has been designed to work with existing classes through Java introspection and doesn’t rely on XML schemas.
  2. Its architecture is not specific to XML and TreeBind can be used to bind any source to any sink, assuming they can be browsed and built following the paradigm of trees (this is the reason why we have chosen this name).

Another difference with other frameworks is that its TreeBind has been sponsored by one of my customers (INSEE) and that I am its author…

The reason why we’ve started this project is that we’ve not found any framework that was meeting the two requirements I have mentioned and I am now bringing TreeBind a step forward by designing a RDF binding.

I have sent an email with the design decisions I am considering for this RDF binding to the TreeBind mailing list and I include a copy bellow for your convenience:


Hi,

I am currently using TreeBind on a RDF/XML vocabulary.

Of course, a RDF/XML document is a well formed XML document and I could use the current XML bindings to read and write RDF/XML document.

However, these bindings focus on the actual XML syntax used to serialize the document. They don’t see the RDF graph behind that syntax and are sensitive to the « style » used in the XML document.

For instance, these two documents produce very similar triples:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns="http://ns.treebind.org/example/";>
    <book>
        <title>RELAX NG</title>
        <written-by>
            <author>
                <fname>Eric</fname>
                <lname>van der Vlist</lname>
            </author>
        </written-by>
    </book>
</rdf:RDF>

and

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns="http://ns.treebind.org/example/";>
    <book>
        <title>RELAX NG</title>
        <written-by rdf:resource="#vdv"/>
    </book>
    <author rdf:ID="vdv">
        <fname>Eric</fname>
        <lname>van der Vlist</lname>
    </author>
</rdf:RDF>

but the XML bindings will generate a quite different set of objects.

The solution to this problem is to create RDF bindings that will sit on top of a RDF parser to pour the content of the RDF model into a set of objects.

The overall architecture of TreeBind has been designed with this kind of extension in mind and that should be easy enough.

That being said, design decisions need to be made to define these RDF bindings and I’d like to discuss them in this forum.

RDF/XML isn’t so much an XML vocabulary in the common meaning of this term but rather a set of binding rules to bind an XML tree into a graph.

These binding rules introduce some conventions that are sometimes different from what we use to do in « raw » XML documents.

In raw XML, we would probably have written the previous example as:

<?xml version="1.0" encoding="UTF-8"?>
<book xmlns="http://ns.treebind.org/example/";>
    <title>RELAX NG</title>
    <author>
        <fname>Eric</fname>
        <lname>van der Vlist</lname>
    </author>
</book>

The XML bindings would pour that content into a set of objects using the following algorithm:

  • find a class that matches the XML expanded name {http://ns.treebind.org/example/}book and create an object from that class.
  • try to find a method such as addTitle or setTitle with a string parameter on this book object and call that method with the string « RELAX NG ».
  • find a class that matches the XML expanded name {http://ns.treebind.org/example/}author and create an object from that class.
  • try to find a method such as addFname or setFname with a string parameter on this author object and call that method with the string « Eric ».
  • try to find a method such as addLname or setLname with a string parameter on this author object and call that method with the string « van der Vlist ».
  • try to find a method such as addAuthor or setAuthor with a string parameter on the book object and call that method with the author object.

We see that there is a difference between the way simple type and complex type elements are treated.

For a simple type element (such as « title », « fname » and « lname »), the name of the element is used to determine the method to call and the parameter type is always string.

For a complex type element (such as author), the name of the element is used both to determine the method to call and the class of the object that needs to be created. The parameter type is this class.

This is because when we write in XML there is an implicit expectation that « author » is used both as a complex object and as a verb.

Unless instructed otherwise, RDF doesn’t allow these implicit shortcuts and an XML element is either a predicate or an object. That’s why we have added an « written-by » element in our RDF example:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns="http://ns.treebind.org/example/";>
    <book>
        <title>RELAX NG</title>
        <written-by>
            <author>
                <fname>Eric</fname>
                <lname>van der Vlist</lname>
            </author>
        </written-by>
    </book>
</rdf:RDF>

The first design decision we have to make is to decide how we will treat that « written-by » element.

To have everything in hand to take a decision, let’s also see what are the triples for that example:

rapper: Parsing file book1.rdf
_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ns.treebind.org/example/book> .
_:genid1 <http://ns.treebind.org/example/title> "RELAX NG" .
_:genid2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ns.treebind.org/example/author> .
_:genid2 <http://ns.treebind.org/example/fname> "Eric" .
_:genid2 <http://ns.treebind.org/example/lname> "van der Vlist" .
_:genid1 <http://ns.treebind.org/example/written-by> _:genid2 .
rapper: Parsing returned 6 statements

In these triples, two of them are defining element types:

_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ns.treebind.org/example/book> .

and

_:genid2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ns.treebind.org/example/author> .

I propose to use these statements to determine which classes must be used to create the objects. So far, that’s pretty similar to what we’re doing in XML.

Then, we have triples that assign literals to our objects:

_:genid1 <http://ns.treebind.org/example/title> "RELAX NG" .
_:genid2 <http://ns.treebind.org/example/fname> "Eric" .
_:genid2 <http://ns.treebind.org/example/lname> "van der Vlist" .

We can use the predicates of these triples (<http://ns.treebind.org/example/title>, <http://ns.treebind.org/example/fname>, <http://ns.treebind.org/example/lname>) to determine the names of the setter methods to use to add the corresponding information to the object. Again, that’s exactly similar to what we’re doing in XML.

Finally, we have a statement that links two objects together:

_:genid1 <http://ns.treebind.org/example/written-by> _:genid2 .

I think that it is quite natural to use the predicate (<http://ns.treebind.org/example/written-by>) to determine the setter method that needs to be called on the book object to set the author object.

This is different from what we would have been doing in XML: in XML, since there is a written-by element, we would have created a « written-by » object, added the author object to the written-by object and added the written-by object to the book object.

Does that difference make sense?

I think it does, but the downside is that the same simple document like this one:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns="http://ns.treebind.org/example/";>
    <book>
        <title>RELAX NG</title>
        <written-by>
            <author>
                <fname>Eric</fname>
                <lname>van der Vlist</lname>
            </author>
        </written-by>
    </book>
</rdf:RDF>

will give a quite different set of objects depending which binding (XML or RDF) will be used.

That seems to be the price to pay to try to get as close as possible to the RDF model.

What do you think?

Earlier, I have mentioned that RDF can be told to accept documents with « shortcuts ». What I had in mind is:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns="http://ns.treebind.org/example/";>
    <book>
        <title>RELAX NG</title>
        <author rdf:parseType="Resource">
            <fname>Eric</fname>
            <lname>van der Vlist</lname>
        </author>
    </book>
</rdf:RDF>

Here, we have used an attribute rdf:parseType= »Resource » to specify that the author element is a resource.

The triples generated from this document are:

rapper: Parsing file book3.rdf
_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ns.treebind.org/example/book> .
_:genid1 <http://ns.treebind.org/example/title> "RELAX NG" .
_:genid2 <http://ns.treebind.org/example/fname> "Eric" .
_:genid2 <http://ns.treebind.org/example/lname> "van der Vlist" .
_:genid1 <http://ns.treebind.org/example/author> _:genid2 .
rapper: Parsing returned 5 statements

The model is pretty similar except that there is a triple missing (we have now 5 triples instead of 6).

The triple that is missing is the one that gave the type of the author element:

_:genid2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ns.treebind.org/example/author> .

The other difference is that <http://ns.treebind.org/example/author> is now a predicate.

In this situation when we don’t have a type for a predicate linking to a resource, I propose that we follow the rule we use in XML and use the predicate to determine both the setter method and the class of the object to create to pour the resource.

What do you think? Does that make sense?

Thanks,

Eric


Thanks for your comments, either on this blog or (preferred) on the TreeBind mailing list.

One thought on “TreeBind goes RDF”

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *