First steps with MarkLogic

[Edited to take into account Dave Cassel’s comments]

To get started with MarkLogic I have chosen to develop a persistence layer for Orbeon Form Runner.

This is the kind of projects I like, small enough to be done in a few days yet technical enough to see advanced topics and potentially useful to other people.

The project is available on my community site and I’d like to share in this post my feelings during this first contact with MarkLogic.

The first contact with a new product is the installation and I have been really surprised by the simplicity of MarkLogic installation  process. My laptop is running Ubuntu which is not a supported platform but the install went very smoothly after converting the RPM package as documented everywhere on the web and it didn’t took me more than a few minutes to get MarkLogic up and running.

The second contact with the admin interface has been less obvious: MarkLogic comes with a series of different generations of web UIs (admin, configuration manager, monitoring, information studio, application builder and query console) and it’s not always obvious to find your way between these tools.

I must also say that I am an old school administrator who prefers configuration files rather than point and click administration windows!

Fortunately this is well documented and I have rapidly been able to create a new database and servers for my project. The interface with my favorite XML tool, oXygen XML editor has also been very easy to setup.

The feeling that hasn’t left me all over this project is a feeling of stability and robustness: I have never needed to restart the server, all the modifications of configuration have been made while the server was up and running, I have never seen any crash nor any non understandable error message.

In other words, MarkLogic is the kind of software which makes you feel secure and comfortable!

A Form Runner persistence layer is a REST API and such APIs are reasonably easy to implement in MarkLogic thanks to their REST library. I think I have found a bug (I am pretty good for that, all the products I have worked with will tell you that) but that was in a minor function and nothing really blocking.

Something to note if you want to try it by yourself is that paths to documents in a database does not always start with a « / » and « foo/bar » is a different directory than « /foo/bar ». To search all the documents under « foo/bar/ » you’ll write something such as:

cts:search(/, cts:directory-query('foo/bar/', "infinity"))

If you forget the trailing slash in (foo/bar) MarkLogic will raise an error with a self-explanatory message but if you add a leading slash (/foo/bar/) like you’d do for any decent file system you will search in a different directory and your search may  silently result in an empty sequence!

In fact, as pointed out by Dave Cassel, Marklogic considers that « foo/ » is a root directory like « / » and « /foo/ » is a subdirectory of the root directory « / ». A database can thus have as many root directory as you want but you need to be careful and if you insert a document as « foo/bar/bat.xml » you won’t be able to find it as « /foo/bar/bat.xml »!

And as you’ve noticed with this simple snipet you’ll have to use many proprietary functions to develop XQuery applications in MarkLogic. This is not really a problem specific to MarkLogic but XQuery has been defined to be generic and we use it for things which are well beyond its original scope.

The good news is that MarkLogic comes with a very extensive library and that you won’t be blocked in your developments. The bad news is of course that what you’ll develop in MarkLogic won’t be easily portable to other XML databases.

The last thing I want to report is the quality of the online documentation, on MarkLogic Community but also on the web at large and on stackoverflow in particular : during my development I have always been able to find answers for the many questions I had in a very reasonable amount of time.

To summarize, I haven’t had the opportunity to test the support of big data yet but this first contact leaves me with a very positive feeling of a product which is mature, stable, rich of features and well documented and supported by its community.

7 thoughts on “First steps with MarkLogic”

  1. Eric, regarding the leading slash for directory-queries, I wonder whether it’s due to URIs of your data set. When you inserted them, did you include the leading slash? In Query Console, select the database you’re working with and click Explore; you’ll be able to browse the documents in the database.

    1. Dave,

      You’re right, the leading slash is not included when the documents are inserted. « foo/bar » and « /foo/bar » seem to be two different paths which I don’t find very intuitive ;) …

      Thanks,

      Eric

      1. The key thing to know is that something can be in a directory, but doesn’t have to be. In the case of « foo/bar.xml », « foo/ » is the top level directory. With « /foo/bar.xml », « foo/ » is a subdirectory under the root « / ». Not requiring directories allows for URIs like « foo.xml » or « http://data.source.com/content/foo.xml ». My preference is to always use the root directory, because I agree that’s more intuitive.

        (disclaimer: MarkLogic employee)

Répondre à Dave Cassel Annuler la réponse

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *