XML – Page 4 – Eric van der Vlist

Sun to buy the M from LAMP

Sun has announced their intention to buy MySQL, the number one database for web applications used both by Google et Amazon but also powering most of personal blogs.

Sun has considered that being the M from “LAMP” (Linux, Apache, MySQL, PHP) would be a good way step to be the “.” in “.com” as they used to say in one of their taglines.

This announce has and will be commented at large… Personally, I do hope that this will speed up a better support of XML by MySQL.

I had the opportunity to have a look at XML support in MySQL 5.1 for the chapters about databases in the book “Beginning XML” that I have co-written with Joe Fawcett (he covered SQL Server and wrote two sections about eXist and MySQL). My conclusion is that these features are a good start but there is still a lot of work between they reach something that can match modern databases!

Knowing the long term commitment of Sun to XML, I do hope that this will boost the developments of new XML features.

While we’re speaking of modern databases, one of the leaders in term of XML support is Oracle.

And today is also the date they’ve chosen to announce that they’re buying BEA. what’s the link between these two announcements? It’s a factor 8.5! It will cost $b 8.5 to Oracle to buy BEA and only $b 1 to Sun to buy MySQL.

I don’t want to underestimate the BEA’s business value, but it looks to me that in term of overall visibility and contribution to the net economy, the factor should be the other way round!

That’s probably a good illustration that it remains more difficult to monetize open source than commercial developments.

Sun se paye le M de LAMP

Sun vient d’annoncer son intention d’acheter MySQL, la base de données qui est devenue la principale base de données des applications Web et est utilisée notamment par Google et Amazon mais également par le plupart des blogs.

Sun aura sans doute considéré qu’être le M de « LAMP » (Linux, Apache, MySQL, PHP) est sans doute un bon moyen d’être de « . » de « .com » pour reprendre une de leurs anciennes taglines!

Cette annonce sera sans doute amplement commentée. Pour ma part, j’espère que ce rachat signifiera un meilleur support de XML par MySQL.

J’ai eu l’occasion de me pencher sur le support de XML par MySQL 5.1 pour le chapitre consacré aux bases de données du livre « Beginning XML » que j’ai coécrit avec Joe Fawcett (je lui ai laissé le soin de présenter SQL Server et ai consacré deux sections à eXist et MySQL). Ma conclusion est que si ces fonctionnalités sont un bon début, il y a encore bien du chemin à faire pour que le support de XML par MySQL soit digne d’une base de donnés moderne.

Compte tenu de l’implication historique de Sun dans XML, j’espère que ce rachat va accélérer le mouvement.

Puisque nous parlons de base de données modernes, une des références en matière de support de XML est sans doute Oracle.

C’est également aujourd’hui que ce dernier annonce le rachat de BEA. Quel rapport y a t-il entre ces deux annonces? Le rapport est de 8,5! Il en coûtera 8,5 milliards de dollars à Oracle pour acquérir BEA alors que Sun ne déboursera qu’un petit milliard de dollars pour acheter MySQL.

Sans vouloir sous estimer la valeur de BEA, il me semble qu’en terme de visibilité et de contribution au Web et à la « nouvelle économie », le rapport devrait être dans l’autre sens!

C’est sans doute la preuve qu’il reste plus difficile de valoriser des développements de logiciels open source que des développements commerciaux.

To XForms or not to XForms?

Yahoo! has released Yahoo! Mobile Developer Platform sous le nom de « Blueprint » and the news has been widely commented by XForms fans: Micah Dubinko states that “Yahoo! introduces mobile XForms” and Erik Bruchez “Yahoo! goes XForms”

The roadmap published by Yahoo! appears to be much more cautious and just says “Much of Blueprint’s philosophy and syntax comes from XForms”.

The developers’ guide clearly shows that if Yahoo! did borrow elements from the XForms recommendation, these elements do not belong to the XForms namespace, cohabit with elements similarly borrowed to XHTML and elements that are specific to Yahoo! and are declared under a single namespace.

The result seems as different from XForms than WAP Forum’s WML was different from XHTML.

If the defenders of a declarative approach can celebrate the fact that this approach has been preferred by Yahoo! over a more procedural approach based on JavaScript, I think that this is an overstatement to say that this is a success for XForms.

XForms has been designed to be user agent agnostic and the development of a Basic version has even been started for low end terminals.

Mobiles were obviously a target for XForms from the beginning and the adoption by Yahoo! of a not really compatible clone can on the contrary be seen as a new failure.

This is especially regrettable for a technology that has a huge technical potential.

XForms ou pas XForms?

Yahoo! vient d’annoncer sa plateforme de développement pour mobiles (Yahoo! Mobile Developer Platform) et la nouvelle a été immédiatement relayée par les fans de XForms : Micah Dubinko titre « Yahoo! introduit XForms sur les mobiles » et Erik Bruchez « Yahoo! adopte XForms ».

La feuille de route publiée par Yahoo! est nettement plus prudente puisqu’elle se contente de dire que « beaucoup de la philosophie et de la syntaxe de Blueprint viennent de XForms ».

Lorsque l’on se penche sur le guide du développeur, on s’aperçoit que Yahoo! a effectivement emprunté des éléments de la recommandation XForms, mais que ces éléments ont été changés d’espace de noms et cohabitent avec des éléments empruntés de la même manière de XHTML et des éléments propres à Yahoo!.

Le résultat n’est donc pas plus proche de XForms que le WML du WAP Forum n’était proche de XHTML.

Si les défenseurs d’une approche déclarative peuvent se féliciter qu’elle ait été préférée par Yahoo! à une approche plus procédurale qui se serait appuyée sur JavaScript, je pense donc très exagéré de parler de succès pour XForms.

XForms a été conçu pour séparer la logique des formulaire de saisie de leur représentation de manière à pouvoir être exécuté sur différents types de terminaux et le développement d’une version « basique » a été lancée.

Les mobiles faisaient clairement partie des cibles de XForms et ont peut au contraire interpréter cette adoption d’un clone non vraiment compatible de XForms par Yahoo! comme un nouvel échec.

C’est d’autant plus regrettable que les implémentations de cette technologie montrent qu’elle représente un potentiel technique remarquable.

IE 6 prétend accepter application/xhtml+xml!

Un des facteurs qui a limité l’adoption de XHTML est sans aucun doute le fait que Internet Explorer n’accepte pas le type média « application/xhtml+xml » : bien que ce soit le type média officiel des documents XHTML et bien qu’Internet Explorer sache afficher les documents XHTML, si vous lui présentez un document avec ce type média il ne l’affiche pas mais propose à l’utilisateur de le sauvegarder ou des chercher une application capable de lire ce document.

Autrement dit, IE n’accepte les documents XHTML que s’ils se font passer pour des documents HTML!

A l’inverse, un navigateur tel que Firefox qui accepte le type média « application/xhtml+xml » affichera également vos documents XHTML si vous les faites passer pour des documents HTML, mais il les traitera très naturellement comme des documents HTML ce qui peut poser problème dans certains cas.

La solution idéale est donc de détecter si le navigateur accepte des documents XHTML et de présenter des documents XHTML ou HTML suivant le résultat de cette détection.

Cette fonctionnalité est proposée par Orbeon Forms bien qu’elle n’est pas mise en œuvre dans les installations par défaut. Pour la rendre opérationnelle, il suffit d’enlever les commentaires qui entourent la détection et le traitement différencié HTML/XHTML dans le pipeline epilogue-servlet.xpl.

Ce pipeline est exécuté pour effectuer les mises en formes finales et envoyer les documents aux navigateurs dans le cas où l’application est exécutée dans un servlet.

Le test correspondant est le suivant :

<p:choose  href="#request">
  <p:when test="contains(/request/headers/header[name = 'accept'], 'application/xhtml+xml')">
...

Sans être spécialiste Orbeon Forms, vous aurez compris qu’il s’agit de tester, dans un document XML représentant la requête HTTP, s’il y a une entête dont le nom est « accept » et qui contienne la chaîne de caractères « applications/xhtml+xml ».

L’élégance de la chose est qu’au lieu de tester qu’il s’agit ou non d’Internet Explorer, vous testez si le navigateur accepte les documents de type « application/xhtml+xml ». On peut donc espérer que ce test fonctionne quelque soit le navigateur et si IE accepte un jour ce type de document nous n’ayons pas à changer ce test pour que des documents XHTML lui soient servis.

Et ça marche : Firefox reçoit effectivement des documents XHTML et IE reçoit des documents HTML.

Sauf que… cela ne devrait pas marcher!

Si vous regardez une requête HTTP envoyée par IE, vous verrez quelque chose comme :

Accept: */*
Accept-Language: fr
Accept-Encoding: gzip, deflate
If-Modified-Since: Mon, 24 Dec 2007 08:52:18 GMT
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)
Host: localhost:8080
Connection: Keep-Alive

Au lieu de donner une liste des types de documents acceptés, IE utilise un joker et prétend accepter tout les types de documents.

Le test effectué par Orbeon Forms n’est pas conforme à la RFC 2616 puisqu’il ignore les jokers qui sont pourtant décrits dans la RFC. De plus, la RFC spécifie également qu’en l’absence d’entête « accept », il faut considérer que le navigateur accepte tout les types de documents.

Pour le rendre plus conforme à la RFC, il faut écrire :

<p:choose href="#request">
  <!-- See http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html -->
    <p:when
       test="contains(/request/headers/header[name = 'accept'], 'application/xhtml+xml')
               or contains(/request/headers/header[name = 'accept'], 'application/*')
               or contains(/request/headers/header[name = 'accept'], '*/*')
               or not(/request/headers/header[name = 'accept'])">

C’est encore un test un peu simpliste puisqu’il accepterait des types erronés tels que « mon-application/* » mais il est réagit correctement aux entêtes conformes à la spécification.

Ceci dit, le test étant maintenant correct, IE reçoit la version XHTML des documents puisqu’il déclare accepter ce type de document et il faut ajouter un deuxième test avec un traitement spécifique pour ce navigateur…

C’est ce qui est fait dans la version du pipeline epilogue-servlet.xpl utilisé par la version actuelle de la planète XMLfr.

Outre le fait que la réponse d’Internet Explorer, bien qu’étant conforme à la norme soit absolument inutile pour un serveur Web, ce qui me semble intéressant dans cette affaire, c’est la manière dont un test simple mais erroné fonctionne dans la majorité des cas pour de mauvaises raisons!

Adios Syncato

It’s been fun to use Syncato, but the lack of any kind of efficient anti spam is really overwhelming and I had to switch to something else to reopen the comments that I had to close with Syncato.

I am giving a try to WordPress which is in a way the complete opposite of Syncato: I don’t like that much its technical foundation (I had a look at its implementation of XML features and I’ll come back on that if time permits), but it is so much more user friendly that it’s difficult to resist… After all, I may be a XML Geek, I am also a user!

XSLT has been my friend again for this migration (the XML import that has been used to initialize the WordPress database and the rewrite rules have been generated with XSLT). As a result, all the posts, comments and feeds are available through the same URIs and the side effects should be minimized for the readers of this blog.

Farewell Syncato, I’ll miss your XML abilities!

A couple of things we got wrong ten years ago

I have started both to design web pages and to learn Java roughly ten years ago, back in 1996.

The first Web server I have ever used was a Netscape server. It came with built-in server side JavaScript and we were convinced that JavaScript would be a language of choice to develop server side Web applications.

Around the same period, I followed my first Java training. The instructor explained us that the real cool thing with Java was its virtual machine that can run everywhere and that, for this reason, Java would become the obvious choice for client side Web development.

Ten years after, we must admit that we got that completely wrong, that Java is mostly used server side and JavaScript is mostly used client side!

Will that remain true in the future?

I would be surprised if Java grew client side, but wouldn’t be surprised if JavaScript made a comeback server side.

Technically speaking, JavaScript is a good language, very comparable to scripting languages such as Python, Perl or Ruby and the fact that it is used client side for increasingly complex functions should justify to use it server side too.

There are good reasons to use the same language client and server side:

Developers don’t have to learn different languages to work client and server side.
It is easier to move functions from the server to the client or vice versa.
Functions can duplicated client and server side.

Ruby on Rails and the Google Web Toolkit translate their source languages into JavaScript to solve similar issues, wouldn’t that be much easier if we could use the same language client and server side?

The duplication of functions is a point that I find really important.

Web 2.0 applications need to remain good Web citizen and serve full pages to clients rather than HTML place holders for Ajax applications.

If you want to do that while keeping the Ajax fluidity, you end up doing the same functions server side to build the initial page and client side to update page fragments.

In the first chapter of Professional Web 2.0 Programming, I show how you can use the same XSLT transformation client and server side to achieve this goal. However, there is a strong cultural push back from many developers to use XSLT and server side JavaScript should be a better alternative for them.

What should the ideal JavaScript framework look like?

There are already several JavaScript framework around, unfortunately all those that I have found follow the same templating principles than PHP or ASP.

For me, the killer JavaScript framework would be modeled after Ruby on Rails or Pylons.

Tell me if you find one!

Google API shift

Google kills their Search API. So what?

I have learned the news through David Megginson’s Quoderat under the title Beginning of the end for open web data APIs? but I don’t agree with his analysis even if it is shared by all the other posts I have read on the subject.

David writes: The replacement, Google AJAX API, forces you to hand over part of your web page to Google so that Google can display the search box and show the results the way they want (with a few token user configuration options), just as people do with Google AdSense ads or YouTube videos which justifies that the whole of open web data and mash-ups all end up [could be] on the losing side

This is not what I understand when I read the Google AJAX Search API Documentation.

The « Hello, World » of Google AJAX Search API does use a method in which you handle to Google a node in your page where they include the markup for their search results, but there is more than that in their API.

If you are not happy with this basic method, you can use Search Control Callbacks to get search results delivered to your own JavaScript methods and do whatever you want with that.

What’s the difference with the SOAP search API, then?

The difference is twofold:

You trade an API that needs to be used server side by an API that needs to be used client side. Because of the same origin policy, the SOAP API needs to be implemented on your own server acting as a proxy. By contrast, the new Ajax API is designed to be used directly in the browser. It would be interesting to test if you can use this API in a server side JavaScript interpreter but this is obviously not Google’s main target!
You trade a SOAP API which is platform and language independent against a JavaScript API. From a developer’s perspective, if you accept the fact that this the API is used client side, that doesn’t make a lot of difference. On the contrary, most will probably be happy to use an API which is simpler than a SOAP client API.

When you think about it, this isn’t that much the end of mashups but rather a shift between server side mashups and client side mashups.

This Ajax Search API appears to be using the same concepts and principles than the Google Maps API and it’s weird to see people who consider than the Google Maps API is the best invention since sliced bread also consider the Google AJAX Search API evil.

Client side mashups are generally easier to implement since they do not rely on any piece of software installed on your server, however the benefit of server side mashups is that they can include content in the HTML pages that they serve, making them good web citizens which are accessible and crawlable.

I don’t regret the SOAP API (SOAP is almost as evil than any API) but what I do regret is that Google doesn’t publish both an Ajax API to make client side mashups easy and a REST API which would be used by their Ajax API and which could also be used server side.

Why XML Experts Should Care About Web 2.0

Here is the talk I had prepared for the Web 2.0 panel a the XML 2006 conference. This has been a very interactive panel and even though I haven’t pronounce exactly the same sentences, the message is the same.

I had proposed a whole session titled “Why XML Experts Should Care About Web 2.0”. I have tried to shrink this 45 minutes presentation to fit within a 5 minutes slot, but that didn’t really work. Instead of presenting the result of this hopeless exercise, I will use a well known metaphor. Of course, metaphors do not prove anything but they are great to quickly illustrate a point and that’s what I need.

Bamboo stems can reach 40 meters in height with diameters up to 30 cm and some species can grow over one meter per day. Despite that, they are so strong that in Asia they are used to build scaffoldings for sky scrappers. These performances are due to the tube like structure of stems reinforced by their nodes.

It recently occurred to me that the IT technology (and probably science in general) is progressing like bamboos and alternates periods of fast innovation with periods of consolidation. It is interesting to note that the prominent actors for these phases are often different. Consolidation builds on prior experience and is a good work for established experts. On the other hand, expertise often tends to censor new ideas and it can seriously limit the ability to innovate.

This theory is well illustrated by the history of the World Wide Web.

In the eighties and early nineties, hypertext experts were stuck by the complexity of their models and a new phase of innovation began with the invention of HTTP and HTML.

The consolidation phase was launched ten years ago by Jon Bosak when he said “You have to put SGML on the web. HTML just won’t work for the kinds of things we’ve been doing in industry.”

In five years time, this consolidation phase grew to a stage where the XML stack is so heavy that it looks like legacy. Its development is almost stalled and a new innovation phase was badly required.

Those of you who know me know me as an XML expert and as many XML experts the crazy hype that is obscuring Web 2.0 kept me away for a long time.

I started to look what’s behind the hype a year ago. Having done so, I am happy to report that Web 2.0 could be the next innovation phase.

A good indication is that XML experts predict that Web 2.0 will fail for the same reasons hypertext experts predicted that HTML would fail: Web 2.0 is messy, over simplistic, not well enough designed, …

If Web 2.0 is the next innovation phase, what should we do?

We can contribute, actively follow the growth of the phenomena, provide guidance but we should avoid to be too directive for the moment.

My first personal contribution is my book “Professional Web 2.0 Programming”. This book is for anyone wanting to catch the Web 2.0 wagon. It’s also a set of reminders and guidances by we’ve tried to be as open as possible and for instance, we have covered not only XML but its alternatives (including controversial technologies such as JSON).

If we keep ready, our turn will come again when the next consolidation phase starts.

This consolidation phase will eventually put XML on the Web like XML has (at least partially) put SGML on the Web.

Will XML on the Web still be XML? Maybe not: SGML on the Web is no longer SGML, why should XML necessarily survive to the next iteration? Anyways, does that really matter?

Our Web 2.0 book appears to be tough to classify

I have arrived in Boston yesterday evening to participate to the XML 2006 conference.

Today, I spent most of my time walking in the town and I couldn’t resist to enter in the first bookshop I found to check if they had our new Web 2.0 book.

This bookshop happened to be Borders, 10 School Street and it took me a while to find the book because it was neither with the other books about the Web nor with other suspects such as books about Ajax but together with my XML Schema book and HTML 4 for dummies (I haven’t understood why this other book was there either) between a bunch of books about XSLT.

Our book is probably difficult to classify because it covers a lot of subjects but, even though I have been involved in it, it is certainly not a book about XML and should rather be classified as a book about the Web!