Why is exposing a document model so important? Why would that be better than providing import/export capabilities or API accesses to the document mode?
The « view source effect » is often considered as one of the reasons why XML is so important: people can just learn from opening existing documents and copy/paste stuff they like into their own documents.
Following this analysis, the view source effect would be one of the main reasons of the success of the web: you can just learn by looking at the source of the pages you consider as good examples.
The view source effect is important indeed, but to take it to its full potential copy/paste need to be automated and the view source effect to become the « transform source effect ».
The ability to transform sources means that you don’t need to fully understand what’s going on to take advantage of markup languages formats: you can just do text substitution on a template.
The web is full of examples of the power of the transform source effect: the various templating languages such as PHP, ASP, JSP and many more are nothing more than implementations of the transform source effect.
The « style free stylesheets » which power XMLfr and that I have described in an XML.com article are another example of the transform source effect.
How does that relate to desktop publishing formats? Let’s take a simple example to illustrate that point.
Let’s say I am programmer and I need to deliver an application that takes models of letters and print them after having changed the names and addresses.
Let’s also imagine that I do not know anything of the specifics of the different word processors and that I want my application to be portable across Microsoft Word, Open Office, WordPerfect, AbiWord and Scribus.
Finally, let’s say, for the fun, that I do not know anything of XML but that I am a XXX programmer (substitute XXX by whatever programming language you like).
Because all these word processors can read and write their documents as XML, I’ll just write a simple program that will substitute predefined strings values included in the documents (let’s call them $name, $address, …) with the contents of variables that I could retrieve for instance from a database.
I am sure that you know how to do that with your favorite programming language! In Perl for instance, that would be something like:
#!/usr/bin/perl
$name = 'Mr. Eric van der Vlist';
$address = '22, rue Edgar Faure';
$postcode = 'F75015';
$city = 'Paris';
$country = 'France';
while (<>) {
s/\$(name|address|postcode|city|country)/${$1}/g;
print;
}
There is no magic here: I am just replacing occurrences of the string « $name » in the text by the variable $name that contains « Eric van der Vlist », occurrences of « $address » by « 22, rue Edgar Faure » and so on in a plain text document.
I am leveraging on the « transform source effect » to write a simple application that is compatible with any application that enables this effect by exposing its model as plain text.
This application will work with Microsoft Word (using WordML and probably even RTF), OpenOffice, WordPerfect, AbiWord, Scribus and may more.
It will also work with HTML, XHTML, XSL-FO, SVG, DocBook, TEI, plain text, TEX, …
It will work with Quark, but only if we use QXML as a XML format and not as an API.
And it won’t work with InDesign unless there is a way to import/export full InDesign documents in XML…
See also: