HTML 5 is not just HTML 4 + 1
This announcement has been already widely commented and I won’t come back on the detail of the differences between HTML 4.1 and HTML 5 which are detailed in one of the documents published with the Working Draft. What I find unfortunate is that this document and much of the comments about HTML 5 focus on the detail of the syntactical differences between these versions rather than commenting more major differences.
These differences are clearly visible as soon as you read the introduction:
The World Wide Web’s markup language has always been HTML. HTML was primarily designed as a language for semantically describing scientific documents, although its general design and adaptations over the years has enabled it to be used to describe a number of other types of documents.
The main area that has not been adequately addressed by HTML is a vague subject referred to as Web Applications. This specification attempts to rectify this, while at the same time updating the HTML specifications to address issues raised in the past few years.
This introduction does a good job in setting the context and expectations: the goal of HTML 5 is to move from documents to applications and this is confirmed in many other places, such as for instance the section titled “Relationship to XUL, Flash, Silverlight, and other proprietary UI languages”:
This specification is independent of the various proprietary UI languages that various vendors provide. As an open, vender-neutral language, HTML provides for a solution to the same problems without the risk of vendor lock-in.
To understand this bold move, we need to set this back into context.
Nobody denies that HTML has been created to represent documents, but its success comes from its neutrality: even if it is fair to say that Web 2.0 is the web has it was meant to be, the makers of HTML couldn’t imagine everything that can be done in modern web applications. If these applications are possible in HTML, this is because HTML has been designed to be neutral enough to describe yesterday’s, today’s and probably tomorrow’s applications.
If on the contrary, HTML 4.01 had attempted to describe, in 1999, what was a web application, it is pretty obvious that this description would have had to be in the best case worked around and that it might even have slowed down the development of Web 2.0.
This is the reason why I would make to HTML 5 the same kind of criticism I made to W3C XML Schema: over specifying how to use a document is a risk to block creativity and increase the coupling between applications.
While many people agree that web applications should be designed as documents, HTML 5 appears to propose to move from documents to applications. This seems to me to be a major step… backward!
Flashback on HTML’s history
Another point that needs to be highlighted are the relations between HTML 5 and XML in general and XHTML in particular.
HTML 5 presents itself as the sibling of both HTML 4.01 and XHTML 1.1 and as a competitor of XHTML 2.0.
To understand why the W3C is developing two competing standards, we need a brief reminder of the history of HTML.
HTML has been originally designed as a SGML vocabulary and uses some of its features to reduce the verbosity of its documents. This is the case for instance of tags such as <img> or <link> that do not need to be closed in HTML.
XML has been designed to be a simplification of SGML and this simplification does not allow to use the features used by HTML to reduce its verbosity.
When XML has been published, W3C found themselves with a SGML application in one hand (HTML) and a simplification of SGML in the other hand (XML) and these two recommendations were incompatible.
To make these recommendations compatible, they decided to create XHTML 1.0 which is a revamping of HTML to be compatible with the XML recommendation while keeping th exact same features. This lead to XHTML 1.0 and then XHTML 1.1 which is roughly the same thing cut into modules that can be used independently.
One of the weaknesses of HTML being its forms, W3C did also work on XForms, a new generation of web forms and started to move forward working on a new version of XHTML with new features, XHTML 2.0 still work in progress.
The approach looked so obvious that W3C has probably neglected to check that the community was still following its works. With the euphoria that followed the publication of XML 1.0 many people were convinced that the browsers war was over, the interest for HTML which was partly fueled by this war started to decline and the W3C works in this domain didn’t seem to raise that much interest compared to let’s say XML Schema languages or Web Services.
It is also fair to say that the practical interest to move from HTML to XHTML wasn’t (and still isn’t) obvious for web site developers since the features are the same. Migrating a site from HTML to XHTML involves an additional work which is only compensated by the joy of displaying a “W3C XHTML 1.x compliant” logo!
This is also the moment when Microsoft stopped any development on Internet Explorer and Netscape transferred their development to Mozilla.
The old actors from the browsers war, well represented at the W3C which was one of their battle fields led the way to new actors, Mozilla, Opera and Apple/Safari younger and less keen to accept the heaviness of W3C procedures.
At the same time, the first Web 2.0 applications sparkled a new wave of creativity among web developers and all this happened outside the W3C. This is not necessarily a bad thing since the mission of standard bodies such as W3C is to standardize rather than innovate, but the W3C doesn’t appear to have correctly estimated the importance of these changes and seems to have lost the contact with their users.
And when these users, led by Opera, Mozilla and Safari decided that it was time to move HTML forward, rather than jump into the XHTML 2.0 wagon, they decided to create their own Working Group, WHATWG, outside the W3C. This is where the first versions of HTML 5 have been drafted together with Web Forms 2.0, a sister documentation designed to be an enhancement of HTML forms simpler than Xforms.
Microsoft was still silent on this subject and the W3C saw themselves as editor of a promising new specification, XHTML 2.0 which didn’t seem to attract much attention while, outside, a new specification claiming to be the true successor of HTML was being developed by the most promising outsiders in the browser market.
At XTech 2007, I had a chance to measure the depth of the channel that separates the two communities by attending to a debate between both working groups.
Tim Berners-Lee must have found that this channel was too deep when he took the decision to invite the WHATWG to continue their work within the W3C in a Working Group created for this purpose and distinct from the XHTML 2.0 Working Group that continues their work as if nothing has changed.
HTML 5 or XHTML 2.0?
So, the W3C has now two distinct and competing Working Groups.
Missons are very close
The XHTML 2.0 Working Group develops an extensible vocabulary based on XML:
The mission of the XHTML2 Working Group is to fulfill the promise of XML for applying XHTML to a wide variety of platforms with proper attention paid to internationalization, accessibility, device-independence, usability and document structuring. The group will provide an essential piece for supporting rich Web content that combines XHTML with other W3C work on areas such as math, scalable vector graphics, synchronized multimedia, and forms, in cooperation with other Working Groups.
The HTML Working Group focuses on the continuity with previous HTML versions:
The mission of the HTML Working Group, part of the HTML Activity, is to continue the evolution of HTML (including classic HTML and XML syntaxes).
The conciseness of this sentence doesn’t imply that the HTML Working Group isn’t worried about extensibility and cross platform support since the list of deliverables says “there is a single specification deliverable for the HTML Working Group, the HTML specification, a platform-neutral and device-independent design”and later on “the HTML WG is encouraged to provide a mechanism to permit independently developed vocabularies such as Internationalization Tag Set (ITS), Ruby, and RDFa to be mixed into HTML documents”.
The policy is thus clearly, taking the risk to see a standards war develop, to develop two specifications and let user choose.
XHTML 5 is a weak alibi
We find this policy within the HTML 5 specification that proposes to choose between two syntaxes:
This specification defines an abstract language for describing documents and applications, and some APIs for interacting with in-memory representations of resources that use this language.
The in-memory representation is known as “DOM5 HTML”, or “the DOM” for short.
There are various concrete syntaxes that can be used to transmit resources that use this abstract language, two of which are defined in this specification.
The first such concrete syntax is “HTML5”. This is the format recommended for most authors. It is compatible with all legacy Web browsers. If a document is transmitted with the MIME type text/html, then it will be processed as an “HTML5” document by Web browsers.
The second concrete syntax uses XML, and is known as “XHTML5”. When a document is transmitted with an XML MIME type, such as application/xhtml+xml, then it is processed by an XML processor by Web browsers, and treated as an “XHTML5” document. Authors are reminded that the processing for XML and HTML differs; in particular, even minor syntax errors will prevent an XML document from being rendered fully, whereas they would be ignored in the “HTML5” syntax.
This section, which by chance is non-normative, appears to exclude that a browser might accept any other HTML document than HTML5 or any XHTML other than XHTML5!
Furthermore, with such a notice, I wonder who would want to choose XHTML 5 over HTML5…
This notice relies on a frequent misunderstanding of the XML recommendation. It is often said that XML parsing must stop after the first error, but the recommendation is much more flexible than that and distinguishes two types of errors:
- An error is “a violation of the rules of this specification; results are undefined. Unless otherwise specified, failure to observe a prescription of this specification indicated by one of the keywords MUST, REQUIRED, MUST NOT, SHALL and SHALL NOT is an error. Conforming software MAY detect and report an error and MAY recover from it.”
- A fatal errors is “an error which a conforming XML processor MUST detect and report to the application. After encountering a fatal error, the processor MAY continue processing the data to search for further errors and MAY report such errors to the application. In order to support correction of errors, the processor MAY make unprocessed data from the document (with intermingled character data and markup) available to the application. Once a fatal error is detected, however, the processor MUST NOT continue normal processing (i.e., it MUST NOT continue to pass character data and information about the document’s logical structure to the application in the normal way).”
We see that on the contrary, the XML recommendation specifies that a XML processor can correct simple errors.
One may argue that what XML considers as a fatal error can be considered by users as simple errors, this would be the case for instance of a <img> tag that wouldn’t be closed. But even for fatal errors, the recommendation doesn’t stipulate that the browser should not display the document. It does require that the parser report the error to the browser but doesn’t say anything about how the browser should react. Similarly, the recommendation imposes that normal processing should stop because the parser would be unable to reliability report the structure of the document but doesn’t say that the browser shouldn’t switch to a recovery mode where it could try to correct this error.
In fact, if browsers are so strict when they display XML documents, this isn’t to be conform to the XML recommendation but because there was a consensus that they should be strict at the time when they implemented their XML support.
At that time, everyone had in mind the consequence of the browsers war that was one of the reasons why browsers accepted pretty much anything that pretended to be HTML. While this can be considered a good thing in some cases, this also means implementing a lot of undocumented algorithms and this leads to major interoperability issues.
The decision to be strict when displaying XML documents came as a new era good resolution and nobody seemed to dissent at that time.
If this position needs to be revisited, it would be ridiculous to throw away XML since we have seen that it isn’t imposed by the recommendation.
The whole way in which the two HTML5 syntaxes are presented is a clear indication thet the XML syntax which was not mentioned in the first HTML5 drafts has been added as a compromise so that the W3C doesn’t look like if they rejected XML, but that the idea is to maintain and promote a non XML syntax.
HTML 5 gets rid of its SGML roots
Not only does HTML 5 rejects XML, but it also abandons any kind of compatibility with SGML and says clearly “while the HTML form of HTML5 bears a close resemblance to SGML and XML, it is a separate language with its own parsing rules”.
This sentence is symptomatic of the overall attitude of the specification that seems to pretend to build on the experience of the web and ignore the experience of markup languages, taking the risk once again, to freeze the web to its current status.
The attitude of the XHTML Working Group is better balanced. Of course, XHTML 2.0 is about building on the most recent web development, but it doesn’t do so without keeping the experience acquired while developing XML and SGML vocabularies.
Technical approaches radically different
Without entering into a detailed comparison, two points are worth mentioning.
XHTML 2.0 is more extensible
Both specifications acknowledge the need to take into account the requirements that have appeared since HTML has been created when these are not correctly supported, but the method to do so is totally different.
HTML 5 has adopted a method that looks simple: if a new need is considered important enough, a new element is added. Since many pages contain articles, a new <article> element is added. And since most pages have navigation bars, a new <nav> element is added…
We have seen with the big vocabularies in document applications what are the limits of this approach: this leads to an explosion of the number of elements and the simplicity turns into complexity. It becomes difficult to choose between elements and pick the right one and since these elements are specialized, they never meet exactly your needs
Using this approach with HTML is more or less a way to transform it into a kind of DocBook clone for the web in the long term.
XHTML 2.0 has taken an opposite approach. The idea is, on the contrary, to start with a clean up and remove any element from XHTML that isn’t absolutely necessary.
The downside is that the values of the class attribute aren’t standardised and that the class attribute is used to convey information about the meaning of an element rather than define the way it should be displayed. This kind of hijack is pretty common since this is also the foundation of microformats.
To avoid this hijack while keeping the flexibility if this approach, XHTML 2.0 proposes to add a role attribute that defines the role of XHTML elements. This attribute can take a set of predefined values together with ad hoc values differentiated by their namespaces.
This method is a way to introduce the same kind of features that will be added to HTML 5 without adding new elements. This is more flexible since anyone can create new values in new namespaces. This also gives microformats a way to build upon something more solid than the class attribute that can be used again to define how elements should be presented.
Documents versus applications
Another important point that differentiate these two specification is their balance between data and applications or treatments.
XHTML 2.0 is built upon the XML stack:
- The lower level is syntactical and consists of the XML and namespaces recommendations.
- On top of this layer, the XML infoset defines a data model independent of any kind of treatment.
- APIs, specific languages (XPath, XQuery, …) and schema languages are built on to of this data model.
It took some few years to build this architecture and things haven’t always been that clear and simple, but its big benefit is to separate data and treatments and be just the right one for a weak coupling between applications.
We’ve seen that HTML 5 has cut all its links to XML and SGML and that means that it doesn’t rely on this architecture. On the contrary, this specification mixes everything, syntax, data model and API (DOM) in a single specification.
This is because, as we’ve already seen, HTML 5 is a vocabulary to develop web applications rather than a vocabulary to write documents.
The architecture on which XHTML 2.0 is built doesn’t prevent people from developing applications, but it dissociates more clearly these applications from the content.
Will the best specification win?
For all these reasons, HTML 5 looks to me as a big step backward and XHTML 2.0 seems to be a much better alternative.
Does that mean that XHTML 2.0 will be the winner or on the contrary, does the fact that HTML 5 is written by those who develop web browsers mean that XHTML 2.0 is doomed?
XHTML 2.0 has a strong handicap, but the battle isn’t lost yet. The HTML Working Group doesn’t expect that HTML 5 becomes a recommendation before Q3 20010 and before that date everything can happen.
It is up to us, the users, to vote with our feet and pen and start by boycotting the HTML 5 features that are already implemented in some browsers.
And short term, certifying that a page us XHTML 1.x valid is a good way to certify that it doesn’t contain HTML 5 features!