The issue with using XML is that, somewhere along the line, an XML parser has to be included, the obvious choice being libxml. Now, libxml is quite useful, and provides all the methods you can reasonably expect a parsing library to provide - but the downside is that it adds a megabyte or more to the code.
We can't expect Windows users to have libxml installed so we will end up bundling it either with EE, or statically linked into EE itself. Either way, it will bloat up the source, and to be honest, I'm not sure XML is the right base format for syntax files. (CE didn't use XML either, but to be fair, XML was invented after CE)
If we take the idea of parsing the XML and creating a serialised/binary class, we're already halfway there by creating an INI-style format, which predates XML by a number of years (Windows 3.0 was using it at the start of the 1990s and it may have been applicable to even earlier versions, let alone what Unix was doing with .rc files)
I do take the point that XML is a known format, especially in recent times, but I think that the format itself isn't quite what we're after - if you have a number of items that vary in depth (e.g. nesting levels), or a number of different elements that can share the same level, then it's fine. But if you have half a dozen elements total, with potentially hundreds of entries per element, you'll end up with something like: (rewriting my original example)
14 May 2006
For the purposes of brevity, I have ignored the XML header or global-container style tags.
While more readable, it is significantly larger, and the difference will be clear with larger syntaxes - e.g. PHP's syntax has at least a couple of thousand special words, which means the matching tags plus any whitespace.
As a comparison, I have taken the above syntaxes and compared them - ignoring any whitespace such as non-essential spaces (the space in E. Editor, for example, remains), end-of-line characters and blank lines.
The original: 195 characters
The XML equivalent: 414 characters
You could probably shorten it down a little by being careful, and by extending some of them to be character groups (a la regular expressions), so you could in theory generate:/#
instead of the above syntax:/#
But that's just a matter of semantics and coding.
To be honest, I think XML is one of those things that is great, but just not for the requirements we have.