XML is a useful technology, sometimes: that's about as positive as I can be about it these days. While there was a
period when I got quite excited about the idea of a standard syntax for, well, everything, time tempers such enthusiasm.
See, 90% of the time, XML is just too damn cumbersome. When what you want to say is
Param1 = 3
or something of that complexity, then having to write
is just a bit hefty.
Anyone who's ever tried using Ant or Maven will surely sympathise with the idea that XML is a pain
in the arse way to write make-files. And XSL transforms? Phew, glad I'm not going back into that arse-end of software
engineering gone mad. It's also a pain in the arse to read XMLified data in C++ or Fortran. Hell, even Python and Java
make you jump through SAX or DOM-shaped hoops in their lowest common denominator implementations! Frankly, XML is a
serious candidate for "no silver bullet"-type debunking: merely wrapping everything in angle brackets actually *solves
nothing*, even if it looks totally SOAP-AJAX-Web 2.0, dude. Much of the time, something simpler is quite enough.
Unfortunately, the message hasn't quite drilled through to everybody's cortexes (cortices?) yet, and I saw several HEP
presentations recently where some data was marked up in XML as if that was an achievement in itself. What do you want...
a biscuit? Sadder still, I was told by a starry-eyed student that this was a great way to provide a unified data format.
To take this out of the HEP context, lets say I invented a new XML-oriented data format, EML. It's short for "Everything
Markup Language", dontcha know?
EML is super-whizzy-clever: it's in XML for a start, which means immediately that it's 21st century (or beyond) and
future-proof, unlike all those column-delimited plain text data files that insufficiently technical people keep bandying
about. You can parse it with all sorts of tools (but not without pain if you use an old-style language whech actually
compiles to *machine code*, duh), and there is future potential for some sort of Web services asynchronous HttPRequest
coolness. Frankly, it's genius. And so simple! See, all you have to do to make a file in EML, is to take its normal
binary representation, convert any accidental non-ASCII bits to XML entities, and slap `...` around it. Not
forgetting an XML namespace declaration, of course: those are *really* useful, and *really* provide a good mechanism for
schema evolution. Tada! Instant interoperability.
If this doesn't seem immediately stupid, please give yourself a good slap and get a job somewhere where you don't have
the option to impose your half-wittedness on anyone impressionable enough to believe in this garbage --- the flow must
be stemmed!
In short, if you have very hierarchically structured data, which is only likely to be processed in languages which
provide easy XML parsing, and no-one is likely to have to write (or read, 90% of the time) the format by hand, then XML
may be for you. Otherwise, you would do well to restrain that knee-jerk XML reflex and *really* think about how you
would have best described your data in a world where XML never existed. Believe me, if this is news to you, you'll thank
me for it.