2006/08/28

XML document formats (and TeX)

This post began as just a teeny tiny mention of how, with Apple supporting OpenDocument in its next release, TeX supporting unicode and OpenType fonts, and Microsoft Office using Open XML, things are going to get interesting in the document applications arena over the next few years.

But the comments thread in that last link got me going. There’s been beef apparently between the OpenDocument people and Microsoft, whereby the former believe that Office should have just used their solution and Microsoft didn’t want to fit their feature set into what ODF provided. Similarly, MathML wouldn’t have worked as a document format without extending it to support other Office features (track changes, e.g.), and that’s one of the main criticisms generally of how Microsoft “supports” standards (embrace, extend, …). David Carlisle (of LaTeX3 project fame, from my point of view) put things more eloquently in the thread under discussion.

The odd one out in all of this are the TeX-based solutions. While xmlTeX can parse, uh, XML, and ConTeXt can handle MathML, it becomes hard for me to see where or how TeX-based markup documents can co-exist with MathML apps (which will be able to copy/paste editable equations like plain text). You’re not going to want to insert some hideous MathML into your otherwise nicely marked-up TeX doc:

<math>
  <mo> &sum; </mo>
  <msupsub>
     <mi> x </mi>
     <mn> 3 </mn>
     <mi> i </mi>
  </msupsub>
</math>

But it seems that isn’t so much a problem after all, with a quick search resulting in a highly relevant paper on [MathML to TeX conversion]. The problem, finally, is editor support. With tagged PDF allowing MathML copying from PDF documents (correct me if I’m wrong, but I believe this is possible), and editor-based MathML to TeX conversion into the source, TeX would work just as well as, say, Office 2007–Mathematica (to give but an example of MathML applications) as a first-class copy/paste citizen — a situation that’s not true today.