So many XML posts, so little time
Over the past few days there have
been
so
many
posts
regarding
Microsoft's
announcement
of XML formatted
files for Office
12. One of the key reasons that I am working
so aggressively towards Eclipse immersion (although not necessarily my
next big chunk of work - priorities
...) is the view that XML will continue to become a pervasive standard
for complex, "unstructured" data. Many thoughts ...
- All of this "noise" had me dredge up a few old bookmarks I'd
been meaning to act upon, concerning "writing with XML" - that is,
composing all of your content directly in XML. Martin
Fowler has what I think is the best overall treatment, a few years
ahead of his time, it seems. Jon Udell has done some interesting
articles over the years, but how about this one,
where he gets you thinking about SQL/XML queries against libraries of
documents (interesting applications to follow ...).
- Note that Office has been able to save as XML for a while;
they're making it the default format. Also, note these new formats are
not exactly the same - they have been improved for security,
incorporate ZIP to keep file sizes down.
- I love hacking at Office, especially in Excel - my source
code page has a
smidgen of the code I've written for that family of products. I
welcome the day when I'll be able to write text-processors that make
fast, command-line batch file driven work of my documents - things
like standard page formatting, watermarking and embedded information,
cleaning out edits, etc.
- Evan Erwin has a nice
long post full of ideas, including the concept of bulk conversion and
cleanup utilities ... yes, we think alike ... I want start hacking
away ... You know, the basic command-line filter
type of programming has always been my favorite; purist programming of
a sort. My first big idea that I never wrote was a universal
translator, a precursor to the big ETL tools of today.
- Ah, but don't stop at the command line - extend the metaphor
to web services, and let your imagination wander ...
- Evan also noticed that the announcement did not seem to
include Outlook (or Access, or OneNote, or Project, or Visio ...).
Pity, there would definitely be some interesting stuff in those files.
- Note that Microsoft has released the XML file
formats, so all of us hackers can get an early start. Here's a new blog
that will focus on them - also, links to white papers from MS for more
information (thanks to Brian Jones, PM in the Win Office team!)
- Note this isn't cross-application portability - this is Microsoft's
XML for Office documents, but there are two other important XML
specifications, from OpenOffice.org
and OASIS,
for office (lower case) documents. Well, at least it will be
easier to adapt all of my nifty utilities for multiple document specs.
- Why wait for Office 12? You can start learning and
understanding XML right now - there are already so many applications
for the general technology - even apart from SOA.
- RSS feeds, the life blood of the blogging / aggregators
crowd, scream for automation, for searching, manipulating content,
etc.
- XBRL hit my radar with a couple
of articles;
compliance for financial reporting is high on many priority lists, but
some of the preferred solutions can get quite expensive. Driving to a
common format for releasing financial may make bring a lot of new,
flexible, and cheaper tools to market.
- Is this finally a compelling reason for upgrading your Office
version? So often in the past, the various upgrades have seemed to add
little meaningful value and lots of extra hassle. If all of my groups
documents can be accessible via XML, would it be enough to convince me
to take on the hassle of upgrading?
- What about backward compatibility? Per Joe Wilcox at Jupiter,
MS will offer compatibility patches back
to Office 2000.
- Hey, this hits the Mac
version of Office as well. There is a great
post by Rick Schaut that talks about the software engineering process -
how the Mac and Win Office teams are sharing specs and even code across
the platforms.
Technorati Tags: office, xml, microsoft