Performance improvement for xml loads (+comments)

Rob Browning rlb@cs.utexas.edu
07 Dec 2000 13:01:20 -0600


Derek Atkins <warlord@MIT.EDU> writes:

> > If you are worried about load times and memory usage, we should consider
> > using a SAX interface to read in the XML.  See this link for tradeoffs:
> > http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html
> 
> Unfortunately the problem isn't just at read-time.  It seems that the
> problem is also during file-writes.  And according to this web page,
> the SAX interfacewont affect file writes, only file reads.

And since we already use SAX, I doubt adding it now would have much
affect :>

> What synergy?  I was never enthused about XML (mostly because I
> don't like ascii file formats for large data objects or network
> protocols).

I think the synergy here is that people think that if you use XML,
it's more likely that there will be tools that will be availble to
allow you to manipulate your data outside the app.  This is in fact
true.  Writing a parser/transformer to do some arbitrary thing to an
XML file (massage it, extract things, etc.), or even to any text file,
is far easier than it would be for some home-brewed format.  Heck you
can use emacs/perl/whatever...and I have.

Also, if you do decide to try and whip something up, make sure you're
aware that we use kvp_frames now, in various places, so you will have
to be able to accomodate items with arbitrarily deep, recursive
key/value trees.

> However, I was willing to let others take a gander at it (mostly
> because I _DO_ think that XML input/output is necessary, especially
> once we want OFX support).  The fact that storing 10000 transactions
> requires 50M of ram in order to build the XML tree is, IMHO,
> unconscionable.

IMO This is a bug in the library, and not an inherent problem in XML
itself, and as I've described several times before, this may be
something that can be dramatically improved with very little effort...

> I think I'll actually try to write an XDR-based data storage system
> and we'll see.  I just don't believe anymore that XML is a reasonable
> way to store large data sets.  XML is a cool technology, but just
> because a technology is cool doesn't mean that it's the right tool for
> the job.

Of course you're welcome to, but why would you waste time on this
rather than trying to go forward with trying to integrate an embedded
MySQL or PostgreSQL?

-- 
Rob Browning <rlb@cs.utexas.edu> PGP=E80E0D04F521A094 532B97F5D64E3930