Performance improvement for xml loads (+comments)

Dave Peticolas dave@krondo.com
Tue, 05 Dec 2000 15:10:32 -0800


Al Snell writes:
> On 5 Dec 2000, Rob Browning wrote:
> 
> > > The only problem I have now is that it is still MUCH slower than the
> > > binary.  The file size is about 6x the size before (9 Meg vs. 1.5
> > > Meg) and to actually do the write it seems to use about 50 Meg of
> > > ram, because xml builds the whole tree in memory before writing
> > > anything.
> > 
> > One thing that's on my to-do list is to integrate zlib.  For writing,
> > all you have to do is set a flag and libxml will compress the output,
> > but for reading, since we parse incrementally (to save RAM), we need
> > to use zlib directly, and I've been too busy with g-wrap to fix it
> > yet.  This won't help RAM usage, but it will help reduce storage space
> > tremendously (I believe it was actually smaller than the binary format
> > last time I tested), and it didn't noticably affect performance (maybe
> > 5%, as long as you don't use too high a zlib level).
> 
> Why ARE you guys using XML? Isn't it really pointless - sure you don't
> *want* tools other than the gnucash engine itself touching the persistent
> data structures, since there are consistency constraints to be maintained?
> 
> What are we getting in exchange for massively increased resource
> utilisation?

The binary format was really a dead-end. It was very brittle, with
subtle endian and architecture issues, and continuing to extend it
was going to be an extreme headache.

That said, XML as the primary file store will probably be a transition
solution on the way to embedded sql. Once we have that, the XML will be
there as an export option, but a very well tested one :)

dave