Performance improvement for xml loads (+comments)

Dave Peticolas dave@krondo.com
Tue, 05 Dec 2000 00:36:00 -0800


Bill Carlson writes:
> Hi,
> 
>  	I've been trying to make the xml stuff go a bit
> faster.  The following patch will cut my large file load
> (~10000 transactions) from about 2 minutes to about 30 seconds.
> This is obviously good (and along the lines of a change
> I made a while ago for the binary file load).  
> 
> 	The only problem I have now is that it is still
> MUCH slower than the binary.  The file size is about 6x
> the size before (9 Meg vs. 1.5 Meg) and to actually do
> the write it seems to use about 50 Meg of ram, because
> xml builds the whole tree in memory before writing anything.
> This all concerns me a great deal as xml seems to be
> the future for gnucash and this experience tells me
> that we are in trouble as far as even moderatly large
> databases of transactions.  Any comments?  I'd be more than
> happy to help make things better.  One thought I had was
> to try a "io-gncsql-r.c" and "io-gncsql-w.c" which would
> talk to an sql database (this would be different than using
> the database as the engine, it would just read and write
> like a file).  Will the architecture of gnucash allow
> for user preferences in the style of saving?
> 
> 	In any case, what follows is my brief patch which
> I'd appreciate if it were put into the CVS.  Thanks!

Thanks for the patch, it is in CVS.

Having an io-gncsql* file interface is an interesting idea.
What kind of speedups do you think it could achieve?

Personally, I think it would be a good idea. With a default
XML interface, new users would not need to worry about being
a DBA, but there would still be an option for storing in sql.

Do you think it would be possible/good to use something like libdba
(the gnome generic db interface library)?

What does everyone else think?

thanks,
dave

ps If you decide to work on this, you should keep in mind
   that there will be more data that needs to be stored in
   addition to what is currently in devel CVS data files.
   This will include budget data as well as price information.