Gnucash very slow after importing 13 yrs data from Quickbooks

Donald Allen donaldcallen at gmail.com
Mon Mar 2 22:27:48 EST 2009


On Mon, Mar 2, 2009 at 6:36 PM, <matthew-gnucash at newtoncomputing.co.uk>wrote:

> Hi,
>
> On Sat, Feb 28, 2009 at 01:55:34PM +0000,
> matthew-gnucash at newtoncomputing.co.uk wrote:
> > On Fri, Feb 27, 2009 at 09:57:21PM -0500, Donald Allen wrote:
> > > The first thing I'd be suspicious of is your home-brew Gnucash file.
> > > I'd suggest double-checking somehow that its form and content are
> > > correct. I'd also suggest running top and vmstat while Gnucash is
>
> I think I've got it, after some file comparison and a bit of
> thinking.
>
> The only differences I could see between 'my' file and a
> gnucash-generated one was that the UUIDs were different. Gnucash
> UUIDs look fairly spread out over the number space, whereas I'd
> created mine as sequences 1, 2, etc just for testing (padded with
> 0s).
>
> The slowness in loading was pretty much all CPU, not disk.
>
> Having realised that loading gets slower the further through the
> file, I suddenly wondered if Gnucash uses the UUID in memory as
> keys in a hash. Whether this is true or not, changing my UUIDs to
> "echo $ordinal | md5sum" and everything runs fast again!


Yep. Not hard to turn a hash lookup into a linear search.

A little war story: the first version of tcp/ip (originally called the
Kahn-Cerf protocol) for a real Arpanet host (a DEC PDP-10 running Tenex) was
written by a guy on loan to the OS development group I ran at BBN (the
absolute first tcp/ip implementation was done by colleague Ray Tomlinson at
BBN on a PDP-11; Ray was the person who also sent the first cross-network
email -- I still remember him telling me about his weekend hack one Monday
morning about 35 years ago -- and who invented the '@' notation). It was
impossibly slow. He said "I know what's wrong" and went off and fiddled with
the code. No dice. Repeated. Still no faster. I finally insisted that he
make measurements to find out where the time was going, instead of assuming
he knew how how his own code behaved. Sure enough, it was something he
wouldn't have guessed, nor would I: the connection hash table was too small,
so most of the lookups turned into linear searches. Made the hash table
bigger and the problem went away.

/Don


>
> Now all I need is a Perl version of the Gnucash UUID generator...
> (if anyone has one, please shout!)
>
> Thanks for suggestions that got my brain thinking in this way!
> I just need to finish off the converter for bits I missed before,
> now (such as transferring customers/vendors/employees etc -
> non-accounting data).
>
> Cheers!
>
> --
> Matthew
> _______________________________________________
> gnucash-user mailing list
> gnucash-user at gnucash.org
> https://lists.gnucash.org/mailman/listinfo/gnucash-user
> -----
> Please remember to CC this list on all your replies.
> You can do this by using Reply-To-List or Reply-All.
>


More information about the gnucash-user mailing list