[GNC-dev] Performance regression loading account

Wed Mar 15 07:02:27 EDT 2023

I wanted to get some numbers on this as my test file seemed OK.
I used Calc to create a CSV transaction import file with 8402 rows and some
description columns with 16, 64 and 128 character random strings.
Used this to import several times to a new empty gnucash xml file and added
some timing for the account open command in 4.903 and 4.13 with results
below...

With 4.903
Description 16 Characters and 16804 unique transactions / descriptions,
0.93, second time 0.73
Description 64 Characters and 16804 unique transactions / descriptions,
2.39, second time 1.67
Description 128 Characters and 16804 unique transactions / descriptions,
4.08, second time 2.90

With 4.13
Description 16 Characters and 16804 unique transactions / descriptions,
0.49, second time 0.35
Description 16 Characters and 16804 unique transactions / descriptions,
1.22, second time 0.61
Description 16 Characters and 16804 unique transactions / descriptions,
1.91, second time 0.93

Regards,
Bob

On Tue, 14 Mar 2023 at 18:54, Maarten Bosmans <mkbosmans at gmail.com> wrote:

> Op ma 13 mrt 2023 om 04:44 schreef john <jralls at ceridwen.us>:
> > My first guess is that it's from creating a cache of quickfill entries
> to populate a drop-down list of possible entries similar to the way the
> transfer account field has worked for a couple of years.
>
> Yes, I've isolated it to the commit "Change the Register description
> layout cell type", Bob in CC.
> That branch adds the combobox and quickfill to the description field
> of the register. In my case those are fairly long (~100 chars) and all
> unique strings, as they come frome downloaded bank statements and
> include a timestamp, account holder, actual description, etc. So for
> my use case having a combo box to easy filling out new items is not
> that useful anyway. May be we can think of a way to adapt the
> behaviour to be useful in Bob's case (I suppose manual entry of a
> short and often reused description text), but not slow down my case?
>
> > An obvious optimization is to get a collation key with
> g_utf8_collate_key for each string and use that for doing the actual
> sorting/ordered inserting. It's still a char-by-char comparison but it
> saves having to validate and normalize the strings on every compare.
> I will have a look into storing the collated string in the QuickFill.
> That probably doubles the memory usage, but should not be too bad.
>
> Maarten
>