[GNC-dev] Bug 797463 - CSV Import of transactions into a new file hangs

John Ralls jralls at ceridwen.us
Thu Nov 7 22:39:41 EST 2019


Christian,

It's not that it's not prepared for Bayesian matching, it's that older versions of GnuCash stored the Bayesian match tokens hierarchically. Aaron Laws (lmat) changed it to a flatter structure with somewhat better memory locality for faster access. imap_convert_bayes_to_flat should run once to convert the data and set the feature, after which check_import_map_data will see the flag and return. A file created with 3.x and Baysian maps would already have the feature set.


With that background, to your questions:

Why does it take so long? Because it traverses the entire tree of accounts, every time. The test book has 1127 accounts. Add to that that there are some things inside the loop that shouldn't be and that convert_imap_account_bayes_to_flat doesn't use some obvious short circuits and you get taking a long time.

Why does it run twice? Because there aren't any accounts with import-map-bayes slots, so it does no conversions so it doesn't set the feature.

Why is the feature not set after the import? It should be if it's actually setting any matches. That's done at the end of change_imap_entry. If you're quitting the matcher without associating the transactions to accounts it won't call change_imap_entry and set the feature.

Regards,
John Ralls


> On Nov 7, 2019, at 2:41 PM, Christian Gruber <christian.gruber at posteo.de> wrote:
> 
> Can anybody provide help?
> 
> The last change on the relevant functions was done in commit fbf4843f31 by "lmat" in Dec 2017 between GnuCash versions 2.6 and 2.7. And the commit message seems to fit.
> 
> Christian
> 
> Am 04.11.19 um 20:28 schrieb Christian Gruber:
>> I have some questions related to Bug 797463 <https://bugs.gnucash.org/show_bug.cgi?id=797463>, which I have analyzed.
>> 
>> The author wrote, that "Gnu Cash hangs", when importing (only two) transactions into a Gnu Cash file with standard accounts list SKR04.
>> 
>> My analysis showed, that actually Gnu Cash does not hang, but needs really long time for import (several minutes). The problem is, that the author has bayesian matching enabled in his preferences, but the Gnu Cash file is not prepared for bayesian matching (feature GNC_FEATURE_GUID_FLAT_BAYESIAN is not set in the Gnu Cash file). Moreover the Gnu Cash file contains a lot of accounts (approx. 1000).
>> 
>> Most time is spent in function check_import_map_data(), which is called from gnc_account_imap_find_account_bayes() (Account.cpp). In this function imap_convert_bayes_to_flat() is called, which AFAICS prepares all accounts for bayesian matching.
>> 
>> I'm not familiar with that conversion step and therefore have several questions:
>> 
>> Why does the conversion need so much CPU time?
>> 
>> If the conversion needs so much CPU time, why is it done only temporarily? The conversion is done for each of the two transactions again.
>> 
>> Why is the conversion even not persistent after the import is done? The feature GNC_FEATURE_GUID_FLAT_BAYESIAN is not set in the Gnu Cash file, even not after the import is finished.
>> 
>> Regards,
>> Christian
>> 
> _______________________________________________
> gnucash-devel mailing list
> gnucash-devel at gnucash.org
> https://lists.gnucash.org/mailman/listinfo/gnucash-devel



More information about the gnucash-devel mailing list