[GNC-dev] Convert Imap to Flat

John Ralls jralls at ceridwen.fremont.ca.us
Tue Nov 6 19:10:03 EST 2018



> On Nov 7, 2018, at 5:52 AM, Geert Janssens <geert.gnucash at kobaltwit.be> wrote:
> 
> Op maandag 5 november 2018 23:27:31 CET schreef John Ralls:
>>> On Nov 6, 2018, at 2:50 AM, Geert Janssens <geert.gnucash at kobaltwit.be>
>>> wrote:> 
>>> Op maandag 5 november 2018 17:07:25 CET schreef Robert Fewell:
>>>> Hi,
>>>> I was poking around with the CSV importer and I noticed the following,
>>>> this
>>>> may also be an issue with other importers on first use after creating a
>>>> new
>>>> file...
>>>> With a new empty xml file, I used a one line transaction csv file with
>>>> appropriate settings and the 'Account' set to 'Assets:Current
>>>> Assets:Checking Account' and observed the following when I got to the
>>>> match
>>>> page...
>>>> 
>>>> With the 'Checking Account' register open it would partly show the
>>>> imported
>>>> transactions, (this can be fixed by suspending GUI changes which I was
>>>> going to propose in a PR) and the register would reload seven times.
>>>> This reloading is caused by the triggering of the
>>>> 'imap_convert_bayes_to_flat' function and as it is a new file you would
>>>> not
>>>> expect it to do any thing but it does. Adding a few print statements I
>>>> get
>>>> the following...
>>>> 
>>>> matchmap_find_destination
>>>> imap_convert_bayes_to_flat
>>>> convert_imap_account_bayes_to_flat 'Assets'
>>>> 
>>>> gnc_split_register_load called for account 'Assets:Current
>>>> 
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Current Assets'
>>>> 
>>>> gnc_split_register_load called for account 'Assets:Current
>>>> 
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Checking Account'
>>>> 
>>>> gnc_split_register_load called for account 'Assets:Current
>>>> 
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Liabilities'
>>>> 
>>>> gnc_split_register_load called for account 'Assets:Current
>>>> 
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Income'
>>>> 
>>>> gnc_split_register_load called for account 'Assets:Current
>>>> 
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Expenses'
>>>> 
>>>> gnc_split_register_load called for account 'Assets:Current
>>>> 
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Equity'
>>>> 
>>>> gnc_split_register_load called for account 'Assets:Current
>>>> 
>>>> Assets:Checking Account' with list of 1
>>>> 
>>>> As you can see, seven accounts get updated forcing the register reload
>>>> seven times, (not sure why those other accounts force a reload either),
>>>> and
>>>> this gets even worse if this first import is 100 transactions which would
>>>> equate to 700 reloads. I have not worked out why all these accounts are
>>>> updated or why after the first pass the converted flag is not set/noticed
>>>> there by eliminating the convert for the rest of the transactions, it
>>>> only
>>>> seems to be noticed on subsequent imports.
>>>> 
>>>> You also get this behaviour if you start the 'Import Map Dialogue' which
>>>> may be the source of a report about that dialogue freezing but that needs
>>>> more investigating.
>>> 
>>> This calls gnc_account_imap_get_info_bayes, which also calls
>>> imap_convert_bayes_to_flat so the it will trigger the same account
>>> refreshes.> 
>>>> Any idea why these accounts are updated and why it runs on every import
>>>> transaction row ?
>>> 
>>> Why the accounts are updated: while only a run in the debugger will verify
>>> it, this is what I have gathered from reading the code:
>>> 
>>> imap_convert_bayes_to_flat's sub functions will call xaccAccountBeginEdit
>>> and xaccAccountCommitEdit at some point. This happens because it changes
>>> the account's kvp frames that store the import maps.
>>> 
>>> On the other side, the register code has set a watch on the register's
>>> account(s) via the component manager. So each time the account signals a
>>> change (or more precisely a successful run of xaccAccountCommitEdit) the
>>> component manager will tell the register to refresh itself.
>>> 
>>> As you suggest you can probably disable this by a call to
>>> gnc_suspend_gui_refresh.
>>> 
>>> Why it runs on every import transaction row ? I suspect this is because
>>> there are no imap records stored yet and hence the feature flag that
>>> blocks the conversion is not set yet. So for each transaction it will try
>>> to do the conversion, find there's no converted imap record to store and
>>> skip setting the feature flag. This will probably continue forever if the
>>> user doesn't use bayesian matching at all.
>>> This is a difficult issue to solve. We don't want to set the flat_bayes
>>> conversion flag if there are no bayes maps because that would needlessly
>>> break backwards compatibility. We could make the conversion code more
>>> careful and have it only commit to accounts if there really are changes
>>> to commit. And add a run time flag that signals the conversion has run
>>> already once. With that conversion should only start if this flag is
>>> false AND the feature flag is false.
>>> However it may all be unnecessary and perhaps just blocking register
>>> updates while the transaction matching (or the whole import code) is
>>> running may eliminate the performance bottleneck already.
>>> So I would start with that: block gui updates.
>>> Then secondly, make the imap code more careful not to do unnecessary
>>> account updates and lastly consider a run time flag.
>>> 
>>> Like you though I suspect this may be at least one cause of the import
>>> issues we see. Thanks for poking at it!
>> 
>> I think that we do want to set the feature flag in this case: The import map
>> uses the flat layout and a pre-flat-bayes GnuCash won’t be able to read it,
>> causing user frustration.
> 
> The timing matters. We want to set the flag from the first flat_bayes_imap 
> entry that's actually written to file, not at the start of a conversion run. 
> That's how the code is constructed now.
> 
> I don't want the flag to be set if at the end of a conversion attempt there 
> still aren't any flat_bayes_imap entries. That would needlessly break 
> backwards compatibility.

OK, that’s reasonable. 

>> 
>> That’s not to say that we shouldn’t also do everything we can to speed up
>> the code--it’s really slow--and certainly suspending UI events while
>> computing the import matches is a good idea.
>> 
>> There’s something else going wrong, though: convert_imap_bayes_to_flat calls
>> xaccAccountBeginEdit at the start and xaccAccountCommitEdit at the end.
>> That should raise the edit level and prevent any interior calls to
>> xaccAccountCommitEdit from doing anything.
> 
> This doesn't look wrong per se to me.
> convert_imap_account_bayes_to_flat is run once for each account. So there will 
> still be the same number of gui refreshes as there are accounts. I don't think 
> we can reduce this further on the account begin/commit level. A complete gui 
> refresh suspend may do better.

Each account has its own import map and any particular import affects only one account’s map, so there should be exactly one effective, i.e. with edit_level == 1, commit and exactly one UI refresh. If there’s more than one of either inside a call to convert_imap_account_bayes() then something’s broken at the QofInstance level. If we’re calling convert_imap_account_bayes() on a particular account more than once in a session then there’s something wrong with the decision logic that calls it. Bob’s printf-profile suggests at least the latter and  your "imap_convert_bayes_to_flat's sub functions will call xaccAccountBeginEdit and xaccAccountCommitEdit at some point” suggests the former.

I suppose Aaron thought that running it on an empty or non-existent map would take negligible time; if that’s not the case then we can simply check for that and quit, but it should be checked in the profiler before we add any code.

Regards,
John Ralls




More information about the gnucash-devel mailing list