[GNC-dev] Convert Imap to Flat

John Ralls jralls at ceridwen.fremont.ca.us
Wed Nov 7 18:29:02 EST 2018



> On Nov 7, 2018, at 5:14 PM, Geert Janssens <geert.gnucash at kobaltwit.be> wrote:
> 
> Op woensdag 7 november 2018 01:10:03 CET schreef John Ralls:
>>> On Nov 7, 2018, at 5:52 AM, Geert Janssens <geert.gnucash at kobaltwit.be>
>>> wrote:> 
>>> Op maandag 5 november 2018 23:27:31 CET schreef John Ralls:
>>>> That’s not to say that we shouldn’t also do everything we can to speed up
>>>> the code--it’s really slow--and certainly suspending UI events while
>>>> computing the import matches is a good idea.
>>>> 
>>>> There’s something else going wrong, though: convert_imap_bayes_to_flat
>>>> calls xaccAccountBeginEdit at the start and xaccAccountCommitEdit at the
>>>> end. That should raise the edit level and prevent any interior calls to
>>>> xaccAccountCommitEdit from doing anything.
>>> 
>>> This doesn't look wrong per se to me.
>>> convert_imap_account_bayes_to_flat is run once for each account. So there
>>> will still be the same number of gui refreshes as there are accounts. I
>>> don't think we can reduce this further on the account begin/commit level.
>>> A complete gui refresh suspend may do better.
>> 
>> Each account has its own import map and any particular import affects only
>> one account’s map, so there should be exactly one effective, i.e. with
>> edit_level == 1, commit and exactly one UI refresh.
> 
> The new CSV importer can handle imports into multiple accounts (on both 
> sides), but that's tangential to this issue.
> 
> The conversion as you describe it would be a lazy conversion: only convert 
> maps on an as needed basis. An interesting idea which hadn't occurred to me 
> and also different from how it's implemented.
> Right now the conversion of *all* import maps of all accounts is initiated the 
> first time any import map (bayes of course) is needed. So even if the import 
> requires only one account's maps, all maps will be converted in one go. Hence 
> the iteration over all accounts.
> 
> I don't know how 2.6.21 would handle partly converted bayes import maps. I do 
> know as it is now the conversion is designed to be run as long as the feature 
> flag is not set. That would also have to change if we would like to go for the 
> lazy conversion.
> 
> The advantage of such lazy conversion is the time required to convert is 
> spread over several import runs, so each conversion is likely to take less 
> time. The disadvantage is we risk ending up with books that carry hierarchical 
> bayes data for an eternity and hence gnucash has to keep code around to handle 
> with it. In a full conversion in one go scenario we can at some point 2 major 
> releases from now declare this unsupported as we only guarantee backwards 
> compatibility for 1 major release series.
> 
> What *is* a problem with the one-shot-convert-all scenario is that it takes 
> noticable time (in some cases even horribly long) and we don't inform the user 
> of what's happening. The conversion should really have been initiated from the 
> gui with a progress bar showing something is going on and indicating how far 
> the process has run.

Sorry, I’d forgotten that it’s an all-at-once. Given that it’s controlled by a feature-flag that’s a reasonable design decision. It points to another possible slow-down: Even if convert_imap_account_bayes() takes negligible time on an empty map, walking a large account tree looking for maps won’t. We can fix that and speed up the eventual conversion a bit by constructing a list or vector of accounts while loading the book: The SQL backends can use a single query, SELECT guid FROM accounts WHERE guid = (SELECT DISTINCT a.guid FROM accounts = a, slots = k WHERE a.guid = s.obj_guid AND s.name = 'import-map-bayes’);[not tested, might need a tweak or two]. The XML backend would just add the account guid to the list. 

While working with imaps we should see if we can combine the flatting and the transfer account name->guid conversion so that they need only one pass through the map to accomplish both in cases where both are needed.

I agree that a dialog box informing the user of what’s going on would be good. Given the performance problems with progress bars on Windows with HiDPI displays I’m not so sure about that part, especially since generating a useful progress measure is a problem for most of our progress bars.

Regards,
John Ralls


More information about the gnucash-devel mailing list