[GNC-dev] avoid the brain dead import

David Cousens davidcousens at bigpond.com
Sat Aug 25 02:22:04 EDT 2018


I think the answer to your question lies in the fact that files users wish
to import don't come from a single source and don't always conform to any
well defined standard with regard to both the data format and the
information supplied.  

Importing OFX data is considerably more straightforward than importing CSV
data for this reason, as it does conform to a reasonably well defined
standard, but even then some institutions do manage to stuff it up. In most
cases users don't necessarily have any control over what another institution
includes in the files they supply. Most include transactions between the To
and From dates inclusively that you might enter when requesting a data
download but this is not guaranteed. Stupid ? Yes, but the importer has to
cope with stupid, as well as nicely well formatted and thought out files.
Not all data for a bank account includes the detail of which account you may
want the second split of a transaction to go to and even if they do it may
not match your choices in setting up your chart of accounts. If it does then
GnuCash deals with that

I am an accountant (retired) and I have imported the same file (or at least
overlapping data in different files) on more than one occasion since I have
been using GnuCash. The point of the matcher is to pick this up before you
have enterd the data into your accounts and not have to deal with the far
more laborious task of working out which transactions were duplicated in an
import and deleting them from your records one by one once they have been
imported. If you get the date format wrong relative to your locale format on
an import, it can be particularly difficult. Swapping days and years
produces some interesting results.

The matcher also has a Bayesian learning system which can allocate the
transfer account for the second split on the basis of matching information
in the description and other fields. My experience has been after I have
imported one  or two month's data, it will generally assign the transfer
account for about 60% of data in the succeeding months and handles regular
payments and deposits pretty well and it gets better still after a few

I import  a few hundred transactions a month, generally in 5-10 minutes from
OFX files with no problem. CSV importing (e.g. Paypal can be far more
problematical but the ability of the importer in v3.2 to save import
settings is a great help.

There is a recent patch (Bug 796778) which might help you shorten the
initial input before the matcher works efficiently but it is not yet
incorporated in the master branch. It implements multiple selection of rows
in the matcher e.g.. from the same vendor using Ctrl-click and Shift Click
and the rubberbanding techniques implemented in GTK and the assignment of
those rows to a single transfer account. It speeds up the initial import of
data quite a bit but is less effective once the Bayesian matching is trained
(which is possibly why it has not been implemented before now) as that tends
to pick up repeated transactions fairly well. 

The downside is of course there is always a transaction or two  from the
same vendor or customer which may have to go to a different transfer
account, i.e. you still have to check that it has been correctly assigned by
the matcher.

Keep trying. Tthe brain dead importer does get less brain dead with repeated

David Cousens

David Cousens
Sent from: http://gnucash.1415818.n4.nabble.com/GnuCash-Dev-f1435356.html

More information about the gnucash-devel mailing list