[GNC] importing multiple CSVs to Gnucash

Gio Bacareza gbacareza at gmail.com
Fri Jun 5 06:57:44 EDT 2020


Thank you Adrien, David and @flywire for your very helpful responses.

I actually created a script that would transform different file formats and
data structures and through a machine learning classifier autopopulate the
"Transfer Account '' depending on the description.

My script would scour through the different formats since banks generally
are not standardized and transform these different files into a standard
simple CSV with DATE, DESCRIPTION, ACCOUNT, DEPOSIT, WITHDRAWAL.

After transformation, I run it through my classifier and that will add the
TRANSFER ACCOUNT field and autopopulate with the predicted transfer account
that is an exact string match, eg if transfer account is "XXXYYYZZZ" in the
csv, there is such an account in GNUCASH that the exact string equivalent
"XXXYYYZZZ". I figured that if I explicitly include the specific Transfer
Account per line in the csv, it should help GNUCASH match.

This could save me a lot of time since I don't have to transform each bank
account statement or logs individually. I have lots of accounts. I can
imagine I'm not alone in this case.

I understand from your comments that GNUCASH follows a Naive-bayes
algorithm to predict the TRANSFER ACCOUNT given the description and you
have to feed it transactions little by litle for it to learn. But as I have
said, I have already pre-processed that data so GNUCASH does not have to do
that. I explicitly feed it with the exact transfer account.

I really want to avoid doing the import account by account if I could.

On Wed, Jun 3, 2020 at 5:21 AM David Cousens <davidcousens at bigpond.com>
wrote:

> Gio,
>
> The Gnucash import matching procedure searches for duplicate transactions
> in a time window around the date of the
> imported recorded set at +-42 days. A match score is then calculated based
> on the differenes in dates and amounts and
> the matching of tokenized data. It is generally very good at picking up
> duplicates but will sometimes match repular
> repeated payments. An imported transition which matches an existing one
> will have the "C"  ("R" in older GnuCash
> versoions) checkbox checked if the match is very good  but may have the
> "U+C"  ("U+R" in older GnuCash versions)
> chceckbox checked if it is not a very close match ( usually if
> descriptions and memo fields are different. It will pay
> to check all matches are actually valid. You can swap between viewing the
> register(s) and the import matcher window.
> The Help manual section on importing will help you with understanding the
> significance of the background colours to the
> rowsand the meaning of the checkboxes in the importer.
>
> https://www.gnucash.org/docs/v3/C/gnucash-help/trans-import.html#:~:text=Navigate%20to%20the%20MT940%2C%20MT942,the%20transactions%20in%20the%20file
> .
>
> Even though you are specifying the transfer accounts the importer will ask
> to to map the account name in the trnsfer
> field to an internal GnuCash account each time it encounters an account
> name for the first time in the import data.
>
> David Cousens
>
>
>
> On Tue, 2020-06-02 at 17:57 +0800, Gio Bacareza wrote:
> > Hi GNUCash experts,
> >
> > I'm planning to consolidate all my statements into 1 big CSV. This CSV
> > would naturally have an account field and a transfer field.
> >
> > Since all transactions are there, there will be instances where there
> will
> > be duplicates. For example, a statement from bank1 could have a transfer
> > from bank1 to bank2. So account = bank 1, transfer = bank 2, amount =
> -100
> > for example. But, since I put in all together in 1 file, there will be
> > another line where account = bank 2, transfer = bank1, amount = 100.
> >
> > When I import this big file containing all transactions from different
> > statements, how will gnucash handle this? Can it automatically detect
> that
> > this is one and the same transaction?
> >
> > The alternative is to import account by account but that is too time
> > consuming for me.
> >
> > Is there a better way?
> >
> --
> Dr David R Cousens
> B.Sc, M.Prof. Acc., Ph.D., G.C.Ed
>
>

-- 
cheers,

Gio


More information about the gnucash-user mailing list