[GNC] How does GnuCash autoassign transactions?

David Cousens davidcousens at bigpond.com
Wed Nov 28 17:36:57 EST 2018


This is my understanding after having looked at some of the matcher code
recently. It may not be exactly correct as I wasn't looking specifically at
the Bayes algorithm but I did skim through the code. 

It tokenizes (parses) for key information in the import data. It seems to
maintain a table of the tokens present in a transactions when you manually
assign a transaction  to a particular transfer account or accept an
automatic transaction to that account. If a transfer account is specified in
the import data then it will use that otherwise when you import a
transaction the code appears to examine the tokens in the incoming
transaction, compares these with the table it has accumulated and calculates
a probability that this transaction  should be assigned to a given account. 
It then searches for and calculates a probability that it is an exact match
or close match for an existing transaction.

Based on these probabilities it marks the transaction in the matcher window
either to be Added, Reconciled or Updated and sets the appropriate checkbox
and displays the relevant probabilities in the colour bars and if the
probability of a match in the transfer account is considered high enough, it
actually assigns a transfer account to the transaction. If the matcher is
completely successful in assigning a transfer account and there is no match
to an existing transaction,  the transaction is marked in a light green. The
transaction row gets maked in a red/pink if the matcher  decides it should
not be imported, e.g. matches an existing transaction sufficiently closely.

Any of these automatic choices can be overridden by the user by unchecking
the checked box, checking another or clicking on the assigned transfer
column to start the account selection dialog as you do for an unassigned
transaction (row in transfer account  column  is marked yellow and assigned
to the Imbalance account).

I think the changes David Carlson referred to was to limit the date range
over which an exact match was searched for but not exactly sure on the
details.

The matcher regularly rejects some regular transactions of mine that are
direct deposits to a payee because my bank includes a unique request number
from the payee in each transaction description. If there is a number or word
present in the description that is indicative of a payee that always goes to
a given account then the matcher will generally get it.  It also often
identifies my fortnightly pension payments as updates to existing records
because all the tokens match and the amounts are normally exactly the same
but the dates match within the date  window designed to catch near misses
for updating. When my pension amount changes occasionally the importer will
recognise it as a new transaction to be imported rather than rejecting it as
a match to an already existing transaction. 

The matching tables are updated when you actually import the transactions
into GnuCash not when you do the assignment of a transfer account so
automatic matches which are overridden and successful automatic matches you
accept all update the tables. After about 3 months retraining after the
update to v3, most of my transactions from my bank for regular payments are
picked up by the matcher mainly on the payee name if it is included in the
description. New payments from a new payee aren't going to be matched
although when I use a different pharmacy the word pharmacy in the
description is sufficent to get a correct match to my medical expense
account.

I recently updated the OFX import instructions in the documentation as a
result of the new feature  to allow multiple selection of transactions and
assignment of a transfer account to that selection in the matcher, which are
planned to be incorporated in V4 and started on the CSV import documentation
but I have to explore recent changes in that more fully before completing an
update. I had also planned to document the matcher process above more fully
after exploring the code a bit more carefully. These changes are also
conflated with David T's reorganization of the Guide and Help manual layout,
so will not appear in the current V3 documentation. 

David Cousens



-----
David Cousens
--
Sent from: http://gnucash.1415818.n4.nabble.com/GnuCash-User-f1415819.html


More information about the gnucash-user mailing list