[GNC-dev] Understanding the bayesian import matching algorithm
christian.gruber at posteo.de
Thu Jul 2 15:10:53 EDT 2020
while further studying the bayesian import matching algorithm I'm now at
the point, where I wanted to understand, how the bayes formula is
applied to the problem of matching transactions to accounts using
tokens. But I need further information, since it doesn't come clear to
me what is really calculated there.
The implementation can be found in the following functions in Account.cpp:
Actually, the latter could be omitted as it only selects the account
with the highest matching probability.
Studying the code and the rare comments on the implementation it seems
to be a variant of the naive bayes classifier
with the tokens used as (independent) "features" and the accounts used
as "classes". But comparing this algorithm to the code leaves several
Does anybody know a more precise algorithm description, on which the
implementation in GnuCash is based on?
More information about the gnucash-devel