Bayesian matching- Imbalance

Lincoln A Baxter lab at lincolnbaxter.com
Wed Jul 15 23:09:00 EDT 2015


On Tue, 2015-07-14 at 09:10 -0700, John Ralls wrote:
> > On Jul 14, 2015, at 8:32 AM, C <Peace at AleksandrSolzhenitsyn.net> 
> wrote:
> > 
> > I'm running- GnuCash r21973 on 2013-01-03 on Linux.
> > 
> > When importing a QFX file into my credit card account many of the
> > imported transactions are "matched" incorrectly. For example; 
> "Sunoco
> > Car fuel" ends up in the transfer column with the name Expense: 
> Food
> > which is incorrect.
> > 
> > The Bayesian matching is set but I don't know what the proper 
> levels are
> > to be set to...or whether that'll solve any problems.
> > 
> > Many of the imported QFX transactions end up in the "Imbalanced 
> USD"
> > account.
> > 
> > Fixing this stuff takes way too much time.
> > 
> > Do you have any idea what should be done and how to fix matching so 
> the
> > purchases are correctly labeled in the "transfer" column?
> > 
> 
> You have to train, or perhaps retrain, the Bayesian matcher. That 
> means reviewing and correcting the transfer accounts every time you 
> do an import before accepting the matches. Depending on the 
> variability of the descriptions and how long you’ve allowed the bad 
> matches to persist it may take many imports worth of corrections to 
> overcome the bad scores. There’s no “clear the history” button 
> implemented to let you start over from scratch.

There may be no "clear the history" button, but it might be very
helpful to many (including me if we could just wipe it out and start
over. This is by far not the first complaint I have read. 

Why could we not just remove the files or replace them with fresh
(empty/untainted) copies?  This and the additional complications you
name below makes the Bayesian matching way less useful than it could
otherwise be, even if we could just wipe it out... 

This is annoying enough to me, that it might be worth my pulling down
the sources and at least fixing the account name problem, and submitted
a patch.

For now, Where are the file(s)?  

> 
> Another complication is that the original implementor made a bad 
> design decision: The Bayesian matcher stores the account name rather 
> than its GUID, so account name changes will produce invalid matches 
> until the scores for the old names are overcome by higher scores on 
> the new name.
> 
> The thresholds are explained in the help manual: http://www.gnucash.o
> rg/docs/v2.6/C/gnucash-help/set-prefs.html#prefs-online
> I don’t think that changing them will help when the match database 
> has high scores for incorrect matches.
> 
> Regards,
> John Ralls
> 
> 
> _______________________________________________
> gnucash-user mailing list
> gnucash-user at gnucash.org
> https://lists.gnucash.org/mailman/listinfo/gnucash-user
> -----
> Please remember to CC this list on all your replies.
> You can do this by using Reply-To-List or Reply-All.


More information about the gnucash-user mailing list