Fixing confused bayesian matching data?

John Ralls jralls at ceridwen.us
Sun Jul 17 23:42:46 EDT 2016


> On Jul 17, 2016, at 6:24 PM, Philip Matthews <philip_matthews at magma.ca> wrote:
> 
> Just wondering if anyone has any advice on what to do with some very confused bayesian matching data?
> 
> Right now, when I import new transactions (either CSV or QFX), they mostly don't find a match anymore. Only around 20 - 30% match.   This is probably because I like to rejig my accounts from time to time as I continue to figure out what works best for me.  Looking through the ".gnucash" file, I see lots of slot entries with account names that don't exist any more.
> 
> For a while, I was just putting up with it and assigning transactions to accounts by hand, but now I am starting to get tired of this.
> 
> A couple of options have occurred to me:
> 
> 1. Just delete everything between <act:slots> and </act:slots> for each account.  This is simple, if rather drastic. But do I do this again in a month when I make another small change to my account structure?
> 
> 2. Write a Python program that goes through the .gnucash file and deletes slot entries that point at accounts that don't exist any more. 
> 
> Comments?  Other thoughts?

The next major version of GnuCash (due around the end of next year) has a new dialog for deleting old match data contributed by Robert Fewell. It also changes the Bayesian matcher to use account GUIDs instead of names (as the plain-string matcher already does) to make it a bit more resistant to reorganization.

That doesn't do anything for you now, of course. New training will eventually override old training but if there's a lot of data already matched it could take a long time.

You can carefully edit out the match data from your file if you insist. Make a backup or two first and test carefully after your edit!

Regards,
John Ralls




More information about the gnucash-user mailing list