Fixing confused bayesian matching data?

Mon Jul 18 16:29:07 EDT 2016

> Message: 2
> Date: Sun, 17 Jul 2016 20:42:46 -0700
> From: John Ralls <jralls at ceridwen.us>
> To: Philip Matthews <philip_matthews at magma.ca>
> Cc: Gnucash User <gnucash-user at gnucash.org>
> Subject: Re: Fixing confused bayesian matching data?
> Message-ID: <D50E28C3-E557-4A73-AAB6-F79DBFB2A7B6 at ceridwen.us>
> Content-Type: text/plain; charset=us-ascii
> 
> 
> > On Jul 17, 2016, at 6:24 PM, Philip Matthews
> <philip_matthews at magma.ca> wrote:
> >
> > Just wondering if anyone has any advice on what to do with some very
> confused bayesian matching data?
> >
> > Right now, when I import new transactions (either CSV or QFX), they
> mostly don't find a match anymore. Only around 20 - 30% match.   This is
> probably because I like to rejig my accounts from time to time as I
continue
> to figure out what works best for me.  Looking through the ".gnucash"
file, I
> see lots of slot entries with account names that don't exist any more.
> >
> > For a while, I was just putting up with it and assigning transactions to
> accounts by hand, but now I am starting to get tired of this.
> >
> > A couple of options have occurred to me:
> >
> > 1. Just delete everything between <act:slots> and </act:slots> for each
> account.  This is simple, if rather drastic. But do I do this again in a
month
> when I make another small change to my account structure?
> >
> > 2. Write a Python program that goes through the .gnucash file and
deletes
> slot entries that point at accounts that don't exist any more.
> >
> > Comments?  Other thoughts?
> 
> The next major version of GnuCash (due around the end of next year) has a
> new dialog for deleting old match data contributed by Robert Fewell. It
also
> changes the Bayesian matcher to use account GUIDs instead of names (as
> the plain-string matcher already does) to make it a bit more resistant to
> reorganization.
> 
> That doesn't do anything for you now, of course. New training will
eventually
> override old training but if there's a lot of data already matched it
could take
> a long time.
> 
> You can carefully edit out the match data from your file if you insist.
Make a
> backup or two first and test carefully after your edit!
> 
> Regards,
> John Ralls
> 

Hi Philip,

There is already a perl script to do what you want, although I haven't used
it.
See 'Bayes' in http://wiki.gnucash.org/wiki/Published_tools.

Regards,
Chris Good

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4817 bytes
Desc: not available
URL: <http://lists.gnucash.org/pipermail/gnucash-user/attachments/20160719/1cb49461/attachment.p7s>