OFX Bayesian import not working for me

John Ralls jralls at ceridwen.us
Sat Nov 28 15:38:24 EST 2015


> On Nov 28, 2015, at 11:03 AM, Eliot Rosenbloom <eliot at ejr.me> wrote:
> 
> Thanks John!
> 
> 1.  I hate to be dumb, but where do I find the file with <slot> tags?  I see only 3 types of files:  .log (with minimal info in them);  and .gnucash with and without a date (both are "garbage" when opened with TextEditor).
> 
> 2.  "look at slots for the other tokens to see that they all make sense." -- I'm a bit fuzzy on the relation (below) between the two entries for "Medicare" [I would delete the one with":" , I assume]  AND the one for "Health Insurance," and why you summed across all 3?  And can you say a bit more about what "making sense" means?  What should I be looking to find or avoid?
> 
> 3.  To post this (obviously w/o my personal files I sent you), do I remember that there is an email address I can forward it to ... probably proceeded by a brief statement of the problem?
> 
> Thanks!

Eliot,

Please remember to copy the list on all replies. “Reply all” works well.

The .gnucash file without a date. It’s compressed with gzip, and you can uncompress it on the command line with gunzip or you can unselect “Compress Files” in Preferences>General and the next save will be uncompressed.

I summed across all three in the first MEDICARE example because I deleted two of them with the unstated assumption that only one was correct. I explained that that was just an example and that a real case would be more complicated, which I thought that I’d clarified later by explaining the way Bayesian matching tokenizes descriptions, scores the token - account pair, and then sums the scores across the tokens to select the matching account.

So to “make sense” of a set of token scores you need to run that process yourself for the tokens you intend to change: From a set of import files find the descriptions containing each token you’re contemplating changing, find the other tokens in those descriptions, look at the token-account scores for each and work out what account the matcher will select in each case. If it appears that the matcher will do the right thing, remove only the “:” delimited account tokens; you probably don’t need to change the scores of the remaining ones, because the token-account scores for “:” delimited accounts were all created together and you’ve decided that the other scores provide the right answer. Deleting the “:” scores is still helpful because the matcher won’t have to look at those scores any more and that will speed it up. If the matcher is guessing wrong then by working out the match process by hand you’ll understand why and can remove or adjust token-account scores as necessary. If all of that seems like too much work you can just delete the whole import-match-bayes slot and start over generating new matches.

If I understand your question about posting, just “reply all”. The list is in the CC field of this message and “reply all” will ensure that it’s in your reply as well.

Regards,
John Ralls




More information about the gnucash-user mailing list