Transaction balancing

Greg Stark gsstark at mit.edu
Fri Jan 24 09:58:07 CST 2003


Derek Atkins <warlord at MIT.EDU> writes:

> I think it's a bit more subtle that that.  Each import mechanism has
> different 'strings' to match on.  QIF, for example, has a PAYEE, MEMO,
> DESCRIPTION, Category, and I think "Account".  OFX has different
> strings to search on.  HBCI is different from both the above.
> 
> I do not think there is a one-size-fits-all system.

You left out the amount field. Lots of my transactions come through with the
same textual description. The only way I have of determining what they were is
that they're always for the same amount.

I think it would be possible to throw a general purpose pattern matching
system at this. Even a simple system like the bayesian filters people are
using to pre-sort their mail should work great.

Also, instead of doing a hard coded heuristic like the 50% prefix of the
description field, there are a number of "approximate matching" algorithms for
text. Something like the agrep algorithm which counts the number of character
deletions/insertions/substitutions necessary to transform the input to the
pattern would provide a much finer grained distinction.

I do think allowing scheme to override the automatic detection would be
useful. But for a different reason than you do. I think the automatic
filtering could be made >99% accurate, but all it can do is determine the
target account. A human could mark interesting portions of the text fields and
instruct Gnucash on how to make use of them. For example my online bank
statement includes my checks' numbers in the text fields. I could write a
regexp that picked that out of the text field and instruct Gnucash to place
that subexpression in the check# field.

But I think it's worth implementing a good AI for doing automatic filtering is
worth it because 99% of the users will never write a line of scheme. Most
programmers probably aren't familiar with scheme. Even I would rather not have
to write scheme to do my filtering, it's just so nice when things work
automatically.

--
greg



More information about the gnucash-devel mailing list