need help understanding import options

Derek Atkins warlord at MIT.EDU
Fri Jun 5 08:45:32 EDT 2009


Hi,

rs <rs123 at rochester.rr.com> writes:

> 1.  There are two potential match issues:  a) matching new
> transactions in the downloaded file to already existing transactions,
> and b) deciding what income/expense category a new transaction belongs
> to.   Which of these matching problems (or both?) is the QIF or
> generic (bayesian (or not)) matcher concerned with?  Which, or both,
> are the bayesian config options concerned with?

The bayesian matching is only concerned about account mapping.

The duplicate matching is done later in the process and uses
the account, amount, and date (and the FITID in OFX).

In QIF there is a mapping from Payee/Memo to GnuCash account for
transactions that don't have a Category or QIF Account attached to
them.  The importer remembers your mappings and re-applies them
on future imports, however the matching to previous imports is done
on the FULL TEXT of the Payee or Memo.

OFX w/o Bayesian matching does effectively the same thing.  However
if you turn ON Bayesian matching then instead of using the full string
the importer breaks it up into different tokens and performs matching
based on the filtering of the tokens.  When you manually map a
tranasction to an account gnucash increases the values of the mappings
for each token to that account.  On future imports it performs an
algorithm that computes the likliest mapping based on the various values
for each token and if the match % is high enough suggests the same
target account.  This works much better in cases where you have
consistent partial payee info with a variable tag, e.g.:

  WHOLEFOODS #1523 20090523

> 2.  Is there any consensus as to which matcher is best:  QIF, QFX
> without bayesian, or QFX with bayesian?

Depends what you're trying to do.  I'd ignore OFX w/o Bayesian matching.
In fact I think in 2.4.x we should turn on Bayesian matching by default.

> 3.  What are these matchers keying on?

Depends what you're talking about, but generally the payee and/or memo.

>    a) Do they ignore the transaction ID numbers for assigning income/
> expense categories (as they should )?  For instance, when i look at
> the QIF files, for example, i see (useful) text as well as long
> transaction ID numbers that are different for every transaction and
> thus useless.

Yes.

>    b) When I download a credit card QIF, it has categories (e.g.,
> "restaurant") imbedded in the file.  Are these categories part of the
> match?  the QFX version for my credit card does not include this
> category field, unfortunately.

The QIF Importer lets you map the QIF Category to a GnuCash account.
Then later it uses that mapping as part of the duplicate checking.

> 5.  Has there been any serious consideration to allowing the user to
> specify the rules of income/expense category assignment [e.g. anything
> called "restaurant" should match to the "dining out" category]?  I
> would certainly prefer that, in most cases, to hoping the software can
> figure it out -- and confine the automatic matching (of whatever type)
> to transactions without rules and to the problem of identifying
> existing transactions.

Sure, send in a patch.

> Any help would be appreciated.  If someone makes the effort to answer
> some of these questions well, it would really help if the answers
> found their way to the documentation.  NOTE:   if the best
> documentation is the code -->  that's a problem.  But if the code
> documentation is readable (and doesn't require deep knowledge of C to
> read it), that might be an adequate answer for now.   where would I
> find the appropriate code for the matchers?

src/import-export

> thanks much.
>
> - Rick

> Please remember to CC this list on all your replies.
> You can do this by using Reply-To-List or Reply-All.

-derek

-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord at MIT.EDU                        PGP key available


More information about the gnucash-user mailing list