need help understanding import options

rs rs123 at rochester.rr.com
Thu Jun 4 20:49:33 EDT 2009


hello,

i'm having trouble getting acceptable matching when using the QFX  
importer.   I've assumed it's better than the QIF importer, but maybe  
not.  Would appreciate suggestions.   I have scanned the list, all  
docs I can find, and google for answers to these questions [below],  
but without much luck.

Given the hassle of importing transactions every month, i'd assumed  
there would be elaborate discussions of the pros and cons of QIF vs.  
QFX +/- bayesian -- as well as detailed discussions of how to optimize  
matching.   I've occasionally seen RTFM suggestions to other  
struggling folks, but I have yet to find the needed detail in any  
official documentation, or for that matter, anywhere at all.


I have enabled bayesian on general principles -- since i'm assuming it  
is a more intelligent and configurable match -- though I do not know  
precisely what bayesian or its alternative are really doing.  I have  
also read the minimal description of the options in the documentation,  
i.e.:

"Use Bayesian matching - Use Bayesian algorithms to match new  
transactions with existing accounts.
Match display threshold - The minimal score a potential match must  
have to be displayed in the match list.
Auto-add threshold - A transaction whose best match's score is in the  
red zone (above display threshold, but below or equal to Auto-add  
threshold) will be added by default.
Auto-clear threshold - A transaction whose best match's score is in  
the green zone (above or equal to Auto-clear threshold) will be  
cleared by default."

I have been unable to find any more detailed description than this of  
what these options really do.   So...

Now, the questions.   If you're too busy to answer them all, please  
pick your favorite.  Any help would be appreciated.

1.  There are two potential match issues:  a) matching new  
transactions in the downloaded file to already existing transactions,  
and b) deciding what income/expense category a new transaction belongs  
to.   Which of these matching problems (or both?) is the QIF or  
generic (bayesian (or not)) matcher concerned with?  Which, or both,  
are the bayesian config options concerned with?

2.  Is there any consensus as to which matcher is best:  QIF, QFX  
without bayesian, or QFX with bayesian?

3.  What are these matchers keying on?
    a) Do they ignore the transaction ID numbers for assigning income/ 
expense categories (as they should )?  For instance, when i look at  
the QIF files, for example, i see (useful) text as well as long  
transaction ID numbers that are different for every transaction and  
thus useless.
    b) When I download a credit card QIF, it has categories (e.g.,  
"restaurant") imbedded in the file.  Are these categories part of the  
match?  the QFX version for my credit card does not include this  
category field, unfortunately.

4.  Could someone please explain the bayesian options?  the official  
description [above] is not enough.  For instance (again), is "Match  
Display Threshold" for matching existing transactions, or matching a  
new transaction to an income/expense category?   I've tried cranking  
the "Match Display Threshold"  all the way from 1 to 6 and see no  
differences in the matching to expense categories.  So what are these  
controls doing?

5.  Has there been any serious consideration to allowing the user to  
specify the rules of income/expense category assignment [e.g. anything  
called "restaurant" should match to the "dining out" category]?  I  
would certainly prefer that, in most cases, to hoping the software can  
figure it out -- and confine the automatic matching (of whatever type)  
to transactions without rules and to the problem of identifying  
existing transactions.

Any help would be appreciated.  If someone makes the effort to answer  
some of these questions well, it would really help if the answers  
found their way to the documentation.  NOTE:   if the best  
documentation is the code -->  that's a problem.  But if the code  
documentation is readable (and doesn't require deep knowledge of C to  
read it), that might be an adequate answer for now.   where would I  
find the appropriate code for the matchers?


thanks much.

- Rick





More information about the gnucash-user mailing list