need help understanding import options
rs
rs123 at rochester.rr.com
Sun Jun 7 19:03:42 EDT 2009
thanks Derek. definitely helpful. I see there are occasional .txt
files in the src directory which may help a bit as well.
As for sending in a patch -- you wouldn't find that useful, as I
haven't written C in 22 years. If, on the other hand, there's a
place to discuss _possible_ requirements, rather than coding details,
I could possibly take a high-level shot at it.
On Jun 5, 2009, at 8:45 AM, Derek Atkins wrote:
> Hi,
>
> rs <rs123 at rochester.rr.com> writes:
>
>> 1. There are two potential match issues: a) matching new
>> transactions in the downloaded file to already existing transactions,
>> and b) deciding what income/expense category a new transaction
>> belongs
>> to. Which of these matching problems (or both?) is the QIF or
>> generic (bayesian (or not)) matcher concerned with? Which, or both,
>> are the bayesian config options concerned with?
>
> The bayesian matching is only concerned about account mapping.
>
> The duplicate matching is done later in the process and uses
> the account, amount, and date (and the FITID in OFX).
>
> In QIF there is a mapping from Payee/Memo to GnuCash account for
> transactions that don't have a Category or QIF Account attached to
> them. The importer remembers your mappings and re-applies them
> on future imports, however the matching to previous imports is done
> on the FULL TEXT of the Payee or Memo.
>
> OFX w/o Bayesian matching does effectively the same thing. However
> if you turn ON Bayesian matching then instead of using the full string
> the importer breaks it up into different tokens and performs matching
> based on the filtering of the tokens. When you manually map a
> tranasction to an account gnucash increases the values of the mappings
> for each token to that account. On future imports it performs an
> algorithm that computes the likliest mapping based on the various
> values
> for each token and if the match % is high enough suggests the same
> target account. This works much better in cases where you have
> consistent partial payee info with a variable tag, e.g.:
>
> WHOLEFOODS #1523 20090523
>
>> 2. Is there any consensus as to which matcher is best: QIF, QFX
>> without bayesian, or QFX with bayesian?
>
> Depends what you're trying to do. I'd ignore OFX w/o Bayesian
> matching.
> In fact I think in 2.4.x we should turn on Bayesian matching by
> default.
>
>> 3. What are these matchers keying on?
>
> Depends what you're talking about, but generally the payee and/or
> memo.
>
>> a) Do they ignore the transaction ID numbers for assigning income/
>> expense categories (as they should )? For instance, when i look at
>> the QIF files, for example, i see (useful) text as well as long
>> transaction ID numbers that are different for every transaction and
>> thus useless.
>
> Yes.
>
>> b) When I download a credit card QIF, it has categories (e.g.,
>> "restaurant") imbedded in the file. Are these categories part of the
>> match? the QFX version for my credit card does not include this
>> category field, unfortunately.
>
> The QIF Importer lets you map the QIF Category to a GnuCash account.
> Then later it uses that mapping as part of the duplicate checking.
>
>> 5. Has there been any serious consideration to allowing the user to
>> specify the rules of income/expense category assignment [e.g.
>> anything
>> called "restaurant" should match to the "dining out" category]? I
>> would certainly prefer that, in most cases, to hoping the software
>> can
>> figure it out -- and confine the automatic matching (of whatever
>> type)
>> to transactions without rules and to the problem of identifying
>> existing transactions.
>
> Sure, send in a patch.
>
>> Any help would be appreciated. If someone makes the effort to answer
>> some of these questions well, it would really help if the answers
>> found their way to the documentation. NOTE: if the best
>> documentation is the code --> that's a problem. But if the code
>> documentation is readable (and doesn't require deep knowledge of C to
>> read it), that might be an adequate answer for now. where would I
>> find the appropriate code for the matchers?
>
> src/import-export
>
>> thanks much.
>>
>> - Rick
>
>> Please remember to CC this list on all your replies.
>> You can do this by using Reply-To-List or Reply-All.
>
> -derek
>
> --
> Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
> Member, MIT Student Information Processing Board (SIPB)
> URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
> warlord at MIT.EDU PGP key available
More information about the gnucash-user
mailing list