discussion generic import design

Benoit Grégoire bock@step.polymtl.ca
Mon, 21 Oct 2002 17:34:18 -0400


> Three issues with the current state of implementation:
>
> The second step is already implemented in the generic importer almost
> exactly in the way I hoped for :-), except that you prefer to offer four
> different choices of what to do with each transaction, where I would
> definitely prefer to have only two choices (only a binary decision for the
> user). (If we can't get to an agreement here, we can still implement both
> possibilities within the same generic framework, so that's not too big of a
> deal)

I agree, and this is mostly a technical problem.  (The problem, for those 
interested, being that in both gtk1.4 and gtk2.0, the support for embedding a 
widget in a cell of a GtkClist is unfinished).  We can however generate an 
event when we click on the cell, and we can show a pixmap in the cell.  IMHO, 
to make everyone happy, here is what has to be done:

1-(Not directly related to this particular issue, but a prerequisite):  
Generate a preference page, in which we can check in what options we want 
enabled (Currently the options are ADD, REPLACE, IGNORE, RECONCILE).  That 
page should also contain:
-ADD and RECONCILE match threshold (two integers)
-Wether by default transaction should be ADDed or RECONCILEd with c (cleared) 
or y (reconciled).  Someone who reconciles with a paper statement at the end 
of the month probably wants c, while someone who uses the matcher to add all 
his transaction frequently probably wants y.  We could also make ADD and 
RECONCILE independent.
2-Create a callback for the cell containing the action.  When clicked, the 
cell changes to the next available action.  If only two actions are 
available, the choice will obviously become binary.
3-The cell containing the action is text, so it isn't obvious at all that you 
can take an action there.  A pixmap representing some sort of button must be 
created for each action.  The design should ideally make it clear that this 
button iterates over a list of choices (which we unfortunately can't 
display).  Basically this is a spin button that spins only one way.

> The first step is not yet implemented in the generic importer. You said you
> prefer to implement that choice list to be in the same window, but I would
> definitely prefer to have a separate druid page for that. Either way, we
> both agree that the user should be presented with some choice list from
> where the "other" account can be chosen, and that this choice should be
> remembered for the next import. The automatic matching will be done based
> on some strings of the transaction, where either we agree on a convention
> on which strings of the transaction should be used for the matching, or the
> protocol-code passes another string to the generic importer which is solely
> used for the automated transaction matching (and not displayed anywhere).

First the technical part.  I think passing a separate string would be a good 
idea, since splits only have one usable text field (the memo).  Functions 
like 
Account * gnc_import_find_dest_account(char * match_text) and
void gnc_import_set_dest_account(char * match_text, Account * dest_account) 
should be created.  They could be made accessible at a global level. 

I don't know exactly how to implement the lookup table, but I think it only 
needs two value, the string, and the account.  If gnc_import_set_dest_account 
is called with a string that is already in the table, the account should be 
overwritten so the match is always the most recent one (unlike Quicken which 
is really stupid about this).  Such a table would be stored in the book's 
KVP.

An probably better alternative would be to store it in the source account's 
KVP.  The functions would become 
 Account * gnc_import_find_dest_account(Account * source_account, char * 
match_text) and
void gnc_import_set_dest_account(Account * source_account, char * match_text, 
Account * dest_account) 
The advantage is that there can be a different match for the same string for 
each account, and the match is more focussed (It is likely you don't want 
your credit card's matches for your savings account).  On the other hand, gas 
bought from you credit card or you checking account should probably go to the 
same destination, and you would have to set that twice.

Now about the GUI issue, remember there are two possibilities when importing 
transactions:

The transaction is not balanced (only one split):  most online transactions, 
ofx, qif and hbci.
The transaction is balanced (two or more one split):  Ofx investment 
transactions, money transfer between accounts, importing a qif file exported 
by quicken, others?

There are actually tree types of matches for the destination account:
1) unambiguous match, no user interaction needed (after initial setup):  ex: 
ofx account transfers, investment transactions contain a unique unique id 
associated with the account, in the same format as the source account.  This 
might also be the case of qif imports from other software (if account name 
provided is unique)
2) unambiguous match, with user's help:  ex: qif categories can be mapped to a 
gnucash account, and the match (normally) doesn't change over time.
3) ambiguous match:  Most transactions imported online.  The string to match 
can have variable sources, probably with some order or priority, but always 
represent the payee in one way or an other.  Furthermore, these matches 
change over time, or even daily (for example If I buy a TV at WallMart one 
day, and a box of cereals the next, these two transactions obviously do not 
go into different categories.

I am convinced 1 and 2 belong in the protocol specific module (possibly using 
the generic import facility for source or destination account matching).

3) should be changeable on the fly since you can (and most likely WILL 
eventually) have to be different in the same import group.  This is why I 
think it should be in the same window as the transactions you are importing.

The rest can and in some case should be handled in a druid like fashion by the 
protocol specific module.  To do that we must make the Transaction matcher 
parentable to an other window.  To do that, I will make init and destroy 
functions for it, which will return a handle and take a parent window as 
parameter.

> There's one more issue with the generic importer, namely, the imported
> transactions are already created in the final accounts currently and they
> are already visible in that account's registers. I would rather like to

I've been discussing that with Derek on IRC.  I still need information before 
I agree to that.  I agree seeing them in the register below is a nuisance.  
It a minor problem for OFX, but I think in your case you have the register 
open for the account being imported, so it must be a real nuisance.
-There might be workaround the register display problem.
-I was under the impression that by definition an engine with an 
open/edit/rollback_or_commit could (and should) be used for that purpose.
-I need a workable alternative, I still don't have one.

> avoid that. Instead, I would suggest that the generic importer should
> define its own "transaction" data type, which might be something like
>
> struct ImportTrans {
>   /* gnucash transaction with everything filled and with one split, but
>      which was not yet inserted into the account */
>   Transaction *trans;

You can't, according to Transaction.h, a split MUST be parented by an account 
before you use any of the xaccSplitSetAmount() and similar functions on it.  
And since an imported transaction could have any number of splits, the splits 
must be set.

>   /* gnucash account where the transaction's split is to be inserted
>      once this transaction should be imported */
>   Account *src_account;
>   /* A string for which the matching should be remembered, so that it
>      can be used for automated matching of the other account */
>   char *payee_memo;
> };
>
> so that the transaction is *not* added to the actual account unless the
> generic importer eventually has decided that this should indeed be done.

I am willing, but I need a solution to the splits list problem.  And remember 
that the commodity list must be available to the importer when importing 
stock accounts (this may or may not be a problem, depending on the solution 
found).  

> That's about what I want to achieve. Some of this might not be possible
> without a major design overhaul. Some of this is already present in the
> current qif-import module. In fact, all of this is already implemented in
> the qif-importer as part of its file import process, but one would need to
> extract the above parts and write a generic interface. Either way, to get a
> final generic importer with the properties described above, I started
> wondering whether eventually it might be less work to start off anew from
> the qif-importer, or to modify the existing generic importer in those ways.

I very much doubt it would be faster to extract anything from the old qif 
importer.  Others have warned me off.    If it does come to the point we 
can't resolve this without branching, and most of the current matcher suits 
you, you could start with it.  I can't honestly say I would be happy, but 
that is what GPL is for ;)  And it would be much faster for you, embedding it 
in a druid is just a matter of changing the name of a few  callbacks and a 
few lines of code in them.

> Since I recently finished all important features of the HBCI module, and
> polishing the transaction import dialog, I just thought about that a bit. I
> will probably have some time for coding this weekend, so I definitely hope
> we can find a consensus on how to proceed with the generic importer.
> Programming with the same concept in mind is just so much more fun :-))

Indeed, I am waiting for you comments.  Hopefully we are going somewhere.

> PS: just to get the measures right (about the amount of code that I have
> written and that I would be ready to write for the importer) why not do a
> quick line count in the hbci directory, not to mention the openhbci
> library...

I don't know exactly what you meant by that, and perhaps it is better that I 
don't.  You outnumber me two to one, in lines of code.  So what?  We both 
have egos, but please, let's not turn this in a my penis is bigger than yours 
kind of thing.  We obviously both already spent more time on OFX, HBCI and 
import than we can ever hope to save in our lifetime by using the software ;)

-- 
Benoit Grégoire
http://step.polymtl.ca/~bock/