[GNC-dev] avoid the brain dead import

Thu Aug 30 10:40:16 EDT 2018

Dear Wm,

On Thu, August 30, 2018 10:10 am, Wm via gnucash-devel wrote:
> On 29/08/2018 23:52, David Cousens wrote:
>
>> I think the decision about whether to import a small number of
>> transactions by hand is really one for the user and not the importer to
>> make. I would import small batches, maybe 20-30  to test the importer
>> function and ensure it was working as expected before attempting to
>> import 10k.
>
> You are missing the point entirely.
>
> The importer compares the tx being imported against *every* extant tx.

The importer compares against existing transactions to detect duplicates. 
This is done because there is absolutely no guarantee that the user wont
import the same transaction multiple times.  This can happen by accident
(importing the same file multiple times), or it could happen because the
data source provides the same data multiple times (e.g., some banks will
provide overlapping downloads).

It has to search every existing transaction because there is no way in the
underlying code not to do that.  Theoretically you should only need to
search through transactions within a relatively short time frame (say, +/-
2-3 weeks).  However there is no way to do this.  Even when you create a
QofSearch with a limited data range, it will *still* iterate through every
existing transaction.  Of course, if the date is not in range it will get
thrown out.  However by that point the damage has been done.

This issue will only get fixed when we can move GnuCash to be a true DB
app.  Then the SQL code can truly limit the search space properly.

Hopefully this explains what's going on.

-derek

-- 
       Derek Atkins                 617-623-3745
       derek at ihtfp.com             www.ihtfp.com
       Computer and Internet Security Consultant