Welcome back

Tue Aug 17 16:20:37 EDT 2004

On Tuesday 17 August 2004 2:44, Derek Atkins wrote:
> linas at linas.org (Linas Vepstas) writes:
> >> A book-merge: function is an excellent idea. I'll get that done.
> >
> > OK, let us know how this goes. Although we could add support for lists 
> > and trees into qof (to handle the lists of splits and the account tree)
> > I'm not sure that I want to burn the brain cells right now to figure out
> > the 'right' way of doing thiis, not just yet.   By contrast, the
> > book-merge-per-object call seemed like a simple catch-all.

Ah - first gotcha. I think a name change would be helpful here. I can see that 
book-merge: would be misleading as we have already diverged. 

Any per-object merge runs into user problems. I am all too familiar with the 
problems of per-collision data handling, as I mentioned in one of the very 
early posts on this code. I got stuck in a loop with a Palm sync that just 
went on and on and on and on and ON presenting new conflicts for resolution 
without ever giving a hint of how many were still to come. I gave up at 200. 
Most other users would never get beyond 50. The fatal flaw in this approach 
is then obvious - those first 200 conflicts were committed to the final 
database as you went along because each object was dealt with in turn. If the 
user aborts, the final book is in complete chaos. YUK!

The merge runs on a per-book basis for two simple reasons. 

1. When the Init() is complete, the user intervention loop has a clear and 
accurate value for the total number of conflicts to be resolved by the user. 
To do this, all import objects must be compared before any user intervention 
is sought. The user then gets a simple choice based on accurate information - 
if the number of conflicts is too high, the user can abort immediately. The 
level that determines 'too high' cannot be determined in advance.

2. Any abort prior to the commit stage is absolutely safe, nothing is changed 
in the target book until every conflict is resolved. If the user aborts at 
any point before the final commit, everything is left exactly as it was 
found.

I like to think of it as a fail-safe.

Also:

> > own merge routine to 'do the right thing' for that object type.  For
> > us 'lazy' object developers, provide a generic merge routine that
> > we can use if an object is fully qof compliant.

Fully qof-compliant objects wouldn't need an object-specific merge routine 
defined.

Having a value 'book-merge: NULL,' would be misleading to other developers. It 
could easily lead people to think that the object will NOT be merged. Yet if 
an object is fully QOF compliant, it's all handled by the book merge routine, 
so no special object behaviour is required.

Instead, I need the object to help me with any non-compatible parameters, 
difficult objects, awkward values and post-commit code that simply doesn't 
fit in to a generic book merge module.

e.g. If an object needs to recalculate balances or lists of Splits after a 
commit but before control is returned to the main GnuCash process for 
editing.

I was thinking of what might be better termed: merge-helper: 
rather than the possibly misleading book-merge:

> I agree.  I think you can safely ignore the lists in most objects
> (except, perhaps, transactions) as those objects will get merged into
> the book later.  For example, you don't need to worry about the
> Account SplitList, because that's really "controlled" by the Splits
> themselves.  When you merge in the splits you get their "parent"
> account just fine.

Will this work reliably if the parent is a new account created by the merge? 
(Possibly in a new AccountGroup too?) At present, I don't have any rules to 
stipulate that certain objects should be compared or committed before or 
after any others. That could be something for merge-helper: to define, 
although I'd rather not (for speed reasons). 

At present, a new Split could be created and filled with real data BEFORE the 
parent account is actually created. The Split will receive a genuine 
reference to the parent - a reference that will be safe once the commit is 
complete - but the Split (or any other process/object) must not access that 
reference until the commit is complete. 

The correct reference is passed because the import object is a QofBook and 
therefore a GUID will have been assigned prior to the comparison. If the 
parent is a new entity in the target book, it will take the GUID that was 
assigned to it in the import book. This is a useful trick to save time on 
future imports - by setting the GUID (and exporting it next time), the next 
import of the same entity already knows the right entity to update and 
therefore the next merge runs more quickly.

All that code is already in place and running in qof_book_merge, including the 
compare, user intervention and commit.

The target book would therefore need to be locked in some way until the commit 
is finished, even in a threaded environment, or an object could easily find 
itself calling uninitialised memory.

Provided that no editing is allowed until the commit is finished, is that a 
problem?

This must be similar to how a book is opened from a file - no process should 
try to read or modify the data until all relevant data has been obtained from 
the filesource.

> The only potential issue I can imagine is merging transactions -- you
> somehow need to detect that what you're merging has fewer splits
> (e.g. if you change a transaction and remove splits).  I'm not sure
> how this can be done easily....

Isn't a merge always going to add data? When would an import result in less 
Splits? I've got no code (so far) that removes data from the target book or 
any objects in the book.

-- 

Neil Williams
=============
http://www.codehelp.co.uk/
http://www.dclug.org.uk/
http://www.isbn.org.uk/
http://sourceforge.net/projects/isbnsearch/

http://www.biglumber.com/x/web?qs=0x8801094A28BCB3E3
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: signature
Url : http://lists.gnucash.org/pipermail/gnucash-devel/attachments/20040817/3b34c4ea/attachment.bin