Dirty entity identification.

Thu Jul 21 15:13:22 EDT 2005

On Thursday 21 July 2005 12:57 am, Chris Shoemaker wrote:
> Ok, this was my point.  I completely understand that you can get a
> very quick boolean answer to the question "has anything in the book
> changed?" by checking each collection's dirty flag.  But think about
> *how* you'd have to create a list of all dirty entities for the case
> where the task is to commit just the 1 split that changed in my
> example.

I had been - and it could be solved but I'd have to formalise the idea first. 
I'm not sure what is the real-world use for such an API. I can see it as a 
fallback for a failed write but that isn't particularly common. I can see it 
for incremental storage systems but we don't use those yet. (SQL aside as 
that can do this via a separate mechanism).

> ISTM (It seems to me) there are 3 options: 1) You can't do 
> that; you must commit all 100000 Splits.  2) You can do that just
> fine, but you must do a linear search through 100000 Splits to find
> the 1 that changed.  or 3) You start at the dirty book, and perform
> the tree search I described before. 

Derek's point stands: The book knows nothing about the tree. There is no tree 
within the book, it only exists in *our* conceptualisation of the 
relationships between objects. All the book knows about are collections and 
collections are not linked to each other - only objects link to other 
objects.

Now it *could* be possible for the collection to keep a GList of changed 
entities in it's own collection. The question is, is it worth doing?

Keep in mind that all existing mechanisms are retrospective - not much is done 
until the question is asked. Storing a GList of modified entities would have 
to be predictive: whether you need it or not, it would be being done. This 
isn't just storing a single boolean value that covers tens of thousands of 
entities, the GList would store each modified entity and could get incredibly 
long in some cases. It may only be storing a pointer to the entity or maybe 
it's GUID (as the type is already determined by the collection), but that 
will mount up. It is conceivable with the SQL query dialog that I've got 
planned for after G2, that the user could update every single instance of one 
object type in one operation.

> The time cost difference between 
> 2) and 3) can be arbitrarily large.

I think it would be too large to inflict on all users at all times for the odd 
occasion that it might be useful.

> I can see that QSF only needs to handle lists of uniformly typed
> entities.  However, if there's no way to ask "are there dirty
> Transactions in this Account"

The Account is marked dirty but the entities responsible are not currently 
identifiable.

> , then *every* selection of a subset of 
> Splits for commiting will require a linear search through *all*
> Splits.  Does that seem like a problem to you?

It would if I could see a need to identify only these entities. 

Currently, I can only see this as a solution in search of a problem.

-- 

Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.gnucash.org/pipermail/gnucash-devel/attachments/20050721/37c5ec9f/attachment.bin