Dirty entity identification.

Fri Jul 22 12:14:20 EDT 2005

On Fri, 2005-07-22 at 09:10 +0100, Neil Williams wrote:
> On Friday 22 July 2005 1:01 am, you wrote:
> > You clearly haven't understood my statements.  I don't want to have to
> > run the list of collections each time I ask if the book is dirty.
> > That's too much work.
> 
> It's not. At no time do you iterate through the entire collections to do this. 
> The code does iterate over the tiny number of primary collections but that 
> takes no time at all. It takes longer to call the function itself than it 
> does to check 12 or so boolean pointers, especially once that function is in 
> a shared library.

Really?  My way.  Load the function argument (1) and function address
(2), call the is_dirty subroutine (1), load the structure offset (1),
read the value (1), return (1).  Depeneding on whether or not a stack
frame is needed add another 3 instructions.  7 to 10 total instructions.

Your way.  Load the function argument (1) and function address (2), call
the is_dirty subroutine (1), load the structure offset (1), read the
value (1), test for non-zero (1), load the address of the
object_is_dirty function (2) and call it(1).  The count is already at
ten instructions and you haven't even started to check the objects.  For
each object you need to load a pointer to the object (1), load an offset
(1), read the value (1) and test it (1).  That's at least four
instructions per object, times your twelve objects, for 48 instructions.
Another two to unwind the stack and that's 50 more, for a total count of
at least 60 instructions.  Depeneding on whether or not a stack frames
are needed, add another 6 to ten instructions.  I come up with a number
of arount 70 instructions total for your method.  Its also O(n) worst
case instead of O(1).  I will concede that the value of n is not
excessively large.

> The only problem with that is that the dirty flag in the book is not exposed 
> directly (or won't be once I've cleaned up the private header file issue). 

And I've never asked for it to be.

> You need to call a function via the API and that function will check the 
> collections. It really is no overhead.

Its anywhere from the equivalent of what I'm proposing, to seven times
more work.

> > void
> > qof_book_set_dirty (QofBook *book)
> > {
> > 	book->inst.dirty = TRUE;
> > 	if (book->dirty_time == 0)
> > 		book->dirty_time = time();
> > }
> 
> But it's pointless even checking this EVERY single time an entity changes. 
> Nobody wants that information, all we want is to detect the FIRST change.

This *is* detecting the first change.  Would you rather it test for
book->inst.dirty == FALSE?   There has to be a test somewhere to
determine first vs. subsequent.  In the proposed code it is a test of
zero/non-zero.  It just so happens that the non-zero case is also used
as a timestamp.

> Instead of calling the collections ONCE each, you want the instance to call 
> the book EVERY single time it changes! Instead of 12 or so calls, you are 
> recommending tens of thousands when only ONE is necessary.

Nice large number.  Got data to back up that figure?

> Instances get dirty, it's their job, it's what they do. Books don't need to 
> know every single time. Once is perfectly adequate.

How do you know if this is the time that the book needs to be set dirty
if you don't test its current value?  If you are testing the value,
you've just moved my above test for first/subsequent outside of the
qof_book_set_dirty() function.  You're still doing the test though.

> Better, IMHO, to check the collections and cache the value - after all it is 
> the UI that controls WHEN the data is saved and how.

Today, yes.  Tomorrow, who knows?

> There's also no point in asking the book if it is dirty once you've got the 
> answer YES. It will stay dirty until the UI issues the qof_session_save(). 
> The book itself has no control over the save operations - it cannot clear 
> it's own flag and it cannot force a save. This function shouldn't need to be 
> called every few seconds or at every window paint. Cache it and wait for a 
> Save.

That presupposes that the frontend know how all possible bakckends work.
I do not want to write code that has to play state coherency games to
make sure that it is correctly tracking the state of the backend.  The
backend is the authority on whether or not the data is dirty, so the
frontend should ask when it wants to know.  Current backends may not
save automatically, but that doesn't mean future backends will not.
Users have long asked for an autosave feature.  That could be
implemented in either the frontend or the backend.  The backend makes
more sense to me as only certain backends need autosave.

> At no time can the set_clean functions be public, these are reserved for the 
> backends. There is no situation where the UI can be allowed to set the book 
> to clean without going through a Save.

Absolutely 100% wrong.  Its called "Exit Without Saving."  I need a
method to tell the backend to drop everything on the floor.  Either that
or I need a global variable to pass data from point A in file 1, back
through the gtk idle loop to point B in file 2 to tell point B not to
bother checking to see if the data is dirty and should be saved before
quitting the program.

David