GUID Management

Christopher Browne cbbrowne@mail.hex.net
Thu, 04 Jan 2001 00:52:06 -0600


On Tue, 02 Jan 2001 15:54:53 EST, the world broke into rejoicing as
David Merrill <dmerrill@lupercalia.net>  said:
> On Tue, Jan 02, 2001 at 02:38:45PM -0600, Christopher Browne wrote:
> > On Tue, 02 Jan 2001 14:23:22 CST, the world broke into rejoicing as
> > linas@linas.org  said:
> > > It's been rumoured that David Merrill said:
> > > > 
> > > > I read this again and changed my mind, and I'll tell you why...
> > > > 
> > > > When a record is changed, it is moved to the audit table, and a new
> > > > record is generated. That means a new GUID as well. So if the record
> > > > exists in the transaction table, it is necessarily the same "original"
> > > > data.
> > > 
> > > Maybe we are using guid's for too many conflicting purposes? 
> > > 
> > > It seems that we need another identifier that says 
> > > 'this is the same record, even though we've been editing it.'
> > > 
> > > For example, don't splits store the guid's of thier parent accounts?
> > > If you edit an account, and issue it a new guid, don't you have to
> > > walk a zillion splits to update thier guid's as well ??
> > 
> > Yes, indeed, this seems an overly conflicted use of GUIDs.
> 
> Yeah, I see your point. The idea was half-baked. Try this:
> 
> The guid never changes on the original record. If it is deleted, no
> problemo - just move it to the audit table. If it is edited, copy it
> to the audit table before changing the original. We could then have
> lots of versions in the audit table, but they are all timestamped and
> so we can recreate the audit trail easily. There's really no need to
> even change the guid on them.

That sounds quite reasonable.

In effect, the audit table contains entries that are "write-only,"
with a _nice_ monotonically increasing key.  

A _PERFECT_ candidate application for using CDB to store the data.
   <ftp://koobera.math.uic.edu/www/cdb.html>

CDB is interesting in that it guarantees O(1) access time, whilst
outright forbidding updating values, which is _fine_ for us here.  It
also provides extremely compact storage, _very_ slightly more than
would result from storing entries as lines of text.  If there are
quite a lot of updates, and we seldom walk back through them, compact
storage is probably beneficial...  But I digress...

> This brings back the problem I thought I had solved, which was
> determining whether or not the record has changed already. Linas
> suggested handling this by sending the original data along with the
> updates when requesting a save. This would work, but seems kludgy. The
> db then has to check, field by field, to see if it has changed.
> The alternative is to assign a version number that increments each
> time the record is updated. Then we need only check one field to know
> if changes have been made. See any problems with that approach?

I think it needs more than just a version number.

Suppose two independent updates try to take place "concurrently."

One goes in, as "version 4," and another goes in, also proposed as
"version 4."  If all that is tracked is the version number, then
the second update gets silently ignored.  Bad Thing.

How about replacing the "version number" with the tuple of
[Update Time, Update Session].

Thus, the engine is given a transaction update, at 13:32:31,

   [Description = "Minto Realty", Account = "Rent Expense",
    Update Time = 2001/01/01 13:32:22.001, Session = "cbbrowne:456"]

and then is given, 2 seconds later, at 13:32:33,

   [Description = "Minto Realty", Amount = 452.75,
    Update Time = 2001/01/01 13:32:24.021, Session = "dcbrowne:567"]

Given time/session stamps, we can make up meaningful policies of "how
to cope," probably reporting back to "session dcbrowne:567" something
to the effect that "someone else was updating that record;" try
again...

I could go along with merely having an "update date" (no hh:mm:ss) and
merely indicate a user ID, not trying to identify a particular
session.  But I'd think a time stamp more useful than a mere "version
number."
--
(concatenate 'string "cbbrowne" "@ntlug.org")
<http://www.hex.net/~cbbrowne/>
Outside of a dog,  a book is man's best friend. Inside  of a dog, it's
too dark to read. -Groucho Marx