Trial Balloon: A new DataStore Architecture?

Derek Atkins warlord@MIT.EDU
31 Oct 2000 08:34:25 -0500

I've been thinking about the Disk-File vs. Database arguments for a
while, and I think there are some broader architectural changes to the
current datastore model that would need to be made before any kind of
multi-access could be implemented.  My thesis is that if we can move
to a more "object-oriented" data storage model, it would make it
easier to add new distributed functionality such as database storage
or even a client/server network-protocol system.

Currently, GnuCash calls a function that reads in an AccountGroup, the
user can then make changes to that AccountGroup, and then another
function is called to write out the AccountGroup.  Coherency between
multiple clients is maintained by a database lock on the complete Data
Store file, which is held from the "open" to the "close."  Obviously
this wouldn't work in a multi-client environment.  Remember that a
multi-client environment does not necessarily imply multiple users.
It could be a cron-job, or an emacs client, or a simple command-line
interface, any of which may be run while a GnuCash Window is still

The architecture that I had in mind is basically an "intelligent" Data
Store with CallBack functionality.  Basically, whenever GnuCash opens
a data store and reads information out of it, it sets a callback on
the data it reads (note: I don't know what granularity we want on
these callbacks; at first approximation I would say "per open
account").  Whenever data is changed in the data store (this can be an
add, remove, or modification of existing data), the callbacks
associated with that data are notified of the changes.

Obviously, an intelligent filestore would imply a filestore process
that runs in the background behind all GnuCash engine processes.  This
may be a problem in the short-term, so my proposal would be to design
the proper architecture and APIs using an integrated system before
splitting out the data storage subsystem.

If we can change the data storage model such that we can split out the
filestore, then integrating any network-based client-server or
database-storage technology would be relatively simple.  The hard part
is designing the right filestore API.  I haven't proceeded that far,
yet, so I don't actually have a proposed API set.  I wanted to get
feedback from this list, first, for architectural and requirements

So, any comments?


PS: What I'm really talking about here is building our own user-space
network file system with client-side cache.  Perhaps we could use
ideas from existing FS designs, such as AFS or Coda?  Thinking about
the problem in this way allows us to re-use existing solutions, or at
least base our solutions on their solutions to the same problems.
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL:      PP-ASEL      N1NWH
       warlord@MIT.EDU                        PGP key available