GncBusiness v. GNCSession

Linas Vepstas linas@linas.org
Wed, 21 Nov 2001 15:08:30 -0600


Hi Derek,


I see that we really are talking past each other; but I see that we also 
rejoin by the end :-) So let me begin by talking in the wrong direction
some more...


On Wed, Nov 21, 2001 at 11:36:29AM -0500, Derek Atkins was heard to remark:
> Having a standard way to hook new object types into the engine and
> backend would allow third parties to extend the Gnucash engine.

Yes, I understand ... that is not the issue.

> Instead of requiring changes to the engine for each of these new
> first-class objects, it would be better to have an extension mechanism
> so that these objects can be provided by a gnc-module.  

Yes, in one sense it would be better.  In another sense it would be
worse.  The tradeoff is between extensibility vs. complexity.
Programmers have a natural habit of creating complexity when the problem
can frequently be solved more simply.  And that is why I was initially
pushing back.

But I thought I concluded the last email with a statement to the 
effect that a simple extension mechanism, as long as its kept simple,
would be OK.

> Moreover, the
> engine SHOULD NOT NEED TO BE CHANGED to support load/save operations
> of these new objects.  In other words, I'd like to see the load/save
> operations be part of the gnc-module that provides these objects.

load/save is a really bad metaphor.  Its unix/windows broken-ness.  
This is why the backend was created: to get away from load/save.
Load/save is a true metaphor only for files, its not true for sql 
or http or rpc (or smalltalk...).

The storage operations are provided not by the module itself, but by a
backend.

> > > Yes it does.  Take a look at gnc-module.c
> > 
> > ? I don't understand. gnc-module doesn't seem to hold anything global.
> > Disclaimer:  I have not looked at, studied, or have overheard a
> > description of the module system.  So maybe I'm missing something.
> > But skimming the code, I don't see any place where I can dangle
> > one or more global structs.  There's no hash table, no insert/delete
> > methods, no lookup/find methods. 
> 
> This would be easier if you actually looked at the code I was referencing
> before you responded to me.  From the top of the gnc-module.c file:

Yes, I looked at the code.

> static GHashTable * loaded_modules = NULL;
> static GList      * module_info = NULL;
> 
> typedef struct 
> {
>   lt_dlhandle   handle;
>   gchar         * filename;
>   int           load_count;
>   GNCModuleInfo * info;  
>   int           (* init_func)(int refcount);
> } GNCLoadedModule;
> 
> This sure looks like static, global data to me.  There most certainly
> is a hash table.  There most certainly is insert/delete methods....
> I'll concede that there isn't a lookup method, but I haven't actually
> looked for one.

OK, then try this on for size: I have just invented a new business
object called 'phonebook'.  It stores customer names, addresses and 
phone numbers.  To access that phone book, I need to have a single
pointer to the 'phonebook' (i.e. a pointer to an instance of some struct
I created.)  My top-level pointer to 'phonebook' is in a certain sense
a 'global'.

Now, one way to gain access to it is to edit gnc-book-p.h, and add
GNCPhoneBook *fones to struct gnc_book_struct.  So voila, I have 
found a home for my global.  This is one way.

But I don't think anyone would be happy if I added GNCPhoneBook *fones 
to struct GNCLoadedModule.    Nor is there a routine where I can 
pass in e.g. (void *) fones, and later on ask "give me back that
pointer to fones I gave you before".   

The gnucash module system seems to be a fancy wrapper around the
dlopen() and dlsym() system routines.  I do not entirely understand 
why we needed a fancy wrapper around dlopen()/dlsym(); presumably 
this has something to do with guile.

The gnucash module system does not seem to be a mechanism for managing
a collection of pointers.   To manage a collection of pointers, you would 
have an interface something like this:

void gnc_insert_global (const char * name, void * data);
void * gnc_fetch_global (const char * name);

which you would use like so:

typedef struct { ...} AccountGroup;
typedef struct { ...} GNCPriceDB;
typedef struct { ...} GNCPhoneBook;

 AccountGroup *topgrp = xaccAccountGroupNew();
 GNCPhoneBook *fones = gncPhoneookNew();

  gnc_insert_global ("accounts", (void *) topgrp);
  gnc_insert_global ("phones", (void *) fones);
  
topgrp = gnc_fetch_global ("accounts");
xaccAccountGroupForEach (topgroup, xxxx);

fones = gnc_fetch_global ("phones");
char * fonenumber = gncPhoneQuery (fones, "Jimmy Joe Bob");

where each of these lines of could could occur in distant parts of the
code.  Thus, you don't need to hard-code a global anywhere, you 
can dynamically look it up, if you know its name.  Kind of like
guids/entities, but with well-known human-readable names, instead 
of assigned numbers.

> > > To give a better idea of what I mean, what I want to do is provide a
> > > registration where the core object C-code can register itself with the
> > > engine, so the engine knows that it exists.  There is a structure:
> > > 
> > > 	GncBusinessObject {
> > > 		int version;
> > > 		char * module_name;
> > > 		char * description;
> > > 		void (*destroy) ();
> > > 		GList * (*get_list) ();
> > > 		...
> > > 	}
> > 
> > Well, at this level of description, how is a 'business object' different
> > than just another module?  Why not just use a module to accomplish this?
> 
> I am using a module to accomplish this.  That is the whole point.  The
> problem is that there is no generic hooks into the GNCSession to store
> the GNCEntityTable for each object-type; there is no hook in the
> GNCBook to store the object tables (list of existing Customers,
> Vendors, Invoices, etc); there is no hook in the Backend structure to
> load or save these objects; there is no hook in the Query structure to
> search these objects.

Right.  We'll have to take these one at a time.  Lets start with 

>  there is no hook in the
> GNCBook to store the object tables 

You have three choices.

A) edit gnc-book-p.h and hard-code GNCPhoneBook into the struct 
   gnc_book_struct.  Also hard-code GNCInvoice, and GNCClanedar, and
   GNCJob, GNCOrder, GNCWhatever in there as well.   Its quick, its
   dirty.  Its a very very very simple solution to the problem, 
   but it *does work*.  KISS.  

B) Create the routines

   void gnc_insert_global (const char * name, void * data);
   void * gnc_fetch_global (const char * name);

   To be more technically correct, they should look more like

   void gnc_book_insert_global (GNCBook *, const char * name, void * data);
   void * gnc_book_fetch_global (GNCBook *, const char * name);

   This also works, although you have to have some hack-jobbery:
   you need to somehow let something know that a session is ending,
   that a book is closing.

C) Create the struct

 	GncDataObject {
 		char * name;
                void * data;
 		void (*session_is_ending) ();
 		void (*destroy) ();
 	}

   and the routines 

   void gnc_book_insert_global (GNCBook *, GncDataObject *);
   GncDataObject * gnc_book_fetch_global (GNCBook *, const char * name);

   (this is an example, the actual details may vary.  This is
   intentionally similar to your struct GncBusinessObject.  That is,
   I think we are begining to speak on the same wavelength.  But its
   different because I have no clue why you need a version, a
   description, or a GList * get_list().)

> > again, given what you've written, I still do not grasp the abstraction
> > you are trying to make.  But see below.
> 
> I'm trying to provide an abstraction that allows a gnc-module to
> provide a new first-class object and dynamically hook into the engine.
> What I'm trying to do is make it so that the engine-proper does not
> have to know about Customers, Invoices, Jobs, Orders, etc.  I'm trying
> to make it so that when the next person comes along and wants to
> create some FooBar object-type, they don't have to change any existing
> code to supply this new object type.

We are talking past each other.  I *do* understand why, and repeating
*why* doesn't make it clearer.  I just don't understand *what* the 
actual abstraction is that you are proposing.

> > > Through this mechanisms, new object types can be entered into the
> > > system without changing any other code.  My "search" widgets, for
> > > example, can choose ANY registered object.  If I create a new object
> > > and register it, viola, I can now search it without any additional
> > > code changes!
> > 
> > ? I can't begin to understand this claim.  Say I want to search unpaid 
> > invoices by date, or by amount, or by number of days left until its
> > due, and sort the results by the due date.  
> 
> This is indeed the path I'd like to take.  Right now I only provide
> a single search query: hidden or not-hidden.  This is the "get_list"
> method.  The actual method is:
> 	GList * (*get_list) (Session, show_all);

OK, now we are getting somewhere.  Except that the above isn't valid C
code, so I am a bit confused still.

1) session is the *wrong* place to hang data.  GNCBook is the right
   place.

2) 'show all' is dangerous: it doesn't scale.  If I have a million 
   customer names, I certainly would almost never actually do a 
   'show all'.

3) What does the GList data point at?  The following code doesn't
   work for me:

   GList * mylist = bizniz->get_list (...);
   for (node=mylist; node; node=node->next)
   {
      GenericSearchWidgetDisplay (node->data);   // huh ????
   }

what's data? its a void *. What do I cast it to?  How can I display 
it in your "generic search widget" ?? How does your generic widget
discover that I returned a list of phone numbers ???

> Ideally this should be more like:
> 	GList * (*query) (Session, Query);


Maybe more like 
    GList * (*query) (GNCBook *, GNCQuery *);

> > Maybe you could acheive this by extending src/engine/Query.c ...
> 
> The point is that I don't want to change any code in src/engine/*

well, somehow, you have to have the ability to say "give me phone
numbers sorted by area code" with GNCQuery.  If you can't express
or represent the kinds of queries a human would be interested in, 
then I don't know what the point is.

> Rather, I want to change the code in src/engine once and then be able
> to extend it through plug-ins.  

Yes,  I understand what you want to do, I just don't understand how you
think you'll do it.

> That way, as I implement new data
> types (right now I'm working on Invoices),

This is very off-topic, but, for me, an 'invoice' is a kind of report
that you would have in the gnc report infrastructure.   A kind of
variation on the 'show transactions' report.  I don't understand how an
invoice is a 'core object'.  But maybe we should save this conversation
for later ?? 


> [ mib idea snipped ]
> 
> I don't quite want to go _this_ far, but I'd certainly like to go
> somewhere in this direction.  What I mean is that I think you can
> abstract out many of the Query topics.  For example, you can search by:
> 
> 	start date
> 	end date	(which implies a finished flag)
> 	name, id
> 
> Some objects have specific searches.  Orders, for example, have
> searches like:
> 
> 	type
> 	job/vendor
> 	invoiced
> 
> Invoices can be searched for:

I think you mean 'accounts payable' not 'invoices'.

> 	type
> 	customer/vendor
> 	due/duedate
> 	paid	

Right. Very good.  We agree. These would be a good set of Query topics.  
GNCQuery already implenments a decent subset of them.  Why not add the
rest?  Or rather, why not add the rest by hard-coding them in?  
Do we need to invent yet another extension mechanism so that we can 
add new query types ?  Inventing extension mechanisms is hard.
Understanding someone else's extension mechanism is even harder.


> > Ugh.  After a long discussion with dave, he put the guid tables in the
> > session, rather than making them global, because this allowed a copy
> > operation.   I don't quite remember why he picked session instead of
> > book.  Again, something to do with the copy operation.
> > 
> > Dave?  Can you remember why?  Can I get you to add something to the docs
> > that says 'the reason that entity tables are in session not book is
> > because xyz ...'
> 
> Part of the reason why was for the RPC Server -- each rpc client is in
> a different session, so you wanted the entity table to exist within
> the session, which is tied to the rpc client thread.

I don't understand this statement either.  If I have three clients
and they are manipulating the same book inside the server, then all
three of them should see the same identical set of guids.  If I have 
one book inside the server, and three distinct sets of guid's refering
to the contents of the book ... ouch makes my head hurt.  Surely
this cannot be what you mean.  There must be soe other reason, but
I am growing dubious ....

> > You are assuming that you now know enough about what core objects are
> > like to be able to make the correct abstraction.  I am not so sure.
> > So far, the engine has two core objects: the account tree, and the
> > price table.  They are quite different.  Looking at them, I cannot
> > really deduce a pattern.  
> 
> There is most certainly a pattern, at least at a basic level:
> 
> 	initState()
> 	create()
> 	destroy()
> 	query()
> 	lookupByGUID()
> 	printName()
> 	printType()
> 	loadFromURL()
> 	beginEdit()
> 	commitEdit()
> 	finalize()

Hmm. Well, ahhh, uhhh... I'm not sure I want to ... begrudgingly ... 

OK, here's one: how do I do a 'table join' query: i.e. 
select all phone numbers where transaction amount > y dollars.

If I am given only one beginEdit()/commitEdit() pair, then 
I am force to separate out Accounts from Transactions.  And there
are very real-world table-join queries involved:

select all accounts where split-memo='some string'.


> > a business object would call a vector.  When a business-object backend
> > is initialized, it fills in the vector table.
> 
> Who initializes this business-object backend, and how?

This is based on the URL. If the URL is postgres://asdf.com then
the postgres backend is loaded.

> > > Who calls "save"?
> > 
> > gnc-session does.
> 
> Right, and how does gnc-session:
> 	a) know that business objects exist, and
> 	b) know what business-object save function(s) to call?

Well, with proposal A), its hard coded. With proposal B) its some hack.
With proposal C) its a vector in the struct.

> Right.  I propose that each business-object-type register this
> "save_function" 

Please stop calling it 'save'.  Save is a meaningless concept for 
the sql backend.  Save is peculirat to the file backend.

> > So maybe I answered my own question?  A generic core-object is a
...

> I think a generic core-object is slightly more than this (see my
> method-list above).  But yes, this is what I'm getting at..

OK, like I said, violent agreement.  But now, the next level:
how much more?  You might talk me into init, destroy, finalize,
and a few more.   But maybe not all of them.  This is where the 
rubber hits the road.  What really is the valid set of generic 
methods?  I think we now need to start getting specific about 
details.

> > People already complain that gnucash is too complex ...
> 
> Really?  Who?  

OK, me.  Gnucash 1.6.0 had 200 KLOC spread over 455 source code
files.  Printed out on paper, this is thicker than Tolstoy's War &
Peace.  

There are days when I wonder: do we really need that many lines of code
to do this?  Isn't there some way of doing more with less?  Isn't the
newbie contributor to GnuCash going to be turned off by this?

> > Next, the business backend will presumably have ...
> Yea..  I know.  This is another reason I'd like to vectorize the

I'm tempted to defer discussion of the generic backend until we have
a better idea of what the generic object is.  In particular, I really
want to know whether the tangle of account/transaction/split maps
to one object or two.  I sense vague arguments in either direction.

> > Gack. My view of the world is not split like this.  I view
> > accounts/transacations/splits to all be tangled up in one and the same
> > core object, the 'account heirarchy'.   You can't really untangle them.
> 
> But you can, and you have to.  There is a "list of accounts" and there
> is a "list of transactions" and there is a "list of splits".  These
> are three object types that have pointers back and forth between them.
> However, they are still three different objects.  They each have their
> own XML structure.  They each have their own SQL tables.

You seem to be saying that the mapping between core-object and sql-table
is 1-1. This has several problems:

-- designing a query interface that allows queries that join tables.
-- although accounts and transactions have begin/end commit pairs,
   splits do not.  One cannot commit a split all by itself, because
   that would break the contraint that the sum of splits totals zero.
   So having a commit() method in the core object would be
   confusing/misleading for splits. And, on the flip side, the
   transaction commit() needs to lock both the transaction *and*
   the split table; locking one is not enough.

I am somewhat concerned that there will be other subtle couplings of this
sort between core objects.  For example, accounts payable groups
together splits with different dates. (viz. "this check drawn on
this bank today pays for that bill received 30 days ago.")  After
this coupling has been made, you can't just change one of the member
transactions without throwing things out of kilter.  You'd need
to add a balancing transaction.   So you're not just updating one
sql table, you are updating several, and you have to update them
together, atomically, i.e. use locks.  

So, if the mapping between sql tables and core objects is 1-1,
then it doesn't make sense (to me) to have commit() and query() be a
part of teh core object.   But if the mapping is not 1-1, then 
having only one set of commit() might also not make sense.

--linas


-- 
pub  1024D/01045933 2001-02-01 Linas Vepstas (Labas!) <linas@linas.org>
PGP Key fingerprint = 8305 2521 6000 0B5E 8984  3F54 64A9 9A82 0104 5933