GncBusiness v. GNCSession

Derek Atkins warlord@MIT.EDU
21 Nov 2001 11:36:29 -0500


linas@linas.org (Linas Vepstas) writes:

> > Right now, all the engine is stored as one big glop of code.  I'd
> > like to modularize it a bit more, 
> 
> My first reaction is that I don't understand why you feel you need 
> to do this.  But see below.

Basically I'd like to be able to extend the engine, not just
in terms of new algorithms, but new (saved) data types.  As
new data types are created, I'd like to be able to "graft" them
onto the engine.  This lets third parties insert new data types
into Gnucash without replacing the engine library.

Having a standard way to hook new object types into the engine and
backend would allow third parties to extend the Gnucash engine.

> > in order to add more core objects
> > to the engine in a dynamic manner.  By "core object" I mean a basic
> > accounting object, like an Account, a Customer, an Invoice...
> 
> Thats too vague for me. I cannot grasp the generalization you are 
> proposing by reading this sentance.

Currently the engine does not support Invoices.  Right now, what
do I have to do to get the engine to support Invoices?  (This is
a rhetorical question).  Now answer that same questions for:
	Customers
	Vendors
	Jobs
	Employees
	Orders
	Inventory Management
	...

Instead of requiring changes to the engine for each of these new
first-class objects, it would be better to have an extension mechanism
so that these objects can be provided by a gnc-module.  Moreover, the
engine SHOULD NOT NEED TO BE CHANGED to support load/save operations
of these new objects.  In other words, I'd like to see the load/save
operations be part of the gnc-module that provides these objects.

> > Yes it does.  Take a look at gnc-module.c
> 
> ? I don't understand. gnc-module doesn't seem to hold anything global.
> Disclaimer:  I have not looked at, studied, or have overheard a
> description of the module system.  So maybe I'm missing something.
> But skimming the code, I don't see any place where I can dangle
> one or more global structs.  There's no hash table, no insert/delete
> methods, no lookup/find methods. 

This would be easier if you actually looked at the code I was referencing
before you responded to me.  From the top of the gnc-module.c file:

static GHashTable * loaded_modules = NULL;
static GList      * module_info = NULL;

typedef struct 
{
  lt_dlhandle   handle;
  gchar         * filename;
  int           load_count;
  GNCModuleInfo * info;  
  int           (* init_func)(int refcount);
} GNCLoadedModule;

This sure looks like static, global data to me.  There most certainly
is a hash table.  There most certainly is insert/delete methods....
I'll concede that there isn't a lookup method, but I haven't actually
looked for one.

> > To give a better idea of what I mean, what I want to do is provide a
> > registration where the core object C-code can register itself with the
> > engine, so the engine knows that it exists.  There is a structure:
> > 
> > 	GncBusinessObject {
> > 		int version;
> > 		char * module_name;
> > 		char * description;
> > 		void (*destroy) ();
> > 		GList * (*get_list) ();
> > 		...
> > 	}
> 
> Well, at this level of description, how is a 'business object' different
> than just another module?  Why not just use a module to accomplish this?

I am using a module to accomplish this.  That is the whole point.  The
problem is that there is no generic hooks into the GNCSession to store
the GNCEntityTable for each object-type; there is no hook in the
GNCBook to store the object tables (list of existing Customers,
Vendors, Invoices, etc); there is no hook in the Backend structure to
load or save these objects; there is no hook in the Query structure to
search these objects.

> again, given what you've written, I still do not grasp the abstraction
> you are trying to make.  But see below.

I'm trying to provide an abstraction that allows a gnc-module to
provide a new first-class object and dynamically hook into the engine.
What I'm trying to do is make it so that the engine-proper does not
have to know about Customers, Invoices, Jobs, Orders, etc.  I'm trying
to make it so that when the next person comes along and wants to
create some FooBar object-type, they don't have to change any existing
code to supply this new object type.

> > Through this mechanisms, new object types can be entered into the
> > system without changing any other code.  My "search" widgets, for
> > example, can choose ANY registered object.  If I create a new object
> > and register it, viola, I can now search it without any additional
> > code changes!
> 
> ? I can't begin to understand this claim.  Say I want to search unpaid 
> invoices by date, or by amount, or by number of days left until its
> due, and sort the results by the due date.  

This is indeed the path I'd like to take.  Right now I only provide
a single search query: hidden or not-hidden.  This is the "get_list"
method.  The actual method is:
	GList * (*get_list) (Session, show_all);

Ideally this should be more like:
	GList * (*query) (Session, Query);

> Maybe you could acheive this by extending src/engine/Query.c ...

The point is that I don't want to change any code in src/engine/*

Rather, I want to change the code in src/engine once and then be able
to extend it through plug-ins.  That way, as I implement new data
types (right now I'm working on Invoices), I don't have to go back and
change code in src/engine/* yet again to add Invoices.  I should just
be able to "plug in" invoices into the engine and have it work.

[ mib idea snipped ]

I don't quite want to go _this_ far, but I'd certainly like to go
somewhere in this direction.  What I mean is that I think you can
abstract out many of the Query topics.  For example, you can search by:

	start date
	end date	(which implies a finished flag)
	name, id

Some objects have specific searches.  Orders, for example, have
searches like:

	type
	job/vendor
	invoiced

Invoices can be searched for:

	type
	customer/vendor
	due/duedate
	paid	

> > An entity is an instance of an object.  
> 
> I think using language like this obscures the true meaning.  An entity
> is a 128-bit universally unique identifier.  A unique name for the
> an instance. 

Well, no.  The GUID is the Globally Unique ID of the entity.  The GUID
has no other meaning by itself; it is a reference to the entity.  The
entity is the actual Account* or Customer* or Order* data structure;
the GUID is just a globally-unique name/reference to the data.

[ pointers elided ]

Of course you use pointers locally.  I'm trying to work on local
issues as well as data-store issues.  If we didn't have data store
issues there would be no need for GUIDs in the first place ;)

> Ugh.  After a long discussion with dave, he put the guid tables in the
> session, rather than making them global, because this allowed a copy
> operation.   I don't quite remember why he picked session instead of
> book.  Again, something to do with the copy operation.
> 
> Dave?  Can you remember why?  Can I get you to add something to the docs
> that says 'the reason that entity tables are in session not book is
> because xyz ...'

Part of the reason why was for the RPC Server -- each rpc client is in
a different session, so you wanted the entity table to exist within
the session, which is tied to the rpc client thread.

Global data is bad, when it refers to actual data.  Global data is
good, when it refers to executable code. :)

> You are assuming that you now know enough about what core objects are
> like to be able to make the correct abstraction.  I am not so sure.
> So far, the engine has two core objects: the account tree, and the
> price table.  They are quite different.  Looking at them, I cannot
> really deduce a pattern.  

There is most certainly a pattern, at least at a basic level:

	initState()
	create()
	destroy()
	query()
	lookupByGUID()
	printName()
	printType()
	loadFromURL()
	beginEdit()
	commitEdit()
	finalize()

Is this everything that you need to know?  No, of course not.  But
it is certainly a basic infrastructure that allows extensibility.

> Don't over-engineer the thing.  Use the smallest amount of code to 
> get the job done, but no less.  Lots of code just means lots of things
> that are hard to grasp, more apis that need to be learned, to be
> understood.  Is the amount of function that they provide really 
> worth trying to learn and understand them?  

The KISS principle is extremely important.  However so is
extensibility.  One of the points of the gnc-module work is
the make Gnucash extensible.  I'm trying to do the same thing,
but at a deeper level -- I want the supported data types to
be extensible, too.

So far I've been able to add business objects with pretty much zero
changes to the core engine or core gnome UI.  However I can't load,
query, or save my data.  I'd like to fix that.  But I'd also like the
next person to be able to hook in as well so they don't have to go
through the same grief I'm going through ;)

> To put it more parochially: why should I try to learn and understand
> what a 'core object' is, when all that you seem to really need is an 
> opaque pointer in GNCBook?  One line of code, one minute of coding, 
> versus days to design a generic mechanism.  You don't have a half-dozen
> core objects (yet).  When you hit the half-dozen mark, then do the
> generic extension.

Because I need N opaque pointers in a GNCBook.  And because the next
person that comes along is going to need M more pointers.  And we're
both going to need the Backend* hooks, and we're both going to need
the Query* hooks, and...   You get the point (I hope).

> a business object would call a vector.  When a business-object backend
> is initialized, it fills in the vector table.

Who initializes this business-object backend, and how?

> > Who calls "save"?
> 
> gnc-session does.

Right, and how does gnc-session:
	a) know that business objects exist, and
	b) know what business-object save function(s) to call?

> > And how does the caller know all the various "save" functions to call
> > out to?
> 
> most of the 'save' functions are known only to the business object GUI.
> So, for example, user completes editing a customer name, clicks on the
> OK button.  The GUI then calls the gncCustomerEditCommit() method.
> The gncCustomerEditCommit() method looks at the business backend,
> to see if the customer_commit() vector is not null, and if so, calls it.
> The customer_commit() vector then scribles to an sql database, or 
> invokes xml-rpc to some server. 

Right.  I propose that each business-object-type register this
"save_function" so that the gnc-session doesn't need to know a-priori
which business gnc-module(s) is(are) loaded.  This allows a user to
load (or not load) the various business objects and have gnucash _just
work_.

> The business backend does have to have one public vector called 
> gnc_session_is_ending_now_so_finish_up(), so that gnc-session.c can call
> it when the user goes to exit gnucash.

Well, I would say that each object needs to have this., but yes.

> So maybe I answered my own question?  A generic core-object is a
> void * pointer to data, and a vector called
> gnc_session_is_ending_now_so_finish_up().  If you want to pretty it
> up with version number and an ascii string name, I guess that's
> harmless.

I think a generic core-object is slightly more than this (see my
method-list above).  But yes, this is what I'm getting at..

> So I can now retract the statements I make above.  If you build a real 
> simple core-object thing, that's OK.  But we really need to document 
> the reason for its existance, so that future coders understand ....
> People already complain that gnucash is too complex ...

Really?  Who?  Most of the complaints I've heard is from users
complaining about all the dependencies -- complaints about how hard it
is to get Gnucash up and running.  I've not heard complains about the
Gnucash architecture or implementation.

> > With a registration method, you can centralize the 'save()' api but
> > have it call out to any number of "save()" methods.  At least in
> > theory.
> 
> Right. Yes.  Although save() is Another Evil Unix Thing (TM).  Another
> symptom of a file-centric view of the world.  If unix was 
> transactional, you wouldn't need a save.  Transactional commits are
> good.  Save is bad. 

Agreed.  This is why SQL==Good :)

> Well, lets look at what the business backend needs to do.  Lets assume
> the SQL backend.   First, in needs to talk to the same SQL server
> and the engine, otherwise this discussion is too weird.   So the 
> business object backend needs to ask/be told/peek at whatever the
> engine backend is using for its sql connection. 

Ok.

> Next, the business backend will presumably have weird routines in it
> that I can't guess: not just a 'save()', but also some transactional
> routines, like 'commit_changes_made_to_customer()' or maybe
> 'contact_ldap_server()', or Lord knows what.  I can't guess.  

Well, I think we can come up with fairly generic functions:
	begin_edit()
	rollback_edit()
	commit_edit()
	query()

Again, my list of generic functions is listed above...

> What do you want to do?  hard code each of these new vectors into
> the existing BackendP.h?  I suppose you can, but it makes more sense to
> me to allow core-object developers to invent thier own backend as they
> wish.  If they want to recycle the sql connection, they need to steal
> that somehow; I don't care how, as long as its not architected.

No, not at all.  I want to vectorize it all.  A vector of vectors, so
to speak.  If we can define a generic vector for a first-class object,
then we just perform a:

	object_vector = object_vector_lookup ("ObjectType-Name");

Now we have the object vector for that type of object.  Obviously the
engine would need to do this, but specific GUI code tied to the object
would not necessarily need to perform the lookup.

> > If the backends are tightly coupled via implementation, why not
> > couple them via API?
> 
> Sure, I'm kind of neutral.  Just realize, whenever you modify
> BackendP.h, you have to go and modify the correesponding code
> in PostgresBackend, RpcBackend, FileBackend, HttpBackend, etc.

Yea..  I know.  This is another reason I'd like to vectorize the
vectors.  I'd like to not have to change the Backend code directly for
every function.

Perhaps each backend can come up with a vector of functions that a
module need to provide for the type?  I don't know exactly how it
should work.

> I couldn't help but notice that certain anonymous individuals
> were more than happy to break the http backend because they couldn't be
> bothered to propagate the changes to all the different places.
> That's the problem with changing API's -- they affect everybody. 

Yet another reason that I want to make the API extensible, so you
don't have to change code everywhere whevener to add something new.
I'd like to just be able to plug-in my new object and have everything
else _just work_.

> > Simiarly, assume a File backend -- how do you store the "two backend"
> > data into one file?  Or are you presuming that the current
> > engine would store data in file 1, and the business backend would
> > store into file2?
> 
> No, I'm assuming that the engine backend and the business backend would
> be implemented in the same hunk of code, in the same 
> lib-whatever-backend.so and so there would be no confusion over these
> kinds of issues.

Uggh; that implies that third-parties can't supply new modules
without replacing the existing core modules.  It means you can't
have multiple parties supplying modules, because they will conflict.
Eww.

> > I guess my confusion is that if src/backend/* needs to be changed, why
> > not just change Backend * at the same time and provide a coherent
> > Backend interface instead of multiple interfaces to essentially the
> > same code.  
> 
> Yeah, that's OK too.  I don't think I feel strongly on this issue.
> 
> > each core object (Account's, Transactions, Splits, Customers,
> > Invoices, etc) 
> 
> Gack. My view of the world is not split like this.  I view
> accounts/transacations/splits to all be tangled up in one and the same
> core object, the 'account heirarchy'.   You can't really untangle them.

But you can, and you have to.  There is a "list of accounts" and there
is a "list of transactions" and there is a "list of splits".  These
are three object types that have pointers back and forth between them.
However, they are still three different objects.  They each have their
own XML structure.  They each have their own SQL tables.

Yes, you cannot exist without Account, Transaction, and Split objects
tied together.  But as an abstraction technique, they are separable.
Similarly, we humans cannot live without water, but you can break
water up into Hydogen and Oxygen.

-derek

-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord@MIT.EDU                        PGP key available