sqlite file format, anyone?

Mon Jun 23 18:46:31 CDT 2003

linas at linas.org (Linas Vepstas) writes:

> > > > "Setters" as you keep calling them are completely irrelevant.  What
> > > 
> > > OK. I thought it would make it easier to add the bus. obj.
> > 
> > Nope.
> 
> Ugh, this is certainly an exhausting conversation.  Why would you say
> that?

Well, setters only solve part of the problem, but not all of it.  For
example setters wont help with references to other objects.  Also,
sometimes you need ancillary code to correct problems (for example I
had to cope with the fact that I had a bug in a copy() operation).

A raw setter can't do that.

> > > > re-write the Query* -> SQL conversion.  I already added comments
> > > OK.
> > 
> > I think this is a MAJOR piece of work!
> OK.
> 
> > Again, this is a concept, not an actual object name.  Take a look at
> > src/backend/file/io-gncxml-v2.h for an example of a plug-in API.  Take
> > a look at src/business/business-core/file/*.c for examples of how this
> > API is used.
> 
> wc *.c tells me that there is 5KLOC of code here.   That's a lot of 
> code  That's more than half the size of the current pg backend. 

Perhaps, but the current pg backend doesn't deal with any of the
business features, so it's hard to compare.  Also, it's hard to
compare xml to sql code.

> Maybe I'm missing something, but why couldn't one implement this 
> with a 'setter' concept?  Naively, it would then cost 20-50 LOC
> per object, not 500 LOC.  Right?  
> 
> I'm looking at gnc-job-xml-v2.c.  Seems like job_dom_tree_create()
> could have been table-driven. set_string() could have been a generic
> macro.  I don't understand why job_name_handler() and job_id_handler()
> are even needed. gnc_job_end_handler() seems to be 100% generic.
> 
> I'm not trying to say that you should re-write the code.  
> I'm trying to say that with a 'setter' concept, you would provide
> a table specifying the object name, the object fields, the field types,
> and the matching XML name for each.  This table would be 5-10 lines of
> code.  Then I claim that one could write code that is 100% generic
> that would do the xml io based off this table.
> 
> I claim you can do the 'setter' concept with sql as well.  In fact,
> its already done with the m4 code for some things (not everything). 
> I wish it wasn't done with m4.

Well, if we were using C++ then this would be a lot easier.  The
problem is that:

        void xaccFooSetName(Foo*, const char*)
and     void xaccBarSetName(Bar*, const char*)

have different interfaces, so you can't easily create a single
callback that works right unless you're willing to cast everything to
hell and gone.  As soon as you're casting you lose all semblance of
compile-time type checking.

Now, we could go and re-write everything in terms of gobject, but that
would also be a lot of work.

Sure, we could we have created a function:

    gnc_xmlv2_set_string(XmlNode*, void(*setter)(gpointer, const char*), gpointer)

and iterate over each of the types...  But a) that wasn't how the 
xml backend was written, and b) it still isn't completely table-driven.
There are still way too many exceptions.

Also the getters are not a 1:1 mapping to table rows.  For example,
Transaction->Splits don't map to a column.  Similarly, something like
Invoice->Is-Paid? doesn't map to a column either (mostly because it's
syntactic sugar for Invoice->Posted-Lot->Is-Closed?).

So, just iterating for the getters and setters doesn't necessarily
work -- just because you can use a relationship in a search does not
imply that relationship is stored in the datafile.

> > Each Backend needs to create its own plugin API.  I don't know what it
> 
> Ugh.  That's exactly what the m4 macros started out being.  They *were*
> the plugin API.   They didn't go far enough, and they shouldn't have
> been in m4.   As I said before, what made the postgres backend large
> was the multi-user serialization issues.  If you go back and remove
> the multi-user code, it'll be smaller than the bus-obj-xml backend.
> If you remove the the 'generic' code to handle, e.g. kvp, its smaller
> still.
> 
> I'm thinking I don't understand what you want.  I don't understand
> what this plug-in API is, or how it can possibly make it easier to 
> add new objects.  (where, by 'easier' I mean 'less per-object LOC's').

In my experience, "cp" and then "M-x replace-string" goes a long way
;) While I agree with you that in general fewer loc is better, I'm not
convinced that fewer loc is necessarily EASIER.

Basically, the plug-in API allows the Backend Core (e.g. the XML
Backend Core or the PG Backend Core) to define what needs to be
implemented to plug a new type into the core.  The needs of the XML
backend are different than the needs of the PG Backend, so they need
different APIs.

> > I dont think it's that little work.  Maybe it is, but I really dont
> > think so.
> 
> Bull, don't call my bluff. I could get it working on libdbi tonight,
> starting after dinner finishing before midnight.  It would take one 
> more day to debug on mysql.  There are *four* different backends in
> there. The pg 'single-user' backend doesn't use events, so events don't 
> even need to be stubbed out.  Checkpoints need to be stubbed out, this
> can be done effectively by setting a very very high checkpoint value
> (e.g. 100000).  I anticipate the mysql debugging would mostly involve
> date and time handling, and this could be ugly.
> 
> I'm starting to get irritated just enough to go ahead and do this.
> What shall it be?  libdbi and MySQL?

IMHO, libdbi.  However I'd rather see you get lots and periods done
rather than this ;)

> > I would also say that the column names should map directly to the
> > parameter names.  However there are always special cases.
> 
> yeah, and the special cases are what interesting ... 

Yea.  It's the special cases that worry me.

I suppose we could define two structures, a "storage descriptor" which
just contains the direct storage parameters, and then also the "search
descriptor" which is a bunch of ancillary (non-storage) searches that
can be computed from the object (but don't map to a column).

For example, the total tax value for an invoice line-item is computed,
not stored.  So it's in the search list but not the storage list.

> --linas

-derek

-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord at MIT.EDU                        PGP key available