GncBusiness v. GNCSession

Sun, 25 Nov 2001 22:51:56 -0600

On Sun, Nov 25, 2001 at 07:12:26PM -0500, Derek Atkins was heard to remark:
> linas@linas.org (Linas Vepstas) writes:
> 
> > > If we could always assume a SQL backend life would be much easier :)
> > 
> > No it wouldn't, for the six reasons I described in the previous note.
> >
> I was just commenting that if I could assume that all
> backends can execute a SQL query then I can, as I've shown you how in
> previous mail, build up the SQL Query String piecemeal in an extremely
> extensible way.

I think we are talking past each other again, since I thought we'd
already gone over this territory.   Lets assume we are talking about
a hypothetical GNCNewQuery, which may or may not be built on or dervied
from the current Query.

1) Having an "SQL-like" interface to GNCNewQuery would be an interesting
   idea, not without utility.  Emphasis on "LIKE" in SQL-like.  Similar, 
   not the same.

2) The "SQL-like" interface would be a string interface, possibly
   with queries resembling the following:

  "SELECT Transaction WHERE (Transaction.memo='asdf' AND
                    Transaction.date<"2001-07-04 12:00:00") OR 
                    Tranasaction.cleared='N';"

   I am open to scheme-form-style queries too:

   (select transaction (or (and (= Transaction.memo 'asdf') 
                                (< Transaction.date "2001-07-04 12:00:00"))
                           (= Tranasaction.cleared 'N')))

   Either style works for me.

   (Back in the gnucash-1.5.x days, I had an XML interface to query
   but that was drowned in the churn, much to my dismay.)

3) It can't be exactly SQL, for reasons listed in the earlier email.
   These reasons include:
   3a) Some things that can be queried in the engine, such as the 
       bank balance, are difficult or imposible to formulate as
       SQL queries.  Therefore, one must either limit the application
       and not allow balance queries (that's bad), or one must be
       able to intercept the query and reshape it or worse, instead
       of passing it straight through to the database.

   3b) SQL has not been standardized.  There are many dialects.
       Its easy to create non-portable SQL.

   3c) Straight-through SQL prevents the development of interesting
       future backends, such as OO db's, or frame-slot db's or other
       interesting techologies.

4) GNCNewQuery must provide a C-language 'parser' or other utility
   that allows the query to be converted to some C representation.
   This is so that I could use C code to take apart queries that 
   involve bank balances and reformulate them as needed. 

   4a) I think this means that an interface somewhat like the current
       Query is unavoidable.

> My frustration is that how a query is executed for the XML/File
> backend is completely different than how it is executed for the
> postgres backend.  

This statement is erroneous.  Both backends use *exactly* the
same execution process.  There is absolutely no difference.

> In the Postgres backend you convert the GncQuery
> into a char* and then pass the char* into the Postgres database.
>
> In other words, part 'b', for the Postgres backend, can be broken
> up into two parts:
> 
> 	b1) parse the GNCQuery into a char*
> 	b2) pass the char* into the Postgres engine

But that is not all.  

b3) push returned data into engine.
b4) look for splits that have dangling balance info.
b5) create a new char * that fills out the missing info.
b6) go to b2), repeat until the all checkpoints are filled out.
b7) Let the engine query code perform the actual scan over 
    the engine data.  

This last part (b7) is *identical* to how it would work with 
an XML file, and the returned results are *identical* to what 
the XML backend would have yeilded.

Steps b1)through b6) happen in the postgres backend.  They are
particular to the design of this backend. They would be different
if we were using MySQL, because MySQL does not support some of the
constructs we use in step b5).  Alternately, someone might invent a
better way of handling account balances.

Only b7) is done in the engine proper.

It may seem 'wasteful' to run the query twice: once in the database,
and a second time locally in the engine.  But its unavoidable for 
a variety of reasons, including the need to keep transactions 
balanced (so that the ledger works right), and the need to have
enough data present to correctly compute account balances.

Note also: due to the caching structure, there are situations
where *no* queries are sent to postgres.

> Part b1 can be done in a distributed manner.  You don't necessarily
> need to the GNCQuery parsing routine to exist in one body of code; you
> can register new GNCQuery parsing routines at run-time.  

I don't understand what you are trying to say.  Technically, 
step b1) is not a parse, its a traverse.  One parses strings. 
One traverses trees.  The current GNCQuery is a tree, not a string.

Note also that b1) is not the hard part.  Steps b3) through b6) 
are, because trnasactions have to be complete balanced, and 
be 'up to date' (in case remote users have edited them).  b5)
is tricky because it traverses a graph with lots of loops in it.
b6) is tricky because its a kind of 'closure': one must keep 
repeating until no more data is returned from postgres.

--linas

-- 
pub  1024D/01045933 2001-02-01 Linas Vepstas (Labas!) <linas@linas.org>
PGP Key fingerprint = 8305 2521 6000 0B5E 8984  3F54 64A9 9A82 0104 5933