python GnuCash interface to SQL backend

Fri Nov 14 18:18:51 EST 2014

On Friday, November 14, 2014, John Ralls <jralls at ceridwen.us> wrote:

>
> On Nov 14, 2014, at 4:28 AM, Sébastien de Menten <sdementen at gmail.com
> <javascript:_e(%7B%7D,'cvml','sdementen at gmail.com');>> wrote:
>
>
> In terms of the implementation itself of the object model, the main things
> I see not that clean are:
> - KVP vs field representation
> - XXX_denom / XXX_num split (instead of using a Decimal type)
> But these two points should be hidden for the user/developper in python
> bindings (official or piecash).
>
>
> I experimented with the available decimal libraries over the summer.
> They’re mostly too slow and they don’t afford enough control over rounding
> to be sufficiently accurate for accounting use.
>
> The object model as it stands now has too much interdependence between
> classes, especially the transaction, split, account, and commodity classes.
> The implementation has a lot of
>
>
When looking at the SQL/XML document, what would
be the unneeded interdependencies ? The links between
account->split->transaction looks meaningful to me. The links
transaction->currency and account->commodity also.
Is it the issues with the scu/denom/num that is cumbersome to handle (if
there is a change in the scu of a currency, all splits related to it should
be updated) ?

> Could you point me to 2 or 3 real case example we would nevertheless end
> up corrupting the database ? It would really help me to materialise this
> "risk".
> With my practical experience with piecash (creating a full account tree
> structure and importing thousands of transactions from csv file), I haven't
> found any case of corruption (except when developing the API and having
> "obvious" bugs).
>
>
> I can’t provide concrete examples without doing an extensive code review
> of piecash, for which I have neither the time nor the inclination. Some
> obvious trouble spots include cross-commodity transactions, especially
> involving lots or trading accounts.
>

> Have you tested with bad data to see if piecash rejects it? Did you
> thoroughly analyze the ways that bad data could be created and ensure that
> you have test cases proving that piecash rejects all of them?
>
> Well, at first, I wanted to have a simple way to extract data out of
GnuCash (a read-only mode). This was easy to do with SQL alchemy and is
safe.
Afterwards, I wanted to be able to modify a GnuCash book knowing what I was
doing (ie being cautious to not create inconsistent objects. This was also
ok (as said, I have used piecash to create automatically a complex account
structure from an excel file and to import thousands of records without
errors).
Next is the ability to ensure consistency checks/error detection when the
user does changes. This is the less mature part and I hear well your
warnings about the complexity of this part.

> For piecash, there is no need for a 3N db schema and there is no need to
> write any logic in SQL !
> It is through SQLAlchemy that we can handle/encapsulate the extra bit of
> logic (which is, as far as I am in the development of piecash and given
> piecash scope, rather limited). Nothing is done in SQL itself.
>
>
> I have trouble believing that an ORM will generate a correct
> implementation with a non-normal schema.
>
> In piecash, the ORM does not generate the SQL schema as it already exists
(it is the one defined by GnuCash). It just maps the existing schema (that
needs to be redescribed in python) to python objects transparently handling
type conversion, transactions, guids, relationships, etc

>
> The table schema is exactly the only piece of code that needs to be
> written (piecash is in fact just that, table schemas with some metadata).
> I do not hope/expect/think to generate this code automatically from C
> headers (as SWIG does).
>
>
> You’re not being consistent. Are you using SQLAlchemy as an ORM or simply
> as a SQL abstraction layer? You said above that nothing is done in SQL, and
> earlier that you are doing consistency checks, but here you’re saying that
> all you’ve written is a table-schema, which would imply that you’re relying
> on the database to do all of the work.
>

>
> I am using sqlalchemy as a mapper and as a session manager (unit of work
pattern). Nothing is done in SQL (meaning I do not write SQL queries nor
SQL stored procedures).
A cascade delete constrain (if I remove a transaction, all related splits
are automatically deleted) is also managed by SA.
So I am "essentially" writing a mapping and then adding some methods to
either create objects in a consistent way (like having a function
"create_transaction(from_acc, to_acc, amount, date, description)" that
create always correct objects) or to check objects are consistent before
commit (this is not yet done)

>
> You’ll get closer working with the XML schema. There’s a reasonably
>> up-to-date version in src/doc/xml/gnucash-v2.rnc.
>>
>>
> As written here above, XML is not easy to work with (at least for me) and
> does not save me of anything you said before (corruption, etc).
>
>
> Sounds like a learning opportunity.
>

But will this save me from all the data corruption issues you mention ? If
not, I prefer working with python objects than XML. If so, how is this done
?

The object model is the class hierarchy, with its member variables and
> functions, expressed in C and Scheme. In a proper database-based
> application there would be a close mapping between the two, but GnuCash
> isn’t yet one of those. In a proper OO design, everything that directly
> affects an object’s state would be encapsulated in the cmember functions of
> the object’s class. GnuCash isn’t one of those either: State-changing code
> is spread throughout the program.
>
> If we take a document centric view instead of an application centric view,
the only important element is the data (and not the implementation of the
engine/class hierarchy/...).
As long as another program changes the data while keeping the
GnuCash schema and related invariants, we do not really care about the
program being OO or not, with encapsulation or not,... In fact, we do not
need the full power of the real GnuCash engine to manipulate (for simple
cases) the data.

I understand from your comment that the status of the GnuCash code is not
optimal and you're working on a clean rewrite in C++. This can be an extra
argument to try to have python bindings not linked to the GnuCash C code
but only relying on the GnuCash data.

Kind regards
Sebastien