python GnuCash interface to SQL backend

John Ralls jralls at ceridwen.us
Fri Nov 14 10:26:29 EST 2014


> On Nov 14, 2014, at 4:28 AM, Sébastien de Menten <sdementen at gmail.com> wrote:
> 
> First of all, thank you John for taking the time to answer to this thread !
> 
> 
> > If you see GnuCash from the (limited) perspective of an editor for a Book document (as LibreOffice Writer is an editor for a ODT document), object persistence is central. And luckily we have both a very clean and well designed object model in GnuCash as well as standard persistence formats (xml, SQL  and not an obscure binary format).
> 
> I don’t share either opinion, especially about the object model in GnuCash.
> 
> When talking about the object model, I am referring to the entities (Account, Book, Split, ...), their relation and attributes. And for most of the "core" objects (excluding company, invoice, ...) it is rather clean and efficient. Personnally, I think it could be a great basis for an open document format for this domain of application (better than QIF & co).
> 
> In terms of the implementation itself of the object model, the main things I see not that clean are:
> - KVP vs field representation 
> - XXX_denom / XXX_num split (instead of using a Decimal type)
> But these two points should be hidden for the user/developper in python bindings (official or piecash).

I experimented with the available decimal libraries over the summer. They’re mostly too slow and they don’t afford enough control over rounding to be sufficiently accurate for accounting use.

The object model as it stands now has too much interdependence between classes, especially the transaction, split, account, and commodity classes. The implementation has a lot of 

> 
> >
> > The GnuCash documentation states that the XML document can be used to get the GnuCash data in whatever other format (see "Converting XML GnuCash File" http://www.gnucash.org/docs/v2.6/C/gnucash-guide/appendixa_xmlconvert1.html <http://www.gnucash.org/docs/v2.6/C/gnucash-guide/appendixa_xmlconvert1.html>) through XLST transformation and vice-versa (so generating a new/transformed gnucash XML file from a LibreOffice spreadsheet).
> 
> No matter what the GnuCash documentation might say, only someone with a deep understanding of GnuCash could successfully create a correct GnuCash data file from a spreadsheet document using XSLT. I’m not familiar with LibreOffice’s ODS format, but I expect that it would take deep knowledge of that format to successfully extract useful data from it with XSLT as well.
> 
> I agree that creating GnuCash file through XSLT is probably clause to be a nightmare ... that is why I quickly dropped the XML route (on ODS, I just took what was written in the GnuCash doc, and indeed interacting with such document require some library).
> 
> 
> >
> > PieCash (I like your name !) aims to fulfill exactly this purpose, no more, no less. As with the XLST transforms, it allows to do CRUD operations on the objects within a GnuCash Book.
> 
> CRUD operations on a GnuCash database will corrupt it. The GnuCash schema is not normalized, and not all of the necessary data is stored in the table associated with the objects.
> 
> 
> "CRUD operations" is not to be taken as "pure CRUD operations", there should always be in piecash a check that the set of objects is self-consistent (but this is really simple, obvious, natural constrains if one know a bit the GnuCash object model)
> 
> Could you point me to 2 or 3 real case example we would nevertheless end up corrupting the database ? It would really help me to materialise this "risk".
> With my practical experience with piecash (creating a full account tree structure and importing thousands of transactions from csv file), I haven't found any case of corruption (except when developing the API and having "obvious" bugs).

I can’t provide concrete examples without doing an extensive code review of piecash, for which I have neither the time nor the inclination. Some obvious trouble spots include cross-commodity transactions, especially involving lots or trading accounts.

Have you tested with bad data to see if piecash rejects it? Did you thoroughly analyze the ways that bad data could be created and ensure that you have test cases proving that piecash rejects all of them?

> >
> > Has this stance on the manipulation outside GnuCash of a GnuCash document evolved since it was written ? Would this still be supported after the C++ rewrite ?
> 
> It isn’t supported now. It never has, and it is unlikely that it ever will be, even if we are able, after several development cycles, to actually migrate to a 3N database schema. There is too much logic that is encoded in the program and which cannot be portably encoded in a SQL database, to make that feasible.
> 
> 
> For piecash, there is no need for a 3N db schema and there is no need to write any logic in SQL !
> It is through SQLAlchemy that we can handle/encapsulate the extra bit of logic (which is, as far as I am in the development of piecash and given piecash scope, rather limited). Nothing is done in SQL itself.

I have trouble believing that an ORM will generate a correct implementation with a non-normal schema. 

> 
>  >
> > All the rest is handled automatically thanks to the SQLAlchemy layer (link between objects, cascade delete, locking and transactions, generation of GUID key, …).
> 
> Won’t work. SQLAlchemy can’t automatically generate a correct class from the schema nor can it derive a correct table description from the C headers.
> 
> 
> The table schema is exactly the only piece of code that needs to be written (piecash is in fact just that, table schemas with some metadata). 
> I do not hope/expect/think to generate this code automatically from C headers (as SWIG does).

You’re not being consistent. Are you using SQLAlchemy as an ORM or simply as a SQL abstraction layer? You said above that nothing is done in SQL, and earlier that you are doing consistency checks, but here you’re saying that all you’ve written is a table-schema, which would imply that you’re relying on the database to do all of the work.

>  
> You’ll get closer working with the XML schema. There’s a reasonably up-to-date version in src/doc/xml/gnucash-v2.rnc.
> 
> 
> As written here above, XML is not easy to work with (at least for me) and does not save me of anything you said before (corruption, etc).

Sounds like a learning opportunity.

>  
> >
> > So, I admit, the main effort in this project consisted in writing for each entity (book, account, etc) the equivalent here above code as well as the relationships between entities. The later is done with a syntax like
> >
> >    class Split(DeclarativeBaseGuid):
> >        [...]
> >        account = relation('Account', backref=backref('splits', cascade='all, delete-orphan'))
> >
> >
> >> It’s also very dangerous to attempt to select what parts of GnuCash’s internal functions to reimplement. To take your example of balances, it’s an integral part of GnuCash’s internal checks to compute and compare balances. Leaving that out greatly increases the risk of introducing corrupt data.
> > I may have not be totally clear in my previous email but I do not indeed see any common sense in implementing the Balance calculations in PieCash. These results are part of the GnuCash application logic and are not saved in the GnuCash XML/SQL files. They are neither part of the object model itself but transient calculations.
> 
> You’re confusing the object model with the persistence model. They’re rather different.
> 
> 
> I understand "object model" as the entities+relations+fields model and its concrete implementation, the "persistence model" as the table schema. Is it also what you are referring to ? If so, could you elaborate shortly on your comment ? I can't succeed in making the link between my comment and yours.
> 
> and thank you again for your time !
> 

The object model is the class hierarchy, with its member variables and functions, expressed in C and Scheme. In a proper database-based application there would be a close mapping between the two, but GnuCash isn’t yet one of those. In a proper OO design, everything that directly affects an object’s state would be encapsulated in the cmember functions of the object’s class. GnuCash isn’t one of those either: State-changing code is spread throughout the program.



More information about the gnucash-devel mailing list