python GnuCash interface to SQL backend

Thu Nov 13 19:39:40 EST 2014

On Nov 13, 2014, at 12:44 PM, Sébastien de Menten <sdementen at gmail.com> wrote:

> 
> On 2014-11-13 19:25, John Ralls wrote:
>> On Nov 13, 2014, at 9:31 AM, Sébastien de Menten <sdementen at gmail.com> wrote:
>> 
>>> Indeed, it may be worth to explain what are the goals (and the limits).
>>> 
>>> I have tried to use the official python bindings and had the following issues:
>>> - need swig + compilations to make them work => pyscash is pure python and has only sqlalchemy as main dependency (which is rather supported/standard package)
>>> - python binding is a mapping if C API with a thin layer python friendly layer => I do not find the resulting python scripts very pythonic
>>> 
>>> [...]
>>> So, to sum up, goals of pyscash :
>>> - easy installation as pure python module (no compilation, no swig, ...)
>>> - easy to contribute as pure python and based on de facto standard ORM for SQL in python (sql alchemy)
>>> - pythonic implementation by leveraging SQL Alchemy features (transparent handling of guid, free commit/rollback/transaction support, automatic parent-children relation handling, etc)
>>> - pythonic interface for CRUD operations on a GnuCash Book
>>> 
>>> The last point is important as the goal is not at all to reimplement the whole GnuCash engine but only to be able to manipulate (CRUD operations) the different GnuCash entities (book, account, currency, etc) in an easy pythonic way.
>>> For instance, I do not plan to reimplement functions like GetBalance, GetReconciledBalance, GetPresentBalanceInCurrency, GetAccountBalance, etc
>>> 
>>> Is the goal of the project clearer ?
>>> Do you see complexities in reimplementing the pure CRUD operations (except the basic complexities of leaving a GnuCash book in a consistent status) ?
>>> 
>>> On the name issue, you are definitely right ! I had in mind a pie-ess-cash pronounciation (the S for SQL) as pygnucash was already taken. Any suggestion ? pysacash (python sqlalchemy gnucash interface) ? or is it even worse :-) ?
>>> 
>>> kind regards,
>>> 
>>> Sebastien
>>> 
>>> ps: and thank you for clarifying the announce on the user mailing list ! I should have thought about that myself to avoid confusion…
>> Sébastien,
>> 
>> Please remember to copy the list on all replies.
>> 
>> The goal of the project is clearer but is in my mind severely misguided. Object persistence is always a side effect of a non-trivial program, and GnuCash is a non-trivial program.
> If you see GnuCash from the (limited) perspective of an editor for a Book document (as LibreOffice Writer is an editor for a ODT document), object persistence is central. And luckily we have both a very clean and well designed object model in GnuCash as well as standard persistence formats (xml, SQL  and not an obscure binary format).

I don’t share either opinion, especially about the object model in GnuCash. 

> 
> The GnuCash documentation states that the XML document can be used to get the GnuCash data in whatever other format (see "Converting XML GnuCash File" http://www.gnucash.org/docs/v2.6/C/gnucash-guide/appendixa_xmlconvert1.html) through XLST transformation and vice-versa (so generating a new/transformed gnucash XML file from a LibreOffice spreadsheet).

No matter what the GnuCash documentation might say, only someone with a deep understanding of GnuCash could successfully create a correct GnuCash data file from a spreadsheet document using XSLT. I’m not familiar with LibreOffice’s ODS format, but I expect that it would take deep knowledge of that format to successfully extract useful data from it with XSLT as well.

> 
> PieCash (I like your name !) aims to fulfill exactly this purpose, no more, no less. As with the XLST transforms, it allows to do CRUD operations on the objects within a GnuCash Book.

CRUD operations on a GnuCash database will corrupt it. The GnuCash schema is not normalized, and not all of the necessary data is stored in the table associated with the objects.

> 
> Has this stance on the manipulation outside GnuCash of a GnuCash document evolved since it was written ? Would this still be supported after the C++ rewrite ?

It isn’t supported now. It never has, and it is unlikely that it ever will be, even if we are able, after several development cycles, to actually migrate to a 3N database schema. There is too much logic that is encoded in the program and which cannot be portably encoded in a SQL database, to make that feasible.

>>  your complaint about the python bindings is that they’re not pythonic I think that your efforts would be better spent improving that. It should become easier as we progress through the C++ rewrite which will produce a much more object-oriented architecture where that’s appropriate. That said, it’s also worth noting that C++ is a far more versatile language than either C or Python in that it supports generic and functional programming as well as the structured and OO paradigms that Python supports. In moving away from GObject we’ll also be enforcing type safety much more rigorously. Since that’s a notorious weakness (though casual hackers think it’s a feature) of Python, having thin Python wrappers around C++ objects will provide for much safer Python add ons than does the current code base.
> The python binding are not pure python bindings (and will probably never be as they should interface C or C++ code). This makes them not easily accessible on Windows platforms (complexity in compilation) and more complex to hack (for non C/C++ programmers). However, they offer the ability to call any function of the engine and to be independent of the backend (XML, SQL,...) as long as it is interfaced so may be the only solution in some cases.
> 
> On the other side, PieCash is pure python. The code required to interface an entity is almost trivial. For instance, the full interface to the Commodity entity is
> 
>    class Commodity(DeclarativeBaseGuid):
>        __tablename__ = 'commodities'
>        __table_args__ = {}
> 
>        # column definitions
>        cusip = Column('cusip', TEXT(length=2048))
>        fraction = Column('fraction', INTEGER(), nullable=False)
>        fullname = Column('fullname', TEXT(length=2048))
>        mnemonic = Column('mnemonic', TEXT(length=2048), nullable=False)
>        namespace = Column('namespace', TEXT(length=2048), nullable=False)
>        quote_flag = Column('quote_flag', INTEGER(), nullable=False)
>        quote_source = Column('quote_source', TEXT(length=2048))
>        quote_tz = Column('quote_tz', TEXT(length=2048))
> 
> All the rest is handled automatically thanks to the SQLAlchemy layer (link between objects, cascade delete, locking and transactions, generation of GUID key, …).

Won’t work. SQLAlchemy can’t automatically generate a correct class from the schema nor can it derive a correct table description from the C headers.

You’ll get closer working with the XML schema. There’s a reasonably up-to-date version in src/doc/xml/gnucash-v2.rnc.

> 
> So, I admit, the main effort in this project consisted in writing for each entity (book, account, etc) the equivalent here above code as well as the relationships between entities. The later is done with a syntax like
> 
>    class Split(DeclarativeBaseGuid):
>        [...]
>        account = relation('Account', backref=backref('splits', cascade='all, delete-orphan'))
> 
> 
>> It’s also very dangerous to attempt to select what parts of GnuCash’s internal functions to reimplement. To take your example of balances, it’s an integral part of GnuCash’s internal checks to compute and compare balances. Leaving that out greatly increases the risk of introducing corrupt data.
> I may have not be totally clear in my previous email but I do not indeed see any common sense in implementing the Balance calculations in PieCash. These results are part of the GnuCash application logic and are not saved in the GnuCash XML/SQL files. They are neither part of the object model itself but transient calculations.

You’re confusing the object model with the persistence model. They’re rather different.

Regards,
John Ralls