Object System

John Ralls jralls at ceridwen.us
Tue Sep 20 00:01:08 EDT 2011


I'm about halfway through writing tests for Account.c, and I'm starting to think about how to fix things... and what really should be tested first rather goes along with that.

I've found that there are three object systems in play in Account and its oldest siblings, Transaction and Split: The (probably original) xacc architecture, which is only barely an object system. It does afford a sort of encapsulation, though, and something resembling object methods. It's clearly the work of a very self-disciplined and experienced structured-C programmer who sort of got the basic ideas of OO and tried to incorporate them. Then there's QofObject, a half-assed clone of an early version of GObject. Finally, someone (Derek, I think) made an effort at some point to convert to GObject, but couldn't bring himself to do the necessary complete rewrite. Over time, a thick layer of old-fashioned structured C "convenience" functions have settled over the top of the more or less OO core.

The result is a very confusing interface and serious memory-management issues. This very much needs to be fixed.

What's more, the design missed out on some of the finer points of OO. Many of the classes are very tightly co-dependent, which is making writing good tests much harder than it should be, and means that it is impossible to substantially change any one class without spending a lot of effort dealing with the effects on other classes.  There are many instances of class methods in another class's implementation module. (I use "class method" rather loosely here: Obviously it's not a function declared in the class header; but it is a function of the form xaccFooDoThis(foo, stuff) or gnc_foo_do_that (foo, junk) which affects class foo's state, but which appears in Bar.c.)

The least effort fix would be to finish the GObject conversion. Most of QOF and all of xacc would go away. C++ access would be available through Gtkmm or gobject-introspection. This is the least effort only because of the earlier attempt: The classes are defined, the property lists written. It's just a matter of getting all of the construction and destruction into init and dispose,  changing all of the code to use gnc_foo_new() and g_object_unref(), and converting QOF signals to GObject signals. A lot of grunt work, but nothing really radical.

But for the long term, GObject is a PITA to use. My view is that it grew out of some C programmers' disdain for C++ and their belief that they could do the same thing in "real" C. That might have been true in 1989, but it wasn't by 1999. It's seriously not the case today. But rewriting the engine in C++ is a far more ambitious undertaking than finishing up the GObject conversion. It will take a long time even if all 5 of the recently active developers dive into it. The good news is that because C++ is a superset of C, it can be done incrementally, for example by replacing priv-foo accesses with C++ accessors. We could maintain a reasonable release schedule while we work through it, and when we're done we'll have a much more flexible and maintainable code base.

The most radical approach is to use an interpreted language like Python or Ruby.  The advantage is that they're extremely expressive and have a lot of stuff "built in", so the required amount of code is on the order of 10% of an equivalent C++ program and 5% of a GObject program. The downside is that interfacing back to C is more difficult and if not done carefully can slow the program down; that part might negate the gain in development speed afforded by the more expressive language features. The advantage is again long-term: We wind up with an engine written in a far more approachable language, making it easier to recruit new contributors while allowing those contributors to be less experienced programmers. (Memory management is a major shortcoming of both GObject and C++. It takes a fair amount of experience before one can write code that gets it right. Most of the interpreted languages are garbage-collected; one doesn't have to worry about it.)

There are two interpreted languages that deserve special, negative, mention: Scheme and Java. Yes, we already have a bunch of Scheme. I think that it is even more of a barrier to getting new developers than the confusing engine API. CS grads from MIT or CMU might be totally comfortable with it, but the rest of the world freaks out at the prefix notation, zillions of parentheses, and car and cdr. Java, on the other hand, has many of the advantages of Python and Ruby (garbage collected, rich environment of support modules, nice unit test infrastructure) but two major drawbacks: It is extremely difficult to interface with C and it has its own, butt-ugly, GUI. (It's not so bad, you say; a lot like Motif. That rather proves my point.)

There's a fourth option, a variant of the first, that I haven't looked into much: Vala, which is supposed to be a C++-like way to write GObject code. I don't think it's appropriate for rewriting the engine, but it might be worthwhile to consider for new classes later on.

My feeling is that the first option, finishing the GObject conversion, has the lowest risk. I can start in on it, and if I only get part way and burn out or get hit by a bus, then we're still ahead, and if someone else comes along in a couple of years, she (hah!) can pick up where I left off and make some more progress. It fits well with Geert's work on converting the old Glade gui implementation to GtkBuilder. It even helps Cutecash some by providing a cleaner API and workable memory management. I'll start on it if there isn't overwhelming sentiment for one of the other options.

I'd really like to see us move to C++ or Python, because it would make the program a lot more flexible and allow us to use programming techniques that aren't available in C. But it's high risk, and it's way too ambitious for me to take on by myself.

Regards,
John Ralls


More information about the gnucash-devel mailing list