Salutations

David Merrill dmerrill@lupercalia.net
Fri, 8 Dec 2000 19:55:39 -0500


On Fri, Dec 08, 2000 at 05:37:12PM -0600, Rob Browning wrote:
> David Merrill <dmerrill@lupercalia.net> writes:
> 
> > I have been using gnucash for awhile now, and I'm interested in
> > helping with the development. I've been following your conversation
> > on rdbms, and I think I can help with that module.
> > 
> > I do Oracle programming and I'm pretty good with SQL and database
> > design in general. I can read C and I can do some simple C programming,
> > but my C is quite rusty from years of disuse. That's a situation I
> > would be very happy to remedy, though.
> > 
> > Any pointers, suggestions, tip, etc. as to how and where I might get
> > started would be greatly appreciated. Who, for instance, is already
> > working in this area?
> 
> First of all, welcome.

Thanks.

I'll just jump right in and we'll see what happens. :-)

> ATM no one is actively working on this, but we consider it one of our
> "critical path" items going forward over the next year.  I've thought
> a little about it, mostly from the perspective of trying to decide if
> it was feasible to just bite the bullet and move to a DB instead of
> another native file format, but that's about it.  So I suspect that
> the first order of business would be a substantial amount of
> discussion just to try and figure out what we need and want.
> 
> Just off the top of my head, here are some issues:
> 
>   * Should we prefer PostgreSQL or MySQL?

We should prefer a generic interface, if possible. I know there are
several ODBC (or ODBC-style) projects going on - are they mature
enough to use? Or will they be mature enough by the time gnucash sql
support is mature enough, anyway?

>   * Can we avoid worrying about that with careful design and by using
>     Gnome DBA?

Oh, is that similar to ODBC in that it abstracts the database behind
an API through which you pass SQL? Maybe that answers my questions
above?

>   * Can we embed one or the other of these databases (i.e. set it up
>     to run an a playground/file/directory in the user's home
>     directory) so that we can replace our current file format, even
>     for single-users, with the DB, without them having to know
>     anything about database administration?  I've spoken to both the
>     PostgreSQL and MySQL camps, and they both claim embedding is now
>     possible, but someone needs to try it and see what that means.

What would be the benefit? With relatively small datasets, there
wouldn't be much, if any, speed gains, but only lots of memory use. It
seems like overkill for many people, and since gnucash already has a
working local storage mechanism...

Plus, it adds another point of failure, along with more complexity, to
the system. When the power is needed, fine, you have to accept the
complexity, but when it's not needed it should be avoided.

>   * What should the DB look like (structure-wise)?  What types do we
>     need, can we do everything with "standard" types, etc.?

We can absolutely do everything with standard types. I've done lots of
development where there were multiple back ends, including 3 years
developing SQL against Oracle, SQL Server, or DB2. So I know how to
write cross-database (is that a word?) SQL. It's not *too* hard as
long as they all follow SQL92, and they all do, more or less.

Now, I don't know MySQL or PostgreSQL (yet), so there could be gotchas
I don't know about, but I doubt it. I just read an article online
today where a guy converted his MySQL database to PostgreSQL in a day
with minimal coding required.

>   * Closely related to the above, where should the engine code/db
>     boundaries be right-now?  in the long run?  This discussion will,
>     at least on the long run side, have to include a discussion of how
>     to handle multi-user and local vs remote performance issues.

I'll leave this decision up to you who have experience with the
gnucash code. With little db design work done, I can more confidently
step in, I think.

>   * How much can we take this in stages?  i.e. perhaps we can first
>     design the database, and set up the SQL server to just be
>     "offline" storage, i.e. we read in the data when we start up, and
>     write it out when we quit, but it's not "live".  Then later we can
>     make the integration tighter.  Would something like this even
>     help?

Definitely designing the database schema is a good place to start.
That is database-independent work and isn't affected by other design
decisions much if at all. I'll take a look at the XML and see how it
might be mapped into a database.

I have a feeling that the multi-user aspects are going to be the real
problem areas. Designing the database won't be hard (I've done many
financial apps in my time!), and writing SQL will be pretty
straightforward (the engine does all the work, really).

> Again thanks for the interest, and I hope this is the kind of feedback
> you were looking for.

Yes, it is.

> I figure you might also want a little more info about how/where the DB
> would probably tie in to the code, but I thought I'd let you digest
> this first :>

That will take some time to understand. I'd like to start at the
database end while you(?) start from the gnucash end. That's because
I'd like to use what I'm already good at. But I will be learning more
about gnucash as we go along so hopefully by the time I have some
working database code I will be able to work along with you from a
more educated perspective.

That's the idea, anyway. :-D

-- 
Dr. David C. Merrill                     http://www.lupercalia.net
Linux Documentation Project                dmerrill@lupercalia.net
Collection Editor & Coordinator            http://www.linuxdoc.org
                                       Finger me for my public key

QOTD:
	"I ain't broke, but I'm badly bent."