XML size (was: no subject)

Rob Browning rlb@defaultvalue.org
Wed, 03 Apr 2002 12:43:45 -0600


Paul Lussier <plussier@mindspring.com> writes:

> Are you planning on writing an ASCII Import feature as well?

Absolutely -- if we went this route...

> Again, I doubt this will be a problem for the home user.  For the 
> small business user, maybe, and in that case, maybe an SQL backend is 
> justified.  But I don't believe it is for the home user.

One thing you may be underestimating is how much easier certain things
would be code-wise if we switched to just using SQL (while still
providing a text import/export process).

There are all kinds of things you can say trivially in SQL (say with
one line of text) that would require a lot of hand-written C or scheme
to do if we weren't using SQL.

So if we could make it so that using SQL across the board wasn't too
bad for the average user (i.e. we make it mostly transparent).  It
seems likely to me to have non-trivial benefits on gnucash's internal
complexity and future features.

> Okay, so I used a bad example.  But I have used this exact technique
> (global search/replace) to change things like payee fields and memo
> fields extensively with no damage or harm whatsoever.  As for using
> the application, not when I have to change 35 occurences of a
> misspelling which got propogated over time.  That simply takes way
> too long.  That's *exactly* why I like the ascii text file.

I completely agree that having a text export/import format is
important (and again, I'd propose that we always keep one, and even
use it for regression tests so we know it still works), and you could
probably do your editing there, but bear in mind that the SQL you have
to know to be able to do some of the things you're talking about is
*really* easy (off the top of my head, and probably a bit wrong):

  update gnc_transactions
    set description='foo' where description='bar';

or similar (of course it wouldn't be quite this easy, but hopefully
you get the idea).

> Additionally, if I need to move the file somewhere, a text file 
> affords much better compression than a binary file.  My current file 
> is 2.1M and compresses to 196k.  That easily fits on a floppy, or can 
> even be e-mailed to someone.

If you're storing the same fundamental data (and not using doubles in
one and integers in the other for example), and you're compressing
with bzip (or even gzip) I'd suspect the final compression size would
be pretty close for text or binary.

> Additionally, I currently have my gnucash file under RCS.  Try 
> putting an SQL database under RCS.

Hmm, FWIW I can't recall if we're actually sorting the output.  If
not, RCS might not be buying you that much (at least storage size
wise) over just keeping compressed complete copies of the file.

-- 
Rob Browning
rlb @defaultvalue.org, @linuxdevel.com, and @debian.org
Previously @cs.utexas.edu
GPG=1C58 8B2C FB5E 3F64 EA5C  64AE 78FE E5FE F0CB A0AD