XML size
Glen Ditchfield
gjditchfield@acm.org
Wed, 03 Apr 2002 11:08:15 -0600
On April 2, 2002 10:32 am, Jesse Becker wrote:
> ... If you are worried about space,
> compress the files (like XML lets you do); gzip averages something like
> 50-60% compression on text.
It does rather better on GnuCash files. I tried compressing my file,
"Accounts", with gzip and bzip2:
-rw-rw-r-- 1 gjditchf gjditchf 91604 Mar 31 08:13 Accounts.bz2
-rw-rw-r-- 1 gjditchf gjditchf 121615 Mar 31 08:13 Accounts.gz
-rw-rw-r-- 1 gjditchf gjditchf 1350319 Mar 31 08:13 Accounts
gzip took a snappy 0.765 seconds (300MHz K6-2); bzip2 took 9.5 seconds.
Every now and then someone on gnucash-devel suggests modifying GnuCash to let
it read and write gzipped files. Sounds good to me ...
I went looking for XML-aware compression schemes and found surprisingly
little. There's some alpha-level code at
http://www.cs.cornell.edu/People/jcheney/xmlppm/xmlppm.html
and a paper there suggests that it wouldn't be hard to generate compressed
files that are half the size of the gzipped file.