XML size (was: no subject)

Paul Lussier plussier@mindspring.com
Wed, 03 Apr 2002 23:43:34 -0500


In a message dated: 03 Apr 2002 20:52:15 EST
Derek Atkins said:

>SQL is far from inflexible....

Maybe I'm confusing some terms here.  To me, SQL is the language used 
for querying the database, not the format the data is actually stored 
in.  Am I wrong?  I don't consider SQL, the language inflexible, I 
consider locking it a binary format inflexible, actully, more like
inaccessible and less portable.

>> >        c) load faster
>> >        d) save faster
>> 
>> I don't see these as major advantages.  As processor speed and over 
>> system speed in general increase, these will decrease.  Any major 
>> gain in these areas now will be lost on newer hardware.
>
>That's fine, you're allowed your opinion.  Many other people (myself
>included) consider speed a major issue.

But what are you speeding up?  Load time and save time?  How often do 
those two actions occur?  I would be all for it if you were talking 
about speeding up re-display after a transaction entry event, that 
happens quite frequently within any given session.  But load time 
occurs once per session.  Saving?  Okay, that happens more 
frequently, but still not all that much.  And I really don't think 
it's the type of things users are going to get all excited about.
You could accomplish similar speed ups by splitting the data up in to 
separate files, one for each account, and a master file which caches 
the totals for use in the main display widget.  That way you only 
load one small file with the totals, which is very quick.  Then, you 
only need to load the files for those accounts which get updated.  
Spread the total load time across the data entry session, and you've 
accomplished the apparent performance enhancement.

>FWIW, this sounds like something that would come out of Redmond:
>"Don't worry about the speed of our software -- new machines
>will come out soon and it will work just fine"

While I agree it's something you'd expect out of Redmond, it's 
actually an almost verbatim quote from "The UNIX Philosophy".
(or it could be "The Mythical Man Month"?, both have some real gems 
 in them :)

And regardless, it's not like you're going to need a 1.5Gig CPU and
a 1/2Gig of RAM to run GnuCash.  I'm on a PII 300 with 96MBs of RAM.
The "average" user has a much more beefy machine than that.  I run 
GnuCash on my PII 200 laptop with the same data set and other than 
maybe a 10 sec load time (at most, I've never actually measured it)
I don't find it slow or even sluggish.

>One of the advantages of Linux has ALWAYS been that it runs much
>faster than those other OSes on equivalent hardware.  Simplicity.

Okay, I guess we're not going to mention things like GNOME and KDE then :)

I agree with your sentiment, however, we're not debating the most 
minimal set of hardware required to run Linux, or GnuCash for that 
matter.  My point was that what may be perceived as a small 
performance enhancement today will be completely lost once the next 
generation of hardware is out.  I don't see the benefit of spending a 
lot of cycles to gain little perceived value.

The fact is, the averagee person has a pretty decent machine.  Sure, 
there are some die hards out there on [34]86s or P-Is with very 
little memory.  But as someone else pointed out, if that's the case, 
then GnuCash is the least of their worries, since *everything* will 
be slow on those machines.  Moving the GnuCash data format to and SQL 
database isn't going to help them.  If anything, it's going to hurt, 
since they'll now have to run a database system (no one yet has 
mentioned an embeddable database that I recall, other than 
Berkeley-DB, and that's not SQL-based.  Please correct me if I'm 
wrong on anything I've said here.)

>> So you're reducing the size of the application by adding code?
>
>Yes, because you don't need all the XML parsing and unparsing,
>so you can "remove" that, and SQL is extremly small and concice.
>
>So, yes, you are reducing size AND complexity by moving to SQL.

Okay, that makes sense, but, aren't you then increasing 
complexity of the GnuCash system as a whole by now requiring an SQL 
database server?

>> As I said previously, the average home user isn't going to have so 
>> much data in their file that the size is going grow to such a state 
>> as to impact them or the performance of their system.
>
>That's not what I've been hearing from other users that I've been
>talking to.

That may be.  I'm going based on what I've been seeing on the GnuCash 
mail lists. Maybe I've missed some posts.  It would be interesting to 
see what kind of system specs and file sizes we're talking about 
though.  Are people realizing 20 or 30 megabyte files?  Are these 
multi-year files?  Would this be solved by the introduction of 
accounting periods so that for instance, each year was in a separate 
file?  Just curious.

>> I'll buy that argument for a business environment, but not for the 
>> home user.
>
>Where is this line drawn?  Besides, if we're going to do it for the
>home-business environment, you'd just done all the work.

I would make it modular and optional.  There is already the option to 
build GnuCash with SQL/PostgreSQL support, but it's not required.
Businesses, or those individuals who wish to use this support may. 
Those who don't have that option as well.  Why does this need to 
change?

>> Ahm, everything is data, some is just accessed and used differently or 
>> more often than others.
>
>I disagree.  HOW the "data" is used is extremely important.  But
>perhaps we must agree to disagree.

I didn't say that how it was accessed wasn't important.  I said 
that everything was data.  You (or someone) made the statement that

	configuration != data

when in fact, configuration data is still data, it's just accessed
and used *differently*.  I totally agree that understanding HOW it's 
used or accessed is extremely important.
-- 

Seeya,
Paul