XML size (was: no subject)

Derek Atkins warlord@MIT.EDU
04 Apr 2002 09:58:22 -0500

Paul Lussier <plussier@mindspring.com> writes:

> I'll agree, that's a bad thing.  I'm still using 1.6.4, and it's 
> worked flawlessly for me since I moved to it.  I haven't seen the 
> need to upgrade to anything else yet (sounds like maybe I shouldn't?)

There isn't a problem in the 1.6 tree, however if you take your 1.6
data, load it into 1.7/CVS, and save it off, you will not be able to
read that data in 1.6, even if you made no changes to the data.
Basically it's a uni-directional data movement.

> >The XML datafiles are an order of magnitude larger than they need to
> >be, and are certainly an order of magnitude larger than the old binary
> >format.  XML is overly verbose.
> Well, yeah, but that's the nature of text vs. binary in general, is 
> it not?  I'll agree that XML is ugly, and overly verbose, and maybe 
> the files are larger than they should be, but you do expect some 
> bloat with the move from binary to ascii, don't you?

Not necessarily.  The problem specific to XML is the amount of
redundant information.

> As for features, there really aren't a lot features I need. 
> Here are some of the benefits I've enjoyed under the ascii-text format
> though, which I've found extremely useful:
> 	- the ability to check in/out of rcs
> 		this gives me the ability to associate comments
> 		with a particular data entry session as well as
> 		the ability to check out a "view" of my data as it
> 		was at a particular time in the past.

Hmm.. Perhaps we should allow 'session logs'?  I'm not sure how to do
this in general, particularly with a database.  Maybe the way to do
this is to tie each data entry to a particular "session" and then
allow the session to have a log message?

> 	- the ability to 'grep' things like dates, check numbers,
> 	  descriptions/memos, and the like from the file without
> 	  having to fire up the actual application; especially when 
> 	  connected to my system over a slow dial up link.

This would be a relatively simple SQL command that you could run on
the command-line, or again you could use the SQL dump....

> 	- the ability to easily change various fields with a global
> 	  search/replace like sed when I realize I need to change 
> 	  something which has mistakenly been propagated throughout
> 	  the file.

Again, this would be relatively simple with a SQL command, or again
with a dump/restore.

> There are probably more, but these are the ones I can think of off 
> the top of my head.  Though I would like to point out, that by 
> providing an import/export feature would still allow me to do any of 
> the above with out a lot of effort (provided I did not need to fire 
> up the gui to do so).

Nah, you can do the dump/restore on your own without firing up

> Seeya,
> Paul


       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord@MIT.EDU                        PGP key available