File store (was Re: Salutations)

Phillip Shelton shelton@usq.edu.au
Mon, 11 Dec 2000 15:03:55 +1000


> -----Original Message-----
> On Mon, 11 Dec 2000 11:32:49 +1000, the world broke into rejoicing as
> "Phillip Shelton" <shelton@usq.edu.au>  said:
> > How does all of this affect the `closing the books'.  If
> the books are
> > `close-able' then maybe we do not have to read the last 10
> years worth of
> > data in at once?
> >
> > Will the closing of the books be easier or harder with a DB?
>
> It becomes a decreasingly relevant issue with a "proper database."
>
> A prime reason why you _want_ to "close the books" is the fact that as
> the amount of data grows, the "document" that is "the books"
> gets large
> and unmanageable.

Ok (I am not an accountant). I just thought that there might be some legal
stuff that has to be done at a year end. But I suppose that is just a matter
of a well written report.

> > With data being stored in a more sophisticated DB, you don't forcibly
> _need_ to close the books; queries hit the portions that are relevant,
> so that having 10 years worth in the DB doesn't make it desperately
> slow.

It still might be nice to be able to archive, but I suppose that we could
just use the archiving stuff in the DB for that?

>  After all, if:
> a) Account balances keep having to be repetitively calculated from the
>    beginning of time until now, That'll Be Slow.

Lets hope we can tell the DB to start calculating from the changed point
only.

> b) The code pulls individual records from the DB to satisfy
> calculations,
>    so that GnuCash has to do a lot of iterating where the
> loops contain
>    DB queries, That'll Be Slow.

Ugh.

> The current set of data structures essentially present the database
> as a "network" or "hierarchical" database which you walk through in
> order to calculate/display stuff.
>
> An SQL system does _not_ work efficiently for that approach; it
> expects a somewhat different abstraction where you _describe_ the
> data that you want, as with:
>    select date, amt, descr from txns where
>      date between "20000101" and "20000909" and
>      acct = "Checking";
> which returns a set.
>
> Submitting one query that returns 500 records is _vastly_
> more efficient
> than submitting 500 queries that each return 1 record, so that quite a
> lot of things need to change to reflect the new sort of "data paths."
> --

Here's hoping that it does not prove imposible.

Phill