Balance Checkpoints
Christopher Browne
cbbrowne@hex.net
Tue, 12 Dec 2000 21:22:44 -0600
On Tue, 12 Dec 2000 15:22:15 EST, the world broke into rejoicing as
David Merrill <dmerrill@lupercalia.net> said:
> On Wed, Dec 13, 2000 at 06:08:46AM +1000, Phillip J Shelton wrote:
> > Derek Atkins wrote:
> > > David Merrill <dmerrill@lupercalia.net> writes:
> > > > aisb, I consider the issue of closing the books and making all
> > > > transactions prior to that immutable to be completely orthogonal to how
> > > > the running total is determined. I am proposing a solution to the
> > > > maintenance of running balances that scales well.
> >
> > I agree.
> >
> > > Let me give an example... You have 1000 transactions a day, so you do
> > > daily checkpoints.
> >
> > And going back to the table, if you do daily checkpoints then you get one
> > record per day. Is date good enough to allow you to do a checkpoint every
> > 500 transactions? In this case you would have two records per day. i.e.
> > does date include the time?
>
> No, date does not include the time. To enable the admin to tune the
> system, we could give them the ability to specify either daily
> checkpointing, or after-n-transactions checkpointing. Then, we would
> have to specify the exact transaction which was checkpointed, though.
This sounds pretty questionable; IF there is to be such checkpointing,
it would make sense for it to be configurable to be associated with some
sort of "date basis," which provides an unambiguous way of evaluating
which transactions should be affected by a given "balance checkpoint."
The "after-n-transactions" approach would make tracking which bits
are associated with which date problematic.
Furthermore, the case where this would become an issue is if there
are so vastly many transactions each day that recomputing balances
within the day, that's going to be the situation of an enterprise
with thousands of transactions _per day_ depending on GnuCash.
That's not where GnuCash is, and I don't see it being where GnuCash
is headed in the short term, or even longer term. After all,
we're not _formally_ seeing plans yet to support multiuser
access. And if you think you could get thousands of transactions
per day _without_ multiple users, it must be pretty good crack
being smoked by all... :-)
If there is so much data on a single day that it cannot be coped
with, _there's a big problem_.
The "setting up balances by date" approach provides, by the way, a quite
natural way of _dynamically_ rebalancing the system based on the amount
of data out there.
A nice rule of thumb could be to generate a checkpoint for every
month, then add a checkpoint for the present date each time the count
reaches 200 transactions, resetting to zero each time. Of course, both
of those may be parameterized and tuned; change the count to 500 if
need be, or to quarterly, or weekly, or ...
Consider two cases:
a) Someone who generates an average of ten transactions per month.
At the start of each month, a "balance checkpoint" gets generated,
and, since they never get 200 transactions in a month, they
never get up to 200 before hitting the next month.
Behaviour here will certainly be pretty OK.
120 transactions; 12 checkpoints; no performance issue
_regardless_ of how we'd implement balance management.
b) Consider someone generating 1000 transactions per month.
Start of each month --> new checkpoint.
Then there are roughly four additional checkpoints generated.
Total for the year --> ~ 60 checkpoints across 12000 transactions.
c) Consider an enterprise with highly skewed activity.
10 transactions between January and September, then
3000/month in October-December.
--> First 9 months have 9 checkpoints.
--> Last 3 months each get split into 15 2-day chunks
Total for the year: 54 checkpoints, which supports
(90 + 3 x 3000) or 9090 transactions.
The point here is that by keeping the "balance checkpoints"
a bit dynamic, they can be used strategically to improve
performance whilst minimizing their quantity.
And as for the company for which 365 "checkpoints" in a year isn't
enough, I'm not sure there's any reasonable option to offer at
this time...
--
(concatenate 'string "cbbrowne" "@ntlug.org") <http://www.hex.net/~cbbrowne/>
Long computations which yield zero are probably all for naught.