Reading whole db at startup (was Re: String lengths in the SQL backend)
Phil Longstaff
plongstaff at rogers.com
Fri Nov 21 11:39:33 EST 2008
On November 20, 2008 06:38:14 pm Phil Longstaff wrote:
> On November 16, 2008 02:01:05 pm Derek Atkins wrote:
> > Phil Longstaff <plongstaff at rogers.com> writes:
> > >> Wait, the ENTIRE contents are read in? Historically only "necessary
> > >> data" was read in. That would be the Accounts and Commodities from
> > >> the main CoA. The transactions were all loaded on demand.
> > >
> > > Yes. I wanted to only read "necessary data". However, my (admittedly
> > > incomplete) knowledge of the engine led me to the conclusion that parts
> > > of th engine assume that all data is present. I couldn't get the
> > > account tree to show correct values, for example, unless all splits for
> > > an account were present. I ended up just loading the whole database
> > > into memory.
> >
> > There should not be any dependence on reading in all the data from the
> > DB. The old PG Backend certainly did not, and it worked fine. I think it
> > got around it by having checkpoints for things like running account
> > balances, so you only need to load "current" transactions, not all of
> > them.
> >
> > If you're loading all data all the time then the only benefit to the DB
> > backend over the XML backend is save-on-commit.
>
> I was just looking at the pg backend code again. In single user mode, it
> *did* read all of the transaction data at startup time.
OK. I have modified code (not committed yet) which doesn't read tx/splits at
startup.
At startup, it executes:
SELECT account_guid, reconcile_state, sum(quantity_num) AS quantity_num,
quantity_denom FROM splits GROUP BY account_guid, reconcile_state,
quantity_denom
This gives me 1 row per guid/reconcile_state/quantity_denom triplet. From
these, I set the start balance, start cleared balance and start reconciled
balance for each account.
When the backend receives a query to load splits for an account, it uses this
algorithm:
1: save start and end balance, cleared balance and reconciled balance for each
account
2: stop qof events
3: load all transactions which contains splits for the target account. This
loads all splits for those transactions, even those in other accounts.
4: for each account, get new end balances. If end balance doesn't match end
balance from step 1, calculate old_start_balance-(new_end_balance-
old_end_balance) and set this as the new start balance.
5: resume qof events
IOW, if the end balance for an account increases by X because of newly loaded
splits, step 4 decreases the start balance by X to make the end balance match
what is should be. Eventually, once all tx/splits are loaded, all start
balances are 0.
Each account that is the target of a query for all splits is marked so that
it's splits are never reloaded. Done as a performance improvement but would
need to be revisited for a multi-user back end.
I need a bit more testing, but once I'm happy with this, I'll commit it. It
does lead to a delay when opening an account with a lot of splits. I may look
at adding a progress bar or other indication that the app is busy.
Phil
More information about the gnucash-devel
mailing list