Reading whole db at startup (was Re: String lengths in the SQL backend)

Phil Longstaff plongstaff at rogers.com
Fri Nov 21 11:39:33 EST 2008


On November 20, 2008 06:38:14 pm Phil Longstaff wrote:
> On November 16, 2008 02:01:05 pm Derek Atkins wrote:
> > Phil Longstaff <plongstaff at rogers.com> writes:
> > >> Wait, the ENTIRE contents are read in?  Historically only "necessary
> > >> data" was read in.  That would be the Accounts and Commodities from
> > >> the main CoA.  The transactions were all loaded on demand.
> > >
> > > Yes.  I wanted to only read "necessary data".  However, my (admittedly
> > > incomplete) knowledge of the engine led me to the conclusion that parts
> > > of th engine assume that all data is present.  I couldn't get the
> > > account tree to show correct values, for example, unless all splits for
> > > an account were present.  I ended up just loading the whole database
> > > into memory.
> >
> > There should not be any dependence on reading in all the data from the
> > DB. The old PG Backend certainly did not, and it worked fine.  I think it
> > got around it by having checkpoints for things like running account
> > balances, so you only need to load "current" transactions, not all of
> > them.
> >
> > If you're loading all data all the time then the only benefit to the DB
> > backend over the XML backend is save-on-commit.
>
> I was just looking at the pg backend code again.  In single user mode, it
> *did* read all of the transaction data at startup time.

OK.  I have modified code (not committed yet) which doesn't read tx/splits at 
startup.

At startup, it executes:

SELECT account_guid, reconcile_state, sum(quantity_num) AS quantity_num, 
quantity_denom FROM splits GROUP BY account_guid, reconcile_state, 
quantity_denom

This gives me 1 row per guid/reconcile_state/quantity_denom triplet.  From 
these, I set the start balance, start cleared balance and start reconciled 
balance for each account.

When the backend receives a query to load splits for an account, it uses this 
algorithm:

1: save start and end balance, cleared balance and reconciled balance for each 
account
2: stop qof events
3: load all transactions which contains splits for the target account.  This 
loads all splits for those transactions, even those in other accounts.
4: for each account, get new end balances.  If end balance doesn't match end 
balance from step 1, calculate old_start_balance-(new_end_balance-
old_end_balance) and set this as the new start balance.
5: resume qof events

IOW, if the end balance for an account increases by X because of newly loaded 
splits, step 4 decreases the start balance by X to make the end balance match 
what is should be.  Eventually, once all tx/splits are loaded, all start 
balances are 0.

Each account that is the target of a query for all splits is marked so that 
it's splits are never reloaded.  Done as a performance improvement but would 
need to be revisited for a multi-user back end.

I need a bit more testing, but once I'm happy with this, I'll commit it.  It 
does lead to a delay when opening an account with a lot of splits.  I may look 
at adding a progress bar or other indication that the app is busy.

Phil


More information about the gnucash-devel mailing list