Startup speed (was: Archiving old transations)
Jonathan Kamens
jik at kamens.us
Fri Jun 21 11:44:03 EDT 2013
Hi David,
That's not a dumb question at all; in fact, it's a very smart question.
To give my own answer (which I'm sure is only one of many possible
answers), I actually need to "go up a level," hence the change in the
Subject line.
For me, the issue isn't really archiving transactions; archiving is just
a workaround for the /real/ issue, which is that GnuCash's startup time
grows linearly with the number of transactions.
People who are saying, "Well, it only takes 30 seconds to start up for
me, and that seems fine," or, "I don't have a problem waiting patiently
for it to launch," are in my opinion missing the point, or more
accurately missing two points.
First of all, let me say it again, the startup time /grows linearly with
the number of transactions./ I've been using GnuCash to track my
finances since 2004. I hope to live for another fifty years or so, and I
hope to continue using GnuCash to track my finances for much of that
time. I also hope that I will be entering more and more transactions per
year into GnuCash, not less, because I hope to have more money to spend
as I get older ;-). As time goes on, it's going to take longer and
longer for me to launch the program, unless the linear growth problem is
fixed.
Start-up time isn't the only issue. Searching over 60 years of
transactions is going to be much slower than searching over 10 years of
transactions, and the search results are going to be cluttered with
ancient stuff I don't care about. Generating a report over 60 years of
transactions is going to be much slower than generating a report over 10
years of transactions, and many of the reports default to the entire
time span of transactions and don't let you change the default until the
first version of the report has already been fully generated.
In short, requiring a monolithic GnuCash file and reading the entire
contents of that file into memory before allowing the user to do
anything with it simply does not scale. If GnuCash is in it for the long
haul, and I certainly hope that it is, this problem cannot be ignored
forever.
Second, anybody with a clue about UX knows that responsiveness is far
and away the single issue that users care most about. Slowness -- during
or after start-up -- is guaranteed to irritate people and make them
switch apps. If you're looking for low-hanging fruit in terms of
improving the user experience, this is it. (Yes, yes, I know, "GnuCash
is free, you don't need to use it if you don't want to, if it's too slow
for you you're welcome to go use something else." Please, spare me the
free software lecture. I am assuming here that the folks who write and
maintain GnuCash actually want it to be attractive to users and more
pleasant for they, themselves, to use.)
Colin claimed that this isn't an issue because computers keep getting
faster so startup time remains relatively constant. There are a number
of reasons why I don't buy that, including:
* I find the attitude which Microsoft let loose on the world that it's
OK for software to get more and more slow and bloated because faster
hardware will compensate for it to be abhorrent. The software should
be just as big and slow as it needs to be to do the work it is
expected to do. Since there are financial applications that manage
just as much data as GnuCash without this start-up problem, GnuCash
clearly doesn't /need/ to suffer from this. The fact that it does is
indicative of a problem with its design, as John Ralls has already
acknowledged. We should be acknowledging that and aspiring to fix
it, not rationalizing it.
* I should not need to upgrade my computer every couple of years just
to be able to balance my checkbook and credit-card statements. Not
everybody can afford to do that.
* In a related vein, the kind of work that GnuCash does really is the
kind of work that it should be possible to do on a low-end computer.
We're not talking about Pixar animation, folks, we're talking about
double-entry accounting.
So, I've now made the case that GnuCash's current architecture is
problematic because it requires a molothic GnuCash file / database and
reads it all on startup. People are, of course, free to disagree with me
about that. For those who do, the rest of what I'm about to write is
irrelevant so you can just stop here. ;-) But if there is a problem,
then what are the possible solutions?
Well, one of them is the one that my Perl script implements, i.e.,
archiving old transactions into a separate file. That solves the startup
problem, and it also solves the problem of all your searches showing you
ancient search results you don't care about. However, as others have
pointed out, it introduces a new problem -- your data is no longer
searchable all in one place, so if you need to find transactions from
years ago, you have to go digging through old files. Furthermore, at
least in my implementation, the files with archived transactions make no
effort to maintain historical balances. It's not ideal.
Another possible solution is something like this...
* Enhance the format of the GnuCash file format by adding a section at
the top of each file summarizing the balances of all the accounts
referenced by transactions in that file.
* Allow GnuCash to read transactions from and write transactions to
multiple files rather than just a single file.
* By default, the detailed transactions are only read from some of the
files (how many is configurable by the user); only balances are read
from the older files and summed to produce initial balances of every
account.
* If the user needs to access transactions archived in one of the
older files, s/he tells GnuCash to read the transactions in that
file, and the balances previously read from that file are replaced
by the actual transactions.
With the database backend, something like this is even easier to
implement... Instead of using separate files, transactions can simple be
queried from the DB by date, and initial balances can be produced by
doing fast aggregate queries for transactions earlier than the earliest
date the user currently wants displayed.
An enhancement to the above idea is to read /only/ the aggregate
balances, from /all/ files or rows in the database, on startup, and then
either (a) read detailed transactions in the background, (b) read
detailed transactions as they are needed for display, or (c) some smart
combination of the two.
I'm sure there are other possible solutions.
Ultimately, I think the issue is as I mentioned previously (and so did
John Ralls) is that there is a fundamental design flaw in requiring all
transactions to be read at startup, so the /right/ solution is to fix
the design flaw and allow transactions to be read in the background or
as needed or both.
Now I'm sort of regretting offering to contribute to a "bounty" to
convince somebody to implement archiving, because I think I'd rather see
the developers spending time working on fixing the fundamental design
flaw than on a workaround for it. Maybe John Ralls is right that doing
something real about this "requires a complete re-write of Gnucash's
core functionality," or just maybe it might be possible to with some
out-of-the-box thinking to graft something useable onto what's there now
without rewriting everything?
jik
On 06/21/2013 10:08 AM, David Carlson wrote:
> Excuse me for being dumb, but what is the definition of archiving, anyway.
> Is it the same for GnuCash as for e-mail?
> Physically where is the stuff moved to and how hard is it to get it back
> if needed?
> Would I use GnuCash to look at that archived stuff?
>
> David C
> _______________________________________________
> gnucash-user mailing list
> gnucash-user at gnucash.org
> https://lists.gnucash.org/mailman/listinfo/gnucash-user
> -----
> Please remember to CC this list on all your replies.
> You can do this by using Reply-To-List or Reply-All.
>
>
More information about the gnucash-user
mailing list