generating reports takes minutes
John Clements
gnucash at brinckerhoff.org
Wed Nov 1 14:33:26 EST 2006
On Nov 1, 2006, at 10:43 AM, Derek Atkins wrote:
> Quoting John Clements <gnucash at brinckerhoff.org>:
>
>>>
>>> I'm using gnucash with data that starts in 2002 - that is 4 years
>>> of data...
>>> Generating a report seems to take minutes - and I generate
>>> several at
>>> startup - :(
>>>
>>> Is there a way to improve this "speed" ?
>>> Report generation is becomming a pain, and I feel it's horribly slow
>>> compared to M$ Money which only takes seconds ...
>>>
>>
>> I've experienced this for years, and I recently noticed that it seems
>> to be much slower for finer-grained time intervals. So let me ask a
>> (possibly insulting) question. When I'm generating, say, a cash-flow
>> report for N time slices (e.g. quarterly for 4 years = 16 slices),
>> with a database of M transactions, is there an operation in there
>> that's of time NM?
>
> Why would this be insulting. I suspect the answer is "yes". In
> fact, it's quite possible that it's WORSE than M*N!
>
>> For instance, I can just imagine this piece of code:
>>
>> (let ([divided (map (lambda (range) (filter (falls-in-range range)
>> all-transactions))
>> given-ranges)])
>> ...)
>>
>> ... which would result in scanning the list of transactions N times.
>> You can tell me the code isn't written this way, and I'll just keep
>> my mouth shut and deal with it.
>
> Oh, it might be.
Hmm... well, I think I can see where this open-source software
conversation is heading, and I don't like it at all. Is there someone
who actually has the time & inclination & knowledge to answer this
question definitively and replace the aforementioned code with
something like (warning! totally untested code!):
(let ([range-vec (make-vector (+ (vector-length split-dates) 1) null)])
(for-each (lambda (transaction)
(let ([i (search split-dates transaction 0 (- (vector-
length split-dates) 1))])
(vector-set! range-vec i (cons transaction (vector-
ref range-vec i)))))))
(define (search split-dates this-date left right)
(if (= left right)
(+ right 1)
(if (< this-date (vector-ref split-dates left))
left
(search split-dates this-date (+ left 1) right))))
?
... and before you point it out, this one is ALSO N*M. The interface
of search is such that you could drop in a binary search instead, if
you cared. My guess is that the improved cache locality of
considering each transaction once is going to be a big win over the
code that scans the transaction set N times... if the code is
actually written this way.
Ah well.
All the best and thanks again,
John Clements
Cal Poly State University
More information about the gnucash-user
mailing list