GDA interim benchmarking synopsis

Fri Feb 15 15:37:34 EST 2008

Derek Atkins wrote:
> Mark Johnson <mrj001 at shaw.ca> writes:
>
>   
>>> I'm really beginning to wonder if we need to write special SQL
>>> based on the database we're using in order to tune the performance
>>> because of issues like this?  Yes, I know that that sort of defeats
>>> the purpose of using a library like GDA, but if GDA can't form
>>> an optimal query then it might behoove us to do the right thing
>>> rather than hope they do.
>>>   
>>>       
>> I hope not, but we are talking not about minor differences in
>> performance, but orders of magnitude.  I was absolutely floored at
>> MySql's performance with sub-queries.  I do have a couple of things to
>> try with it, though:
>> 1. creation of a secondary index on slots.obj_guid
>> 2. There is a section of the MySql manual on optimizing this stuff.
>> Maybe it has something helpful.
>>     
>
> Exactly.  If it were just a minor difference then I just wouldn't
> care, but the difference between seconds and minutes is sufficiently
> large that we may need to work around the issue ourselves.
>
> -derek
>   
I have now created the following indices:
create index idx_splits_account_guid on splits (account_guid);
create index idx_slots_obj_guid on slots (obj_guid);
Phil, you really should create this in gnucash-gda.  They will help.

These have improved the performance of the MySql queries by a factor of 
nearly 10.  However, they are still far too slow.  Waiting a few minutes 
instead of tens of minutes for a register to open simply won't be 
acceptable.

I've also gone through the section of the MySql manual on optimizing "IN 
subquery" types of queries.  I tried their "pushdown" method.  Still 
unacceptable performance.  (I didn't time it, but I think it was 
slower.)  I really don't understand why a sub-query which produces a 
CONSTANT set of rows needs to be re-run by the outer query, which 
appears to be what the MySql manual says and also what is happening.

I could not run the "slots" queries on PostgreSQL.  However, it 
performed quite well on both forms of the remaining queries (sub-query 
and join).  It did so even before I created the 2 indices.

I am surprised at this.  Comparisons I've read claim that MySql is 
faster than PostgreSQL.  However, I'm not seeing it.  Perhaps, these 
were based upon older versions.

MySql version 5.0.51
SQLite3 version: 3.3.7
PostgreSQL version 8.2.6

Mark