GnuCash Daily Source Diff
Dave Peticolas
dave@krondo.com
09 Nov 2001 13:26:06 -0800
--=-PDhsUfNXlDSIjJnGAh0E
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable
On Fri, 2001-11-09 at 07:20, Linas Vepstas wrote:
> Hi Dave,
>=20
> May I hassle you some more?
Sure :)
> On Fri, Nov 09, 2001 at 08:04:24AM -0600, Dave Peticolas was heard to rem=
ark:
> > Index: src/backend/postgres/gncquery.c
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> > RCS file: /home/cvs/cvsroot/gnucash/src/backend/postgres/gncquery.c,v
> > retrieving revision 1.6
> > retrieving revision 1.7
> > diff -u -r1.6 -r1.7
> > --- src/backend/postgres/gncquery.c 2001/11/06 23:02:26 1.6
> > +++ src/backend/postgres/gncquery.c 2001/11/09 11:50:23 1.7
>=20
> > - /* determine whther the query will need to reference the account
> > - * or commodity tables. If it doesn't need them, then we can gain
> > - * a significant performance improvement by not specifying them.
> > - * The exact reason why this affects performance is a bit of a=20
> > - * mystery to me ... */
> > + /* determine whether the query will need to reference certain
> > + * tables. */
>=20
> I was somewhat disheartened to see this change; and would really like
> to get you to put it back. This routine would not exist at all, if I
> hadn't spent a week performance tuning the sql backend. Originally,
> I specified *all* of the tables that might be needed for a query; the
> code for this was very simple. But then on a hunch, I noticed that=20
> including a table that wasn't actually referenced could slow performance
> by a factor of 10 (!!!!).
Yes, I know, I was the one who originally pointed out to you
that removing unneeded tables would improve performance (you remember
the 10-minute startup times I saw?). However, upon reading a Postgres
book I found that in Postgres you don't actually need to use the FROM
clause at all, i.e., Postgres can figure out the tables it needs from=20
the rest of the query. So you only need to use FROM if you want to do
a self-join and need to specify a new name for a table. So the new code
only puts in FROM clauses for the self-joined commodity tables if
needed, the others are left out.
> > + /* add needed explicit tables. postgres can figure out the rest. */
>=20
> Bzzzt. I think this note says it all. Postgres can figure out the=20
> rest, but it will crater performance as it does so. You really, really
> have got to restore the missing tables.
Have you actually tested it to see if it craters performance? The=20
problem with the original queries was putting *unreferenced* tables
in the FROM clause, i.e., explicity telling Postgres to search a
table that didn't need to be. Postgres isn't going to search tables
that aren't referenced in the query at all.
> > - switch (xaccGUIDType (&pd->guid.guid, session))
> > +
> > + sq->pq =3D stpcpy (sq->pq, " (");
> > +
> > + if (guid_equal (&pd->guid.guid, xaccGUIDNULL ()))
> > {
> > - case GNC_ID_NONE:
> > - case GNC_ID_NULL:
> > - default:
> > - sq->pq =3D stpcpy(sq->pq, "FALSE ");
> > - break;
> > - =20
> > - case GNC_ID_ACCOUNT:
> > - sq->pq =3D stpcpy(sq->pq, "gncAccount.accountGuid=
=3D '");
> > - sq->pq =3D guid_to_string_buff (&pd->guid.guid, s=
q->pq);
> > - sq->pq =3D stpcpy(sq->pq, "' ");
> > - break;
> > - =20
> > - case GNC_ID_TRANS:
> > - sq->pq =3D stpcpy(sq->pq, "gncTransaction.transGu=
id =3D '");
> > - sq->pq =3D guid_to_string_buff (&pd->guid.guid, s=
q->pq);
> > - sq->pq =3D stpcpy(sq->pq, "' ");
> > - break;
> > - =20
> > - case GNC_ID_SPLIT:
> > - sq->pq =3D stpcpy(sq->pq, "gncEntry.entryGuid =3D=
'");
> > - sq->pq =3D guid_to_string_buff (&pd->guid.guid, s=
q->pq);
> > - sq->pq =3D stpcpy(sq->pq, "' ");
> > - break;
> > + sq->pq =3D stpcpy(sq->pq, "FALSE ");
> > }
> > -
> > - if (0 =3D=3D pd->guid.sense)
> > + else
> > {
> > - sq->pq =3D stpcpy (sq->pq, ") ");
> > + sq->pq =3D stpcpy(sq->pq, "gncAccount.accountGuid =3D=
'");
> > + sq->pq =3D guid_to_string_buff (&pd->guid.guid, sq->p=
q);
> > + sq->pq =3D stpcpy(sq->pq, "' ");
> > + sq->pq =3D stpcpy(sq->pq, " OR ");
> > +
> > + sq->pq =3D stpcpy(sq->pq, "gncTransaction.transGuid =
=3D '");
> > + sq->pq =3D guid_to_string_buff (&pd->guid.guid, sq->p=
q);
> > + sq->pq =3D stpcpy(sq->pq, "' ");
> > + sq->pq =3D stpcpy(sq->pq, " OR ");
> > +
> > + sq->pq =3D stpcpy(sq->pq, "gncEntry.entryGuid =3D '")=
;
> > + sq->pq =3D guid_to_string_buff (&pd->guid.guid, sq->p=
q);
> > + sq->pq =3D stpcpy(sq->pq, "' ");
>=20
> The above looks like another performance-cratering change. Why search
> all tables, when you can search the one that you need? I presume you
> made this change because you don't know which table to search ... but
> how can you possibly not know which table to search? Who would be
> stupid enough to ask 'get me this GUId, but I don't know if its a
> transaction or an account guid' ????
I am simply implementing the semantics of the Query predicate, which is
to match a split whose guid, transaction guid, or account guid matches
the given guid. That is what the C code implements. Don't be fooled by
the fact that the engine Query code uses xaccGUIDType -- that function=20
only works if the entity has already been loaded into the engine. The=20
sql backend can't assume that.
> > case PR_SHRS: {
> > PINFO("term is PR_SHRS");
> > sq->pq =3D stpcpy(sq->pq,=20
> > - "gncEntry.accountGuid =3D gncAccount.accountGuid =
AND "
> > "gncAccount.commodity =3D account_com.commodity A=
ND ");
> > AMOUNT_TERM ("gncEntry.amount","account_com");
>=20
> ???? Don't you need to know which account is to be matched ??
Yes, but that is now added further up if need_account is true.
You also need that line if you want to sort on account fields,
so I figured to do it the same as need_entry.
dave
--=-PDhsUfNXlDSIjJnGAh0E
Content-Type: application/pgp-signature
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQA77Enu5effKKCmfpIRAkuLAKCge7fXhY2bDHW/m/v9kbPIEfBm+wCff6iE
D460Jd7pQhW6pTo0WPdCYFM=
=JwG0
-----END PGP SIGNATURE-----
--=-PDhsUfNXlDSIjJnGAh0E--