[GNC-dev] Slow user interface in 3.x gnucash (for large files), and potential optimization

Wed Jul 4 20:33:33 EDT 2018

Christian Stimming and all developers,

First, I want to thank all of you for your great effort to roll out the 3.0
release and the quick followups to fix bugs and get releases 3.1, and 3.2
out in quick succession.

Christian has made substantial progress at isolating and solving several
major sources of lagging response with Gnucash.  I would like to ask you to
see if there is room for improvement or at least some low hanging fruit in
a couple of areas that bother me.  Actually, I would like all developers to
take an interest in this issue and to at least keep in mind that whatever
code they implement should be designed to run as efficiently as possible.

I am still using releases 2.6.15 through 2.6.17 on various computers and I
have recently observed a couple of things that are definitely not new but
that may still impact releases 3.2 and future 3.3.  These are more
pronounced when using SSH to run GnuCash on a faster computer from a slower
(Windows) computer.  Because SSH seems to be vulnerable, I suspect that
there may be repetitive screen refreshes happening as the code is
executed.

For me, the most noticeable lag happens every time an automatic file save
happens.  GnuCash does not accept user input during the save.  This is
stretched dramatically if the file resides on a network resource instead of
the local hard drive or ssd but it is also substantially slower when using
SSH to run GnuCash remotely.  This lag is 1 minute 10 seconds for me.  This
is with a local network NAS serving the foo.gnucash standard format
compressed file via Ethernet to a virtual Debian Stretch Linux machine.

When opening Edit > Preferences there is a long lag.  For my situation, the
lag is about 24 seconds before the Preferences window appears.  This is not
something users do frequently, but it is quite noticeable.

I have also noticed that after a file is opened that was previously closed
with open reports, the first time that the user touches the report tab
there is an obvious flicker as the report in the window is reconstructed
before the the users very own eyes.  This is most apparent for linechart
reports and probably barchart reports.

Thank You All,

David C

On Fri, Jun 15, 2018 at 3:47 PM, John Ralls <jralls at ceridwen.us> wrote:

>
>
> > On Jun 15, 2018, at 12:55 PM, Christian Stimming <christian at cstimming.de>
> wrote:
> >
> > Am Freitag, 15. Juni 2018, 10:05:09 schrieb John Ralls:
> >>>> A very good catch indeed. But pre-constructing the string in
> >>>> qofbook.cpp only saves two string constructions per invocation as the
> >>>> vector still has to make its own copies. I guess that its much worse
> >>>> for you because the ancient gcc on Ubuntu 14.04 (gcc4.8) doesn't do
> >>>> small-string optimization.
> >>>
> >>> Is there any reason we cant use std::string& in the vector?  Or do we
> >>> think that we might lose references there?
> >>
> >> In this instance the string is static so it would be safe to use const
> >> std::string&, but while there's an is_volatile type trait there's no
> >> is_static so we can't insert a static assert to catch cases where the
> >> actual std::string is on the stack. That could lead to some ugly bugs.
> >
> > The std containers are specified in a way so that they make only sense
> to
> > contain the values by-value (ok, this might be formulated a bit too
> short, but
> > not too long ago this was a good rule of thumb). It is up to the
> > implementation to make optimizations (copy-on-write and such), but the
> > interface requires the programmer to think of the stored type as
> "by-value".
> >
> > Nevertheless I think my change made good use of the std::string's
> already
> > existing optimizations. In particular, the saving was so surprisingly
> large
> > here that I think gcc's std::string itself implements some sort of
> copy-on-
> > write optimization, and this means its content isn't deep-copied in
> these
> > calls.
> >
> > Some numbers from the valgrind/callgrind evaluation. I got the following
> > number of calls, starting from qof_query_run and skipping some
> uninterestings
> > calls in between:
> >
> > - 6 qof_query_run
> > - 6 g_list_sort_with_data
> > - 360,000 (approx.) sort_func - due to the number of splits and txn in
> my file
> > - 360,000 xaccSplitOrder
> >    The most expensive part in this function is this next callee:
> > - 360,000 qof_book_use_split_action_for_num_field
> > - 360,000 (approx.) qof_book_get_property
> >
> > This function in turn has approx. 1,100,000 calls to std::basic_string
> > constructor and destructor. The change in my commit is this:
> > Before my commit, these calls also caused 1,100,000 calls to
> > std::string::_S_create and std::string::_M_destroy i.e. the by-value
> > constructor and destructor. After my commit, I see a mere 22,000 calls
> to
> > std::string::_S_create and std::string::_M_destroy.
> >
> >> Besides, using strings is still leaving performance on the table: A
> string
> >> comparison is n times longer than an int comparison where n is the
> number
> >> of consecutive leading characters that are the same in the two strings.
> I
> >> got a huge speedup on loading a few months ago because GncGUID's
> >> operator==() was doing a character-wise comparison of the ascii
> >> representation instead of comparing the two int64_ts of the actual GUID.
> >
> > Sounds good. The above numbers seem to indicate that std::string already
> > optimizes away a long comparison as long as the string memory itself is
> > identical. In my opinion this is good enough of an optimization.
> >
> >> KVP paths aren't really a good use for std::string anyway: It's a lot of
> >> weight for something that's used as a human-readable index. Even static
> >> char[] is heavy for this purpose. We could get a huge boost if we use
> >> something like Glib's quarks [1].
> >> Want to have a go at that?
> >
> > Err... this would mean a major refactoring of any access to the KVP
> system.
> > No, I don't feel motivated to work into that direction. Also, if I
> understnad
> > the above valgrind evaluation correctly, we can already achieve almost
> optimum
> > (i.e. pointer comparison) by ensuring a set of std::string constants to
> be
> > used. I think this is the most efficient way to proceed here, and I
> would just
> > push this commit after some more cleanup.
>
> Christian,
>
> Hardly major, particularly considering that Aaron has already completely
> rewritten KVP into C++ and separately changed the access from '/'-delimited
> string paths to std::vector<std::string> of the tokens.
> All that's required is lifting Glib's GQuark and converting it to C++,
> changing KVP frame's std::vector<std::string> to std::vector<GncQuark>, and
> using g_quark_intern_from_static_string instead of static std::string.
>
> The situation you've tested is a bit of a special case because all of the
> uses of the KVP_OPTIONS keys are in one file. Any keys used in more than
> one file will have different static std::strings and pointer comparison
> won't work. Examples include anything accessed with qof_instance_set/get.
> That said, the mallocs/deallocs are 99% or more of the performance hit. If
> you're willing and have time to do it your way and have time to get it done
> quickly then by all means go ahead.
>
> Regards,
> John Ralls
> _______________________________________________
> gnucash-devel mailing list
> gnucash-devel at gnucash.org
> https://lists.gnucash.org/mailman/listinfo/gnucash-devel
>