[GNC-dev] More need for speed improvement around XML

John Ralls jralls at ceridwen.us
Sun Dec 30 19:52:58 EST 2018



> On Dec 30, 2018, at 1:25 PM, Christian Stimming <christian at cstimming.de> wrote:
> 
> Am Sonntag, 30. Dezember 2018, 00:44:40 CET schrieb John Ralls:
>>> On Dec 29, 2018, at 2:16 PM, Christian Stimming <christian at cstimming.de>
>>> wrote:> 
>>> Am Samstag, 29. Dezember 2018, 01:45:32 CET schrieb John Ralls:
>>>>> When saving to XML file, for each transaction the call stack with the
>>>>> expensive code walks down like this:
>>>>> 
>>>>> gnc_transaction_dom_tree_create
>>>>> add_time64
>>>>> time64_to_dom_tree
>>>>> gnc_print_time64
>>>>> GncDateTime::format
>>>>> GncDateTimeImpl::format
>>>> 
>>>> Christian,
>>>> 
>>>> I don't have time to fully test it out right now, but give
>>>> https://github.com/jralls/gnucash/tree/maint a go.
>>>> 
>>>> As noted in the HEAD commit it has a side effect of recording times in
>>>> XML
>>>> files in UTC, no timezones. Users who want to keep their files in version
>>>> control will like that, but it needs to be checked for backward
>>>> compatibility with 2.6 before it can be merged into maint.
>>> 
>>> Thanks for working at this. Indeed the main issue is that we don't need a
>>> user-visible format conversion (which must respect the locale) but instead
>>> a serialization to a fixed iso8601 format. Fixing this issue also means
>>> that the XML files will no longer depend on the local timezone, which is
>>> an issue that has been asked for a long time.
>>> 
>>> Thanks for your patch - this fixes performance issues with
>>> gnc_time64_to_iso8601_buff() and all of its callers. However, the
>>> Save-to-XML function calls gnc_print_time64() instead (from
>>> time64_to_dom_tree), which is a different code path. Consequently, your
>>> patch alone does not modify the saved XML (just verified). Your call to
>>> GncDateTime::format_iso8601 needs to be used in gnc_print_time64, too,
>>> like so:
>>> 
>>> xmlNodePtr
>>> time64_to_dom_tree (const char* tag, const time64 time)
>>> {
>>> 
>>>    xmlNodePtr ret;
>>>    g_return_val_if_fail (time != INT64_MAX, NULL);
>>> 
>>> -    auto date_str = gnc_print_time64 (time, "%Y-%m-%d %H:%M:%S %q");
>>> -    if (!date_str)
>>> -        return NULL;
>>> +    GncDateTime gncdt(time);
>>> +    auto sstr = gncdt.format_iso8601();
>>> +    gchar* date_str = g_strdup(sstr.c_str());
>>> 
>>> 
>>> However, with this patch, all time64 timestamps in the XML file will then
>>> change from time+timezone to UTC time only. To my surprise, although we
>>> knew about this issue for such a long time and changed most timestamps
>>> appropriately, there are still timestamps that will change date due to
>>> this
>>> change. With a quick glance, the "reconciled-date" timestamp contains
>>> beginning-of-day in a local timezone. Also maybe user-entered commodity
>>> prices. Others might still show up, oh well.
>>> 
>>> I haven't checked for backward compatibility with 2.6, but will do during
>>> the next days.
>> 
>> Rats, I got the two functions scrambled.
>> 
>> Geert and I discussed a month or two ago expanding the use of time-neutral
>> to everywhere that we currently use day-begin and day-end except searches,
>> but neither of us has gotten to it yet.
> 
> For completeness: I checked loading such a file with gnucash-2.6, and as far 
> as I can tell, this runs just fine. All transactions and their dates show up 
> as usual even though the timestamps are now read as UTC instead of localtime.

The timestamps have always been UTC, they've just been written oddly. It's explained in comments somewhere, I'll summarize: We've historically written timestamps as an ISO8607-like string, i.e. YYYY-MM-DD HH:MM:SS with an offset representing a timezone, ±HHMM. For example my TZ offset is -0800, so the date-time right now is 2018-12-30 14:26:24 -0800. The parser adds back the 8 hours and stores whatever 2018-12-30 22:26:24 is in seconds since the epoch. Losing the offset doesn't change anything except to make less work for
the parser and reducing the diff of two files written with different offsets.

It's good news that the 2.6 parser reads them correctly.

Since SQL doesn't understand offsets we've always used plain UTC date times in the SQL backend.

Regards,
John Ralls



More information about the gnucash-devel mailing list