KVP and data that contains a forward slash

John Ralls jralls at ceridwen.us
Wed Feb 10 20:29:06 EST 2016


> On Feb 10, 2016, at 7:41 AM, Derek Atkins <warlord at mit.edu> wrote:
> 
> John Ralls <jralls at ceridwen.us> writes:
> 
>>>> Thanks for catching that. We absolutely don't want to create deep
>>>> keys, performance will suffer in SQL. Please file a bug so that I
>>>> don't forget.
>>> 
>>> Don't we still have a single function that parses a full key-path from a
>>> single string?  How would it know how to part something like:
>>> 
>>> /foo/bar/Assets/Bank  into  [foo] [bar] [Assets/Bank] ??
>> 
>> Yes, that's the problem. It's not used for matcher tokens in maint but
>> is in master, and it shouldn't be.
> 
> Ah, well that would be the problem :)
> 
>>      I can't off-hand think of a case
>> where using a file path for a KVP key would make sense, but if there
>> is then it shouldn't have that set of functions used on it either.
> 
> Originally the KVP path was supposed to be a file-system.  So if there
> is a desire to move away from that, then we should be explicit about it.

I hope that's not literally true in the sense of writing it out to disk, that would be incredibly stupid. Surely you mean that it was meant to have paths *like* a file system, which it does. It's still an inefficient design because there's no locality so every step is a (or more likely several because of the hash tables) cache miss.

>> Most of our existing use of nested KVP isn't necessary anyway, but
>> changing it will require a new file/db version and conversion
>> routines. No point in introducing new nesting though, and besides that
>> would also create a file incompatibility between 2.6 and 2.8.
> 
> I would disagree; it's nice to have some nesting (certainly within the
> XML framework) to make it easy to remove whole KVP subtrees.  When you
> view KVP as a file system then it makes total sense to have nesting, and
> all the benefits that come with it.

That would carry more weight if we actually used the XML DOM tree inside of GnuCash, but we don't. Besides, that's not how we use it. With a couple of exceptions (import matching and book properties) we use it to add members to classes without changing the XML or SQL Schema.  Those paths are two or three deep and for the most part have only one or two elements at the bottom. Getting rid of KvpFrames and converting the "path" to a string name so that it takes a single hash lookup instead 2 or three will be more performant with no affect on our actual usage. The payoff is even higher when the KVP data isn't all in memory: Having to do 3 SQL queries to retrieve an int64_t or a string is ridiculous. For most of those uses having to do a separate query at all is ridiculous. The members should be part of the object record and retrieved when the object is instantiated.

Regards,
John Ralls




More information about the gnucash-devel mailing list