KVP and data that contains a forward slash
John Ralls
jralls at ceridwen.us
Fri Feb 12 13:35:29 EST 2016
> On Feb 12, 2016, at 9:26 AM, Derek Atkins <warlord at MIT.EDU> wrote:
>
> John Ralls <jralls at ceridwen.us> writes:
>
>> Again, the path argument would be more convincing if we actually used
>> XPath. I don't think it really matters what grand plan the designer
>> (you?) had in mind, what matters is how we're using it now.
>
> No, it wasn't me; KVP was in there (long?) before I got involved.
>
>> No, SQL just makes the penalty more severe. Cache misses are
>> expensive; so much so that the current C++ doctrine (based on many
>> simulations) is that it's much faster to sequentially search a
>> std::vector than to use any container that relies on independently
>> allocated nodes, even when the overhead of complete reallocation for
>> an insert operation is accounted for. Every indirection in KVP,
>> i.e. every step involving a KvpFrame means dereferencing the
>> KvpFrame*, dereferencing its GHashTable*, dereferencing the hash
>> table's hashes and getting the next KvpItem*, dereferencing it,
>> getting its content type and ptr, and if it's a KvpFrame, repeating
>> the process. Because each of them is a different type GSlice will have
>> a different magazine for each of them. If not much else is going on on
>> the system all of the magazines might be in cache after the first
>> round and subsequent descents will be faster. Or not.
>>
>> That overhead could be reduced immensely by not having KvpFrames. With
>> the whole path as a key there'd be a single GHashtable lookup and only
>> three derefences (the QofInstance's Kvp GHashtable, the KvpItem* it
>> returns, and the KvpItem's contents).
>
> Is it a structural problem or an implementation/storage problem?
Why implementation, of course, by the first theorem of CS.;-)
>
>> All of that said, I agree that not implementing KVP in the SQL backend
>> that way was a mistake and I regret it.
>
> Why do YOU regret it? I don't think you implemented it that way.
Actually, I did. It was the first real development I did for GnuCash; Phil Longstaff hadn't quite finished the DBI implementation but had apparently run out of time and it was holding up the 2.4.0 release, so I dove in. Handling KvpFrames was the biggest missing piece. I made it work like the in-memory implementation without thinking through the query implications.
>
>> Moving almost all of the Kvp access to being through GObject
>> properties does make all of the access via the object's API. The next
>> step is to add members to the objects and load them from KVP in the
>> backend. That gets rid of the Kvp performance penalty entirely for the
>> XML backend and makes it one-time for the SQL backend.
>
> Right. That works; you just need to change the way the 'write' works.
Yeah, we've been over this ground a couple of times before, and I'm not worried about that part. The two Kvp uses that aren't object extensions, File Properties and Import Matching, will require a bit more effort. File Properties can just be made into a object that's a member of QofBook and loaded with the book. Easy. Import Matching makes a lot of records and I'd really prefer not to load it into GnuCash's memory at all and just query for results, but that would require a schema change with a separate table.
Regards,
John Ralls
More information about the gnucash-devel
mailing list