Exponential growth of the slots table

John Ralls jralls at ceridwen.us
Sat Dec 4 21:29:36 EST 2010


On Dec 4, 2010, at 9:22 AM, John Ralls wrote:

> 
> On Dec 4, 2010, at 9:06 AM, Phil Longstaff wrote:
> 
>> On Sat, 2010-12-04 at 07:41 -0800, John Ralls wrote:
>>> On Dec 3, 2010, at 10:42 PM, Phil Longstaff wrote:
>>> 
>>>> The slots table contains extra information for the other objects.  This extra 
>>>> information allows extra values to be tied onto objects.  It will be loaded and 
>>>> saved with the objects.  When loaded, the slots are like a directory structure.  
>>>> At each level, there are named keys with values of different types (int, string, 
>>>> date, ...).  There can also be sub-levels.
>>>> 
>>>> The main key is the guid field which contains the guid of the object that this 
>>>> slot belongs to.  This object can be of any other type (account, split, 
>>>> transaction, book, ...).  The other fields in the slot table are the full 
>>>> directory path of the key, the value type, and the value.
>>>> 
>>>> I believe that guid/path should be unique.
>>> 
>>> That's only part of the story: Using full paths (not really directory paths, more like xml XPaths except that the components are the the string values of the slots:key elements rather than element names... which is the source of Bug 635859 [1], because the the delimiter character '/' is allowed in string data (though not in element names). didn't work for hierarchies of slots, which are used in several places (Bug 627831 [2]): Child elements were silently not created during load. r19729 added recursion to the slot storage so that each level is stored as a row, and slots with children (either of type frame or type list) get a guid_val created on the fly -- which is discarded on load to provide for exact round-tripping to xml -- which is used as the obj_guid to link children to their parents. 
>>> 
>>> But that isn't the problem: The problem is that save_slots doesn't query to see if a slot record exists before inserting it -- and the slot table's primary key is an autoincrement, so there isn't a key conflict, either.
>> 
>> No, it doesn't query if it exists, but it is supposed to delete all
>> slots for the object before it starts to save.
> 
> OK, I was thinking along those lines as a fix. Perhaps all that needs to be done is to make that recursive as well.
> 

Done, I think. I've committed r19908 which recurses down the FRAMES and GLISTS to clear out child slots. 

Elwood, if you're able, build it (or if you're a windows user you can get tomorrow's nightly at http://code.gnucash.org/builds/win32/trunk/). Running your budget function should clear out the duplicate slots. It will probably take a while, so be patient.

Do note that many of the slot names are duplicated for bank/brokerage accounts, so there will be some duplication. This query

 select obj_guid, name, count(*) from slots group by obj_guid, name  having count(*) > 1;

should produce no results.

Regards,
John Ralls 



More information about the gnucash-user mailing list