[Gnucash-changes] r13152 - gnucash/trunk/lib/libqof/qof - Don't allow the event handler list to shrink while we're traversing it.

Derek Atkins warlord at MIT.EDU
Fri Feb 10 17:06:48 EST 2006


Neil Williams <linux at codehelp.co.uk> writes:

> On Wednesday 08 February 2006 6:04 am
>> Date: 2006-02-08 01:04:51 -0500 (Wed, 08 Feb 2006)
>> New Revision: 13152
>> Trac: http://svn.gnucash.org/trac/changeset/13152
>
> I've been wondering about this ever since it went in, I'm not sure
> if it's the best solution. It seems a little too simplistic.

It's not the best solution.  It /is/ simplistic.  But the problem is
that the code is re-entrant but it's not quite handling that properly.

> Is there a test case I can use? Was the crash described below part of a bug 
> report? How can I reproduce it?

Hopefully Chris will chime in here.  I honestly don't know offhand
what crash he's seeing, but it's certainly a theoretical problem due
to the re-entrancy.

>> Modified:
>>    gnucash/trunk/lib/libqof/qof/gnc-event.c
>> Log:
>>    Don't allow the event handler list to shrink while we're traversing it.
>
> What about making this stage conditional? (It would at least limit the 
> inexorable growth of the list.)

I'm not sure how you make it conditional.  The real fix is to do
delayed destruction of the event handler, but then you also need a
nest calculator to make sure you clean up only at nestlevel 0.  This
requires at least changing the ABI if not the API (I don't recall if
the event-handler list is a public QOF object or not).

>> +    /* Normally, we could actually remove the handler's node from the
>> +       list, but we may be unregistering the event handler as a result
>> +       of a generated event, such as GNC_EVENT_DESTROY.  In that case,
>> +       we're in the middle of walking the GList and it is wrong to
>> +       modify the list. So, instead, we just NULL the handler. */
>
> If GNCEngineEventType != GNC_EVENT_DESTROY, remove the handler node?

Not sufficient.  You could be processing a modify event and during one
of the callbacks it winds up calling a destroy event which uses the
same callback list and then removes a callback handler...  And then
when you pop back you're in the same boat.

>>    This change isn't ideal in the sense that the handler list is now a
>>    monotonically increasing resource,
>
> This worries me - consider if the events are used in a situation where 
> resources are far more limited than in a desktop computer. How big does this 
> list now become if the process is in use continually?

Chris, Josh, David, and I had a dicussion about this on IRC before
Chris made this change (see what you miss by not hanging out on IRC?)
Yes, this is a short-term fix to keep the accessing freed memory.
It's not a long-term fix, which requires a little bit of
re-architecting the QOF Event code.

>>    but it's better than crashing when 
>> the handler in node N+1 happens to be deleted while servicing the handler
>> in node N.
>
> Isn't there some way we can test for N+1 != NULL?
> (we probably should be doing that anyway.)

That's not necessarily sufficient, unfortunately.

> This isn't an API issue, it's a technical issue about how best to
> prevent this crash without causing another elsewhere by consuming
> resources on a much more limited platform.

It /may/ be an API issue depending on whether the eventlist objects
are opaque or not.  I don't recall and I haven't looked right now.
I'll go take a look this evening or this weekend and see if I can come
up with something.

-derek
-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord at MIT.EDU                        PGP key available


More information about the gnucash-devel mailing list