GLib RFC: Improve checking provided with --enable-mem-check

Wed, 06 Jun 2001 16:03:10 +1000

Hi,

I recently tried to use a glib compiled with --enable-mem-check to debug 
gnucash, but instead ran into a 'block freed x times' message (from 
gtkcalendar.c). The number x was an incredibly large number, suggesting that

   1. The pointer being freed had never been allocated or
   2. The memory had been overwritten by something else after previously
      being freed.

The current implementation cannot distinguish between these two 
possibilities, and is incapable of detecting a few other kinds of problem.

I am proposing to make some additions to g_malloc, g_free and related 
functions (g_malloc0, g_realloc) to make the debugging memory manager 
more robust to misuse. This will improve it's utilitiy in bug-hunting.

I have previously written a debugging memory manager for C++, which 
works by overloading operator new and operator delete. It operated by 
aquiring it's memory blocks from the malloc/free interface, much the 
same way as g_malloc/g_free does today.

My implementation added 2 pointers (8 bytes on 32 bit arch) to each 
block which are used to form a doubly linked list. When a block is 
allocated, it is entered into a hash table (overflow by doubly linked 
lists). When a pointer is given to g_free, I propose to verify it by 
looking it up in the hash table. It is thus possible to diagnose free's 
of invalid pointers vs multiple frees.

I also propose to fill freed blocks with a non-0 number, such as 
0xDEADBEEF, because 0-filling will hide errors. Filling with some other 
number will lead to diagnosis by invalid pointer errors (bus errors, 
frees of invalid pointers) if the block is re-used. 0-filling will lead 
to a segmentation fault if the pointer is dereferenced, but will not 
generate an error if it is passed to g_free.

It is also possible to detect block write overruns by adding a check 
magic number at the beginning and end of the block, at the cost of 8 
more bytes per allocation. This method does not detect reads over the 
end of an allocated block. The check would be carried out when the block 
is freed.

The current implementation never releases allocated blocks, so as to 
detect if they are ever freed again, and also to help detect usage of 
the freed block. It is possible to retain this behaviour by removing the 
newly freed block from the hash table of allocated blocks, and to enter 
it into a doubly linked list of freed blocks. If malloc runs out of 
memory, it is possible to de-allocate some of the blocks on the 
freed-blocks list until the request is satisfied or there are none left. 
I consider that it's more important to continue running the program than 
to hold on to all the blocks ever freed in the hope of detecting an error.

The addition of this information allows a more meaningful 'check_heap' 
operation, which may actually walk the heap and check that all the 
doubly linked list structure is intact, and that the magic numbers 
guarding the front and back of each block are intact. It would also 
check that the blocks of memory in the freed block list have not been 
overwritten. I would like to be able to invoke this operation from the 
debugger, but I am not yet aware of how to do so.

The proposed modification will consume more resources than the current 
one, but will also diagnose more errors than the current one. It will 
not consume the incredible amount of resources which electric fence does 
by allocating at least two VM pages per allocation. (This makes Electric 
Fence unsuitable for use with large programs such as gnucash - the 
machine runs out of resources before the program has finished initialising.)

Remember that the overhead is only incurred if you compile with 
--enable-mem-check. This option could be expanded to have a 
no/minimal/yes option, similarly to the current --enable-debug option. 
minimal would retain the current implementation, yes would select my new 
implementation.

The techniques that I developed for writing my memory manager were used 
to test student assignments for errors, and were verified to uncover a 
wide variety of memory errors.

I propose to leave the --enable-mem-profile code alone and to make my 
enhancements cooperate with it where they are used together.

Comments are solicited, bearing in mind that this is a debugging memory 
manager proposal, not the normal one, and that I am aiming for 
robustness in the face of programmer error over efficiency, That being 
said, if you know of ways to achieve the same goal using fewer bytes and 
achieving higher performance, please describe them.

I am volunteering to implement this proposal.

I have currently only looked at gmem.c in glib-1.2.9. Is there any more 
recent version that I should know about?

Ben.