Friday, 23 December 2011

Python garbage collection crash

I run the unit tests for the Windows debug build of Stackless Python 2.7.2, the tests pass, but on exiting the interpreter crashes.  The problem is that the garbage collector finds a 1 element list, where that element has either been already garbage collected or never initialised.  Logically, the following occurs.

How do you track down the cause of this crash?


  1. Ah, is it in the visit_decref call?
    This can be caused by a reference counting bug, where a reference was lost. The list is pointing to an object that has been killed already.
    We have been seeing similar things at CCP recently, perhaps it is an older issue?
    If it is easily reproducable, it would make sense to try to focus on a particular piece of code that produces this. Then, insert instrumentation code into listobject.c (assuming that we keep on seeing this for lists of len 1) and perhaps store a repr of all single members of len 1 list with the list object. Additional data, to help us. then, during the crash, you can look at some sort of representation of this element in the debugger, even though the debugger has gone.

  2. Hmm, I thought Dave Malcolm had some patches on the tracker that added additional tracing information to the cyclic GC to help pick up when dodgy objects were added (rather than when attempting to clean them up failed), but I can't find them now :(