Thursday 21 January 2010

Strange class leakage

One of the ways in which code reloading can be implemented in Python, with respect to classes defined within managed scripts, is by updating the existing class in place.

The process resembles the following..

  1. The code reloading framework is started up, loading a main copy of every script file registered with it.
  2. The user modifies a script file.
  3. The code reloading framework detects and reacts to the file modification.
  4. The new version of the script file is executed into a dictionary.
  5. For every class in the dictionary that already existed..
    1. The original version of the class is retrieved from the main copy.
    2. Each valid dictionary entry in the new class is moved into the original class, overwriting whatever it had. If the entry is a function, it is rebuilt so that its internal func_globals dictionary points to that of the main copy, rather than the new copy.
  6. The new version of the script file is discarded and garbage collected, now that the main version has been updated to resemble it.
It is a lot less legwork than the alternate approach, which is wandering around the references in the garbage collector and replacing references to the original class.

Leakage

I use the code reloading framework in a MUD framework. One of the project's tasks that I am working on at the moment, is locating the commands which a logged in player can execute.

The original approach was to simply iterate over a given namespace and check if one of the contents was an instance of the Command class. However, as there are now multiple namespace locations where commands may be located, I decided to try an alternate approach. Instead I would find use the garbage collector to find all the subclasses of the Command class, and obtain the commands that way.
for class_ in pysupport.FindSubclasses(Command, inclusive=True):
for verb in class_.__verbs__:
if verb in self.verbs:
logger.warning("Duplicate '%s': %s", verb, class_)
continue
Each time I modified the rehash command, I started to see a ghost class appearing in the output of this logging action. So for the umpteenth time, I wrote some code to dig through the referrers in the garbage collector, in order to work out what was holding the references.

Tracking references

By passing the ghost classes to PrintReferrers, I see the following..
5 REFERRERS FOR <type 'classobj'> __builtin__.Rehash 0x20a84b0
SKIPPED/is-local-frame 0x20ebec8 <frame object at 0x020EBEC8>
SKIPPED/is-seen-list 0x20a9e90 <type 'list'>
SKIPPED/is-local-frame 0x20f7ae0 <frame object at 0x020F7AE0>
6 REFERRERS FOR <type 'list'> [ ... ] 0x20ae1e8
5 REFERRERS FOR <type 'listiterator'> 0x1c33110
SKIPPED/is-seen-list 0x20a9e90 <type 'list'>
SKIPPED/is-local-frame 0x1b1fc88 <frame object at 0x01B1FC88>
SKIPPED/is-referrer-list 0x20a9c10 <type 'list'>
SKIPPED/is-local-frame 0x1b1fe30 <frame object at 0x01B1FE30>
SKIPPED/is-local-frame 0x20f7ae0 <frame object at 0x020F7AE0>
SKIPPED/is-local-frame 0x20ebec8 <frame object at 0x020EBEC8>
SKIPPED/is-seen-list 0x20a9e90 <type 'list'>
SKIPPED/is-referrer-list 0x20a96c0 <type 'list'>
SKIPPED/is-local-frame 0x1b1fc88 <frame object at 0x01B1FC88>
SKIPPED/is-local-frame 0x20f7ae0 <frame object at 0x020F7AE0>
5 REFERRERS FOR <type 'dict'> { ... } 0x20aba50
SKIPPED/is-local-frame 0x20ebec8 <frame object at 0x020EBEC8>
SKIPPED/is-seen-list 0x20a9e90 <type 'list'>
SKIPPED/is-referrer-list 0x20a96c0 <type 'list'>
SKIPPED/is-local-frame 0x20cc188 <frame object at 0x020CC188>
5 REFERRERS FOR <type 'function'> 0x20a5eb0
SKIPPED/is-seen-list 0x20a9e90 <type 'list'>
SKIPPED/is-local-frame 0x20cc188 <frame object at 0x020CC188>
SKIPPED/is-referrer-list 0x20a9490 <type 'list'>
SKIPPED/is-local-frame 0x20cc330 <frame object at 0x020CC330>
5 REFERRERS FOR <type 'dict'> { ... } 0x20ab9c0
SKIPPED/is-seen-list 0x20a9e90 <type 'list'>
SKIPPED/is-local-frame 0x20cc330 <frame object at 0x020CC330>
SKIPPED/is-referrer-list 0x20a9ad0 <type 'list'>
SKIPPED/is-local-frame 0x20cf078 <frame object at 0x020CF078>
SKIPPED/seen 0x20a84b0 __builtin__.Rehash
I've trimmed some redundant information out by hand. But what it shows, is that the class has a function that has a dictionary, that has a reference to the class. Seeing what the contents of the trimmed containers were, it indicates that the normal func_globals reference is creating a circular reference and keeping the class alive.

All classes leak?

The first thing I did was rule out that this was simply the way that Python worked.

I created a script containing..
class AClass:
def AFunction(self):
pass
And then repeatedly reloaded the class in the interpreter..
>>> d = {}
>>> execfile("test.py", d, d)
>>> d = {}
>>> execfile("test.py", d, d)
>>> d = {}
>>> execfile("test.py", d, d)
>>> d = {}
>>> execfile("test.py", d, d)
>>> import gc, types
>>> for v in gc.get_objects():
... if type(v) is types.ClassType and v.__name__ == "AClass":
... print v
...
__builtin__.AClass
__builtin__.AClass
__builtin__.AClass
__builtin__.AClass
>>> gc.collect()
12
>>> for v in gc.get_objects():
... if type(v) is types.ClassType and v.__name__ == "AClass":
... print v
...
__builtin__.AClass
So outside of my framework Python does the correct thing.

Conclusion

Some possible reasons for what is going on:
  • My reference printing function is skipping something it shouldn't (not seeing it).
  • How I am calling execfile is creating this circular reference in a way that is somehow different (not seeing it).
For now, I will enter a defect and have the new script file clear its dictionary when it is garbage collected, which fixes the problem.

Writing this reference tracking code for the umpteenth time makes me wonder if something similar, but both tested and proven, is out there.

7 comments:

  1. Your finding is correct. Normal functions in modules are part of a circular reference involving the module dictionary. This was explained to me here: http://mail.python.org/pipermail/python-dev/2009-December/094446.html
    I think that 'reload' clears the old module directory, just as you are doing.

    ReplyDelete
  2. Nice to see someone else experimenting with reliable code reload. I have my own version of Guido's original work at http://svn.plone.org/svn/plone/plone.reload/trunk/plone/reload/xreload.py

    This stuff is used by quite a number of people in real live, so I'm a bit confident that it actually works. I ran into a number of problems while doing this, like people adding an import statement at the top and having to update the func_globals of each function with the new name.

    Removal of attributes, functions and the like gives problems too. I've often had situation where people in some way patched additional things into modules. So the naive assumption that the runtime version of a module can be reconstructed from the code in its file, often failed. I adopted a policy of generally not removing things. While that can lead to "zombie code" lying around it produced better results in real live.

    Other fun things are type changes, like switching a function to a property of the same name or adding decorators to things.

    I never got around to deal with modules in any good way. Good to see you tackled this.

    ReplyDelete
  3. Ah but Kristjan, I am not referring to a normal function. In this case, it is the function on a class, that seems to be causing the problem (there are no global functions). And outside of my framework, this case is garbage collected fine.

    ReplyDelete
  4. Hanno, my framework bypasses the Python module and package system, loading all the scripts it manages. Then it detects modification of those scripts and puts changes in place.

    I track leaked contributions from earlier versions of scripts, and also leave them in place as you describe. Without introspection of added and removed code, leaving these in place is the best one can do.

    Yes, I have encountered type changes in the past too. I do not deal with them. They sit in that area over to the side with the other things the user just has to be aware of. And as I do not use decorators, I have not implemented any support for them.

    ReplyDelete
  5. Have you tried running gc.collect() before you loop through GC objects? Python doesn't collect cyclic garbage instantly, but if the objects participating in a cycle don't have __del__ methods, they will get collected by gc.collect(), which is automatically run every few hundred bytecodes, IIRC.

    Also, if Command is a new-style class, you don't need to trawl the depths of GC; instead recursively traverse Command.__subclasses__() to get all the subclasses.

    ReplyDelete
  6. Marius, yes I ran gc.collect(). I believe that's the first step in my leaked reference finder.

    Nice tip on __subclasses__, I cannot guarantee that classes will be newstyle or oldstyle, so while I can use this as an optimisation in the former case, I need to also support the latter case.

    ReplyDelete
  7. Richard, you'll find that any function, be it in a class or directly in the module dict, has a func_globals member which refers to the module dict. So, all functions in python are normally part of a reference cycle.

    The reason reaload(module) doesn't leave those cycles behind, is because the module code is excuted in the context of the module dict of the old module, thus breaking all the cycles. See PyImport_ExecCodeModuleEx().

    ReplyDelete