Friday 15 January 2010

Locating the instances of Python classes

There is no straightforward way to obtain the instances of a Python class. But it is functionality that is occasionally useful. As I often have to write code to track down instances, and I have seen other broken solutions elsewhere, I'm going to post my code here in case it is of use to others.

Python itself does not track the instances of a clas directly, but it does track them indirectly within its garbage collection and reference counting system. So, given you know what to look for, you can make use of the garbage collection module to trace down class instances.

Tracking down direct instances of a class is relatively straightforward.

>>> class SomeClass: pass
...
>>> instance = SomeClass()
>>> referrers = [
... v
... for v
... in gc.get_referrers(SomeClass)
... if type(v) is types.InstanceType and
... isinstance(v, SomeClass)
... ]
>>> referrers[0] is instance
True
Of course the class you are interested in can be inherited by other classes, and instances of those inheriting classes do not have direct references to the class of interest. But classes which inherit other classes, have an indirect reference to those others classes through their __bases__ attribute. So if the class of interest is referred to by a tuple, which is in turn referred to by a class, then that class inherits the class of interest.

Tracking down instances related through inheritance is a little less straightforward.
    def ReferringClass(v):
matches = [
v
for v
in gc.get_referrers(v)
if type(v) is types.ClassType or
type(v) is types.TypeType
]
if len(matches) == 1:
return matches[0]

def FindInstances(class_):
instances = {}
for v in gc.get_referrers(class_):
if type(v) is types.InstanceType and
isinstance(v, class_):
if class_ not in instances:
instances[class_] = []
instances[class_].append(v)
elif type(v) is tuple and class_ in v:
rclass_ = ReferringClass(v)
if rclass_ is not None:
instances.update(FindInstances(rclass_))
return instances
A call to FindInstances will return a dictionary of the located instances. The class of interest, and each class that inherits it at some level, will have an entry mapped to a list of the instances of the given class.

The following demonstrates FindInstances.
>>> class ClassOfInterest:
... pass
...
>>> class SomeOtherClass:
... pass
...
>>> class SingleInheritingClass(ClassOfInterest):
... pass
...
>>> class MultiInheritingClass(SomeOtherClass,
... ClassOfInterest):
... pass
...
>>> class IndirectClass(MultiInheritingClass):
... pass
...
>>>
>>> coi = ClassOfInterest()
>>> soc = SomeOtherClass()
>>> mic = MultiInheritingClass()
>>> ic = IndirectClass()
>>>
>>> d = FindInstances(ClassOfInterest)
>>> for class_, instances in d.iteritems():
... print class_, len(instances)
...
IndirectClass 1
MultiInheritingClass 1
ClassOfInterest 1

2 comments:

  1. It is worth pointing out that your code works only with old-style classes (i.e. those that do not inherit from object). If you omit the InstanceType check, it'll work with all kinds of classes.

    For the second task I'd find it simpler to iterate over gc.get_objects() and filter by isinstance(), which already pays attention to inheritance. When I needed this kind of thing, looping though all gc objects was fast enough for about a million objects.

    ReplyDelete
  2. Another good tip, thanks Marius. I went with the approach I did, rather than iterating over all garbage collected objects, because it is also a good insight into what exactly holds onto references. It wasn't hard to write, either.

    ReplyDelete