Implementing code reloading, part 1
This post continues on from A custom namespacing system for Python, part 2.
Code reloading
There are many situations where it is useful to be able to update the code an application is using while it is running. Some of the most common are when:
- A service is being provided by the application and interruption to that service for its users is undesirable.
- Starting up the application takes an excessive amount of time and modifying its behaviour without restarting it can allow for more productive development.
- The state which the application maintains is built up after it was started, and the ability to modify behaviour of the application without restarting it allows interaction with the accumulated state.
Code reloading through overwriting
With the custom namespacing system previously created, a script file can be loaded and its contents placed into a given namespace.
First a script needs to be loaded.
scriptFile = LoadScript("scripts\example.py", "game")Next the script is run, placing its contents into the given namespace.
RunScript(scriptFile)This can easily be hooked up to file change events.
def OnFileChanged(filePath):A simple overwriting code reloading system can be implemented by repeating this process every time a file changes, which is exactly what would happen if there were actually a file change event hooked up in the manner shown. In response to a file change, the previously created objects from before the change are overwritten by the newly created versions from the freshly changed file.
scriptFile = LoadScript("scripts\example.py", "game")
RunScript(scriptFile)
Overwriting functions
If each time a function is called it is accessed via the namespace, the newest version of the function will be the one which is called. This makes sense, as this is the whole point of the overwriting process which happens on a reload. If a programmer wants to keep calling the same version of the function and not start calling new versions as they get put in place, they need to keep their own reference to the function to call, avoiding use of the namespace.
Overwriting classes
Let's try a few things. To start with, assume that "example.py" provides a base class called
BaseObject
.class BaseObject:If there a class is defined, it is likely that at some point that class will be instantiated.
def PrintClassName(self):
print self.__class__.__name__ +" (original)"
import gameAt some point "example.py" is changed, specifically the
baseObjectBeforeReload = game.BaseObject()
BaseObject
class, triggering a reload. This is the new version of the class:class BaseObject:Now the namespace entry
def PrintClassName(self):
print self.__class__.__name__ +" (reloaded)"
game.BaseObject
will be a new class with the old version overwritten. And since the previously created instance baseObjectBeforeReload
was created from the old version of the class, it will still be using that old version of the class.Executing this code:
baseObjectAfterReload = game.BaseObject()Gives this output:
print baseObjectAfterReload.__class__ is not baseObjectBeforeReload.__class__
baseObjectBeforeReload.PrintClassName()
baseObjectAfterReload.PrintClassName()
FalseFor certain applications, this might be a desirable result. It won't update the behaviour of existing instances which are in use within the application, but it will give a different behaviour for new instances created from the newer classes. If a programmer still wants to create instances from older versions of the class, they can do so by taking a reference to the class from the namespace before a reload has happened and the processing of a changed file has overwritten it.
"BaseObject (original)"
"BaseObject (reloaded)"
Limitations to naive class overwriting
However, things get more complicated when inheritance is involved. For the purposes of illustration, here are two classes,
BaseObject
and its subclass Object
.In the file "example1.py",
BaseObject
is defined:class BaseObject:In the file "example2.py",
def TokenFunction(self):
print "TokenFunction called"
Object
inherits from BaseObject
:import gameThere are two things to note about the
class Object(game.BaseObject):
def TokenFunction(self):
game.BaseObject.TokenFunction(self)
Object
class:Object.__class__
is a direct reference to the whatevergame.BaseObject
was when the class was loaded.game.BaseObject
inTokenFunction
is obtained when the function is executed, this means it is not guaranteed to be the same class asObject.__class__
.
TokenFunction
will cause the following error:TypeError: unbound method TokenFunction() must be called withThis is the primary limitation in the overwriting approach. If this approach is used, then a subclass can never refer to its base class through a namespace reference. It needs to do so through a reference to a version of the base class which won't change.
BaseObject instance as first argument (got Object instance instead)
self.__class__.__bases__[0]
fits the bill, but using it each time is too unwieldy, and it is easier to store a reference to the version of the class used.import gameIf a programmer can live with this limitation and does not find the need to keep track of the older references too cumbersome, this form of naive overwriting might be more than sufficient. But to be realistic, this requirement is too impractical to work with.
BaseObject = game.BaseObject
class Object(BaseObject):
def TokenFunction(self):
BaseObject.TokenFunction(self)
Overwriting classes "intelligently"
By extending the reloading process it may be possible to remove the need to keep track of those older references. When a class is updated, all the subclasses which inherit the old version need to be updated to be compatible with the new version. The base classes of a subclass can be found in the
__bases__
attribute of the subclass. If the reference to the old version of a reloaded class in this attribute is replaced with a reference to the new version, then the above error does not occur.This can be done in the following manner:
import gc, typesBut the opposite case is now a problem. If existing references to the old version of the class are stored directly before the reload and used after it, the error can still occur.
for ob1 in gc.get_referrers(oldBaseObjectClass):
# Class '__bases__' references are stored in a tuple.
if type(ob1) is tuple:
for ob2 in gc.get_referrers(ob1):
if type(ob2) in (types.ClassType, types.TypeType):
if ob2.__bases__ is ob1:
__bases__ = list(ob2.__bases__)
idx = __bases__.index(oldBaseObjectClass)
__bases__[idx] = game.BaseObject
ob2.__bases__ = tuple(__bases__)
This approach can be further extended to cover additional cases, like replacing references to the old class in global dictionaries, but there are references which cannot be replaced. This still leaves the burden on the programmer having to know what cases they need to avoid, in order for this code reloading solution to be workable.
Unit tests illustrating the overwriting approach and general limitations in code reloading approaches can be found here.
What next?
It is useful to know the limitations to code reloading, in order to determine how they apply to a given approach, and how they affect the behaviour of any reloaded code whatever the approach taken.
This post is followed by Implementing code reloading, part 2.
Edited: 2009/02/25, 6:00PM. Rewrote the last couple of paragraphs and linked to the next post.
No comments:
Post a Comment