Stackless Python: Blocking and green threads
Stackless Python is not a multithreading solution, but it is a green threading solution. The microthreads which it provides are scheduled in turn within the same operating thread. This means that any blocking operations made from within one of these microthreads, like a socket call, will block the scheduler and any other microthreads within it.
When someone wants to use the functionality normal blocking operations provide in their Stackless Python application, they need to use an asynchronous non-blocking version of this functionality instead. This leaves the developer unable to write code in a manner that can provide normal control flow, forcing additional boilerplate and complexity into their application.
It is possible for custom solutions to be developed to abstract away this boilerplate and complexity, but these solutions can merely be a different form of boilerplate and complexity. Often they might be a custom framework or API that has to be learned in order for new developers to understand how the application works. An alternative approach is to give the developers the same interface they would have if they were developing a Python application using normal threading and without Stackless Python.
Case study: The Stackless socket module
The design decision behind the Stackless socket module was to provide a replacement socket module that blocked the calling tasklet instead of the calling thread. It takes the asynchronous networking support from the asyncore module in the standard library and wraps it using the channels which Stackless provides to create a replacement socket module with the same interface as the original one.
Using it developers can write clean and simple synchronous code as they would have before Stackless Python became part of their development environment. However, it does not come with Stackless Python, and developers have to add it to their project and ensure it is monkeypatched in place of the standard socket module.
Example: The standard socket module
The following code shows how to connect to a remote host and read some data over a socket using the standard socket module.
import socketThis is a standard socket API and knowledge of how to use sockets in C translates almost directly to Python.
clientSocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
clientSocket.connect((HOST, PORT))
s = clientSocket.recv(1024)
print "READ DATA", s
clientSocket.close()
Example: The asyncore module
The following code shows how to connect to a remote host and read some data over a socket using the asyncore module.
import socketAs can been seen in this example, the asynchronous nature of asyncore implicitly gives it a model where control flow is broken up. This gives a custom and cumbersome base on which a developer has to build.
import asyncore
class SocketOperation(asyncore.dispatcher):
def __init__(self):
asyncore.dispatcher.__init__(self)
self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
self.connect((HOST, PORT))
def handle_connect(self):
pass
def handle_read(self):
s = self.recv(1024)
print "READ DATA", s
self.close()
def handle_close(self):
self.close()
def handle_expt(self):
self.close()
def writable(self):
return False
localReference = SocketOperation()
asyncore.loop()
Example: The Stackless socket module
The following code shows how to connect to a remote host and read some data over a socket using the standard socket module.
import stacklessAs intended, the actual socket logic in this example is identical to the logic in the standard socket example. This example is more extensive, but not unnecessarily so. The extra logic needs to be provided by anything which uses Stackless, in order to launch functions as tasklets and have them scheduled.
import stacklesssocket
# Install the monkeypatched socket module.
stacklesssocket.install()
# This now obtains the monkeypatched socket module.
import socket
def SocketOperation():
clientSocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
clientSocket.connect((HOST, PORT))
s = clientSocket.recv(1024)
print "READ DATA", s
clientSocket.close()
# The main tasklet is best used for running the scheduler. Other
# tasklets take care of the application logic.
stackless.tasklet(SocketOperation)()
while stackless.runcount > 1:
stackless.run()
Conclusion
The intention was to show that it was possible to hide away boilerplate and complexity in such a way that an absolute minimum amount of burden is placed on the developer who uses a solution. There is no new and arbitrary abstraction to learn with its own boilerplate and complexity. Development can be done using the same approach that would be used if a green threading solution was not involved.
In a way this highlights the way in which generator coroutines are lacking. Each generator only has the ability to block the function it is located within, so in order to block a chain of function calls, each function has to cooperate by participating in a compatible way. This forces a framework with associated boilerplate and complexity on all of those functions. Given the already disjunctive nature of the yield keyword, the result is a solution that is a pale shade of something that can be made using real coroutines.
When using green threading, proper blocking is essential. An arbitrary function down a chain of calling functions should be able to block anything up that chain, without their cooperation. With this ability, building blocks like the Stackless socket module can be implemented, and used transparently by the logic which unknowingly encounters it.
An example of this, is the monkeypatching of the Stackless socket module in place of the standard one. Because it provides the same interface, any other modules that do not know it is involved, but already use the socket module, inherently also now block the current tasklet rather than the current thread. Andrew Dalke's urllib sample demonstrates this.
Final notes
Example applications that demonstrate use the Stackless socket module are available from the Stackless examples project.
No comments:
Post a Comment