Wednesday, 11 November 2009

Comparing Go and Stackless Python

Google has just released a new programming language, called Go. Written by Russ Cox, amongst others, it wraps a custom programming language around low-level functionality very similar to that present in his libtask. With the ability to launch functions as microthreads, and the ability to switch between them using channels, they provide functionality similar to that of Stackless Python.

This post is intended to serve as a comparison of how microthreads and channels are used in two languages that feature them. It is not intended to advocate the choice of one over the other, nor is it guaranteed to be full and complete.

Starting a worker function as a microthread

The availability of lightweight threads that can be used without regard for the resource usage they might incur, means that among other things work can be farmed off to other microthreads while the current microthread does its own thing.

Stackless Python

channel = stackless.channel()

def wrapper(argument, channel):
result = longCalculation(argument)
channel.send(result)

stackless.tasklet(wrapper)(17, channel)
# Do other work in the current tasklet until the channel has a result.
result = channel.receive()
Go
c := make(chan int);

func wrapper(a int, c chan int) {
result := longCalculation(a);
c <- result;
}

go wrapper(17, c);
// Do other work in the current goroutine until the channel has a result.
x := <-c;
There are several things to note from this, including how the different languages handle microthread and channel creation, and the different syntax used respectively.

Creating a microthread

When a given function (in this case wrapper) is to be started as a microthread, the arguments to be passed into it (17 and the channel reference) need to be provided as well. These are set aside for use when the microthread is first scheduled, and the given function starts execution within it.

Stackless Python

A Stackless Python microthread is called a tasklet.
stackless.tasklet(wrapper)(17, channel)
Advantages:
  • Creation of microthreads happens in a function call, returning a reference to the created instance. The instance can be manipulated, allowing amongst other things explicit interruption and killing of the microthread.
    def engage_worker():
    c = stackless.channel()

    def worker():
    # Acquire some result..
    c.send(result)

    worker_tasklet = stackless.tasklet(worker)()
    # Do some work before requesting the result..
    if c.balance != 0:
    # Return the acquired result that is waiting.
    return c.receive()

    # The worker tasklet is still busy and we do not want
    # to wait for it, so abort it and return nothing.
    worker_tasklet.kill()
Go

A Go microthread is called a goroutine.
go wrapper(17, c)
Disadvantages:
  • There does not appear to be a way to store and operate on created microthreads. So the act of creating a microthread as a worker, but manually killing it before its work is complete, appears to be impossible.
One key difference between Go and Stackless Python, is how the tasklet is inserted into the scheduler. In Go, the go keyword explicitly indicates the microthread is being scheduled. While in Stackless Python, the passing of arguments to be used provides the tasklet with the last information it needs to run, and in doing so the tasklet is implicitly inserted into the scheduler.

Creating a channel

In both languages, it is possible to create channels to be used for communication between the microthreads.

Stackless Python
channel = stackless.channel()
Go
c := make(chan int);
Channel operations

Superficially at least, both kinds of channels are similar, allowing the sending and receiving of values through them in much the same way.

Stackless Python

Sending:
channel.send(value)
Receiving:
result = channel.receive()
Go

Sending:
c <- value;
Receiving:
result := <-c;
Microthread memory usage

One of the advantages of using these types of microthreads, is that they do not have the memory requirements that proper operating system threads do. Instead of having one or more megabytes set aside for possible use as a stack, they instead have at most several kilobytes set aside for them.

Stackless Python

Stackless tasklets use as their stack the actual stack of the operating system thread they were created in. This means that when a tasklet blocks and it is set aside to let others run, a chunk of memory is allocated from the heap, and the portion of the stack that has been used by it is copied into that chunk. Then the allocated chunk belonging to the next tasklet to be run is copied back onto the stack, and the chunk freed.

Advantages:
  • Blocked microthreads only use as much memory as they actually used.
  • C function calls can be intermixed with the Python function calls in the call stack of a blocked microthread. This could for instance involve a Python function invoking a C function, which then calls back into Python resulting in the blocking occuring before the stack is unwound.
Disadvantages:
  • Microthreads are linked to the thread they were created in and cannot continue running in any other thread.
It is possible to migrate microthreads from one operating system thread to another in Stackless Python, with the use of its ability to pickle blocked microthreads. However, every function that results in a system call that blocks the interpreter until it completes, would need to be monkey-patched to invoke the migration process. A better solution might be to monkey-patch the relevant functions to do the system calls asynchronously, for instance in the same way as the Stackless socket library does.

Go

Advantages:
  • Microthreads are not linked to the thread they were created in and if that thread is blocked for a system call on behalf of a given microthread, the other microthreads can be migrated to another thread and can continue running there.
Disadvantages:
  • It is not possible to call into C code and have it call back into Go code.
  • For C code to be usable with Go, it needs to be compiled with custom C compilers.
As noted in the Go source code, currently the language defaults to running in single-threaded mode due to multi-threaded operation being unstable. This means that the blocking of a thread for a system call on behalf of a given microthread would in fact block the execution of all the other microthreads until the call is completed.

8 comments:

  1. Also, there is no 'select' in Stackless Python

    ReplyDelete
  2. I pretty much always use Stackless with the Twisted library. At first glance it looks like the "Select" in Go is equivalent to a Twisted.Integernet.Defer.DeferredList evaluation.

    I really cannot see why Google which is a big user of Python is wasting resources on Go when it could be supporting Stackless.

    ReplyDelete
  3. It seems like Google might on to something disallowing manual killing of goroutines. Forcing a thread or process to die is often dangerous. Google's implementation forces you to maintain a list of "cancel" variables that the goroutine can check periodically to ensure that not only does it stop when told to, but also that it doesn't die at an unexpected spot.

    ReplyDelete
  4. Craig: Killing it, is a simple example of what you can do with the ability to hold a reference to a tasklet you start.

    The difference between Go and Stackless, is that Go exposes a fixed implementation of microthreads and channels in a partial and limited way. Stackless exposes them in a way where programmers can take them, and build different and interesting functionality around them.

    You can signal microthreads with variables in Stackless as well. But if you know your tasklet is safe to be killed as needed (not that this case arises), why add the boilerplate?

    ReplyDelete
  5. Masha Rabinovich: Go's microthreads and channels are only partially exposed to the programmer, and in a very limited way. In order to make them usable you need select. In Stackless, you do not need select and in fact, there has been no demand for anything like it.

    I thought about writing my own, but the fact is that in reality it is boilerplate to get around the limitations of the language that features it. In a more flexible language like Stackless, the need goes away.

    ReplyDelete
  6. Not: Just because something looks like a duck and quacks like a duck doesn't mean it is a duck.

    Google has made good argument in my opinion why this language suits their needs. I rather like it, though my previous comments indicate what I think of their microthreads and channels as a programmer who might use the language.

    ReplyDelete
  7. """ Microthreads are linked to the thread they were created in and cannot continue running in any other thread."""
    This is generally not true for most tasklets.
    If they are in soft-switching mode, they can run
    in any thread. The restriction is only an implementation limit in certain scenarios.

    cheers - chris

    ReplyDelete