Saturday, October 18, 2008

Example of Microsimulation with Stackless Python


Stackless Python is roughly Python with support for microthreads (called tasklets). Tasklets communicate with each other by sending messages through channels , i.e. similar to Erlang, but of course with Python syntax (guess which I prefer :-).  Stackless has a sibling called Greenlet Python , which is a library that can be used with traditional Python to get decent mictrothread support.

Scale
Tasklets have low overhead (compared to traditional threads) so you can have hundreds of thousands of them simultaneously, and when it gets overly crowded and busy on one machine, you can easily scale up by pickling some tasklets and send them to another machine (better known as mobile agents in computer science). Since tasklets also live Python's GIL regime, it might be idea to use the processing API to spawn several processes to better utilize multicore machines (and have many tasklets within each process), or use some of these libraries .

API Alignment Challenge
In Python 2.6 the process and thread APIs are aligned, but a missing piece would be to align those apis with stackless/tasklet apis. From my perspective a natural order would be process > thread > tasklet, and perhaps adding your favorite piece of the cloud would make sense for the API as well?

Example - simulation with 100k robots
The robots are playing a variant of musical chairs on a rectangular arena, it differs from regular musical chairs since if a chair has been sat on once, it can't be sat on again. Finding out who won is left as an exercise (hint: it can be found more than 1 place)

from random import randint
import stackless
import sys
import time

class Arena:
    def __init__(self, xdim, ydim):
        self.arena = [[None]*ydim for x in range(xdim)]
        (self.xdim, self.ydim) = (xdim, ydim)

    def find_unused_place_for_robot(self, robotname):
        (x, y) = randint(0,self.xdim-1), randint(0,self.ydim-1)
        if not self.arena[x][y]:
            self.arena[x][y] = robotname
        return self.arena[x][y] == robotname

class Robot:
    def __init__(self, name=0, arena=None, maxrounds=0):
        self.name = name
        self.arena = arena
        self.points = 0
        self.maxrounds = maxrounds

        # bind Robot's live method to it's tasklet
        self.tasklet = stackless.tasklet(self.live)()

    def live(self):
        rounds = 0
        while rounds < self.maxrounds:
            self.play()
            rounds += 1
        self.tasklet.kill()

    def play(self):
        # entire play method is atomically executed
        atomic = self.tasklet.set_atomic(True)
        if self.arena.find_unused_place_for_robot(self.name):
            self.points += 1
        self.tasklet.set_atomic(atomic)

class RobotArenaSimulator:
    def __init__(self, num_robots, xdim, ydim, maxrounds):
        self.maxrounds = maxrounds
        self.arena = Arena(xdim, ydim)
        self.robots = [Robot(id, self.arena, maxrounds)
                       for id in range(1,num_robots+1)]
    def run_preemptive(self, nopreempt_steps=1):
        tstart = time.clock()
        while stackless.getruncount() != 1:
            # run() takes out the tasklet after nopreempt_steps
            t = stackless.run(nopreempt_steps)
            # so need to add the tasklet to scheduler again
            if t: t.insert()
        return (time.clock()-tstart)

if __name__ == "__main__":
    tstart = time.clock()
    simulator = RobotArenaSimulator(num_robots = 100000,
                                    xdim = 10000, ydim = 10000,
                                    maxrounds = 10)
    simulationtime = simulator.run_preemptive()
    print "simulation time was %.2f seconds" % (simulationtime)
    print "total running time was %.2f seconds" % (time.clock()-tstart)