0% found this document useful (0 votes)
73 views11 pages

Concurrency Oriented Programming in Termite Scheme: Guillaume Germain Marc Feeley Stefan Monnier

programming

Uploaded by

Florin Grecu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views11 pages

Concurrency Oriented Programming in Termite Scheme: Guillaume Germain Marc Feeley Stefan Monnier

programming

Uploaded by

Florin Grecu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Concurrency Oriented Programming in Termite Scheme

Guillaume Germain Marc Feeley Stefan Monnier


Université de Montréal
{germaing, feeley, monnier}@iro.umontreal.ca

Abstract Termite aims to make this much easier by doing all the low-level
Termite Scheme is a variant of Scheme intended for distributed work for you and by leveraging Scheme’s powerful abstraction
computing. It offers a simple and powerful concurrency model, tools to make it possible to concentrate just on the part of the design
inspired by the Erlang programming language, which is based on a of the high-level protocol which is specific to your application.
message-passing model of concurrency. More specifically, instead of having to repeat all this work every
Our system is well suited for building custom protocols and ab- time, Termite offers a simple yet high-level concurrency model
stractions for distributed computation. Its open network model al- on which reliable distributed applications can be built. As such it
lows for the building of non-centralized distributed applications. provides functionality which is often called middleware. As macros
The possibility of failure is reflected in the model, and ways to abstract over syntax, closures abstract over data, and continuations
handle failure are available in the language. We exploit the exis- abstract over control, the concurrency model of Termite aims to
tence of first class continuations in order to allow the expression of provide the capability of abstracting over distributed computations.
high-level concepts such as process migration. The Termite language itself, like Scheme, was kept as powerful
We describe the Termite model and its implications, how it com- and simple as possible (but no simpler), to provide simple orthog-
pares to Erlang, and describe sample applications built with Ter- onal building blocks that we can then combine in powerful ways.
mite. We conclude with a discussion of the current implementation Compared to Erlang, the main additions are two building blocks:
and its performance. macros and continuations, which can of course be sent in messages
like any other first class object, enabling such operations as task
General Terms Distributed computing in Scheme migration and dynamic code update.
An important objective was that it should be flexible enough to
Keywords Distributed computing, Scheme, Lisp, Erlang, Contin- allow the programmer to easily build and experiment with libraries
uations providing higher-level distribution primitives and frameworks, so
that we can share and reuse more of the design and implementation
1. Introduction between applications. Another important objective was that the
There is a great need for the development of widely distributed ap- basic concurrency model should have sufficiently clean semantic
plications. These applications are found under various forms: stock properties to make it possible to write simple yet robust code on
exchange, databases, email, web pages, newsgroups, chat rooms, top of it. Only by attaining those two objectives can we hope
games, telephony, file swapping, etc. All distributed applications to build higher layers of abstractions that are themselves clean,
share the property of consisting of a set of processes executing maintainable, and reliable.
concurrently on different computers and communicating in order Sections 2 and 3 present the core concepts of the Termite model,
to exchange data and coordinate their activities. The possibility of and the various aspects that are a consequence of that model.
failure is an unavoidable reality in this setting due to the unreliabil- Section 4 describes the language, followed by extended examples
ity of networks and computer hardware. in Sec. 5. Finally, Section 6 presents the current implementation
Building a distributed application is a daunting task. It requires with some performance measurements.
delicate low-level programming to connect to remote hosts, send
them messages and receive messages from them, while properly
catching the various possible failures. Then it requires tedious 2. Termite’s Programming Model
encoding and decoding of data to send them on the wire. And The foremost design philosophy of the Scheme [14] language is the
finally it requires designing and implementing on top of it its own definition of a small, coherent core which is as general and power-
application-level protocol, complete with the interactions between ful as possible. This justifies the presence of first class closures and
the high-level protocol and the low-level failures. Lots and lots of continuations in the language: these features are able to abstract
bug opportunities and security holes in perspective. data and control, respectively. In designing Termite, we extended
this philosophy to concurrency and distribution features. The model
must be simple and extensible, allowing the programmer to build
his own concurrency abstractions.
Distributed computations consist of multiple concurrent pro-
Permission to make digital or hard copies of all or part of this work for personal or grams running in usually physically separate spaces and involving
classroom use is granted without fee provided that copies are not made or distributed data transfer through a potentially unreliable network. In order to
for profit or commercial advantage and that copies bear this notice and the full citation model this reality, the concurrency model used in Termite views
on the first page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee. the computation as a set of isolated sequential processes which are
Scheme and Functional Programming 2006 17 Sep 2006, Portland, OR uniquely identifiable across the distributed system. They commu-
Copyright c 2006 ACM [to be supplied]. . . $5.00. nicate with each other by exchanging messages. Failure is reflected
in Termite by the uncertainty associated with the transmission of
a message: there is no guarantee that a message sent will ever be 2.3 Failure
delivered. The unreliability of the physical, “real world” aspects of a dis-
The core features of Termite’s model are: isolated sequential tributed computation makes it necessary for that computation to
processes, message passing, and failure. pay close attention to the possibility of failure. A computation run
on a single computer with no exterior communication generally
2.1 Isolated sequential processes does not have to care whether the computer crashes. This is not
Termite processes are lightweight. There could be hundreds of the case in a distributed setting, where some parts of the computa-
thousands of them in a running system. Since they are an important tion might go on even in the presence of hardware failure or if the
abstraction in the language, the programmer should not consider network connection goes down. In order to model failure, sending
their creation as costly. She should use them freely to model the a message in Termite is an unreliable operation. More specifically,
problems at hand. the semantics of the language do not specify how much time a mes-
A Termite process executes in the context of a node. Nodes are sage will take to reach its destination and it may even never reach
identified with a node identifier that contains information to locate it, e.g. because of some hardware failure or excessive load some-
a node physically and connect to it (see Sec. 3.4 for details). The where along the way. Joe Armstrong has called this send and pray
procedure spawn creates and starts a new process on the node of semantics [2].
the parent process. Since the transmission of a message is unreliable, it is generally
Termite processes are identified with process identifiers or pids. necessary for the application to use a protocol with acknowledg-
Pids are universally unique. We make the distinction here between ments to check that the destination has received the message . The
globally unique, which means unique at the node level, and uni- burden of implementing such a protocol is left to the application
versally unique, which means unique at the whole distributed net- because there are several ways to do it, each with an impact on the
work level. A pid is therefore a reference to a process and contains way the application is organized. If no acknowledgment is received
enough information to determine the node on which the process is within a certain time frame, then the application will take some
located. It is important to note that there is no guarantee that a pid action to recover from the failure. In Termite the mechanism for
refers to a process that is reachable or still alive. handling the waiting period is to have an optional timeout for the
Termite enforces strong isolation between each of the processes: amount of time to wait for messages. This is a basic mechanism on
it is impossible for a process to directly access the memory space of which we can build higher level failure handling.
another process. This is meant to model the reality of a physically
distributed system, and has the advantage of avoiding the prob- 3. Peripheral Aspects
lems relative to sharing memory space between processes. This also
avoids having to care about mutual exclusion at the language level. Some other Termite features are also notable. While they are not
There is no need for mutexes and condition variables. Another con- core features, they come naturally when considering the basic
sequence of that model is that there is no need for a distributed model. The most interesting of those derived features are serializa-
garbage collector, since there cannot be any foreign reference be- tion, how to deal with mutation, exception handling and the naming
tween two nodes’s memory spaces. On the other hand, a live pro- of computers and establishing network connections to them.
cess might become unreachable, causing a resource leak: this part
of the resource management needs to be done manually. 3.1 Serialization
There should be no restrictions on the type of data that can consti-
2.2 Sending and receiving messages tute a message. Therefore, it is important that the runtime system
Processes interact by exchanging messages. Each process has a of the language supports serialization of every first class value in
single mailbox in which Termite stores messages in the order in the language, including closures and continuations.
which it receives them. This helps keep the model simple since it But this is not always possible. Some first class values in
saves us from introducing concepts like mailboxes or ports. Scheme are hard to serialize meaningfully, like ports and references
In Termite, a message can be any serializable first class value. to physical devices. It will not be possible to serialize a closure or
It can be an atomic value such as a number or a symbol, or a a continuation if it has a direct reference to one of these objects in
compound value such as a list, record, or continuation, as long as it their environment.
contains only serializable values. To avoid having references to non-serializable objects in the en-
The message sending operation is asynchronous. When a pro- vironment, we build proxies to those objects by using processes,
cess sends a message, this is done without the process blocking. so that the serialization of such an object will be just a pid. There-
The message retrieval operation is synchronous. A process at- fore, Termite uses processes to represent ports (like open files) or
tempting to retrieve a message from its mailbox will block if no references to physical devices (like the mouse and keyboard).
message is available. Abstracting non-serializable objects as processes has two other
Here is an example showing the basic operations used in Ter- benefits. First, it enables the creation of interesting abstractions.
mite: A process A spawns a new process B; The process B sends a For example, a click of the mouse will send a message to some
message to A; The process A waits until it receives it. “mouse listener”, sending a message to the process proxying the
standard output will print it, etc. Secondly, this allows us to access
(let ((me (self))) non-movable resources transparently through the network.
(spawn
3.2 Explicit mutation
(lambda ()
(! me "Hello, world!")))) To keep the semantics clean and simplify the implementation, mu-
tation of variables and data structures is not available. This allows
(?) =⇒ "Hello, world!" the implementation of message-passing within a given computer
without having to copy the content of the message.
The procedure self returns the pid of the current process. The For this reason, Termite forbids explicit mutation in the sys-
procedure ! is the send message operation, while the procedure ? tem (as with the special form set! and procedures set-car!,
is the retrieve the next mailbox message operation. vector-set!, etc.) This is not as big a limitation as it seems at
first. It is still possible to replace or simulate mutation using pro- 3.5 Tags
cesses. We just need to abstract state using messages and suspended A process may make multiple concurrent requests to another pro-
processes. This is a reasonable approach because processes are cess. Also, replies to requests may come out of order (and even
lightweight. An example of a mutable data structure implemented from a completely different process, e.g. if the request was for-
using a process is given in Section 4.6. warded). In those cases, it can be difficult to sort out which re-
ply corresponds to which request. For this purpose, Termite has a
universally unique reference data type called tag. When needed,
3.3 Exception handling
the programmer can then uniquely mark each new request with a
A Termite exception can be any first class value. It can be raised by new tag, and copy the tag into the replies, to unequivocally indi-
an explicit operation, or it can be the result of a software error (like cate which reply corresponds to which request. Note that this can
division by zero or a type error). be necessary even when there is apparently only one request pend-
Exceptions are dealt with by installing dynamically scoped han- ing, since the process may receive a spurious delayed reply to some
dlers. Any exception raised during execution will invoke the han- earlier request which had timed out.
dler with the exception as a parameter. The handler can either
choose to manage that exceptional condition and resume execution 4. The Termite Language
or to raise it again. If it raises the exception again, it will invoke the
nearest encapsulating handler. Otherwise, the point at which exe- This section introduces the Termite language through examples.
cution resumes depends on the handler: an exception-handler will For the sake of simplicity those examples assume that messages
resume execution at the point the exception was raised, whereas an will always be delivered (no failure) and always in the same order
exception-catcher will resume execution at the point that the han- that they were sent.
dler was installed. The fundamental operations of Termite are:
If an exception propagates to the outer scope of the process (i.e. (spawn fun ): create a process running fun and return its pid.
an uncaught exception), the process dies. In order to know who to
notify of such a circumstance, each process has what we call links (! pid msg ): send message msg to process pid.
to other processes. When a process dies and it is linked to other pro-
cesses, Termite propagates the exception to those processes. Links (? [timeout [default ]]): fetch a message from the mailbox.
between processes are directed. A process which has an outbound 4.1 Making a “server” process
link to another process will send any uncaught exception to the
other process. Note that exception propagation, like all communi- In the following code, we create a process called pong-server.
cation, is unreliable. The implementation will make an extra effort This process will reply with the symbol pong to any message that
when delivering an exception since that kind of message may be is a list of the form (pid ping) where pid refers to the originating
more important for the correct execution of the application. process. The Termite procedure self returns the pid of the current
Receiving an exception causes it to be raised in the receiving process.
process at the moment of the next message retrieve operation by
that process. (define pong-server
Links can be established in both directions between two pro- (spawn
cesses. In that situation the link is said to be bidirectional. The (lambda ()
direction of the link should reflect the relation between the two (let loop ()
processes. In a supervisor-worker relation, we will use a bidirec- (let ((msg (?)))
tional link since both the supervisor and the worker need to learn (if (and (list? msg)
about the death of the other (the supervisor so it may restart the (= (length msg) 2)
worker, the worker so it can stop executing). In a monitor-worker (pid? (car msg))
relation where the monitor is an exterior observer to the worker, we (eq? (cadr msg) ’ping))
will use an outbound link from the worker since the death of the (let ((from (car msg)))
monitor should not affect the worker. (! from ’pong)
(loop))
(loop)))))))
3.4 Connecting nodes
(! pong-server (list (self) ’ping))
Termite processes execute on nodes. Nodes connect to each other
when needed in order to exchange messages. The current practice (?) =⇒ pong
in Termite is to uniquely identify nodes by binding them to an IP
address and a TCP port number. Node references contain exactly
that information and therefore it is possible to reach a node from 4.2 Selective message retrieval
the information contained in the reference. Those references are While the ? procedure retrieves the next available message in
built using the make-node procedure. the process’ mailbox, sometimes it can be useful to be able to
Termite’s distributed system model is said to be open: nodes choose the message to retrieve based on a certain criteria. The
can be added or removed from a distributed computation at any selective message retrieval procedure is (?? pred [timeout
time. Just like it is possible to spawn a process on the current [default ]]). It retrieves the first message in the mailbox which
node, it is possible to spawn a process on a remote node by using satisfies the predicate pred. If none of the messages in the mailbox
the remote-spawn procedure. This is one of the key features that satisfy pred, then it waits until one arrives that does or until the
enable distribution. timeout is hit.
The concept of global environment as it exists in Scheme is Here is an example of the ?? procedure in use:
tied to a node. A variable referring to the global environment will
resolve to the value tied to that variable on the node on which the (! (self) 1)
process is currently executing. (! (self) 2)
(! (self) 3) (define rpc-server
(spawn
(?) =⇒ 1 (lambda ()
(?? odd?) =⇒ 3 (let loop ()
(?) =⇒ 2 (recv
((from tag (’add a b))
(! from (list tag (+ a b)))))
4.3 Pattern matching (loop)))))
The previous pong-server example showed that ensuring that a
message is well-formed and extracting relevant information from it (let ((tag (make-tag)))
can be quite tedious. Since those are frequent operations, Termite (! rpc-server (list (self)
offers an ML-style pattern matching facility. tag
Pattern matching is implemented as a special form called recv, (list ’add 21 21)))
conceptually built on top of the ?? procedure. It has two simulta- (recv
neous roles: selective message retrieval and data destructuring. The ;; note the reference to tag in
following code implements the same functionality as the previous ;; the current lexical scope
pong server but using recv: ((,tag reply) reply))) =⇒ 42
(define better-pong-server The pattern of implementing a synchronous call by creating a
(spawn tag and then waiting for the corresponding reply by testing for tag
(lambda () equality is frequent. This pattern is abstracted by the procedure !?.
(let loop () The following call is equivalent to the last let expression in the
(recv previous code:
((from ’ping) ; pattern to match
(where (pid? from)) ; constraint (!? rpc-server (list ’add 21 21))
(! from ’pong))) ; action Note that the procedure !? can take optional timeout and default
(loop))))) arguments like the message retrieving procedures.
The use of recv here only has one clause, with the pattern 4.6 Mutable data structure
(from ’ping) and an additional side condition (also called where
clause) (pid? from). The pattern constrains the message to be While Termite’s native data structures are immutable, it is still
a list of two elements where the first can be anything (ignoring possible to implement mutable data structures using processes to
for now the subsequent side condition) and will be bound to the represent state. Here is an example of the implementation of a
variable from, while the second has to be the symbol ping. There mutable cell:
can of course be several clauses, in which case the first message (define (make-cell content)
that matches one of the clauses will be processed. (spawn
4.4 Using timeouts (lambda ()
(let loop ((content content))
Timeouts are the fundamental way to deal with unreliable message (recv
delivery. The operations for receiving messages (ie. ?, ??) can ((from tag ’ref)
optionally specify the maximum amount of time to wait for the (! from (list tag content))
reception of a message as well as a default value to return if this (loop content))
timeout is reached. If no timeout is specified, the operation will wait
forever. If no default value is specified, the timeout symbol will ((’set! content)
be raised as an exception. The recv special form can also specify (loop content)))))))
such a timeout, with an after clause which will be selected after
no message matched any of the other clauses for the given amount (define (cell-ref cell)
of time. (!? cell ’ref))
(! some-server (list (self) ’request argument))
(define (cell-set! cell value)
(! cell (list ’set! value)))
(? 10) ; waits for a maximum of 10 seconds
;; or, equivalently:
(recv 4.7 Dealing with exceptional conditions
(x x) Explicitly signaling an exceptional condition (such as an error) is
(after 10 (raise ’timeout))) done using the raise procedure. Exception handling is done us-
ing one of the two procedures with-exception-catcher and
4.5 Remote procedure call with-exception-handler, which install a dynamically scoped
exception handler (the first argument) for the duration of the evalu-
The procedure spawn takes a thunk as parameter, creates a process ation of the body (the other arguments).
which evaluates this thunk, and returns the pid of this newly created After invoking the handler on an exception, the procedure
process. Here is an example of an RPC server to which uniquely with-exception-catcher will resume execution at the point
identified requests are sent. In this case a synchronous call to the where the handler was installed. with-exception-handler, the
server is used: alternative procedure, will resume execution at the point where
the exception was raised. The following example illustrates this
difference:
(list migration, two are shown here. The simplest form of migration,
(with-exception-catcher called here migrate-task, is to move a process to another node,
(lambda (exception) exception) abandoning messages in its mailbox and current links behind. For
(lambda () that we capture the continuation of the current process, start a new
(raise 42) ; this will not return process on a remote node which invokes this continuation, and then
123)) =⇒ (42) terminate the current process:

(list (define (migrate-task node)


(with-exception-handler (call/cc
(lambda (exception) exception) (lambda (k)
(lambda () (remote-spawn node (lambda () (k #t)))
(raise 42) ; control will resume here (halt!))))
123)) =⇒ (123)
A different kind of migration (migrate/proxy), which might
The procedure spawn-link creates a new process, just like be more appropriate in some situations, will take care to leave a
spawn, but this new process is bidirectionally linked with the cur- process behind (a proxy) which will forward messages sent to it
rent process. The following example shows how an exception can to the new location. In this case, instead of stopping the original
propagate through a link between two processes: process we make it execute an endless loop which forwards to the
new process every message received:
(catch
(lambda (exception) #t) (define (migrate/proxy node)
(spawn (lambda () (raise ’error))) (define (proxy pid)
(? 1 ’ok) (let loop ()
#f) =⇒ #f (! pid (?))
(loop)))
(catch (call/cc
(lambda (exception) #t) (lambda (k)
(spawn-link (lambda () (raise ’error))) (proxy
(? 1 ’ok) (remote-spawn-link
#f) =⇒ #t node
(lambda () (k #t)))))))
4.8 Remotely spawning a process Process cloning is simply creating a new process from an exist-
The function to create a process on another node is remote-spawn. ing process with the same state and the same behavior. Here is an
Here is an example of its use: example of a process which will reply to a clone message with a
thunk that makes any process become a “clone” of that process:
(define node (make-node "example.com" 3000))
(define original
(let ((me (self))) (spawn
(remote-spawn node (lambda ()
(lambda () (let loop ()
(! me ’boo)))) =⇒ a-pid (recv
((from tag ’clone)
(?) =⇒ boo (call/cc
(lambda (clone)
Note that it is also possible to establish links to remote pro- (! from (list tag (lambda ()
cesses. The remote-spawn-link procedure atomically spawns (clone #t))))))))
and links the remote process: (loop)))))
(define node (make-node "example.com" 3000)) (define clone (spawn (!? original ’clone)))
(catch Updating code dynamically in a running system can be very
(lambda (exception) exception) desirable, especially with long-running computations or in high-
(let ((me (self))) availability environments. Here is an example of such a dynamic
(remote-spawn-link node code update:
(lambda ()
(raise ’error)))) (define server
(? 2 ’ok)) =⇒ error (spawn
(lambda ()
(let loop ()
4.9 Abstractions built using continuations (recv
Interesting abstractions can be defined using call/cc. In this sec- ((’update k)
tion we give as an example process migration, process cloning, and (k #t))
dynamic code update.
Process migration is the act of moving a computation from ((from tag ’ping)
one node to another. The presence of serializable continuations in (! from (list tag ’gnop)))) ; bug
Termite makes it easy. Of the various possible forms of process (loop)))))
(recv
(define new-server ((’load-report from load)
(spawn (loop (dict-set meters from load)))
(lambda () ((from tag ’minimum-load)
(let loop () (let ((min (find-min (dict->list meters))))
(recv (! from (list tag (pid-node (car min)))))
((’update k) (loop dict))
(k #t)) ((from tag ’average-load)
(! from (list tag
((from tag ’clone) (list-average
(call/cc (map cdr
(lambda (k) (dict->list meters)))))
(! from (list tag k))))) (loop dict)))))

((from tag ’ping) (define (minimum-load supervisor)


(! from (list tag ’pong)))) ; fixed (!? supervisor ’minimum-load))
(loop)))))
(define (average-load supervisor)
(!? server ’ping) =⇒ gnop (!? supervisor ’average-load))

(! server (list ’update (!? new-server ’clone))) And here is how we may start such a supervisor:

(!? server ’ping) =⇒ pong (define (start-meter-supervisor)


(spawn
Note that this allows us to build a new version of a running (lambda ()
process, test and debug it separately and when it is ready replace (let ((supervisor (self)))
the running process with the new one. Of course this necessitates (meter-supervisor
cooperation from the process whose code we want to replace (it (map
must understand the update message). (lambda (node)
(spawn
5. Examples (migrate node)
(start-meter supervisor)))
One of the goals of Termite is to be a good framework to experi- *node-list*))))))
ment with abstractions of patterns of concurrency and distributed
protocols. In this section we present three examples: first a simple Now that we can establish what is the current load on nodes in
load-balancing facility, then a technique to abstract concurrency in a cluster, we can implement load balancing. The work dispatching
the design of a server and finally a way to transparently “robustify” server receives a thunk, and migrates its execution to the currently
a process. least loaded node of the cluster. Here is such a server:
5.1 Load Balancing (define (start-work-dispatcher load-server)
This first example is a simple implementation of a load-balancing (spawn
facility. It is built from two components: the first is a meter supervi- (lambda ()
sor. It is a process which supervises workers (called meters in this (let loop ()
case) on each node of a cluster in order to collect load information. (recv
The second component is the work dispatcher: it receives a closure ((from tag (’dispatch thunk))
to evaluate, then dispatches that closure for evaluation to the node (let ((min-loaded-node
with the lowest current load. (minimum-load load-server)))
Meters are very simple processes. They do nothing but send the (spawn
load of the current node to their supervisor every second: (lambda ()
(migrate min-loaded-node)
(define (start-meter supervisor) (! from (list tag (thunk))))))))
(let loop () (loop)))))
(! supervisor
(list ’load-report (define (dispatch dispatcher thunk)
(self) (!? dispatcher (list ’dispatch thunk)))
(local-loadavg)))
(recv (after 1 ’ok)) ; pause for a second It is then possible to use the procedure dispatch to request
(loop))) execution of a thunk on the most lightly loaded node in a cluster.
The supervisor creates a dictionary to store current load infor- 5.2 Abstracting Concurrency
mation for each meter it knows about. It listens for the update mes-
sages and replies to requests for the node in the cluster with the Since building distributed applications is a complex task, it is par-
lowest current load and to requests for the average load of all the ticularly beneficial to abstract common patterns of concurrency. An
nodes. Here is a simplified version of the supervisor: example of such a pattern is a server process in a client-server or-
ganization. We use Erlang’s concept of behaviors to do that: behav-
(define (meter-supervisor meter-list) iors are implementations of particular patterns of concurrent inter-
(let loop ((meters (make-dict))) action.
The behavior given as example in this section is derived from 5.3 Fault Tolerance
the generic server behavior. A generic server is a process that can Promoting the writing of simple code is only a first step in order
be started, stopped and restarted, and answers RPC-like requests. to allow the development of robust applications. We also need to
The behavior contains all the code that is necessary to handle the be able to handle system failures and software errors. Supervisors
message sending and retrieving necessary in the implementation of are another kind of behavior in the Erlang language, but we use
a server. The behavior is only the generic framework. To create a a slightly different implementation from Erlang’s. A supervisor
server we need to parameterize the behavior using a plugin that process is responsible for supervising the correct execution of a
describes the server we want to create. A plugin contains closures worker process. If there is a failure in the worker, the supervisor
(often called callbacks) that the generic code calls when certain restarts it if necessary.
events occur in the server. Here is an example of use of such a supervisor:
A plugin only contains sequential code. All the code having
to deal with concurrency and passing messages is in the generic (define (start-pong-server)
server’s code. When invoking a callback, the current server state (let loop ()
is given as an argument. The reply of the callback contains the (recv
potentially modified server code. ((from tag ’crash)
A generic server plugin contains four closures. The first is for (! from (list tag (/ 1 0))))
server initialization, called when creating the server. The second is ((from tag ’ping)
for procedure calls to the server: the closure dispatches on the term (! from (list tag ’pong))))
received in order to execute the function call. Procedure calls to the (loop)))
server are synchronous. The third closure is for casts, which are
asynchronous messages sent to the server in order to do manage- (define robust-pong-server
ment tasks (like restarting or stopping the server). The fourth and (spawn-thunk-supervised start-pong-server))
last closure is called when terminating the server.
Here is an example of a generic server plugin implementing a (define (ping server)
key/value server: (!? server ’ping 1 ’timeout))
(define key/value-generic-server-plugin
(make-generic-server-plugin (define (crash server)
(lambda () ; INIT (!? server ’crash 1 ’crashed))
(print "Key-Value server starting")
(make-dict)) (define (kill server)
(! server ’shutdown))
(lambda (term from state) ; CALL
(match term (print (ping robust-pong-server))
((’store key val) (print (crash robust-pong-server))
(dict-set! state key val) (print (ping robust-pong-server))
(list ’reply ’ok state)) (kill robust-pong-server)

This generates the following trace (note that the messages pre-
((’lookup key) fixed with info: are debugging messages from the supervisor) :
(list ’reply (dict-ref state key) state))))
(info: starting up supervised process)
(lambda (term state) ; CAST pong
(match term (info: process failed)
(’stop (list ’stop ’normal state)))) (info: restarting...)
(info: starting up supervised process)
(lambda (reason state) ; TERMINATE crashed
(print "Key-Value server terminating")))) pong
(info: had to terminate the process)
It is then possible to access the functionality of the server by
(info: halting supervisor)
using the generic server interface:
(define (kv:start) The call to spawn-thunk-supervised return the pid of the
(generic-server-start-link supervisor, but any message sent to the supervisor is sent to the
key/value-generic-server-plugin)) worker. The supervisor is then mostly transparent: interacting pro-
cesses do not necessarily know that it is there.
(define (kv:stop server) There is one special message which the supervisors intercepts,
(generic-server-cast server ’stop)) and that consists of the single symbol shutdown. Sending that
message to the supervisor makes it invoke a shutdown procedure
(define (kv:store server key val) that requests the process to end its execution, or terminate it if it
(generic-server-call server (list ’store key val))) does not collaborate. In the previous trace, the “had to terminate the
process” message indicates that the process did not acknowledge
(define (kv:lookup server key) the request to end its execution and was forcefully terminated.
(generic-server-call server (list ’lookup key))) A supervisor can be parameterized to set the acceptable restart
frequency tolerable for a process. A process failing more often than
Using such concurrency abstractions helps in building reliable a certain limit is shut down. It is also possible to specify the delay
software, because the software development process is less error- that the supervisor will wait for when sending a shutdown request
prone. We reduce complexity at the cost of flexibility. to the worker.
The abstraction shown in this section is useful to construct a jects and the block option is used. However, the Scheme program
fault-tolerant server. A more general abstraction would be able to performing the deserialization must have the same compiled code,
supervise multiple processes at the same time, with a policy deter- either statically linked or dynamically loaded. Because the Scheme
mining the relation between those supervised processes (should the interpreter in the Gambit-C runtime is compiled with the block op-
supervisor restart them all when a single process fails or just the tion, we can always serialize closures and continuations created by
failed process, etc.). interpreted code and we can deserialize them in programs using the
same version of Gambit-C. The serialization format is machine in-
5.4 Other Applications dependent (endianness, machine word size, instruction set, memory
As part of Termite’s development, we implemented two non-trivial layout, etc.) and can thus be deserialized on any machine. Continu-
distributed applications with Termite. Dynamite is a framework for ation serialization allows the implementation of process migration
developing dynamic AJAX-like web user-interfaces. We used Ter- with call/cc.
mite processes to implement the web-server side logic, and we For the first prototype of Termite we used the Gambit-C system
can manipulate user-interface components directly from the server- as-is. During the development process various performance prob-
side (for example through the repl). Schack is an interactive mul- lems were identified. This prompted some changes to Gambit-C
tiplayer game using Dynamite for its GUI. Players and monsters which are now integrated in the official release:
move around in a virtual world, they can pick up objects, use them,
etc. The rooms of the world, the players and monsters are all im- • Mailboxes: Each Gambit-C thread has a mailbox. Predefined
plemented using Termite processes which interact. procedures are available to probe the messages in the mailbox
and extract messages from the mailbox. The operation to ad-
6. The Termite Implementation vance the probe to the next message optionally takes a timeout.
This is useful for implementing Termite’s time limited receive
The Termite system was implemented on top of the Gambit-C operations.
Scheme system [6]. Two features of Gambit-C were particularly • Thread subtyping: There is a define-type-of-thread spe-
helpful for implementing the system: lightweight threads and ob-
cial form to define subtypes of the builtin thread type. This is
ject serialization.
useful to attach thread local information to the thread, in partic-
Gambit-C supports lightweight prioritized threads as specified
ular the process links.
by SRFI 18 [7] and SRFI 21 [8]. Each thread descriptor contains the
thread’s continuation in the same linked frame representation used • Serialization: Originally serialization used a textual format
by first class continuations produced by call/cc. Threads are sus- compatible with the standard datum syntax but extended to all
pended by capturing the current continuation and storing it in the types and with the SRFI 38 [5] notation for representing cy-
thread descriptor. The space usage for a thread is thus dependent cles. We added hash tables to greatly improve the speed of the
on the depth of its continuation and the objects it references at that algorithm for detecting shared data. We improved the compact-
particular point in the computation. The space efficiency compares ness of the serialized objects by using a binary format. Finally,
well with the traditional implementation of threads which preallo- we parameterized the serialization and deserialization proce-
cates a block of memory to store the stack, especially in the context dures (object->u8vector and u8vector->object) with an
of a large number of small threads. On a 32 bit machine the to- optional conversion procedure applied to each subobject visited
tal heap space occupied by a trivial suspended thread is roughly during serialization or constructed during deserialization. This
650 bytes. A single shared heap is used by all the threads for all allows the program to define serialization and deserialization
allocations including continuations (see [9] for details). Because methods for objects such as ports and threads which would oth-
the thread scheduler uses scalable data structures (red-black trees) erwise not be serializable.
to represent priority queues of runnable, suspended and sleeping • Integration into the Gambit-C runtime: To correctly imple-
threads, and threads take little space, it is possible to manage mil- ment tail-calls in C, Gambit-C uses computed gotos for intra-
lions of threads on ordinary hardware. This contributes to make the module calls but trampolines to jump from one compilation unit
Termite model practically insensitive to the number of threads in- to another. Because the Gambit-C runtime and the user pro-
volved. gram are distributed over several modules, there is a relatively
Gambit-C supports serialization for an interesting subset of ob- high cost for calling procedures in the runtime system from the
jects including closures and continuations but not ports, threads user program. When the Termite runtime system is in a mod-
and foreign data. The serialization format preserves sharing, so ule of its own, calls to some Termite procedures must cross two
even data with cycles can be serialized. We can freely mix inter- module boundaries (user program to Termite runtime, and Ter-
preted code and compiled code in a given program. The Scheme mite runtime to Gambit-C runtime). For this reason, integrating
interpreter, which is written in Scheme, is in fact compiled code the Termite runtime in the thread module of the Gambit-C run-
in the Gambit-C runtime system. Interpreted code is represented time enhances execution speed (this is done simply by adding
with common Scheme objects (vectors, closures created by com- (include "termite.scm") at the end of the thread module).
piled code, symbols, etc.). Closures use a flat representation, i.e.
a closure is a specially tagged vector containing the free variables
and a pointer to the entry point in the compiled code. Continuation 7. Experimental Results
frames use a similar representation, i.e. a specially tagged vector In order to evaluate the performance of Termite, we ran some
containing the continuation’s free variables, which include a ref- benchmark programs using Termite version 0.9. When possible,
erence to the parent continuation frame, and a pointer to the re- we compared the two systems by executing the equivalent Erlang
turn point in the compiled code. When Scheme code is compiled program using Erlang/OTP version R11B-0, compiled with SMP
with Gambit-C’s block option, which signals that procedures de- support disabled. Moreover, we also rewrote some of the bench-
fined at top-level are never redefined, entry points and return points marks directly in Gambit-C Scheme and executed them with ver-
are identified using the name of the procedure that contains them sion 4.0 beta 18 to evaluate the overhead introduced by Termite.
and the integer index of the control point within that procedure. Se- In all cases we compiled the code, and no optimization flags were
rialization of closures and continuations created by compiled code given to the compilers. We used the compiler GCC version 4.0.2
is thus possible as long as they do not refer to non-serializable ob- to compile Gambit-C, and we specified the configuration option “–
enable-single-host” for the compilation. We ran all the benchmarks then sends this integer minus one to the next process in the ring.
on a GNU/Linux machine with a 1 GHz AMD Athlon 64, 2GB When the number received is 0, the process terminates its execution
RAM and a 100Mb/s Ethernet, running kernel version 2.6.10. after sending 0 to the next process. This program is run twice with
a different initial number (K). Each process will block a total
7.1 Basic benchmarks of ⌈K/250000⌉ + 1 times (once for K = 0 and 5 times for
Simple benchmarks were run to compare the general performance K = 1000000).
of the systems on code which does not require concurrency and With K = 0 it is mainly the ring creation and destruction time
distribution. The benchmarks evaluate basic features like the cost which is measured. With K = 1000000, message passing and pro-
of function calls and memory allocation. cess suspension take on increased importance. The results of this
The following benchmarks were used: benchmark are given in Figure 3. Performance is given in microsec-
onds per process. A lower number means better performance.
• The recursive Fibonacci and Takeuchi functions, to estimate the
cost of function calls and integer arithmetic, Erlang Gambit Termite
• Naive list reversal, to strain memory allocation and garbage K (µs) (µs) (µs)
collection, 0 6.64 4.56 7.84
• Sorting a list of random integers using the quicksort algorithm, 1000000 7.32 14.36 15.48
• String matching using the Smith Waterman algorithm.

The results of those benchmarks are given in Figure 1. They Figure 3. Performance for ring of 250000 processes
show that Termite is generally 2 to 3.5 times faster than Er-
lang/OTP. The only exception is for nrev which is half the speed We can see that all three systems have similar performance for
of Erlang/OTP due to the overhead of Gambit-C’s interrupt polling process creation; Gambit-C is slightly faster than Erlang and Ter-
approach. mite is slightly slower. The performance penalty for Termite rela-
tively to Gambit-C is due in part to the extra information Termite
Erlang Termite processes must maintain (like a list of links) and the extra test on
Test (s) (s) message sends to determine whether they are intended for a local or
fib (34) 1.83 0.50 a remote process. Erlang shows the best performance when there is
tak (27, 18, 9) 1.00 0.46 more communication between processes and process suspension.
nrev (5000) 0.29 0.53
qsort (250000) 1.40 0.56 7.3 Benchmarks for distributed applications
smith (600) 0.46 0.16 7.3.1 “Ping-Pong” exchanges
This benchmark measures the time necessary to send a message
Figure 1. Basic benchmarks. between two processes exchanging ping-pong messages. The pro-
gram is run in three different situations: when the two processes are
running on the same node, when the processes are running on dif-
7.2 Benchmarks for concurrency primitives ferent nodes located on the same computer and when the processes
We wrote some benchmarks to evaluate the relative performance are running on different nodes located on two computers commu-
of Gambit-C, Termite, and Erlang for primitive concurrency oper- nicating across a local area network. In each situation, we vary the
ations, that is process creation and exchange of messages. volume of the messages sent between the processes by using lists
The first two benchmarks stress a single feature. The first of small integers of various lengths. The measure of performance is
(spawn) creates a large number of processes. The first process cre- the time necessary to send and receive a single message. The lower
ates the second and terminates, the second creates the third and the value, the better the performance.
terminates, and so on. The last process created terminates the pro-
gram. The time for creating a single process is reported. In the Erlang Gambit Termite
second benchmark (send), a process repeatedly sends a message to List length (µs) (µs) (µs)
itself and retrieves it. The time needed for a single message send 0 0.20 0.67 0.75
and retrieval is reported. The results are given in Figure 2. Note that 10 0.31 0.67 0.75
neither program causes any process to block. We see that Gambit-C 20 0.42 0.67 0.74
and Termite are roughly twice the speed of Erlang/OTP for process 50 0.73 0.68 0.75
creation, and roughly 3 times slower than Erlang/OTP for message 100 1.15 0.66 0.74
passing. Termite is somewhat slower than Gambit-C because of 200 1.91 0.67 0.75
the overhead of calling the Gambit-C concurrency primitives from 500 4.40 0.67 0.75
the Termite concurrency primitives, and because Termite processes 1000 8.73 0.67 0.75
contain extra information (list of linked processes).

Erlang Gambit Termite Figure 4. Local ping-pong: Measure of time necessary to send
Test (µs) (µs) (µs) and receive a message of variable length between two processes
running on the same node.
spawn 1.57 0.63 0.91
send 0.08 0.22 0.27
The local ping-pong benchmark results in Figure 4 illustrate an
interesting point: when the volume of messages grows, the perfor-
Figure 2. Benchmarks for concurrency primitives. mance of the Erlang system diminishes, while the performance of
Termite stays practically the same. This is due to the fact that the
The third benchmark (ring) creates a ring of 250 thousand Erlang runtime uses a separate heap per process, while the Gambit-
processes on a single node. Each process receives an integer and C runtime uses a shared heap approach.
Erlang Termite Termite
List length (µs) (µs) Migration (µs)
0 53 145 Within a node 4
10 52 153 Between two local nodes 560
20 52 167 Between two remote nodes 1000
50 54 203
100 55 286
Figure 7. Time required to migrate a process.
200 62 403
500 104 993
1000 177 2364
capturing a continuation and spawning a new process to invoke it
Figure 5. Inter-node ping-pong: Measure of time necessary to is almost free.
send and receive a message of variable length between two pro-
cesses running on two different nodes on the same computer.
8. Related Work
The inter-node ping-pong benchmark exercises particularly the The Actors model is a general model of concurrency that has been
serialization code, and the results in Figure 5 show clearly that developed by Hewitt, Baker and Agha [13, 12, 1]. It specifies a
Erlang’s serialization is significantly more efficient than Termite’s. concurrency model where independent actors concurrently execute
This is expected since serialization is a relatively new feature in code and exchange messages. Message delivery is guaranteed in the
Gambit-C that has not yet been optimized. Future work should model. Termite might be considered as an “impure” actor language,
improve this aspect. because it does not adhere to the strict “everything is an actor”
model since only processes are actors. It also diverges from that
Erlang Termite model by the unreliability of the message transmission operation.
List length (µs) (µs) Erlang [3, 2] is a distributed programming system that has had a
0 501 317 significant influence on this work. Erlang was developed in the con-
10 602 337 text of building telephony applications, which are inherently con-
20 123 364 current. The idea of multiple lightweight isolated processes with
50 102 437 unreliable asynchronous message transmission and controlled error
100 126 612 propagation has been demonstrated in the context of Erlang to be
200 176 939 useful and efficient. Erlang is a dynamically-typed semi-functional
500 471 1992 language similar to Scheme in many regards. Those characteristics
1000 698 3623 have motivated the idea of integrating Erlang’s concurrency ideas
to a Lisp-like language. Termite notably adds to Erlang first class
continuations and macros. It also features directed links between
Figure 6. Remote ping-pong: Measure of time necessary to send processes, while Erlang’s links are always bidirectionals.
and receive a message of variable length between two processes Kali [4] is a distributed implementation of Scheme. It allows
running on two different computers communicating through the the migration of higher-order objects between computers in a dis-
network. tributed setting. It uses a shared-memory model and requires a dis-
tributed garbage collector. It works using a centralized model where
Finally, the remote ping-pong benchmark additionally exercises a node is supervising the others, while Termite has a peer-to-peer
the performance of the network communication code. The results model. Kali does not feature a way to deal with network failure,
are given in Figure 6. The difference with the previous program while that is a fundamental aspect of Termite. It implements ef-
shows that Erlang’s networking code is also more efficient than ficient communication by keeping a cache of objects and lazily
Termite’s by a factor of about 2.5 for large messages. This appears transmitting closure code, which are techniques a Termite imple-
to be due to more optimized networking code as well as a more mentation might benefit from.
efficient representation on the wire, which comes back to the rela- The Tube [11] demonstrates a technique to build a distributed
tive youth of the serialization code. The measurements with Erlang programming system on top of an existing Scheme implementation.
show an anomalous slowdown for small messages which we have The goal is to have a way to build a distributed programming
not been able to explain. Our best guess is that Nagle’s algorithm environment without changing the underlying system. It relies on
gets in the way, whereas Termite does not suffer from it because it the “code as data” property of Scheme and on a custom interpreter
explicitly disables it. able to save state to code represented as S-expressions in order to
implement code migration. It is intended to be a minimal addition
7.3.2 Process migration to Scheme that enables distributed programming. Unlike Termite,
We only executed this benchmark with Termite, since Erlang does it neither features lightweight isolated process nor considers the
not support the required functionality. This program was run in problems associated with failures.
three different configurations: when the process migrates on the Dreme [10] is a distributed programming system intended for
same node, when the process migrates between two nodes running open distributed systems. Objects are mobile in the network. It uses
on the same computer, and when the process migrates between two a shared memory model and implements a fault-tolerant distributed
nodes running on two different computers communicating through garbage collector. It differs from Termite in that it sends objects to
a network. The results are given in Figure 7. Performance is given remote processes by reference unless they are explicitly migrated.
in number of microseconds necessary for the migration. A lower Those references are resolved transparently across the network, but
value means better performance. the cost of operations can be hidden, while in Termite costly opera-
The results show that the main cost of a migration is in the tions are explicit. The system also features a User Interface toolkit
serialization and transmission of the continuation. Comparatively, that helps the programmer to visualize distributed computation.
9. Conclusion
Termite has shown to be an appropriate and interesting language
and system to implement distributed applications. Its core model
is simple yet allows for the abstraction of patterns of distributed
computation.
We built the current implementation on top of the Gambit-C
Scheme system. While this has the benefit of giving a lot of free-
dom and flexibility during the exploration phase, it would be inter-
esting to build from scratch a system with the features described in
this paper. Such a system would have to take into consideration the
frequent need for serialization, try to have processes as lightweight
and efficient as possible, look into optimizations at the level of what
needs to be transferred between nodes, etc. Apart from the opti-
mizations it would also benefit from an environment where a more
direct user interaction with the system would be possible. We in-
tend to take on those problems in future research while pursuing
the ideas laid in this paper.

10. Acknowledgments
This work was supported in part by the Natural Sciences and
Engineering Research Council of Canada.

References
[1] Gul Agha. Actors: a model of concurrent computation in distributed
systems. MIT Press, Cambridge, MA, USA, 1986.
[2] Joe Armstrong. Making reliable distributed systems in the presence
of software errors. PhD thesis, The Royal Institute of Technology,
Department of Microelectronics and Information Technology,
Stockholm, Sweden, December 2003.
[3] Joe Armstrong, Robert Virding, Claes Wikström, and Mike Williams.
Concurrent Programming in Erlang. Prentice-Hall, second edition,
1996.
[4] H. Cejtin, S. Jagannathan, and R. Kelsey. Higher-Order Distributed
Objects. ACM Transactions on Programming Languages and
Systems, 17(5):704–739, 1995.
[5] Ray Dillinger. SRFI 38: External representation for data with shared
structure. http://srfi.schemers.org/srfi-38/srfi-38.
html.
[6] Marc Feeley. Gambit-C version 4. http://www.iro.umontreal.
ca/∼gambit.
[7] Marc Feeley. SRFI 18: Multithreading support. http://srfi.
schemers.org/srfi-18/srfi-18.html.
[8] Marc Feeley. SRFI 21: Real-time multithreading support. http:
//srfi.schemers.org/srfi-21/srfi-21.html.
[9] Marc Feeley. A case for the unified heap approach to erlang
memory management. Proceedings of the PLI’01 Erlang Workshop,
September 2001.
[10] Matthew Fuchs. Dreme: for Life in the Net. PhD thesis, New York
University, Computer Science Department, New York, NY, United
States, July 2000.
[11] David A. Halls. Applying mobile code to distributed systems. PhD
thesis, University of Cambridge, Computer Laboratory, Cambridge,
United Kingdom, December 1997.
[12] C. E. Hewitt and H. G. Baker. Actors and continuous functionals. In
E. J. Neuhold, editor, Formal Descriptions of Programming Concepts.
North Holland, Amsterdam, NL, 1978.
[13] Carl E. Hewitt. Viewing control structures as patterns of passing
messages. Journal of Artificial Intelligence, 8(3):323–364, 1977.
[14] Richard Kelsey, William Clinger, and Jonathan Rees (Editors).
Revised5 report on the algorithmic language Scheme. ACM SIGPLAN
Notices, 33(9):26–76, 1998.

You might also like