0% found this document useful (0 votes)
130 views

A Type-Theoretic Reconstruction of The Visitor Pattern: Peter Buchlovsky

This document discusses reconstructing the visitor pattern from a type-theoretic perspective using polymorphic lambda calculus. It aims to clarify how variants of the visitor pattern relate to the more idealized type-theoretic encoding of algebraic data types. The key points are: 1) The visitor pattern can be seen as an implementation of isomorphisms between sum types and existentially quantified types in polymorphic lambda calculus. 2) Variants of the visitor pattern, such as internal vs external visitors and functional vs imperative visitors, are discussed. 3) The paper aims to provide a precise type-theoretic formulation of the visitor pattern that could inform reasoning about visitors and transfer technology from theory to design

Uploaded by

scvalencia606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
130 views

A Type-Theoretic Reconstruction of The Visitor Pattern: Peter Buchlovsky

This document discusses reconstructing the visitor pattern from a type-theoretic perspective using polymorphic lambda calculus. It aims to clarify how variants of the visitor pattern relate to the more idealized type-theoretic encoding of algebraic data types. The key points are: 1) The visitor pattern can be seen as an implementation of isomorphisms between sum types and existentially quantified types in polymorphic lambda calculus. 2) Variants of the visitor pattern, such as internal vs external visitors and functional vs imperative visitors, are discussed. 3) The paper aims to provide a precise type-theoretic formulation of the visitor pattern that could inform reasoning about visitors and transfer technology from theory to design

Uploaded by

scvalencia606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

MFPS 2005

A Type-theoretic Reconstruction of the Visitor


Pattern
Peter Buchlovsky 1
Computer Laboratory
University of Cambridge
Cambridge CB3 0FD, United Kingdom

Hayo Thielecke 2
School of Computer Science
University of Birmingham
Birmingham B15 2TT, United Kingdom

Abstract
In object-oriented languages, the Visitor pattern can be used to traverse tree-like
data structures: a visitor object contains some operations, and the data structure
objects allow themselves to be traversed by accepting visitors. In the polymorphic
lambda calculus (System F), tree-like data structures can be encoded as polymorphic
higher-order functions. In this paper, we reconstruct the Visitor pattern from the
polymorphic encoding by way of generics in Java. We sketch how the quantified
types in the polymorphic encoding can guide reasoning about visitors in general.
Key words: Visitor pattern, polymorphic types, object-oriented
programming, Generic Java

Introduction

Tree-like data structures, such as abstract syntax trees or binary trees, and
their traversal (often called tree walking) are ubiquitous in programming.
Modern functional languages, such as ML and Haskell, provide constructs
in the form of datatype definitions and pattern matching to deal with trees
and more general recursive datatypes. However, in object-oriented languages
such as Java, the situation is more complicated. Testing class membership
and branching on it is widely considered a violation of object-oriented style.
1
2

Email: [email protected]
Email: [email protected]
This paper is electronically published in
Electronic Notes in Theoretical Computer Science
URL: www.elsevier.nl/locate/entcs

Buchlovsky and Thielecke

Instead, the Visitor pattern [6] has been proposed to define operations on
tree-like inductive datatypes.
The simplest example using visitors is that of a sum type of the form A+B.
Since Java lacks a sum type, the Visitor pattern implements a class doing the
job of a sum type by a sort of double-negation transform. Concretely, the class
has a method for accepting a visitor; the visitor itself has two methods, one
accepting arguments of type A, the other of type B. If we take a very idealized
view by taking methods as functions and objects as tuples of such functions,
we can regard the above as an instance of a well-known isomorphism in the
polymorphic -calculus [8]:
A+B
= .((A ) (B ))
These isomorphisms are firmly grounded in programming language theory.
Via the Curry-Howard correspondence, they also form part of a bigger picture
in terms of the definability of logical connectives such as disjunction in higherorder logics.
The present paper aims to flesh out this type-theoretic view of visitors.
Specifically, a Design Pattern (such as the aforementioned Visitor pattern),
by its very nature, is not so much a single unambiguous definition, but a
variety of related instances. Hence we aim to clarify how variants of the Visitor
pattern relate to the more idealized type-theoretic picture. We classify visitors
as internal or external, depending on whether the visitor itself or the tree
specifies the traversal. Moreover these can be either functional (by returning
a value) or imperative (having a void return type and side-effects instead).
The resemblance of variants of the Visitor pattern to the encoding of algebraic types into System F may be part of the type-theoretic folklore. However,
formulating this precisely seems to be very useful, given the possibility of
some technology transfer from the highly developed theory of polymorphic
lambda calculi to less rigorous, but widely used design patterns. Our point is
not to translate object-oriented languages into functional ones or conversely.
Rather, we can see that the same notion has a manifestation in both of these
very different scenarios.
The contributions of this paper include the following:

We present a stylized polymorphic -calculus to make the relation of polymorphic encodings to visitors perspicuous.

We show the type soundness of the resulting visitors in Featherweight Generic Java [9].

We sketch how this abstract view of visitors could be useful for reasoning
about visitors.

For completeness, an appendix (Section A) gives more details on Featherweight Generic Java. These details are not essential for understanding the
paper.
2

Buchlovsky and Thielecke

Background

We briefly recall the two areas of relevant background between which we aim
to bridge: visitors as presented in the Design Patterns literature [6], and
polymorphic lambda calculi [7].
We find it useful to give some object-oriented terminology:
Interface A fully abstract class. Defines a set of methods but does not give
their bodies. This corresponds to an ML signature.
Class A class may implement an interface by providing a body for each
method header in the interface. This corresponds to an ML structure.
We will consider Generic Java [3] (Java extended with type parameters on
classes, interfaces and methods) throughout. Briefly, the syntax is as follows.
An interface or class definition of the form class C<>{...} defines a class
C parameterized by the type variable . The type C<Int> instantiates the
type parameter of class C to Int. A method definition of the form <> T
m(...){...} defines a method m in which the type variable is universally
quantified. A call to method m of the form o.m<Int>(...) instantiates to
Int.
The purpose of the Visitor pattern is to organize a program around the operations on a datatype as opposed to the constructors. The canonical example
of the Visitor pattern consists of abstract syntax trees and their traversal by
various phases of a compiler; this approach is used in SableCC [5]. We will
consider a simpler example based on binary trees of integer leaves and the
operation of summing up the leaves.
The standard object-oriented implementation of binary trees is based around
the Composite pattern [6]. The datatype signature (sometimes called element) is represented as an interface and the constructor for each variant of
the datatype becomes a class (sometimes called concrete element) which
implements the interface. The interface includes method headers that specify
the signatures of all operations on the data. This forces each constructor class
to provide a method to handle the appropriate case of the operation.
This approach permits new datatype variants to be added without modifying existing code, something that is not possible in ML. The disadvantage
is that adding new operations is difficult as every existing class has to be
amended. The Visitor pattern turns the situation around. It becomes easy to
add new operations but difficult to add new variants.
The Visitor pattern is used as follows. Every operation on the datatype
is packaged into a concrete visitor class. The case for handling each variant
is contained in a method typically named visitCons where Cons is the name
of the constructor class for that variant. Every concrete visitor implements a
visitor interface. This specifies the types of visit methods that must be present
in a concrete visitor. It can be seen as a signature for concrete visitors.
The constructor classes of the datatype are modified to include an accept
3

Buchlovsky and Thielecke

method. Its role is to accept a visitor and call the visit method for the variant
which the class implements. This is essentially a form of double dispatch on
the datatype variant and the visit method. Any components of the variant
stored in fields inside the class must be passed to the visit method. It is also
necessary to parameterize the accept methods and visitor interface since we
cannot know in advance the type yielded by any concrete visitor class.
To summarize, the visitor pattern consists of the following classes:
Visitor This is an interface for visitors. It declares visit methods named
visitCons for each Cons class.
ConcreteVisitor One class for each operation on the data. It implements
the Visitor interface and has to provide implementations for each of the
visit methods declared there.
Data/Element This is an interface naming a datatype (e.g. BinTree). It
declares an accept method which takes a Visitor object as an argument.
ConcreteData/ConcreteElement A Cons class for each variant of the
datatype (e.g. Leaf). This corresponds to an ML datatype constructor
named Cons. It implements the accept method that calls visitCons in the
Visitor object.
The Visitor pattern does not prescribe where a visitor should store intermediate results or how it should return the result to the caller. We will distinguish
between functional visitors which return intermediate results through the result of the call to accept and imperative visitors which accumulate results in
some field inside the visitor.
Another aspect of the Visitor pattern is the choice of traversal strategies
for composite objects. We could put the traversal code in the datatype. To do
this we ensure that the accept method is called recursively on any component
objects and passes the results to the visitor in the call to visit. Alternatively,
we could put the traversal code in the visitor itself. We will refer to these as
internal and external visitors respectively, by analogy with internal and
external iterators [6].
We will use the polymorphic lambda calculus (System F) to encode data
types. In fact, we need a more powerful extension of System F with polymorphic type constructors, called F , since it allows us to approximate generics and interfaces better than we could with System F alone. Most features
of this system will not be new to anyone familiar with advanced languages
like Haskell or ML: intuitively, F contains polymorphic functions and also
polymorphic type constructors.
See Figure 1 for a fairly standard presentation of F extended with finite
products. We write
/ to mean that is not among the free variables
in . We abbreviate ::.T as .T and similarly for ::.T . Empty
products are written as 1. We will also write ti for the projection i t and
ha, bi:A B .t for x:A B .[a 7 1 x, b 7 2 x] t.
4

Buchlovsky and Thielecke

Terms
t, s ::= x | x:T .t | t t | ::K .t | t [T ] | hti ii1..n | i t
Kinds
K ::= | K K
Types
T ::= | T T | ::K .T | ::K .T | T [T ] |

i1..n

Ti

Contexts
::= | , x : T | , :: K
Typing
(Var) x : T
`x:T
` T1 :: , x : T1 ` t : T2
` t : T1 T2 ` s : T1
(App)
(Abs)
` t s : T2
` x:T1 .t : T1 T2
, :: K ` t : T

/
` ::K .t : ::K .T
` t : ::K .T1 ` T2 :: K
(TApp)
` t [T2 ] : [ 7 T2 ] T1
Q
` t : i1..n Ti
i 1..n ` ti : Ti
Q
(Tuple)
(Proj)
` hti ii1..n : i1..n Ti
` j t : Tj
Kinding
(TAbs)

(TVar) :: K
` :: K
, :: K1 ` T :: K2
(KAbs)

/
` ::K1 .T :: K1 K2
` T1 :: K1 K2 ` T2 :: K1
` T1 [T2 ] :: K2
, :: K ` T ::
` T1 :: ` T2 ::
(KArrow)
(KAll)

/
` T1 T2 ::
` ::K .T ::
(KApp)

(KTuple)

i 1..n ` Ti ::
Q
` i1..n Ti ::

Reductions
(x:T .t) s ;
(::K .t)[T ] ;
j hti ii1..n

(::K .T1 )[T2 ] ;

[x 7 s] t
[ 7 T ] t
tj
[ 7 T2 ] T1

Fig. 1. System F with products.

Buchlovsky and Thielecke

Visitors and algebraic type encodings

In this section, we recall the encoding of algebraic types in System F [14]


(sometimes called the Bohm-Berarducci encoding [2]). The standard description is in terms of F -algebra. However, we find it useful to present it in a
slightly different style: while isomorphic to the usual account, it makes the
connection to the Visitor pattern more explicit.
3.1

Internal visitors

We consider algebraic types of the form


X
.
Fi []
i1..n

and we further restrict attention to Fi that are products of the recursive type
variable and type constants. Thus the types defined this way are various
forms of trees, and we will see how visitors are tree walkers.
Definition 3.1 We define internal visitors to be pairs of the form hA, hai ii1..n i,
where A is a type (called the result type), and hai ii1..n is a tuple of functions
ai : Fi [A] A (called the visit methods).
P
In the presence of sums, we could define F [X] = i1..n Fi [X]. Visitors
are essentially F -algebras, and their visit methods give the structure map.
Next, we consider the encoding of algebraic types (that is to say initial
F -algebras), but rephrased in terms of weakly initial visitors.
We define an object T as follows:
Y
T = .( (Fi [] ))
i1..n

T comes equipped with visit methods:


consi : Fi [T ] T
consi = x:Fi [T ]..v:

(Fj [] ).vi (Fi [t:T .t[]v]x)

j1..n

Intuitively, we think of the consi as the constructors of the datatype T .


Then T with hconsi ii1..n is weakly initial in the following sense. Let A
together with hai : Fi [A] Aii1..n be any visitor. Then there is a function
from T to A defined by:
t:T .t[A] hai ii1..n
This explains the definition of T : any element t of T has the ability to
accept any visitor with result A and visit methods visiti , and to yield an
element of A.
Y
T = .( (Fi [] ))
| {z }
i1..n

visiti []

{z

Visitor[]

Buchlovsky and Thielecke

More intuitively, the elements of T can be thought of as trees; and they use
the visit methods to collapse themselves recursively into a single element of
A. This is achieved by letting the visitor visit any subtrees, thereby collapsing
them into elements of the result type, and then calling the appropriate visit
methods for the topmost node.
accept visitor
walk over subtrees
z
{
z
}|
{
Y}|
consi = x:Fi [T ]. .v: (Fj [] ). vi (Fi [t:T .t[]v]x)
|{z}
| {z }
j1..n
call
visit
method
constructor arguments
Visitor morphisms that witness the initiality of T can be seen as calling
accept on a T object and passing it a concrete visitor:
concrete visitor
z }| {
b
a = t:T . t[A] hai ii1..n
|{z}
call accept
3.2

External visitors

An external visitor consists of a pair hA, hvi ii1..n i, where A is a type and
hvi ii1..n is a tuple of functions vi : Fi [T ] A. Note the T in the position
where an internal visitor would have another occurrence of A. Intuitively, an
external visitor has visit methods just like an internal one; the difference is
that these methods may accept trees of type T as arguments, rather than
automatically collapsing the trees into elements of result type A.
We define a structure S that accepts external visitors in the same way that
T accepts internal ones:
Y
S = .( (Fi [T ] ))
i1..n

This object has the structure of visit methods:


pi : Fi [S] S
pi = x:Fi [S]. .m:

(Fj [T ] ).mi (Fi [s:S .s[T ] hconsi ii1..n ] x)

j1..n

The visitor structure on S induces a function from the weakly initial visitor T .
b
p:T S
Intuitively, this map takes a tree and pattern matches it, so that it can be
visited by an external visitor.
External visitors can themselves use this map for further pattern matching
of subtrees and thus traversal. However, without adding recursion, an external
visitor can not traverse trees of arbitrary depth. Since the traversal of the
whole tree is no longer built-in the way it was in internal visitors, an external
visitor would have to recur under its own steam, as it were, which requires a
fixpoint combinator in the visitor.
7

Buchlovsky and Thielecke

Factorizing the encoding

The overall view on visitors in this section is as follows. Our aim is to bring
out the connection between the Visitor pattern and type encodings in System
F by factoring various translations. We present an encoding OJK of algebraic
types into a polymorphic -calculus that is stylistically close to object-oriented
languages. The calculus is a restricted subset of F (with products), and
the encoding corresponds to the classical encoding FJK of algebraic types
in System F (up to some reductions and type isomorphisms). On the other
hand, because our calculus resembles object-oriented languages, there is a
straightforward embedding JK into Featherweight Generic Java (which is itself
a subset of Generic Java); and this lets us recover the internal variant of the
Visitor pattern (by composition).
Alg. Types
qqq
Visitor
qqq
OJK

qq
xqqq JK
FGJ o

 


?_

F JK

/ F
_

/ F
oo
GJ o

We need a calculus for expressing visitors in such a way that we can then
transform them into both System F and FGJ. To do so, we will approximate objects as tuples of methods (of a restricted function type). Another
ingredient is a form of parameterized let on types that we will use for approximating interfaces with generics. We define:

let <> be T1 in T2 (:: .T2 ) [.T1 ]


Where there are no type parameters we omit <> and define:
let be T1 in T2 (:: .T2 ) [T1 ]
This construct could be extended to include an arbitrary number of parameters
by adding a tuple kind to the definition of F .
We define some conventions: we write + to mean one or more occur~ for [S1 ] . . . [Sp ]; and JSK
~ for JS1 K, . . . ,JSp K. Our
rences; ~ for 1 , . . . , p ; [S]
calculus for defining visitor types is then as follows.
Definition 4.1 Types of oo
are well-kinded types of F (with let) given by
J in
~
J ::= (let <> be O in)+ [S]
interface definitions
Q
O ::= i1..n Mi
object type
Q
M ::= ~ .( j1..m Sj ) S
method type
~
S ::= [S]

interface instantiation

| int

integer type

where , TyVar and int is a constant type. Note that this grammar
restricts all type variables to the kinds or . We also insist that all type
variables are bound and that all bound variables are distinct.
8

Buchlovsky and Thielecke

Using Definition 4.1, we can reformulate visitors as let types.


Definition 4.2 If T is a data type with constructors of type Fi [T ] T , its
visitor encoding is as follows:
Visitor[]

accept
z
}|
{
}|
{
z
Y
T = let <> be
( Fi [] ) in let be .[] in
| {z }
i1..n

visiti []

It is easy to see that this is equivalent to the System F encoding by a series


of reductions in F :
Y
T = let <> be
(Fi [] ) in let be .[] in
i1..n

= ( :: .(.)[.[] ])[.

(Fi [] )]

i1..n

; ( :: ..[] )[.

(Fi [] )]

i1..n

; .(.

(Fi [] ))[]

i1..n

; .(

(Fi [] ))

i1..n

But we can also recover visitors in Generic Java. To do so, we define a translation.
Definition 4.3 The translation JK from types of oo
to a sequence of FGJ
interface definitions as follows: (To render this more compactly we slightly
abuse EBNF notation. Repeated occurrences on the LHS correspond to those
on the RHS.)
~ = (interface <> {JOK})+
J(let <> be O in)+ [S]K
Q
J i1..n Mi K = (JMi K)i1..n
Q
~ JSK m((JSj K xj )j1..m );
J ~ .( j1..m Sj ) SK = <>
~ = <JSK>
~
J[S]K
JintK = Int

where m and xj are fresh.


When considering the transform to FGJ it will be necessary
to consider
the
Q
Q
factors of Fi [] separately. We will take Fi [] to be j1..r1 int kr..n .
9

Buchlovsky and Thielecke

We define the following abbreviations:


Int ~f = Int f1 , . . . , Int fr1
D ~g = D gr , . . . , D gn

(similarly for Int ~x)

(similarly for ~g and S ~y)

Int ~f; = Int f1 ; . . . ; Int fr1 ;


D ~g; = D gr ; . . . ; D gn ; (similarly for ~g;)
this.~f=~f; = this.f1 =f1 ; . . . ; this.fr1 =fr1 ;
this.~g=~g; = this.gr =gr ; . . . ; this.gn =gn ;
this.~f = this.f1 , . . . , this.fr1
this.~g.accept<>(v) = this.gr .accept<>(v), . . . , this.gn .accept<>(v)
Applying JK to the let encoding of visitors results in
Y
Jlet <> be
(Fi [] ) in let be .[] in K
i1..n

interface <> {
( visiti (Int ~x, ~y);)i1..n
}
interface {
<> accept(<> v);
}

Proposition 4.4 The translation from oo


to FGJ is type-preserving.
Proof (Sketch) The proof is by induction on the structure of types, keeping
track of the interface table generated.
2
We have shown that a sequence of Java interfaces can be seen as types in
System F . If interfaces are like types then classes implementing interfaces
should correspond to terms. This is indeed the case: for each constructor consi
there is a class Consi . Similarly, tuples of visit methods in F correspond
to classes implementing the visitor interface. See Figure 2 for an overview of
this correspondence.
We would also like to be sure that both the interface and class definitions
are well-typed.
Proposition 4.5 (Internal visitors are well-formed)
Assuming a class table CT and an interface table IT with the class and interface definitions shown in Figure 2, for any , for each i 1..n,
(i) ` U ok
(ii) ; ~x : Int, ~y : U, this : ConcVis ` ei V and ` V <: U
the class and interface definitions in Figure 2 are well-formed.
Proof (Sketch) Straightforward type-checking using the type system of FGJ.2
10

Buchlovsky and Thielecke

Internal visitor encoding

Internal visitor in FGJ


interface Visitor<> {
T =
Y
( visiti (Int ~x, ~y);)i1..n
let <> be
(Fi [] ) in
}
i1..n
interface D {
let be .[] in
<> accept(Visitor<> v);
}
class Consi implements D {
Int ~f; D ~g;
consi : Fi [T ] T
Consi (Int ~f, D ~g) {
consi = x:Fi [T ].
this.~f = ~f; this.~g = ~g;
Y
.v:
(Fj [] ). }
<> accept(Visitor<> v) {
j1..n
return v.visiti (this.~f,
vi (Fi [t:T .t[]v]x)
this.~g.accept<>(v));
}
}
Y
class ConcVis implements Visitor<U > {
s :
(Fi [U ] U )
i1..n
(U visiti (Int ~x, U ~y) {return ei ;})i1..n
}
s = hx:Fi [U ].ei ii1..n
Fig. 2. Correspondence between type encodings and internal visitors in FGJ.

As a worked example, we consider the type B of binary trees with integers


at the leaves, as given by the least fixed point of F [X] = Z + X X. This is
encoded as
B = let be .(Z ) (( ) ) in let be .[] in
which is transformed into
interface Visitor<> {
visitLeaf(Int x);
visitNode( x, y);
}
interface BinTree {
<> accept(Visitor<> v);
}
The datatype constructors are
leaf

: ZB

leaf(n)

= .hp, qi:(Z ) (( ) ).pn

node

: (B B) B

node(l, r) = .hp, qi:(Z ) (( ) ).qhl[]hp, qi, r[]hp, qii


11

Buchlovsky and Thielecke

The corresponding FGJ classes Leaf and Node are


class Leaf implements BinTree {
Int n;
Nil(Int n) { this.n = n; }
<> accept(Visitor<> v) {
return v.visitLeaf();
}
}
class Node implements BinTree {
BinTree l; BinTree r;
Cons(BinTree l, BinTree r) { this.l = l; this.r = r; }
<> accept(Visitor<> v) {
return v.visitNode(l.accept<>(v), r.accept<>(v));
}
}
The type of a concrete visitor for binary trees is
let be (Z Z) ((Z Z) Z) in
An operation for summing up the leaves of a tree can defined as follows:
sum : B Z
sum = t:T .t[Z]hx:Z.x, hx, yi:Z Z.x + yi
which is equivalent to a call to accept on a BinTree with an instance of the
following concrete visitor
class SumVisitor implements Visitor<Int> {
Int visitLeaf(Int x) {
return x;
}
Int visitNode(Int x, Int y) {
return x + y;
}
}
The generated code is valid, well-typed FGJ with interfaces code. This
means it is also almost correct Generic Java. Indeed, after minor modifications (adding the keyword public to method definitions, discarding type
parameter instantiation on calls to accept and adding a main method), it
compiles correctly using the Sun Java 1.5 compiler.

From functional to imperative internal visitors

The visitors in Figure 2 were purely functional and relied on generics. A more
typical rendition of an internal visitor using internal state instead is given in
Figure 3 (it is debatable whether visitNode() should be omitted, since there
12

Buchlovsky and Thielecke

interface Visitor {
void visitLeaf(int n);
void visitNode();
}
interface BinTree {
void accept(Visitor v);
}
class Leaf implements BinTree {
int n;
Leaf(int n) { this.n = n; }
public void accept(Visitor v) {
v.visitLeaf(n);
}
}
class Node implements BinTree {
BinTree left; BinTree right;
Node(BinTree left, BinTree right) {
this.left = left;
this.right = right;
}
public void accept(Visitor v) {
left.accept(v);
right.accept(v);
}
}
class SumVisitor implements Visitor {
int s;
SumVisitor(int s) { this.s = s; }
public void visitLeaf(int n) { s = s + n; }
public void visitNode() { }
}
Fig. 3. An imperative internal visitor in Java (without generics)

is no information at internal nodes).


In this section, we address the equivalence of functional and imperative
visitors, albeit in a very idealized setting. Consider the example of summing
all the leaves of a binary tree. Functional visitors would do this by performing
additions at all the internal nodes. On the other hand, a more typical imperative use of the Visitor pattern would use a field in the visitor to accumulate
the sum. The field is initialized to 0, and at each leaf, its value is added to the
field; at an internal node, the left and right subtrees are simply traversed one
after the other. After the traversal, the field holds the result. Since such an
imperative visitor uses state rather than a result value, its return type is void.
13

Buchlovsky and Thielecke

We will model a visitor that updates a piece of state S by a function S S,


as one does in the semantics of imperative languages, or the monadic view of
computation approach. Given (the functional internal visitor encoding of) a
binary tree t, we define its imperative counterpart b
t as follows:
b
t = .c:Z .t[ ]hc, i
where abbreviates function composition,
= hf, gi:( ) ( ).x:.g(f x)
Compared to t, b
t is more specialized: it only allows one to specify an operation
at the leaves, which needs to transform a state of type , while the only
operation at internal nodes is to compose the state transformers of the left
and right subtrees; this composition of state transformations corresponds to
calling two void-returning methods in succession.
To relate the state-transforming and the functional visitors, we want to
show that the state-transforming visitor (with initial state 0) yields the same
result as the functional one:
b
t [Z] (n:Z.s:Z.s + n) 0 = t[Z]hidZ , +i
We sketch a proof using relational parametricity [13], specifically the reasoning developed by Wadler as theorems for free [16]; see the latter paper
for an introduction and the relevant definitions.
Let t be a binary tree for internal visitors, that is
t : .((Z ) (( ) ))
Now b
t [Z] (n:Z.s:Z.s + n) = t[Z Z] h(n:Z.s:Z.s + n), Z i. Hence we
want to show
t[Z]hidZ , +i = t[Z Z]haddZ , Z i 0
where
idZ = n:Z.n

:ZZ

addZ = n:Z.s:Z.s + n : Z Z Z
Assuming parametricity [16], we have that for all relations R,
ht, ti ((Z R) ((R R) R)) R
We define a relation R : Z (Z Z) by hn, f i R iff f (x) = n + x for all x.
Then we have:
hidZ , addZ i Z R
h+, Z i (R R) R
The latter of these holds because if hhx, yi, hf, gii RR, then hx+y, f giR.
Since t maps related arguments to related results, we have
ht[Z]hidZ , +i, t[Z Z]haddZ , Z ii R
Hence, by the definition of R, t[Z Z]haddZ , Z i 0 = t[Z]hidZ , +i, as required.
14

Buchlovsky and Thielecke

Note that the proof made use of 0 being the neutral element of addition,
and of the associativity of addition in establishing the relation R between f g
and x + y:
(f g)(z) = f (g(z)) = f (z + y) = (z + y) + x = z + (x + y)
The above argument of relating a functional and an imperative visitor
by associativity is applicable to more substantial cases as well. Consider the
standard example of abstract syntax trees, and suppose we need to traverse the
tree to add information into a symbol table. The most evident specification in
terms of a synthesized attribute would be essentially functional, merging the
symbol table of the subtrees at the inner nodes. An imperative visitor could
instead start off with an empty symbol table and add entries by updating
a mutable symbol table during traversal. Showing the equivalence of the
functional and the imperative version should be analogous to the parametricity
argument above, with the empty symbol table being the neutral element, and
merging of symbol tables as the associative operation.

Conclusions

We have reconstructed the internal Visitor pattern and a restricted form of


the external variant within an idealized type-theoretic setting. To do so, we
had to make simplifying assumptions, and the fit is not perfect; this may be
inevitable, since Java was not designed on top of a lambda calculus, unlike
functional languages. But if one grants these idealizations, it is possible to
glean essential features of visitors more easily than from lengthy Java code or
class diagrams. We summarize our abstract reconstruction of visitors with the
following mapping from the terminology used in the patterns literature [6].
Patterns literature

Idealized view

Visitor interface

Type (determines a functor F )

Concrete visitor

Object of visitor type (


= F -algebra)

Visit method

Component of the structure map of a visitor

Data (or Element) Weakly initial visitor


Accept method

Witnesses initiality

Concrete data

Given by consi

There is a large literature on the Visitor pattern, most of it concerned


with overcoming some of its inflexibility. This line of work goes back to Reynolds [12]. In the context of the present paper, the most relevant previous work
is Felleisen and Friedmans textbook [4], in which variations on the Visitor
pattern are developed, implicitly based on program transformations known
from functional programming. Setzer [15] also observes a connection between
15

Buchlovsky and Thielecke

visitors and functional programming.


As sketched in Section 5, for functional (particularly internal) visitors, the
polymorphic typing immediately gives one reasoning principles, or theorems
for free. It should be possible to transfer the structure of such arguments to
more realistic imperative visitors, where logical relations on parts of the heap
would be used in place of the return type parametricity of the functional visitors. Instead of the result types of visitors, the relations could be built on an
effect system [10] (adapted from functional to object-oriented languages [1]).
In particular, the effect annotations tells us that the data structure to be traversed unleashes the effects of the visitor, but causes none itself. Apart from
effect systems, Hoare logic may also be applicable to visitors. Specifically, it
would be interesting to see whether abstract predicates [11] for visitors can be
handled analogously to the quantified result type.
The visitor-style encoding in System F does not extend to datatypes that
use subclassing. It may be worthwhile to consider similar encodings in F<:
to
see whether it fits such Visitor pattern variants more closely.

Acknowledgement
We thank Alan Mycroft and the anonymous referees for their comments.

References
[1] Gavin Bierman, Matthew Parkinson, and Andrew Pitts. MJ: an imperative
core calculus for Java and Java with effects. Technical Report 563, University
of Cambridge Computer Laboratory, 2003.
[2] C. Bohm and C. Berarducci. Automatic synthesis of typed lambda-programs
on term algebras. Theoretical Computer Science, 39(2/3):135154, 1985.
[3] Gilad Bracha, Martin Odersky, David Stoutamire, and Philip Wadler.
Making the future safe for the past: Adding genericity to the Java
programming language. In Proceedings of the 13th ACM Conference on ObjectOriented Programming, Systems, Languages, and Applications (OOPSLA98),
Vancouver, British Columbia, 1822 October 1998, pages 183200. ACM Press,
New York, 1998.
[4] Matthias Felleisen and Daniel P. Friedman. A Little Java, A Few Patterns.
MIT Press, Cambridge, Massachusetts, 1998.
[5] Etienne M. Gagnon and Laurie J. Hendren. SableCC, an object-oriented
compiler framework. In Proceedings of the Conference on Technology of ObjectOriented Languages and Systems, Santa Barbara, California, 37 August 1998,
pages 140154. IEEE Computer Society, Washington DC, 1998.
[6] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design
Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley,
Boston, Massachusetts, 1995.

16

Buchlovsky and Thielecke

[7] Jean-Yves Girard. Interpretation fonctionnelle et elimination des coupures de


larithmetique dordre superieur. PhD thesis, Universite Paris 7, 1972.
[8] Jean-Yves Girard, Paul Taylor, and Yves Lafont. Proofs and Types. Cambridge
University Press, Cambridge, 1989.
[9] Atsushi Igarashi, Benjamin Pierce, and Philip Wadler. Featherweight Java:
A minimal core calculus for Java and GJ. In Proceedings of the 14th
ACM Conference on Object-Oriented Programming, Systems, Languages, and
Applications (OOPSLA99), Denver, Colorado, 15 November 1999, pages 132
146. ACM Press, New York, 1999.
[10] John M. Lucassen and David K. Gifford. Polymorphic effect systems. In
Proceedings of the 15th ACM Symposium on Principles of Programming
Languages (POPL88), San Diego, California, 1315 January 1988, pages 47
57. ACM Press, New York, 1988.
[11] Matthew Parkinson and Gavin Bierman. Separation logic and abstraction.
In Proceedings of the 31st ACM Symposium on Principles of Programming
Languages (POPL05), Long Beach, California, 1214 January 2005, pages
247258. ACM Press, New York, 2005.
[12] John C. Reynolds. User-defined types and procedural data structures as
complementary approaches to data abstraction. In S. A. Schuman, editor, New
Directions in Algorithmic Languages 1975, pages 157168. IRIA, Rocquencourt,
France, 1976.
[13] John C. Reynolds. Types, abstraction and parametric polymorphism. In
R. E. A. Mason, editor, Information Processing 83, pages 513523. Elsevier
Science Publishers B. V. (North-Holland), Amsterdam, 1983.
[14] John C. Reynolds and Gordon D. Plotkin. On functors expressible in the
polymorphic typed lambda calculus. Information and Computation, 105:129,
1993.
[15] Anton Setzer. Java as a functional programming language. In H. Geuvers and
F. Wiedijk, editors, Types for Proofs and Programs: International Workshop,
TYPES 2002, Berg en Dal, The Netherlands, 2428 April 2002, number 2646
in Lecture Notes in Computer Science, pages 279298. Springer, Berlin, 2003.
[16] Philip Wadler.
Theorems for free!
In Proceedings of
the 4th International Conference on Functional Programming and Computer
Architecture (FPCA89), London, 1113 September 1989, pages 347359. ACM
Press, New York, 1989.

Featherweight Generic Java

Featherweight Generic Java [9] is a minimal Java-like calculus. It includes


type parameterized classes and methods and its syntax is (almost) a subset
of Java, so all FGJ programs are also valid Java programs. (To convert an
17

Buchlovsky and Thielecke

FGJ program into a Generic Java program it is necessary to add the keyword
public before method implementations and to elide type instantiations on
method calls.)
Our description of FGJ includes interfaces which were not present in the
original definition. The extension is limited since it only permits single inheritance in interface hierarchies and a class is permitted to implement at most
one interface. There are also some restrictions on type parameter bounds.
Since we are not interested in casting we will omit that from our account.
The syntax and typing of FGJ with interfaces are shown in Figures A.1
and A.2. The computation rules are shown in Figure A.3. We omit some
auxiliary rules due to lack of space. We abbreviate the keywords extends as
/, implements as  and return as . The metavariables , , range over
type variables; T, U, V range over types; N and O range over class types; and Q
ranges over interface types.
Some abbreviations are also necessary for various sequences:
~f = f0 , . . . ,fn

(similarly for ~C, ~x, ~e,


~ , ~N, ~Q etc.)

~M = M0 . . . Mn

(similarly for ~H)

~C ~f = C1 f1 , . . . , Cn fn
~C ~f; = C1 f1 ; . . . ; Cn fn ;
this.~f=~f; = this.f1 =f1 ; . . . ; this.fn =fn ;
The empty sequence is written as and concatenation is denoted with a
comma.
Unparameterized classes C<> and methods m<> can be abbreviated to C and
m. As in Java, unbounded parameters are assumed to have a bound of Object.
We also abbreviate C / Object to C. A class does not have to implement an
interface, in which case we can omit  Q from its definition. Unlike the class
hierarchy, the interface hierarchy has no root so we also allow / Q to be omitted
from interface definitions. We will omit empty calls to super() and empty
constructors.
The class table CT is a mapping from class names to class declarations.
The extended calculus also has an interface table IT which plays a similar role
with respect to interfaces. The authors of FGJ define some sanity conditions
on class tables and we assume that both CT and IT obey them. We assume
the existence of the special variable this, which may not be used as the name
of a field or method parameter. A program in FGJ with interfaces is a triple
(CT, IT, e) of a class table, an interface table and an expression.

18

Buchlovsky and Thielecke

Syntax
CL ::= class C<~
/ ~N> / N  Q {~T ~f; K ~M}

e ::= x | e.f | e.m<~T>(~e) | new N(~e)

IN ::= interface I<~


/ ~N> / Q {~H}

T ::= | | N | Q

K ::= C(~T ~f) {super(~f); this.~f = ~f;}

N ::= C<~T>

H ::= <~
/ ~N> T m (~T ~x);

Q ::= I<~T>

M ::= <~
/ ~N> T m (~T ~x) { e;}
Subtyping
` T <: T
` S <: T
` T <: U
` S <: U

S-Refl

S-Poly

` <: ()

IT (I) = interface I<~


/ ~N> / Q {...}
` I<~T> <: [~
7 ~T] Q

S-Trans

CT (C) = class C<~


/ ~N> / N  Q {...}
~
` C<T> <: [~
7 ~T] N

S-Sub

S-Extend

CT (C) = class C<~


/ ~N> / N  Q {...}
` C<~T> <: [~
7 ~T] Q

S-Impl

Well-formed types
WF-Obj

` Object ok
` Int ok

WF-Int

` ok
dom()
` ok

WF-Empty

WF-Poly

IT (I) = interface I<~


/ ~N> / Q {...}
` ~T ok

` ~T <: [~
7 ~T] ~N
WF-Interface

` I<~T> ok
CT (C) = class C<~
/ ~N> / N  Q {...}
` ~T ok

` ~T <: [~
7 ~T] ~N
WF-Class

` C<~T> ok
Expression typing
; ` x (x)

T-Var

; ` e0 T0
fields(bound (T0 )) = ~T ~f
; ` e0 .fi Ti

T-Field

; ` e0 T0
~ / ~O>~U U
mtype(m, bound (T0 )) = <
` ~V ok
; ` ~e ~S

` N ok

~ 7 ~V] ~O
` ~V <: [

; ` ~e ~S

~ 7 ~V] ~U
` ~S <: [

~ 7 ~V] U
; ` e0 .m<~V>(~e) [

fields(N) = ~T ~f
` ~S <: ~T

; ` new N(~e) N
T-Invk

Fig. A.1. FGJ sans casting extended with interfaces: Main definitions

19

T-New

Buchlovsky and Thielecke

Method typing
~ <: ~O
=
~ <: ~N,
` ~T ok

` T ok

` ~O ok

; ~x : ~T, this : C<~


> ` e0 S

` S <: T

CT (C) = class C<~


/ ~N> / N  Q {...}
override(m, N, <~ / ~P>~U U)
T-Method

~ / ~O> T m (~T ~x) { e0 ;} OK IN C<~


<
/ ~N>
Method header typing
~ <: ~O
=
~ <: ~N,
` ~T ok

` T ok

` ~O ok

IT (I) = interface I<~


/ ~N> / Q {...}
override(m, Q, <~ / ~P>~U U)
~ / ~O> T m (~T ~x); OK IN I<~
<
/ ~N>

T-Mhead

Class typing

~ <: ~N ` ~N ok

~ <: ~N ` N ok

~ <: ~N ` ~T ok
methods(N) = ~M1
fields(N) = ~U ~g

~ <: ~N ` Q ok

implement((~M0 , ~M1 ), ~H)


mheaders(Q) = ~H
~M0 , ~M1 OK IN C<~
/ ~N>

K = C(~U ~g, ~T ~f) {super(~g); this.~f = ~f;}


class C<~
/ ~N> / N  Q {~T ~f; K ~M0 } OK

T-Class

Interface typing

~ <: ~N ` ~N ok

~ <: ~N ` Q ok

~H OK IN I<~
/ ~N>
interface I<~
/ ~N> / Q {~H} OK

T-Interface

Fig. A.2. FGJ sans casting extended with interfaces: Main definitions continued

Computation
fields(N) = ~T ~f
(new N(~e)).fi ; ei

Comp-Field

mbody(m<~V>, N) = (~x, e0 )
(new N(~e)).m<~V>(~d) ; [~x 7 ~d, this 7 new N(~e)] e0

Comp-Invk

Fig. A.3. FGJ sans casting extended with interfaces: Computation rules

20

You might also like