Polymorphic Subtypes For Effect Analysisthe Static Semantics (1997)
Polymorphic Subtypes For Effect Analysisthe Static Semantics (1997)
1 Introduction
Motivation. The last decade has seen a number of papers addressing the diffi-
cult task of developing type systems for languages that admit polymorphism in
the style of the ML let-construct, that admit subtyping, and that admit effects
as may arise from assignment or communication.
This is a problem of practical importance. The programming language Stan-
dard ML has been joined by a number of other high-level languages demonstrat-
ing the power of polymorphism for large scale software development. Already
Standard ML contains imperative effects in the form of ref-types that can be
used for assignment; closely related languages like Concurrent ML or Facile fur-
ther admit primitives for synchronous communication. Finally, the trend towards
integrating aspects of object orientation into these languages necessitates a study
of subtyping.
Apart from the need to type such languages we see a need for type systems
integrating polymorphism, subtyping, and effects in order to be able to continue
the present development of annotated type and effect systems for a number of
static program analyses; example analyses include control flow analysis, binding
time analysis and communication analysis. This will facilitate modular proofs of
correctness while at the same time allowing the inference algorithms to generate
syntax-free constraints that can be solved efficiently.
State of the art. One of the pioneering papers in the area is [10] that developed
the first polymorphic type inference and algorithm for the applicative fragment
of ML; a shorter presentation for the typed λ-calculus with let is given in [3].
Since then many papers have studied how to integrate subtyping. A number
of early papers did so by mainly focusing on the typed λ-calculus and only
briefly dealing with let [11, 5]. Later papers have treated polymorphism in full
generality [18, 8]. A key ingredient in these approaches is the simplification of
the enormous set of constraints into something manageable [4, 18].
Already ML necessitates an incorporation of imperative effects due to the
presence of ref-types. A pioneering paper in the area is [21] that develops a
distinction between imperative and applicative type variables: for creation of a
reference cell we demand that its type contain imperative variables only; and
one is not allowed to generalise over imperative variables unless the expression
in question is non-expansive (i.e. does not expand the store) which will be the
case if it is an identifier or a function abstraction.
The problem of typing ML with references (but without subtyping) has lead
to a number of attempts to improve upon [21]; this includes the following:
– [23] is similar in spirit to [21] in that one is not allowed to generalise over a
type variable if a reference cell has been created with a type containing this
variable; to trace such variables the type system is augmented with effects.
Effects may be approximated by larger effects, that is the system employs
subeffecting.
– [19] can be considered a refinement of [23] in that effects also record the region
in which a reference cell is created (or a read/write operation performed);
this information enables one to “mask” effects which have taken place in
“inaccessible” regions.
– [9] presents a somewhat alternative view: here focus is not on detecting
creation of reference cells but rather to detect their use; this means that if
an identifier occurs free in a function closure then all variables in its type
have to be “examined”. This method is quite powerful but unfortunately
it fails to be a conservative extension of ML (cf. Sect. 2.6): some purely
applicative programs which are typeable in ML may be untypeable in this
system.
The surveys in [19, section 11] and in [23, section 5] show that many of these
systems are incomparable, in the sense that for any two approaches it will often
be the case that there are programs which are accepted by one of them but
not by the other, and vice versa. Our approach (which will be illustrated by
a fragment of Concurrent ML but is equally applicable to Standard ML with
references) involves subtyping which is strictly more powerful than subeffecting
(as shown in Example 4); apart from this we do not attempt to measure its
strength relative to other approaches.
In the area of static program analysis, annotated type and effect systems have
been used as the basis for control flow analysis [20] and binding time analysis
[14, 7]. These papers typically make use of a polymorphic type system with
subtyping and no effects, or a non-polymorphic type system with effects and
subtyping. A more ambitious analysis is the approach of [15] to let annotated
type and effect systems extract terms of a process algebra from programs with
communication; this involves polymorphism and subeffecting but the algorithmic
issues are non-trivial [12] (presumably because the inference system is expressed
without using constraints); [1] presents an algorithm that is sound as well as
complete, but which generates constraints that are not guaranteed to have best
solutions. Finally we should mention [22] where effects are incorporated into ML
types in order to deal with region inference.
2 Inference System
The fragment of Concurrent ML [17, 16] we have chosen for illustrating our
approach has expressions (e ∈ Exp) and constants (c ∈ Con) given by the
following syntax:
e ::= c | x | fn x ⇒ e | e1 e2 | let x = e1 in e2
| rec f x ⇒ e | if e then e1 else e2
c ::= () | true | false | n | + | * | = | · · ·
| pair | fst | snd | nil | cons | hd | tl | isnil
| send | receive | sync | channel | fork
For expressions this includes constants, identifiers, function abstraction, applica-
tion, polymorphic let-expressions, recursive functions, and conditionals; a pro-
gram is an expression without any free identifiers.
Constants can be divided into four classes, according to whether they are
sequential or non-sequential and according to whether they are constructors or
base functions.
The sequential constructors include the unique element () of the unit type,
the two booleans, numbers (n ∈ Num), pair for constructing pairs, and nil and
cons for constructing lists.
The sequential base functions include a selection of arithmetic operations,
fst and snd for decomposing a pair, and hd, tl and isnil for decomposing and
inspecting a list.
We shall allow to write (e1 ,e2 ) for pair e1 e2 , to write [] for nil and
[e1 · · · en ] for cons (e1 ,cons(· · ·,nil) · · ·). Additionally we shall write e1 ;e2
for snd (e1 ,e2 ); to motivate this notice that since the language is call-by-value,
evaluation of the latter expression will give rise to evaluation of e1 followed by
evaluation of e2 , the value of which will be the final result.
The unique flavour of Concurrent ML is due to the non-sequential constants
which are the primitives for communication; we include five of these but more
(in particular choose and wrap) can be added. The non-sequential constructors
are send and receive: rather than actually enabling a communication they
create delayed communications which are first-class entities that can be passed
around freely. This leads to a very powerful programming discipline (in particular
in the presence of choose and wrap) as is discussed in [17]. The non-sequential
base functions are channel for allocating new communication channels, fork for
spawning new processes, and sync for synchronising delayed communications;
examples of their use are given in the Introduction.
that takes a function f as argument, defines an identity function id, and then
applies id to itself. The identity function contains a conditional whose sole pur-
pose is to force f and a locally defined function to have the same type. The
locally defined function is yet another identity function except that it attempts
to send the argument to id over a newly created channel. (To be able to execute
one would need to fork a process that could read over the same channel.)
This program is of interest because it will be rejected by a system using
subeffecting only, whereas it will be accepted in the systems of [19] and [21]. We
shall see that we will be able to type this program in our system as well! 2
To prepare for the type inference system we must clarify the syntax of types,
effects, type schemes, and constraints. The syntax of types (t ∈ Typ) is given by:
b ::= {t chan} | β | ∅ | b1 ∪ b2
Apart from the presence of behaviour variables (denoted β) a behaviour can thus
be viewed as a set of “atomic” behaviours each of form {t chan}, recording the
allocation of a channel of type t chan. The definition of types and behaviours is
of course mutually recursive.
A constraint set C is a finite set of type (t1 ⊆ t2 ) and behaviour inclusions
(b1 ⊆ b2 ). A type scheme (ts ∈ TSch) is given by
ts ::= ∀(~
αβ~ : C). t
C, A ⊢ e : σ & b
effect. This means that e has type or type scheme σ, and that its execution will
result in a behaviour described by b, assuming that free identifiers have types as
specified by A and that all type and behaviour variables are related as described
by C.
The overall structure of the type inference system of Figure 2 is very close to
those of [18, 8] with a few components from [19, 15] thrown in; the novel ideas of
our approach only show up as carefully constructed side conditions for some of
the rules. Concentrating on the “overall picture” we thus have rather straight-
forward axioms for constants and identifiers; here A(x) denotes the rightmost
entry for x in A. The rules for abstraction and application are as usual in effect
systems: the latent behaviour of the body of a function abstraction is placed
on the arrow of the function type, and once the function is applied the latent
behaviour is added to the effect of evaluating the function and its argument (re-
flecting that the language is call-by-value). The rule for let is straightforward
given that both the let-bound expression and the body needs to be evaluated.
The rule for recursion makes use of function abstraction to concisely represent
the “fixed point requirement” of typing recursive functions; note that we do not
admit polymorphic recursion3 . The rule for conditional is unable to keep track
of which branch is chosen, therefore an upper approximation of the branches is
taken. We then have separate rules for subtyping, instantiation and generalisa-
tion and we shall explain their side conditions shortly.
3
Even though this is undecidable in general [6] one might allow polymorphic recursion
in the annotations as in [7] or [22].
(con) C, A ⊢ c : TypeOf(c) & ∅
C, A[x : t1 ] ⊢ e : t2 & b
(abs)
C, A ⊢ fn x ⇒ e : (t1 →b t2 ) & ∅
C, A[f : t] ⊢ fn x ⇒ e : t & b
(rec)
C, A ⊢ rec f x ⇒ e : t & b
C, A ⊢ e : t & b
(sub) if C ⊢ t ⊆ t′ and C ⊢ b ⊆ b′
C, A ⊢ e : t′ & b′
αβ~ : C0 ). t0 & b
C, A ⊢ e : ∀(~
(ins) αβ~ : C0 ). t0 is solvable from C by S0
if ∀(~
C, A ⊢ e : S0 t0 & b
C ∪ C0 , A ⊢ e : t0 & b
(gen) if ∀(~αβ~ : C0 ). t0 is both well-formed,
αβ~ : C0 ). t0 & b
C, A ⊢ e : ∀(~ ~ ∩
solvable from C, and satisfies {~ αβ}
FV(C, A, b) = ∅
2.2 Subtyping
Rule (sub) generalises the subeffecting rule of [19] by incorporating subtyping
and extends the subtyping rule of [18] to deal with effects. To do this we associate
two kinds of judgements with a constraint set: the relations C ⊢ b1 ⊆ b2 and
C ⊢ t1 ⊆ t2 are defined by the rules and axioms of Figure 3. In all cases we write
≡ for the equivalence induced by the orderings. We shall also write C ⊢ C ′ to
mean that C ⊢ b1 ⊆ b2 for all (b1 ⊆ b2 ) in C ′ and that C ⊢ t1 ⊆ t2 for all (t1 ⊆ t2 )
in C ′ .
The relation C ⊢ t1 ⊆ t2 expresses the usual notion of subtyping, in partic-
ular it is contravariant in the argument position of a function type but there
Ordering on behaviours
(axiom) C ⊢ b1 ⊆ b2 if (b1 ⊆ b2 ) ∈ C
(refl) C ⊢b⊆b
C ⊢ b1 ⊆ b2 C ⊢ b2 ⊆ b3
(trans)
C ⊢ b1 ⊆ b3
C ⊢ t ≡ t′
(chan)
C ⊢ {t chan} ⊆ {t′ chan}
(∅) C ⊢∅⊆b
C ⊢ b1 ⊆ b C ⊢ b2 ⊆ b
(lub)
C ⊢ (b1 ∪ b2 ) ⊆ b
Ordering on types
(axiom) C ⊢ t1 ⊆ t2 if (t1 ⊆ t2 ) ∈ C
(refl) C ⊢t⊆t
C ⊢ t1 ⊆ t2 C ⊢ t2 ⊆ t3
(trans)
C ⊢ t1 ⊆ t3
C ⊢ t′1 ⊆ t1 C ⊢ t2 ⊆ t′2 C ⊢ b ⊆ b′
(→) ′
C ⊢ (t1 →b t2 ) ⊆ (t′1 →b t′2 )
C ⊢ t1 ⊆ t′1 C ⊢ t2 ⊆ t′2
(×)
C ⊢ (t1 × t2 ) ⊆ (t′1 × t′2 )
C ⊢ t ⊆ t′
(list)
C ⊢ (t list) ⊆ (t′ list)
C ⊢ t ≡ t′
(chan)
C ⊢ (t chan) ⊆ (t′ chan)
C ⊢ t ⊆ t′ C ⊢ b ⊆ b′
(com)
C ⊢ (t com b) ⊆ (t′ com b′ )
2.3 Generalisation
We now explain some of the side conditions for the rules (ins) and (gen). This
involves the notion of substitution: a mapping from type variables to types and
from behaviour variables to behaviours4 such that the domain is finite. Here the
domain
S of a substitution S is Dom(S) = {γ | S γ 6= γ} and the range is Ran(S) =
{FV(S γ) | γ ∈ Dom(S)} where the concept of free variables, denoted FV(· · ·),
is standard. The identity substitution is denoted Id and we sometimes write
Inv(S) = Dom(S) ∪ Ran(S) for the set of variables that are involved in the
substitution S.
Rule (ins) is much as in [18] and merely says that to take an instance of a
type scheme we must ensure that the constraints are satisfied; this is expressed
using the notion of solvability:
Definition 1. The type scheme ∀(~ αβ~ : C0 ). t0 is solvable from C by the substi-
tution S0 if Dom(S0 ) ⊆ {~ ~
αβ} and if C ⊢ S0 C0 .
A type t is trivially solvable from C, and an environment A is solvable from
C if for all x in Dom(A) it holds that A(x) is solvable from C.
Except for the well-formedness requirement (explained later), rule (gen) seems
close to the corresponding rule in [18]: clearly we cannot generalise over vari-
ables free in the global type assumptions or global constraint set, and as in effect
systems (e.g. [19]) we cannot generalise over variables visible in the effect. Fur-
thermore, as in [18] solvability is imposed to ensure that we do not create type
schemes that have no instances; this condition ensures that the expressions let
x = e1 in e2 and let x = e1 in (x;e2 ) are going to be equivalent in the type
system.
let ch = channel ()
in · · ·
(sync(send(ch,7)))
(sync(send(ch,true)))
4
We use γ to range over α’s and β’s as appropriate and use g range over t’s and b’s
as appropriate.
and note that it is semantically unsound (at least if “· · ·” forked some process
receiving twice over ch and adding the results). Writing C = {{α chan} ⊆ β,
{{int chan}} ⊆ β, {{bool chan}} ⊆ β} and C ′ = {{α′ chan} ⊆ β} then gives
C ∪ C ′ , [ ] ⊢ channel () : α′ chan & β
and, without taking well-formedness into account, rule (gen) would give
C, [ ] ⊢ channel () : (∀(α′ : C ′ ). α′ chan) & β
because α′ ∈/ FV(C, β) and ∀(α′ : C ′ ). α′ chan is solvable from C by either of
the substitutions [α′ 7→ α], [α′ 7→ int] and [α′ 7→ bool]. This then would give
C, [ch : ∀(α′ : C ′ ). α′ chan] ⊢ ch : int chan & ∅
C, [ch : ∀(α′ : C ′ ). α′ chan] ⊢ ch : bool chan & ∅
so that
C, [ ] ⊢ e : t & b
for suitable t and b. As it is easy to find S such that ∅ ⊢ S C, we shall see (by
Lemma 10 and Lemma 11) that we even have
∅, [ ] ⊢ e : t′ & b′
for suitable t′ and b′ . This shows that some notion of well-formedness is essential
for semantic soundness; actually the example suggests that if there is a constraint
({α′ chan} ⊆ β) then one should not generalise over α′ if it is impossible to
generalise over β. 2
Well-formedness
We can now define the notion of well-formedness for constraints and for type
schemes; for the latter we make use of the arrow relations defined above.
We now list a few basic properties of the inference system that we shall use later.
Fact 8. For all constants c of Figure 1, the type scheme TypeOf(c) is closed,
well-formed and solvable from ∅.
Proof. A straightforward case analysis on the last rule applied; for constants we
make use of Fact 8.
(a) If C ⊢ C ′ then S C ⊢ S C ′ .
(b) If C, A ⊢ e : σ & b then S C, S A ⊢ e : S σ & S b (and has the same shape).
(a) If C ⊢ C0 then C ′ ⊢ C0 ;
(b) If C, A ⊢ e : σ & b then C ′ , A ⊢ e : σ & b (and has the same shape).
Fact 13. Given an inference tree for C, A ⊢ e : σ & b there exists a constraint-
saturated inference tree C, A ⊢c e : σ & b (that has the same shape).
C, A ⊢n e : S t0 & b.
A[x : u1 ] ⊢ML e : u2
(abs)
A ⊢ML fn x ⇒ e : u1 → u2
(ins) A ⊢ML e : ∀~
α .u if Dom(R) ⊆ {~
α}
A ⊢ML e : R u
We are now ready to state that our system conservatively extends ML.
Theorem 21. Let e be a sequential expression:
– if [ ] ⊢ML e : u then ∅, [ ] ⊢ e : ι(u) & ∅;
– if C, [ ] ⊢ e : t & b where C contains no type constraints then [ ] ⊢ML e : ǫ(t).
Proof: See Appendix A.
3 Conclusion
References
Proof. Induction in the proof tree, performing case analysis on the last rule
applied:
(axiom): then (α ⊆ α′ ) ∈ C so the claim is trivial.
(refl): the claim is trivial.
(trans): assume that C ⊢ α ⊆ α′ because C ⊢ α ⊆ t′′ and C ⊢ t′′ ⊆ α′ . By using
Lemma 4 on the inference C ⊢ α ⊆ t′′ we infer that t′′ is a variable α′′ . By applying
C↓ C↓
the induction hypothesis we infer that α ∈ {α′′} and that α′′ ∈ {α′ } , from
C↓
which we conclude that α ∈ {α′ } . 2
Proof. Induction in the size of the inference tree, where we define the size of
the inference tree for C ⊢ t ≡ t′ as the sum of the size of the inference tree for
C ⊢ t ⊆ t′ and the size of the inference tree for C ⊢ t′ ⊆ t.
First we consider the part concerning behaviours, performing case analysis
on the last inference rule applied:
(axiom): then (b ⊆ b′ ) ∈ C so since C is well-formed b′ is a variable; hence
the claim.
(refl): the claim is trivial.
6
This case is the reason for not defining the size of a tree as the number of inferences.
(trans): assume that C ⊢ b ⊆ b′ because C ⊢ b ⊆ b′′ and C ⊢ b′′ ⊆ b′ . The in-
C↓ C↓ C↓
duction hypothesis tells us that FV(b) ⊆ FV(b′′ ) and that FV(b′′ ) ⊆
′ C↓
FV(b ) ; hence the claim.
(chan): assume that C ⊢ {t chan} ⊆ {t′ chan} because C ⊢ t ≡ t′ . The in-
C↓ C↓
duction hypothesis tells us that FV(t) = FV(t′ ) ; hence the claim.
(∅): the claim is trivial.
(∪): the claim is trivial.
(lub): assume that C ⊢ b1 ∪ b2 ⊆ b′ because C ⊢ b1 ⊆ b′ and C ⊢ b2 ⊆ b′ . The
C↓ C↓ C↓
induction hypothesis tells us that FV(b1 ) ⊆ FV(b′ ) and that FV(b2 ) ⊆
C↓ C↓ C↓ C↓
FV(b′ ) , from which we infer that FV(b1 ∪ b2 ) = FV(b1 ) ∪ FV(b2 ) ⊆
′ C↓
FV(b ) .
Next we consider the part concerning types, where we perform case analysis
on the form of t′ :
′
t′ = t′1 →b t′2 : Let n1 be the size of the inference tree for C ⊢ t ⊆ t′ and let
n2 be the size of the inference tree for C ⊢ t′ ⊆ t. Lemma 4 (applied to the former
inference) tells us that there exist t1 , b and t2 such that t = t1 →b t2 and such
that C ⊢ t′1 ⊆ t1 , C ⊢ b ⊆ b′ and C ⊢ t2 ⊆ t′2 , where each inference tree is of size
< n1 (due to the remark prior to the proof). Lemma 4 (applied to the latter
inference, i.e. C ⊢ t′ ⊆ t) tells us that C ⊢ t1 ⊆ t′1 , C ⊢ b′ ⊆ b and C ⊢ t′2 ⊆ t2 , where
each inference tree is of size < n2 .
Thus C ⊢ t1 ≡ t′1 and C ⊢ t2 ≡ t′2 , where each inference tree has size < n1 +n2 .
C↓ C↓
We can thus apply the induction hypothesis to infer that FV(t1 ) = FV(t′1 )
C↓ C↓ C↓
and that FV(t2 ) = FV(t′2 ) ; and similarly we can infer that FV(b) ⊆
′ C↓ ′ C↓ C↓
FV(b ) and that FV(b ) ⊆ FV(b) . This enables us to concluce that
C↓ C↓
FV(t) = FV(t′ ) .
t′ has a topmost type constructor other than →: we can proceed as above.
t′ is a variable: Since C ⊢ t′ ⊆ t we can use Lemma 4 to infer that t is a vari-
C↓
able; then we use Lemma 22 to infer that FV(t′ ) ⊆ FV(t) . Similarly we can
C↓ C↓ C↓
infer FV(t) ⊆ FV(t′ ) . This implies the desired relation FV(t) = FV(t′ ) .
2
S C, S A ⊢ e : ∀(~
αβ~ : S C0 ). S t0 & S b. (1)
αβ~ : C0 ). t0 is well-formed
∀(~ (2)
there exists S0 with Dom(S0 ) ⊆ {~ ~ such that C ⊢ S0 C0
αβ} (3)
{~ ~ ∩ FV(C, A, b) = ∅
αβ} (4)
Define R = [~ ~′ β~′ ] with {α~′ β~′ } fresh. We then apply the induction hypoth-
αβ~ 7→ α
esis (with S R) and due to (4) this gives us S C ∪ S R C0 , S A ⊢ e : S R t0 & S b.
Below we prove
(a) If C ⊢ C0 then C ′ ⊢ C0 .
(b) If C, A ⊢ e : σ & b then C ′ , A ⊢ e : σ & b (and has the same shape).
Proof normalisation
Lemma 16 If A is well-formed and solvable from C then an inference tree
C, A ⊢ e : σ & b can be transformed into one C, A ⊢n e : σ & b that is normalised.
Proof. Using Fact 13, we can, without loss of generality, assume that we have a
constraint-saturated inference tree for C, A ⊢ e : σ & b. We proceed by induction
on the inference.
The case (con). We assume C, A ⊢c c : TypeOf (c) & ∅. If TypeOf(c) is a
type then we already have a T-normalised inference. So assume TypeOf(c) is
a type scheme ∀(~ αβ~ : C0 ). t0 and let R be a renaming of α~ β~ to fresh variables
α ′ ~
~ β . We can then construct the following TS-normalised inference tree:
′
(con)
αβ~ : C0 ). t0 & ∅
C ∪ R C0 , A ⊢ c : ∀(~
(ins)
C ∪ R C0 , A ⊢ c : R t0 & ∅
(gen)
C, A ⊢ c : ∀(α~′ β~′ : R C0 ). R t0 & ∅
The rule (ins) is applicable since Dom(R) ⊆ {~ ~ and C ∪ R C0 ⊢ R C0 . The rule
αβ}
(gen) is applicable because ∀(~ αβ : C0 ). t0 = ∀(α~′ β~′ : R C0 ). R t0 (up to alpha-
~
renaming) is well-formed and solvable from C (Fact 8), and furthermore {α~′ β~′ }∩
FV(C, A, ∅) = ∅ holds by choice of α~′ β~′ .
The case (id). We assume C, A ⊢c x : A(x) & ∅. If A(x) is a type then we
already have a T-normalised inference. So assume A(x) = ∀(~ αβ~ : C0 ). t0 and
~ β~ to fresh variables α~′ β~′ . We can then construct the
let R be a renaming of α
following TS-normalised inference tree:
(id)
αβ~ : C0 ). t0 & ∅
C ∪ R C0 , A ⊢ x : ∀(~
(ins)
C ∪ R C0 , A ⊢ x : R t0 & ∅
(gen)
C, A ⊢ x : ∀(α~′ β~′ : R C0 ). R t0 & ∅
Conservative extension
Fact 23. For all S and R and all t and u, we have ǫ(S t) = ǫ(S) ǫ(t) and ι(R u) =
ι(R) ι(u).
Proof of the first part of Theorem 21 The first part of the theorem follows
from the following proposition which admits a proof by induction:
Proof. Induction in the proof tree for A ⊢ML e : us, where we perform case
analysis on the definition in Fig. 4 (the clauses for conditionals and for recursion
are omitted, as they present no further complications).
∅, ι(A) ⊢ e : ∀(~
α : ∅). ι(u) & ∅
∅, ι(A) ⊢ e : ∀(~
α : ∅). ι(u) & ∅
Proof. From Fact 25 we know that Ct has a most general unifier R1 , and hence
there exists R2 such that R0 = R2 R1 . Let G1 = Dom(R1 )\Dom(R0); for α ∈ G1
we have R2 R1 α = R0 α = α and hence R1 maps the variables in G1 into distinct
variables G2 (which by R2 are mapped back again). Since R1 is idempotent we
have G2 ∩ Dom(R1 ) = ∅, so R0 equals R2 on G2 showing that G2 ⊆ Dom(R0).
Moreover, G1 ∩ G2 = ∅.
Let φ map α ∈ G1 into R1 α and map α ∈ G2 into R2 α and behave as
the identity otherwise. Then φ is its own inverse so that φ φ = Id. Now define
R = φ R1 ; clearly R unifies Ct and if R′ also unifies Ct then (since R1 is most
general unifier) there exists R′′ such that R′ = R′′ R1 = R′′ φ φ R1 = (R′′ φ) R.
We are left with showing (i) that R is idempotent and (ii) that Dom(R) ⊆
G. For (i), first observe that R1 φ equals Id except on Dom(R1). Since R1 is
idempotent we have FV(R1 α) ∩ Dom(R1) = ∅ (for all α) and hence
R R = φ R1 φ R1 = φ Id R1 = R
Proof. Induction in the proof tree. If (t1 ⊆ t2 ) ∈ C, the claim follows from the
assumptions. The cases for reflexivity and transitivity are straight-forward. For
′
the structural rules, assume e.g. that C ⊢ t1 →b t2 ⊆ t′1 →b t′2 because (among
′ ′
other things) C ⊢ t1 ⊆ t1 and C ⊢ t2 ⊆ t2 . By using the induction hypothesis we
get the desired equality
′
R ǫ(t1 →b t2 ) = R ǫ(t1 ) → R ǫ(t2 ) = R ǫ(t′1 ) → R ǫ(t′2 ) = R ǫ(t′1 →b t′2 ).
Relating type schemes. For a type scheme ts = ∀(~ αβ~ : C). t we shall not
in general (when C 6= ∅) define any entity ǫ(ts); this is because one natural
attempt, namely ∀(~ α : ǫ(C)). ǫ(t), is not an ML type scheme and another natural
attempt, ∀~α .ǫ(t), causes loss of the information in ǫ(C). Rather we shall define
some relations between ML types, types, ML type schemes and type schemes:
Definition 28. We write u ≺R α β~ : C0 ). t0 and where R is an
ǫ ts, where ts = ∀(~
ML substitution, iff there exists R0 which equals R on all variables except α ~
such that R0 satisfies ǫ(C0 ) and such that u = R0 ǫ(t0 ).
Notice that instead of demanding R0 to equal R on all variables but α ~ , it is
sufficient to demand that R0 equals R on FV(ts). (We have the expected property
′ R ′
that if u ≺Rǫ ts and ts is alpha-equivalent to ts then also u ≺ǫ ts .)
Proof. We have us = ∀~α .ǫ(t), so for any u it holds that u≺us ⇔ ∃ R with
Dom(R) ⊆ α~ such that u = R ǫ(t) ⇔ u ≺Idǫ ts. 2
Fact 34. Let R and S be such that ǫ(S) = R (this will for instance be the case if
S = ι(R), cf. Fact 19). Then the relation u ≺R Id
ǫ ts holds iff the relation u ≺ǫ S ts
holds.
Consequently, us ∼=R ∼Id
ǫ ts holds iff us =ǫ S ts holds.
Proof. Let ts = ∀(~ α β~ : C). t. Due to the remark after Definition 28 we can
~
assume that ~α β is disjoint from Dom(S) ∪ Ran(S), so S ts = ∀(~ α β~ : S C). S t.
′
First we prove “if”. For this suppose that R equals Id except on ~α and that
R′ satisfies ǫ(S C) and that u = R′ ǫ(S t), which by straight-forward extensions
of Fact 23 amounts to saying that R′ satisfies R ǫ(C) and that u = R′ R ǫ(t).
Since {~α } ∩ Ran(R) = ∅ we conclude that R′ R equals R except on α ~ , so we can
′
use R R to show that u ≺R ǫ ts.
Next we prove “only if”. For this suppose that R′ equals R except on α ~
and that R′ satisfies ǫ(C) and that u = R′ ǫ(t). Let R′′ behave as R′ on α ~ and
behave as the identity otherwise. Our task is to show that R′′ satisfies ǫ(S C) and
that u = R′′ ǫ(S t), which as we saw above amounts to showing that R′′ satisfies
R ǫ(C) and that u = R′′ R ǫ(t). This will follow if we can show that R′ = R′′ R.
~ we have R′′ R α = R′′ α = R′ α since Dom(R) ∩ {~
But if α ∈ α α } = ∅, and if
α∈/α~ we have R′′ R α = R α = R′ α where the first equality sign follows from
Ran(R) ∩ {~α } = ∅ and Dom(R′′) ⊆ α ~. 2
Fact 35. If us ∼
=Id
ǫ ts then FV(us) ⊆ FV(ts).
Proof of the second part of Theorem 21 The second part of the theorem
follows (by letting R = Id and A = A′ = [ ]) from the following proposition,
which admits a proof by induction.
The case (abs): Suppose R satisfies ǫ(C) and that A′ ∼ =Rǫ A. Then also
′ ∼ R
A [x : R ǫ(t1 )] =ǫ A[x : t1 ], so the induction hypothesis can be applied to find u2
such that u2 = R ǫ(t2 ) and such that A′ [x : R ǫ(t1 )] ⊢ML e : u2 . By using (abs)
we get the judgement
A′ ⊢ML fn x ⇒ e : R ǫ(t1 ) → u2
The case (let): Suppose R satisfies ǫ(C1 ∪ C2 ) and that A′ ∼ =R ǫ A. Since R satis-
fies ǫ(C1 ) we can apply the induction hypothesis to find us1 such that us1 ∼ =Rǫ ts1
′
and such that A ⊢ML e1 : us1 .
Since R satisfies ǫ(C2 ) and since A′ [x : us1 ] ∼
=Rǫ A[x : ts1 ] we can apply the
′
induction hypothesis to infer that A [x : us1 ] ⊢ML e2 : R ǫ(t2 ). Now use (let) to
arrive at the desired judgement A′ ⊢ML let x = e1 in e2 : R ǫ(t2 ).
u ≺R ~ : C0 ). t0
ǫ ∀(~αβ
⇔ u ≺Id ~ : S C0 ). S t0
ǫ ∀(~αβ
⇔ ∃R1 with Dom(R1 ) ⊆ {~ α}
such that R1 satisfies R ǫ(C0 ) and u = R1 R ǫ(t0 )
⇔ ∃R1 with Dom(R1 ) ⊆ {~ α}
such that ∃R2 : R1 = R2 R0 and u = R1 R ǫ(t0 )
⇔ ∃R2 with Dom(R2 ) ⊆ {~ α } such that u = R2 R0 R ǫ(t0 )
⇔ u≺∀~α .R′ ǫ(t0 ).
α β~ } is disjoint
The first ⇔ follows from Fact 34 where we have exploited that {~
from Dom(S)∪Ran(S); the second ⇔ follows from the definition of ≺Id ǫ together
with Fact 23; the third ⇔ is a consequence of R0 being the most general unifier
of R ǫ(C0 ); and the fourth ⇔ is a consequence of Dom(R0) ⊆ {~ α } since then
from R1 = R2 R0 we conclude that if α′ ∈ / {~α } then R1 α′ = R2 α′ and hence
Dom(R1) ⊆ {α} iff Dom(R2) ⊆ {α}.