Foundational Proof-Carrying Code
Foundational Proof-Carrying Code
Andrew W. Appel
Princeton University
Abstract
Proof-carrying code is a framework for the mechanical verification of safety properties of machine language
programs, but the problem arises of quis custodiat ipsos custodeswho will verify the verifier itself? Foundational proof-carrying code is verification from the smallest possible set of axioms, using the simplest possible verifier and the smallest possible runtime system. I will describe many of the mathematical and engineering problems to be solved in the construction of a foundational
proof-carrying code system.
1 Introduction
When you obtain a piece of software a shrinkwrapped application, a browser plugin, an applet, an OS
kernel extension you might like to ascertain that its safe
to execute: it accesses only its own memory and respects
the private variables of the API to which its linked. In a
Java system, for example, the byte-code verifier can make
such a guarantee, but only if theres no bug in the verifier
itself, or in the just-in-time compiler, or the garbage collector, or other parts of the Java virtual machine (JVM).
If a compiler can produce Typed Assembly Language
(TAL) [14], then just by type-checking the low-level representation of the program we can guarantee safety but
only if theres no bug in the typing rules, or in the typechecker, or in the assembler that translates TAL to machine language. Fortunately, these components are significantly smaller and simpler than a Java JIT and JVM.
Proof-carrying code (PCC) [15] constructs and verifies
a mathematical proof about the machine-language program itself, and this guarantees safety but only if theres
no bug in the verification-condition generator, or in the
logical axioms, or the typing rules, or the proof-checker.
This research was supported in part by DARPA award F30602-991-0519 and by National Science Foundation grant CCR-9974553.
To appear in LICS 01, 16th Annual IEEE Symposium on Logic in Computer Science, Jume 16, 2001.
the machine instructions of the program, expands the substitutions of its machine-code Hoare logic, examines the
formal parameter declarations to derive function preconditions, and examines result declarations to derive postconditions. A bug in the VCgen will lead to the wrong
formula being proved and checked.
The soundness of a PCC systems typing rules and
VCgen can, in principle, be proved as a metatheorem. Human-checked proofs of type systems are almost
tractable; the appendices of Neculas thesis [16] and Morrisett et al.s paper [14] contain such proofs, if not of the
actual type systems used in PCC systems, then of their
simplified abstractions. But constructing a mechanicallycheckable correctness proof of a full VCgen would be a
daunting task.
tp
: type.
tm
: tp -> type.
o: tp.
num: tp.
arrow: tp -> tp -> tp.
%infix right 14 arrow.
pair: tp -> tp -> tp.
pf
: tm o -> type.
The trick of using lam and @ to coerce between metalogical functions tm T1 -> tm T2 and object-logic
functions tm (T1 arrow T2) is described by Harper,
Honsell, and Plotkin [10]. We need object-logic functions
so that we can quantify over them using forall; that is,
the type of F in forall [F] predicate(F) must
be tm T for some T such as num arrow num, but cannot be tm T1 -> tm T2.
We have introduction and elimination rules for these
constructors (rules for pairing omitted here):
beta_e: {P: tm T -> tm o}
pf(P (lam F @ X)) -> pf(P (F X)).
beta_i: {P: tm T -> tm o}
pf(P (F X)) -> pf(P (lam F @ X)).
imp_i: (pf A -> pf B) -> pf (A imp B).
imp_e: pf (A imp B) -> pf A -> pf B.
forall_i:
({X:tm T}pf(A X)) -> pf(forall A).
forall_e:
pf(forall A) -> {X:tm T}pf(A X).
not_not_e: pf ((B imp forall [A] A)
imp forall [A] A)
-> pf B.
0:
1:
r
r0
r1
..
.
31:
32:
r31
fp0
..
.
63:
64:
65:
fp31
cc
m
0:
1:
2:
..
.
PC
unused
..
.
A single step of the machine is the execution of one instruction. We can specify instruction execution by giving
a step relation (r, m) 7 (r0, m0 ) that describes the relation
between the prior state (r, m) and the state (r0 , m0 ) of the
machine after execution.
For example, to describe the instruction r1 r2 + r3
we might start by writing,
(r, m) 7 (r0, m0 )
r0 (1) = r(2) + r(3) (x 6= 1. r0 (x) = r(x)) m0 = m
We start by modeling a specific von Neumann machine, such as the Sparc or the Pentium. A machine state
comprises a register bank and a memory, each of which
is a function from integers (addresses) to integers (contents). Every register of the instruction-set architecture
(ISA) must be assigned a number in the register bank: the
general registers, the floating-point registers, the condition codes, and the program counter. Where the ISA does
not specify a number (such as for the PC) we use an arbitrary index:
26
21
16
21
16
decode(w, instr)
(i, j, k.
0 i < 25 0 j < 25 0 k < 25
w = 3 226 + i 221 + j 216 + k 20
instr = add(i, j, k))
(i, j, c.
0 i < 25 0 j < 25 0 c < 216
w = 12 226 + i 221 + j 216 + c 20
instr = load(i, j, sign-extend(c)))
...
(see Appel and Felten [2] for descriptions of security policies that are more interesting than this one).
We can add a new conjunct to the semantics of the load
instruction,
load(i, j, c) =
r, m, r0 , m0 . r0 (i) = m(r( j) + c)
readable(r( j) + c)
(x 6= i. r0 (x) = r(x)) m0 = m.
Now, in a machine state where the program counter points
to a load instruction that violates the safety policy, our
step relation 7 does not relate this state to any successor state (even though the real machine knows how to
execute it).
Using this partial step relation, we can define safety; a
given state is safe if, for any state reachable in the Kleene
closure of the step relation, there is a successor state:
safe-state(r, m) =
r0 , m0 . (r, m 7 r0 , m0 ) r00 , m00 . r0 , m0 7 r00 , m00
A program is just a sequence of integers (representing
machine instructions); we say that a program p is loaded
at a location start in memory m if
loaded(p, m, start) = i dom(p). m(i + start) = p(i)
Finally (assuming that programs are written in
position-independent code), a program is safe if, no matter where we load it in memory, we get a safe state:
safe(p) =
r, m, start. loaded(p, m, start) r(PC ) = start
safe-state(r, m)
The important thing to notice about this formulation is
that there is no verification-condition generator. The syntax and semantics of machine instructions, implicit in a
VCgen, have been made explicit and much more concise in the step relation. But the Hoare logic of machine
instructions and typing rules for function parameters, also
implicit in a VCgen, must now be proved as lemmas
about which more later.
4 Specifying safety
Our step relation (r, m) 7 (r0 , m0 ) is deliberately partial; some states have no successor state. In these states
4
5 Proving safety
In a sufficiently expressive logic, as we all know, proving theorems can be a great deal more difficult than
merely stating them and higher-order logic is certainly
expressive. For guidance in proving safety of machinelanguage programs we should not particularly look to previous work in formal verification of program correctness.
Instead, we should think more of type checking: automatic proofs of decidable safety properties of programs.
The key advances that makes it possible to generate
proofs automatically are typed intermediate languages
[11] and typed assembly language [14]. Whereas conventional compilers type-check the source program, then
throw away the types (using the lambda-calculus principle
of erasure) and then transform the program through progressively lower-level intermediate representations until
they reach assembly language and then machine language, a type-preserving compiler uses typed intermediate languages at each level. If the program type-checks
at a low level, then it is safe, regardless of whether the
previous (higher-level) compiler phases might be buggy
on some inputs. As the program is analyzed into smaller
pieces at the lower levels, the type systems become progressively more complex, but the type theory of the
1990s is up to the job of engineering the type systems.
Typing rules for machine language. In important insight in the development of PCC is that one can write
type-inference rules for machine language and machine
states. For example, Necula [15] used rules such as
source code
Compiler
Front-end
IR (or byte codes)
typecheck
Optimizer
lower-level IR
Code
Generator
assembly-level IR
Register
Allocator
native machine code
Conventional Compiler
source code
Compiler
Front-end
IR (or byte codes)
Type-preserving
Optimizer
typed lower-level IR
Type-preserving
Code Generator
typed assembly lang.
Type-preserving
Reg. Allocator
proof-carrying
native machine code
m ` x : 1 2
m ` m(x) : 1 m(x + 1) : 2
meaning that if x has type 1 2 in memory m meaning
that it is a pointer to a boxed pair then the contents of
location x will have type 1 and the contents of location
x + 1 will have type 2 .
Proofs of safety in PCC use the local induction hypotheses at each point in the program to prove that the
program is typable. This implies, by a type-soundness argument, that the program is therefore safe.
If the type system is given by syntactic inference rules,
the proof of type soundness is typically done by syntactic subject reduction one proves that each step of computation preserves typability and that typable states are
safe. The proof involves structural induction over typing
derivations. In conventional PCC, this proof is done in the
metatheory, by humans.
In foundational PCC we wish to include the typesoundness proof inside the proof that is transmitted to
the code consumer because (1) its more secure to avoid
reliance on human-checked proofs and (2) that way we
avoid restricting the protocol to a single type system. But
in order to do a foundational subject-reduction theorem,
we would need to build up the mathematical machinery to
manipulate typing derivations as syntactic objects, all represented inside our logic using foundational mathematical
concepts sets, pairs, and functions. We would need to
do case analyses over the different ways that a given type
judgement might be derived. While this can all be done,
we take a different approach to proving that typability implies safety.
We take a semantic approach. In a semantic proof one
assigns a meaning (a semantic truth value) to type judgements. One then proves that if a type judgement is true
then the typed machine state is safe. One further proves
that the type inference rules are sound, i.e., if the premises
are true then the conclusion is true. This ensures that
derivable type judgements are true and hence typable machine states are safe.
The semantic approach avoids formalizing syntactic
type expressions. Instead, one formalizes a type as a set
of semantic values. One defines the operator as a function taking two sets as arguments and returning a set. The
typecheck
typecheck
typecheck
proof
check
Type-preserving Compiler
TAL was originally designed to be used in a certifying compiler, but one that certifies the assembly code and
uses a trusted assembler to translate to machine code. But
we can use TAL to help generate proofs in a PCC system
that directly verifies the machine code. In such a system,
the proofs are typically by induction, with induction hypotheses such as, whenever the program-counter reaches
location l, the register 3 will be a pointer to a pair of integers. These local invariants can be generated from the
TAL formulation of the program, but in a PCC system
they can be checked in machine code without needing to
5
To represent a function value, we let x be the entry address of the function; here is the function f (x) = x + 1,
assuming that arguments and return results are passed in
register 1:
|= x :m 1 2
|= m(x) :m 1
m(x + 1) :m 2
200
Although the two forms of the application typeinference rule look very similar they are actually significantly different. In the second rule 1 and 2 range over
semantic sets rather than type expressions. The relation
|= in the second version is defined directly in terms of a
semantics for assertions of the form x :m . The second
rule is actually a lemma to be proved while the first rule
is simply a part of the definition of the syntactic relation
`. For the purposes of foundational PCC, we view the semantic proofs as preferable to syntactic subject-reduction
proofs because they lead to shorter and more manageable
foundational proofs. The semantic approach avoids the
need for any formalization of type expressions and avoids
the formalization of proofs or derivations of type judgements involving type expressions.
5.1
109
r1 := r1+1
201
4070
jump(r7)
we could not handle recursions where the type being defined occurs in a negative (contravariant) position, as in
datatype exp = APP of exp * exp
| LAM of exp -> exp
where the boxed occurrence of exp is a negative occurrence. Contravariant recursion is occasionally useful in
ML, but it is the very essence of object-oriented programming, so these limitations (no mutable fields, no contravariant recursion) are quite restrictive.
5.2
m
108
1111
Building semantic models for type systems is interesting and nontrivial. In a first attempt, Amy Felty and
I [3] were able to model a pure-functional (immutable
datatypes) call-by-value language with records, address
arithmetic, polymorphism and abstract types, union and
intersection types, continuations and function pointers,
and covariant recursive types.
Our simplest semantics is set-theoretic: a type is a set
of values. But what is a value? It is not a syntactic construct, as in lambda-calculus; on a von Neumann machine
we wish to use a more natural representation of values that
corresponds to the way procedures and data structures are
represented in practice. This way, our type theory can
match reality without a layer of simulation in between.
We can represent a value as a pair (m, x), where m is a
memory and x is an integer (typically representing an address).
To represent a pointer data structure that occupies a
certain portion of the machines memory, we let x be the
root address of that structure. For example, the boxed pair
of integers h5, 7i represented at address 108 would be represented as the value ({108 7 5, 109 7 7}, 108).
108
200
5.3
where nf(e0 ) means that e0 is a normal form has no successor in the call-by-value small-step evaluation relation.
We start with definitions for the sets that represent the
types:
>
int
1 2
{}
{hk, vi | k 0}
{hk, 0i , hk, 1i , . . . | k 0}
{hk, (v1 , v2 )i | j < k. h j, v1 i 1 h j, v2 i 2 }
{hk, x.ei | j < k v. h j, vi e[v/x] : j }
{hk, vi | hk, vi F k+1 ()}
fin
. :k (e) :k
where (e) is the result of replacing the free variables in e
with their values under substitution . To drop the index
k, we define
|= e : k. |=k e :
Soundness theorem: It is trivial to prove from these
definitions that if |=e : and e
7 e0 then e0 is not
stuck, that is, e0 7 e00 .
Well founded type constructors. We define the notion
of a well founded type constructor. Here I will not give
the formal definition, but state the informal property that
if F is well founded and x : F(), then to extract from x
a value of type , or to apply x to a value of type , must
take at least one execution step. The constructors and
are well founded.
5.4
Mutable fields
|= e : 1 2
|= 1 (e) : 1
|= e1 :
|= e2 :
|= e1 e2 :
7
pendent types. This will make type-checking of TML difficult; we will need to assume that each compiler will have
a source language with a decidable type system, and that
translation of terms (and types) will yield a witness to the
type-checking of the resultant TML representation.
Abstract machine instructions. One can view machine instructions at many levels of abstraction:
1. At the lowest level, an instruction is just an integer,
an opcode encoding.
2. At the next level, it implements a relation on raw machine states (r, m) 7 (r0 , m0 ).
3. At a higher level, we can say that the Sparc add instruction implements a machine-independent notion
of add, and similarly for other instruction.
4. Then we can view add as manipulating not just registers, but local variables (which may be implemented
in registers or in the activation record).
5. We can view this instruction as one of various typed
instructions on typed values; in the usual view, add
has type int int int, but the address-arithmetic
add has type
(0 1 . . . n ) const(i) (i i+1 . . . n )
for any i.
x
108
x+2
110
m
y0 : t0
y1 : t1
y2 : t2
y3 : t3
6. Finally, we can specialize this typed add to the particular context where some instance of it appears, for
example by instantiating the i, n, and i in the previous example.
Abstraction level 1 is used in the statement of the theorem
(safety of a machine-language program p). Abstraction
level 5 is implicitly used in conventional proof-carrying
code [15]. Our ongoing research involves finding semantic models for each of these levels, and then proving lemmas that can convert between assertions at the different
levels.
be applied to the problem of garbage collection, as noticed in important recent work by Walker, Crary, and Morrisett [21]; to traverse objects of unknown type, the intensional type calculi of originally developed by Harper and
Morrisett [11] can be applied. Wangs work covers the
region operators and management of pointer sharing; related work by Monnier, Saha, and Shao [13] covers the
intensional type system.
Other potentially unsafe parts of the runtime system
are ad hoc implementations of polytypic functions those
that work by induction over the structure of data types
such as polymorphic equality testers, debuggers, and
marshallers (a.k.a. serializers or picklers). Juan Chen and
I have developed an implementation of polytypic primitives as a transformation on the typed intermediate representation in the SML/NJ compiler [6]. Like the R transformation of Crary and Weirich [8] it allows these polytypic functions to be typechecked, but unlike their calculus, ours does not require dependent types in the typed
intermediate language and is thus simpler to implement.
7 Conclusion
Our goal is to reduce the size of the trusted computing base of systems that run machine code from untrusted
sources. This is an engineering challenge that requires
work on many fronts. We are fortunate that during the
last two decades, many talented scientists have built the
mathematical infrastructure we need the theory and implementation of logical frameworks and automated theorem provers, type theory and type systems, compilation
and memory management, and programming language
design. The time is ripe to apply all of these advances
as engineering tools in the construction of safe systems.
References
[1] Andrew W. Appel.
Hints on proving theorems
in Twelf. www.cs.princeton.edu/appel/twelf-tutorial,
February 2000.
[14] Greg Morrisett, David Walker, Karl Crary, and Neal Glew.
From System F to typed assembly language. ACM Trans.
on Programming Languages and Systems, 21(3):527568,
May 1999.
[2] Andrew W. Appel and Edward W. Felten. Models for security policies in proof-carrying code. Technical Report
TR-636-01, Princeton University, March 2001.
[16] George Ciprian Necula. Compiling with Proofs. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, September 1998.
[17] Frank Pfenning. Elf: A meta-language for deductive systems. In A. Bundy, editor, Proceedings of the 12th International Conference on Automated Deduction, pages 811
815, Nancy, France, June 1994. Springer-Verlag LNAI
814.
[18] Frank Pfenning and Carsten Schurmann. System description: Twelf a meta-logical framework for deductive systems. In The 16th International Conference on Automated
Deduction. Springer-Verlag, July 1999.
[19] Norman Ramsey and Mary F. Fernandez. Specifying representations of machine instructions. ACM Trans. on Programming Languages and Systems, 19(3):492524, May
1997.
[20] Mads Tofte and Jean-Pierre Talpin. Implementation of the
typed call-by-value -calculus using a stack of regions. In
Twenty-first ACM Symposium on Principles of Programming Languages, pages 188201. ACM Press, January
1994.
[21] David Walker, Karl Crary, and Greg Morrisett. Typed
memory management via static capabilities. ACM Trans.
on Programming Languages and Systems, 22(4):701771,
July 2000.
[22] Daniel C. Wang and Andrew W. Appel. Type-preserving
garbage collectors. In POPL 2001: The 28th ACM
SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 166178. ACM Press, January
2001.
10