Isabelle
Isabelle
Paulson
Markus Wenzel
HO
e lle L
a b ∀
Is =
α
λ
β →
Springer-Verlag
Berlin Heidelberg NewYork
London Paris Tokyo
Hong Kong Barcelona
Budapest
Preface
1. The Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Types, Terms and Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Interaction and Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4. Presenting Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1 Concrete Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.1 Infix Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.2 Mathematical Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.3 Prefix Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.1.4 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Document Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.1 Isabelle Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2.2 Structure Markup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2.3 Formal Comments and Antiquotations . . . . . . . . . . . . . . 60
4.2.4 Interpretation of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2.5 Suppressing Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
A. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
x Table of Contents
Part I
Elementary Techniques
1. The Basics
1.1 Introduction
We do not assume that you are familiar with mathematical logic. However, we
do assume that you are used to logical and set theoretic notation, as covered
in a good discrete mathematics course [34], and that you are familiar with
the basic concepts of functional programming [5, 14, 29, 35]. Although this
tutorial initially concentrates on functional programming, do not be misled:
HOL can express most mathematical concepts, and functional programming
is just one particularly simple and ubiquitous instance.
Isabelle [28] is implemented in ML [19]. This has influenced some of Isa-
belle/HOL’s concrete syntax but is otherwise irrelevant for us: this tutorial
is based on Isabelle/Isar [37], an extension of Isabelle which hides the im-
plementation language almost completely. Thus the full name of the system
should be Isabelle/Isar/HOL, but that is a bit of a mouthful.
There are other implementations of HOL, in particular the one by Mike
Gordon et al., which is usually referred to as “the HOL system” [10]. For
us, HOL refers to the logical system, and sometimes its incarnation Isa-
belle/HOL.
A tutorial is by definition incomplete. Currently the tutorial only intro-
duces the rudiments of Isar’s proof language. To fully exploit the power of
Isar, in particular the ability to write readable and structured proofs, you
should start with Nipkow’s overview [24] and consult the Isabelle/Isar Refer-
ence Manual [37] and Wenzel’s PhD thesis [38] (which discusses many proof
patterns) for further details. If you want to use Isabelle’s ML level directly
(for example for writing your own proof procedures) see the Isabelle Reference
Manual [26]; for details relating to HOL see the Isabelle/HOL manual [25].
All manuals have a comprehensive index.
4 1. The Basics
1.2 Theories
!! HOL contains a theory Main, the union of all the basic predefined theories like
arithmetic, lists, sets, etc. Unless you know what you are doing, always include
Main as a direct or indirect parent of all your theories.
Embedded in a theory are the types, terms and formulae of HOL. HOL is
a typed logic whose type system resembles that of functional programming
languages like ML or Haskell. Thus there are
1.3 Types, Terms and Formulae 5
base types, in particular bool, the type of truth values, and nat, the type of
natural numbers.
type constructors, in particular list, the type of lists, and set, the type of
sets. Type constructors are written postfix, e.g. (nat)list is the type
of lists whose elements are natural numbers. Parentheses around single
arguments can be dropped (as in nat list), multiple arguments are sep-
arated by commas (as in (bool,nat)ty ).
function types, denoted by ⇒. In HOL ⇒ represents total functions only. As
is customary, τ1 ⇒ τ2 ⇒ τ3 means τ1 ⇒ (τ2 ⇒ τ3 ). Isabelle also sup-
ports the notation [τ1 , . . . , τn ] ⇒ τ which abbreviates τ1 ⇒ · · · ⇒ τn
⇒ τ.
type variables, denoted by 'a, 'b etc., just like in ML. They give rise to
polymorphic types like 'a ⇒ 'a, the type of the identity function.
!! Types are extremely important because they prevent us from writing nonsense.
Isabelle insists that all terms and formulae must be well-typed and will print an
error message if a type mismatch is encountered. To reduce the amount of explicit
type information that needs to be provided by the user, Isabelle infers the type of
all variables automatically (this is called type inference) and keeps quiet about
it. Occasionally this may lead to misunderstandings between you and the system. If
anything strange happens, we recommend that you ask Isabelle to display all type
information via the Proof General menu item Isabelle > Settings > Show Types (see
Sect. 1.5 for details).
Terms are formed as in functional programming by applying functions
to arguments. If f is a function of type τ1 ⇒ τ2 and t is a term of type
τ1 then f t is a term of type τ2 . HOL also supports infix functions like +
and some basic constructs from functional programming, such as conditional
expressions:
if b then t1 else t2 Here b is of type bool and t1 and t2 are of the same
type.
let x = t in u is equivalent to u where all free occurrences of x have been re-
placed by t. For example, let x = 0 in x+x is equivalent to 0+0. Multiple
bindings are separated by semicolons: let x1 = t1 ;…; xn = tn in u .
case e of c1 ⇒ e1 | … | cn ⇒ en evaluates to ei if e is of the form ci .
Terms may also contain λ-abstractions. For example, λx. x+1 is the func-
tion that takes an argument x and returns x+1. Instead of λx.λy.λz. t we
can write λx y z. t .
Formulae are terms of type bool. There are the basic constants True
and False and the usual logical connectives (in decreasing order of priority):
¬, ∧, ∨, and −→, all of which (except the unary ¬) associate to the right.
In particular A −→ B −→ C means A −→ (B −→ C) and is thus logically
equivalent to A ∧ B −→ C (which is (A ∧ B) −→ C ).
Equality is available in the form of the infix function = of type 'a ⇒ 'a
⇒ bool. Thus t1 = t2 is a formula provided t1 and t2 are terms of the same
6 1. The Basics
type. If t1 and t2 are of type bool then = acts as if-and-only-if. The formula
t1 6= t2 is merely an abbreviation for ¬(t1 = t2 ).
Quantifiers are written as ∀ x. P and ∃ x. P . There is even ∃! x. P , which
means that there exists exactly one x that satisfies P . Nested quantifications
can be abbreviated: ∀ x y z. P means ∀ x.∀ y.∀ z. P .
Despite type inference, it is sometimes necessary to attach explicit type
constraints to a term. The syntax is t::τ as in x < (y::nat). Note that
:: binds weakly and should therefore be enclosed in parentheses. For in-
stance, x < y::nat is ill-typed because it is interpreted as (x < y)::nat. Type
constraints may be needed to disambiguate expressions involving overloaded
functions such as +, * and <. Section 8.3.1 discusses overloading, while Ta-
ble A.2 presents the most important overloaded function symbols.
In general, HOL’s concrete syntax tries to follow the conventions of func-
tional programming and mathematics. Here are the main rules that you
should be familiar with to avoid certain syntactic traps:
– Remember that f t u means (f t) u and not f(t u)!
– Isabelle allows infix functions like +. The prefix form of function application
binds more strongly than anything else and hence f x + y means (f x) + y
and not f(x+y).
– Remember that in HOL if-and-only-if is expressed using equality. But
equality has a high priority, as befitting a relation, while if-and-only-if
typically has the lowest priority. Thus, ¬ ¬ P = P means ¬¬(P = P) and
not (¬¬P) = P. When using = to mean logical equivalence, enclose both
operands in parentheses, as in (A ∧ B) = (B ∧ A).
– Constructs with an opening but without a closing delimiter bind very
weakly and should therefore be enclosed in parentheses if they appear in
subterms, as in (λx. x) = f. This includes if, let, case, λ, and quantifiers.
– Never write λx.x or ∀ x.x=x because x.x is always taken as a single qualified
identifier. Write λx. x and ∀ x. x=x instead.
– Identifiers may contain the characters _ and ', except at the beginning.
For the sake of readability, we use the usual mathematical symbols
throughout the tutorial. Their ascii-equivalents are shown in table A.1 in
the appendix.
!! A particular problem for novices can be the priority of operators. If you are
unsure, use additional parentheses. In those cases where Isabelle echoes your
input, you can see which parentheses are dropped — they were superfluous. If you
are unsure how to interpret Isabelle’s output because you don’t know where the
(dropped) parentheses go, set the Proof General flag Isabelle > Settings > Show
Brackets (see Sect. 1.5).
1.4 Variables 7
1.4 Variables
Isabelle distinguishes free and bound variables, as is customary. Bound vari-
ables are automatically renamed to avoid clashes with free variables. In ad-
dition, Isabelle has a third kind of variable, called a schematic variable or
unknown, which must have a ? as its first character. Logically, an unknown
is a free variable. But it may be instantiated by another term during the proof
process. For example, the mathematical theorem x = x is represented in Isa-
belle as ?x = ?x, which means that Isabelle can instantiate it arbitrarily. This
is in contrast to ordinary variables, which remain fixed. The programming
language Prolog calls unknowns logical variables.
Most of the time you can and should ignore unknowns and work with
ordinary variables. Just don’t be surprised that after you have finished the
proof of a theorem, Isabelle will turn your free variables into unknowns. It
indicates that Isabelle will automatically instantiate those unknowns suitably
when the theorem is used in some other proof. Note that for readability we
often drop the ? s when displaying a theorem.
Proof General offers the Isabelle menu for displaying information and setting
flags. A particularly useful flag is Isabelle > Settings > Show Types which causes
Isabelle to output the type information that is usually suppressed. This is indis-
pensible in case of errors of all kinds because often the types reveal the source of
the problem. Once you have diagnosed the problem you may no longer want to see
the types because they clutter all output. Simply reset the flag.
8 1. The Basics
Assuming you have installed Isabelle and Proof General, you start it by typing
Isabelle in a shell window. This launches a Proof General window. By
default, you are in HOL1 .
You can choose a different logic via the Isabelle > Logics menu.
1
This is controlled by the ISABELLE_LOGIC setting, see The Isabelle System Manual
for more details.
2. Functional Programming in HOL
This chapter describes how to write functional programs in HOL and how
to verify them. However, most of the constructs and proof procedures in-
troduced are general and recur in any specification or verification task. We
really should speak of functional modelling rather than functional program-
ming: our primary aim is not to write programs but to design abstract models
of systems. HOL is a specification language that goes well beyond what can
be expressed as a program. However, for the time being we concentrate on
the computable.
If you are a purist functional programmer, please note that all functions in
HOL must be total: they must terminate for all inputs. Lazy data structures
are not directly available.
HOL already has a predefined theory of lists called List — ToyList is merely a
small fragment of it chosen as an example. To avoid some ambiguities caused
by defining lists twice, we manipulate the concrete syntax and name space of
theory Main as follows.
no notation Nil ("[]") and Cons (infixr "#" 65) and append (infixr "@" 65)
hide type list
hide const rev
The datatype list introduces two constructors Nil and Cons, the empty list
and the operator that adds an element to the front of a list. For example,
10 2. Functional Programming in HOL
theory ToyList
imports Main
begin
no_notation Nil ("[]") and Cons (infixr "#" 65) and append (infixr "@" 65)
hide_type list
hide_const rev
the term Cons True (Cons False Nil) is a value of type bool list, namely
the list with the elements True and False. Because this notation quickly
becomes unwieldy, the datatype declaration is annotated with an alternative
syntax: instead of Nil and Cons x xs we can write [] and x # xs. In fact,
this alternative syntax is the familiar one. Thus the list Cons True (Cons
False Nil) becomes True # False # []. The annotation infixr means that #
associates to the right: the term x # y # z is read as x # (y # z) and not as
(x # y) # z. The 65 is the priority of the infix #.
!! Syntax annotations can be powerful, but they are difficult to master and are
never necessary. You could drop them from theory ToyList and go back to the
identifiers Nil and Cons. Novices should avoid using syntax annotations in their
own theories.
Next, two functions app and rev are defined recursively, in this order,
because Isabelle insists on definition before use:
primrec app :: "'a list ⇒ 'a list ⇒ 'a list" (infixr "@" 65) where
"[] @ ys = ys" |
"(x # xs) @ ys = x # (xs @ ys)"
2.2 Evaluation
yields the correct result False # True # []. But we can go beyond mere func-
tional programming and evaluate terms with variables in them, executing
functions symbolically:
value "rev (a # b # c # [])"
yields c # b # a # [].
12 2. Functional Programming in HOL
Having convinced ourselves (as well as one can by testing) that our definitions
capture our intentions, we are ready to prove a few simple theorems. This
will illustrate not just the basic proof commands but also the typical proof
process.
Main Goal. Our goal is to show that reversing a list twice produces the
original list.
theorem rev_rev [simp]: "rev(rev xs) = xs"
For compactness reasons we omit the header in this tutorial. Until we have
finished a proof, the proof state proper always looks like this:
1. G1
.
.
.
n. Gn
The numbered lines contain the subgoals G1 , …, Gn that we need to prove to
establish the main goal. Initially there is only one subgoal, which is identical
with the main goal. (If you always want to see the main goal as well, set the
flag Proof.show_main_goal — this flag used to be set by default.)
Let us now get back to rev (rev xs) = xs. Properties of recursively de-
fined functions are best established by induction. In this case there is nothing
obvious except induction on xs:
apply(induct_tac xs)
This tells Isabelle to perform induction on variable xs. The suffix tac stands
for tactic, a synonym for “theorem proving function”. By default, induction
acts on the first subgoal. The new proof state contains two subgoals, namely
the base case (Nil) and the induction step (Cons):
1. V
rev (rev []) = []
2. x1 x2. rev (rev x2) = x2 =⇒ rev (rev (x1 # x2)) = x1 # x2
This command tells Isabelle to apply a proof strategy called auto to all sub-
goals. Essentially, auto tries to simplify the subgoals. In our case, subgoal 1 is
solved completely (thanks to the equation rev [] = []) and disappears; the
simplified version of subgoal 2 becomes the new subgoal 1:
1. x1 x2. rev (rev x2) = x2 =⇒ rev (rev x2 @ x1 # []) = x1 # x2
V
First Lemma. After abandoning the above proof attempt (at the shell level
type oops) we start a new proof:
lemma rev_app [simp]: "rev(xs @ ys) = (rev ys) @ (rev xs)"
The keywords theorem and lemma are interchangeable and merely indi-
cate the importance we attach to a proposition. Therefore we use the words
theorem and lemma pretty much interchangeably, too.
There are two variables that we could induct on: xs and ys. Because @ is
defined by recursion on the first argument, xs is the correct one:
apply(induct_tac xs)
1. rev ys = rev ys @ []
A total of 2 subgoals...
Again, we need to abandon this proof attempt and prove another simple
lemma first. In the future the step of abandoning an incomplete proof before
embarking on the proof of a lemma usually remains implicit.
14 2. Functional Programming in HOL
As a result of that final done, Isabelle associates the lemma just proved with
its name. In this tutorial, we sometimes omit to show that final done if it is
obvious from the context that the proof is finished.
Notice that in lemma app_Nil2, as printed out after the final done, the
free variable xs has been replaced by the unknown ?xs, just as explained in
Sect. 1.4.
Going back to the proof of the first lemma
lemma rev_app [simp]: "rev(xs @ ys) = (rev ys) @ (rev xs)"
apply(induct_tac xs)
apply(auto)
we find that this time auto solves the base case, but the induction step merely
simplifies to
1. x1 x2.
V
rev (x2 @ ys) = rev ys @ rev x2 =⇒
(rev ys @ rev x2) @ x1 # [] = rev ys @ rev x2 @ x1 # []
Now we need to remember that @ associates to the right, and that # and @
have the same priority (namely the 65 in their infixr annotation). Thus the
conclusion really is
(rev ys @ rev list) @ (a # []) = rev ys @ (rev list @ (a # []))
Third Lemma. Abandoning the previous attempt, the canonical proof pro-
cedure succeeds without further ado.
lemma app_assoc [simp]: "(xs @ ys) @ zs = xs @ (ys @ zs)"
apply(induct_tac xs)
apply(auto)
done
The final end tells Isabelle to close the current theory because we are finished
with its development:
end
The complete proof script is shown in Fig. 2.2. The concatenation of
Figs. 2.1 and 2.2 constitutes the complete theory ToyList and should reside
in file ToyList.thy.
end
Review This is the end of our toy proof. It should have familiarized you
with
– the standard theorem proving procedure: state a goal (lemma or theorem);
proceed with proof until a separate lemma is required; prove that lemma;
come back to the original goal.
– a specific procedure that works well for functional programs: induction
followed by all-out simplification via auto.
– a basic repertoire of proof commands.
16 2. Functional Programming in HOL
!! It is tempting to think that all lemmas should have the simp attribute just
because this was the case in the example above. However, in that example all
lemmas were equations, and the right-hand side was simpler than the left-hand
side — an ideal situation for simplification purposes. Unless this is clearly the case,
novices should refrain from awarding a lemma the simp attribute, which has a
global effect. Instead, lemmas can be applied locally where they are needed, which
is discussed in the following chapter.
This section discusses a few basic commands for manipulating the proof state
and can be skipped by casual readers.
There are two kinds of commands used during a proof: the actual proof
commands and auxiliary commands for examining the proof state and con-
trolling the display. Simple proof commands are of the form apply(method),
where method is typically induct_tac or auto. All such theorem proving oper-
ations are referred to as methods, and further ones are introduced through-
out the tutorial. Unless stated otherwise, you may assume that a method
attacks merely the first subgoal. An exception is auto, which tries to solve all
subgoals.
The most useful auxiliary commands are as follows:
Modifying the order of subgoals: defer moves the first subgoal to the end
and prefer n moves subgoal n to the front.
Printing theorems: thm name1 … namen prints the named theorems.
Reading terms and types: term string reads, type-checks and prints the
given string as a term in the current context; the inferred type is output
as well. typ string reads and prints the given string as a type in the
current context.
Further commands are found in the Isabelle/Isar Reference Manual [37].
Clicking on the State button redisplays the current proof state. This is helpful
in case commands like thm have overwritten it.
2.5 Datatypes
2.5.1 Lists
Lists are one of the essential datatypes in computing. We expect that you are
already familiar with their basic operations. Theory ToyList is only a small
fragment of HOL’s predefined theory List1 . The latter contains many further
operations. For example, the functions hd (“head”) and tl (“tail”) return
the first element and the remainder of a list. (However, pattern matching is
usually preferable to hd and tl.) Also available are higher-order functions like
map and filter. Theory List also contains more syntactic sugar: [ x1 ,…,xn ]
abbreviates x1 # …# xn #[]. In the rest of the tutorial we always use HOL’s
predefined lists by building on theory Main.
!! Looking ahead to sets and quanifiers in Part II: The best way to express that
some element x is in a list xs is x ∈ set xs, where set is a function that turns
a list into the set of its elements. By the same device you can also write bounded
quantifiers like ∀x ∈ set xs or embed lists in other set expressions.
where αi are distinct type variables (the parameters), Ci are distinct con-
structor names and τij are types; it is customary to capitalize the first letter
in constructor names. There are a number of restrictions (such as that the
type should not be empty) detailed elsewhere [25]. Isabelle notifies you if you
violate them.
Laws about datatypes, such as [] 6= x#xs and (x#xs = y#ys) = (x=y
∧ xs=ys), are used automatically during proofs by simplification. The same
is true for the equations in primitive recursive function definitions.
Every2 datatype t comes equipped with a size function from t into the
natural numbers (see Sect. 2.6.1 below). For lists, size is just the length, i.e.
size [] = 0 and size(x # xs) = size xs + 1. In general, size returns
f x1 . . . (C y1 . . . yk ) . . . xn = r
!! Internally Isabelle only knows about exhaustive case expressions with non-
nested patterns: patterni must be of the form Ci xi1 . . . xiki and C1 , . . . , Cm
must be exactly the constructors of the type of e. More complex case expressions are
automatically translated into the simpler form upon parsing but are not translated
back for printing. This may lead to surprising output.
!! Induction is only allowed on free (or -bound) variables that should not occur
V
among the assumptions of the subgoal; see Sect. 9.2.1 for details. Case distinc-
tion (case_tac) works for arbitraryV terms, which need to be quoted if they are
non-atomic. However, apart from -bound variables, the terms must not contain
variables that are bound outside. For example, given the goal ∀ xs. xs = [] ∨
(∃ y ys. xs = y # ys), case_tac xs will not work as expected because Isabelle
interprets the xs as a new free variable distinct from the bound xs in the goal.
The aim of this case study is twofold: it shows how to model boolean expres-
sions and some algorithms for manipulating them, and it demonstrates the
constructs introduced above.
20 2. Functional Programming in HOL
apply(auto)
done
In fact, all proofs in this case study look exactly like this. Hence we do not
show them below.
More interesting is the transformation of If-expressions into a normal
form where the first argument of IF cannot be another IF but must be a
constant or variable. Such a normal form can be computed by repeatedly
replacing a subterm of the form IF (IF b x y) z u by IF b (IF x z u) (IF
y z u), which has the same value. The following primitive recursive functions
perform this task:
primrec normif :: "ifex ⇒ ifex ⇒ ifex ⇒ ifex" where
"normif (CIF b) t e = IF (CIF b) t e" |
"normif (VIF x) t e = IF (VIF x) t e" |
"normif (IF b t e) u f = normif b (normif t u f) (normif e u f)"
How do we come up with the required lemmas? Try to prove the main
theorems without them and study carefully what auto leaves unproved. This
can provide the clue. The necessity of universal quantification (∀ t e) in the
two lemmas is explained in Sect. 3.2
22 2. Functional Programming in HOL
!! The constants 0 and 1 and the operations +, -, *, min, max, ≤ and < are
overloaded: they are available not just for natural numbers but for other types
as well. For example, given the goal x + 0 = x, there is nothing to indicate that you
are talking about natural numbers. Hence Isabelle can only infer that x is of some
arbitrary type where 0 and + are declared. As a consequence, you will be unable
to prove the goal. To alert you to such pitfalls, Isabelle flags numerals without a
fixed type in its output: x + (0::'a) = x. (In the absence of a numeral, it may
take you some time to realize what has happened if Show Types is not set). In this
particular example, you need to include an explicit type constraint, for example x+0
= (x::nat). If there is enough contextual information this may not be necessary:
Suc x = x automatically implies x::nat because Suc is not overloaded.
For details on overloading see Sect. 8.3.1. Table A.2 in the appendix shows the
most important overloaded operations.
2.6 Some Basic Types 23
!! The symbols > and ≥ are merely syntax: x > y stands for y < x and similary
for ≥ and ≤.
!! Constant 1::nat is defined to equal Suc 0. This definition (see Sect. 2.7.2) is
unfolded automatically by some tactics (like auto, simp and arith) but not by
others (especially the single step tactics in Chapter 5). If you need the full set of
numerals, see Sect. 8.4.1. Novices are advised to stick to 0 and Suc.
Both auto and simp (a method introduced below, Sect. 3.1) prove simple
arithmetic goals automatically:
lemma "[[ ¬ m < n; m < n + (1::nat) ]] =⇒ m = n"
For efficiency’s sake, this built-in prover ignores quantified formulae, many
logical connectives, and all arithmetic operations apart from addition. In
consequence, auto and simp cannot prove this slightly more complex goal:
lemma "m 6= (n::nat) =⇒ m < n ∨ n < m"
The method arith is more general. It attempts to prove the first subgoal
provided it is a linear arithmetic formula. Such formulas may involve the
usual logical connectives (¬, ∧, ∨, −→, =, ∀ , ∃ ), the relations =, ≤ and <, and
the operations +, -, min and max. For example,
lemma "min i (max j (k*k)) = max (min (k*k) i) (min i (j::nat))"
apply(arith)
succeeds because k * k can be treated as atomic. In contrast,
lemma "n*n = n+1 =⇒ n=0"
2.6.2 Pairs
HOL also has ordered pairs: (a1 ,a2 ) is of type τ1 × τ2 provided each ai
is of type τi . The functions fst and snd extract the components of a pair:
fst(x,y) = x and snd(x,y) = y . Tuples are simulated by pairs nested to the
right: (a1 ,a2 ,a3 ) stands for (a1 ,(a2 ,a3 )) and τ1 × τ2 × τ3 for τ1 × (τ2 × τ3 ).
Therefore we have fst(snd(a1 ,a2 ,a3 )) = a2 .
Remarks:
24 2. Functional Programming in HOL
– There is also the type unit, which contains exactly one element denoted
by (). This type can be viewed as a degenerate product with 0 components.
– Products, like type nat, are datatypes, which means in particular that
induct_tac and case_tac are applicable to terms of product type. Both split
the term into a number of variables corresponding to the tuple structure
(up to 7 components).
– Tuples with more than two or three components become unwieldy; records
are preferable.
For more information on pairs and records see Chapter 8.
2.7 Definitions
Type synonyms are similar to those found in ML. They are created by a
type synonym command:
type synonym number = nat
type synonym gate = "bool ⇒ bool ⇒ bool"
type synonym ('a, 'b) alist = "('a × 'b) list"
3.1 Simplification
So far we have proved our theorems by auto, which simplifies all subgoals.
In fact, auto can do much more than that. To go beyond toy examples, you
need to understand the ingredients of auto. This section covers the method
that auto always applies first, simplification.
Simplification is one of the central theorem proving tools in Isabelle and
many other systems. The tool itself is called the simplifier. This section
introduces the many features of the simplifier and is required reading if you
intend to perform proofs. Later on, Sect. 9.1 explains some more advanced
features and a little bit of how the simplifier works. The serious student should
read that section as well, in particular to understand why the simplifier did
something unexpected.
!! Simplification can run forever, for example if both f (x) = g(x) and g(x) = f (x)
are simplification rules. It is the user’s responsibility not to include simplification
rules that can lead to nontermination, either on their own or in combination with
other simplification rules.
where the list of modifiers fine tunes the behaviour and may be empty. Specific
modifiers are discussed below. Most if not all of the proofs seen so far could
3.1 Simplification 29
have been performed with simp instead of auto, except that simp attacks only
the first subgoal and may thus need to be repeated — use simp_all to simplify
all subgoals. If nothing changes, simp fails.
3.1.5 Assumptions
By default, assumptions are part of the simplification process: they are used
as simplification rules and are simplified themselves. For example:
lemma "[[ xs @ zs = ys @ xs; [] @ xs = [] @ [] ]] =⇒ ys = zs"
apply simp
done
The second assumption simplifies to xs = [], which in turn simplifies the first
assumption to zs = ys, thus reducing the conclusion to ys = ys and hence to
True.
In some cases, using the assumptions can lead to nontermination:
lemma "∀ x. f x = g (f (g x)) =⇒ f [] = f [] @ []"
An unmodified application of simp loops. The culprit is the simplification rule
f x = g (f (g x)), which is extracted from the assumption. (Isabelle notices
certain simple forms of nontermination but not this one.) The problem can
be circumvented by telling the simplifier to ignore the assumptions:
apply(simp (no_asm))
done
Three modifiers influence the treatment of assumptions:
(no_asm) means that assumptions are completely ignored.
(no_asm_simp) means that the assumptions are not simplified but are used
in the simplification of the conclusion.
30 3. More Functional Programming
(no_asm_use) means that the assumptions are simplified but are not used in
the simplification of each other or the conclusion.
Only one of the modifiers is allowed, and it must precede all other modifiers.
!! If you have defined f x y ≡ t then you can only unfold occurrences of f with at
least two arguments. This may be helpful for unfolding f selectively, but it may
also get in the way. Defining f ≡ λx y. t allows to unfold all occurrences of f .
There is also the special method unfold which merely unfolds one or
several definitions, as in apply(unfold xor_def). This is can be useful in
situations where simp does too much. Warning: unfold acts on all subgoals!
So far all examples of rewrite rules were equations. The simplifier also accepts
conditional equations, for example
lemma hd_Cons_tl[simp]: "xs 6= [] =⇒ hd xs # tl xs = xs"
apply(case_tac xs, simp, simp)
done
Note the use of “,” to string together a sequence of methods. Assuming
that the simplification rule (rev xs = []) = (xs = []) is present as well, the
lemma below is proved by plain simplification:
lemma "xs 6= [] =⇒ hd(rev xs) # tl(rev xs) = rev xs"
The conditional equation hd_Cons_tl above can simplify hd (rev xs) # tl
(rev xs) to rev xs because the corresponding precondition rev xs 6= [] sim-
plifies to xs 6= [], which is exactly the local assumption of the subgoal.
1. (xs = [] −→ zs = xs @ zs) ∧
(∀ x21 x22. xs = x21 # x22 −→ x21 # x22 @ zs = xs @ zs)
32 3. More Functional Programming
!! The simplifier merely simplifies the condition of an if but not the then or
else parts. The latter are simplified only after the condition reduces to True
or False, or after splitting. The same is true for case-expressions: only the selector
is simplified at first, until either the expression reduces to one of the cases or it is
split.
3.1.10 Tracing
Using the simplifier effectively may take a bit of experimentation. Set the
Proof General flag Isabelle > Settings > Trace Simplifier to get a better idea
of what is going on:
3.1 Simplification 33
[1]Rewriting:
rev [a] ≡ rev [] @ [a]
[1]Rewriting:
rev [] ≡ []
[1]Rewriting:
[] @ [a] ≡ [a]
[1]Rewriting:
[a] = [] ≡ False
The trace lists each rule being applied, both in its general form and the
instance being used. The [i] in front (where above i is always 1) indicates
that we are inside the ith invocation of the simplifier. Each attempt to apply
a conditional rule shows the rule followed by the trace of the (recursive!)
simplification of the conditions, the latter prefixed by [i + 1] instead of
[i]. Another source of recursive invocations of the simplifier are proofs of
arithmetic formulae. By default, recursive invocations are not shown, you
must increase the trace depth via Isabelle > Settings > Trace Simplifier Depth.
Many other hints about the simplifier’s actions may appear.
In more complicated cases, the trace can be very lengthy. Thus it is ad-
visable to reset the Trace Simplifier flag after having obtained the desired
trace. Since this is easily forgotten (and may have the unpleasant effect of
swamping the interface with trace information), here is how you can switch
the trace on locally in a proof:
using [[simp_trace=true]]
apply simp
Within the current proof, all simplifications in subsequent proof steps will be
traced, but the text reminds you to remove the using clause after it has done
its job.
34 3. More Functional Programming
The search engine is started by clicking on Proof General’s Find icon. You specify
your search textually in the input buffer at the bottom of the window.
The simplest form of search finds theorems containing specified patterns.
A pattern can be any term (even a single identifier). It may contain “ ”, a
wildcard standing for any term. Here are some examples:
length
"_ # _ = _ # _"
"_ + _"
"_ * (_ - (_::nat))"
Specifying types, as shown in the last example, constrains searches involving
overloaded operators.
!! Always use “ ” rather than variable names: searching for "x + y" will usually
not find any matching theorems because they would need to contain x and y
literally. When searching for infix operators, do not just type in the symbol, such
as +, but a proper term such as "_ + _". This remark applies to more complicated
syntaxes, too.
If you are looking for rewrite rules (possibly conditional) that could sim-
plify some term, prefix the pattern with simp:.
simp: "_ * (_ + _)"
This finds all equations—not just those with a simp attribute—whose con-
clusion has the form
_ * (_ + _) = . . .
It only finds equations that can simplify the given pattern at the root, not
somewhere inside: for example, equations of the form _ + _ = . . . do not
match.
You may also search for theorems by name—you merely need to specify
a substring. For example, you could search for all commutativity theorems
like this:
name: comm
This retrieves all theorems whose name contains comm.
Search criteria can also be negated by prefixing them with “-”. For exam-
ple,
-name: List
finds theorems whose name does not contain List. You can use this to exclude
particular theories from the search: the long name of a theorem contains the
name of the theory it comes from.
3.2 Induction Heuristics 35
Proof General keeps a history of all your search expressions. If you click on Find,
you can use the arrow keys to scroll through previous searches and just modify
them. This saves you having to type in lengthy expressions again and again.
The purpose of this section is to illustrate some simple heuristics for inductive
proofs. The first one we have already mentioned in our initial example:
Theorems about recursive functions are proved by induction.
In case the function has more than one argument
Do induction on argument number i if the function is defined by
recursion in argument number i.
When we look at the proof of (xs@ys) @ zs = xs @ (ys@zs) in Sect. 2.3 we
find
– @ is recursive in the first argument
– xs occurs only as the first argument of @
– both ys and zs occur at least once as the second argument of @
Hence it is natural to perform induction on xs.
The key heuristic, and the main point of this section, is to generalize the
goal before induction. The reason is simple: if the goal is too specific, the
induction hypothesis is too weak to allow the induction step to go through.
Let us illustrate the idea with an example.
Function rev has quadratic worst-case running time because it calls func-
tion @ for each element of the list and @ is linear in its first argument. A linear
time version of rev reqires an extra argument where the result is accumulated
gradually, using only #:
primrec itrev :: "'a list ⇒ 'a list ⇒ 'a list" where
"itrev [] ys = ys" |
"itrev (x#xs) ys = itrev xs (x#ys)"
The behaviour of itrev is simple: it reverses its first argument by stacking
its elements onto the second argument, and returning that second argument
36 3. More Functional Programming
when the first one becomes empty. Note that itrev is tail-recursive: it can
be compiled into a loop.
Naturally, we would like to show that itrev does indeed reverse its first
argument provided the second one is empty:
lemma "itrev xs [] = rev xs"
There is no choice as to the induction variable, and we immediately simplify:
apply(induct_tac xs, simp_all)
Unfortunately, this attempt does not prove the induction step:
1. a list.
V
itrev list [] = rev list =⇒ itrev list [a] = rev list @ [a]
The induction hypothesis is too weak. The fixed argument, [], prevents it
from rewriting the conclusion. This example suggests a heuristic:
Generalize goals for induction by replacing constants by variables.
Of course one cannot do this naïvely: itrev xs ys = rev xs is just not true.
The correct generalization is
lemma "itrev xs ys = rev xs @ ys"
The induction hypothesis is still too weak, but this time it takes no intuition
to generalize: the problem is that ys is fixed throughout the subgoal, but the
induction hypothesis needs to be applied with a # ys instead of ys. Hence we
prove the theorem for all ys instead of a fixed one:
lemma "∀ ys. itrev xs ys = rev xs @ ys"
This time induction on xs followed by simplification succeeds. This leads to
another heuristic for generalization:
Generalize goals for induction by universally quantifying all free vari-
ables (except the induction variable itself!).
This prevents trivial failures like the one above and does not affect the validity
of the goal. However, this heuristic should not be applied blindly. It is not
always required, and the additional quantifiers can complicate matters in
some cases. The variables that should be quantified are typically those that
change in recursive calls.
3.3 Case Study: Compiling Expressions 37
| Vex 'a
| Bex "'v binop" "('a,'v)expr" "('a,'v)expr"
The three constructors represent constants, variables and the application of
a binary operation to two subexpressions.
The value of an expression with respect to an environment that maps
variables to values is easily defined:
primrec "value" :: "('a,'v)expr ⇒ ('a ⇒ 'v) ⇒ 'v" where
"value (Cex v) env = v" |
"value (Vex a) env = env a" |
"value (Bex f e1 e2) env = f (value e1 env) (value e2 env)"
The stack machine has three instructions: load a constant value onto the
stack, load the contents of an address onto the stack, and apply a binary
operation to the two topmost elements of the stack, replacing them by the
result. As for expr, addresses and values are type parameters:
datatype (dead 'a, 'v) instr = Const 'v
| Load 'a
| Apply "'v binop"
The execution of the stack machine is modelled by a function exec that
takes a list of instructions, a store (modelled as a function from addresses
to values, just like the environment for evaluating expressions), and a stack
(modelled as a list) of values, and returns the stack at the end of the execution
— the store remains unchanged:
primrec exec :: "('a,'v)instr list ⇒ ('a⇒'v) ⇒ 'v list ⇒ 'v list"
where
"exec [] s vs = vs" |
"exec (i#is) s vs = (case i of
Const v ⇒ exec is s (v#vs)
| Load a ⇒ exec is s ((s a)#vs)
| Apply f ⇒ exec is s ((f (hd vs) (hd(tl vs)))#(tl(tl vs))))"
Recall that hd and tl return the first element and the remainder of a list.
Because all functions are total, hd is defined even for the empty list, although
we do not know what the result is. Thus our model of the machine always
terminates properly, although the definition above does not tell us much
about the result in situations where Apply was executed with fewer than two
elements on the stack.
The compiler is a function from expressions to a list of instructions. Its
definition is obvious:
primrec compile :: "('a,'v)expr ⇒ ('a,'v)instr list" where
"compile (Cex v) = [Const v]" |
"compile (Vex a) = [Load a]" |
"compile (Bex f e1 e2) = (compile e2) @ (compile e1) @ [Apply f]"
Now we have to prove the correctness of the compiler, i.e. that the exe-
cution of a compiled expression results in the value of the expression:
theorem "exec (compile e) s [] = [value e s]"
3.4 Advanced Datatypes 39
This section presents advanced forms of datatypes: mutual and nested re-
cursion. A series of examples will culminate in a treatment of the trie data
structure.
and evaluating the result in an environment env yields the same result as
evaluation a in the environment that maps every variable x to the value of
s(x) under env. If you try to prove this separately for arithmetic or boolean
expressions (by induction), you find that you always need the other theorem
in the induction step. Therefore you need to state and prove both theorems
simultaneously:
lemma "evala (substa s a) env = evala a (λx. evala (s x) env) ∧
evalb (substb s b) env = evalb b (λx. evala (s x) env)"
apply(induct_tac a and b)
The resulting 8 goals (one for each constructor) are proved in one fell swoop:
apply simp_all
In general, given n mutually recursive datatypes τ1 , …, τn , an inductive
proof expects a goal of the form
P1 (x1 ) ∧ · · · ∧ Pn (xn )
Exercise 3.4.1 Define a function norma of type 'a aexp ⇒ 'a aexp that re-
places IF s with complex boolean conditions by nested IF s; it should eliminate
the constructors And and Neg, leaving only Less. Prove that norma preserves
the value of an expression and that the result of norma is really normal, i.e.
no more Ands and Neg s occur in it. (Hint: proceed as in Sect. 2.5.6 and read
the discussion of type annotations following lemma subst_id below).
So far, all datatypes had the property that on the right-hand side of their
definition they occurred only at the top-level: directly below a constructor.
Now we consider nested recursion, where the recursive datatype occurs nested
in some other datatype (but not inside itself!). Consider the following model
of terms where function symbols can be applied to a list of arguments:
datatype ('v,'f)"term" = Var 'v | App 'f "('v,'f)term list"
Note that we need to quote term on the left to avoid confusion with the
Isabelle command term. Parameter 'v is the type of variables and 'f the
type of function symbols. A mathematical term like f (x, g(y)) becomes App
f [Var x, App g [Var y]], where f, g, x, y are suitable values, e.g. numbers
or strings.
What complicates the definition of term is the nested occurrence of term
inside list on the right-hand side. In principle, nested recursion can be elim-
inated in favour of mutual recursion by unfolding the offending datatypes,
here list. The result for term would be something like
42 3. More Functional Programming
"substs s [] = []" |
"substs s (t # ts) = subst s t # substs s ts"
Note that Var is the identity substitution because by definition it leaves vari-
ables unchanged: subst Var (Var x) = Var x. Note also that the type anno-
tations are necessary because otherwise there is nothing in the goal to enforce
that both halves of the goal talk about the same type parameters ('v,'f).
As a result, induction would fail because the two halves of the goal would be
unrelated.
Exercise 3.4.2 The fact that substitution distributes over composition can
be expressed roughly as follows:
subst (f ◦ g) t = subst f (subst g t)
Correct this statement (you will find that it does not type-check), strengthen
it, and prove it. (Note: ◦ is function composition; its definition is found in
theorem o_def ).
Exercise 3.4.3 Define a function trev of type ('v, 'f) Nested.term ⇒
('v, 'f) Nested.term that recursively reverses the order of arguments of all
function symbols in a term. Prove that trev (trev t) = t.
3.4 Advanced Datatypes 43
What is more, we can now disable the old defining equation as a simplification
rule:
declare subst_App [simp del]
The advantage is that now we have replaced substs by map, we can profit from
the large number of pre-proved lemmas about map. Unfortunately, inductive
proofs about type term are still awkward because they expect a conjunc-
tion. One could derive a new induction principle as well (see Sect. 9.2.3), but
simpler is to stop using primrec and to define functions with fun instead.
Simple uses of fun are described in Sect. 3.5 below. Advanced applications,
including functions over nested datatypes like term, are discussed in a sepa-
rate tutorial [17].
Of course, you may also combine mutual and nested recursion of data-
types. For example, constructor Sum in Sect. 3.4.1 could take a list of expres-
sions as its argument: Sum "'a aexp list".
How far can we push nested recursion? By the unfolding argument above,
we can reduce nested to mutual recursion provided the nested recursion only
involves previously defined datatypes. This does not include functions:
datatype t = C "t ⇒ bool"
This declaration is a real can of worms. In HOL it must be ruled out because
it requires a type t such that t and its power set t ⇒ bool have the same
cardinality — an impossibility. For the same reason it is not possible to allow
recursion involving the type t set, which is isomorphic to t ⇒ bool.
Fortunately, a limited form of recursion involving function spaces is per-
mitted: the recursive type may occur on the right of a function arrow, but
never on the left. Hence the above can of worms is ruled out but the following
example of a potentially infinitely branching tree is accepted:
datatype (dead 'a,'i) bigtree = Tip | Br 'a "'i ⇒ ('a,'i)bigtree"
44 3. More Functional Programming
Parameter 'a is the type of values stored in the Br anches of the tree, whereas
'i is the index type over which the tree branches. If 'i is instantiated to bool,
the result is a binary tree; if it is instantiated to nat, we have an infinitely
branching tree because each node has as many subtrees as there are natural
numbers. How can we possibly write down such a tree? Using functional
notation! For example, the term
Br 0 (λi. Br i (λn. Tip))
of type (nat, nat) bigtree is the tree whose root is labeled with 0 and whose
ith subtree is labeled with i and has merely Tip s as further subtrees.
Function map_bt applies a function to all labels in a bigtree:
primrec map_bt :: "('a ⇒ 'b) ⇒ ('a,'i)bigtree ⇒ ('b,'i)bigtree"
where
"map_bt f Tip = Tip" |
"map_bt f (Br a F) = Br (f a) (λi. map_bt f (F i))"
This is a valid primrec definition because the recursive calls of map_bt involve
only subtrees of F, which is itself a subterm of the left-hand side. Thus termi-
nation is assured. The seasoned functional programmer might try expressing
λi. map_bt f (F i) as map_bt f ◦ F, which Isabelle however will reject. Ap-
plying map_bt to only one of its arguments makes the termination proof less
obvious.
The following lemma has a simple proof by induction:
lemma "map_bt (g o f) T = map_bt g (map_bt f T)"
apply(induct_tac T, simp_all)
done
Because of the function type, the proof state after induction looks unusual.
Notice the quantified induction hypothesis:
1. V
map_bt (g ◦ f) Tip = map_bt g (map_bt f Tip)
2. x1 V F.
( x2a. x2a ∈ range F =⇒
map_bt (g ◦ f) x2a = map_bt g (map_bt f x2a)) =⇒
map_bt (g ◦ f) (Br x1 F) = map_bt g (map_bt f (Br x1 F))
If you need nested recursion on the left of a function arrow, there are
alternatives to pure HOL. In the Logic for Computable Functions (LCF),
types like
datatype lam = C "lam → lam"
do indeed make sense [27]. Note the different arrow, → instead of ⇒, express-
ing the type of continuous functions. There is even a version of LCF on top
of HOL, called HOLCF [20].
“ape”, “can”, “car” and “cat”. When searching a string in a trie, the letters of
the string are examined sequentially. Each letter determines which subtrie to
search next. In this case study we model tries as a datatype, define a lookup
and an update function, and prove that they behave as expected.
Q
Q
a c
QQ
Q
QQ
l n p a
Q
QQ
l e n r t
Proper tries associate some value with each string. Since the information
is stored only in the final node associated with the string, many nodes do not
carry any value. This distinction is modeled with the help of the predefined
datatype option (see Sect. 2.6.3).
To minimize running time, each node of a trie should contain an array that
maps letters to subtries. We have chosen a representation where the subtries
are held in an association list, i.e. a list of (letter,trie) pairs. Abstracting over
the alphabet 'a and the values 'v we define a trie as follows:
datatype ('a,'v)trie = Trie "'v option" "('a * ('a,'v)trie)list"
The first component is the optional value, the second component the associa-
tion list of subtries. This is an example of nested recursion involving products,
which is fine because products are datatypes as well. We define two selector
functions:
primrec "value" :: "('a,'v)trie ⇒ 'v option" where
"value(Trie ov al) = ov"
primrec alist :: "('a,'v)trie ⇒ ('a * ('a,'v)trie)list" where
"alist(Trie ov al) = al"
Association lists come with a generic lookup function. Its result involves type
option because a lookup can fail:
primrec assoc :: "('key * 'val)list ⇒ 'key ⇒ 'val option" where
"assoc [] x = None" |
"assoc (p#ps) x =
(let (a,b) = p in if a=x then Some b else assoc ps x)"
Now we can define the lookup function for tries. It descends into the trie
examining the letters of the search string one by one. As recursion on lists is
simpler than on tries, let us express this as primitive recursion on the search
string argument:
46 3. More Functional Programming
Our plan is to induct on as; hence the remaining variables are quantified.
From the definitions it is clear that induction on either as or bs is required.
The choice of as is guided by the intuition that simplification of lookup might
be easier if update has already been simplified, which can only happen if as
is instantiated. The start of the proof is conventional:
apply(induct_tac as, auto)
Unfortunately, this time we are left with three intimidating looking subgoals:
3.5 Total Recursive Functions: fun 47
1. … =⇒ lookup … bs = lookup t bs
2. … =⇒ lookup … bs = lookup t bs
3. … =⇒ lookup … bs = lookup t bs
All methods ending in tac take an optional first argument that specifies the
range of subgoals they are applied to, where [!] means all subgoals, i.e. [1-3]
in our case. Individual subgoal numbers, e.g. [2] are also allowed.
This proof may look surprisingly straightforward. However, note that this
comes at a cost: the proof script is unreadable because the intermediate
proof states are invisible, and we rely on the (possibly brittle) magic of auto
(simp_all will not do — try it) to split the subgoals of the induction up in
such a way that case distinction on bs makes sense and solves the proof.
Exercise 3.4.4 Modify update (and its type) such that it allows both in-
sertion and deletion of entries with a single function. Prove the correspond-
ing version of the main theorem above. Optimize your function such that it
shrinks tries after deletion if possible.
Exercise 3.4.5 Write an improved version of update that does not suffer
from the space leak (pointed out above) caused by not deleting overwritten
entries from the association list. Prove the main theorem for your improved
update.
3.5.1 Definition
After a function f has been defined via fun, its defining equations (or vari-
ants derived from them) are available under the name f .simps as theorems.
For example, look (via thm) at sep.simps and sep1.simps to see that they
define the same function. What is more, those equations are automatically
declared as simplification rules.
3.5 Total Recursive Functions: fun 49
3.5.2 Termination
Isabelle’s automatic termination prover for fun has a fixed notion of the size
(of type nat) of an argument. The size of a natural number is the number
itself. The size of a list is its length. For the general case see Sect. 2.5.2.
A recursive function is accepted if fun can show that the size of one fixed
argument becomes smaller with each recursive call.
More generally, fun allows any lexicographic combination of size measures
in case there are multiple arguments. For example, the following version of
Ackermann’s function is accepted:
fun ack2 :: "nat ⇒ nat ⇒ nat" where
"ack2 n 0 = Suc n" |
"ack2 0 (Suc m) = ack2 (Suc 0) m" |
"ack2 (Suc n) (Suc m) = ack2 (ack2 n (Suc m)) m"
The order of arguments has no influence on whether fun can prove ter-
mination of a function. For more details see elsewhere [6].
3.5.3 Simplification
The second argument decreases with each recursive call. The termination
condition
n 6= 0 =⇒ m mod n < n
in one step to
(if n = 0 then m else fun0.gcd n (m mod n)) = k
3.5.4 Induction
Having defined a function we might like to prove something about it. Since the
function is recursive, the natural proof principle is again induction. But this
time the structural form of induction that comes with datatypes is unlikely
to work well — otherwise we could have defined the function by primrec.
Therefore fun automatically proves a suitable induction rule f .induct that
follows the recursion pattern of the particular function f . We call this re-
cursion induction. Roughly speaking, it requires you to prove for each fun
3.5 Total Recursive Functions: fun 51
equation that the property you are trying to establish holds for the left-hand
side provided it holds for all recursive calls on the right-hand side. Here is a
simple example involving the predefined map functional on lists:
lemma "map f (sep x xs) = sep (f x) (map f xs)"
Note that map f xs is the result of applying f to all elements of xs. We prove
this lemma by recursion induction over sep :
apply(induct_tac x xs rule: sep.induct)
The resulting proof state has three subgoals corresponding to the three
clauses for sep :
1. Va. map f (sep a []) = sep (f a) (map f [])
V
2. Va x. map f (sep a [x]) = sep (f a) (map f [x])
3. a x y zs.
map f (sep a (y # zs)) = sep (f a) (map f (y # zs)) =⇒
map f (sep a (x # y # zs)) = sep (f a) (map f (x # y # zs))
The proof goes smoothly because the induction rule follows the recursion of
sep. Try proving the above lemma by structural induction, and you find that
you need an additional case distinction.
In general, the format of invoking recursion induction is
apply(induct_tac x1 . . . xn rule: f .induct)
where x1 . . . xn is a list of free variables in the subgoal and f the name of a
function that takes n arguments. Usually the subgoal will contain the term
fx1 . . . xn but this need not be the case. The induction rules do not mention
f at all. Here is sep.induct:
[[ Va. P a [];
V
Va x. P a [x];
a x y zs. P a (y # zs) =⇒ P a (x # y # zs)]]
=⇒ P u v
It merely says that in order to prove a property P of u and v you need to prove
it for the three cases where v is the empty list, the singleton list, and the list
with at least two elements. The final case has an induction hypothesis: you
may assume that P holds for the tail of that list.
4. Presenting Theories
By now the reader should have become sufficiently acquainted with elemen-
tary theory development in Isabelle/HOL. The following interlude describes
how to present theories in a typographically pleasing manner. Isabelle pro-
vides a rich infrastructure for concrete syntax of the underlying λ-calculus
language (see Sect. 4.1), as well as document preparation of theory texts based
on existing PDF-LATEX technology (see Sect. 4.2).
As pointed out by Leibniz more than 300 years ago, notions are in princi-
ple more important than notations, but suggestive textual representation of
ideas is vital to reduce the mental effort to comprehend and apply them.
The core concept of Isabelle’s framework for concrete syntax is that of mixfix
annotations. Associated with any kind of constant declaration, mixfixes
affect both the grammar productions for the parser and output templates for
the pretty printer.
In full generality, parser and pretty printer configuration is a subtle af-
fair [37]. Your syntax specifications need to interact properly with the existing
setup of Isabelle/Pure and Isabelle/HOL. To avoid creating ambiguities with
existing elements, it is particularly important to give new syntactic constructs
the right precedence.
Below we introduce a few simple syntax declaration forms that already
cover many common situations fairly well.
Sect. 4.2.4). There are also a few predefined control symbols, such as \<^sub>
and \<^sup> for sub- and superscript of the subsequent printable symbol,
respectively. For example, A\<^sup>\<star>, is output as A? .
A number of symbols are considered letters by the Isabelle lexer and can
be used as part of identifiers. These are the greek letters α (\<alpha>), β
(\<beta>), etc. (excluding λ), special letters like A (\<A>) and A (\<AA>).
Moreover the control symbol \<^sub> may be used to subscript a single letter
or digit in the trailing part of an identifier. This means that the input
\<forall>\<alpha>\<^sub>1. \<alpha>\<^sub>1 = \<Pi>\<^sub>\<A>
Prefix syntax annotations are another form of mixfixes [37], without any
template arguments or priorities — just some literal syntax. The following
example associates common symbols with the constructors of a datatype.
datatype currency =
Euro nat ("e")
| Pounds nat ("£")
| Yen nat ("U")
| Dollar nat ("$")
4.1.4 Abbreviations
text {*
\noindent The datatype induction rule generated here is
of the form @{thm [display] bintree.induct [no_vars]}
*}
Here we have augmented the theory by formal comments (using text blocks),
the informal parts may again refer to formal entities by means of “antiquota-
tions” (such as @{text "'a bintree"} or @{typ 'a}), see also Sect. 4.2.3.
58 4. Presenting Theories
One may now start to populate the directory MySession and its ROOT file
accordingly. The file MySession/document/root.tex should also be adapted
at some point; the default version is mostly self-explanatory. Note that
\isabellestyle enables fine-tuning of the general appearance of characters
and mathematical symbols (see also Sect. 4.2.4).
Especially observe the included LATEX packages isabelle (mandatory),
isabellesym (required for mathematical symbols), and the final pdfsetup
(provides sane defaults for hyperref, including URL markup). All three are
4.2 Document Preparation 59
theory Foo_Bar
imports Main
begin
definition foo :: …
definition bar :: …
lemma fooI: …
lemma fooE: …
theorem main: …
end
Isabelle source comments, which are of the form (* . . . *), essentially act
like white space and do not really contribute to the content. They mainly
serve technical purposes to mark certain oddities in the raw input text. In
contrast, formal comments are portions of text that are associated with
formal Isabelle/Isar commands (marginal comments), or as standalone
paragraphs within a theory or proof context (text blocks).
Marginal comments are part of each command’s concrete syntax [37]; the
common form is “-- text” where text is delimited by ". . . " or {* . . . *} as
before. Multiple marginal comments may be given at the same time. Here is
a simple example:
lemma "A --> A"
— a triviality of propositional logic
— (should not really bother)
by (rule impI) — implicit assumption step involved here
From the LATEX viewpoint, “--” acts like a markup command, associated
with the macro \isamarkupcmt (taking a single argument).
Text blocks are introduced by the commands text and txt. Each takes
again a single text argument, which is interpreted as a free-form paragraph
in LATEX (surrounded by some additional vertical space). The typesetting
may be changed by redefining the LATEX environments of isamarkuptext or
isamarkuptxt, respectively (via \renewenvironment).
The text part of Isabelle markup commands essentially inserts quoted ma-
terial into a formal text, mainly for instruction of the reader. An antiquo-
tation is again a formal object embedded into such an informal portion. The
interpretation of antiquotations is limited to some well-formedness checks,
with the result being pretty printed to the resulting document. Quoted text
blocks together with antiquotations provide an attractive means of referring
to formal entities, with good confidence in getting the technical details right
(especially syntax and types).
The general syntax of antiquotations is as follows: @{name arguments},
or @{name [options] arguments} for a comma-separated list of options con-
sisting of a name or name=value each. The syntax of arguments depends on
the kind of antiquotation, it generally follows the same conventions for types,
terms, or theorems as in the formal part of a theory.
This sentence demonstrates quotations and antiquotations: λx y. x is a
well-typed term.
The output above was produced as follows:
text {*
This sentence demonstrates quotations and antiquotations:
@{term "%x y. x"} is a well-typed term.
*}
The notational change from the ASCII character % to the symbol λ reveals
that Isabelle printed this term, after parsing and type-checking. Document
preparation enables symbolic output by default.
The next example includes an option to show the type of all variables.
The antiquotation @{term [show_types] "%x y. x"} produces the output
λ(x::'a) y::'b. x. Type inference has figured out the most general typings
in the present theory context. Terms may acquire different typings due to con-
straints imposed by their environment; within a proof, for example, variables
are given the same types as they have in the main goal statement.
Several further kinds of antiquotations and options are available [37]. Here
are a few commonly used combinations:
62 4. Presenting Theories
for \<XYZ> (see isabelle.sty for working examples). Control symbols are
slightly more difficult to get right, though.
The \isabellestyle macro provides a high-level interface to tune the
general appearance of individual symbols. For example, \isabellestyle{it}
uses the italics text style to mimic the general appearance of the LATEX math
mode; double quotes are not printed at all. The resulting quality of type-
setting is quite good, so this should be the default style for work that gets
distributed to a broader audience.
By default, Isabelle’s document system generates a LATEX file for each theory
that gets loaded while running the session. The generated session.tex will
include all of these in order of appearance, which in turn gets included by
the standard root.tex. Certainly one may change the order or suppress
unwanted theories by ignoring session.tex and load individual files directly
in root.tex. On the other hand, such an arrangement requires additional
maintenance whenever the collection of theories changes.
Alternatively, one may tune the theory loading process in ROOT itself:
some sequential order of theories sections may enforce a certain traversal
of the dependency graph, although this could degrade parallel processing.
The nodes of each sub-graph that is specified here are presented in some
topological order of their formal dependencies.
Moreover, the system build option document=false allows to disable doc-
ument generation for some theories. Its usage in the session ROOT is like this:
theories [document = false] T
The original source has been “lemma "x = x" by %invisible (simp)”.
Tags observe the structure of proofs; adjacent commands with the same tag
are joined into a single region. The Isabelle document preparation system
allows the user to specify how to interpret a tagged region, in order to keep,
64 4. Presenting Theories
drop, or fold the corresponding parts of the document. See the Isabelle System
Manual [36] for further details, especially on isabelle build and isabelle
document.
Ignored material is specified by delimiting the original formal source with
special source comments (*<*) and (*>*). These parts are stripped before
the type-setting phase, without affecting the formal checking of the theory, of
course. For example, we may hide parts of a proof that seem unfit for general
public inspection. The following “fully automatic” proof is actually a fake:
lemma "x 6= (0::int) =⇒ 0 < x * x"
by (auto)
Suppressing portions of printed text demands care. You should not mis-
represent the underlying theory development. It is easy to invalidate the
visible text by hiding references to questionable axioms, for example.
Part II
This chapter outlines the concepts and techniques that underlie reasoning
in Isabelle. Until now, we have proved everything using only induction and
simplification, but any serious verification project requires more elaborate
forms of inference. The chapter also introduces the fundamentals of predicate
logic. The first examples in this chapter will consist of detailed, low-level proof
steps. Later, we shall see how to automate such reasoning using the methods
blast, auto and others. Backward or goal-directed proof is our usual style,
but the chapter also introduces forward reasoning, where one theorem is
transformed to yield another.
In Isabelle, proofs are constructed using inference rules. The most familiar
inference rule is probably modus ponens:
P→Q P
Q
Carefully examine the syntax. The premises appear to the left of the arrow
and the conclusion to the right. The premises (if more than one) are grouped
using the fat brackets. The question marks indicate schematic variables
(also called unknowns): they may be replaced by arbitrary formulas. If we
use the rule backwards, Isabelle tries to unify the current subgoal with the
conclusion of the rule, which has the form ?P ∧ ?Q. (Unification is discussed
below, Sect. 5.8.) If successful, it yields new subgoals given by the formulas
assigned to ?P and ?Q.
The following trivial proof illustrates how rules work. It also introduces
a style of indentation. If a command adds a new subgoal, then the next
command’s indentation is increased by one space; if it proves a subgoal, then
the indentation is reduced. This provides the reader with hints about the
subgoal structure.
lemma conj_rule: "[[P; Q]] =⇒ P ∧ (Q ∧ P)"
apply (rule conjI)
apply assumption
apply (rule conjI)
apply assumption
apply assumption
At the start, Isabelle presents us with the assumptions (P and Q ) and with
the goal to be proved, P ∧ (Q ∧ P). We are working backwards, so when we
5.3 Elimination Rules 69
Isabelle leaves two new subgoals: the two halves of the original conjunction.
The first is simply P, which is trivial, since P is among the assumptions.
We can apply the assumption method, which proves a subgoal by finding a
matching assumption.
1. [[P; Q]] =⇒ Q ∧ P
We are left with the subgoal of proving Q ∧ P from the assumptions P and Q.
We apply rule conjI again.
1. [[P; Q]] =⇒ Q
2. [[P; Q]] =⇒ P
We are left with two new subgoals, Q and P, each of which can be proved
using the assumption method.
P ∧Q P ∧Q
P Q
Now consider disjunction. There are two introduction rules, which resem-
ble inverted forms of the conjunction elimination rules:
P Q
P ∨Q P ∨Q
In a logic text, the disjunction elimination rule might be shown like this:
[P] [Q]
.. ..
.. ..
P ∨Q R R
R
The assumptions [P] and [Q] are bracketed to emphasize that they are local
to their subproofs. In Isabelle notation, the already-familiar =⇒ syntax serves
the same purpose:
[[?P ∨ ?Q; ?P =⇒ ?R; ?Q =⇒ ?R]] =⇒ ?R (disjE)
When we use this sort of elimination rule backwards, it produces a case
split. (We have seen this before, in proofs by induction.) The following proof
illustrates the use of disjunction elimination.
lemma disj_swap: "P ∨ Q =⇒ Q ∨ P"
apply (erule disjE)
apply (rule disjI2)
apply assumption
apply (rule disjI1)
apply assumption
We assume P ∨ Q and must prove Q ∨ P . Our first step uses the disjunction
elimination rule, disjE . We invoke it using erule, a method designed to work
with elimination rules. It looks for an assumption that matches the rule’s
first premise. It deletes the matching assumption, regards the first premise as
proved and returns subgoals corresponding to the remaining premises. When
we apply erule to disjE, only two subgoals result. This is better than applying
it using rule to get three subgoals, then proving the first by assumption:
the other subgoals would have the redundant assumption P ∨ Q . Most of
the time, erule is the best way to use elimination rules, since it replaces
an assumption by its subformulas; only rarely does the original assumption
remain useful.
1. P =⇒ Q ∨ P
2. Q =⇒ Q ∨ P
These are the two subgoals returned by erule. The first assumes P and the
second assumes Q. Tackling the first subgoal, we need to show Q ∨ P . The
second introduction rule (disjI2 ) can reduce this to P, which matches the
assumption. So, we apply the rule method with disjI2 …
1. P =⇒ P
2. Q =⇒ Q ∨ P
…and finish off with the assumption method. We are left with the other sub-
goal, which assumes Q.
1. Q =⇒ Q ∨ P
5.4 Destruction Rules: Some Examples 71
formula, when usually we want to take both parts of the conjunction as new
assumptions. The easiest way to do so is by using an alternative conjunction
elimination rule that resembles disjE . It is seldom, if ever, seen in logic books.
In Isabelle syntax it looks like this:
[[?P ∧ ?Q; [[?P; ?Q]] =⇒ ?R]] =⇒ ?R (conjE)
Exercise 5.4.1 Use the rule conjE to shorten the proof above.
5.5 Implication
At the start of this chapter, we saw the rule modus ponens. It is, in fact, a
destruction rule. The matching introduction rule looks like this in Isabelle:
(?P =⇒ ?Q) =⇒ ?P −→ ?Q (impI)
And this is modus ponens:
[[?P −→ ?Q; ?P]] =⇒ ?Q (mp)
Here is a proof using the implication rules. This lemma performs a sort
of uncurrying, replacing the two antecedents of a nested implication by a
conjunction. The proof illustrates how assumptions work. At each proof step,
the subgoals inherit the previous assumptions, perhaps with additions or
deletions. Rules such as impI and disjE add assumptions, while applying
erule or drule deletes the matching assumption.
lemma imp_uncurry: "P −→ (Q −→ R) =⇒ P ∧ Q −→ R"
apply (rule impI)
apply (erule conjE)
apply (drule mp)
apply assumption
apply (drule mp)
apply assumption
apply assumption
First, we state the lemma and apply implication introduction (rule impI ),
which moves the conjunction to the assumptions.
1. [[P −→ Q −→ R; P ∧ Q]] =⇒ R
Next, we apply conjunction elimination (erule conjE ), which splits this con-
junction into two parts.
1. [[P −→ Q −→ R; P; Q]] =⇒ R
1. [[P; Q]] =⇒ P
2. [[P; Q; Q −→ R]] =⇒ R
Repeating these steps for Q −→ R yields the conclusion we seek, namely R.
1. [[P; Q; Q −→ R]] =⇒ R
The symbols =⇒ and −→ both stand for implication, but they differ in
many respects. Isabelle uses =⇒ to express inference rules; the symbol is
built-in and Isabelle’s inference mechanisms treat it specially. On the other
hand, −→ is just one of the many connectives available in higher-order logic.
We reason about it using inference rules such as impI and mp, just as we
reason about the other connectives. You will have to use −→ in any context
that requires a formula of higher-order logic. Use =⇒ to separate a theorem’s
preconditions from its conclusion.
The by command is useful for proofs like these that use assumption heav-
ily. It executes an apply command, then tries to prove all remaining subgoals
using assumption. Since (if successful) it ends the proof, it also replaces the
done symbol. For example, the proof above can be shortened:
lemma imp_uncurry: "P −→ (Q −→ R) =⇒ P ∧ Q −→ R"
apply (rule impI)
apply (erule conjE)
apply (drule mp)
apply assumption
by (drule mp)
We could use by to replace the final apply and done in any proof, but
typically we use it to eliminate calls to assumption. It is also a nice way of
expressing a one-line proof.
5.6 Negation
The former conclusion, namely R, now appears negated among the assump-
tions, while the negated formula R −→ Q becomes the new conclusion.
We can now apply introduction rules. We use the intro method, which
repeatedly applies the given introduction rules. Here its effect is equivalent
to rule impI.
1. [[¬ (P −→ Q); ¬ R; R]] =⇒ Q
This rule combines the effects of disjI1 and disjI2. Its great advantage is that
we can remove the disjunction symbol without deciding which disjunction
to prove. This treatment of disjunction is standard in sequent and tableau
calculi.
lemma "(P ∨ Q) ∧ R =⇒ P ∨ (Q ∧ R)"
apply (rule disjCI)
apply (elim conjE disjE)
apply assumption
by (erule contrapos_np, rule conjI)
The first proof step to applies the introduction rules disjCI. The resulting
subgoal has the negative assumption ¬(Q ∧ R).
1. [[(P ∨ Q) ∧ R; ¬ (Q ∧ R)]] =⇒ P
Next we apply the elim method, which repeatedly applies elimination rules;
here, the elimination rules given in the command. One of the subgoals is
trivial (apply assumption), leaving us with one other:
1. [[¬ (Q ∧ R); R; Q]] =⇒ P
Now we must move the formula Q ∧ R to be the conclusion. The combination
(erule contrapos_np, rule conjI)
is robust: the conjI forces the erule to select a conjunction. The two subgoals
are the ones we would expect from applying conjunction introduction to
Q ∧ R:
1. [[R; Q; ¬ P]] =⇒ Q
2. [[R; Q; ¬ P]] =⇒ R
They are proved by assumption, which is implicit in the by command.
As we have seen, Isabelle rules involve schematic variables, which begin with a
question mark and act as placeholders for terms. Unification — well known
to Prolog programmers — is the act of making two terms identical, possibly
replacing their schematic variables by terms. The simplest case is when the
two terms are already the same. Next simplest is pattern-matching, which
replaces variables in only one of the terms. The rule method typically matches
the rule’s conclusion against the current subgoal. The assumption method
matches the current subgoal’s conclusion against each of its assumptions.
Unification can instantiate variables in both terms; the rule method can do
this if the goal itself contains schematic variables. Other occurrences of the
variables in the rule or proof state are updated at the same time.
Schematic variables in goals represent unknown terms. Given a goal such
as ∃x. P, they let us proceed with a proof. They can be filled in later, some-
times in stages and often automatically.
If unification fails when you think it should succeed, try setting the Proof Gen-
eral flag Isabelle > Settings > Trace Unification, which makes Isabelle show the
cause of unification failures (in Proof General’s Trace buffer).
5.8 Unification and Substitution 77
The assumption method having failed, we try again with the flag set:
apply assumption
In this trivial case, the output clearly shows that e clashes with c:
Clash: e =/= c
s=t s=s
t=s
The attribute THEN, which combines two rules, is described in Sect. 5.15.1 be-
low. The subst method is more powerful than applying the substitution rule.
It can perform substitutions in a subgoal’s assumptions. Moreover, if the sub-
goal contains more than one occurrence of the left-hand side of the equality,
the subst method lets us specify which occurrence should be replaced.
By default, Isabelle tries to substitute for all the occurrences. Applying erule
ssubst yields this subgoal:
1. triple (f x) (f x) x =⇒ triple (f x) (f x) (f x)
The substitution should have been done in the first two occurrences of x
only. Isabelle has gone too far. The back command allows us to reject this
possibility and demand a new one:
1. triple (f x) (f x) x =⇒ triple x (f x) (f x)
Now Isabelle has left the first occurrence of x alone. That is promising but it
is not the desired combination. So we use back again:
1. triple (f x) (f x) x =⇒ triple (f x) x (f x)
This also is wrong, so we use back again:
1. triple (f x) (f x) x =⇒ triple x x (f x)
And this one is wrong too. Looking carefully at the series of alternatives, we
see a binary countdown with reversed bits: 111, 011, 101, 001. Invoke back
again:
1. triple (f x) (f x) x =⇒ triple (f x) (f x) x
5.9 Quantifiers
Note that the resulting proof state has a bound variable, namely x. The rule
has replaced the universal quantifier of higher-order logic by Isabelle’s meta-
level quantifier. Our goal is to prove P x −→ P x for arbitrary x ; it is an
implication, so we apply the corresponding introduction rule (impI ).
1. x. P x =⇒ P x
V
Now consider universal elimination. In a logic text, the rule looks like this:
∀x. P
P[t/x]
with the proviso “x not free in P.” Isabelle’s treatment of substitution makes
the proviso unnecessary. The conclusion is expressed as P −→ (∀ x. Q x).
No substitution for the variable P can introduce a dependence upon x : that
would be a bound variable capture. Let us walk through the proof.
lemma "(∀ x. P −→ Q x) =⇒ P −→ (∀ x. Q x)"
First we apply implies introduction (impI ), which moves the P from the con-
clusion to the assumptions. Then we apply universal introduction (allI ).
apply
V (rule impI, rule allI)
1. x. [[∀ x. P −→ Q x; P]] =⇒ Q x
As before, it replaces the HOL quantifier by a meta-level quantifier, producing
a subgoal that binds the variable x. The leading bound variables (here x )
and the assumptions (here ∀ x. P −→ Q x and P ) form the context for the
conclusion, here Q x. Subgoals inherit the context, although assumptions can
be added or deleted (as we saw earlier), while rules such as allI add bound
variables.
Now, to reason from the universally quantified assumption, we apply the
elimination rule using the drule method. This rule is called spec because it
specializes a universal formula to a particular term.
apply
V (drule spec)
1. x. [[P; P −→ Q (?x2 x)]] =⇒ Q x
Observe how the context has changed. The quantified formula is gone, re-
placed by a new assumption derived from its body. We have removed the
quantifier and replaced the bound variable by the curious term ?x2 x. This
term is a placeholder: it may become any term that can be built from x.
(Formally, ?x2 is an unknown of function type, applied to the argument x.)
This new assumption is an implication, so we can use modus ponens on it,
which concludes the proof.
by (drule mp)
Let us take a closer look at this last step. Modus ponens yields two subgoals:
one where we prove the antecedent (in this case P ) and one where we may
assume the consequent. Both of these subgoals are proved by the assumption
method, which is implicit in the by command. Replacing the by command
by apply (drule mp, assumption) would have left one last subgoal:
1. x. [[P; Q (?x2 x)]] =⇒ Q x
V
The concepts just presented also apply to the existential quantifier, whose
introduction rule looks like this in Isabelle:
5.9 Quantifiers 83
?P ?x =⇒ ∃ x. ?P x (exI)
If we can exhibit some x such that P(x) is true, then ∃x.P(x) is also true.
It is a dual of the universal elimination rule, and logic texts present it using
the same notation for substitution.
The existential elimination rule looks like this in a logic text:
[P]
..
..
∃x. P Q
Q
It looks like this in Isabelle:
[[∃ x. ?P x; x. ?P x =⇒ ?Q]] =⇒ ?Q (exE)
V
Note that drule spec removes the universal quantifier and — as usual with
elimination rules — discards the original formula. Sometimes, a universal
formula has to be kept so that it can be used again. Then we use a new
method: frule. It acts like drule but copies rather than replaces the selected
assumption. The f is for forward.
In this example, going from P a to P(h(h a)) requires two uses of the
quantified assumption, one for each h in h(h a).
lemma "[[∀ x. P x −→ P (h x); P a]] =⇒ P(h (h a))"
Examine the subgoal left by frule:
apply (frule spec)
1. [[∀ x. P x −→ P (h x); P a; P ?x −→ P (h ?x)]] =⇒ P (h (h a))
It is what drule would have left except that the quantified assumption is still
present. Next we apply mp to the implication and the assumption P a:
apply (drule mp, assumption)
1. [[∀ x. P x −→ P (h x); P a; P (h a)]] =⇒ P (h (h a))
We have created the assumption P(h a), which is progress. To continue the
proof, we apply spec again. We shall not need it again, so we can use drule.
apply (drule spec)
1. [[P a; P (h a); P ?x2 −→ P (h ?x2)]] =⇒ P (h (h a))
The new assumption bridges the gap between P(h a) and P(h(h a)).
by (drule mp)
The proof requires instantiating the quantified assumption with the term h a.
apply (drule_tac x = "h a" in spec)
1. [[P a; P (h a); P (h a) −→ P (h (h a))]] =⇒ P (h (h a))
We have forced the desired instantiation.
Existential formulas can be instantiated too. The next example uses the
divides relation of number theory:
?m dvd ?n ≡ ∃ k. ?n = ?m * k (dvd_def)
!! Description operators can be hard to reason about. Novices should try to avoid
them. Fortunately, descriptions are seldom required.
1. [[P k; ∀ x. P x −→ k ≤ x]]
=⇒ (THE x. P x ∧ (∀ y. P y −→ x ≤ y)) = k
The first step has merely unfolded the definition.
apply (rule the_equality)
1. [[P
V k; ∀ x. P x −→ k ≤ x]] =⇒ P k ∧ (∀ y. P y −→ k ≤ y)
2. x. [[P k; ∀ x. P x −→ k ≤ x; P x ∧ (∀ y. P y −→ x ≤ y)]]
=⇒ x = k
As always with the_equality, we must show existence and uniqueness of the
claimed solution, k. Existence, the first subgoal, is trivial. Uniqueness, the
second subgoal, follows by antisymmetry:
[[x ≤ y; y ≤ x]] =⇒ x = y (order_antisym)
5.10 Description Operators 87
The assumptions imply both k ≤ x and x ≤ k. One call to auto does it all:
by (auto intro: order_antisym)
Using SOME rather than THE makes inv f behave well even if f is not injective.
As it happens, most useful theorems about inv do assume the function to be
injective.
The inverse of f, when applied to y, returns some x such that f x = y.
For example, we can prove inv Suc really is the inverse of the Suc function
lemma "inv Suc (Suc n) = n"
by (simp add: inv_def)
The proof is a one-liner: the subgoal simplifies to a degenerate application of
SOME, which is then erased. In detail, the left-hand side simplifies to SOME x.
Suc x = Suc n, then to SOME x. x = n and finally to n.
We know nothing about what inv Suc returns when applied to zero. The
proof above still treats SOME as a definite description, since it only reasons
about situations in which the value is described uniquely. Indeed, SOME satis-
fies this rule:
[[P a; x. P x =⇒ x = a]] =⇒ (SOME x. P x) = a (some_equality)
V
1. x. ∀ x. ∃ y. P x y =⇒ P x (?f x)
V
We have applied the introduction rules; now it is time to apply the elimination
rules.
3
In fact, inv is defined via a second constant inv_into, which we ignore here.
88 5. The Rules of the Game
1. x y. P (?x2 x) y =⇒ P x (?f x)
V
This rule is seldom used for that purpose — it can cause exponential blow-up
— but it is occasionally used as an introduction rule for the ε-operator. Its
name in HOL is someI_ex .
Most of the examples in this tutorial involve proving theorems. But not every
conjecture is true, and it can be instructive to see how proofs fail. Here we
attempt to prove a distributive law involving the existential quantifier and
conjunction.
lemma "(∃ x. P x) ∧ (∃ x. Q x) =⇒ ∃ x. P x ∧ Q x"
The first steps are routine. We apply conjunction elimination to break the as-
sumption into two existentially quantified assumptions. Applying existential
elimination removes one of the quantifiers.
apply (erule conjE)
apply
V (erule exE)
1. x. [[∃ x. Q x; P x]] =⇒ ∃ x. P x ∧ Q x
The proviso of the existential elimination rule has forced the variables to
differ: we can hardly expect two arbitrary values to be equal! There is no
way to prove this subgoal. Removing the conclusion’s existential quantifier
yields two identical placeholders, which can become any term involving the
variables x and xa. We need one to become x and the other to become xa,
but Isabelle requires all instances of a placeholder to be identical.
5.11 Some Proofs That Fail 89
We can prove either subgoal using the assumption method. If we prove the
first one, the placeholder changes into x.
apply
V assumption
1. x xa. [[P x; Q xa]] =⇒ Q x
We are left with a subgoal that cannot be proved. Applying the assumption
method results in an error message:
*** empty result sequence -- proof command failed
When interacting with Isabelle via the shell interface, you can abandon a
proof using the oops command.
Here is another abortive proof, illustrating the interaction between bound
variables and unknowns. If R is a reflexive relation, is there an x such that
R x y holds for all y? Let us see what happens when we attempt to prove it.
lemma "∀ y. R y y =⇒ ∃ x. ∀ y. R x y"
First, we remove the existential quantifier. The new proof state has an un-
known, namely ?x.
apply (rule exI)
1. ∀ y. R y y =⇒ ∀ y. R ?x y
It looks like we can just apply assumption, but it fails. Isabelle refuses to
substitute y, a bound variable, for ?x ; that would be a bound variable capture.
We can still try to finish the proof in some other way. We remove the universal
quantifier from the conclusion, moving the bound variable y into the subgoal.
But note that it is still bound!
apply
V (rule allI)
1. y. ∀ y. R y y =⇒ R ?x y
This subgoal can only be proved by putting y for all the placeholders, making
the assumption and conclusion become R y y. Isabelle can replace ?z2 y by
y ; this involves instantiating ?z2 to the identity function. But, just as two
steps earlier, Isabelle refuses to substitute y for ?x. This example is typical
of how Isabelle enforces sound quantifier reasoning.
90 5. The Rules of the Game
The next example is a logic problem composed by Lewis Carroll. The blast
method finds it trivial. Moreover, it turns out that not all of the assumptions
are necessary. We can experiment with variations of this formula and see
which ones can be proved.
lemma "(∀ x. honest(x) ∧ industrious(x) −→ healthy(x)) ∧
¬ (∃ x. grocer(x) ∧ healthy(x)) ∧
(∀ x. industrious(x) ∧ grocer(x) −→ honest(x)) ∧
(∀ x. cyclist(x) −→ industrious(x)) ∧
(∀ x. ¬healthy(x) ∧ cyclist(x) −→ ¬honest(x))
−→ (∀ x. grocer(x) −→ ¬cyclist(x))"
by blast
The blast method is also effective for set theory, which is described in the
next chapter. The formula below may look horrible, but the blast method
proves it in milliseconds.
lemma "(S i∈I. S
A(i)) ∩ ( j∈J. B(j)) =
S S
( i∈I. j∈J. A(i) ∩ B(j))"
by blast
Few subgoals are couched purely in predicate logic and set theory. We can
extend the scope of the classical reasoner by giving it new rules. Extending
it effectively requires understanding the notions of introduction, elimination
and destruction rules. Moreover, there is a distinction between safe and un-
safe rules. A safe rule is one that can be applied backwards without losing
5.13 Other Classical Reasoning Methods 91
apply unsafe rules that could render the goal unprovable. By performing the
obvious steps, clarify lays bare the difficult parts of the problem, where
human intervention is necessary.
For example, the following conjecture is false:
lemma "(∀ x. P x) ∧ (∃ x. Q x) −→ (∀ x. P x ∧ Q x)"
apply clarify
The blast method would simply fail, but clarify presents a subgoal that
helps us see why we cannot continue the proof.
1. x xa. [[∀ x. P x; Q xa]] =⇒ P x ∧ Q x
V
The proof must fail because the assumption Q xa and conclusion Q x refer to
distinct bound variables. To reach this state, clarify applied the introduction
rules for −→ and ∀ and the elimination rule for ∧. It did not apply the
introduction rule for ∧ because of its policy never to split goals.
Also available is clarsimp , a method that interleaves clarify and simp.
Also there is safe, which like clarify performs obvious steps but even applies
those that split goals.
The force method applies the classical reasoner and simplifier to one goal.
Unless it can prove the goal, it fails. Contrast that with the auto method,
which also combines classical reasoning with simplification. The latter’s pur-
pose is to prove all the easy subgoals and parts of subgoals. Unfortunately,
it can produce large numbers of new subgoals; also, since it proves some
subgoals and splits others, it obscures the structure of the proof tree. The
force method does not have these drawbacks. Another difference: force tries
harder than auto to prove its goal, so it can take much longer to terminate.
Older components of the classical reasoner have largely been superseded
by blast, but they still have niche applications. Most important among these
are fast and best. While blast searches for proofs using a built-in first-
order reasoner, these earlier methods search for proofs using standard Isabelle
inference. That makes them slower but enables them to work in the presence
of the more unusual features of Isabelle rules, such as type classes and function
unknowns. For example, recall the introduction rule for Hilbert’s ε-operator:
?P ?x =⇒ ?P (SOME x. ?P x) (someI)
The repeated occurrence of the variable ?P makes this rule tricky to apply.
Consider this contrived example:
lemma "[[Q a; P a]]
=⇒ P (SOME x. P x ∧ Q x) ∧ Q (SOME x. P x ∧ Q x)"
apply (rule someI)
We can apply rule someI explicitly. It yields the following subgoal:
1. [[Q a; P a]] =⇒ P ?x ∧ Q ?x
The proof from this point is trivial. Could we have proved the theorem with
a single command? Not using blast: it cannot perform the higher-order uni-
fication needed here. The fast method succeeds:
5.14 Finding More Theorems 93
In Sect. 3.1.11, we introduced Proof General’s Find button for finding theo-
rems in the database via pattern matching. If we are inside a proof, we can
be more specific; we can search for introduction, elimination and destruction
rules with respect to the current goal. For this purpose, Find provides three
aditional search criteria: intro, elim and dest.
For example, given the goal
1. A ∧ B
you can click on Find and type in the search expression intro. You will
be shown a few rules ending in =⇒ ?P ∧ ?Q, among them conjI . You may
even discover that the very theorem you are trying to prove is already in the
database. Given the goal
1. A −→ A
the search for intro finds not just impI but also imp_refl: ?P −→ ?P.
As before, search criteria can be combined freely: for example,
"_ @ _" intro
searches for all introduction rules that match the current goal and mention
the @ function.
Searching for elimination and destruction rules via elim and dest is anal-
ogous to intro but takes the assumptions into account, too.
94 5. The Rules of the Game
Forward proof means deriving new facts from old ones. It is the most funda-
mental type of proof. Backward proof, by working from goals to subgoals, can
help us find a difficult proof. But it is not always the best way of presenting
the proof thus found. Forward proof is particularly good for reasoning from
the general to the specific. For example, consider this distributive law for the
greatest common divisor:
k × gcd(m, n) = gcd(k × m, k × n)
k = gcd(k, k × n)
gcd(k, k) = k
Let us reproduce our examples in Isabelle. Recall that in Sect. 3.5.3 we de-
clared the recursive function gcd:
fun gcd :: "nat ⇒ nat ⇒ nat" where
"gcd m n = (if n=0 then m else gcd n (m mod n))"
From this definition, it is possible to prove the distributive law. That takes
us to the starting point for our example.
?k * gcd ?m ?n = gcd (?k * ?m) (?k * ?n) (gcd_mult_distrib2)
The keyword lemmas declares a new theorem, which can be derived from an
existing one using attributes such as [of k 1]. The command thm gcd_mult_0
displays the result:
k * gcd 1 ?n = gcd (k * 1) (k * ?n)
5.15 Forward Proof: Transforming Theorems 95
The directives, or attributes, are processed from left to right. This declaration
of gcd_mult is equivalent to the previous one.
Such declarations can make the proof script hard to read. Better is to
state the new lemma explicitly and to prove it using a single rule method
whose operand is expressed using forward reasoning:
lemma gcd mult [simp]: "gcd k (k*n) = k"
by (rule gcd_mult_distrib2 [of k 1, simplified, THEN sym])
Compared with the previous proof of gcd_mult, this version shows the reader
what has been proved. Also, the result will be processed in the normal way.
In particular, Isabelle generalizes over all variables: the resulting theorem will
have ?k instead of k .
At the start of this section, we also saw a proof of gcd(k, k) = k. Here is
the Isabelle version:
lemma gcd self [simp]: "gcd k k = k"
by (rule gcd_mult [of k 1, simplified])
We have seen that the forward proof directives work well within a backward
proof. There are many ways to achieve a forward style using our existing
proof methods. We shall also meet some new methods that perform forward
reasoning.
The methods drule, frule, drule_tac, etc., reason forward from a subgoal.
We have seen them already, using rules such as mp and spec to operate on
formulae. They can also operate on terms, using rules such as these:
x = y =⇒ f x = f y (arg_cong)
i ≤ j =⇒ i * k ≤ j * k (mult_le_mono1)
For example, let us prove a fact about divisibility in the natural numbers:
98 5. The Rules of the Game
The key step is to apply the function …mod u to both sides of the equation
u*m = Suc(u*n):
apply (drule_tac f="λx. x mod u" in arg_cong)
1. [[2 ≤ u; u * m mod u = Suc (u * n) mod u]] =⇒ False
Simplification reduces the left side to 0 and the right side to 1, yielding the
required contradiction.
apply (simp add: mod_Suc)
done
The insert method inserts a given theorem as a new assumption of all sub-
goals. This already is a forward step; moreover, we may (as always when
using a theorem) apply of, THEN and other directives. The new assumption
can then be used to help prove the subgoals.
For example, consider this theorem about the divides relation. The first
proof step inserts the distributive law for gcd. We specify its variables as
shown.
lemma relprime dvd mult:
"[[ gcd k n = 1; k dvd m*n ]] =⇒ k dvd m"
apply (insert gcd_mult_distrib2 [of m k n])
In the resulting subgoal, note how the equation has been inserted:
1. [[gcd k n = 1; k dvd m * n; m * gcd k n = gcd (m * k) (m * n)]]
=⇒ k dvd m
The next proof step utilizes the assumption gcd k n = 1 (note that Suc 0 is
another expression for 1):
apply(simp)
1. [[gcd k n = Suc 0; k dvd m * n; m = gcd (m * k) (m * n)]]
=⇒ k dvd m
Simplification has yielded an equation for m. The rest of the proof is omitted.
Here is another demonstration of insert. Division and remainder obey a
well-known law:
(?m div ?n) * ?n + ?m mod ?n = ?m (div_mult_mod_eq)
lemma div_mult_self_is_m:
"0<n =⇒ (m*n) div n = (m::nat)"
apply (insert div_mult_mod_eq [of "m*n" n])
apply (simp)
done
The first step inserts the law, specifying m*n and n for its variables. Notice
that non-trivial expressions must be enclosed in quotation marks. Here is the
resulting subgoal, with its new assumption:
1. [[0 < n; (m * n) div n * n + (m * n) mod n = m * n]]
=⇒ (m * n) div n = m
Simplification reduces (m * n) mod n to zero. Then it cancels the factor n on
both sides of the equation (m * n) div n * n = m * n, proving the theorem.
The first subgoal is trivial (blast), but for the second Isabelle needs help to
eliminate the case z =35. The second invocation of subgoal_tac leaves two
subgoals:
1. [[z < 37; 66 < 2 * z; z * z 6= 1225; Q 34; Q 36;
z 6= 35]]
=⇒ z = 34 ∨ z = 36
2. [[z < 37; 66 < 2 * z; z * z =6 1225; Q 34; Q 36]]
=⇒ z 6= 35
Assuming that z is not 35, the first subgoal follows by linear arithmetic
(arith). For the second subgoal we apply the method force, which proceeds
by assuming that z =35 and arriving at a contradiction.
Summary of these methods:
– insert adds a theorem as a new assumption
– subgoal_tac adds a formula as a new assumption and leaves the subgoal of
proving that formula
Naturally you should try to divide proofs into manageable parts. Look for
lemmas that can be proved separately. Sometimes you will observe that they
are instances of much simpler facts. On other occasions, no lemmas suggest
themselves and you are forced to cope with a long proof involving many
subgoals.
If the proof is long, perhaps it at least has some regularity. Then you can
express it more concisely using tacticals, which provide control structures.
Here is a proof (it would be a one-liner using blast, but forget that) that
contains a series of repeated commands:
lemma "[[P−→Q; Q−→R; R−→S; P]] =⇒ S"
apply (drule mp, assumption)
apply (drule mp, assumption)
apply (drule mp, assumption)
apply (assumption)
done
Each of the three identical commands finds an implication and proves its
antecedent by assumption. The first one finds P−→Q and P, concluding Q ; the
second one concludes R and the third one concludes S. The final step matches
the assumption S with the goal to be proved.
Suffixing a method with a plus sign (+ ) expresses one or more repetitions:
5.17 Managing Large Proofs 101
All methods apply to the first subgoal. Sometimes, not only in a large
proof, you may want to focus on some other subgoal. Then you should try
the commands defer or prefer.
In the following example, the first subgoal looks hard, while the others
look as if blast alone could prove them:
102 5. The Rules of the Game
1. hard
2. ¬ ¬ P =⇒ P
3. Q =⇒ Q
The defer command moves the first subgoal into the last position.
defer 1
1. ¬ ¬ P =⇒ P
2. Q =⇒ Q
3. hard
Using defer, we have cleared away the trivial parts of the proof so that we
can devote attention to the difficult part.
The prefer command moves the specified subgoal into the first position.
For example, if you suspect that one of your subgoals is invalid (not a theo-
rem), then you should investigate that subgoal first. If it cannot be proved,
then there is no point in proving the other subgoals.
1. ok1
2. ok2
3. doubtful
In the first, where n=0, the implication becomes trivial: k dvd gcd m n goes
to k dvd m. The second subgoal is proved by an unfolding of gcd, using this
rule about divides:
[[?f dvd ?m; ?f dvd ?n]] =⇒ ?f dvd ?m mod ?n (dvd_mod)
This theorem concisely expresses the correctness of the gcd function. We state
it with the iff attribute so that Isabelle can use it to remove some occur-
rences of gcd. The theorem has a one-line proof using blast supplied with
two additional introduction rules. The exclamation mark (intro!) signifies
safe rules, which are applied aggressively. Rules given without the exclama-
tion mark are applied reluctantly and their uses can be undone if the search
backtracks. Here the unsafe rule expresses transitivity of the divides relation:
[[?m dvd ?n; ?n dvd ?p]] =⇒ ?m dvd ?p (dvd_trans)
This chapter describes the formalization of typed set theory, which is the
basis of much else in HOL. For example, an inductive definition yields a set,
and the abstract theories of relations regard a relation as a set of pairs. The
chapter introduces the well-known constants such as union and intersection,
as well as the main operations on relations, such as converse, composition
and transitive closure. Functions are also covered. They are not sets in HOL,
but many of their properties concern sets: the range of a function is a set,
and the inverse image of a function maps sets to sets.
This chapter will be useful to anybody who plans to develop a substantial
proof. Sets are convenient for formalizing computer science concepts such
as grammars, logical calculi and state transition systems. Isabelle can prove
many statements involving sets automatically.
This chapter ends with a case study concerning model checking for the
temporal logic CTL. Most of the other examples are simple. The chapter
presents a small selection of built-in theorems in order to point out some key
properties of the various constants and to introduce you to the notation.
Natural deduction rules are provided for the set theory constants, but
they are seldom used directly, so only a few are presented here.
6.1 Sets
HOL’s set theory should not be confused with traditional, untyped set theory,
in which everything is a set. Our sets are typed. In a given set, all elements
have the same type, say τ , and the set itself has type τ set.
We begin with intersection, union and complement. In addition to
the membership relation, there is a symbol for its negation. These points
can be seen below.
Here are the natural deduction rules for intersection. Note the resemblance
to those for conjunction.
[[c ∈ A; c ∈ B]] =⇒ c ∈ A ∩ B (IntI)
c ∈ A ∩ B =⇒ c ∈ A (IntD1)
c ∈ A ∩ B =⇒ c ∈ B (IntD2)
Here are two of the many installed theorems concerning set complement.
Note that it is denoted by a minus sign.
108 6. Sets, Functions and Relations
(c ∈ - A) = (c ∈
/ A) (Compl_iff)
- (A ∪ B) = - A ∩ - B (Compl_Un)
Set difference is the intersection of a set with the complement of another
set. Here we also see the syntax for the empty set and for the universal set.
A ∩ (B - A) = {} (Diff_disjoint)
A ∪ - A = UNIV (Compl_partition)
The subset relation holds between two sets just if every element of one
is also an element of the other. This relation is reflexive. These are its natural
deduction rules:
( x. x ∈ A =⇒ x ∈ B) =⇒ A ⊆ B (subsetI)
V
In harder proofs, you may need to apply subsetD giving a specific term for c.
However, blast can instantly prove facts such as this one:
(A ∪ B ⊆ C) = (A ⊆ C ∧ B ⊆ C) (Un_subset_iff)
The proof fails. It is not a statement about sets, due to overloading; the
relation symbol <= can be any relation, not just subset. In this general form,
the statement is not valid. Putting in a type constraint forces the variables
to denote sets, allowing the proof to succeed:
lemma "((A:: 'a set) <= -B) = (B <= -A)"
Finite sets are expressed using the constant insert, which is a form of union:
insert a A = {a} ∪ A (insert_is_Un)
The finite set expression {a,b} abbreviates insert a (insert b {}). Many
facts about finite sets can be proved automatically:
lemma "{a,b} ∪ {c,d} = {a,b,c,d}"
by blast
Not everything that we would like to prove is valid. Consider this attempt:
lemma "{a,b} ∩ {b,c} = {b}"
apply auto
The proof fails, leaving the subgoal b=c. To see why it fails, consider a correct
version:
lemma "{a,b} ∩ {b,c} = (if a=c then {a,b} else {b})"
apply simp
by blast
Our mistake was to suppose that the various items were distinct. Another
remark: this proof uses two methods, namely simp and blast. Calling simp
eliminates the if -then-else expression, which blast cannot break down. The
combined methods (namely force and auto) can prove this fact in one step.
The set comprehension {x. P} expresses the set of all elements that satisfy the
predicate P. Two laws describe the relationship between set comprehension
and the membership relation:
(a ∈ {x. P x}) = P a (mem_Collect_eq)
{x. x ∈ A} = A (Collect_mem_eq)
constraints. The drawback is that it hides the true form of the expression,
with its existential quantifiers.
Remark. We do not need sets at all. They are essentially equivalent to
predicate variables, which are allowed in higher-order logic. The main benefit
of sets is their notation; we can write x∈A and {z. P} where predicates would
require writing A(x) and λz. P.
Universal and existential quantifications may range over sets, with the obvi-
ous meaning. Here are the natural deduction rules for the bounded universal
quantifier. Occasionally you will need to apply bspec with an explicit instan-
tiation of the variable x :
( x. x ∈ A =⇒ P x) =⇒ ∀ x∈A. P x (ballI)
V
[[∀ x∈A. P x; x ∈ A]] =⇒ P x (bspec)
Dually, here are the natural deduction rules for the bounded existential quan-
tifier. You may need to apply bexI with an explicit instantiation:
[[P x; x ∈ A]] V
=⇒ ∃ x∈A. P x (bexI)
[[∃ x∈A. P x; x. [[x ∈ A; P x]] =⇒ Q]] =⇒ Q (bexE)
Unions can be formed over the values of a given set. The syntax is
x∈A. B or UN x:A. B in ascii. Indexed union satisfies this basic law:
S
It has two natural deduction rules similar to those for the existential quanti-
fier. Sometimes UN_I must be applied explicitly:
[[a ∈ A;S b ∈ B a]] =⇒V b ∈ ( x∈A. B x) (UN_I)
S
[[b ∈ ( x∈A. B x); x. [[x ∈ A; b ∈ B x]] =⇒ R]] =⇒ R (UN_E)
The following built-in abbreviation (see Sect. 4.1.4) lets us express the union
over a type:
( x. B x) ≡ ( x∈UNIV. B x)
S S
We may also express the union of a set of sets, written Union C in ascii:
(A ∈ C) = (∃ X∈C. A ∈ X) (Union_iff)
S
Intersections are treated dually, although they seem to be used less often
than unions. The syntax below would be INT x: A. B and Inter C in ascii.
Among others, these theorems are available:
(b ∈ T
( x∈A. B x)) = (∀ x∈A. b ∈ B x) (INT_iff)
T
(A ∈ C) = (∀ X∈C. A ∈ X) (Inter_iff)
Isabelle uses logical equivalences such as those above in automatic proof.
Unions, intersections and so forth are not simply replaced by their definitions.
Instead, membership tests are simplified. For example, x ∈ A ∪ B is replaced
by x ∈ A ∨ x ∈ B.
6.2 Functions 111
The predicate finite holds of all finite sets. Isabelle/HOL includes many
familiar theorems about finiteness and cardinality (card). For example, we
have theorems concerning the cardinalities of unions, intersections and the
powerset:
[[finite A; finite B]]
=⇒ card A + card B = card (A ∪ B) + card (A ∩ B) (card_Un_Int)
finite A =⇒
card {B. B ⊆ A ∧ card B = k} = card A choose k (n_subsets)
6.2 Functions
This section describes a few concepts that involve functions. Some of the more
important theorems are given along with the names. A few sample proofs
appear. Unlike with set theory, however, we cannot simply state lemmas and
expect them to be proved using blast.
Two functions are equal if they yield equal results given equal arguments.
This is the principle of extensionality for functions:
( x. f x = g x) =⇒ f = g (ext)
V
Function update is useful for modelling machine states. It has the obvi-
ous definition and many useful facts are proved about it. In particular, the
following equation is installed as a simplification rule:
(f(x:=y)) z = (if z = x then y else f z) (fun_upd_apply)
112 6. Sets, Functions and Relations
1. ∀ x y. f x = f y −→ x = y =⇒
(∀ x. f (g x) = f (h x)) = (∀ x. g x = h x)
This can be proved using the auto method.
The image of a set under a function is a most useful notion. It has the
obvious definition:
f ` A ≡ {y. ∃ x∈A. y = f x} (image_def)
Laws involving image can often be proved automatically. Here are two
examples, illustrating connections with indexed union and with the general
syntax for comprehension:
lemma "f`A ∪ g`A = ( x∈A. {f x, g x})"
S
A function’s range is the set of values that the function can take on. It
is, in fact, the image of the universal set under that function. There is no
constant range. Instead, range abbreviates an application of image to UNIV :
range f f`UNIV
Few theorems are proved specifically for range; in most cases, you should look
for a more general theorem concerning images.
Inverse image is also useful. It is defined as follows:
f -` B ≡ {x. f x ∈ B} (vimage_def)
6.3 Relations
A relation is a set of pairs. As such, the set operations apply to them. For
instance, we may form the union of two relations. Other primitives are defined
specifically for relations.
114 6. Sets, Functions and Relations
The identity relation, also known as equality, has the obvious definition:
Id ≡ {p. ∃ x. p = (x,x)} (Id_def)
Composition of relations (the infix O ) is also available:
r O s = {(x,z). ∃ y. (x,y) ∈ s ∧ (y,z) ∈ r} (relcomp_unfold)
This is one of the many lemmas proved about these concepts:
R O Id = R (R_O_Id)
Composition is monotonic, as are most of the primitives appearing in this
chapter. We have many theorems similar to the following one:
[[r' ⊆ r; s' ⊆ s]] =⇒ r' O s' ⊆ r O s (relcomp_mono)
The converse or inverse of a relation exchanges the roles of the two
operands. We use the postfix notation r −1 or r^-1 in ASCII.
((a,b) ∈ r −1 ) = ((b,a) ∈ r) (converse_iff)
Here is a typical law proved about converse and composition:
(r O s)−1 = s−1 O r −1 (converse_relcomp)
The image of a set under a relation is defined analogously to image under
a function:
(b ∈ r `` A) = (∃ x∈A. (x,b) ∈ r) (Image_iff)
It satisfies many similar laws.
The domain and range of a relation are defined in the standard way:
(a ∈ Domain r) = (∃ y. (a,y) ∈ r) (Domain_iff)
(a ∈ Range r) = (∃ y. (y,a) ∈ r) (Range_iff)
Iterated composition of a relation is available. The notation overloads that
of exponentiation. Two simplification rules are installed:
R ^ 0 = Id
R ^ Suc n = R O R^n
Idempotence is one of the laws proved about the reflexive transitive closure:
(r ∗ )∗ = r ∗ (rtrancl_idemp)
The transitive closure is similar. The ASCII syntax is r^+. It has two
introduction rules:
p ∈ r =⇒ p ∈ r + (r_into_trancl)
[[(a, b) ∈ r + ; (b, c) ∈ r + ]] =⇒ (a, c) ∈ r + (trancl_trans)
The induction rule resembles the one shown above. A typical lemma states
that transitive closure commutes with the converse operator:
(r −1 )+ = (r + )−1 (trancl_converse)
The reflexive transitive closure also commutes with the converse operator. Let
us examine the proof. Each direction of the equivalence is proved separately.
The two proofs are almost identical. Here is the first one:
lemma rtrancl_converseD: "(x,y) ∈ (r −1 )∗ =⇒ (y,x) ∈ r ∗ "
apply (erule rtrancl_induct)
apply (rule rtrancl_refl)
apply (blast intro: rtrancl_trans)
done
The first step of the proof applies induction, leaving these subgoals:
1. V
(x, x) ∈ r ∗
2. y z. [[(x,y) ∈ (r −1 )∗ ; (y,z) ∈ r −1 ; (y,x) ∈ r ∗ ]]
=⇒ (z,x) ∈ r ∗
The first subgoal is trivial by reflexivity. The second follows by first elimi-
nating the converse operator, yielding the assumption (z,y) ∈ r, and then
applying the introduction rules shown above. The same proof script handles
the other direction:
lemma rtrancl_converseI: "(y,x) ∈ r ∗ =⇒ (x,y) ∈ (r −1 )∗ "
apply (erule rtrancl_induct)
apply (rule rtrancl_refl)
apply (blast intro: rtrancl_trans)
done
!! This trivial proof requires auto rather than blast because of a subtle issue
involving ordered pairs. Here is a subgoal that arises internally after the rules
equalityI and subsetI have been applied:
1. x. x ∈ (r −1 )∗ =⇒ x ∈ (r ∗ )−1
V
Now that x has been replaced by the pair (a,b), we can proceed. Other methods
that split variables in this way are force, auto, fast and best. Section 8.1 will
discuss proof techniques for ordered pairs in more detail.
!! You may want to skip the rest of this section until you need to perform a
complex recursive function definition or induction. The induction rule returned
by fun is good enough for most purposes. We use an explicit well-founded induction
only in Sect. 9.2.4.
they can be used directly, too. The least or strongest fixed point yields an
inductive definition; the greatest or weakest fixed point yields a coinductive
definition. Mathematicians may wish to note that the existence of these fixed
points is guaranteed by the Knaster-Tarski theorem.
!! Casual readers should skip the rest of this section. We use fixed point operators
only in Sect. ??.
For fixed point operators, the ordering will be the subset relation: if A ⊆ B
then we expect f (A) ⊆ f (B). In addition to its definition, monotonicity has
the obvious introduction and destruction rules:
( A B. A ≤ B =⇒ f A ≤ f B) =⇒ mono f (monoI)
V
The most important properties of the least fixed point are that it is a
fixed point and that it enjoys an induction rule:
mono f =⇒ lfp f = f (lfp f) (lfp_unfold)
[[aV ∈ lfp f; mono f;
x. x ∈ f (lfp f ∩ {x. P x}) =⇒ P x]] =⇒ P a (lfp_induct)
The induction rule shown above is more convenient than the basic one derived
from the minimality of lfp . Observe that both theorems demand mono f as
a premise.
The greatest fixed point is similar, but it has a coinduction rule:
mono f =⇒ gfp f = f (gfp f) (gfp_unfold)
[[mono f; a ∈ X; X ⊆ f (X ∪ gfp f)]] =⇒ a ∈ gfp f (coinduct)
The formulae of PDL are built up from atomic propositions via negation and
conjunction and the two temporal connectives AX and EF . Since formulae are
essentially syntax trees, they are naturally modelled as a datatype:1
datatype formula = Atom "atom"
| Neg formula
| And formula formula
1
The customary definition of PDL [12] looks quite different from ours, but the
two are easily shown to be equivalent.
6.5 Fixed Point Operators 119
| AX formula
| EF formula
This resembles the boolean expression case study in Sect. 2.5.6. A validity
relation between states and formulae specifies the semantics. The syntax an-
notation allows us to write s |= f instead of valid s f . The definition is by
recursion over the syntax:
primrec valid :: "state ⇒ formula ⇒ bool" ("(_ |= _)" [80,80] 80)
where
"s |= Atom a = (a ∈ L s)" |
"s |= Neg f = (¬(s |= f))" |
"s |= And f g = (s |= f ∧ s |= g)" |
"s |= AX f = (∀ t. (s,t) ∈ M −→ t |= f)" |
"s |= EF f = (∃ t. (s,t) ∈ M ∗ ∧ t |= f)"
Only the equation for EF deserves some comments. Remember that the postfix
−1
and the infix `` are predefined and denote the converse of a relation and
the image of a set under a relation. Thus M −1 `` T is the set of all predecessors
of T and the least fixed point (lfp ) of λT. mc f ∪ M −1 `` T is the least set
T containing mc f and all predecessors of T. If you find it hard to see that
mc (EF f) contains exactly those states from which there is a path to a state
where f is true, do not worry — this will be proved in a moment.
First we prove monotonicity of the function inside lfp in order to make
sure it really has a least fixed point.
lemma mono_ef: "mono(λT. A ∪ (M −1 `` T))"
apply(rule monoI)
apply blast
done
Now we can relate model checking and semantics. For the EF case we need a
separate lemma:
lemma EF_lemma:
"lfp(λT. A ∪ (M −1 `` T)) = {s. ∃ t. (s,t) ∈ M ∗ ∧ t ∈ A}"
120 6. Sets, Functions and Relations
The equality is proved in the canonical fashion by proving that each set
includes the other; the inclusion is shown pointwise:
apply(rule equalityI)
apply(rule subsetI)
apply(simp)
Simplification leaves us with the following first subgoal
1. s. s ∈ lfp (λT. A ∪ M −1 `` T) =⇒ ∃ t. (s, t) ∈ M ∗ ∧ t ∈ A
V
A total of 2 subgoals...
We now return to the second set inclusion subgoal, which is again proved
pointwise:
apply(rule subsetI)
apply(simp, clarify)
After simplification and clarification we are left with
1. x t. [[(x, t) ∈ M ∗ ; t ∈ A]] =⇒ x ∈ lfp (λT. A ∪ M −1 `` T)
V
1. x t. t ∈ A =⇒ t ∈ A ∪ M −1 `` lfp (λT. A ∪ M −1 `` T)
V
A total of 2 subgoals...
Exercise 6.5.1 AX has a dual operator EN (“there exists a next state such
that”)2 with the intended semantics
s |= EN f = (∃ t. (s, t) ∈ M ∧ t |= f)
Fortunately, EN f can already be expressed as a PDL formula. How?
Show that the semantics for EF satisfies the following recursion equation:
s |= EF f = (s |= f ∨ s |= EN (EF f))
The semantics of PDL only needs reflexive transitive closure. Let us be ad-
venturous and introduce a more expressive temporal operator. We extend the
datatype formula by a new constructor
| AF formula
which stands for “Always in the Future”: on all infinite paths, at some point
the formula holds. Formalizing the notion of an infinite path is easy in HOL:
it is simply a function from nat to state.
definition Paths :: "state ⇒ (nat ⇒ state)set" where
"Paths s ≡ {p. s = p 0 ∧ (∀ i. (p i, p(i+1)) ∈ M)}"
3
This definition allows a succinct statement of the semantics of AF :
2
We cannot use the customary EX: it is reserved as the ascii-equivalent of ∃ .
3
Do not be misled: neither datatypes nor recursive functions can be extended
by new constructors or equations. This is just a trick of the presentation (see
Sect. 4.2.5). In reality one has to define a new datatype and a new function.
122 6. Sets, Functions and Relations
Now we define mc (AF f) as the least set T that includes mc f and all states
all of whose direct successors are in T :
"mc(AF f) = lfp(af(mc f))"
Because af is monotone in its second argument (and also its first, but that
is irrelevant), af A has a least fixed point:
lemma mono_af: "mono(af A)"
apply(simp add: mono_def af_def)
apply blast
done
In contrast to the analogous proof for EF, and just for a change, we do not
use fixed point induction. Park-induction, named after David Park, is weaker
but sufficient for this proof:
f S ≤ S =⇒ lfp f ≤ S (lfp_lowerbound)
The instance of the premise f S ⊆ S is proved pointwise, a decision that auto
takes for us:
apply(rule lfp_lowerbound)
apply(auto simp add: af_def Paths_def)
1. p. [[∀ t. (p 0, t) ∈ M −→
V
(∀ p. t = p 0 ∧ (∀ i. (p i, p (Suc i)) ∈ M) −→
(∃ i. p i ∈ A));
∀ i. (p i, p (Suc i)) ∈ M ]]
=⇒ ∃ i. p i ∈ A
lfp (af A). Iterating this argument yields the promised infinite A-avoiding
path. Let us formalize this sketch.
The one-step argument in the sketch above is proved by a variant of
contraposition:
lemma not_in_lfp_afD:
"s ∈
/ lfp(af A) =⇒ s ∈/ A ∧ (∃ t. (s,t) ∈ M ∧ t ∈
/ lfp(af A))"
apply(erule contrapos_np)
apply(subst lfp_unfold[OF mono_af])
apply(simp add: af_def)
done
We assume the negation of the conclusion and prove s ∈ lfp (af A). Unfold-
ing lfp once and simplifying with the definition of af finishes the proof.
Now we iterate this process. The following construction of the desired
path is parameterized by a predicate Q that should hold along the path:
primrec path :: "state ⇒ (state ⇒ bool) ⇒ (nat ⇒ state)" where
"path s Q 0 = s" |
"path s Q (Suc n) = (SOME t. (path s Q n,t) ∈ M ∧ Q t)"
Element n + 1 on this path is some arbitrary successor t of element n such
that Q t holds. Remember that SOME t. R t is some arbitrary but fixed t
such that R t holds (see Sect. 5.10). Of course, such a t need not exist, but
that is of no concern to us since we will only use path when a suitable t does
exist.
Let us show that if each state s that satisfies Q has a successor that again
satisfies Q, then there exists an infinite Q -path:
lemma infinity_lemma:
"[[ Q s; ∀ s. Q s −→ (∃ t. (s,t) ∈ M ∧ Q t) ]] =⇒
∃ p∈Paths s. ∀ i. Q(p i)"
After simplification and clarification, the subgoal has the following form:
1. i. [[Q s; ∀ s. Q s −→ (∃ t. (s, t) ∈ M ∧ Q t)]]
V
=⇒ (path s Q i, SOME t. (path s Q i, t) ∈ M ∧ Q t) ∈ M ∧
Q (path s Q i)
It invites a proof by induction on i:
124 6. Sets, Functions and Relations
apply(induct_tac i)
apply(simp)
After simplification, the base case boils down to
1. [[Q s; ∀ s. Q s −→ (∃ t. (s, t) ∈ M ∧ Q t)]]
=⇒ (s, SOME t. (s, t) ∈ M ∧ Q t) ∈ M
A total of 2 subgoals...
The conclusion looks exceedingly trivial: after all, t is chosen such that (s,
t) ∈ M holds. However, we first have to show that such a t actually exists!
This reasoning is embodied in the theorem someI2_ex :
[[∃ a. ?P a; x. ?P x =⇒ ?Q x]] =⇒ ?Q (SOME x. ?P x)
V
1. x. x ∈
/ lfp (af A) =⇒ ∃ p∈Paths x. ∀ i. p i ∈
/ A
V
1. Vx. ∀ s. s ∈
/ lfp (af A) −→ (∃ t. (s, t) ∈ M ∧ t ∈
/ lfp (af A))
V
2. x. ∃ p∈Paths x. ∀ i. p i ∈/ lfp (af A) =⇒
∃ p∈Paths x. ∀ i. p i ∈
/ A
If you find these proofs too complicated, we recommend that you read
Sect. 9.2.4, where we show how inductive definitions lead to simpler argu-
ments.
The main theorem is proved as for PDL, except that we also derive the
necessary equality lfp(af A) = ... by combining AF_lemma1 and AF_lemma2
on the spot:
theorem "mc f = {s. s |= f}"
apply(induct_tac f)
apply(auto simp add: EF_lemma equalityI[OF AF_lemma1 AF_lemma2])
done
The language defined above is not quite CTL. The latter also includes
an until-operator EU f g with semantics “there Exists a path where f is true
U ntil g becomes true”. We need an auxiliary function:
primrec
until:: "state set ⇒ state set ⇒ state ⇒ state list ⇒ bool" where
"until A B s [] = (s ∈ B)" |
"until A B s (t#p) = (s ∈ A ∧ (s,t) ∈ M ∧ until A B t p)"
Exercise 6.5.2 Extend the datatype of formulae by the above until operator
and prove the equivalence between semantics and model checking, i.e. that
mc (EU f g) = {s. s |= EU f g}
For more CTL exercises see, for example, Huth and Ryan [15].
Let us close this section with a few words about the executability of our
model checkers. It is clear that if all sets are finite, they can be represented as
lists and the usual set operations are easily implemented. Only lfp requires
a little thought. Fortunately, theory While_Combinator in the Library [4] pro-
vides a theorem stating that in the case of finite sets and a monotone func-
tion F, the value of lfp F can be computed by iterated application of F to {}
until a fixed point is reached. It is actually possible to generate executable
functional programs from HOL definitions, but that is beyond the scope of
the tutorial.
7. Inductively Defined Sets
Our first lemma states that numbers of the form 2 × k are even. Introduction
rules are used to show that specific values belong to the inductive set. Such
proofs typically involve induction, perhaps over some other inductive set.
lemma two_times_even[intro!]: "2*k ∈ even"
apply (induct_tac k)
apply auto
done
The first step is induction on the natural number k, which leaves two subgoals:
1. V
2 * 0 ∈ Even.even
2. n. 2 * n ∈ Even.even =⇒ 2 * Suc n ∈ Even.even
Here auto simplifies both subgoals so that they match the introduction rules,
which are then applied automatically.
Our ultimate goal is to prove the equivalence between the traditional
definition of even (using the divides relation) and our inductive definition.
One direction of this equivalence is immediate by the lemma just proved,
whose intro! attribute ensures it is applied automatically.
lemma dvd_imp_even: "2 dvd n =⇒ n ∈ even"
by (auto simp add: dvd_def)
From the definition of the set Even.even, Isabelle has generated an induction
rule:
[[xV ∈ Even.even; P 0;
n. [[n ∈ Even.even; P n]]
=⇒ P (Suc (Suc n))]]
=⇒ P x (even.induct)
A property P holds for every even number provided it holds for 0 and is closed
under the operation Suc(Suc ·). Then P is closed under the introduction rules
for Even.even, which is the least set closed under those rules. This type of
inductive argument is called rule induction.
7.1 The Set of Even Numbers 129
Apart from the double application of Suc, the induction rule above resem-
bles the familiar mathematical induction, which indeed is an instance of rule
induction; the natural numbers can be defined inductively to be the least set
containing 0 and closed under Suc.
Induction is the usual way of proving a property of the elements of an
inductively defined set. Let us prove that all members of the set Even.even
are multiples of two.
lemma even_imp_dvd: "n ∈ even =⇒ 2 dvd n"
We begin by applying induction. Note that even.induct has the form of
an elimination rule, so we use the method erule. We get two subgoals:
apply (erule even.induct)
1. V
semiring_parity_class.even 0
2. n. [[n ∈ Even.even; semiring_parity_class.even n]]
=⇒ semiring_parity_class.even (Suc (Suc n))
We unfold the definition of dvd in both subgoals, proving the first one and
simplifying the second:
apply (simp_all add: dvd_def)
The next command eliminates the existential quantifier from the assumption
and replaces n by 2 * k.
apply clarify
To conclude, we tell Isabelle that the desired value is Suc k. With this hint,
the subgoal falls to simp.
apply (rule_tac x = "Suc k" in exI, simp)
Combining the previous two results yields our objective, the equivalence
relating Even.even and dvd.
theorem even_iff_dvd: "(n ∈ even) = (2 dvd n)"
by (blast intro: dvd_imp_even even_imp_dvd)
[[aV ∈ Even.even; a = 0 =⇒ P;
n. [[a = Suc (Suc n); n ∈ Even.even]]
=⇒ P]]
=⇒ P (even.cases)
This general rule is less useful than instances of it for specific patterns. For
example, if a has the form Suc (Suc n) then the first case becomes irrele-
vant, while the second case tells us that n belongs to Even.even. Isabelle will
generate this instance for us:
inductive cases Suc_Suc_cases [elim!]: "Suc(Suc n) ∈ even"
Just above we devoted some effort to reaching precisely this result. Yet we
could have obtained it by a one-line declaration, dispensing with the lemma
even_imp_even_minus_2. This example also justifies the terminology rule in-
version: the new rule inverts the introduction rule even.step. In general, a
rule can be inverted when the set of elements it introduces is disjoint from
those of the other introduction rules.
For one-off applications of rule inversion, use the ind_cases method. Here
is an example:
apply (ind_cases "Suc(Suc n) ∈ even")
Just as there are datatypes defined by mutual recursion, there are sets defined
by mutual induction. As a trivial example we consider the even and odd
natural numbers:
inductive set
Even :: "nat set" and
Odd :: "nat set"
where
zero: "0 ∈ Even"
| EvenI: "n ∈ Odd =⇒ Suc n ∈ Even"
| OddI: "n ∈ Even =⇒ Suc n ∈ Odd"
If we want to prove that all even numbers are divisible by two, we have
to generalize the statement as follows:
lemma "(m ∈ Even −→ 2 dvd m) ∧ (n ∈ Odd −→ 2 dvd (Suc n))"
The proof is by rule induction. Because of the form of the induction theorem,
it is applied by rule rather than erule as for ordinary inductive definitions:
apply(rule Even_Odd.induct)
1. V
even 0
2. Vn. [[n ∈ Odd; even (Suc n)]] =⇒ even (Suc n)
3. n. [[n ∈ Even; even n]] =⇒ even (Suc (Suc n))
The first two subgoals are proved by simplification and the final one can
be proved in the same manner as in Sect. 7.1.3 where the same subgoal was
encountered before. We do not show the proof script.
Instead of a set of even numbers one can also define a predicate on nat:
inductive evn :: "nat ⇒ bool" where
zero: "evn 0" |
step: "evn n =⇒ evn(Suc(Suc n))"
Everything works as before, except that you write inductive instead of in-
ductive set and evn n instead of n ∈ Even. When defining an n-ary rela-
tion as a predicate, it is recommended to curry the predicate: its type should
7.2 The Reflexive Transitive Closure 133
more sensitive, and even blast can be lead astray in the presence of large
numbers of rules.
To prove transitivity, we need rule induction, i.e. theorem rtc.induct:
?x2.0) ∈ ?r*; x. ?P x x;
V
[[(?x1.0,
x y z. [[(x, y) ∈ ?r; (y, z) ∈ ?r*; ?P y z]] =⇒ ?P x z]]
V
=⇒ ?P ?x1.0 ?x2.0
apply(blast)
apply(blast intro: rtc_step)
done
Let us now prove that r* is really the reflexive transitive closure of r, i.e.
the least reflexive and transitive relation containing r. The latter is easily
formalized
inductive set
rtc2 :: "('a × 'a)set ⇒ ('a × 'a)set"
for r :: "('a × 'a)set"
where
"(x,y) ∈ r =⇒ (x,y) ∈ rtc2 r"
| "(x,x) ∈ rtc2 r"
| "[[ (x,y) ∈ rtc2 r; (y,z) ∈ rtc2 r ]] =⇒ (x,z) ∈ rtc2 r"
and the equivalence of the two definitions is easily shown by the obvious rule
inductions:
lemma "(x,y) ∈ rtc2 r =⇒ (x,y) ∈ r*"
apply(erule rtc2.induct)
apply(blast)
apply(blast)
apply(blast intro: rtc_trans)
done
Exercise 7.2.2 Repeat the development of this section, but starting with
a definition of rtc where rtc_step is replaced by its converse as shown in
exercise 7.2.1.
136 7. Inductively Defined Sets
Now the type integer_op gterm denotes the ground terms built over those
symbols.
The type constructor gterm can be generalized to a function over sets. It
returns the set of ground terms that can be formed over a set F of function
symbols. For example, we could consider the set of ground terms formed from
the finite set {Number 2, UnaryMinus, Plus}.
This concept is inductive. If we have a list args of ground terms over F
and a function symbol f in F, then we can apply f to args to obtain another
ground term. The only difficulty is that the argument list may be of any
length. Hitherto, each rule in an inductive definition referred to the induc-
tively defined set a fixed number of times, typically once or twice. A universal
quantifier in the premise of the introduction rule expresses that every element
of args belongs to our inductively defined set: is a ground term over F. The
function set denotes the set of elements in a given list.
inductive set
gterms :: "'f set ⇒ 'f gterm set"
for F :: "'f set"
where
step[intro!]: "[[∀ t ∈ set args. t ∈ gterms F; f ∈ F]]
=⇒ (Apply f args) ∈ gterms F"
To demonstrate a proof from this definition, let us show that the function
gterms is monotone. We shall need this concept shortly.
7.3 Advanced Inductive Definitions 137
!! Why do we call this function gterms instead of gterm? A constant may have
the same name as a type. However, name clashes could arise in the theorems
that Isabelle generates. Our choice of names keeps gterms.induct separate from
gterm.induct.
Call a term well-formed if each symbol occurring in it is applied to the
correct number of arguments. (This number is called the symbol’s arity.)
We can express well-formedness by generalizing the inductive definition of
gterms. Suppose we are given a function called arity, specifying the arities
of all symbols. In the inductive step, we have a list args of such terms and a
function symbol f. If the length of the list matches the function’s arity then
applying f to args yields a well-formed term.
inductive set
well_formed_gterm :: "('f ⇒ nat) ⇒ 'f gterm set"
for arity :: "'f ⇒ nat"
where
step[intro!]: "[[∀ t ∈ set args. t ∈ well_formed_gterm arity;
length args = arity f]]
=⇒ (Apply f args) ∈ well_formed_gterm arity"
The inductive definition neatly captures the reasoning above. The univer-
sal quantification over the set of arguments expresses that all of them are
well-formed.
function lists. This function, from the Isabelle theory of lists, is analogous
to the function gterms declared above: if A is a set then lists A is the set of
lists whose elements belong to A.
In the inductive definition of well-formed terms, examine the one intro-
duction rule. The first premise states that args belongs to the lists of well-
formed terms. This formulation is more direct, if more obscure, than using a
universal quantifier.
inductive set
well_formed_gterm' :: "('f ⇒ nat) ⇒ 'f gterm set"
for arity :: "'f ⇒ nat"
where
step[intro!]: "[[args ∈ lists (well_formed_gterm' arity);
length args = arity f]]
=⇒ (Apply f args) ∈ well_formed_gterm' arity"
monos lists_mono
We cite the theorem lists_mono to justify using the function lists.1
A ⊆ B =⇒ lists A ⊆ lists B (lists_mono)
Why must the function be monotone? An inductive definition describes an
iterative construction: each element of the set is constructed by a finite num-
ber of introduction rule applications. For example, the elements of even are
constructed by finitely many applications of the rules
0 ∈ Even.even
n ∈ Even.even =⇒ Suc (Suc n) ∈ Even.even
Showing that 4 is even using these rules requires showing that 3 is not even.
It is far from trivial to show that this set of rules characterizes the even
numbers.
Even with its use of the function lists, the premise of our introduction
rule is positive:
args ∈ lists (well_formed_gterm' arity)
To apply the rule we construct a list args of previously constructed well-
formed terms. We obtain a new term, Apply f args. Because lists is mono-
tone, applications of the rule remain valid as new terms are constructed.
Further lists of well-formed terms become available and none are taken away.
1
This particular theorem is installed by default already, but we include the monos
declaration in order to illustrate its syntax.
7.3 Advanced Inductive Definitions 139
This proof resembles the one given in Sect. 7.3.1 above, especially in the form
of the induction hypothesis. Next, we consider the opposite inclusion:
lemma "well_formed_gterm' arity ⊆ well_formed_gterm arity"
apply clarify
apply (erule well_formed_gterm'.induct)
apply auto
done
The proof script is virtually identical, but the subgoal after applying in-
duction may be surprising:
1. x args f.
V
[[args
∈ lists
(well_formed_gterm' arity ∩
{a. a ∈ well_formed_gterm arity});
length args = arity f]]
=⇒ Apply f args ∈ well_formed_gterm arity
Does gterms distribute over intersection? We have proved that this function
is monotone, so mono_Int gives one of the inclusions. The opposite inclusion
asserts that if t is a ground term over both of the sets F and G then it is also
a ground term over their intersection, F ∩ G.
lemma gterms_IntI:
"t ∈ gterms F =⇒ t ∈ gterms G −→ t ∈ gterms (F∩G)"
S → | bA | aB
A → aS | bAA
B → bS | aBB
At the end we say a few words about the relationship between the original
proof [13, p. 81] and our formal version.
We start by fixing the alphabet, which consists only of a’s and b ’s:
datatype alfa = a | b
Words over this alphabet are of type alfa list, and the three nonterminals
are declared as sets of such words. The productions above are recast as a
mutual inductive definition of S, A and B :
inductive set
S :: "alfa list set" and
A :: "alfa list set" and
B :: "alfa list set"
where
"[] ∈ S"
| "w ∈ A =⇒ b#w ∈ S"
| "w ∈ B =⇒ a#w ∈ S"
First we show that all words in S contain the same number of a’s and b ’s.
Since the definition of S is by mutual induction, so is the proof: we show at
the same time that all words in A contain one more a than b and all words
in B contain one more b than a.
lemma correctness:
"(w ∈ S −→ size[x←w. x=a] = size[x←w. x=b]) ∧
(w ∈ A −→ size[x←w. x=a] = size[x←w. x=b] + 1) ∧
(w ∈ B −→ size[x←w. x=b] = size[x←w. x=a] + 1)"
These propositions are expressed with the help of the predefined filter func-
tion on lists, which has the convenient syntax [x←xs. P x], the list of all el-
ements x in xs such that P x holds. Remember that on lists size and length
are synonymous.
The proof itself is by rule induction and afterwards automatic:
by (rule S_A_B.induct, auto)
This may seem surprising at first, and is indeed an indication of the power
of inductive definitions. But it is also quite straightforward. For example,
consider the production A → bAA: if v, w ∈ A and the elements of A contain
one more a than b’s, then bvw must again contain one more a than b’s.
As usual, the correctness of syntactic descriptions is easy, but complete-
ness is hard: does S contain all words with an equal number of a’s and b ’s? It
turns out that this proof requires the following lemma: every string with two
more a’s than b ’s can be cut somewhere such that each half has one more a
than b. This is best seen by imagining counting the difference between the
number of a’s and b ’s starting at the left end of the word. We start with 0
and end (at the right end) with 2. Since each move to the right increases
or decreases the difference by 1, we must have passed through 1 on our way
from 0 to 2. Formally, we appeal to the following discrete intermediate value
theorem nat0_intermed_int_val
7.4 Case Study: A Context Free Grammar 143
This could have been done earlier but was not necessary so far.
The completeness theorem tells us that if a word has the same number of
a’s and b ’s, then it is in S, and similarly for A and B :
theorem completeness:
"(size[x←w. x=a] = size[x←w. x=b] −→ w ∈ S) ∧
(size[x←w. x=a] = size[x←w. x=b] + 1 −→ w ∈ A) ∧
(size[x←w. x=b] = size[x←w. x=a] + 1 −→ w ∈ B)"
The rule parameter tells induct_tac explicitly which induction rule to use.
For details see Sect. 9.2.2 below. In this case the result is that we may assume
the lemma already holds for all words shorter than w. Because the induction
step renames the induction variable we rename it back to w.
The proof continues with a case distinction on w, on whether w is empty
or not.
apply(case_tac w)
apply(simp_all)
Simplification disposes of the base case and leaves only a conjunction of two
step cases to be proved: if w = a # v and
length (if x = a then [x ∈ v] else []) =
length (if x = b then [x ∈ v] else []) + 2
then b # v ∈ A, and similarly for w = b # v. We only consider the first case
in detail.
After breaking the conjunction up into two cases, we can apply part1 to
the assumption that w contains two more a’s than b ’s.
apply(rule conjI)
apply(clarify)
apply(frule part1[of "λx. x=a", simplified])
apply(clarify)
cavalier about this point and may even have overlooked the slight difficulty
lurking in the omitted cases. Such errors are found in many pen-and-paper
proofs when they are scrutinized formally.
Part III
Advanced Material
8. More about Types
So far we have learned about a few basic types (for example bool and nat),
type abbreviations (types) and recursive datatypes (datatype). This chap-
ter will introduce more advanced material:
– Pairs (Sect. 8.1) and records (Sect. 8.2), and how to reason about them.
– Type classes: how to specify and reason about axiomatic collections of
types (Sect. 8.3). This section leads on to a discussion of Isabelle’s numeric
types (Sect. 8.4).
– Introducing your own types: how to define types that cannot be constructed
with any of the basic methods (Sect. 8.5).
The material in this section goes beyond the needs of most novices. Serious
users should at least skim the sections as far as type classes. That material
is fairly advanced; read the beginning to understand what it is about, but
consult the rest only when necessary.
This works well if rewriting with split_def finishes the proof, as it does
above. But if it does not, you end up with exactly what we are trying to
avoid: nests of fst and snd. Thus this approach is neither elegant nor very
practical in large examples, although it can be effective in small ones.
If we consider why this lemma presents a problem, we realize that we need
to replace variable p by some pair (a, b). Then both sides of the equation
would simplify to a by the simplification rules (case (a, b) of (c, d) ⇒ f
c d) = f a b and fst (x1, x2) = x1. To reason about tuple patterns requires
some way of converting a variable of product type into a pair. In case of a
subterm of the form case p of (x, xa) ⇒ f x xa this is easy: the split rule
prod.split replaces p by a pair:
lemma "(λ(x,y).y) p = snd p"
apply(split prod.split)
1. a b aa ba. swap (swap (a, b)) = (aa, ba) −→ (a, b) = (aa, ba)
V
apply simp
done
Note that we have intentionally included only split_paired_all in the first
simplification step, and then we simplify again. This time the reason was not
152 8. More about Types
8.2 Records
Record types are not primitive in Isabelle and have a delicate internal rep-
resentation [21], based on nested copies of the primitive product type. A
record declaration introduces a new record type scheme by specifying its
fields, which are packaged internally to hold up the perception of the record
as a distinguished entity. Here is a simple example:
record point =
Xcoord :: int
8.2 Records 153
Ycoord :: int
Records of type point have two fields named Xcoord and Ycoord, both of
type int. We now define a constant of type point:
definition pt1 :: point where
"pt1 ≡ (| Xcoord = 999, Ycoord = 23 |)"
We see above the ASCII notation for record brackets. You can also use the
symbolic brackets (| and |). Record type expressions can be also written di-
rectly with individual fields. The type name above is merely an abbreviation.
definition pt2 :: "(|Xcoord :: int, Ycoord :: int|)" where
"pt2 ≡ (|Xcoord = -45, Ycoord = 97|)"
For each field, there is a selector function of the same name. For exam-
ple, if p has type point then Xcoord p denotes the value of the Xcoord field
of p. Expressions involving field selection of explicit records are simplified
automatically:
lemma "Xcoord (|Xcoord = a, Ycoord = b|) = a"
by simp
!! Field names are declared as constants and can no longer be used as variables.
It would be unwise, for example, to call the fields of type point simply x and y.
Now, let us define coloured points (type cpoint) to be points extended with
a field col of type colour :
datatype colour = Red | Green | Blue
The fields of this new type are Xcoord, Ycoord and col, in that order.
definition cpt1 :: cpoint where
"cpt1 ≡ (|Xcoord = 999, Ycoord = 23, col = Green|)"
!! If you use the symbolic record brackets (| and |), then you must also use the
symbolic ellipsis, “. . . ”, rather than three consecutive periods, “...”. Mixing
the ASCII and symbolic versions causes a syntax error. (The two versions are more
distinct on screen than they are on paper.)
Two records are equal if all pairs of corresponding fields are equal. Concrete
record equalities are simplified automatically:
lemma "((|Xcoord = a, Ycoord = b|) = (|Xcoord = a', Ycoord = b'|)) =
(a = a' ∧ b = b')"
by simp
The following equality is similar, but generic, in that r can be any instance
of 'a point_scheme:
lemma "r(|Xcoord := a, Ycoord := b|) = r(|Ycoord := b, Xcoord := a|)"
by simp
We see above the syntax for iterated updates. We could equivalently have
written the left-hand side as r(|Xcoord := a|)(|Ycoord := b|).
Record equality is extensional: a record is determined entirely by the
values of its fields.
lemma "r = (|Xcoord = Xcoord r, Ycoord = Ycoord r|)"
by simp
The generic version of this equality includes the pseudo-field more:
lemma "r = (|Xcoord = Xcoord r, Ycoord = Ycoord r, . . . = point.more r|)"
by simp
The simplifier can prove many record equalities automatically, but general
equality reasoning can be tricky. Consider proving this obvious fact:
lemma "r(|Xcoord := a|) = r(|Xcoord := a'|) =⇒ a = a'"
apply simp?
oops
Here the simplifier can do nothing, since general record equality is not elimi-
nated automatically. One way to proceed is by an explicit forward step that
applies the selector Xcoord to both sides of the assumed record equality:
lemma "r(|Xcoord := a|) = r(|Xcoord := a'|) =⇒ a = a'"
apply (drule_tac f = Xcoord in arg_cong)
Contrast those with the corresponding functions for record cpoint. Observe
cpoint.fields in particular.
cpoint.make Xcoord Ycoord col ≡
(|Xcoord = Xcoord, Ycoord = Ycoord, col = col|)
cpoint.fields col ≡ (|col = col|)
cpoint.extend r more ≡
(|Xcoord = Xcoord r, Ycoord = Ycoord r, col = col r, . . . = more|)
cpoint.truncate r ≡
(|Xcoord = Xcoord r, Ycoord = Ycoord r, col = col r|)
To demonstrate these functions, we declare a new coloured point by
extending an ordinary point. Function point.extend augments pt1 with a
colour value, which is converted into an appropriate record fragment by
cpoint.fields.
definition cpt2 :: cpoint where
"cpt2 ≡ point.extend pt1 (cpoint.fields Green)"
The coloured points cpt1 and cpt2 are equal. The proof is trivial, by
unfolding all the definitions. We deliberately omit the definition of pt1 in
order to reveal the underlying comparison on type point.
lemma "cpt1 = cpt2"
apply (simp add: cpt1_def cpt2_def point.defs cpoint.defs)
Exercise 8.2.2 (For Java programmers.) Model a small class hierarchy using
records.
8.3.1 Overloading
Type classes allow overloading; thus a constant may have multiple definitions
at non-overlapping types.
This introduces a new class plus, along with a constant plus with nice infix
syntax. plus is also named class operation. The type of plus carries a class
constraint "'a :: plus" on its type variable, meaning that only types of class
plus can be instantiated for "'a". To breathe life into plus we need to declare
a type to be an instance of plus:
instantiation nat :: plus
begin
Note that the name plus carries a suffix _nat; by default, the local name of a
class operation f to be instantiated on type constructor κ is mangled as f_κ.
In case of uncertainty, these names may be inspected using the print context
command.
Although class plus has no axioms, the instantiation must be formally
concluded by a (trivial) instantiation proof “..”:
instance ..
Here we instantiate the product type prod to class plus, given that its type
arguments are of class plus:
fun plus_prod :: "'a × 'b ⇒ 'a × 'b ⇒ 'a × 'b" where
"(x, y) ⊕ (w, z) = (x ⊕ w, y ⊕ z)"
Obviously, overloaded specifications may include recursion over the syntactic
structure of types.
instance ..
end
This way we have encoded the canonical lifting of binary operations to prod-
ucts by means of type classes.
8.3.2 Axioms
Attaching axioms to our classes lets us reason on the level of classes. The
results will be applicable to all types in a class, just as in axiomatic mathe-
matics.
!! Proofs in this section use structured Isar proofs, which are not covered in this
tutorial; but see [24].
This class specification requires that all instances of semigroup obey assoc: " x
V
y z :: 'a::semigroup. (x ⊕ y) ⊕ z = x ⊕ (y ⊕ z)".
160 8. More about Types
We can use this class axiom to derive further abstract theorems relative
to class semigroup :
lemma assoc_left:
fixes x y z :: "'a::semigroup"
shows "x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z"
using assoc by (rule sym)
The semigroup constraint on type 'a restricts instantiations of 'a to types of
class semigroup and during the proof enables us to use the fact assoc whose
type parameter is itself constrained to class semigroup. The main advantage
of classes is that theorems can be proved in the abstract and freely reused
for each instance.
On instantiation, we have to give a proof that the given operations obey
the class axioms:
instantiation nat :: semigroup
begin
instance proof
The proof opens with a default proof step, which for instance judgements
invokes method intro_classes.
fix m n q :: nat
show "(m ⊕ n) ⊕ q = m ⊕ (n ⊕ q)"
by (induct m) simp_all
qed
end
Again, the interesting things enter the stage with parametric types:
instantiation prod :: (semigroup, semigroup) semigroup
begin
instance proof
fix p 1 p 2 p 3 :: "'a::semigroup × 'b::semigroup"
show "p 1 ⊕ p 2 ⊕ p 3 = p 1 ⊕ (p 2 ⊕ p 3 )"
by (cases p 1 , cases p 2 , cases p 3 ) (simp add: assoc)
Associativity of product semigroups is established using the hypothetical as-
sociativity assoc of the type components, which holds due to the semigroup
constraints imposed on the type components by the instance proposition.
Indeed, this pattern often occurs with parametric types and type classes.
qed
end
Monoids. We define a subclass monoidl (a semigroup with a left-hand neu-
tral) by extending semigroup with one additional parameter neutral together
with its property:
class monoidl = semigroup +
8.3 Type Classes 161
definition
neutral_nat_def: "0 = (0::nat)"
instance proof
fix n :: nat
show "0 ⊕ n = n"
unfolding neutral_nat_def by simp
qed
end
In contrast to the examples above, we here have both specification of class
operations and a non-trivial instance proof.
This covers products as well:
instantiation prod :: (monoidl, monoidl) monoidl
begin
definition
neutral_prod_def: "0 = (0, 0)"
instance proof
fix p :: "'a::monoidl × 'b::monoidl"
show "0 ⊕ p = p"
by (cases p) (simp add: neutral_prod_def neutl)
qed
end
Fully-fledged monoids are modelled by another subclass which does not add
new parameters but tightens the specification:
class monoid = monoidl +
assumes neutr: "x ⊕ 0 = x"
Corresponding instances for nat and products are left as an exercise to the
reader.
lemma left_cancel:
fixes x y z :: "'a::group"
shows "x ⊕ y = x ⊕ z ←→ y = z"
proof
assume "x ⊕ y = x ⊕ z"
then have "÷ x ⊕ (x ⊕ y) = ÷ x ⊕ (x ⊕ z)" by simp
then have "(÷ x ⊕ x) ⊕ y = (÷ x ⊕ x) ⊕ z" by (simp add: assoc)
then show "y = z" by (simp add: invl neutl)
next
assume "y = z"
then show "x ⊕ y = x ⊕ z" by simp
qed
Any group is also a monoid; this can be made explicit by claiming an additional
subclass relation, together with a proof of the logical difference:
instance group ⊆ monoid
proof
fix x
from invl have "÷ x ⊕ x = 0" .
then have "÷ x ⊕ (x ⊕ 0) = ÷ x ⊕ x"
by (simp add: neutl invl assoc [symmetric])
then show "x ⊕ 0 = x" by (simp add: left_cancel)
qed
The proof result is propagated to the type system, making group an instance
of monoid by adding an additional edge to the graph of subclass relation; see
also Figure 8.1.
semigroup semigroup
? ?
monoidl monoidl
B
B
monoid B monoid
B PP
PP
BN P
q
P
group group
Fig. 8.1. Subclass relationship of monoids and groups: before and after establishing
the relationship group ⊆ monoid; transitive edges are left out.
axiom refl above). These constraints are always carried around and Isabelle
takes care that they are never lost, unless the type variable is instantiated
with a type that has been shown to belong to that class. Thus you may be
able to prove False from your axioms, but Isabelle will remind you that this
theorem has the hidden hypothesis that the class is non-empty.
Even if each individual class is consistent, intersections of (unrelated)
classes readily become inconsistent in practice. Now we know this need not
worry us.
8.4 Numbers
Until now, our numerical examples have used the type of natural num-
bers, nat. This is a recursive datatype generated by the constructors zero
and successor, so it works well with inductive proofs and primitive recursive
function definitions. HOL also provides the type int of integers, which lack
induction but support true subtraction. With subtraction, arithmetic reason-
ing is easier, which makes the integers preferable to the natural numbers for
complicated arithmetic expressions, even if they are non-negative. There are
also the types rat, real and complex : the rational, real and complex num-
bers. Isabelle has no subtyping, so the numeric types are distinct and there
are functions to convert between them. Most numeric operations are over-
loaded: the same symbol can be used at all numeric types. Table A.2 in the
appendix shows the most important operations, together with the priorities
of the infix symbols. Algebraic properties are organized using type classes
around algebraic concepts such as rings and fields; a property such as the
164 8. More about Types
The constants 0 and 1 are overloaded. They denote zero and one, respectively,
for all numeric types. Other values are expressed by numeric literals, which
consist of one or more decimal digits optionally preceeded by a minus sign
(-). Examples are 2, -3 and 441223334678. Literals are available for the types
of natural numbers, integers, rationals, reals, etc.; they denote integer values
of arbitrary size.
Literals look like constants, but they abbreviate terms representing the
number in a two’s complement binary notation. Isabelle performs arithmetic
on literals by rewriting rather than using the hardware arithmetic. In most
cases arithmetic is fast enough, even for numbers in the millions. The arith-
metic operations provided for literals include addition, subtraction, multipli-
cation, integer division and remainder. Fractions of literals (expressed using
division) are reduced to lowest terms.
!! The arithmetic operators are overloaded, so you must be careful to ensure that
each numeric expression refers to a specific type, if necessary by inserting type
constraints. Here is an example of what can go wrong:
!! Numeric literals are not constructors and therefore must not be used in pat-
terns. For example, this declaration is rejected:
function h where
"h 3 = 2"
|"h i = i"
You should use a conditional expression instead:
"h i = (if i = 3 then 2 else i)"
Surprisingly few of these results depend upon the divisors’ being nonzero.
That is because division by zero yields zero:
a div 0 = 0 (DIVISION_BY_ZERO_DIV)
a mod 0 = a (DIVISION_BY_ZERO_MOD)
In div_mult_mult1 above, one of the two divisors (namely c) must still be
nonzero.
The divides relation has the standard definition, which is overloaded over
all numeric types:
m dvd n ≡ ∃ k. n = m * k (dvd_def)
Section 5.18 discusses proofs involving this relation. Here are some of the
facts proved about it:
[[m dvd n; n dvd m]] =⇒ m = n (dvd_antisym)
[[k dvd m; k dvd n]] =⇒ k dvd (m + n) (dvd_add)
Subtraction. There are no negative natural numbers, so m - n equals zero
unless m exceeds n. The following is one of the few facts about m - n that is
not subject to the condition n ≤ m.
(m - n) * k = m * k - n * k (diff_mult_distrib)
Natural number subtraction has few nice properties; often you should remove
it by simplifying with this split rule.
P(a-b) = ((a<b −→ P 0) ∧ (∀ d. a = b+d −→ P d)) (nat_diff_split)
For example, splitting helps to prove the following fact.
lemma "(n - 2) * (n + 2) = n * n - (4::nat)"
apply
V (simp split: nat_diff_split, clarify)
1. d. [[n < 2; n * n = 4 + d]] =⇒ d = 0
The result lies outside the scope of linear arithmetic, but it is easily found if
we explicitly split n<2 as n=0 or n=1:
apply (subgoal_tac "n=0 | n=1", force, arith)
done
Reasoning methods for the integers resemble those for the natural numbers,
but induction and the constant Suc are not available. HOL provides many
lemmas for proving inequalities involving integer multiplication and division,
similar to those shown above for type nat. The laws of addition, subtraction
and multiplication are available through the axiomatic type class for rings
(Sect. 8.4.5).
The absolute value function abs is overloaded, and is defined for all types
that involve negative numbers, including the integers. The arith method can
prove facts about abs automatically, though as it does so by case analysis,
the cost can be exponential.
8.4 Numbers 167
The last two differ from their natural number analogues by requiring c to
be positive. Since division by zero yields zero, we could allow c to be zero.
However, c cannot be negative: a counterexample is a = 7, b = 2 and c = −3,
when the left-hand side of zdiv_zmult2_eq is −2 while the right-hand side
is −1. The prefix z in many theorem names recalls the use of Z to denote the
set of integers.
Induction is less important for integers than it is for the natural numbers,
but it can be valuable if the range of integers has a lower or upper bound.
There are four rules for integer induction, corresponding to the possible re-
lations of the bound (≥, >, ≤ and <):
≤ i; P k; i. i; P i]] =⇒ P(i+1)]] =⇒ P i (int_ge_induct)
V
[[k V [[k ≤
[[k < i; P(k+1);
V i. [[k < i; P i]] =⇒ P(i+1)]] =⇒ P i (int_gr_induct)
[[i ≤ k; P k; i. V [[i ≤ k; P i]] =⇒ P(i-1)]] =⇒ P i (int_le_induct)
[[i < k; P(k-1); i. [[i < k; P i]] =⇒ P(i-1)]] =⇒ P i (int_less_induct)
These types provide true division, the overloaded operator /, which differs
from the operator div of the natural numbers and integers. The rationals
and reals are dense: between every two distinct numbers lies another. This
property follows from the division laws, since if x 6= y then (x + y)/2 lies
between them:
a < b =⇒ ∃ r. a < r ∧ r < b (dense)
168 8. More about Types
The real numbers are, moreover, complete: every set of reals that is
bounded above has a least upper bound. Completeness distinguishes the reals
from the rationals,
√ for which the set {x | x < 2} has no least upper bound.
2
!! Types rat, real and complex are provided by theory HOL-Complex, which is
Main extended with a definitional development of the rational, real and complex
numbers. Base your theory upon theory Complex_Main, not the usual Main.
Available in the logic HOL-NSA is the theory Hyperreal, which define the
type hypreal of non-standard reals. These hyperreals include infinitesimals,
which represent infinitely small and infinitely large quantities; they facili-
tate proofs about limits, differentiation and integration [8]. The development
defines an infinitely large number, omega and an infinitely small positive num-
ber, epsilon. The relation x ≈ y means “x is infinitely close to y.” Theory
Hyperreal also defines transcendental functions such as sine, cosine, expo-
nential and logarithm — even the versions for type real, because they are
defined using nonstandard limits.
Setting the flag Isabelle > Settings > Show Sorts will display the type classes of
all type variables.
Here is how the theorem mult_cancel_left appears with the flag set.
((c::'a::ring_no_zero_divisors) * (a::'a::ring_no_zero_divisors) =
c * (b::'a::ring_no_zero_divisors)) =
(c = (0::'a::ring_no_zero_divisors) ∨ a = b)
Simplifying with the AC-Laws. Suppose that two expressions are equal,
differing only in associativity and commutativity of addition. Simplifying
with the following equations sorts the terms and groups them to the right,
making the two expressions identical.
a + b + c = a + (b + c) (add.assoc)
a + b = b + a (add.commute)
a + (b + c) = b + (a + c) (add.left_commute)
The name ac_simps refers to the list of all three theorems; similarly there is
ac_simps. They are all proved for semirings and therefore hold for all numeric
types.
Here is an example of the sorting effect. Start with this goal, which in-
volves type nat.
1. Suc (i + j * l * k + m * n) = f (n * m + i + k * j * l)
Simplify using ac_simps and ac_simps.
apply (simp add: ac_simps ac_simps)
Here is the resulting subgoal.
1. Suc (i + (m * n + j * (k * l))) = f (i + (m * n + j * (k * l)))
170 8. More about Types
Division Laws for Fields. Here is a selection of rules about the division
operator. The following are installed as default simplification rules in order
to express combinations of products and quotients as rational expressions:
a * (b / c) = a * b / c (times_divide_eq_right)
b / c * a = b * a / c (times_divide_eq_left)
a / (b / c) = a * c / b (divide_divide_eq_right)
a / b / c = a / (b * c) (divide_divide_eq_left)
Signs are extracted from quotients in the hope that complementary terms
can then be cancelled:
- (a / b) = - a / b (minus_divide_left)
- (a / b) = a / - b (minus_divide_right)
The following distributive law is available, but it is not installed as a
simplification rule.
(a + b) / c = a / c + b / c (add_divide_distrib)
Absolute Value. The absolute value function abs is available for all ordered
rings, including types int, rat and real. It satisfies many properties, such as
the following:
|x * y| = |x| * |y| (abs_mult)
(|a| ≤ b) = (a ≤ b ∧ - a ≤ b) (abs_le_iff)
|a + b| ≤ |a| + |b| (abs_triangle_ineq)
!! The absolute value bars shown above cannot be typed on a keyboard. They can
be entered using the X-symbol package. In ascii, type abs x to get |x|.
Raising to a Power. Another type class, ordered idom, specifies rings that
also have exponentation to a natural number power, defined using the obvi-
ous primitive recursion. Theory Power proves various theorems, such as the
following.
a ^ (m + n) = a ^ m * a ^ n (power_add)
a ^ (m * n) = (a ^ m) ^ n (power_mult)
|a ^ n| = |a| ^ n (power_abs)
This does not define my_new_type at all but merely introduces its name. Thus
we know nothing about this type, except that it is non-empty. Such declara-
tions without definitions are useful if that type can be viewed as a parameter
of the theory. A typical example is given in Sect. ??, where we define a tran-
sition relation over an arbitrary type of states.
In principle we can always get rid of such type declarations by making
those types parameters of every other type, thus keeping the theory generic.
In practice, however, the resulting clutter can make types hard to read.
If you are looking for a quick and dirty way of introducing a new type to-
gether with its properties: declare the type and state its properties as axioms.
Example:
axiomatization where
just_one: "∃ x::my_new_type. ∀ y. x = y"
Now we come to the most general means of safely introducing a new type,
the type definition. All other means, for example datatype, are based on
it. The principle is extremely simple: any non-empty subset of an existing
type can be turned into a new type. More precisely, the new type is specified
to be isomorphic to some non-empty subset of an existing type.
Let us work a simple example, the definition of a three-element type. It
is easily represented by the first three natural numbers:
typedef three = "{0::nat, 1, 2}"
In order to enforce that the representing set on the right-hand side is non-
empty, this definition actually starts a proof to that effect:
1. ∃ x. x ∈ {0, 1, 2}
Fortunately, this is easy enough to show, even auto could do it. In general,
one has to provide a witness, in our case 0:
apply(rule_tac x = 0 in exI)
by simp
172 8. More about Types
This type definition introduces the new type three and asserts that it is a
copy of the set {0, 1, 2}. This assertion is expressed via a bijection between
the type three and the set {0, 1, 2}. To this end, the command declares the
following constants behind the scenes:
Rep_three :: three ⇒ nat
Abs_three :: nat ⇒ three
The situation is best summarized with the help of the following diagram,
where squares denote types and the irregular region denotes a set:
nat
three {0,1,2}
So far, everything was easy. But it is clear that reasoning about three
will be hell if we have to go back to nat every time. Thus our aim must be to
raise our level of abstraction by deriving enough theorems about type three
to characterize it completely. And those theorems should be phrased in terms
of A, B and C, not Abs_three and Rep_three. Because of the simplicity of the
8.5 Introducing New Types 173
example, we merely need to prove that A, B and C are distinct and that they
exhaust the type.
In processing our typedef declaration, Isabelle proves several helpful lem-
mas. The first two express injectivity of Rep_three and Abs_three:
(Rep_three x = Rep_three y) = (x = y) (Rep_three_inject)
[[x ∈ {0, 1, 2}; y ∈ {0, 1, 2} ]]
(Abs_three_inject)
=⇒ (Abs_three x = Abs_three y) = (x = y)
The following ones allow to replace some x::three by Abs_three(y::nat), and
conversely y by Rep_three x :
These theorems are proved for any type definition, with three replaced by
the name of the type in question.
Distinctness of A, B and C follows immediately if we expand their defini-
tions and rewrite with the injectivity of Abs_three:
lemma "A 6= B ∧ B 6= A ∧ A 6= C ∧ C 6= A ∧ B 6= C ∧ C 6= B"
by(simp add: Abs_three_inject A_def B_def C_def)
Of course we rely on the simplifier to solve goals like 0 6= 1.
The fact that A, B and C exhaust type three is best phrased as a case
distinction theorem: if you want to prove P x (where x is of type three) it
suffices to prove P A, P B and P C :
lemma three_cases: "[[ P A; P B; P C ]] =⇒ P x"
Again this follows easily using the induction principle stemming from the
type definition:
apply(induct_tac x)
Although three could be defined in one line, we have chosen this exam-
ple to demonstrate typedef because its simplicity makes the key concepts
particularly easy to grasp. If you would like to see a non-trivial example
that cannot be defined more directly, we recommend the definition of finite
multisets in the Library [4].
Let us conclude by summarizing the above procedure for defining a new
type. Given some abstract axiomatic description P of a type ty in terms of a
set of functions F , this involves three steps:
1. Find an appropriate type τ and subset A which has the desired properties
P, and make a type definition based on this representation.
2. Define the required functions F on ty by lifting analogous functions on
the representation via Abs ty and Rep ty.
3. Prove that P holds for ty by lifting P from the representation.
You can now forget about the representation and work solely in terms of the
abstract functions F and properties P.
9. Advanced Simplification and Induction
9.1 Simplification
This section describes features not covered until now. It also outlines the
simplification process itself, which can be helpful when the simplifier does
not do what you expect of it.
Only the first argument is simplified; the others remain unchanged. This
makes simplification much faster and is faithful to the evaluation strategy in
programming languages, which is why this is the default congruence rule for
if. Analogous rules control the evaluation of case expressions.
You can declare your own congruence rules with the attribute cong , either
globally, in the usual manner,
declare theorem-name [cong]
or locally in a simp call by adding the modifier
cong: list of theorem names
¬P 7→ P = False
P −→ Q 7→ P =⇒ Q
P ∧Q 7→ P, Q
∀x. P x 7→ P ?x
∀x ∈ A. P x 7→ ?x ∈ A =⇒ P ?x
if P then Q else R 7→ P =⇒ Q, ¬P =⇒ R
Now that we have learned about rules and logic, we take another look at
the finer points of induction. We consider two questions: what to do if the
proposition to be proved is not directly amenable to induction (Sect. 9.2.1),
and how to utilize (Sect. 9.2.2) and even derive (Sect. 9.2.3) new induction
schemas. We conclude with an extended example of induction (Sect. 9.2.4).
We cannot prove this equality because we do not know what hd and last
return when applied to [].
We should not have ignored the warning. Because the induction formula
is only the conclusion, induction does not affect the occurrence of xs in the
premises. Thus the case that should have been trivial becomes unprovable.
Fortunately, the solution is easy:1
Pull all occurrences of the induction variable into the conclusion using
−→.
Thus we should state the lemma as an ordinary implication (−→), letting
rule_format (Sect. 5.15) convert the result to the usual =⇒ form:
lemma hd_rev [rule_format]: "xs 6= [] −→ hd(rev xs) = last xs"
This time, induction leaves us with a trivial base case:
1. [] 6= [] −→ hd (rev []) = last []
A total of 2 subgoals...
And auto completes the proof.
If there are multiple premises A1 , …, An containing the induction variable,
you should turn the conclusion C into
A1 −→ · · · An −→ C .
Additionally, you may also have to universally quantify some other variables,
which can yield a fairly complex conclusion. However, rule_format can remove
any number of occurrences of ∀ and −→.
A second reason why your proposition may not be amenable to induction
is that you want to induct on a complex term, rather than a variable. In
general, induction on a term t requires rephrasing the conclusion C as
∀y1 . . . yn . x = t −→ C . (9.1)
where y1 . . . yn are the free variables in t and x is a new variable. Now you
can perform induction on x. An example appears in Sect. 9.2.2 below.
The very same problem may occur in connection with rule induction.
Remember that it requires a premise of the form (x1 , . . . , xk ) ∈ R, where R
is some inductively defined set and the xi are variables. If instead we have a
premise t ∈ R, where t is not just an n-tuple of variables, we replace it with
(x1 , . . . , xk ) ∈ R, and rephrase the conclusion C as
∀y1 . . . yn . (x1 , . . . , xk ) = t −→ C .
Readers who are puzzled by the form of statement (9.1) above should
remember that the transformation is only performed to permit induction.
Once induction has been applied, the statement can be transformed back
into something quite intuitive. For example, applying wellfounded induction
on x (w.r.t. ≺) to (9.1) and transforming the result a little leads to the goal
y. ∀z. t z ≺ t y −→ C z =⇒ C y
^
where y stands for y1 . . . yn and the dependence of t and C on the free vari-
ables of t has been made explicit. Unfortunately, this induction schema cannot
be expressed as a single theorem because it depends on the number of free
variables in t — the notation y is merely an informal device.
1. n. ∀ m<n. ∀ i. m = f i −→ i ≤ f i =⇒ ∀ i. n = f i −→ i ≤ f i
V
After stripping the ∀ i, the proof continues with a case distinction on i. The
case i = 0 is trivial and we focus on the other case:
apply(rule allI)
apply(case_tac i)
apply(simp)
1. n i nat.
V
[[∀ m<n. ∀ i. m = f i −→ i ≤ f i; i = Suc nat]] =⇒ n = f i −→ i ≤
f i
Exercise 9.2.1 From the axiom and lemma for f, show that f is the identity
function.
Method induct_tac can be applied with any rule r whose conclusion is of
the form ?P ?x1 . . .?xn , in which case the format is
apply(induct_tac y1 . . . yn rule: r )
where y1 , . . . , yn are variables in the conclusion of the first subgoal.
A further useful induction rule is length_induct, induction on the length
of a list
( xs. ∀ ys. length ys < length xs −→ P ys =⇒ P xs) =⇒ P xs
V
182 9. Advanced Simplification and Induction
Induction schemas are ordinary theorems and you can derive new ones
whenever you wish. This section shows you how, using the example of
nat_less_induct. Assume we only have structural induction available for nat
and want to derive complete induction. We must generalize the statement as
shown:
lemma induct_lem: "( n::nat. ∀ m<n. P m =⇒ P n) =⇒ ∀ m<n. P m"
V
apply(induct_tac n)
The base case is vacuously true. For the induction step (m < Suc n) we dis-
tinguish two cases: case m < n is true by induction hypothesis and case m = n
follows from the assumption, again using the induction hypothesis:
apply(blast)
by(blast elim: less_SucE)
due to the SOME operator involved. Below we give a simpler proof of AF_lemma2
based on an auxiliary inductive definition.
Let us call a (finite or infinite) path A-avoiding if it does not touch any
node in the set A. Then AF_lemma2 says that if no infinite path from some state
s is A-avoiding, then s ∈ lfp (af A). We prove this by inductively defining
the set Avoid s A of states reachable from s by a finite A-avoiding path:
inductive set
Avoid :: "state ⇒ state set ⇒ state set"
for s :: state and A :: "state set"
where
"s ∈ Avoid s A"
| "[[ t ∈ Avoid s A; t ∈
/ A; (t,u) ∈ M ]] =⇒ u ∈ Avoid s A"
It is easy to see that for any infinite A-avoiding path f with f 0 ∈ Avoid
s A there is an infinite A-avoiding path starting with s because (by definition
of Avoid) there is a finite A-avoiding path from s to f 0. The proof is by
induction on f 0 ∈ Avoid s A. However, this requires the following reformu-
lation, as explained in Sect. 9.2.1 above; the rule_format directive undoes the
reformulation after the proof.
lemma ex_infinite_path[rule_format]:
"t ∈ Avoid s A =⇒
∀ f∈Paths t. (∀ i. f i ∈
/ A) −→ (∃ p∈Paths s. ∀ i. p i ∈
/ A)"
apply(erule Avoid.induct)
apply(blast)
apply(clarify)
apply(drule_tac x = "λi. case i of 0 ⇒ t | Suc i ⇒ f i" in bspec)
apply(simp_all add: Paths_def split: nat.split)
done
The base case (t = s) is trivial and proved by blast. In the induction step,
we have an infinite A-avoiding path f starting from u, a successor of t. Now
we simply instantiate the ∀ f∈Paths t in the induction hypothesis by the
path starting with t and continuing with f. That is what the above λ-term
expresses. Simplification shows that this is a path starting with t and that
the instantiated induction hypothesis implies the conclusion.
Now we come to the key lemma. Assuming that no infinite A-avoiding
path starts from s, we want to show s ∈ lfp (af A). For the inductive proof
this must be generalized to the statement that every point t “between” s and
A, in other words all of Avoid s A, is contained in lfp (af A):
lemma Avoid_in_lfp[rule_format(no_asm)]:
"∀ p∈Paths s. ∃ i. p i ∈ A =⇒ t ∈ Avoid s A −→ t ∈ lfp(af A)"
The proof is by induction on the “distance” between t and A. Remember that
lfp (af A) = A ∪ M −1 `` lfp (af A). If t is already in A, then t ∈ lfp (af
A) is trivial. If t is not in A but all successors are in lfp (af A) (induction
hypothesis), then t ∈ lfp (af A) is again trivial.
The formal counterpart of this proof sketch is a well-founded induction
on M restricted to Avoid s A - A, roughly speaking:
184 9. Advanced Simplification and Induction
1. t. [[∀ p∈Paths s. ∃ i. p i ∈ A;
V
∀ y. (t, y) ∈ M ∧ t ∈/ A −→
y ∈ Avoid s A −→ y ∈ lfp (af A);
t ∈ Avoid s A]]
=⇒ t ∈ lfp (af A)
2. ∀ p∈Paths s. ∃ i. p i ∈ A =⇒
wf {(y, x). (x, y) ∈ M ∧ x ∈ Avoid s A ∧ x ∈ / A}
Having proved the main goal, we return to the proof obligation that the
relation used above is indeed well-founded. This is proved by contradiction: if
the relation is not well-founded then there exists an infinite A-avoiding path
all in Avoid s A, by theorem wf_iff_no_infinite_down_chain:
wf r = (@ f. ∀ i. (f (Suc i), f i) ∈ r)
Avoid_in_lfp is now
[[∀ p∈Paths s. ∃ i. p i ∈ A; t ∈ Avoid s A]] =⇒ t ∈ lfp (af A)
9.2 Advanced Induction Techniques 185
The main theorem is simply the corollary where t = s, when the assumption
t ∈ Avoid s A is trivially true by the first Avoid-rule. Isabelle confirms this:
theorem AF_lemma2: "{s. ∀ p ∈ Paths s. ∃ i. p i ∈ A} ⊆ lfp(af A)"
by(auto elim: Avoid_in_lfp intro: Avoid.intros)
10. Case Study: Verifying a Security Protocol
This protocol uses public-key cryptography. Each person has a private key,
known only to himself, and a public key, known to everybody. If Alice wants
to send Bob a secret message, she encrypts it using Bob’s public key (which
188 10. Case Study: Verifying a Security Protocol
everybody knows), and sends it to Bob. Only Bob has the matching private
key, which is needed in order to decrypt Alice’s message.
The core of the Needham-Schroeder protocol consists of three messages:
1. A → B : {|Na, A|}Kb
2. B → A : {|Na, Nb|}Ka
3. A → B : {|Nb|}Kb
First, let’s understand the notation. In the first message, Alice sends Bob a
message consisting of a nonce generated by Alice (Na) paired with Alice’s
name (A) and encrypted using Bob’s public key (Kb). In the second message,
Bob sends Alice a message consisting of Na paired with a nonce generated
by Bob (Nb), encrypted using Alice’s public key (Ka). In the last message,
Alice returns Nb to Bob, encrypted using his public key.
When Alice receives Message 2, she knows that Bob has acted on her mes-
sage, since only he could have decrypted {|Na, A|}Kb and extracted Na. That
is precisely what nonces are for. Similarly, message 3 assures Bob that Alice
is active. But the protocol was widely believed [7] to satisfy a further prop-
erty: that Na and Nb were secrets shared by Alice and Bob. (Many protocols
generate such shared secrets, which can be used to lessen the reliance on
slow public-key operations.) Lowe found this claim to be false: if Alice runs
the protocol with someone untrustworthy (Charlie say), then he can start
a new run with another agent (Bob say). Charlie uses Alice as an oracle,
masquerading as Alice to Bob [18].
In messages 1 and 3, Charlie removes the encryption using his private key
and re-encrypts Alice’s messages using Bob’s public key. Bob is left thinking
he has run the protocol with Alice, which was not Alice’s intention, and Bob
is unaware that the “secret” nonces are known to Charlie. This is a typical
man-in-the-middle attack launched by an insider.
Whether this counts as an attack has been disputed. In protocols of
this type, we normally assume that the other party is honest. To be hon-
est means to obey the protocol rules, so Alice’s running the protocol with
Charlie does not make her dishonest, just careless. After Lowe’s attack, Alice
has no grounds for complaint: this protocol does not have to guarantee any-
thing if you run it with a bad person. Bob does have grounds for complaint,
however: the protocol tells him that he is communicating with Alice (who is
honest) but it does not guarantee secrecy of the nonces.
Lowe also suggested a correction, namely to include Bob’s name in mes-
sage 2:
10.2 Agents and Messages 189
1. A → B : {|Na, A|}Kb
2. B → A : {|Na, Nb, B|}Ka
3. A → B : {|Nb|}Kb
If Charlie tries the same attack, Alice will receive the message {|Na, Nb, B|}Ka
when she was expecting to receive {|Na, Nb, C }| Ka . She will abandon the run,
and eventually so will Bob. Below, we shall look at parts of this protocol’s
correctness proof.
In ground-breaking work, Lowe [18] showed how such attacks could be
found automatically using a model checker. An alternative, which we shall
examine below, is to prove protocols correct. Proofs can be done under more
realistic assumptions because our model does not have to be finite. The strat-
egy is to formalize the operational semantics of the system and to prove
security properties using rule induction.
The spy is part of the system and must be built into the model. He is a mali-
cious user who does not have to follow the protocol. He watches the network
and uses any keys he knows to decrypt messages. Thus he accumulates ad-
ditional keys and nonces. These he can use to compose new messages, which
he may send to anybody.
Two functions enable us to formalize this behaviour: analz and synth.
Each function maps a sets of messages to another set of messages. The set
analz H formalizes what the adversary can learn from the set of messages H .
The closure properties of this set are defined inductively.
inductive set
analz :: "msg set ⇒ msg set"
for H :: "msg set"
where
Inj [intro,simp] : "X ∈ H =⇒ X ∈ analz H"
| Fst: "{|X,Y |} ∈ analz H =⇒ X ∈ analz H"
| Snd: "{|X,Y |} ∈ analz H =⇒ Y ∈ analz H"
| Decrypt [dest]:
"[[Crypt K X ∈ analz H; Key(invKey K) ∈ analz H]]
=⇒ X ∈ analz H"
Note the Decrypt rule: the spy can decrypt a message encrypted with
key K if he has the matching key, K −1 . Properties proved by rule induction
include the following:
G ⊆ H =⇒ analz G ⊆ analz H (analz_mono)
analz (analz H) = analz H (analz_idem)
The set of fake messages that an intruder could invent starting from H is
synth(analz H), where synth H formalizes what the adversary can build from
the set of messages H .
inductive set
synth :: "msg set ⇒ msg set"
for H :: "msg set"
where
Inj [intro]: "X ∈ H =⇒ X ∈ synth H"
| Agent [intro]: "Agent agt ∈ synth H"
| MPair [intro]:
"[[X ∈ synth H; Y ∈ synth H]] =⇒ {|X,Y |} ∈ synth H"
| Crypt [intro]:
"[[X ∈ synth H; Key K ∈ H]] =⇒ Crypt K X ∈ synth H"
The set includes all agent names. Nonces and keys are assumed to be
unguessable, so none are included beyond those already in H . Two elements
of synth H can be combined, and an element can be encrypted using a key
present in H .
Like analz, this set operator is monotone and idempotent. It also satisfies
an interesting equation involving analz :
10.4 Event Traces 191
– synth (analz (knows Spy evs)) is everything that the spy could generate
The function pubK maps agents to their public keys. The function priK
maps agents to their private keys. It is merely an abbreviation (cf. Sect. 4.1.4)
defined in terms of invKey and pubK.
consts pubK :: "agent ⇒ key"
abbreviation priK :: "agent ⇒ key"
where "priK x ≡ invKey(pubK x)"
The set bad consists of those agents whose private keys are known to the spy.
Two axioms are asserted about the public-key cryptosystem. No two
agents have the same public key, and no private key equals any public key.
axiomatization where
inj_pubK: "inj pubK" and
priK_neq_pubK: "priK A 6= pubK B"
1. A → B : {|Na, A|}Kb
2. B → A : {|Na, Nb, B|}Ka
3. A → B : {|Nb|}Kb
where NB is a fresh nonce: Nonce NB ∈ / used evs2. Writing the sender as A'
indicates that B does not know who sent the message. Calling the trace vari-
able evs2 rather than simply evs helps us know where we are in a proof after
many case-splits: every subgoal mentioning evs2 involves message 2 of the
protocol.
Benefits of this approach are simplicity and clarity. The semantic model
is set theory, proofs are by induction and the translation from the informal
notation to the inductive rules is straightforward.
by blast
The Fake case is proved automatically. If priK A is in the extended trace
then either (1) it was already in the original trace or (2) it was generated by
the spy, who must have known this key already. Either way, the induction
hypothesis applies.
Unicity lemmas are regularity lemmas stating that specified items can
occur only once in a trace. The following lemma states that a nonce cannot
be used both as Na and as Nb unless it is known to the spy. Intuitively,
it holds because honest agents always choose fresh values as nonces; only
the spy might reuse a value, and he doesn’t know this particular value. The
proof script is short: induction, simplification, blast. The first line uses the
rule rev_mp to prepare the induction by moving two assumptions into the
induction formula.
lemma no_nonce_NS1_NS2:
"[[Crypt (pubK C) {|NA', Nonce NA, Agent D|} ∈ parts (knows Spy evs);
Crypt (pubK B) {|Nonce NA, Agent A|} ∈ parts (knows Spy evs);
evs ∈ ns_public]]
=⇒ Nonce NA ∈ analz (knows Spy evs)"
apply (erule rev_mp, erule rev_mp)
apply (erule ns_public.induct, simp_all)
apply (blast intro: analz_insertI)+
done
The following unicity lemma states that, if NA is secret, then its appearance
in any instance of message 1 determines the other components. The proof is
similar to the previous one.
lemma unique_NA:
"[[Crypt(pubK B) {|Nonce NA, Agent A |} ∈ parts(knows Spy evs);
Crypt(pubK B') {|Nonce NA, Agent A'|} ∈ parts(knows Spy evs);
10.7 Proving Secrecy Theorems 195
Nonce NA ∈
/ analz (knows Spy evs); evs ∈ ns_public]]
=⇒ A=A' ∧ B=B'"
The secrecy theorems for Bob (the second participant) are especially impor-
tant because they fail for the original protocol. The following theorem states
that if Bob sends message 2 to Alice, and both agents are uncompromised,
then Bob’s nonce will never reach the spy.
theorem Spy_not_see_NB [dest]:
"[[Says B A (Crypt (pubK A) {|Nonce NA, Nonce NB, Agent B|}) ∈ set evs;
A ∈
/ bad; B ∈ / bad; evs ∈ ns_public]]
=⇒ Nonce NB ∈ / analz (knows Spy evs)"
To prove it, we must formulate the induction properly (one of the assump-
tions mentions evs), apply induction, and simplify:
apply (erule rev_mp, erule ns_public.induct, simp_all)
The proof states are too complicated to present in full. Let’s examine the
simplest subgoal, that for message 1. The following event has just occurred:
The variables above have been primed because this step belongs to a different
run from that referred to in the theorem statement — the theorem refers to a
past instance of message 2, while this subgoal concerns message 1 being sent
just now. In the Isabelle subgoal, instead of primed variables like B 0 and Na 0
we have Ba and NAa:
1. evs1 NAa Ba.
V
/ bad; B ∈
[[A ∈ / bad; evs1 ∈ ns_public;
Says B A (Crypt (pubK A) {|Nonce NA, Nonce NB, Agent B|})
∈ set evs1 −→
Nonce NB ∈ / analz (knows Spy evs1);
Nonce NAa ∈ / used evs1]]
=⇒ Ba ∈ bad −→
Says B A (Crypt (pubK A) {|Nonce NA, Nonce NB, Agent B|})
∈ set evs1 −→
NB 6= NAa
The simplifier has used a default simplification rule that does a case analysis
for each encrypted message on whether or not the decryption key is compro-
mised.
analz (insert (Crypt K X) H) =
(if Key (invKey K) ∈ analz H
then insert (Crypt K X) (analz (insert X H))
else insert (Crypt K X) (analz H)) (analz_Crypt_if)
196 10. Case Study: Verifying a Security Protocol
The simplifier has also used Spy_see_priK, proved in Sect. 10.6 above, to yield
Ba ∈ bad.
Recall that this subgoal concerns the case where the last message to be
sent was
1. A0 → B 0 : {|Na 0 , A0}| Kb0 .
This message can compromise Nb only if Nb = Na 0 and B 0 is compromised,
allowing the spy to decrypt the message. The Isabelle subgoal says precisely
this, if we allow for its choice of variable names. Proving NB 6= NAa is easy: NB
was sent earlier, while NAa is fresh; formally, we have the assumption Nonce
NAa ∈/ used evs1.
Note that our reasoning concerned B ’s participation in another run.
Agents may engage in several runs concurrently, and some attacks work by
interleaving the messages of two runs. With model checking, this possibility
can cause a state-space explosion, and for us it certainly complicates proofs.
The biggest subgoal concerns message 2. It splits into several cases, such as
whether or not the message just sent is the very message mentioned in the
theorem statement. Some of the cases are proved by unicity, others by the
induction hypothesis. For all those complications, the proofs are automatic
by blast with the theorem no_nonce_NS1_NS2.
The remaining theorems about the protocol are not hard to prove. The
following one asserts a form of authenticity: if B has sent an instance of
message 2 to A and has received the expected reply, then that reply really
originated with A. The proof is a simple induction.
theorem B_trusts_NS3:
"[[Says B A (Crypt (pubK A) {|Nonce NA, Nonce NB, Agent B|}) ∈ set evs;
Says A' B (Crypt (pubK B) (Nonce NB)) ∈ set evs;
A ∈
/ bad; B ∈ / bad; evs ∈ ns_public]]
=⇒ Says A B (Crypt (pubK B) (Nonce NB)) ∈ set evs"
From similar assumptions, we can prove that A started the protocol run
by sending an instance of message 1 involving the nonce NA. For this theorem,
the conclusion is
Says A B (Crypt (pubK B) {|Nonce NA, Agent A|}) ∈ set evs
Analogous theorems can be proved for A, stating that nonce NA remains se-
cret and that message 2 really originates with B. Even the flawed protocol
establishes these properties for A; the flaw only harms the second participant.
Detailed information on this protocol verification technique can be found
elsewhere [30], including proofs of an Internet protocol [31]. We must stress
that the protocol discussed in this chapter is trivial. There are only three
messages; no keys are exchanged; we merely have to prove that encrypted
data remains secret. Real world protocols are much longer and distribute
many secrets to their participants. To be realistic, the model has to include
the possibility of keys being lost dynamically due to carelessness. If those
keys have been used to encrypt other sensitive information, there may be
197
cascading losses. We may still be able to establish a bound on the losses and
to prove that other protocol runs function correctly [32]. Proofs of real-world
protocols follow the strategy illustrated above, but the subgoals can be much
bigger and there are more of them.
198
199
[[ [| \<lbrakk>
]] |] \<rbrakk>
=⇒ ==> \<Longrightarrow>
!! \<And>
V
≡ == \<equiv>
== \<rightleftharpoons>
* => \<rightharpoonup>
) <= \<leftharpoondown>
λ % \<lambda>
⇒ => \<Rightarrow>
∧ & \<and>
∨ | \<or>
−→ --> \<longrightarrow>
¬ ~ \<not>
6 = ~= \<noteq>
∀ ALL, ! \<forall>
∃ EX, ? \<exists>
∃! EX!, ?! \<exists>!
ε SOME, @ \<epsilon>
◦ o \<circ>
|| abs \<bar> \<bar>
≤ <= \<le>
× * \<times>
∈ : \<in>
∈
/ ~: \<notin>
⊆ <= \<subseteq>
⊂ < \<subset>
∪ Un \<union>
∩ Int \<inter>
UN, Union \<Union>
S
INT, Inter \<Inter>
T
∗
^* \<^sup>*
−1
^-1 \<inverse>