Document
Document
Language
Preliminary design (version 1.4, November 5, 2008)
This work has been supported by the ’CAT’ ANR project (ANR-05-RNTL-0030x) and by the ANR
CIFRE contract 2005/973.
2
1 Introduction 9
1.1 Organization of this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Generalities about Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.1 Kinds of annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.2 Parsing annotations in practice . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.3 About preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.4 About keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Notations for grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Specification language 13
2.1 Lexical rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Logic expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Operators precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.3 Typing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.4 Integer arithmetic and machine integers . . . . . . . . . . . . . . . . . . . . . . 18
2.2.5 Real numbers and floating point numbers . . . . . . . . . . . . . . . . . . . . . 20
2.3 Function contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Built-in constructs \old and \result . . . . . . . . . . . . . . . . . . . . . . 21
2.3.2 Simple function contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.3 Contracts with named behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.4 Memory locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.5 Default contracts, multiple contracts . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4 Statement annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.1 Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2 Loop annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.3 Built-in construct \at . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4.4 Statement contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.5.1 Integer measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.5.2 General measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.5.3 Recursive function calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.5.4 Non-terminating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.6 Logic specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6.1 Predicate and function definitions . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6.2 Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6.3 Inductive predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.6.4 Axiomatic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.6.5 Polymorphic logic types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4 CONTENTS
3 Libraries 61
3.1 Libraries of logic specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1.1 Real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1.2 Finite lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1.3 Sets and Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2 Jessie library: logical adressing of memory blocks . . . . . . . . . . . . . . . . . . . . . 61
3.2.1 Abstract level of pointer validity . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2.2 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.3 Memory leaks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4 Conclusion 63
A Appendices 65
A.1 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
A.2 Comparison with JML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
A.2.1 Low-level language vs. inheritance-based one . . . . . . . . . . . . . . . . . . . 66
A.2.2 Deductive verification vs. RAC . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.2.3 Syntactic differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.3 Typing rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
A.3.1 Rules for terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
A.3.2 Typing rules for locationssets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
A.4 Specification Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
A.4.1 Accessing to a C variable that is masked . . . . . . . . . . . . . . . . . . . 71
A.5 Illustrative example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
A.6 Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.6.1 Version 1.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.6.2 Version 1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.6.3 Version 1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Bibliography 81
List of Figures 83
Index 85
This is a preliminary design of the ACSL language, a deliverable of the task 7.2 of the ANR RNTL project
CAT (http://www.rntl.org/projet/resume2005/cat.htm). In this project, a reference
implementation of ACSL is provided: the Frama-C platform (http://frama-c.cea.fr).
This is the version 1.4 of ACSL design. Several features may still evolve in the future. In particular,
some features in this document are considered experimental, meaning that their syntax and semantics is
not already fixed. These features are marked with E XPERIMENTAL. They must also be considered as
advanced features, which are not supposed to be useful for a basic use of that specification language.
Acknowledgements
We gratefully thank all the people who contributed to this document: Sylvie Boldo, Jean-Louis Colaço,
Pierre Crégut, Pascal Cuoq, David Delmas, Stéphane Duprat, Arnaud Gotlieb, Thierry Hubert, Dillon
Pariente, Pierre Rousseau, Julien Signoles, Jean Souyris.
8 CONTENTS
Introduction
This document is a reference manual for the ACSL implementation provided by the Frama-C frame-
work [10]. ACSL is an acronym for “ANSI C Specification Language”. This is a Behavioral Interface
Specification Language (a.k.a. BISL) implemented in the F RAMA -C framework. As suggested by its
name, it aims at specifying behavioral properties of C source code. The main inspiration for this lan-
guage comes from the specification language of the C ADUCEUS tool [8, 9] for deductive verification of
behavioral properties of C programs. It is itself inspired from the Java Modeling Language (JML [18])
which aims at similar goals for Java source code: indeed it aims both at runtime assertion checking
and static verification using the ESC/JAVA 2 tool [14], where we aim at static verification and deductive
verification (see Appendix A.2 for a detailed comparison between ACSL and JML).
Going back further in history, JML design was guided by the general design-by-contract principle
proposed by Bertrand Meyer, who took his own inspiration from the concepts of preconditions and
postconditions on a routine, going back at least to Dijkstra, Floyd and Hoare in the late 60’s and early
70’s, and originally implemented in the E IFFEL language.
In this document, we assume that the reader has a good knowledge of the ANSI C programming
language [13, 12].
– function contract : such an annotation is inserted just before the declaration or the definition
of a function. See section 2.3.
– global invariant : this is allowed at the level of global declarations. See section 2.11.
– type invariant : this allows to declare both structure or union invariants, and invariants on
type names introduced by typedef. See section 2.11.
– logic specifications : logic type introduction, introduction or definition definitions of logic functions or
predicates, axiomslemmas, axiomatizations by declaration of new logic types, logic func-
tions, predicates with axioms they satisfy. Such an annotation is placed at the level of
global declarations.
• Statement annotations:
– assert clause : these are allowed everywhere a C label is allowed, or just before a block
closing brace.
– loop annotation (invariant, variant, assign clauses): is allowed immediately before a loop
statement: for , while , do . . . while . See Section 2.4.2.
– statement contract : very similar to a function contract, and placed before a statement or a
block. Semantical condition conditions must be checked (normal termination only, no goto going
inside, no goto going outside). See Section 2.4.4.
– ghost code : regular C code, only visible from the specificationspecifications, that is only al-
lowed to modify ghost variables. See section 2.12. This includes ghost braces for enclosing
blocks.
return x /*@ +1 */ ;
In our language, this is forbidden. Technically, the current implementation of Frama-C isolates the
comments in a first step of syntax analysis, and then parses a second time. Nevertheless, the grammar
and the corresponding parser must be carefully designed to avoid interaction of annotations with the
code. For example, in code such as
the statement c=1 must be understood as the branch of the if. This is ensured by the grammar below,
saying that assert annotations are not statement themselves, but attached to the statement that follows,
like C labels.
Specification language
• Some UTF8 characters may be used in place of some constructs, as shown in the following table:
>= ≥ 0x2265
<= ≤ 0x2264
> > 0x003E
< < 0x003C
!= 6 ≡ 0x2262
== ≡ 0x2261
==> =⇒ 0x21D2
<==> ⇐⇒ 0x21D4
&& ∧ 0x2227
|| ∨ 0x2228
^^ ⊕ 0x22BB
! ¬ 0x00AC
\forall ∀ 0x2200
\exists ∃ 0x2203
• Comments can be put inside ACSL annotations. They use the C++ format, i.e. begin with
// and extend to the end of current line.
bin-op ::= + | - | * | / | %
| == | != | <= | >= | > | <
| && | || | boolean operations
| & | | | --> | <--> | ^ bitwise operations
unary-op ::= + | - unary plus and minus
| ! boolean negation
| ~ bitwise complementation
| * pointer dereferencing
| & address-of operator
term ::= \true | \false
| integer integer constants
| real real constants
| id variables
| unary-op term
| term bin-op term
| term [ term ] array access
| { term for [ term ] = term } array functional modifier
| term . id structure field access
| { term for id = term } structure field functional modifier
| term -> id
| ( type-expr ) term cast
| id ( term (, term)∗ ) function application
| ( term ) parentheses
| term ? term : term
| \let id = term ; term local binding
| sizeof ( term )
| sizeof ( C-type-expr )
| id : term syntactic naming
rel-op ::= == | != | <= | >= | > | <
pred ::= \true | \false
| term (rel-op term)+ comparisons (see remark)
| id ( term (, term)∗ ) predicate application
| ( pred ) parentheses
| pred && pred conjunction
| pred || pred disjunction
| pred ==> pred implication
| pred <==> pred equivalence
| ! pred negation
| pred ^^ pred exclusive or
| term ? pred : pred
| pred ? pred : pred
| \let id = term ; pred local binding
| \forall binders ; pred universal quantification
| \exists binders ; pred existential quantification
| id : pred syntactic naming
terms in classical first-order logic. The grammar for binders and type expressions is given separately in
Figure 2.2.
With respect to C pure expressions, the additional constructs are as follows:
Additional connectives C operators && (UTF8: ∧), || (UTF8: ∨) and ! (UTF8: ¬) are used as logical
connectives. There are additional connectives ==> (UTF8: =⇒) for implication, <==> (UTF8:
⇐⇒) for equivalence and ^^ (UTF8: ⊕) for exclusive or. These logical connectives all have a
bitwise counterpart, either C ones like &, |, ~ and ^, or additional ones like bitwise implication
--> and bitwise equivalence <-->.
Quantification Universal quantification is denoted by \forall τ x1 , . . . , xn ; e and existential quan-
tification by \exists τ x1 , . . . , xn ; e.
Local binding \let x = e1 ; e2 introduces the name x for expression e1 which can be used in expres-
sion e2 .
Conditional c ? e1 : e2 . There is a subtlety here: the condition may be either a boolean term or a pred-
icate. In case of a predicate, the two branches must be also predicates, so that this construct acts
as a connective with the following semantics: c?e1 :e2 is equivalent to (c==>e1 )&&(!c==>e2 ).
Syntactic naming id:e is a term or a predicate equivalent to e. It is different from local naming with
\let : the name cannot be reused in other terms or predicates. It is only for readibility purposes.
Functional modifier The composite element modifier is an additional operator in relation to the C struc-
ture field and array accessors. The expression { s for id = v } denotes the same structure than s,
except for the field id to be modified by v. The equivalent expression for an array is { t for [ i ]
= v } which returns the same array than t, except for the ith element those value is updated to v.
See Section 2.10 for an example of use of these operators.
Logic functions Applications in terms and in propositions are not applications of C functions, but of
logic functions or predicates; see Section 2.6 for detail.
Consecutive comparison operators The construct t1 relop1 t2 relop2 t3 · · · tk with several consecutive
comparison operators is a shortcut for (t1 relop1 t2 ) && (t2 relop2 t3 ) && · · ·. It is required that
the relopi operators must be in the same “direction”, i.e. they must all belong either to {<, <=, ==}
or to {>, >=, ==}. Expressions such as x < y > z or x != y != z are not allowed.
To enforce the same interpretation as in C expressions, one may need to add extra parentheses:
a==b<c is equivalent to a==b&&b<c, whereas a==(b<c) is equivalent to \let x=b<c;a==x.
This situation raises some issues, see example below.
There is a subtlety with respect to comparison operators: they are predicates when used in predicate
position, and boolean functions when used in term position.
Example 2.1 Let’s consider the following example:
int f(int a, int b) { return a < b; }
The following postconditions are wrong:
• the obvious postcondition \result == a < b is not the right one because it is actually a
shortcut for \result == a && a < b.
• adding parentheses results in an ill-typed postcondition a correct post-condition
\result == (a < b), because it tests equality of which has type . Note however that there is
an implicit conversion (see Sec. 2.2.3) from the int with which has type (the type of \result)
to boolean . similarly, is not well-typed, because it makes an equivalence between an and a predicate. The following are
correct postconditions: (the type of (a<b))
• is acceptable because it is an equivalence between two predicatesan equivalent post-condition, which does not
rely on implicit conversion, is (\result != 0) == (a<b). Both pairs of parentheses
are mandatory.
• \result == (integer)(a<b) is also acceptable because it compares two integers. The
cast towards integer enforces a<b to be understood as a boolean term. Notice that a cast
towards int would also be acceptable.
• \result != 0 <==> a < b is acceptable because it is an equivalence between two
predicates.
2.2.2 Semantics
The semantics of logic expressions in ACSL is based on mathematical first-order logic [24]. In particular,
it is a 2-valued logic with only total functions. Consequently, expressions are never “undefined”. This
is an important design choice and the specification writer should be aware of that. (For a discussion
about the issues raised by such design choices, in similar specification languages such as JML, see the
comprehensive list compiled by Patrice Chalin [2, 3].)
Having only total functions implies than one can write terms such as 1/0, or *p when p is null (or
more generally when it points to a non-properly allocated memory cell). In particular, the predicates
1/0 == 1/0
*p == *p
are valid, since they are instances of the axiom ∀x, x = x of first-order logic. Of course, there is no
way to deduce anything useful about such terms. As usual, it is up to the specification designer to write
consistent assertions. For example, when introducing the following lemma (see Section 2.6):
2.2.3 Typing
The language of logic expressions is typed (as in multi-sorted first-order logic). Types are either C types
or logic types defined as follows:
• “mathematical” types: integer for unbounded, mathematical integers, real for real numbers,
boolean for booleans (with values written \true and \false);
• C integral types char, short, int and long, signed or unsigned, are all subtypes of type
integer;
Notes:
• There is a distinction between booleans and predicates. The expression x < y in term position is
a boolean, and the same expression is also allowed in predicate position.
• Unlike in C, there is a distinction between booleans and integers. There is an implicit promotion
from integers to booleans, thus one may write x && y instead of x != 0 && y != 0. If the
reverse conversion is needed, an explicit cast is required, e.g. (int)(x>0)+1, where \false
becomes 0 and \true becomes 1.
• Quantification can be made over any type: logic types and C types. Quantification over pointers
must be used carefully, since it depends on the memory state where dereferencing is done (see
Section 2.2.4 and Section 2.6.9).
• q × d + r = n;
Example 2.2 The following examples illustrate the results of division and modulo depending on the sign
of their arguments:
Hexadecimal and octal constants are always non-negative. Suffixes u and l for C constants are allowed
but meaningless.
is not allowed because x+1, which is a mathematical integer, must be casted to int. One should write
either
//@ logic integer f(int x) = x+1 ;
or
//@ logic int f(int x) = (int)(x+1) ;
Enum types
Enum types are also interpreted as mathematical integers. Casting an integer into an enum in the logic
must give the same result as if it was in the C code.
Bitwise operations
Like arithmetic operations, bitwise operations apply to any mathematical integer: any mathematical
integer has a unique infinite 2-complement binary representation with infinitely many zeros (for non-
negative numbers) or ones (for negative numbers) on the left. Then bitwise operations apply to this
representation.
Example 2.6
• 7 & 12 = · · · 00111 & · · · 001100 = · · · 00100 = 4
• ~5 = ~ · · · 00101 = · · · 111010 = -6
Transcendental functions
Classical mathematical operations like exponential, sine, cosine, etc. are supposed to be available in
some library (see Section 3.1.1).
Quantification
Quantification over a variable of type real is of course usual quantification over real numbers.
Quantification over float (resp. double) types is allowed too, and is supposed to range over all real
numbers representable as floats (resp doubles). In particular, this does not include NaN, +infinity and
-infinity in the considered range.
• The caller of the function must guarantee that it is called in a state where the property
P1 &&P2 && . . . holds.
• The called function returns a state where the property E1 &&E2 && . . . holds.
• All memory locations of the pre-state that do not belong to the set L1 ∪ L2 ∪ . . . remain allocated
and are left unchanged in the post-state. The set L1 ∪L2 ∪. . . itself is interpreted in the post-statepre-
state.
Notice that the multiplicity of clauses are proposed mainly to improve readibility since the contract
above is equivalent to the following simplified one:
@ assigns L1 , L2 , . . .;
@ ensures E1 && E2 && . . .;
@*/
If no clause requires is given, it defaults to \true, and similarly for ensures clause. Giving no
assigns clause means that locations assigned by the function are not specified, so the caller has no
information at all on this function’s side effects. See Section 2.3.5 for more details on default status of
clauses.
Example 2.9 The following function is given a simple contract for computation of the integer square
root.
/*@ requires x ≥ 0;
@ ensures \result ≥ 0;
@ ensures \result * \result ≤ x;
@ ensures x < (\result + 1) * (\result + 1);
@*/
int isqrt(int x);
The contract means that the function must be called with a nonnegative argument, and returns a value
satisfying the conjunction of the three ensures clauses. Inside these ensures clauses, the use
of the construct \old(x) is not necessary, even if the function modifies the formal operand
x, because function calls modify a copy of the effective operands, and the effective operands
remain unaltered. In fact, x denotes the effective operand of isqrt calls which has the same
value interpreted in the pre-state than in the post-state.
Example 2.10 The following function is given a contract to specify that it increments the value pointed
to by the pointer given as argument.
The contract means that the function must be called with a pointer p that points to a safely allocated
memory location (see Section 2.7 for details on the \valid built-in predicate). It does not modify any
memory location but the one pointed to by p. Finally, the ensures clause specifies that the value *p is
incremented by one.
/*@ requires P ;
@ behavior b1 :
@ assumes A1 ;
@ requires R1 ;
@ assigns L1 ;
@ ensures E1 ;
@ behavior b2 :
@ assumes A2 ;
@ requires R2 ;
@ assigns L2 ;
@ ensures E2 ;
@*/
The semantics of such a contract is as follows:
• The caller of the function must guarantee that the call is performed in a state where the property
P &&(A1 ==>R1 )&&(A2 ==>R2 ) holds.
• The called function returns a state where the properties \old(Ai )==>Ei hold for each i.
• For each i, if the function is called in a pre-state where Ai holds, then each memory location of
that pre-state that does not belong to the set Li remains allocated and is left unchanged in the
post-state.
Notice that the requires clauses in the behaviors are proposed mainly to improve readibility (to
avoid some duplication of formulas), since the contract above is equivalent to the following simplified
one:
/*@ requires P &&(A1 ==>R1 )&&(A2 ==>R2 );
@ behavior b1 :
@ assumes A1 ;
@ assigns L1 ;
@ ensures E1 ;
@ behavior b2 :
@ assumes A2 ;
@ assigns L2 ;
@ ensures E2 ;
@*/
Note that a simple contract such as
/*@ requires P ; assigns L; ensures E; */
is actually equivalent to a single named behavior as follows:
/*@ requires P ;
@ behavior <any name>:
@ assumes \true;
@ assigns L;
@ ensures E;
@*/
Similarly, global assigns and ensures clauses are equivalent to a single named behavior. More
precisely,
/*@ requires P ;
@ assigns L;
@ ensures E;
@ behavior b1 : ...
@ behavior b2 : ...
.
@ .
.
@*/
is equivalent to
/*@ requires P ;
@ behavior <any name>:
@ assumes \true;
@ assigns L;
@ ensures E;
@ behavior b1 : ...
@ behavior b2 : ...
.
@ .
.
@*/
Example 2.11 In the following, bsearch(t, n, v) searches for element v in array t between indices 0
and n − 1.
The precondition requires array t to be allocated at least from indices 0 to n − 1. The two named
behaviors correspond respectively to the successful behavior and the failing behavior.
Since the function is performing a binary search, it requires the array t to be sorted in increasing
order: this is the purpose of the predicate named t_is_sorted in the assumes clause of the behavior
named failure.
See 2.4.2 for a continuation of this example.
Example 2.12 The following function illustrates the importance of different assigns clauses for each
behavior.
Its contract means that it assigns values pointed to by p or by q, conditionally on the sign of n.
Completeness of behaviors
Notice that in a contract with named behaviors, it is not required that the disjunction of the Ai is true, i.e.
it is not mandatory to provide a “complete” set of behaviors. If such a condition is seeked, it is possible
to add the following clause to a contract:
/*@ ...
@ complete behaviors b1 , . . . , bn ;
@*/
It specifies that the set of behaviors b1 , . . . , bn is complete i.e. that
_
R⇒ Ai
1≤i≤n
holds, where R is the precondition of the contract. The simplified version of that clause
/*@ ...
@ complete behaviors;
@*/
means that all behaviors given in the contract should be taken into account.
Similarly, it is not required that two distinct behaviors are disjoint. If desired, this can be specified
with the following clause:
/*@ ...
@ disjoint behaviors b1 , . . . , bn ;
@*/
It means that the given behaviors are pairwise disjoint i.e. that, for all distinct i and j,
R ⇒ ¬(Ai ∧ Aj )
• t1 ..t2 denotes the set of integers between t1 and t2 , included. If t1 > t2 , this the same as \empty
• {s | b; P } denotes set comprehension, that is the union of the sets denoted by s for each value b of
binders satisfying predicate P (binders b are bound in both s and P ).
Note that \assigns \nothing is equivalent to \assigns \empty; it is left for convenience.
It is annotated with three equivalent assigns clauses, each one specifying that only the set of cells
{t[0], . . . , t[n − 1]} is modified.
Example 2.14 The following function increments each value stored in a linked list.
struct list {
int hd;
struct list *next;
};
2.4.1 Assertions
The syntax of assertions is given in Figure 2.7, as an extension of the grammar of C statements.
• assert p means that p must hold in the current state (the sequence point where the assertion
occurs).
• The variant for id1 , . . . , idk : assert p associates the assertion to the named behaviors idi ,
each of them being a behavior identifier for the current function (or a behavior of an enclosing
block as defined later in Section 2.4.4). It means that this assertion must hold only for the consid-
ered behaviors.
Loop invariants
The semantics of loop invariants is defined as follows: a loop annotation of the form
• The predicate I holds before entering the loop (in the case of a for loop, this means right after
the initialization expression).
• The predicate I is an inductive invariant, that is if I is assumed true in some state where the
condition c is also true, and if execution of the loop body in that state ends normally at the end of
the body or with a continue statement, I is true in the resulting state. If the loop condition has
side effects, these are included in the loop body in a suitable way:
• At any loop iteration, any location that was allocated before entering the loop, and is not member
of L (interpreted in the current state), must remain allocated and has the same value as before
entering the loop. In fact, the loop assigns clause specifies an inductive invariant for
the locations wich are not member of L.
Remarks
• The \old construct is not allowed in loop annotations. The \at form should be used to refer to
another state (see Section 2.4.3).
• When a loop exits with break or return or goto, it is not required that the loop invariant
holds. In such cases, locations that are not member of L can be disallocated or assigned
between the end of the previous iteration and the exit statement.
Example 2.15 Here is a continuation of example 2.11. Note the use of a loop invariant associated to a
function behavior.
Loop variants
Optionally, a loop annotation may include a loop variant of the form
/*@ loop variant m;
@*/
...
where m is a term of type integer.
The semantics is as follows: for each loop iteration that terminates normally or with continue,
the value of m at end of the iteration must be smaller that its value at the beginning of the iteration.
Moreover, its value at the beginning must be nonnegative. Note that the value of m at loop exit might be
negative. It does not compromise termination of the loop. Here is an example:
Example 2.16
void f(int x) {
//@ loop variant x;
while (x ≥ 0) {
x −= 2;
}
}
It is also possible to specify termination orderings other than the usual order on integers, using the
additional for modifier. This is explained in Section 2.5.
i←0
m < t[i]
m ← t[i]
m ≥ t[i]
do inv
i←i+1
i<n
i≥n
The invariant is inductively preserved by the two paths that go from node “inv” to itself.
while (y > 0) {
x++;
//@ invariant 0 < x < 11;
y−−;
//@ invariant 0 ≤ y < 10;
}
since 0 <= y < 10 is not a consequence of hypothesis 0 < x < 11 after executing y--; and
0 < x < 11 cannot be deduced from 0 <= y < 10 after looping back through the condition y>0
and executing x++. Correct invariants could be:
while (y > 0) {
x++;
//@ invariant 0 < x < 11 ∧ x+y ≡ 11;
y−−;
//@ invariant 0 ≤ y < 10 ∧ x+y ≡ 10;
}
• The label Here is visible in all statement annotations, where it refers to the state where the anno-
tation appears; and in all contracts, and where it refers to the pre-state for the requires and the
assumes, variant, terminates, ... clause and the post-state for other clauses. It is
also visible in data invariants, presented in Section 2.11.
• The label Old is visible in assigns and ensures clauses of all contracts (both for functions
and for statement contracts described below in Section 2.4.4), and refers to the pre-state of this
contract.
• The label Pre is visible in all statement annotations, and refers to the pre-state of the function it
occurs in.
Note that no logic label is visible in global logic declarations such as lemmas, axioms, definition of
predicate or logic functions. When such an annotation needs to refer to a given memory state, it has to
be given a label binder: this is described in Section 2.6.9.
Example 2.20 The code below implements the famous extended Euclid’s algorithm for computing the
greatest common divisor of two integers x and y, while computing at the same time two Bézout coeffi-
cients p and q such that p × x + q × y = gcd(x, y). The loop invariant for the Bézout property needs to
refer to the value of x and y in the pre-state of the function.
/*@ requires x ≥ 0 ∧ y ≥ 0;
@ behavior bezoutProperty:
@ ensures (*p)*x+(*q)*y ≡ \result;
@*/
int extended_Euclid(int x, int y, int *p, int *q) {
int a = 1, b = 0, c = 0, d = 1;
/*@ loop invariant x ≥ 0 ∧ y ≥ 0 ;
@ for bezoutProperty: loop invariant
@ a*\at(x,Pre)+b*\at(y,Pre) ≡ x ∧
@ c*\at(x,Pre)+d*\at(y,Pre) ≡ y ;
@ loop variant y;
@*/
while (y > 0) {
int r = x % y;
int q = x / y;
int ta = a, tb = b;
x = y; y = r;
a = c; b = d;
c = ta − c * q; d = tb − d * q;
}
*p = a; *q = b;
return x;
}
Example 2.21 Here is a toy example illustrating tricky issues with \at and labels:
int i;
int t[10];
Example 2.22 Here is an example which illustrates each of these special clauses for statement contracts.
int f(int x) {
while (x > 0) {
2.5 Termination
The property of termination concerns both loops and recursive function calls. Termination is guaranteed
by attaching a measure function to each loop and each recursive function. By default, a measure is an
integer expression, and measures are compared using the usual ordering over integers (Section 2.5.1).
It is also possible to define measures into other domains and/or using a different ordering relation (Sec-
tion 2.5.2).
In other words, the measure must be a decreasing sequence of integers which remain nonnegative, except
possibly for the last value of the sequence (See example 2.16).
Example 2.23 In Example 2.15, a loop variant u-l decreases at each iteration, and remains nonnega-
tive, except at the last iteration where it may become negative.
Example 2.24 The following example illustrates a variant annotation using a pair of integers, ordered
lexicographically.
//@ ensures \result ≥ 0;
int dummy();
//@ requires x ≥ 0 ∧ y ≥ 0;
void f(int x,int y) {
/*@ loop invariant x ≥ 0 ∧ y ≥ 0;
@ loop variant (x,y) for lexico;
@*/
while (x > 0 ∧ y > 0) {
if (dummy()) {
x−−; y = dummy();
}
else y−−;
}
}
Example 2.25 Here are the classical factorial and Fibonacci functions:
//@ decreasesdecreases n;
int fib(int n) {
if (n ≤ 1) return 1;
return fib(n−1) + fib(n−2);
}
//@ decreasesdecreases x;
int odd(int x) {
if (x ≡ 0) return 0;
return even(x−1);
}
Example 2.27 A concrete example of a function that may not always terminate is the incr_list
function of example 2.14. In fact, another acceptable contract for this function is the following one:
struct list int hd; struct list* next; // We give an axiomatic definition of the reachability predicate
// this time, the specification accepts circular lists, but does not ensure
// that the function terminates on them (as a matter of fact, it does not).
/*@ assigns { q−>hd | struct list *q ; reachable(p,q) } ;
terminates reachable(p,\nullNULL);
* /
void incr_list(struct list *p) {
while (p) { p−>hd++ ; p = p−>next; }
}
illustrates the definition of a new predicate is_positive with an integer parameter, and a new
logic function sign with a real parameter returning an integer.
2.6.2 Lemmas
Lemmas are user-given propositions, a facility that might help theorem provers to establish
validity of ACSL specifications.
Of course, a complete verification of an ACSL specification has to provide a proof for each lemma.
On the other hand, it is
Example 2.30 The following introduce a predicate isgcd(x, y, d) meaning that d is the greatest
common divisor of x and y.
/*@ inductive is_gcd(integer a, integer b, integer d) {
@ case gcd_zero:
@ ∀ integer n; is_gcd(n,0,n)
@ case gcd_succ:
@ ∀ integer a,b,d; is_gcd(b, a % b, d) =⇒ is_gcd(a,b,d)
@}
@*/
Example 2.31 The following axiomatization introduce a theory of finite lists of integers a la LISP.
Like inductive definitions, there is no syntactic conditions which would guarantee axiomatic
definitions to be consistent. It is usually up to the user to ensure that the introduction of axioms does
not lead to a logical inconsistency.
One can then consider for instance list of integers (list <integer>), list of pointers (e.g. list
<char*>), list of list of reals (list <list <real> >1 ), etc.
The grammar of Figure 2.11 contains rules for declaring polymorphic types and using polymorphic
type expressions.
defines a logic function which returns the maximal index i between 0 and n − 1 such that t[i] = 0.
Notice that there is no syntactic condition on such recursive definitions, such as limitation to primitive
recursionlike e.g. in Coq [7]. In essence, a recursive definition of the form f(args) = e; where f occurs
in expression e is just a shorcut for a shortcut for axiomatic declaration of f followed by with an axiom
\forall args; f(args) = e. In other words, recursive definitions are not guaranteed to be
consistent, in the same way that axioms axiomatics may introduce inconsistency. Of course, tools might
provide a way to check consistency.
Abstraction The term \lambda τ1 x1 , . . . , τn xn ; t denotes the n-ary logic function which maps
x1 , . . . , xn to t. It has the same precedence as \forall and \exists
1
In this latter case, note that the two ’>’ must be separated by a space, to avoid confusion with the shift operator.
Extended quantifiers Terms \quant(t1 , t2 , t3 ) where quant is max min sum product or numof
are extended quantifications. t1 and t2 must have type integer, and t3 must be a unary function
with an integer argument, and a numeric value (integer or real) except for \numof for which it
should have a boolean value. Their meanings are given as follows:
If i > j then \sum and \numof above are 0, \product is 1, and \max and \min are unspeci-
fied (see Section 2.2.2).
construct to refer to a given label. However, to ease reading of such logic expressions, it is allowed to
omit a label whenever there is only one label in the context.
Example 2.36 The following annotations declare a function which returns the number of occurrences
of a given double in an array of doubles between the given indexes, together with the related axioms.
It should be noted that without the reads clauseslabels, this axiomatization would be inconsistent, since the
function would not depend on the values stored in t, hence the two last axioms would say both that
a = b + 1 and a = b for some a and b.
/*@ axiomatic NbOcc {
@ /*/ nb_occ(t,i,j,e) gives the number of occurrences of e in t[i..j]
* @ // (in a given memory state labelled L) */
/* @ logic integer nb_occ{L}(double t[], integer i, integer j,
@ double e) @ reads t..; @*/; /* Notice that without label L, t
*/ /*@ axiom nb_occ_empty{L} :
@ ∀ double t[], integer ie, integer j, double ei, j;
@ i > j =⇒ nb_occ(t,i,j,e) ≡ 0;
@*/ // without L, term nb_occ(t,i,j,e) would be rejected /* axiom nb_occ_true{L} :
@ ∀ double t[], integer ie, integer j, double ei, j;
@ i ≤ j ∧ t[ij] ≡ e =⇒
@ nb_occ(t,i,j,e) ≡ nb_occ(t,i,j−1,e) + 1;
@*/ // without L, term ti would be rejected, here it is (ti,L) /* axiom nb_occ_false{L} :
@ ∀ double t[], integer ie, integer j, double ei, j;
@ i ≤ j ∧ t[ij] 6≡ e =⇒
@ nb_occ(t,i,j,e) ≡ nb_occ(t,i,j−1,e);
@}
* /
Example 2.37 This second example defines a predicate which indicates whether two arrays of the same
size are a permutation of each other. It illustrates the use of more than a single label. Thus, the \at
operator is mandatory here. Indeed the two arrays may come from two distinct memory states. Typically,
one of the post condition of a sorting function would be permut{Pre,Post}(t,t).
@ }
@*/
Module components are then accessible using a qualified notation like List::length.
Predefined algebraic specifications can be provided as libraries (see section 3), and imported using a
construct like
where the file list.acsl contains logic definitions, like the List module above.
\base_addr : ‘a * → char*
\block_length : ‘a * → size_t
• \valid applies to a set of terms (see Section 2.3.4) of some pointer type. \valid(s) holds if
and only if dereferencing any p ∈ s is safe. In particular, \valid(\empty) holds.
• \null is an extra notation for the null pointer (i.e. a shortcut for (void*)0). Note that as in
C itself (see [12], par. 6.3.2.3), the constant 0 can have any pointer type.
\offset ‘a * → size_t
:
\offset(p) = (char*)p − \base_addr(p)
the following property holds: for any set of pointers s, \valid(s) if and only if for all p ∈ s:
2.7.2 Separation
E XPERIMENTAL
\separated(loc1 , .., locn ) means that for each i 6= j, the intersection of loci and locj is empty.
Each loci is a set of terms as defined in Section 2.3.4.
• \freed(p), indicates that p was allocated in the pre-state but that it is not the case in the
post-state.
Example 2.38 Here is an example where we defined the footprint of a structure, that is the set of
locations that can be accessed from an object of this type.
struct S {
char *x;
int *y;
};
Notice that in the first definition, since union is made with a set<char*> and a set<int*>, the
result is a set<char*> (accordingly to typing of union). In other words, the two definitions above
are equivalent.
This logic function can be used as argument of \separatedor in assigns clause.
The following built-in predicates allow to deal with allocation and deallocation of memory blocks. They can be used in a postcondition
\fresh(p) indicates that p was not allocated in the pre-state.
, indicates that p was allocated in the pre-state but that it is not the case in the post-state.
In an ensures-clause, \result is bound to the return code, e.g the value returned by main or the
argument passed to exit.
Example 2.39 The following example is a variation over the array_sum function in example 2.34, in
which the values of the array are added to a global variable total.
return;
}
Example 2.40 The composite element modifier operators are useful additional constructs for such func-
tional expressions.
• global invariants and type invariants: the former only apply to specified global variables, whereas
the latter are associated to a static type, and apply to any variables of the corresponding type;
• strong invariants and weak invariants: strong invariants must be valid at any time during program
execution (more precisely at any sequence point as defined in the C standard), whereas weak
invariants must be valid at function boundaries (function entrance and exit) but can be violated in
between.
The syntax for declaring data invariants is given in Figure 2.18. The strength modifier defaults to
weak.
1. a weak global invariant a_is_positive which specifies that global variable a should remain
positive (weakly, so this property might be violated temporarily between functions calls);
int a;
//@ global invariant a_is_positive: a ≥ 0 ;
struct S {
int f;
};
//@ type invariant S_f_is_positive(struct S s) = s.f ≥ 0 ;
2.11.1 Semantics
The distinction between strong and weak invariants has to do with the sequence points where the property
is supposed to hold. The distinction between global and type invariants has to do with the set of values
on which they are supposed to hold.
• Weak global invariants are properties which apply to global data and hold at any function entrance
and function exit.
• Strong global invariants are properties which apply to global data and hold at any step during
execution (starting after initialization of these data).
• A weak type invariant on type τ must hold at any function entrance and exit, and applies to any
global variable or formal parameter which has static type τ . If the result of the function is of type
τ , the result must also satisfy its weak invariant at function exit. Notice that it says nothing of
fields, array elements, memory locations, etc. of type τ .
• A strong type invariant on type τ must hold at any step during execution, and applies to any global
variable, local variable, or formal parameter which has static type τ . If the result of the function is
of type τ , the result must also satisfy its strong invariant at function exit. Again, it says nothing of
fields, array elements, memory locations, etc. of type τ .
Example 2.42 The following example illustrates the use of a data invariant on a local static variable.
void out_char(char c) {
col++;
if (col ≥ 80) col = 0;
Example 2.43 Here is a longer example, the famous Dijkstra’s Dutch flag algorithm.
@ permut{Pre,Here}(t,t,f.n−1);
@ loop assigns b,i,r,t[0 .. f.n−1];
@ loop variant r − i;
@*/
while (i < r) {
switch (t[i]) {
case BLUE:
swap(t, b++, i++);
break;
case WHITE:
i++;
break;
case RED:
swap(t, −−r, i);
break;
}
}
}
• Model variables can only appear in specifications. They are not lvalues, thus they cannot be
assigned directly (unlike ghost variables, see below).
Thus, in practice, the only way to prove that a function body satisfies a contract with model variables is
to provide an invariant relating model variables and concrete variables, as in the example below.
Example 2.44 Here is an example of a specification for a function which generates fresh integers. The
contract is given in term of a model variable which is intended to represent the set of “forbidden” values,
e.g. the values that have already been generated.
/* public interface */
int gen() {
static int x = 0;
/*@ global invariant I: ∀ integer k;
@ Set::mem(k,forbidden) =⇒ x > k;
@*/
return x++;
}
Remarks Although the syntax of model variables is close to JML model variables, they differ in the
sense that the type of a model variable is a logic type, not a C type. Also, the semantics above is closer
to the one of B machines [1]. It has to be noticed that program verification with model variables does
not have a well-established theoretical background [19, 17], so we deliberately do not provide a precise
semantics in this document .
• Comments must be introduced by // and extend until the end of the line (the ghost code itself is
placed inside a C comment. /* ... */ would thus lead to incorrect C code).
• It is however possible to write multi-line annotations for ghost code. These annotations are en-
closed between /*@ and */. As in normal annotations, @s at the beginning of a line and at the end
of the comment (before the final */) are considered as blank.
• A non-ghost function can take ghost parameters. If such a ghost clause is present in the declarator,
then the list of ghost parameters must be non-empty and fixed (no vararg ghost). The call to the
function must then provide the appropriate number of ghost parameters.
• Any non-ghost if-statement which does not have a non-ghost else clause can be augmented with
a ghost one. Similarly, a non-ghost switch can have a ghost default: clause if it does not
have a non-ghost one (there are however semantical restrictions for valid ghost labelled statements
in a switch, see next paragraph for details).
Semantics of Ghost Code The question of semantics is essential for ghost code. Informally, the se-
mantics requires that ghost statements do not change the regular program execution This implies several
conditions, including e.g:
• If p is a ghost pointer pointing to a non-ghost memory location, then it is forbidden to assign ∗p.
• Body of a ghost function is ghost code, hence do not modify non-ghost variables or fields.
• If a non-ghost C function is called in ghost code, it must not modify non-ghost variables or fields.
• If a structure has ghost fields, the sizeof of the structure is the same has the structure without
ghost fields. Also, alignment of fields remains unchanged.
• The control-flow graph of a function must not be altered by ghost statements. In particular, no
ghost return can appear in the body of a non-ghost function. Similarly, ghost goto, break,
and continue continue cannot jump outside of the innermost non-ghost enclosing block.
Semantics is specified as follows. First, one has to think that program execution with ghost code
involves a ghost memory heap and a ghost stack, disjoint from the regular heap and stack. Ghost variables
lie in the ghost heap, so as the ghost field of structures. Thus, every memory side-effect can be classified
as ghost or non-ghost. Then, the semantics is that memory side-effects of ghost code must always be in
the ghost heap or the ghost stack.
Notice that this semantics is not statically decidable. It is left to tools to provide approximations,
correct in the sense that any code statically detected as ghost must be semantically ghost.
Example 2.45 The following example shows some invalid assignments of ghost pointers:
Example 2.46 The following example shows some invalid ghost statements:
int f (int x, int y) {
//@ ghost int z = x + y;
switch (x) {
case 0: return y;
//@ ghost 1: z=y;
// above statement is correct.
int g(int x) {
//@ ghost int z = x;
if (x > 0) { return x; }
//@ ghost else { z++; return x; }
// invalid, would bypass the non−ghost return
return x+1;
}
Differences between model variables and ghost variables A ghost variable is an additional specifi-
cation variable which is assigned in ghost code like any C variable. On the other hand, a model variable
cannot be assigned, but one can state it is modified and can express properties about the new value, in a
non-deterministic way, using logic assertions and invariants. In other words, one can say that specifica-
tions using ghost variables modifications are executable.
Example 2.47 The example 2.44 can also be specified with a ghost variable instead of a model variable:
Specifying properties of a volatile variable may be done via a specific construct to attach two ghost
functions to it. This construct, described by the grammar of Figure 2.21, has the following shape:
volatile τ x;
//@ volatile x reads f writes g;
τ f(volatile τ * p);
τ g(volatile τ * p, τ v);
This must be understood as a special construct to instrument the C code, where each access to the variable
x is replaced by a call to f(&x), and each assignment to x of a value v is replaced by g(&x,v).
Example 2.48 The following code is instrumented in order to inject fixed values at each read of variable
x, and collect written values.
volatile int x;
}
return sum;
}
int f(int n) {
int x;
int* f() {
int a;
return &a;
}
int* g() {
int* p = f();
//@ assert \specified(p);
return p+1;
}
Libraries
Disclaimer: this chapter is unfinished, it is left here to give an idea of what it will look like in the final
document.
This chapter is devoted to librairies of specification, built upon the ACSL specification language.
Section 3.2 describes additional predicates introduced by the Jessie plugin of Frama-C, to propose a
slightly higher level of annotation.
• abs, exp, power, log, sin, cos, atan, etc. over reals
• isFinite predicate over floats and doubles (means not NaN nor infinity)
\offset_min(p + i) = \offset_min(p) − i
\offset_max(p + i) = \offset_max(p) − i
and the ACSL built-in predicate \valid(p) is now equivalent to \valid_range(p, 0, 0).
3.2.2 Strings
E XPERIMENTAL
The logic function
integer \strlen(char ∗ p)
denotes the length of a 0-terminated C string. It is total function, whose value is non-negative if and only
if the pointer in argument is really a string.
Conclusion
This document presents a Behavioral Interface Specification Language for ANSI C source code. It
provides a common basis that could be shared among several tools. The specification language described
here is intended to evolve in the future, and remain open to additional constructions. One interesting
possible extension regards “temporal” properties in a large sense, such as liveness properties, which can
sometimes be simulated by regular specifications with ghost variables [11], or properties on evolution of
data over the time, such as the history constraints of JML, or in the Lustre assertion language.
64 Conclusion
Appendices
A.1 Glossary
pure expressions In ACSL setting, a pure expression is a C expression which contains no assignments,
no incrementation operator ++ or --, no function call, and no access to a volatile object. The set
of pure expression is a subset of the set of C expressions without side effect (C standard [13, 12],
§5.1.2.3, alinea 2).
left-values A left-value (lvalue for short) is an expression which denotes some place in the memory
during program execution, either on the stack, on the heap, or in the static data segment. It can
be either a variable identifier or an expression of the form ∗e, e[e], e.id or e->id, where e is any
expression and id a field name. See C standard, §6.3.2.1 for a more detailed description of lvalues.
A modifiable lvalue is an lvalue allowed in the left part of an assignment. In essence, all lvalues
are modifiable except variables declared as const or of some array type with explicit length.
pre-state and post-state For a given function call, the pre-state denotes the program state at the begin-
ning of the call, including the current values for the function parameters. The post-state denotes
the program state at the return of the call.
function behavior A function behavior (behavior for short) is a set of properties relating the pre-state
and the post-state for a possibly restricted set of pre-states (behavior assumptions).
function contract A function contract (contract for short) forms a specification of a function, consisting
of the combination of a precondition (a requirement on the pre-state for any caller to that function),
a collection of behaviors, and possibly a measure in case of a recursive function.
• ACSL is a BISL for C, a low-level structured language, while JML is a BISL for Java, an object-
oriented inheritance-based high-level language. Not only the language features are not the same
but the programming styles and idioms are very different, which entails also different ways of
specifying behaviors. In particular, C has no inheritance nor exceptions, and no language support
for the simplest properties on memory (e.g, the size of an allocated memory block).
• JML relies on runtime assertion checking (RAC) when typing, static analysis and automatic de-
ductive verification fail. The example of CCured [21, 5], that adds strong typing to C by relying on
66 Appendices
RAC too, shows that it is not possible to do it in a modular way. Indeed, it is necessary to modify
the layout of C data structures for RAC, which is not modular. The follow-up project Deputy [6]
thus reduces the checking power of annotations in order to preserve modularity. On the contrary,
we choose not to restrain the power of annotations (e.g., all first order logic formulas are allowed).
To that end, we rely on manual deductive verification using an interactive theorem prover (e.g.,
Coq) when every other technique failed.
JML has a core notion of inheritance of specifications, that duplicates in specifications the inheritance
feature of Java. Inheritance combined with visibility and modularity account for a number of complex
features in JML (e.g, spec_public modifier, data groups, represents clauses, etc), that are necessary
to express the desired inheritance-related specifications while respecting visibility and modularity. Since
C has no inheritance, these intricacies are avoided in ACSL.
The usual way of signaling errors in Java is through exceptions. Therefore, JML specifications are
tailored to express exceptional postconditions, depending on the exception raised. Since C has no excep-
tions, ACSL does not use exceptional specifications. Instead, C programmers are used to signal errors
by returning special values, like mandated in various ways in the C standard.
Example A.1 In §7.12.1 of the standard, it is said that functions in <math.h> signal errors as follows:
“On a domain error, [...] the integer expression errno acquires the value EDOM.”
Example A.2 In §7.19.5.1 of the standard, it is said that function fclose signals errors as follows: “The
fclose function returns [...] EOF if any errors were detected.”
Example A.3 In §7.19.6.1 of the standard, it is said that function fprintf signals errors as follows: “The
fprintf function returns [...] a negative value if an output or encoding error occured.”
Example A.4 In §7.20.3 of the standard, it is said that memory management functions signal errors as
follows: “If the space cannot be allocated, a null pointer is returned.”
As shown by these few examples, there is no unique way to signal errors in the C standard library,
not mentioning user-defined functions. But since errors are signaled by returning special values, it is
sufficient to write an appropriate postcondition:
In Java, the precondition of the following function that nullifies an array of characters is always true.
Even if there was a precondition on the length of array a, it could easily be expressed using the Java
expression a.length that gives the dynamic length of array a.
On the contrary, the precondition of the same function in C, whose definition follows, is more in-
volved. First, remark that the C programmer has to add an extra argument for the size of the array, or
rather a lower bound on this array size.
where predicate \valid is the one defined in Section 2.7.1. (note that \valid(a + 0..(−1)) is
the same as \valid(\empty) and thus is true regardless of the validity of a itself). When n is null, a
does not need to be valid at all, and when n is strictly positive, a must point to an array of size at least
n. To make it more obvious, the C programmer adopted a defensive programming style, which returns
immediately when n is null. We can duplicate this in the specification:
Usually, many memory requirements are only necessary for some paths through the function, which
correspond to some particular behaviors, selected according to some tests performed along the corre-
sponding paths. Since C has no memory primitives, these tests involve other variables that the C pro-
grammer added to track additional information, like n in our example.
To make it easier, it is possible in ACSL to distinguish between the assume part of a behavior, that
specifies the tests that need to succeed for this behavior to apply, and the requires part that specifies
the additional preconditions that must be true when a behavior applies. The specification for our example
can then be translated into:
This is equivalent to the previous requirement, except here behaviors can be completed with post-
conditions that belong to one behavior only. Contrary to JML, the set of behaviors for a function do not
necessarily cover all cases of use for this function, as mentioned in Section 2.3.3. This allows for partial
specifications, whereas JML behaviors cannot offer such flexibility. Here, Our two behaviors are clearly
mutually exclusive, and, since n is an unsigned int, our they cover all the possible cases. We could
have specified that as well, by adding the following lines in the contract (see Section 2.3.3).
/*@ ...
@ disjoint behaviors;
@ complete behaviors;
@*/
/*@ requires P1 ;
@ requires P2 ;
@ ensures Q1 ;
@ ensures Q2 ;
@ behavior x1 :
@ requires A1 ;
@ requires R1 ;
@ ensures E1 ;
@ behavior x2 :
@ requires A2 ;
@ requires R2 ;
@ ensures E2 ;
@*/
/*@ requires P1 ;
@ requires P2 ;
@ ensures Q1 ;
@ ensures Q2 ;
@ behavior x1 :
@ assumes A1 ;
@ requires R1 ;
@ ensures E1 ;
@ behavior x2 :
@ assumes A2 ;
@ requires R2 ;
@ ensures E2 ;
@*/
Syntactically, the only difference with the JML specification is the addition of the assumes clauses. Its
translation to assume-guarantee is however quite different. It assumes from the pre-state the condition:
Thus, ACSL allows to distinguish between the clauses that control which behavior is active
(the assumes clauses) and the clauses that are preconditions for a particular behavior (the internal
requires clauses). In addition, as mentioned above, there is by default no requirement in ACSL for
the specification to be complete (The last part of the JML condition on the pre-state). If desired, this has
to be precised explicitely with a complete behaviors clause as seen in Section 2.3.3.
Beware that in assigns clauses, the given locations refer to the post-state, whereas in JML, they refer to the pre-state. The JML semantics
can be obtained using construct.
JML ACSL
modifiable,assignable assigns
measured_by decreases
loop_invariant loop invariant
decreases loop variant
(\forall τ x; P ; Q) (\forall τ x; P ==>Q)
(\exists τ x; P ; Q) (\exists τ x; P &&Q)
(\max τ x; a<=x<=b; f ) \max(a, b, \lambda τ x; f )
int x;
//@ assigns x;
int g();
int f(int x) {
// ...
return g();
}
In order to write the assigns clause for f, we must access the global variable x, since
f calls g, which can modify x. This is not possible with C scoping rules, as x refers to the
parameter of f in the scope of the function.
A solution is to use a ghost pointer to x, giving rise to the following code:
int x;
//@ assigns x;
int g();
int f(int x) {
// ...
return g();
}
#include <stdlib.h>
// forward reference
struct _memory_slice;
/* A memory chunk list links memory chunks in the same memory block.
* Newly allocated chunks are put first, so that the offset of chunks
* decreases when following the next pointer. Allocated chunks should
* fill the memory block up to its own next index.
*/
typedef struct _memory_chunk_list {
memory_chunk* chunk;
// current list element
struct _memory_chunk_list* next;
// tail of the list
} memory_chunk_list;
/* A memory slice holds together a memory block block and a list of chunks
* chunks on this memory block.
*/
typedef struct _memory_slice {
//@ ghost boolean packed;
// ghost field packed is meant to be used as a guard that tells when
// the invariant of a structure of type memory_slice holds
memory_block* block;
memory_chunk_list* chunks;
} memory_slice;
mb−>next += s;
mb−>used += s;
//@ ghost mb−>ghost = true; // pack the block
// add the new chunk to the list
mcl = (memory_chunk_list*)malloc(sizeof(memory_chunk_list));
if (mcl ≡ 0) return 0;
mcl−>chunk = mc;
mcl−>next = ms−>chunks;
ms−>chunks = mcl;
//@ ghost ms−>ghost = true; // pack the slice
return mc;
}
// iterate through memory chunks
/*@
@ loop invariant valid_memory_chunk_list(mcl,mb);
@ loop variant mcl for chunk_lower_length;
@ */
while (mcl 6≡ 0) {
mc = mcl−>chunk;
// is mc free and large enough?
if (mc−>free ∧ s ≤ mc−>size) {
mc−>free = false;
mb−>used += mc−>size;
return mc;
}
// try next chunk
mcl = mcl−>next;
}
msl = msl−>next;
}
// allocate a new block
mb_size = (DEFAULT_BLOCK_SIZE < s) ? s : DEFAULT_BLOCK_SIZE;
mb_data = (char*)malloc(mb_size);
if (mb_data ≡ 0) return 0;
mb = (memory_block*)malloc(sizeof(memory_block));
if (mb ≡ 0) return 0;
mb−>size = mb_size;
mb−>next = s;
mb−>used = s;
mb−>data = mb_data;
//@ ghost mb−>ghost = true; // pack the block
// allocate a new chunk
mc = (memory_chunk*)malloc(sizeof(memory_chunk));
if (mc ≡ 0) return 0;
mc−>offset = 0;
mc−>size = s;
mc−>free = false;
mc−>block = mb;
//@ ghost mc−>ghost = true; // pack the chunk
// allocate a new chunk list
mcl = (memory_chunk_list*)malloc(sizeof(memory_chunk_list));
if (mcl ≡ 0) return 0;
//@ ghost mcl−>offset = 0;
mcl−>chunk = mc;
mcl−>next = 0;
// allocate a new slice
ms = (memory_slice*)malloc(sizeof(memory_slice));
if (ms ≡ 0) return 0;
ms−>block = mb;
ms−>chunks = mcl;
//@ ghost ms−>ghost = true; // pack the slice
// update the block accordingly
mb−>slice = ms;
// add the new slice to the list
msl = (memory_slice_list*)malloc(sizeof(memory_slice_list));
if (msl ≡ 0) return 0;
msl−>slice = ms;
msl−>next = *arena;
//@ ghost msl−>ghost = true; // pack the slice list
* arena = msl;
return mc;
}
if (msl−>slice ≡ ms) {
*arena = msl−>next;
//@ ghost msl−>ghost = false; // unpack the slice list
free(msl);
}
// case it is not the first slice
while (msl 6≡ 0) {
if (msl−>next 6≡ 0 ∧ msl−>next−>slice ≡ ms) {
memory_slice_list* msl_next = msl−>next;
msl−>next = msl−>next−>next;
// unpack the slice list
//@ ghost msl_next−>ghost = false;
free(msl_next);
break;
}
msl = msl−>next;
}
//@ ghost ms−>ghost = false; // unpack the slice
// deallocate all chunks in the block
mcl = ms−>chunks;
// iterate through memory chunks
/*@
@ loop invariant valid_memory_chunk_list(mcl,mb);
@ loop variant mcl for chunk_lower_length;
@ */
while (mcl 6≡ 0) {
memory_chunk_list *mcl_next = mcl−>next;
mc = mcl−>chunk;
//@ ghost mc−>ghost = false; // unpack the chunk
free(mc);
free(mcl);
mcl = mcl_next;
}
mb−>next = 0;
mb−>used = 0;
// deallocate the memory block and its data
//@ ghost mb−>ghost = false; // unpack the block
free(mb−>data);
free(mb);
// deallocate the corresponding slice
free(ms);
return;
}
// mark the chunk as freed
chunk−>free = true;
// update the block accordingly
mb−>used −= chunk−>size;
return;
}
A.6 Changes
A.6.1 Version 1.4
• Introduction of axiomatic to gather predicates, logic functions, and their defining ax-
ioms.
[1] Jean-Raymond Abrial. The B-Book: Assigning Programs to Meanings. Cambridge University
Press, 1996.
[2] Patrice Chalin. Reassessing JML’s logical foundation. In Proceedings of the 7th Workshop on
Formal Techniques for Java-like Programs (FTfJP’05), Glasgow, Scotland, July 2005.
[3] Patrice Chalin. A sound assertion semantics for the dependable systems evolution verifying com-
piler. In Proceedings of the International Conference on Software Engineering (ICSE’07), pages
23–33, Los Alamitos, CA, USA, 2007. IEEE Computer Society.
[4] David R. Cok and Joseph R. Kiniry. ESC/Java2 implementation notes. Technical report, may
2007. http://secure.ucd.ie/products/opensource/ESCJava2/ESCTools/
docs/Escjava2-ImplementationNotes.pdf.
[5] Jeremy Condit, Matthew Harren, Scott McPeak, George C. Necula, and Westley Weimer. Ccured in
the real world. In PLDI ’03: Proceedings of the ACM SIGPLAN 2003 conference on Programming
language design and implementation, pages 232–244, 2003.
[6] Jeremy Paul Condit, Matthew Thomas Harren, Zachary Ryan Anderson, David Gay, and George
Necula. Dependent types for low-level programming. In ESOP ’07: Proceedings of the 16th
European Symposium on Programming, Oct 2006.
[8] Jean-Christophe Filliâtre and Claude Marché. Multi-prover verification of C programs. In Jim
Davies, Wolfram Schulte, and Mike Barnett, editors, 6th International Conference on Formal En-
gineering Methods, volume 3308 of Lecture Notes in Computer Science, pages 15–29, Seattle, WA,
USA, November 2004. Springer.
[9] Jean-Christophe Filliâtre and Claude Marché. The Why/Krakatoa/Caduceus platform for deductive
program verification. In Werner Damm and Holger Hermanns, editors, 19th International Confer-
ence on Computer Aided Verification, Lecture Notes in Computer Science, Berlin, Germany, July
2007. Springer.
[11] A. Giorgetti and J. Groslambert. JAG: JML Annotation Generation for verifying temporal prop-
erties. In FASE’2006, Fundamental Approaches to Software Engineering, volume 3922 of LNCS,
pages 373–376, Vienna, Austria, March 2006. Springer.
[12] International Organization for Standardization (ISO). The ANSI C standard (C99). http://www.
open-std.org/JTC1/SC22/WG14/www/docs/n1124.pdf.
[13] Brian Kernighan and Dennis Ritchie. The C Programming Language (2nd Ed.). Prentice-Hall,
1988.
82 BIBLIOGRAPHY
[16] Gary T. Leavens, Albert L. Baker, and Clyde Ruby. Preliminary design of JML: A behavioral
interface specification language for Java. Technical Report 98-06i, Iowa State University, 2000.
[17] Gary T. Leavens, K. Rustan M. Leino, and Peter Müller. Specification and verification challenges
for sequential object-oriented programs. Form. Asp. Comput., 19(2):159–189, 2007.
[18] Gary T. Leavens, K. Rustan M. Leino, Erik Poll, Clyde Ruby, and Bart Jacobs. JML: notations and
tools supporting detailed design in Java. In OOPSLA 2000 Companion, Minneapolis, Minnesota,
pages 105–106, 2000.
[19] Claude Marché. Towards modular algebraic specifications for pointer programs: a case study.
In H. Comon-Lundh, C. Kirchner, and H. Kirchner, editors, Rewriting, Computation and Proof,
volume 4600 of Lecture Notes in Computer Science, pages 235–258. Springer-Verlag, 2007.
[20] Yannick Moy. Union and cast in deductive verification. Technical Report ICIS-R07015, Radboud
University Nijmegen, jul 2007. http://www.lri.fr/~moy/union_and_cast/union_
and_cast.pdf.
[21] George C. Necula, Scott McPeak, and Westley Weimer. CCured: Type-safe retrofitting of legacy
code. In Symposium on Principles of Programming Languages, pages 128–139, 2002.
[22] Arun D. Raghavan and Gary T. Leavens. Desugaring JML method specifications. Technical Report
00-03a, Iowa State University, 2000.
[23] David Stevenson et al. An american national standard: IEEE standard for binary floating point
arithmetic. ACM SIGPLAN Notices, 22(2):9–25, 1987.
Post, 35 \eq_float, 21
\freed, 51 \exists, 14
\fresh, 51 exit_behavior, 52
\old, 35, 36
\result, 37 \false, 14, 18
for, 14, 22, 30, 33, 37
annotation, 29 \forall, 14
loop, 29 \from, 65
as, 64 function behavior, 24, 73
assert, 29, 30 function contract, 22, 73
assertion, 29 functional expression, 52
assigns, 22, 23, 30, 65
assumes, 22 \ge_double, 21
\at, 35 \ge_float, 21
axiom, 63 ghost, 57
axiomatic, 43 ghost, 67
axiomatic, 63 global, 66
global invariant, 53
\baseaddr, 50 grammar entries
behavior, 24, 73 abrupt-clause, 37
behavior, 22 assertion, 30
\blocklength, 50 assigns-clause, 22
\boolean, 18 assumes-clause, 22
boolean, 15 axiom-decl, 63
breaks, 37 axiomatic-decl, 63
behavior-body, 22
case, 63, 64 built-in-logic-type, 15
cast, 18–21 constructor, 64
complete behaviors, 26 data-inv-decl, 66
comprehension, 28 data-invariant, 66
continues, 37 declaration, 67
contract, 22, 29, 36, 73 decreases-clause, 22
data invariant, 53 direct-declarator, 67
\decreases, 38 ensures-clause, 22
decreases, 22 extended-quantifier, 63
dependency, 52 ghost-selection-statement, 67
disjoint behaviors, 26 indcase, 63
do, 30 inductive-def, 63
label-binders, 65
else, 67 locations, 22
\empty, 27 logic-const-decl, 63
ensures, 22, 23 logic-const-def, 62
\eq_double, 21 logic-decl, 63
86 INDEX
logic-function-decl, 63 if, 67
logic-function-def, 62 inductive, 63
logic-predicate-decl, 63 inductive definitions, 42
logic-predicate-def, 62 inductive predicates, 42
logic-type-decl, 63 \initialized, 61
logic-type-def, 64 \integer, 18
logic-type-expr, 15, 62 integer, 15
logic-type, 63 \inter, 27, 28
loop-annot, 30 invariant, 30, 32
loop-assigns, 30 data, 53
loop-invariant, 30 global, 53
loop-variant, 30 strong, 53
match-cases, 64 type, 53
match-case, 64 weak, 53
named-behavior, 22 invariant, 30, 33, 66
parameters, 62 \is_finite_double, 21
parameter, 62 \is_finite_float, 21
pat, 64 \is_nan_double, 21
poly-id, 62 \is_nan_float, 21
postfix-expression, 67
pred, 14, 22, 27 l-value, 22, 27
record-type, 64 \lambda, 45, 63
rel-op, 14 \le_double, 21
requires-clause, 22 \le_float, 21
simple-behavior, 22 left-value, 73
simple-clauses, 22 lemma, 62
statement-contract, 37 \let, 14, 64
statements-ghost, 67 library, 50
statement, 30, 67 location, 51
struct-declaration, 66, 67 logic, 62, 63
sum-type, 64 logic specification, 41
terminates-clause, 22 loop, 30
term, 14, 64 loop annotation, 29
\lt_double, 21
type-expr, 15, 64
\lt_float, 21
type-invariant, 66
lvalue, 73
unary-op, 14
variable-ident, 15 \match, 64
binder, 15 \max, 45, 63
inv-strength, 66 \min, 45, 63
lemma-decl, 62 model, 56
logic-def, 62 model, 66
type-var-binders, 62 module, 49
type-var, 62
\gt_double, 21 \ne_double, 21
\gt_float, 21 \ne_float, 21
\nothing, 22
Here, 35 \null, 50
hybrid \numof, 45, 63
function, 46
predicate, 46 \offset, 50
Old, 35 variant, 32
\old, 22 \variant, 38
variant, 30
polymorphism, 44 volatile, 59
post-state, 73 volatile, 67
Pre, 35
pre-state, 73 weak, 66
predicate, 13 while, 30
predicate, 62, 63 writes, 67
\product, 45, 63
pure expression, 73
reads, 67
\real, 18
real, 15
real_of_double, 20
real_of_float, 20
record, 45, 64
recursion, 44
requires, 22, 23
\result, 22
returns, 37
\round_double, 21
\round_float, 21
\separated, 50
set type, 51
sizeof, 20
sizeof, 14
specification, 41
\specified, 61
statement contract, 29, 36
strong, 66
\subset, 27
sum, 45, 64
\sum, 45, 63
term, 13
terminates, 22, 40
termination, 32, 37
\true, 14, 18
type
concrete, 45
polymorphic, 44
record, 45
sum, 45
type, 63, 64, 66
type invariant, 53
\union, 27, 28
\valid, 50