Explicit-Value Analysis Based On CEGAR and Interpolation: Dirk Beyer and Stefan L Owe
Explicit-Value Analysis Based On CEGAR and Interpolation: Dirk Beyer and Stefan L Owe
gG
g
. We write c
g
c
if (c, g, c
) , and cc
if
there exists a g with c
g
c
.
An abstract data state represents a region of concrete
data states, formally dened as abstract variable assign-
ment. An abstract variable assignment is a partial func-
tion v : X Z , , which maps variables in the def-
inition range of function v to integer values or or .
The special value is used to represent an unknown
value, e.g., resulting from an uninitialized variable or an
external function call, and the special value is used
to represent no value, i.e., a contradicting variable assign-
ment. We denote the denition range for a partial func-
tion f as def(f) = x [ y : (x, y) f, and the restric-
tion of a partial function f to a new denition range Y
as f
|Y
= f (Y (Z , )). An abstract variable
assignment v represents the region [[v]] of all concrete
data states cd for which v is valid, formally: [[v]] =
cd [ x def(v) : cd(x) = v(x) or v(x) = . An abstract
state of a program is a pair (l, v), representing the following
set of concrete states: (l, cd) [ cd [[v]].
B. Congurable Program Analysis with
Dynamic Precision Adjustment
We use the framework of congurable program analysis
(CPA) [8], extended by the concept of dynamic precision
adjustment [9]. Such a CPA supports adjusting the precision
of an analysis during the exploration of the programs ab-
stract state space. A composite CPA can control the precision
of its component analyses during the verication process,
i.e., it can make a component analysis more abstract, and
thus more efcient, or it can make a component analy-
sis more precise, and thus more expensive. A CPA D =
(D, , , merge, stop, prec) consists of (1) an abstract do-
main D, (2) a set of precisions, (3) a transfer relation ,
(4) a merge operator merge, (5) a termination check stop,
and (6) a precision adjustment function prec. Based on these
components and operators, we can formulate a exible and
customizable reachability algorithm, which is adapted from
previous work [8], [12].
C. Explicit-Value Analysis as CPA
In the following, we dene a component CPA that tracks
explicit integer values for program variables. In order to
obtain a complete analysis, we construct a composite CPA
that consists of the component CPA for explicit values and
another component CPA for tracking the program locations
(CPA for location analysis, as previously described [9]). For
the composite CPA, the general denitions of the abstract
domain, the transfer relation, and the other operators are given
in previous work [9]; the composition is done automatically
by the framework implementation CPACHECKER.
The CPA for explicit-value analysis, which tracks integer
values for the variables of a program explicitly, is dened as
C = (D
C
,
C
,
C
, merge
C
, stop
C
, prec
C
) and consists of the
following components [9]:
1. The abstract domain D
C
= (C, 1, [[]]) contains the set
C of concrete data states, and uses the semi-lattice 1 =
(V, , , _, .), which consists of the set V = (X :)
of abstract variable assignments, where : = Z
Z
,
Z
(x)
or v(x) =
Z
or v
(x) =
Z
. The join . : V V V
yields the least upper bound for two variable assignments. The
concretization function [[]] : V 2
C
assigns to each abstract
data state v its meaning, i.e., the set of concrete data states
that it represents.
2. The set of precisions
C
= 2
X
is the set of subsets of
program variables. A precision
C
species a set of
variables to be tracked. For example, = means that no
variable is tracked, and = X means that every program
variable is tracked.
3. The transfer relation
C
has the transfer v
g
(v
, ) if
(1) g = (, assume(p), ) and for all x X :
v
(x) =
Z
if (y,
Z
) v for some y X
or the formula p
/v
is unsatisable
c if c is the only satisfying assignment of
the formula p
/v
for variable x
Z
otherwise
where p
/v
denotes the interpretation of a predicate p over
variables from X for an abstract variable assignment v, that
is, p
/v
=
p
xdef(v),v(x)Z
x = v(x) x def(v) : v(x) =
Z
or
(2) g = (, w :=exp, ) and for all x X :
v
(x) =
exp
/v
if x = w
v(x) if x def(v)
Z
otherwise
where exp
/v
denotes the interpretation of an expression exp
over variables from X for an abstract value assignment v:
exp
/v
=
Z
if (y,
Z
) v for some y X
Z
if (y,
Z
) v or y , def(v)
for some y X that occurs in exp
c otherwise, where expression exp
evaluates to c after replacing each
occurrence of variable x with x def(v)
by v(x) in exp
4. The merge operator does not combine elements when
control ow meets: merge
C
(v, v
, ) = v
.
3
5. The termination check considers abstract states individually:
stop
C
(v, R, ) = (v
R : v _ v
).
6. The precision adjustment function computes a new abstract
state with precision based on the abstract state v and the
precision by restricting the variable assignment v to those
variables that appear in , formally: prec(v, , R) = (v
|
, ).
(In this analysis instance, prec only adjusts the abstract state
according to the current precision , and leaves the precision
itself unchanged.)
The precision of the analysis controls which program vari-
ables are tracked in an abstract state. In other approaches,
this information is hard-wired in either the abstract-domain
elements or the algorithm itself. The concept of CPA supports
different precisions for different abstract states. A simple
analysis can start with an initial precision and propagate it
to new abstract states, such that the overall analysis uses a
globally uniform precision. It is also possible to specify a
precision individually per program location, instead of using
one global precision. Our renement approach in the next
section will be based on location-specic precisions.
D. Predicate Analysis as CPA
The abstract domain of predicates [18] was successfully used
in several tools for software model checking (e.g., [4], [6],
[10], [13], [16], [25]). In a predicate analysis, the precision is
dened as a set of predicates, and the abstract states track the
strongest set of predicates that are fullled (cartesian predicate
abstraction) or the strongest boolean combination of predicates
that are fullled (boolean predicate abstraction). This means,
the abstraction level of the abstract model is determined by
predicates that are tracked in the analysis. Predicate analysis
is also implemented as a CPA in the framework CPACHECKER,
and a detailed description is available [11]. The precision
is freely adjustable also in the predicate analysis, and we
use this feature later in this article to compose a combined
analysis. This analysis uses the predicate analysis to track
variables that have many distinct values a scenario in which
the explicit-value analysis alone would be inefcient. The
combined analysis adjusts the overall precision by removing
variables with many distinct values from the precision of
the explicit-value analysis and adds predicates about these
variables to the precision of the predicate analysis [9] to allow
the combined analysis to run efciently.
E. Lazy Abstraction
The concept of lazy abstraction [22] consists of two ideas:
First, the abstract reachability graph (ARG) the unfolding of
the control-ow graph, representing our central data structure
to store abstract states is constructed on-the-y, i.e., only
when needed and only for parts of the state space that are
reachable. We implement this using the standard reachability
algorithm for CPAs as described in the next subsection.
Second, the abstract states in the ARG are rened only where
necessary along infeasible error paths in order to eliminate
those paths. This is implemented by using CPAs with dynamic
Algorithm 1 CPA(D, R
0
, W
0
), adapted from [9]
Input: a CPA D = (D, , , merge, stop, prec),
a set R
0
(E ) of abstract states with precision,
a subset W
0
R
0
of frontier abstract states with precision,
where E denotes the set of elements of the semi-lattice of D
Output: a set of reachable abstract states with precision,
a subset of frontier abstract states with precision
Variables: two sets reached and waitlist of elements of E
reached := R
0
; waitlist := W
0
;
while waitlist ,= do
choose (e, ) from waitlist; remove (e, ) from waitlist;
for each e
with e(e
, ) do
// Precision adjustment.
( e, ) := prec(e
, , reached);
if isTargetState( e) then
return
reached ( e, ), waitlist
;
for each (e
) reached do
// Combine with existing abstract state.
e
new
:= merge( e, e
, );
if e
new
,= e
then
waitlist :=
waitlist |(e
new
, )
\ |(e
);
reached :=
reached |(e
new
, )
\ |(e
);
// Add new abstract state?
if stop
e,
e [ (e, ) reached
then
waitlist := waitlist |( e, );
reached := reached |( e, )
return (reached, );
precision adjustment, where the renement procedure oper-
ates on location-specic precisions and where the precision-
adjustment operator always removes unnecessary information
from abstract states, as outlined above.
F. Reachability Algorithm for CPA
Algorithm 1 keeps updating two sets of abstract states with
precision: the set reached to store all abstract states with pre-
cision that are found to be reachable, and a set waitlist to store
all abstract states with precision that are not yet processed, i.e.,
the frontier. The state exploration starts with choosing and
removing an abstract state with precision from the waitlist,
and the algorithm considers each abstract successor according
to the transfer relation. Next, for the successor, the algorithm
adjusts the precision of the successor using the precision
adjustment function prec. If the successor is a target state
(i.e., a violation of the property is found), then the algorithm
terminates, returning the current sets reached and waitlist
possibly as input for a subsequent precision renement,
as shown below (cf. Alg. 2). Otherwise, using the given
operator merge, the abstract successor state is combined with
each existing abstract state from reached. If the operator merge
has added information to the new abstract state, such that the
old abstract state is subsumed, then the old abstract state with
precision is replaced by the new abstract state with precision
in the sets reached and waitlist. If after the merge step the
resulting new abstract state with precision is covered by the
set reached, then further exploration of this abstract state is
stopped. Otherwise, the abstract state with its precision is
added to the set reached and to the set waitlist. Finally, once
the set waitlist is empty, the set reached is returned.
4
G. Counterexample-Guided Abstraction Renement
Counterexample-guided abstraction renement (CEGAR) [14]
is a technique for automatic stepwise renement of an abstract
model. CEGAR is based on three concepts: (1) a precision,
which determines the current level of abstraction, (2) a fea-
sibility check, deciding if an abstract error path is feasible,
i.e., if there exists a corresponding concrete error path, and
(3) a renement procedure, which takes as input an infeasible
error path and extracts a precision that sufces to instruct
the exploration algorithm to not explore the same path again
later. Algorithm 2 shows an outline of a generic and simple
CEGAR algorithm. The algorithm starts checking a program
using a coarse initial precision
0
. It uses the reachability
algorithm Alg. 1 for computing the reachable abstract state
space, returning the sets reached and waitlist. If the analysis
has exhaustively checked all program states and did not reach
the error, indicated by an empty set waitlist, then the algorithm
terminates and reports that the program is safe. If the algorithm
nds an error in the abstract state space, i.e., a counterexample
for the given specication, then the exploration algorithm
stops and returns the unnished, incomplete sets reached and
waitlist. Now the according abstract error path is extracted
from the set reached using procedure extractErrorPath and
analyzed for feasibility using the procedure isFeasible for
feasibility check. If the abstract error path is feasible, meaning
there exists a corresponding concrete error path, then this
error path represents a violation of the specication and the
algorithm terminates, reporting a bug. If the error path is
infeasible, i.e., not corresponding to a concrete program path,
then the precision was too coarse and needs to be rened.
The algorithm extracts certain information from the error path
in order to rene the precision based on that information
using the procedure Rene for renement, which returns a
precision that makes the analysis strong enough to refute
the infeasible error path in further state-space explorations.
The current precision is extended using the precision returned
by the renement procedure and the analysis is restarted with
this rened precision. Instead of restarting from the initial sets
for reached and waitlist, we can also prune those parts of the
ARG that need to be rediscovered with new precisions, and
replace the precision of the leaf nodes in the ARG with the
rened precision, and then restart the exploration on the pruned
sets. Our contribution in the next section is to introduce new
implementations for the feasibility check as well as for the
renement procedure.
H. Interpolation
For a pair of formulas
and
+
such that
+
is
unsatisable, a Craig interpolant is a formula that fullls
the following requirements [17]:
1) the implication
holds,
2) the conjunction
+
is unsatisable, and
3) only contains symbols that occur in both
and
+
.
Such a Craig interpolant is guaranteed to exist for many useful
theories, for example, the theory of linear arithmetic with
Algorithm 2 CEGAR(D, e
0
,
0
)
Input: a congurable program analysis with dynamic precision
adjustment D = (D, , , merge, stop, prec),
an initial abstract state e
0
E with precision
0
,
where E denotes the set of elements of the semi-lattice of D
Output: verication result safe or unsafe
Variables: a set reached of elements of E ,
a set waitlist of elements of E ,
an error path = (op
1
, l
1
), ..., (op
n
, l
n
))
reached := |(e
0
,
0
); waitlist := |(e
0
,
0
); :=
0
;
while true do
(reached, waitlist) := CPA(D, reached, waitlist);
if waitlist = then
return safe
else
:= extractErrorPath(reached);
if isFeasible() then // error path is feasible: report bug
return unsafe
else // error path is not feasible: rene and restart
:= Rene();
reached := (e
0
, ); waitlist := (e
0
, );
uninterpreted functions, as implemented in some SMT solvers
(e.g., MATHSAT
3
, SMTINTERPOL
4
).
CEGAR based on Craig interpolation has been proven
successful in the predicate domain. Therefore, we investigate if
this technique is also benecial for explicit-value model check-
ing. Interpolants from the predicate domain, which consist of
path formulas, are not useful for the explicit domain. Hence,
we need to develop a procedure to compute interpolants for the
explicit domain, which we introduce in the following section.
III. Renement-Based Explicit-Value Analysis
The level of abstraction in our explicit-value analysis is deter-
mined by the precisions for abstract variable assignments over
program variables. The CEGAR-based iterative renement
needs an extraction method to obtain the necessary precision
from infeasible error paths. We use our novel notion of
interpolation for the explicit domain to achieve this goal.
A. Explicit-Value Abstraction
We now introduce some necessary operations on abstract vari-
able assignments, the semantics of operations and paths, and
the precision for abstract variable assignments and programs,
in order to be able to concisely discuss interpolation for
abstract variable assignments and constraint sequences.
The operations implication and conjunction for abstract
variable assignments are dened as follows: implication for
v and v
: v v
if def(v
) we have v(x) = v
(x) or v(x) =
or v
) we have
(vv
)(x) =
)
v
)
v(x) if v(x) = v
(x)
if , = v(x) ,= v
(x) ,=
otherwise (v(x) = or v
(x) = )
3
http://mathsat4.disi.unitn.it
4
http://ultimate.informatik.uni-freiburg.de/smtinterpol
5
Furthermore we dene contradiction for an abstract variable
assignment v: v is contradicting if there is a variable x
def(v) such that v(x) = (which implies [[v]] = ); and
renaming for v: the abstract variable assignment v
xy
, with
y , def(v), results from v by renaming variable x to y:
v
xy
= (v (x, v(x))) (y, v(x)).
The semantics of an operation op Ops is dened
by the strongest post-operator SP
op
() for abstract variable
assignments: given an abstract variable assignment v, SP
op
(v)
represents the set of data states that are reachable from
any of the states in the region represented by v after the
execution of op. Formally, given an abstract variable assign-
ment v and an assignment operation s := exp, we have
SP
s:=exp
(v) = v
|X\{s}
v
s:=exp
with v
s:=exp
= (s, exp
/v
),
where exp
/v
denotes the interpretation of expression exp for
the abstract variable assignment v (cf. denition of exp
/v
in
Subsection II-C). That is, the value of variable s is the result
of the arithmetic evaluation of expression exp, or if not
all values in the expression are known, or if no value is
possible (an abstract data state in which a variable is assigned
to does not represent any concrete data state). Given an
abstract variable assignment v and an assume operation [p],
we have SP
[p]
(v) = v
(x) =
if (y, ) v for some variable x X or the formula p
/v
is
unsatisable, or v
= op
1
, ..., op
n
. The semantics of
a program path = (op
1
, l
1
), ..., (op
n
, l
n
) is dened as
the successive application of the strongest post-operator to
each operation of the corresponding constraint sequence
:
SP
(v) = SP
op
n
(...SP
op
i
(..SP
op
1
(v)..)...). The set of con-
crete program states that result from running is represented
by the pair (l
n
, SP
(v
0
)), where v
0
= is the initial abstract
variable assignment that does not map any variable to a value.
A program path is feasible if SP
(v
0
) is not contradicting,
i.e., SP
(v
0
)(x) ,= for all variables x in def(SP
(v
0
)). A
concrete state (l
n
, cd
n
) is reachable from a region r, denoted
by (l
n
, cd
n
) Reach(r), if there exists a feasible program
path = (op
1
, l
1
), ..., (op
n
, l
n
) with (l
0
, v
0
) r and
cd
n
[[SP
(v
0
)]]. A location l is reachable if there exists
a concrete state c such that (l, c) is reachable. A program is
SAFE if l
e
is not reachable.
The precision for an abstract variable assignment is a set
of variables. The explicit-value abstraction for an abstract
variable assignment is an abstract variable assignment that
is dened only on variables that are in the precision .
For example, the explicit-value abstraction for the variable
assignment v = x 2, y 5 and the precision = x
is the abstract variable assignment v
= x 2.
The precision for a program is a function : L 2
X
,
which assigns to each program location a precision for an
abstract variable assignment, i.e., a set of variables for which
the analysis is instructed to track values. A lazy explicit-value
abstraction of a program uses different precisions for different
abstract states on different program paths in the abstract
reachability graph (ARG). The explicit-value abstraction for
a variable assignment at location l is computed using the
precision (l).
B. CEGAR for Explicit-Value Model Checking
We now instantiate the three components of the CEGAR
technique, i.e., precision, feasibility check, and renement, for
our explicit-value analysis. The precisions that our CEGAR
instance uses are the above introduced precisions for a program
(which assign to each program location a set of variables), and
we start the CEGAR iteration with the empty precision, i.e.,
init
(l) = for each l L, such that no variable will be
tracked.
The feasibility check for a path is performed by exe-
cuting an explicit-value analysis of the path using the full
precision (l) = X for all locations l, i.e., all variables
will be tracked. This is equivalent to computing SP
(v
0
)
and check if the result is contradicting, i.e., if there is a
variable for which the resulting abstract variable assignment
is . This feasibility check is extremely efcient, because the
path is nite and the strongest post-operations for abstract
variable assignments are simple arithmetic evaluations. If the
feasibility check reaches the error location l
e
, then this error
can be reported. If the check cannot reach the error location,
because of a contradicting abstract variable assignment, then a
renement is necessary because at least one constraint depends
on a variable that was not yet tracked.
We dene the last component of the CEGAR technique, the
renement, after we introduced the notion of interpolation for
variable assignments and constraint sequences.
C. Interpolation for Variable Assignments
For each infeasible error path in the above mentioned re-
nement operation, we need to determine a precision that
assigns to each program location on that path the set of
program variables that the explicit-value analysis needs to
track in order to eliminate that infeasible error path in future
explorations. Therefore, we dene an interpolant for abstract
variable assignments.
An interpolant for a pair of abstract variable assignments
v
and v
+
, such that v
v
+
is contradicting, is an abstract
variable assignment 1 that fullls the following requirements:
1) the implication v
1 holds,
2) the conjunction 1 v
+
is contradicting, and
3) 1 only contains variables in its denition range which
are in the denition ranges of both v
and v
+
(def(1)
def(v
) def(v
+
)).
Lemma. For a given pair (v
, v
+
) of abstract variable
assignments, such that v
v
+
is contradicting, an interpolant
exists. Such an interpolant can be computed in time O(m+n),
where m and n are the sizes of v
and v
+
, respectively.
6
Algorithm 3 Interpolate(
,
+
)
Input: two constraint sequences
and
+
,
with
+
is contradicting
Output: a constraint sequence ,
which is an interpolant for
and
+
Variables: an abstract variable assignment v
v := SP
()
for each x def(v) do
if SP
+(v
|def(v)\{x}
) is contradicting then
// x is not relevant and should not occur in the interpolant
v := v
|def(v)\{x}
// construct the interpolating constraint sequence
:= )
for each x def(v) do
// construct an assume constraint for x
:= [x = v(x)])
return
Proof. The variable assignment v
|def(v
+
)
is an interpolant for
the pair (v
, v
+
).
Note. The above-mentioned interpolant that simply results
from restricting v
= op
1
, ..., op
m
is dened as their concatenation,
i.e.,
= op
1
, ..., op
n
, op
1
, ..., op
m
, the implication of
and
(denoted by
) as SP
(v
0
) SP
(v
0
), and
is contradicting if [[SP
(v
0
)]] = , with v
0
= .
An interpolant for a pair of constraint sequences
and
+
,
such that
+
is contradicting, is a constraint sequence
that fullls the following requirements:
1) the implication
holds,
2) the conjunction
+
is contradicting, and
3) contains in its constraints only variables that occur in
the constraints of both
and
+
.
Lemma. For a given pair (
,
+
) of constraint sequences,
such that
+
is contradicting, an interpolant exists. Such
an interpolant is computable in time O(m n), where m and
n are the sizes of
and
+
, respectively.
Proof. Algorithm Interpolate (Alg. 3) returns an interpolant
for two constraint sequences
and
+
. The algorithm
starts with computing the strongest post-condition for
+
:= op
i+1
, ..., op
n
)
// inductive interpolation
:= Interpolate( op
i
,
+
)
// extract variables from variable assignment that results from
(l
i
) :=
(x, z) SP
() and ,= z ,=
return
n SP steps). If it is contradicting, the variable can be removed.
If not, the variable is necessary to prove the contradiction
of the two constraint sequences, and thus, should occur in
the interpolant. Note that this keeps only variables in v that
occur in
+
as well. The rest of the algorithm constructs a
constraint sequence from the variable assignment, in order to
return an interpolating constraint sequence, which fullls the
three requirements of an interpolant. A naive implementation
can compute such an interpolant in O((m+n)
3
).
E. Renement Based on Explicit-Interpolation
The goal of our interpolation-based renement for explicit-
value analysis is to determine a localized precision that is
strong enough to eliminate an infeasible error path in future
explorations. This criterion is fullled by the property of
interpolants. A second goal is to have a precision that is as
weak as possible, by creating interpolants that have a denition
range as small as possible, in order to be parsimonious in
tracking variables and creating abstract states.
We apply the idea of interpolation for constraint sequences
to assemble a precision-extraction algorithm: Algorithm Rene
(Alg. 4) takes as input an infeasible program path, and returns
a precision for a program. A further requirement is that
the procedure computes inductive interpolants [6], i.e., each
interpolant along the path contains just enough information
to prove the remaining path infeasible. This is needed in
order to ensure that the interpolants at the different locations
achieve the goal of providing a precision that eliminates
the infeasible error path from further explorations. For every
program location l
i
along an infeasible error path , starting
at l
0
, we split the constraint sequence of the path into a
constraint prex
N9: ERROR:
N6
{a1}
N9: ERROR:
CFA abstract states
int a, b, c;
a = 0;
b = a;
c = a;
[a == -1] [a != -1] [a == -1]
a = 1; <noop>
int a, b, c;
a = 0;
b = a;
c = a;
[a == 0] [a != 0]
[a != -1] [a == -1]
a = 1; <noop>
int a, b, c;
a = 0;
b = a;
c = a;
[a == 0] [a != 0]
=
=
=
=
=
=
= {a1}
N9: ERROR:
interpolants
int a, b, c;
a = 0;
b = a;
c = a;
[a == -1]
{a}
N9: ERROR:
precision
int a, b, c;
a = 0;
b = a;
c = a;
[a == -1]
error path refuted
a = 1;
[a == 0]
a = 1;
[a == 0]
a = 1;
[a == 0]
Fig. 2: Illustration of one renement iteration; from left to right: a simple example CFA, an infeasible error path with the abstract states
annotated in the nodes (precision was empty, nothing is tracked), the interpolated variable assignments annotated in the nodes, the precisions
extracted from the interpolants annotated in the nodes, and nally the CFA with the abstract states annotated in the nodes according to the
new precision (unreached nodes including error shown in gray)
of the CEGAR algorithm (cf. Alg. 2). Note that the repetitive
interpolations are not an efciency bottleneck. The path is
always nite, without any loops or branching, and thus, even
a full-precision check can be decided efciently. Figure 2
illustrates the interpolation process on a simple example.
F. Optimizations
In our implementation, we added several optimizations to
improve the performance of our approach.
ARG Pruning instead of Restart. Our renement rou-
tine Rene (cf. Alg. 4) returns a set of variables (precision)
that are important for deciding the reachability of the error
location. One of the ideas of lazy abstraction renement [22]
is that the precision is only rened where necessary, i.e., only
at the locations along the path that was considered in the
renement; the other parts of the state space are not rened.
As mentioned in the discussion of the CEGAR algorithm (cf.
Alg. 2), it is not necessary to restart the exploration of the
state space from scratch after a renement. Instead, we identify
the descendant closest to the root of the abstract reachability
graph (ARG) in which the precision was rened, and the re-
exploration of the state space continues from there. In total,
this signicantly reduces the number of tracked variables per
abstract state, which in turn leads to a more efcient analysis,
because it drastically increases the chance that a new abstract
state is covered by an existing abstract state.
Scoped Precision Renement. The precision for a program
assigns to each program location the set of variables that
need to be tracked at that location, and the interpolation-
based renement adds new variables precisely at the locations
for which they were discovered during renement. In our
experience, the number of renements is reduced signicantly
if we add a variable to the precision not only at the particular
location for which it was discovered, but at all locations in
the local scope of the variable. This helps to avoid adding a
variable twice that can occur on two different branches. By
adding the variable to the precision in advance in the local
scope, we abbreviate some renement iterations. For example,
consider Fig. 2 again. After the illustrated renement, another
renement step would be necessary, in order to discover that
variable a needs to be tracked at location N4 as well (to prevent
the analysis from going through location N6). By adding
variable a to the precision of all locations in the scope of
variable a immediately after the rst renement, the program
can be proved safe without further renement. This effect
was also observed, and used, in the software model checker
BLAST [6].
Precise Counterexample Check. In order to further increase
the precision of our analysis, we double-check all feasible er-
ror paths using bit-precise bounded model checking (BMC)
5
,
by generating a path program [7] for the error path and let the
BMC conrm the bug. Since the generated path program does
not contain any loop or branching, it can be veried efciently.
If both our analysis and the bit-precise BMC report unsafe,
then we report a bug. If the BMC cannot conrm the bug,
our analysis continues trying to nd another error path. This
additional feature is available as a command-line option in our
implementation.
Auxiliary Predicate Analysis. As an additional option for
further improvement of the analysis, we implemented the
combination with a predicate analysis, as outlined in existing
work [9]. In this combination, if the explicit-value analysis
nds an error path, this path is rst checked for satisability in
the predicate domain. If the satisability check is positive, the
result unsafe can be reported and the error path is returned;
if negative, then the explicit-value domain is not expressive
enough to analyze that program path (e.g., due to inequalities).
5
In our implementation, we use CBMC [15] as bounded model checker.
8
In this case, we ask the predicate analysis to rene its
abstraction along that path, which yields a rened predicate
precision that eliminates the error path but considering the
facts along that path in the (more precise, and more expensive)
predicate domain. We need to parsimoniously use this feature
because the post-operations of the predicate analysis are much
more expensive than the post-operations of the explicit-value
analysis. In general, after a renement step, either the explicit-
value precision is rened (preferred) or the predicate precision
is rened (only if explicit does not succeed).
Using the concept of dynamic precision adjustment [9], we
also switch off the tracking of variables in the explicit-value
domain if the number of different values on a path exceeds a
certain threshold. After this, the predicate analysis will get
switched on (by the above-mentioned mechanism) and the
facts on that path are further tracked using predicates. This is
important if the explicit-value analysis tries to unwind loops;
the symbolic, predicate-based analysis can often store a large
number of values more efciently.
Note that this renement-based, parallel composition with
precision adjustment of the explicit-value analysis and the
predicate analysis is more powerful than a mere parallel
product of the two analyses, because after each renement, the
explicit part of the analysis tracks exactly what it is capable of
tracking, while the auxiliary predicate analysis takes care of
only those facts that are beyond the capabilities of the explicit
domain, resulting in a lightweight analysis on both ends. Such
a combination is easy to achieve in our implementation, be-
cause we use the framework of congurable program analysis
(CPA), which lets the user freely congure such combinations.
IV. Experiments
In order to demonstrate that our approach yields a signicant
practical improvement of verication efciency and effec-
tiveness, we implemented our algorithms and compared our
new techniques to existing tools for software verication. In
the following, we show that the application of abstraction,
CEGAR, and interpolation to the explicit-value domain con-
siderably improves the number of solved instances and the run
time. Combinations of the new explicit-value analysis with
a predicate-based analysis can further increase the number
of solved instances. All our experiments were performed on
hardware identical to that of the SV-COMP12 [5], such that
our results are comparable to all the results obtained there.
Compared Verication Approaches. For presentation, we re-
strict the comparison of our new approach to the SV-COMP12
participants BLAST, SATABS, and the competition winner CPA-
MEMO, all of which are based on predicate abstraction and
CEGAR. Furthermore, to investigate performance differences
in the same tool environment, we also compare with different
congurations of CPACHECKER. The model checker BLAST is
based on predicate abstraction, and uses a CEGAR loop for
abstraction renement. The predicates for the precision are
learned from counterexample paths using interpolation. The
central data structure of the algorithm is an ARG, which
is lazily constructed and rened. BLAST won the category
DeviceDrivers64 in the SV-COMP12, and got bronze in
another category. The model checker SATABS is also based on
predicate abstraction and CEGAR, but in contrast to BLAST, it
constructs and checks in every iteration of the CEGAR loop a
new boolean program based on the current precision of the
predicate abstraction, and does not use lazy abstraction or
interpolation. SATABS got silver in the categories SystemC
and Concurrency, and bronze in another category. The
model checker CPA-MEMO is based on predicate abstraction,
CEGAR, and interpolation, but extends it with the concepts of
adjustable-block encoding [11] and block-abstraction memo-
ization [26]. CPA-MEMO won the category Overall, got silver
in two more categories, and bronze in another category.
We implemented our concepts as extensions of
CPACHECKER [10], a software-verication framework
based on congurable program analysis (CPA). We compare
with the existing explicit-value analysis (without abstraction,
CEGAR, and interpolation) and with the existing predicate
analysis that is based on boolean predicate abstraction,
CEGAR, interpolation, and adjustable-block encoding [11].
We used the trunk version of CPACHECKER
6
in revision 6615.
Verication Tasks. For the evaluation of our approach,
we use all SV-COMP12
7
verication tasks that do
not involve concurrency properties (all categories ex-
cept category Concurrency). All obtained experimental
data as well as the tool implementation are available at
http://www.sosy-lab.org/dbeyer/cpa-explicit.
Quality Measures. We compare the verication results of
all verication approaches based on three measures for ver-
ication quality: First, we take the run time, in seconds, of
the verication runs to measure the efciency of an approach.
Obviously, the lower the run time, the better the tool. Second,
we use the number of correctly solved instances of verication
tasks to measure the effectiveness of an approach. The more
instances a tool can solve, the more powerful the analysis is.
Third, and most importantly, we use the scoring schema of the
SV-COMP12 as indicator for the quality of an approach. The
scoring schema implements a community-agreed weighting
schema, namely, that it is more difcult to prove a program
correct compared to nding a bug and that a wrong answer
should be penalized with double the scores that a correct
answer would have achieved. For a full discussion of the
ofcial rules and benchmarks of the SV-COMP12, we refer to
the competition report [5]. Besides the data tables, we use plots
of quantile functions [5] for visualizing the number of solved
instances and the verication time. The quantile function for
one approach contains all pairs (x, y) such that the maximum
run time of the x fastest results is y. We use a logarithmic
scale for the time range from 1 s to 1000 s and a linear scale
for the time range between 0 s and 1 s. In addition, we decorate
the graphs with symbols at every fth data point in order to
make the graphs distinguishable on gray-scale prints.
Improvements of Explicit-Value Analysis. In the rst evalu-
ation, we compare two different congurations of the explicit-
value analysis: CPA-EXPL refers to the existing implementation
of a standard explicit-value analysis without abstraction and
6
http://cpachecker.sosy-lab.org
7
http://sv-comp.sosy-lab.org/2012
9
Category CPA-EXPL CPA-EXPLitp
points solved time points solved time
ControlFlowInt 124 81 8400 123 79 780
DeviceDrivers 53 37 63 53 37 69
DeviceDrivers64 5 5 660 33 19 200
HeapManipul 1 3 5.5 1 3 5.8
SystemC 34 26 1600 34 26 1500
Overall 217 152 11000 244 164 2500
TABLE I: Comparison with purely explicit, non-CEGAR approach
1
10
100
1000
CPA-EXPL
CPA-EXPLitp
0 50 100 150 200
n-th fastest result
T
i
m
e
i
n
s
Fig. 3: Quantile plot: purely explicit analyses
renement, and CPA-EXPLitp refers to the new approach, which
implements abstraction, CEGAR, and interpolation. Table I
and Fig. 3 show that the new approach uses less time, solves
more instances, and obtains more points in the SV-COMP12
scoring schema.
Improvements of Combination with Predicate Analysis.
In the second evaluation, we compare the renement-based
explicit analysis against a standard predicate analysis, as well
as to the predicate analysis combined with CPA-EXPL and CPA-
EXPLitp, respectively: CPA-PRED refers to a standard predicate
analysis that CPACHECKER offers (ABE-lf, [11]), CPA-EXPLitp
refers again to the explicit-value analysis, which implements
abstraction, CEGAR, and interpolation, CPA-EXPL-PRED refers
to the combination of predicate analysis and explicit-value
analysis without renement, and CPA-EXPLitp-PRED refers to the
combination of predicate analysis and explicit-value analysis
with renement.
Table II and Fig. 4 show that the new combination approach
outperforms the existing approaches CPA-PRED and CPA-EXPLitp
in terms of solved instances and score. The comparison with
column CPA-EXPL-PRED is interesting because it shows that the
combination of two analyses is an improvement even without
renement in the explicit-value analysis, but switching on
the renement in both domains makes the new combination
signicantly more effective.
Comparison with State-of-the-Art Veriers. In the third
evaluation, we compare our new combination approach with
three established tools: BLAST refers to the standard BLAST
conguration that participated in the SV-COMP12, SATABS
also refers to the respective standard conguration, CPA-MEMO
refers to a special predicate abstraction that is based on block-
abstraction memoization, and CPA-EXPLitp-PRED refers to our
novel approach, which combines a predicate analysis (CPA-
PRED) with the new explicit-value analysis that is based on
1
10
100
1000
T
i
m
e
i
n
s
CPA-PRED
CPA-EXPLitp
CPA-EXPL-PRED
CPA-EXPLitp-PRED
0 50 100 150 200
n-th fastest result
Fig. 4: Quantile plot: comparison with predicate-based congurations
1
10
100
1000
T
i
m
e
i
n
s
BLAST
SATABS
CPA-Memo
CPA-EXPLitp-PRED
0 50 100 150 200
n-th fastest result
Fig. 5: Quantile plot: comparison with three existing tools
abstraction, CEGAR, and interpolation (CPA-EXPLitp). Table III
and Fig. 5 show that the new approach outperforms BLAST
and SATABS by consuming considerably less verication time,
more solved instances, and a better score. Even compared
to the SV-COMP12 winner, CPA-MEMO, our new approach
scores higher. It is interesting to observe that the difference in
scores is much higher than the difference in solved instances:
this means CPA-MEMO had many incorrect verication results,
which in turn shows that our new combination is signicantly
more precise.
V. Conclusion
The surprising insight of this work is that it is possible
to achieve without using sophisticated SMT-solvers during
the abstraction renement a performance and precision
that can compete with the worlds leading symbolic model
checkers, which are based on SMT-based predicate abstraction.
We achieved this by incorporating the ideas of abstraction,
10
Category CPA-PRED CPA-EXPLitp CPA-EXPL-PRED CPA-EXPLitp-PRED
score solved time score solved time score solved time score solved time
ControlFlowInt 103 70 2500 123 79 780 131 85 2600 141 91 830
DeviceDrivers 71 46 80 53 37 69 71 46 82 71 46 87
DeviceDrivers64 33 24 2700 33 19 200 10 11 1100 37 24 980
HeapManipul 8 6 12 1 3 5.8 6 5 11 8 6 12
SystemC 22 17 1900 34 26 1500 62 45 1500 61 44 3700
Overall 237 163 7100 244 164 2500 280 192 5300 318 211 5600
TABLE II: Comparison with predicate-based congurations
Category BLAST SATABS CPA-MEMO CPA-EXPLitp-PRED
score solved time score solved time score solved time score solved time
ControlFlowInt 71 51 9900 75 47 5400 140 91 3200 141 91 830
DeviceDrivers 72 51 30 71 43 140 51 46 93 71 46 87
DeviceDrivers64 55 33 1400 32 17 3200 49 33 500 37 24 980
HeapManipul 4 9 16 8 6 12
SystemC 33 23 4000 57 40 5000 36 30 450 61 44 3700
Overall 231 158 15000 235 147 14000 280 209 4300 318 211 5600
TABLE III: Comparison with three existing tools
counterexample-guided abstraction renement, lazy abstrac-
tion renement, and interpolation into a standard, simple
explicit-value analysis.
We further improved the performance and precision by
combining our renement-based explicit-value analysis with
a predicate analysis, in order to benet from the comple-
mentary advantages of the methods. The combination analysis
dynamically adjusts the precision [9] for an optimal trade-
off between the precision of the explicit analysis and the
precision of the auxiliary predicate analysis. This combination
out-performs state-of-the-art model checkers, witnessed by a
thorough comparison on a standardized set of benchmarks.
Despite the overall success of our new approach, individual
instances of benchmarks show different performance with
different congurations i.e., either with or without CEGAR.
Therefore, a general heuristic for nding a suitable strategy for
a single verication task would be benecial. Also, we envi-
sion better support for pointers and data structures, because
our interpolation approach can be efciently applied even
with high precision. Moreover, we so far only combined our
interpolation approach with an auxiliary predicate analysis in
the ABE-lf conguration, and we have not yet tried to combine
this with the superior block-abstraction memoization (ABM)
[26] technique. Finally, we plan to extend our interpolation
approach to other abstract domains like intervals.
References
[1] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles,
Techniques, and Tools. Addison-Wesley, 1986.
[2] A. Albarghouthi, A. Gurnkel, and M. Chechik. Craig interpretation.
In Proc. SAS, pages 300316, 2012.
[3] T. Ball, A. Podelski, and S. K. Rajamani. Boolean and cartesian abstrac-
tions for model checking C programs. In Proc. TACAS, LNCS 2031,
pages 268283. Springer, 2001.
[4] T. Ball and S. K. Rajamani. The SLAM project: Debugging system
software via static analysis. In Proc. POPL, pages 13. ACM, 2002.
[5] D. Beyer. Competition on Software Verication (SV-COMP). In Proc.
TACAS, LNCS 7214, pages 504524. Springer, 2012.
[6] D. Beyer, T. A. Henzinger, R. Jhala, and R. Majumdar. The software
model checker BLAST. Int. J. Softw. Tools Technol. Transfer, 9(5-6):505
525, 2007.
[7] D. Beyer, T. A. Henzinger, R. Majumdar, and A. Rybalchenko. Path
programs. In Proc. PLDI, pages 300309. ACM, 2007.
[8] D. Beyer, T. A. Henzinger, and G. Th eoduloz. Congurable software
verication: Concretizing the convergence of model checking and pro-
gram analysis. In Proc. CAV, LNCS 4590, pages 504518. Springer,
2007.
[9] D. Beyer, T. A. Henzinger, and G. Th eoduloz. Program analysis with
dynamic precision adjustment. In Proc. ASE, pages 2938. IEEE, 2008.
[10] D. Beyer and M. E. Keremoglu. CPACHECKER: A tool for congurable
software verication. In Proc. CAV, LNCS 6806, pages 184190.
Springer, 2011.
[11] D. Beyer, M. E. Keremoglu, and P. Wendler. Predicate abstraction with
adjustable-block encoding. In Proc. FMCAD, pages 189197. FMCAD,
2010.
[12] D. Beyer and P. Wendler. Algorithms for software model checking:
Predicate abstraction vs. IMPACT. In Proc. FMCAD, pages 106113.
FMCAD, 2012.
[13] S. Chaki, E. M. Clarke, A. Groce, S. Jha, and H. Veith. Modular
verication of software components in C. IEEE Trans. Softw. Eng.,
30(6):388402, 2004.
[14] E. M. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith.
Counterexample-guided abstraction renement for symbolic model
checking. J. ACM, 50(5):752794, 2003.
[15] E. M. Clarke, D. Kr oning, and F. Lerda. A tool for checking ANSI-
C programs. In Proc. TACAS, LNCS 2988, pages 168176. Springer,
2004.
[16] E. M. Clarke, D. Kr oning, N. Sharygina, and K. Yorav. SATABS: SAT-
based predicate abstraction for ANSI-C. In Proc. TACAS, LNCS 3440,
pages 570574. Springer, 2005.
[17] W. Craig. Linear reasoning. A new form of the Herbrand-Gentzen
theorem. J. Symb. Log., 22(3):250268, 1957.
[18] S. Graf and H. Sadi. Construction of abstract state graphs with PVS.
In Proc. CAV, LNCS 1254, pages 7283. Springer, 1997.
[19] B. S. Gulavani, S. Chakraborty, A. V. Nori, and S. K. Rajamani. Auto-
matically rening abstract interpretations. In Proc. TACAS, LNCS 4963,
pages 443458. Springer, 2008.
[20] K. Havelund and T. Pressburger. Model checking Java programs using
JAVA PATHFINDER. Int. J. Softw. Tools Technol. Transfer, 2(4):366381,
2000.
[21] T. A. Henzinger, R. Jhala, R. Majumdar, and K. L. McMillan. Abstrac-
tions from proofs. In Proc. POPL, pages 232244. ACM, 2004.
[22] T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Lazy abstraction.
In Proc. POPL, pages 5870. ACM, 2002.
[23] G. J. Holzmann. The SPIN model checker. IEEE Trans. Softw. Eng.,
23(5):279295, 1997.
[24] C. S. Pasareanu, M. B. Dwyer, and W. Visser. Finding feasible counter-
examples when model checking abstracted Java programs. In Proc.
TACAS, LNCS 2031, pages 284298. Springer, 2001.
[25] A. Podelski and A. Rybalchenko. ARMC: The logical choice for
software model checking with abstraction renement. In Proc. PADL,
LNCS 4354, pages 245259. Springer, 2007.
[26] D. Wonisch. Block abstraction memoization for CPACHECKER. In Proc.
TACAS, LNCS 7214, pages 531533. Springer, 2012.
11