0% found this document useful (0 votes)
8 views31 pages

W01P1-Intro

The document discusses the importance of software testing and quality assurance, highlighting the complexity of software engineering and the need for better tools to manage it. It outlines various software disasters caused by bugs, the limitations of traditional testing methods, and introduces formal verification methods as a means to ensure software correctness. The document also covers automated verification techniques, challenges in the field, and the topics to be covered in a related course.

Uploaded by

sta.emails
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views31 pages

W01P1-Intro

The document discusses the importance of software testing and quality assurance, highlighting the complexity of software engineering and the need for better tools to manage it. It outlines various software disasters caused by bugs, the limitations of traditional testing methods, and introduces formal verification methods as a means to ensure software correctness. The document also covers automated verification techniques, challenges in the field, and the topics to be covered in a related course.

Uploaded by

sta.emails
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Introduction: Software Testing and

Quality Assurance
Software Testing, Quality Assurance, and Maintenance

Winter 2018

Prof. Arie Gurfinkel


Software is Everywhere

2
2
Software is Everywhere

“Software easily rates as the most poorly


constructed, unreliable, and least maintainable
technological artifacts invented by man”
Paul Strassman, former CIO of Xerox

3
3
Infamous Software Disasters

Between 1985 and 1987, Therac-25 gave patients massive overdoses


of radiation, approximately 100 times the intended dose. Three patients
died as a direct consequence.

On February 25, 1991, during the Gulf War, an American Patriot


Missile battery in Dharan, Saudi Arabia, failed to track and intercept an
incoming Iraqi Scud missile. The Scud struck an American Army
barracks, killing 28 soldiers and injuring around 100 other people.

On June 4, 1996 an unmanned Ariane 5 rocket launched by the


European Space Agency forty seconds after lift-off. The rocket was on
its first voyage, after a decade of development costing $7 billion. The
destroyed rocket and its cargo were valued at $500 million.

http://www5.in.tum.de/~huckle/bugse.html

4
4
http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

5
5
Why so many bugs?

Software Engineering is very complex


• Complicated algorithms
• Many interconnected components
• Legacy systems
• Huge programming APIs
• …

Software Engineers need better tools to deal with this complexity!

6
6
What Software Engineers Need Are …

Tools that give better confidence than ad-hoc testing while remaining
easy to use

And at the same time, are


• … fully automatic
• … (reasonably) easy to use
• … provide (measurable) guarantees
• … come with guidelines and methodologies to apply effectively
• … apply to real software systems

7
7
Testing

Software validation the “old-fashioned” way:


• Create a test suite (set of test cases)
• Run the test suite
• Fix the software if test suite fails
• Ship the software if test suite passes

8
8
“Program testing can be a very effective way to show the
presence of bugs, but is hopelessly inadequate for showing
their absence.”
Edsger W. Dijkstra

Very hard to test the portion inside the “if" statement!

input x
if (hash(x) == 10) {
...
}

9
9
“Beware of bugs in the above code; I have only proved it correct, not
tried it.”
Donald Knuth

You can only verify what you have specified.

Testing is still important, but can we make it less impromptu?

10
10
Verification / Quality Assurance

Verification: formally prove that a computing system


satisfies its specifications
• Rigor: well established mathematical foundations
• Exhaustiveness: considers all possible behaviors of the system, i.e.,
finds all errors
• Automation: uses computers to build reliable computers

Formal Methods: general area of research related to


program specification and verification

11
11
Ultimate Goal: Static Program Analysis

Automated Correct
Program
Analysis
Specification Incorrect

Reasoning statically about behavior of a program without executing it


• compile-time analysis
• exhaustive, considers all possible executions under all possible environments
and inputs

The algorithmic discovery of properties of program by inspection of the


source text
Manna and Pnueli

Also known as static analysis, program verification, formal methods, etc.


12
12
Turing, 1936: “undecidable”

13
13
Undecidability

A problem is undecidable if there does not exists a Turing machine that


can solve it
• i.e., not solvable by a computer program
The halting problem
• does a program P terminates on input I
• proved undecidable by Alan Turing in 1936
• https://en.wikipedia.org/wiki/Halting_problem

Rice’s Theorem
• for any non-trivial property of partial functions, no general and effective
method can decide whether an algorithm computes a partial function with that
property
• in practice, this means that there is no machine that can always decide
whether the language of a given Turing machine has a particular nontrivial
property
• https://en.wikipedia.org/wiki/Rice%27s_theorem
14
14
LEGO Turing Machine

BEGIN:
READ
CJUMP0 CASE_0
CASE_1:
WRITE 0
MOVE R
JUMP BEGIN
CASE_0:
WRITE 1
MOVE R
JUMP BEGIN

by Soonho Kong. See http://www.cs.cmu.edu/~soonhok for building instructions.

15
15
Living with Undecidability

“Algorithms” that occasionally diverge

Limit programs that can be analyzed


• finite-state, loop-free

Partial (unsound) verification Testing Sym Exec


• analyze only some executions up-to a fixed number of steps

Incomplete verification / Abstraction Automated Verification


• analyze a superset of program executions

Programmer Assistance
• annotations, pre-, post-conditions, inductive invariants

Deductive Verification 16
16
(User) Effort vs (Verification) Assurance

Deductive
Assurance/Coverage

Verification

Automated
Verification

Symbolic
Execution

Testing

Effort 17
17
Formal Software Analysis
J. McCarthy, “A basis for mathematical theory of computation”,
1963.

P. Naur, “Proof of algorithms by general snapshots”, 1966.

R. W. Floyd, “Assigning meaning to programs”, 1967.

C.A.R Hoare, “An axiomatic basis for computer programming”,


1969.

E. W. Dijkstra: “Guarded Commands, Nondeterminacy and Formal


derivation ”, 1975.
18
18
But Turing was already thinking about
program verification in 1949!

19
19
Turing, 1949 Alan M. Turing. “Checking a large routine”, 1949

20
20
20
21
21
method factorial (n: int) returns (v:int)

{
v := 1;
if (n == 1) { return v; }
var i := 2;
while (i <= n)

{
v := i * v;
i := i + 1;
}
return v;
}
22
22
method factorial (n: int) returns (v:int)
requires n >= 0;
ensures v = fact(n);
{
v := 1;
if (n <= 1) { return v; }
var i := 2;
while (i <= n)
invariant i <= n+1
invariant v = fact(i-1)
{
v := i * v;
i := i + 1;
}
return v;
}
23
23
Proving inductive invariants

The main step is to show that the invariant is preserved by one


execution of the loop

assume(i <= n+1);


assume(v == fact(i-1));
assume(i <= n);
v := i*v;
i := i+1;
assert(i<=n+1);
assert(v == fact(i-1));

Correctness of a loop-free program can (often) be decided by a


Theorem Prover or a Satisfiability Modulo Theory (SMT) solver.

24
24
Proving inductive invariants

The main step is to show that the invariant is preserved by one


execution of the loop

(i0 <= n0+1) && assume(i <= n+1);


(v0 == (i0-1)!) && assume(v == fact(i-1));
(i0 <= n0) && assume(i <= n);
(v1 = i0 * v0) && v := i*v;
(i1 = i0 + 1) i := i+1;
è
((i1 <= n0+1) && assert(i<=n+1);
(v1 == (i1-1)!)) assert(v == fact(i-1));

Correctness of a loop-free program can (often) be decided by a


Theorem Prover or a Satisfiability Modulo Theory (SMT) solver.
25
25
Automated Verification

Deductive Verification
• A user provides a program and a verification certificate
– e.g., inductive invariant, pre- and post-conditions, function summaries, etc.
• A tool automatically checks validity of the certificate
– this is not easy! (might even be undecidable)
• Verification is manual but machine certified

Algorithmic Verification (My research area)


• A user provides a program and a desired specification
– e.g., program never writes outside of allocated memory
• A tool automatically checks validity of the specification
– and generates a verification certificate if the program is correct
– and generates a counterexample if the program is not correct
• Verification is completely automatic – “push-button”

26
26
Available Tools

Testing
• many tools actively used in industry. We will use Python unittest
Symbolic Execution
• mostly academic tools with emerging industrial applications
• KLEE, S2E, jDART, Pex (now Microsoft IntelliTest)
Automated Verification
• built into compilers, may lightweight static analyzers
– clang analyzer, Facebook Infer, Coverity, …
• academic pushing the coverage/automation boundary
– SeaHorn (my tool), JayHorn, CPAChecker, SMACK, T2, …
(Automated) Deductive Verification
• academic, still rather hard to use, we’ll experience in class J
• Dafny/Boogie (Microsoft), Viper, Why3, KeY, ...

27
27
Key Challenges

Testing
• Coverage

Symbolic Execution and Automated Verification


• Scalability

Deductive Verification
• Usability

Common Challenge
• Specification / Oracle

28
28
Topics Covered in the Course

Foundations
• syntax, semantics, abstract syntax trees, visitors, control flow graphs
Testing
• coverage: structural, dataflow, and logic
Symbolic Execution
• using SMT solvers, constraints, path conditions, exploration strategies
• building a (toy) symbolic execution engine
Deductive Verification
• Hoare Logic, weakest pre-condition calculus, verification condition generation
• verifying algorithm using Dafny, building a small verification engine
Automated Verification
• (basics of) software model checking

29
29
A little about me

2007, PhD University of Toronto

2006-2016, Principle Researcher at Software


Engineering Institute, Carnegie Mellon University

Sep 2016, Associate Professor, University of Waterloo

SPACER

UFO FrankenBit
Avy SeaHorn

30
30
Interests and Tools

Interests
• Software Model Checking, Program Verification, Decision Procedures,
Abstract Interpretation, SMT, Horn Clauses, …

Active Tools
• SeaHorn – Algorithmic Logic-Based Verification framework for C
• AVY – Hardware Model Checker with Interpolating PDR
• SPACER – Horn Clause Solver based on Z3 GPDR
• for more, see http://arieg.bitbucket.org/tools.html

Current Work
• parametric symbolic reachability – verifying safety properties of parametric
systems
• automated verification of C
•…
31
31

You might also like