0% found this document useful (0 votes)
11 views

A Self-Teaching Introduction - Software testing

The document outlines the licensing, disclaimer of liability, and limited warranty associated with the book 'Software Testing: A Self-Teaching Introduction' by Rajiv Chopra. It specifies that the purchase does not grant ownership of the content and that the work is provided 'as is' without warranties regarding performance or results. Additionally, it details the structure of the book, including chapters on various testing methodologies, verification and validation, and automated testing.

Uploaded by

m.hajni.m
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

A Self-Teaching Introduction - Software testing

The document outlines the licensing, disclaimer of liability, and limited warranty associated with the book 'Software Testing: A Self-Teaching Introduction' by Rajiv Chopra. It specifies that the purchase does not grant ownership of the content and that the work is provided 'as is' without warranties regarding performance or results. Additionally, it details the structure of the book, including chapters on various testing methodologies, verification and validation, and automated testing.

Uploaded by

m.hajni.m
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 784

SOFTWARE

TESTING
LICENSE, DISCLAIMER OF LIABILITY, AND LIMITED
WARRANTY
By purchasing or using this book (the “Work”), you agree that this license grants permission to use the
contents contained herein, but does not give you the right of ownership to any of the textual content in
the book or ownership to any of the information or products contained in it. This license does not
permit uploading of the Work onto the Internet or on a network (of any kind) without the written
consent of the Publisher. Duplication or dissemination of any text, code, simulations, images, etc.
contained herein is limited to and subject to licensing terms for the respective products, and permission
must be obtained from the Publisher or the owner of the content, etc., in order to reproduce or network
any portion of the textual material (in any media) that is contained in the Work.

MERCURY LEARNING AND INFORMATION (“MLI” or “the Publisher”) and anyone involved in
the creation, writing, or production of the companion disc, accompanying algorithms, code, or
computer programs (“the software”), and any accompanying Web site or software of the Work, cannot
and do not warrant the performance or results that might be obtained by using the contents of the Work.
The author, developers, and the Publisher have used their best efforts to insure the accuracy and
functionality of the textual material and/or programs contained in this package; we, however, make no
warranty of any kind, express or implied, regarding the performance of these contents or programs. The
Work is sold “as is” without warranty (except for defective materials used in manufacturing the book or
due to faulty workmanship).

The author, developers, and the publisher of any accompanying content, and anyone involved in the
composition, production, and manufacturing of this work will not be liable for damages of any kind
arising out of the use of (or the inability to use) the algorithms, source code, computer programs, or
textual material contained in this publication. This includes, but is not limited to, loss of revenue or
profit, or other incidental, physical, or consequential damages arising out of the use of this Work.

The sole remedy in the event of a claim of any kind is expressly limited to replacement of the book, and
only at the discretion of the Publisher. The use of “implied warranty” and certain “exclusions” vary
from state to state, and might not apply to the purchaser of this product.
SOFTWARE
TESTING
A Self-Teaching Introduction

RAJIV CHOPRA, PhD

MERCURY LEARNING AND INFORMATION


Dulles, Virginia
Boston, Massachusetts
New Delhi
Copyright ©2018 by MERCURY LEARNING AND INFORMATION LLC. All rights reserved.

Original Title and Copyright: Software Testing, 4/e. © 2014 by S.K. Kataria & Sons.

This publication, portions of it, or any accompanying software may not be reproduced in any way,
stored in a retrieval system of any type, or transmitted by any means, media, electronic display or
mechanical display, including, but not limited to, photocopy, recording, Internet postings, or scanning,
without prior permission in writing from the publisher.

Publisher: David Pallai


MERCURY LEARNING AND INFORMATION
22841 Quicksilver Drive
Dulles, VA 20166
[email protected]
www.merclearning.com
(800) 232-0223

R. Chopra. Software Testing: A Self-Teaching Introduction.


ISBN: 978-1-683921-66-0

The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a
means to distinguish their products. All brand names and product names mentioned in this book are
trademarks or service marks of their respective companies. Any omission or misuse (of any kind) of
service marks or trademarks, etc. is not an attempt to infringe on the property of others.

Library of Congress Control Number: 2017960714


181920321 This book is printed on acid-free paper in the United States of America

Our titles are available for adoption, license, or bulk purchase by institutions, corporations, etc.
For additional information, please contact the Customer Service Dept. at 800-232-0223(toll free).

All of our titles are available in digital format at authorcloudware.com and other digital vendors. The
sole obligation of MERCURY LEARNING AND INFORMATION to the purchaser is to replace the
book, based on defective materials or faulty workmanship, but not based on the operation or
functionality of the product.
CONTENTS

Chapter 1: Introduction to Software Testing


1.0. Introduction
1.1. The Testing Process
1.2. What is Software Testing?
1.3. Why Should We Test? What is the Purpose?
1.4. Who Should do Testing?
1.5. How Much Should We Test?
1.6. Selection of Good Test Cases
1.7. Measurement of Testing
1.8. Incremental Testing Approach
1.9. Basic Terminology Related to Software Testing
1.10. Testing Life Cycle
1.11. When to Stop Testing?
1.12. Principles of Testing
1.13. Limitations of Testing
1.14. Available Testing Tools, Techniques, and Metrics
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 2: Software Verification and Validation
2.0. Introduction
2.1. Differences between Verification and Validation
2.2. Differences between QA And QC?
2.3. Evolving Nature of Area
2.4. V&V Limitations
2.5. Categorizing V&V Techniques
2.6. Role of V&V in SDLC—Tabular Form [IEEE std. 1012]
2.7. Proof of Correctness (Formal Verification)
2.8. Simulation and Prototyping
2.9. Requirements Tracing
2.10. Software V&V Planning (SVVP)
2.11. Software Technical Reviews (STRs)
2.11.1. Rationale for STRs
2.11.2. Types of STRs
2.11.3. Review Methodologies
2.12. Independent V&V Contractor (IV&V)
2.13. Positive and Negative Effects of Software V&V on Projects
2.14. Standard for Software Test Documentation (IEEE829)
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 3: Black-Box (or Functional) Testing Techniques
3.0. Introduction to Black-Box (or Functional Testing)
3.1. Boundary Value Analysis (BVA)
3.1.1. What is BVA?
3.1.2. Limitations of BVA
3.1.3. Robustness Testing
3.1.4. Worst-Case Testing
3.1.5. Examples with Their Problem Domain
3.1.6. Guidelines for BVA
3.2. Equivalence Class Testing
3.2.1. Weak Normal Equivalence Class Testing
3.2.2. Strong Normal Equivalence Class Testing
3.2.3. Weak Robust Equivalence Class Testing
3.2.4. Strong Robust Equivalence Class Testing
3.2.5. Solved Examples
3.2.6. Guidelines for Equivalence Class Testing
3.3. Decision Table Based Testing
3.3.1. What are Decision Tables?
3.3.2. Advantages, Disadvantage, and Applications of Decision
Tables
3.3.3. Examples
3.3.4. Guidelines for Decision Table Based Testing
3.4. Cause-Effect Graphing Technique
3.4.1. Causes and Effects
3.4.2. Test Cases for the Triangle Problem
3.4.3. Test Cases for Payroll Problem
3.4.4. Guidelines for the Cause-Effect Functional Testing
Technique
3.5. Comparison on Black-Box (or Functional) Testing Techniques
3.5.1. Testing Effort
3.5.2. Testing Efficiency
3.5.3. Testing Effectiveness
3.5.4. Guidelines for Functional Testing
3.6. Kiviat Charts
3.6.1. The Concept of Balance
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 4: White-Box (or Structural) Testing Techniques
4.0. Introduction to White-Box Testing or Structural Testing or
Clear-Box or Glass-Box or Open-Box Testing
4.1. Static versus Dynamic White-Box Testing
4.2. Dynamic White-Box Testing Techniques
4.2.1. Unit/Code Functional Testing
4.2.2. Code Coverage Testing
4.2.3. Code Complexity Testing
4.3. Mutation Testing Versus Error Seeding—Differences in Tabular
Form
4.4. Comparison of Black-Box and White-Box Testing in Tabular
Form
4.5. Practical Challenges in White-Box Testing
4.6. Comparison on Various White-Box Testing Techniques
4.7. Advantages of White-Box Testing
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 5: Gray-Box Testing
5.0. Introduction to Gray-Box Testing
5.1. What is Gray-Box Testing?
5.2. Various Other Definitions of Gray-Box Testing
5.3. Comparison of White-Box , Black-Box, and Gray-Box Testing
Approaches in Tabular Form
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 6: Reducing the Number of Test Cases
6.0. Prioritization Guidelines
6.1. Priority Category Scheme
6.2. Risk Analysis
6.3. Regression Testing—Overview
6.3.1. Differences between Regression and Normal Testing
6.3.2. Types of Regression Testing
6.4. Prioritization of Test Cases for Regression Testing
6.5. Regression Testing Technique—A Case Study
6.6. Slice-Based Testing
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 7: Levels of Testing
7.0. Introduction
7.1. Unit, Integration, System, and Acceptance Testing Relationship
7.2. Integration Testing
7.2.1. Classification of Integration Testing
7.2.2. Decomposition-Based Integration
7.2.3. Call Graph-Based Integration
7.2.4. Path-Based Integration with its Pros and Cons
7.2.5. System Testing
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 8: Object-Oriented Testing
8.0. Basic Unit for Testing, Inheritance, and Testing
8.1. Basic Concepts of State Machines
8.2. Testing Object-Oriented Systems
8.2.1. Implementation-Based Class Testing/White-Box or
Structural Testing
8.2.2. Responsibility-Based Class Testing/Black-
Box/Functional Specification-Based Testing of Classes
8.3. Heuristics for Class Testing
8.4. Levels of Object-Oriented Testing
8.5. Unit Testing a Class
8.6. Integration Testing of Classes
8.7. System Testing (With Case Study )
8.8. Regression and Acceptance Testing
8.9. Managing the Test Process
8.10. Design for Testability (DFT)
8.11. GUI Testing
8.12. Comparison of Conventional and Object-Oriented Testing
8.13. Testing using Orthogonal Arrays
8.14. Test Execution Issues
8.15. Case Study—Currency Converter Application
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 9: Automated Testing
9.0. Automated Testing
9.1. Consideration during Automated Testing
9.2. Types of Testing Tools-Static V/s Dynamic
9.3. Problems with Manual Testing
9.4. Benefits of Automated Testing
9.5. Disadvantages of Automated Testing
9.6. Skills Needed for Using Automated Tools
9.7. Test Automation: “No Silver Bullet”
9.8. Debugging
9.9. Criteria for Selection of Test Tools
9.10. Steps for Tool Selection
9.11. Characteristics of Modern Testing Tools
9.12. Case Study on Automated Tools, Namely, Rational Robot, Win
Runner, Silk Test, and Load Runner
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 10: Test Point Analysis (TPA)
10.0. Introduction
10.1. Methodology
10.1.1. TPA Philosophy
10.1.2. TPA Model
10.2. Case Study
10.3. TPA for Case Study
10.4. Phase Wise Breakup Over Testing Life Cycle
10.5. Path Analysis
10.6. Path Analysis Process
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 11: Testing Your Websites—Functional and Non-Functional
Testing
11.0. Abstract
11.1. Introduction
11.2. Methodology
11.2.1. Non-Functional Testing (or White-Box Testing)
11.2.2. Functional Testing (or Black-Box Testing)
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 12: Regression Testing of a Relational Database
12.0. Introduction
12.1. Why Test an RDBMS?
12.2. What Should We Test?
12.3. When Should We Test?
12.4. How Should We Test?
12.5. Who Should Test?
Summary
Multiple Choice Questions
Answers
Conceptual Short Questions With Answers
Review Questions
Chapter 13: A Case Study on Testing of E-Learning Management
Systems
1 Introduction
2 Software Requirement Specifications
2.1. Introduction
2.1.1. Purpose
2.1.2. Scope
2.1.3. Definitions, Acronyms, and Abbreviations
2.1.4. References Books
2.1.5. Overview
2.2. Overall Descriptions
2.2.1. Product Perspective
2.2.2. Product Functions
2.2.3. User Characteristics
2.2.4. Constraints
2.2.5. Assumptions and Dependencies
2.2.6. Apportioning of Requirements
2.3. Specific Requirements
2.3.1. User Interfaces and Validations
2.3.2. Functions
2.3.3. Modules
2.3.4. Performance Requirements
2.3.5. Logical Database Requirements
2.3.6. Design Constraints
2.3.7. Software System Attributes
2.4.Change Management Process
2.5.Document Approval
2.6.Supporting Information
3 System Design
4 Reports And Testing
4.1.Test Report
4.2.Testing
4.2.1. Types of Testing
4.2.2. Levels of Testing
5 Test Cases
5.1. Return Filed Report
5.2. Monthly/Quarterly Tax Paid Form
5.3. Monthly/Quarterly Tax Paid Form
5.4. Monthly /Quarterly Tax Paid Form
5.5. Service Wise Report (Admin Report)
5.6. STRPs Wise Report (Admin Report)
Conclusion
Chapter 14: The Game Testing Process
14.1. “Black-Box” Testing
14.2. “White-Box” Testing
14.3. The Life Cycle of a Build
14.4. On Writing Bugs Well
Exercises
Chapter 15: Basic Test Plan Template

Appendix A: Quality Assurance and Testing Tools


Appendix B: Suggested Projects
Appendix C: Glossary
Appendix D: Sample Project Description
Appendix E: Bibliography

Index
CHAPTER
1

INTRODUCTION TO SOFTWARE
TESTING
Inside this Chapter:
1.0. Introduction
1.1. The Testing Process
1.2. What is Software Testing?
1.3. Why Should We Test? What is the Purpose?
1.4. Who Should Do Testing?
1.5. What Should We Test?
1.6. Selection of Good Test Cases
1.7. Measurement of the Progress of Testing
1.8. Incremental Testing Approach
1.9. Basic Terminology Related to Software Testing
1.10. Testing Life Cycle
1.11. When to Stop Testing?
1.12. Principles of Testing
1.13. Limitations of Testing
1.14. Available Testing Tools, Techniques, and Metrics

1.0. INTRODUCTION
Testing is the process of executing the program with the intent of
finding faults. Who should do this testing and when should it start are
very important questions that are answered in this text. As we know
software testing is the fourth phase of the software development life
cycle (SDLC). About 70% of development time is spent on testing.
We explore this and many other interesting concepts in this chapter.
1.1. THE TESTING PROCESS
Testing is different from debugging. Removing errors from your
programs is known as debugging but testing aims to locate as yet
undiscovered errors. We test our programs with both valid and invalid
inputs and then compare our expected outputs as well as the observed
outputs (after execution of software). Please note that testing starts
from the requirements analysis phase only and goes until the last
maintenance phase. During requirement analysis and designing we do
static testing wherein the SRS is tested to check whether it is as per
user requirements or not. We use techniques of code reviews, code
inspections, walkthroughs, and software technical reviews (STRs) to
do static testing. Dynamic testing starts when the code is ready or even
a unit (or module) is ready. It is dynamic testing as now the code is
tested. We use various techniques for dynamic testing like black-box,
gray-box, and white-box testing. We will be studying these in the
subsequent chapters.

1.2. WHAT IS SOFTWARE TESTING?


The concept of software testing has evolved from simple program
“check-out” to a broad set of activities that cover the entire software
life-cycle.
There are five distinct levels of testing that are given below:
a. Debug: It is defined as the successful correction of a failure.
b. Demonstrate: The process of showing that major features
work with typical input.
c. Verify: The process of finding as many faults in the application
under test (AUT) as possible.
d. Validate: The process of finding as many faults in
requirements, design, and AUT.
e. Prevent: To avoid errors in development of requirements,
design, and implementation by self-checking techniques,
including “test before design.”
There are various definitions of testing that are given below:

“Testing is the process of exercising or evaluating a system or


system component by manual or automated means to verify that
it satisfies specified requirements.”
[IEEE 83a]

OR
“Software testing is the process of executing a program or
system with the intent of finding errors.”
[Myers]

OR
“It involves any activity aimed at evaluating an attribute or
capability of a program or system and determining that it meets
its required results.”
[Hetzel]
Testing is NOT:
a. The process of demonstrating that errors are not present.
b. The process of showing that a program performs its intended
functions correctly.
c. The process of establishing confidence that a program does
what it is supposed to do.
So, all these definitions are incorrect. Because, with these as
guidelines, one would tend to operate the system in a normal manner
to see if it works. One would unconsciously choose such
normal/correct test data as would prevent the system from failing.
Besides, it is not possible to certify that a system has no errors—
simply because it is almost impossible to detect all errors.
So, simply stated: “Testing is basically a task of locating errors.”
It may be:
a. Positive testing: Operate the application as it should be
operated. Does it behave normally? Use a proper variety of
legal test data, including data values at the boundaries to test if
it fails. Check actual test results with the expected. Are results
correct? Does the application function correctly?
b. Negative testing: Test for abnormal operations. Does the
system fail/crash? Test with illegal or abnormal data.
Intentionally attempt to make things go wrong and to
discover/detect—“Does the program do what it should not do?
Does it fail to do what it should?”
c. Positive view of negative testing: The job of testing is to
discover errors before the user does. A good tester is one who
is successful in making the system fail. Mentality of the tester
has to be destructive—opposite to that of the creator/author,
which should be constructive.
One very popular equation of software testing is:
Software Testing = Software Verification + Software
Validation

As per IEEE definition(s):

Software verification: “It is the process of evaluating a system


or component to determine whether the products of a given
development phase satisfy the conditions imposed at the start of
that phase.”
OR
“It is the process of evaluating, reviewing, inspecting and doing
desk checks of work products such as requirement
specifications, design specifications and code.”
OR
“It is a human testing activity as it involves looking at the
documents on paper.”
Whereas software validation: “It is defined as the process of
evaluating a system or component during or at the end of
development process to determine whether it satisfies the
specified requirements. It involves executing the actual
software. It is a computer based testing process.”
Both verification and validation (V&V) are complementary to
each other.
As mentioned earlier, good testing expects more than just running
a program. We consider a leap-year function working on MS SQL
(server data base):

CREATE FUNCTION f_is_leap_year (@ ai_year small int)


RETURNS small int
AS
BEGIN
--if year is illegal (null or –ve), return –1
IF (@ ai_year IS NULL) or
(@ ai_year <= 0) RETURN –1
IF (((@ ai_year % 4) = 0) AND
((@ ai_year % 100)< > 0)) OR
((@ ai_year % 400) = 0)
RETURN 1 — leap year
RETURN 0 --Not a leap year
END

We execute above program with number of inputs:

TABLE 1.1 Database Table: Test_leap_year


In this database table given above there are 15 test cases. But these
are not sufficient as we have not tried with all possible inputs. We
have not considered the trouble spots like:
i. Removing statement (@ ai_year % 400 = 0) would result in
Y2K problem.
ii. Entering year in float format like 2010.11.
iii. Entering year as a character or as a string.
iv. Entering year as NULL or zero (0).
This list can also grow further. These are our trouble spots or
critical areas. We wish to locate these areas and fix these problems
before our customer does.

1.3. WHY SHOULD WE TEST? WHAT IS THE PURPOSE?


Testing is necessary. Why?
1. The Technical Case:
a. Competent developers are not inflallible.
b. The implications of requirements are not always
forseeable.
c. The behavior of a system is not necessarily predictable
from its components.
d. Languages, databases, user interfaces, and operating
systems have bugs that can cause application failures.
e. Reusable classes and objects must be trustworthy.
2. The Business Case:
a. If you don’t find bugs your customers or users will.
b. Post-release debugging is the most expensive form of
development.
c. Buggy software hurts operations, sales, and reputation.
d. Buggy software can be hazardous to life and property.
3. The Professional Case:
a. Test case design is a challenging and rewarding task.
b. Good testing allows confidence in your work.
c. Systematic testing allows you to be most effective.
d. Your credibility is increased and you have pride in your
efforts.
4. The Economics Case: Practically speaking, defects get
introduced in every phase of SDLC. Pressman has described a
defect amplification model wherein he says that errors get
amplified by a certain factor if that error is not removed in that
phase only. This may increase the cost of defect removal. This
principle of detecting errors as close to their point of
introduction as possible is known as phase containment of
errors.
FIGURE 1.1 Efforts During SDLC.

5. To Improve Quality: As computers and software are used in


critical applications, the outcome of a bug can be severe. Bugs
can cause huge losses. Bugs in critical systems have caused
airplane crashes, allowed space shuttle systems to go awry, and
halted trading on the stock market. Bugs can kill. Bugs can
cause disasters.
In a computerized embedded world, the quality and reliability
of software is a matter of life and death. This can be achieved
only if thorough testing is done.
6. For Verification and Validation (V&V): Testing can serve as
metrics. It is heavily used as a tool in the V&V process. We
can compare the quality among different products under the
same specification based on results from the same test.
Good testing can provide measures for all relevant quality
factors.
7. For Reliability Estimation: Software reliability has important
relationships with many aspects of software, including the
structure and the amount of testing done to the software. Based
on an operational profile (an estimate of the relative frequency
of use) of various inputs to the program, testing can serve as a
statistical sampling method to gain failure data for reliability
estimation.

Recent Software Failures


a. May 31st, 2012, HT reports the failure of air traffic
management software, Auto Trac-III, at Delhi Airport. The
system is unreliable. This ATC software was installed in 2010
as Auto Trac-II (its older version). Since then it has faced many
problems due to inadequate testing. Some of the snags were:
1. May 28, 2011, snag hits radar system of ATC.
2. Feb. 22, 2011, ATC goes blind for 10 minutes with no data
about arriving or departing flights.
3. July 28, 2010, radar screens at ATC go blank for 25 minutes
after system displaying flight data crashes.
4. Feb. 10, 2010, one of the radar scopes stops working at
ATC.
5. Jan. 27, 2010, a screen goes blank at ATC due to a technical
glitch.
6. Jan. 15, 2010, radar system collapses due to software glitch.
ATC officials manually handle the aircraft.
b. The case of a 2010 Toyota Prius that had a software bug that
caused braking problems on bumpy roads.
c. In another case of Therac-25, 6 cancer patients were given
overdose.
d. A breach on play station network caused a loss of $170 million
to Sony Corp.

Why it happened?

As we know software testing constitutes about 40% of overall effort


and 25% of the overall software budget. Software defects are
introduced during SDLC due to poor quality requirements, design, and
code. Sometimes due to the lack of time and inadequate testing, some
of the defects are left behind, only to be found later by users. Software
is a ubiquitous product; 90% of people use software in their everyday
life. Software has high failure rates due to the poor qualify of the
software.
Smaller companies that don’t have deep pockets can get wiped out
because they did not pay enough attention to software quality and
conduct the right amount of testing.

1.4. WHO SHOULD DO TESTING?


As mentioned earlier, testing starts right from the very beginning. This
implies that testing is everyone’s responsibility. By “everyone,” we
mean all project team members. So, we cannot rely on one person
only. Naturally, it is a team effort. We cannot only designate the tester
responsible. Even the developers are responsible. They build the code
but do not indicate any errors as they have written their own code.

1.5. HOW MUCH SHOULD WE TEST?


Consider that there is a while_loop that has three paths. If this loop is
executed twice, we have (3 × 3) paths and so on. So, the total number
of paths through such code will be:

= 1 + 3 + (3 × 3) + (3 × 3 × 3) + ...
= 1 + Σ3n
(where n > 0)

This means an infinite number of test cases. Thus, testing is not


100% exhaustive.

1.6. SELECTION OF GOOD TEST CASES


Designing a good test case is a complex art. It is complex because:
a. Different types of test cases are needed for different classes of
information.
b. All test cases within a test suite will not be good. Test cases
may be good in variety of ways.
c. People create test cases according to certain testing styles like
domain testing or risk-based testing. And good domain tests are
different from good risk-based tests.
Brian Marick coined a new term to a lightly documented test case
—the test idea. According to Brian, “A test idea is a brief statement of
something that should be tested.” For example, if we are testing a
square-root function, one test idea would be—“test a number less than
zero.” The idea here is again to check if the code handles an error
case.
Cem Kaner said—“The best test cases are the ones that find
bugs.” Our efforts should be on the test cases that finds issues. Do
broad or deep coverage testing on the trouble spots.
A test case is a question that you ask of the program. The point of
running the test is to gain information like whether the program will
pass or fail the test.

1.7. MEASUREMENT OF TESTING


There is no single scale that is available to measure the testing
progress. A good project manager (PM) wants worse conditions to
occur in the very beginning of the project instead of in the later
phases. If errors are large in numbers, we can say either testing was
not done thoroughly or it was done so thoroughly that all errors were
covered. So there is no standard way to measure our testing process.
But metrics can be computed at the organizational, process, project,
and product levels. Each set of these measurements has its value in
monitoring, planning, and control.

NOTE

Metrics is assisted by four core components—schedule quality,


resources, and size.

1.8. INCREMENTAL TESTING APPROACH


To be effective, a software tester should be knowledgeable in two key
areas:
1. Software testing techniques
2. The application under test (AUT)
For each new testing assignment, a tester must invest time in
learning about the application. A tester with no experience must also
learn testing techniques, including general testing concepts and how to
define test cases. Our goal is to define a suitable list of tests to perform
within a tight deadline. There are 8 stages for this approach:
Stage 1: Exploration
Purpose: To gain familiarity with the application
Stage 2: Baseline test
Purpose: To devise and execute a simple test case
Stage 3: Trends analysis
Purpose: To evaluate whether the application performs as
expected when actual output cannot be predetermined
Stage 4: Inventory
Purpose: To identify the different categories of data and create a
test for each category item
Stage 5: Inventory combinations
Purpose: To combine different input data
Stage 6: Push the boundaries
Purpose: To evaluate application behavior at data boundaries
Stage 7: Devious data
Purpose: To evaluate system response when specifying bad data
Stage 8: Stress the environment
Purpose: To attempt to break the system
The schedule is tight, so we may not be able to perform all of the
stages. The time permitted by the delivery schedule determines how
many stages one person can perform. After executing the baseline test,
later stages could be performed in parallel if more testers are available.

1.9. BASIC TERMINOLOGY RELATED TO SOFTWARE


TESTING
We must define the following terminologies one by one:
1. Error (or mistake or bugs): People make errors. When people make
mistakes while coding, we call these mistakes bugs. Errors tend to
propagate. A requirements error may be magnified during design
and still amplified during coding. So, an error is a mistake during
SDLC.
2. Fault (or defect): A missing or incorrect statement in a program
resulting from an error is a fault. So, a fault is the representation of
an error. Representation here means the mode of expression, such
as a narrative text, data flow diagrams, hierarchy charts, etc. Defect
is a good synonym for fault. Faults can be elusive. They requires
fixes.
3. Failure: A failure occurs when a fault executes. The manifested
inability of a system or component to perform a required function
within specified limits is known as a failure. A failure is evidenced
by incorrect output, abnormal termination, or unmet time and space
constraints. It is a dynamic process.
So, Error (or mistake or bug) Fault (or defect)
Failure.
For example,
Error (e.g., * replaced by /) Defect (e.g., C = A/B)
(e.g., C = 2 instead of 8)
4. Incident: When a failure occurs, it may or may not be readily
apparent to the user. An incident is the symptom associated with a
failure that alerts the user to the occurrence of a failure. It is an
unexpected occurrence that requires further investigation. It may
not need to be fixed.
5. Test: Testing is concerned with errors, faults, failures, and
incidents. A test is the act of exercising software with test cases. A
test has two distinct goals—to find failures or to demonstrate
correct execution.
6. Test case: A test case has an identity and is associated with
program behavior. A test case also has a set of inputs and a list of
expected outputs. The essence of software testing is to determine a
set of test cases for the item to be tested.
The test case template is shown below.

FIGURE 1.2 Test Case Template.

There are 2 types of inputs:


a. Preconditions: Circumstances that hold prior to test case
execution.
b. Actual inputs: That were identified by some testing method.
Expected outputs are also of two types:
a. Post conditions
b. Actual outputs
The act of testing entails establishing the necessary preconditions,
providing the test case inputs, observing the outputs, and then
comparing these with the expected outputs to determine whether the
test passed.
The remaining information in a test case primarily supports testing
team. Test cases should have an identity and a reason for being. It is
also useful to record the execution history of a test case, including
when and by whom it was run, the pass/fail result of each execution,
and the version of the software on which it was run. This makes it
clear that test cases are valuable, at least as valuable as source code.
Test cases need to be developed, reviewed, used, managed, and saved.
So, we can say that test cases occupy a central position in testing.

Test cases for ATM:


Preconditions: System is started.
7. Test suite: A collection of test scripts or test cases that is used for
validating bug fixes (or finding new bugs) within a logical or
physical area of a product. For example, an acceptance test suite
contains all of the test cases that were used to verify that the
software has met certain predefined acceptance criteria.
8. Test script: The step-by-step instructions that describe how a test
case is to be executed. It may contain one or more test cases.
9. Test ware: It includes all of testing documentation created during
the testing process. For example, test specification, test scripts, test
cases, test data, the environment specification.
10. Test oracle: Any means used to predict the outcome of a test.
11. Test log: A chronological record of all relevant details about the
execution of a test.
12. Test report: A document describing the conduct and results of
testing carried out for a system.

1.10. TESTING LIFE CYCLE

FIGURE 1.3 A Testing Life Cycle.

In the development phase, three opportunities arise for errors to be


made resulting in faults that propagate through the remainder of the
development process. The first three phases are putting bugs IN, the
testing phase is finding bugs, and the last three phases are getting bugs
OUT. The fault resolution step is another opportunity for errors and
new faults. When a fix causes formerly correct software to misbehave,
the fix is deficient.

1.11. WHEN TO STOP TESTING?


Testing is potentially endless. We cannot test until all defects are
unearthed and removed. It is simply impossible. At some point, we
have to stop testing and ship the software. The question is when?
Realistically, testing is a trade-off between budget, time, and
quality. It is driven by profit models.
The pessimistic approach is to stop testing whenever some or any
of the allocated resources—time, budget, or test cases—are exhausted.
The optimistic stopping rule is to stop testing when either
reliability meets the requirement, or the benefit from continuing
testing cannot justify the testing cost.
[Yang]

1.12. PRINCIPLES OF TESTING


To make software testing effective and efficient we follow certain
principles. These principles are stated below.
1. Testing should be based on user requirements: This is in order
to uncover any defects that might cause the program or system
to fail to meet the client’s requirements.
2. Testing time and resources are limited: Avoid redundant tests.
3. Exhaustive testing is impossible: As stated by Myer, it is
impossible to test everything due to huge data space and the
large number of paths that a program flow might take.
4. Use effective resources to test: This represents the most
suitable tools, procedures, and individuals to conduct the tests.
The test team should use tools like:
a. Deja Gnu: It is a testing frame work for interactive or
batch-oriented applications. It is designed for regression
and embedded system testing. It runs on UNIX platform.
It is a cross-platform operating system.
5. Test planning should be done early: This is because test
planning can begin independently of coding and as soon as the
client requirements are set.
6. Testing should begin “in small” and progress toward testing
“in large”: The smallest programming units (or modules)
should be tested first and then expanded to other parts of the
system.
7. Testing should be conducted by an independent third party.
8. All tests should be traceable to customer requirements.
9. Assign best people for testing. Avoid programmers.
10. Test should be planned to show software defects and not their
absence.
11. Prepare test reports including test cases and test results to
summarize the results of testing.
12. Advance test planning is a must and should be updated in a
timely manner.

1.13. LIMITATIONS OF TESTING


1. Testing can show the presence of errors—not their absence.
2. No matter how hard you try, you will never find the last bug in
an application.
3. The domain of possible inputs is too large to test.
4. There are too many possible paths through the program to test.
5. In short, maximum coverage through minimum test-cases. That
is the challenge of testing.
6. Various testing techniques are complementary in nature and it
is only through their combined use that one can hope to detect
most errors.

NOTE

To see some of the most popular testing tools of 2017, visit the
following site: https://www.guru99.com/testing-tools.html

1.14. AVAILABLE TESTING TOOLS, TECHNIQUES, AND


METRICS
There are an abundance of software testing tools that exist. Some of
the early tools are listed below:
a. Mothora: It is an automated mutation testing tool-set
developed at Purdue University. Using Mothora, the tester
can create and execute test cases, measure test case
adequacy, determine input-output correctness, locate and
remove faults or bugs, and control and document the test.
b. NuMega’s Bounds Checker, Rational’s Purify: They
are run-time checking and debugging aids. They can both
check and protect against memory leaks and pointer
problems.
c. Ballista COTS Software Robustness Testing Harness
[Ballista]: It is a full-scale automated robustness testing
tool. It gives quantitative measures of robustness
comparisons across operating systems. The goal is to
automatically test and harden commercial off-the-shelf
(COTS) software against robustness failures.

SUMMARY
1. Software testing is an art. Most of the testing methods and
practices are not very different from 20 years ago. It is
nowhere near maturity, although there are many tools and
techniques available to use. Good testing also requires a
tester’s creativity, experience, and intuition, together with
proper techniques.
2. Testing is more than just debugging. It is not only used to
locate errors and correct them. It is also used in validation,
verification process, and reliability measurement.
3. Testing is expensive. Automation is a good way to cut down
cost and time. Testing efficiency and effectiveness is the
criteria for coverage based testing techniques.
4. Complete testing is infeasible. Complexity is the root of the
problem.
5. Testing may not be the most effective method to improve
software quality.

MULTIPLE CHOICE QUESTIONS


1. Software testing is the process of
a. Demonstrating that errors are not present
b. Executing the program with the intent of finding errors
c. Executing the program to show that it executes as per SRS
d. All of the above.
2. Programmers make mistakes during coding. These mistakes
are known as
a. Failures
b. Defects
c. Bugs
d. Errors
3. Software testing is nothing else but
a. Verification only
b. Validation only
c. Both verification and validation
d. None of the above.
4. Test suite is a
a. Set of test cases
b. Set of inputs
c. Set of outputs
d. None of the above.
5. Which one is not the verification activity?
a. Reviews
b. Path testing
c. Walkthroughs
d. Acceptance testing
6. A break in the working of a system is called a(n)
a. Defect
b. Failure
c. Fault
d. Error
7. One fault may lead to
a. One failure
b. No failure
c. Many failures
d. All of the above.
8. Verification is
a. Checking product with respect to customer’s expectations
b. Checking product with respect to SRS
c. Checking product with respect to the constraints of the
project
d. All of the above.
9. Validation is
a. Checking the product with respect to customer’s
expectations
b. Checking the product with respect to specification
c. Checking the product with respect to constraints of the
project
d. All of the above.
10. Which one of the following is not a testing tool?
a. Deja Gnu
b. TestLink
c. TestRail
d. SOLARIS

ANSWERS
1. b.
2. c.
3. c.
4. a.
5. d.
6. b.
7. d.
8. b.
9. a.
10. d.

CONCEPTUAL SHORT QUESTIONS WITH ANSWERS


Q. 1. Are there some myths associated to software testing?
Ans. Some myths related to software testing are as follows:
1. Testing is a structured waterfall idea: Testing may be purely
independent, incremental, and iterative activity. Its nature
depends upon the context and adapted strategy.
2. Testing is trivial: Adequate testing means a complete
understanding of application under test (AUT) and
appreciating testing techniques.
3. Testing is not necessary: One can minimize the programming
errors but cannot eliminate them. So, testing is necessary.
4. Testing is time consuming and expensive: We remember a
saying—“Pay me now or pay me much more later” and this is
true in the case of software testing as well. It is better to apply
testing strategies now or else defects may appear in the final
product which is more expensive.
5. Testing is destructive process: Testing a software is a
diagnostic and creative activity which promotes quality.
Q. 2. Give one example of
a. Interface specification bugs
b. Algorithmic bugs
c. Mechanical bugs
Ans. a. Examples of interface specification bugs are:
i. Mismatch between what the client needs and what the
server offers.
ii. Mismatch between requirements and implementation.
b. Examples of algorithmic bugs are:
i. Missing initialization.
ii. Branching errors.
c. Examples of mechanical bugs are:
i. Documentation not matching with the operating
procedures.
Q. 3. Why are developers not good testers?
Ans. A person checking his own work using his own documentation
has some disadvantages:
i. Misunderstandings will not be detected. This is because the
checker will assume that what the other individual heard from
him was correct.
ii. Whether the development process is being followed properly
or not cannot be detected.
iii. The individual may be “blinded” into accepting erroneous
system specifications and coding because he falls into the
same trap during testing that led to the introduction of the
defect in the first place.
iv. It may result in underestimation of the need for extensive
testing.
v. It discourages the need for allocation of time and effort for
testing.
Q. 4. Are there any constraints on testing?
Ans. The following are the constraints on testing:
i. Budget and schedule constraints
ii. Changes in technology
iii. Limited tester’s skills
iv. Software risks
Q. 5. What are test matrices?
Ans. A test matrix shows the inter-relationship between functional
events and tests. A complete test matrix defines the conditions
that must be tested during the test process.
The left side of the matrix shows the functional events and the
top identifies the tests that occur in those events. Within the
matrix, cells are the process that needs to be tested. We can even
cascade these test matrices.
Q. 6. What is a process?
Ans. It is defined as a set of activities that represent the way work is
performed. The outcome of a process is usually a product or
service. For example,

Examples of IT processes Outcomes


1. Analyze business needs Needs statement
2. Run job Executed job
3. UNIT test Defect-free unit

Q. 7. Explain PDCA view of a process?


Ans. A PDCA cycle is a conceptual view of a process. It is shown in
Figure 1.4.
It has four components:

FIGURE 1.4

i. P-Devise a plan: Define your objective and find the


conditions and methods required to achieve your objective.
Express a specific objective numerically.
ii. D-Execute (or Do) the plan: Create the conditions and
perform the necessary teaching and training to execute the
plan. Make sure everyone thoroughly understands the
objectives and the plan. Teach workers all of the procedures
and skills they need to fulfill the plan and a thorough
understanding of job is needed. Then they can perform the
work according to these procedures.
iii. C-Check the results: Check to determine whether work is
progressing according to the plan and that the expected
results are obtained. Also, compare the results of the work
with the objectives.
iv. A-Take necessary action: If a check finds out an
abnormality, i.e., if the actual value differs from the target
value then search for its cause and try to mitigate the cause.
This will prevent the recurrence of the defect.
Q. 8. Explain the V-model of testing?
Ans. According to the waterfall model, testing is a post-development
activity. The spiral model took one step further by breaking the
product into increments each of which can be tested separately.
However, V-model brings in a new perspective that different
types of testing apply at different levels. The V-model splits
testing into two parts—design and execution. Please note that
the test design is done early while the test execution is done in
the end. This early design of tests reduces overall delay by
increasing parallelism between development and testing. It
enables better and more timely validation of individual phases.
The V-model is shown in Figure 1.5.

FIGURE 1.5

The levels of testing echo the levels of abstractions found in the


waterfall model of SDLC. Please note here, especially in terms
of functional testing, that the three levels of definition, i.e.,
specification, initial design, and detailed design; correspond
directly to three levels of testing—unit, integration, and system
testing.

A practical relationship exists between the levels of testing


versus black- and white-box testing. Most practitioners say that
structural testing is most appropriate at the unit level while
functional testing is most appropriate at the system level.
REVIEW QUESTIONS
1. What is software testing?
2. Distinguish between positive and negative testing?
3. Software testing is software verification plus software
validation. Discuss.
4. What is the need of testing?
5. Who should do testing? Discuss various people and their roles
during development and testing.
6. What should we test?
7. What criteria should we follow to select test cases?
8. Can we measure the progress of testing?
9. “Software testing is an incremental process.” Justify the
statement.
10. Define the following terms:
a. Error
b. Fault
c. Failure
d. Incident
e. Test
f. Test case
g. Test suite
h. Test script
i. Testware
11. Explain the testing life cycle?
12. When should we stop testing? Discuss pessimistic and
optimistic approaches.
13. Discuss the principles of testing.
14. What are the limitations of testing?
15. a. What is debugging?
b. Why exhaustive testing is not possible?
16. What are modern testing tools?
17. Write a template for a typical test case.
18. Differentiate between error and fault.
19. What is software testing? How is it different from debugging?
20. Differentiate between verification and validation?
21. Explain the concept of a test case and test plan.
22. a. Differentiate between positive testing and negative testing.
b. Why is 100% testing not possible through either black-box
or white-box testing techniques?
c. Name two testing tools used for functional testing.
d. What is static testing? Name two techniques to perform
static testing.
23. “Software testing is an incremental process.” Justify the
statement.
24. a. Why are developers not good testers?
b. Which approach should be followed to step testing?
25. a. Discuss the role of software testing during the software life
cycle and why it is so difficult?
b. What should we test? Comment on this statement. Illustrate
the importance of testing.
c. Will exhaustive testing (even if possible for very small
programs) guarantee that the program is 100% correct?
d. Define the following terms:
i. Test suite
ii. Bug
iii. Mistake
iv. Software failure
CHAPTER
2

SOFTWARE VERIFICATION AND


VALIDATION
Inside this Chapter:
2.0. Introduction
2.1. Differences Between Verification and Validation
2.2. Differences Between QA and QC
2.3. Evolving Nature of Area
2.4. V&V Limitations
2.5. Categorizing V&V Techniques
2.6. Role of V&V in SDLC—Tabular Form
2.7. Proof of Correctness (Formal Verification)
2.8. Simulation and Prototyping
2.9. Requirements Tracing
2.10. Software V&V Planning (SVVP)
2.11. Software Technical Reviews (STRs)
2.12. Independent V&V (IV&V) Contractor
2.13. Positive and Negative Effects of Software V&V on Projects
2.14. Standard for Software Test Documentation (IEEE829)

2.0. INTRODUCTION
Software that satisfies its user expectations is a necessary goal of a
successful software development organization. To achieve this goal,
software engineering practices must be applied throughout the
evolution of the software product. Most of these practices attempt to
create and modify software in a manner that maximizes the probability
of satisfying its user expectations.
2.1. DIFFERENCES BETWEEN VERIFICATION AND
VALIDATION
Software verification and validation (V&V) is a technical discipline of
systems engineering. According to Stauffer and Fuji (1986), software
V&V is “a systems engineering process employing a rigorous
methodology for evaluating the correctness and quality of software
product through the software life cycle.”
According to Dr. Berry Boehm (1981), software V&V is
performed in parallel with the software development and not at the
conclusion of the software development. However, verification and
validation are different. Table shows the differences between them.

Verification Validation
1. It is a static process of verifying 1. It is a dynamic process of
documents, design, and code. validating/testing the actual
product.
2. It does not involve executing the 2. It involves executing the code.
code.
3. It is human based checking of 3. It is the computer-based
documents/files. execution of program.
4. Target is requirements 4. Target is actual product—a unit, a
specification, application module, a set of integrated
architecture, high level and modules, and the final product.
detailed design, and database
design.
5. It uses methods like inspections, 5. It uses methods like black-box,
walk throughs, desk-checking, gray-box, and white-box testing.
etc.
6. It, generally, comes first—before 6. It generally follows verification.
validation.
7. It answers the question—Are we 7. It answers the question—Are we
building the product right? building the right product?
8. It can catch errors that validation 8. It can catch errors that
cannot catch. verification cannot catch.

Both of these are essential and complementary. Each provides its


own sets of error filters.
Each has its own way of finding the errors in the software.

2.2. DIFFERENCES BETWEEN QA AND QC?


Quality assurance: The planned and systematic activities implemented
in a quality system so that quality requirements for a product or
service will be fulfilled is known as quality assurance.
Quality control: The observation techniques and activities used to
fulfill requirements for quality is known as quality control.
Both are, however, different to each other. We tabulate The
differences between them are shown below.

Quality Assurance (QA) Quality Control (QC)


1. It is process related. 1. It is product related.
2. It focuses on the process used to 2. It focuses on testing of a product
develop a product. developed or a product under
development.
3. It involves the quality of the 3. It involves the quality of the
processes. products.
4. It is a preventive control. 4. It is a detective control.
5. Allegiance is to development. 5. Allegiance is not to development.

2.3. EVOLVING NATURE OF AREA


As the complexity and diversity of software products continue to
increase, the challenge to develop new and more effective V&V
strategies continues. The V&V approaches that were reasonably
effective on small batch-oriented products are not sufficient for
concurrent, distributed, or embedded products. Thus, this area will
continue to evolve as new research results emerge in response to new
V&V challenges.

2.4. V&V LIMITATIONS


The overall objective of software V&V approaches is to ensure that
the product is free from failures and meets its user’s expectations.
There are several theoretical and practical limitations that make this
objective impossible to obtain for many products. They are discussed
below:

1. Theoretical Foundations
Howden claims the most important theoretical result in program
testing and analysis is that no general purpose testing or analysis
procedure can be used to prove program correctness.

2. Impracticality of Testing All Data


For most programs, it is impractical to attempt to test the program with
all possible inputs due to a combinational explosion. For those inputs
selected, a testing oracle is needed to determine the correctness of the
output for a particular test input.

3. Impracticality of Testing All Paths


For most programs, it is impractical to attempt to test all execution
paths through the product due to a combinational explosion. It is also
not possible to develop an algorithm for generating test data for paths
in an arbitrary product due to the mobility to determine path
feasibility.

4. No Absolute Proof of Correctness


Howden claims that there is no such thing as an absolute proof of
correctness. Instead, he suggests that there are proofs of equivalency,
i.e., proofs that one description of a product is equivalent to another
description. Hence, unless a formal specification can be shown to be
correct and, indeed, reflects exactly the user’s expectations, no claims
of product correctness can be made.
2.5. CATEGORIZING V&V TECHNIQUES
Various V&V techniques are categorized below:

FIGURE 2.1 Types of V&V Techniques.

The static methods of V&V involves the review processes.


Whereas dynamic methods like black-box testing can be applied at all
levels, even at the system level. While the principle of white-box
testing is that it checks for interface errors at the module level, black-
box testing can also be done at the module level by testing boundary
conditions and low-level functions like correctly displaying error
messages.

2.6. ROLE OF V&V IN SDLC—TABULAR FORM [IEEE


STD. 1012]
Traceability Analysis
It traces each software requirement back to the system requirements
established in the concept activity. This is to ensure that each
requirement correctly satisfies the system requirements and that no
extraneous software requirements are added. In this technique, we also
determine whether any derived requirements are consistent with the
original objectives, physical laws, and the technologies described in
system document.

Interface Analysis
It is the detailed examination of the interface requirements
specifications. The evaluation criteria is the same as that for
requirements specification. The main focus is on the interfaces
between software, hardware, user, and external software.

Criticality Analysis
Criticality is assigned to each software requirement. When
requirements are combined into functions, the combined criticality of
requirements form the criticality for the aggregate function. Criticality
analysis is updated periodically as requirement changes are
introduced. This is because such changes can cause an increase or
decrease in a functions criticality which depends on how the revised
requirement impacts system criticality.
Criticality analysis is a method used to locate and reduce high-risk
problems and is performed at the beginning of the project. It identifies
the functions and modules that are required to implement critical
program functions or quality requirements like safety, security, etc.

Criticality analysis involves the following steps:


Step 1: Construct a block diagram or control flow diagram
(CFD) of the system and its elements. Each block will
represent one software function (or module) only.
Step 2: Trace each critical function or quality requirement
through CFD.
Step 3: Classify all traced software functions as critical to:
a. Proper execution of critical software functions.
b. Proper execution of critical quality requirements.
Step 4: Focus additional analysis on these traced critical software
functions.
Step 5: Repeat criticality analysis for each life cycle process to
determine whether the implementation details shift the
emphasis of the criticality.

Hazard and Risk Analysis


It is done during the requirements definition activity. Now hazards or
risks are identified by further refining the system requirements into
detailed software requirements. These risks are assessed for their
impact on the system. A summary of these activities is given below:

V&V Activity V&V Tasks Key Issues


1. Requirements Traceability analysis Evaluates the
V&V Software correctness,
requirements completeness,
evaluation accuracy,
Interface analysis consistency,
testability, and
Criticality analysis
readability of
System V&V test software
plan generation requirements.
Acceptance V&V test Evaluates the
plan generation software interfaces.
Configuration Identifies the
management criticality of each
assessment software function.
Hazard analysis Initiates the V&V test
Risk analysis planning for the
V&V system test.
Initiates the V&V test
planning for the
V&V acceptance test.
Ensures completeness
and adequacy of the
SCM process.
Identifies potential
hazards based on the
product data during
the specified
development activity.
Identifies potential
risks based on the
product data during
the specified
development activity.
2. Design V&V Traceability analysis Evaluates software
Software design design modules for
evaluation correctness,
Interface analysis completeness,
accuracy,
Criticality analysis
consistency,
Component V&V test testability, and
plan generation and readability.
verification
Initiates the V&V test
Integration V&V test planning for the
plan generation and V&V component
verification test.
Hazard analysis Initiates the V&V test
Risk analysis planning for the
V&V integration test.
3. Implementation Traceability analysis Verifies the
V&V Source code and correctness,
source code completeness,
documentation consistency,
evaluation accuracy, testability,
Interface analysis and readability of
source code.
Criticality analysis
V&V test case
generation and
verification
V&V test procedure
generation and
verification
Component V&V test
execution and
verification
Hazard analysis
Risk analysis
4. Test V&V Traceability analysis
Acceptance V&V test
procedure generation
and verification
Integration V&V test
execution and
verification
System V&V test
execution and
verification
Acceptance V&V test
execution and
verification
5. Maintenance SVVP (software Modifies the SVVP.
V&V verification and Evaluates the effect
validation plan) on software of the
revision operation anomalies.
Proposed change Verifies the
assessment correctness of
Anomaly evaluation software when
Criticality analysis migrated to a
Migration assessment different operational
environment.
Retirement
assessment Ensures that the
existing system
Hazard analysis continues to function
Risk analysis correctly when
specific software
elements are retired.

2.7. PROOF OF CORRECTNESS (FORMAL


VERIFICATION)
A proof of correctness is a mathematical proof that a computer
program or a part thereof will, when executed, yield correct results,
i.e., results fulfilling specific requirements. Before proving a program
correct, the theorem to be proved must, of course, be formulated.
Hypothesis: The hypothesis of such a correctness theorem is typically
a condition that the relevant program variables must satisfy
immediately “before” the program is executed. This condition is called
the precondition.
Thesis: The thesis of the correctness theorem is typically a condition
that the relevant program variables must satisfy immediately “after”
execution of the program. This latter condition is called the
postcondition.
So, the correctness theorem is stated as follows:
“If the condition, V, is true before execution of the program, S,
then the condition, P, will be true after execution of S.”
where V is precondition and P is postcondition.
Notation: Such a correctness theorem is usually written as {V} S
{P}, where V, S, and P have been explained above.
By program variable we broadly include input and output data,
e.g., data entered via a keyboard, displayed on a screen, or printed on
paper. Any externally observable aspect of the program’s execution
may be covered by the precondition and postcondition.

2.8. SIMULATION AND PROTOTYPING


Simulation and prototying are techniques for analyzing the expected
behavior of a product. There are many approaches to constructing
simulations and prototypes that are well documented in literature.
For V&V purposes, simulations and prototypes are normally used
to analyze requirements and specifications to ensure that they reflect
the user’s needs. Because they are executable, they offer additional
insight into the completeness and correctness of these documents.
Simulations and prototypes can also be used to analyze predicted
product performance, especially for candidate product designs, to
ensure that they conform to the requirements.
It is important to note that the utilization of simulation and
prototyping as a V&V technique requires that the simulations and
prototypes themselves be correct. Thus, the utilization of these
techniques requires an additional level of V&V activity.

2.9. REQUIREMENTS TRACING


“It is a technique for ensuring that the product, as well as the testing
of the product, addresses each of its requirements.” The usual
approach to performing requirements tracing uses matrices.
a. One type of matrix maps requirements to software modules.
Construction and analysis of this matrix can help ensure that all
requirements are properly addressed by the product and that the
product does not have any superfluous capabilities. System
verification diagrams are another way of analyzing
requirements/modules traceability.
b. Another type of matrix maps requirements to test cases.
Construction and analysis of this matrix can help ensure that all
requirements are properly tested.
c. A third type of matrix maps requirements to their evaluation
approach. The evaluation approaches may consist of various levels
of testing, reviews, simulations, etc.

The requirements/evaluation matrix ensures that all requirements


will undergo some form of V&V. Requirements tracing can be applied
for all the products of the software evolution process.
2.10. SOFTWARE V&V PLANNING (SVVP)
The development of a comprehensive V&V plan is essential to the
success of a project. This plan must be developed early in the project.
Depending on the development approach followed, multiple levels of
test plans may be developed corresponding to various levels of V&V
activities. IEEE 83b has documented the guidelines for the contents of
system, software, build, and module test plans.
The following steps for SVVP are listed below:

Step 1: Identification of V&V Goals


V&V goals must be identified from the requirements and
specifications. These goals must address those attributes of the product
that correspond to its user expectations. These goals must be
achievable taking into account both theoretical and practical
limitations.

Step 2: Selection of V&V Techniques


Once Step 1 (above) is finished, we must select specific techniques for
each of the products that evolves during SDLC. These are given
below:
a. During the Requirements Phase: The applicable techniques for
accomplishing the V&V objectives for requirements are—technical
reviews, prototyping, and simulations. The review process is often
called a system requirement review (SRR).
b. During the Specifications Phase: The applicable techniques for
this phase are technical reviews, requirements tracing, prototyping,
and simulations. The requirements must be traced to the
specifications.
c. During the Design Phase: The techniques for accomplishing the
V&V objectives for designs are technical reviews, requirements
tracing, prototyping, simulation, and proof of correctness. We can
go for two types of design reviews:
i. High-level designs that correspond to an architectural view of
the product are often reviewed in a preliminary design review
(PDR).
ii. Detailed designs are addressed by a critical design review
(CDR).
d. During the Implementation Phase: The applicable techniques for
accomplishing V&V objectives for implementation are technical
reviews, requirements tracing, testing, and proof of correctness.
Various code review techniques such as walk throughs and
inspections exist.
At the source-code level, several static analysis techniques are
available for detecting implementation errors. The requirements
tracing activity is concerned with tracing requirements to source-
code modules. The bulk of the V&V activity for source code
consists of testing. Multiple levels of testing are usually performed.
At the module-level, proof-of-correctness techniques may be
applied, if applicable.
e. During the Maintenance Phase: Because changes describe
modifications to products, the same techniques used for V&V
during development may be applied during modification. Changes
to implementation require regression testing.

Step 3: Organizational Responsibilities

The organizational structure of a project is a key planning


consideration for project managers. An important aspect of this
structure is the delegation of V&V activities to various organizations.
This decision is based upon the size, complexity, and criticality of
the product. Four types of organizations are addressed. These
organizations reflect typical strategies for partitioning tasks to achieve
V&V goals for the product. These are:
a. Developmental Organization: This type of organization has the
following responsibilities:
1. To participate in technical reviews for all of the evolution
products.
2. To construct prototypes and simulations.
3. To prepare and execute test plans for unit and integration levels
of testing. This is called preliminary qualification testing (PQT).
4. To construct any applicable proofs of correctness at the module
level.
b. Independent Test Organization (ITO): This type of organization
has the following responsibilities:
1. It enables test activities to occur in parallel with those of
development.
2. It participates in all of the product’s technical reviews and
monitors PQT effort.
3. The primary responsibility of the ITO is the preparation and
execution of the product’s system test plan. This is sometimes
referred to as the formal qualification test (FQT).
The plan for this must contain the equivalent of a
requirements/evaluation matrix that defines the V&V approach
to be applied for each requirement.
4. If the product must be integrated with other products, this
integration activity is normally the responsibility of the ITO.
c. Software Quality Assurance (SQA) Organizations: The intent
here is to identify some activities for ensuring software quality.
Evaluations are the primary avenue for ensuring software quality.
Some evaluation types are given below:
i. Internal consistency of product
ii. Understandability of product
iii. Traceability to indicated documents
iv. Consistency with indicated documents
v. Appropriate allocation of sizing, timing, and resources
vi. Adequate test coverage of requirements
vii. Completeness of testing
viii. Completeness of regression testing
d. Independent V&V Contractor: An independent contractor may
be selected to do V&V. The scope of activities of this organization
varies from that of an ITO (discussed above) and SQA
organization.

Step 4: Integrating V&V Approaches


Once a set of V&V objectives has been identified, an overall
integrated V&V approach must be determined. This approach involves
the integration of techniques applicable to various life cycle phases as
well as the delegation of these tasks among the project’s organizations.
The planning of this integrated V&V approach is very dependent upon
the nature of the product and the process used to develop it. Earlier the
waterfall approach for testing was used and now incremental approach
is used. Regardless of the approach selected, V&V progress must be
tracked. Requirements/ evaluation matrices play a key role in this
tracking by providing a means of insuring that each requirement of the
product is addressed.

Step 5: Problem Tracking


i. It involves documenting the problems encountered during the
V&V effort.
ii. Routing these problems to appropriate persons for correctness.
iii. Ensuring that corrections have been done.
iv. Typical information to be collected includes:
a. When the problem occurred
b. Where the problem occurred
c. State of the system before occurrence
d. Evidence of the problem
e. Priority for solving problem

NOTE

This fifth step is very important when we go with OS testing.


Step 6: Tracking Test Activities

SVVP must provide a mechanism for tracking testing effort, testing


cost, and testing quality. To do this the following data is collected:
a. Number of tests executed
b. Number of tests remaining
c. Time used
d. Resources used
e. Number of problems found

These data can be used to compare actual test progress against


scheduled progress.

Step 7: Assessment

It is important that the software V&V plan provide for the ability to
collect data that can be used to assess both the product and the
techniques used to develop it. This involves careful collection of error
and failure data, as well as analysis and classification of these data.

2.11. SOFTWARE TECHNICAL REVIEWS (STRs)


A review process can be defined as a critical evaluation of an object. It
includes techniques such as walk-throughs, inspections, and audits.
Most of these approaches involve a group meeting to assess a work
product.
Software technical reviews can be used to examine all of the
products of the software evolution process. In particular, they are
especially applicable and necessary for those products not yet in
machine-processable form, such as requirements or specifications
written in natural language.

2.11.1. RATIONALE FOR STRS


The main rationale behind STRs are as follows:
a. Error-Prone Software Development and Maintenance Process:
The complexity and error-prone nature of developing and
maintaining software should be demonstrated with statistics
depicting error frequencies for intermediate software products.
These statistics must also convey the message that errors occur
throughout the development process and that the later these errors
are detected, the higher the cost for their repair.

Summary:
i. Complexity of software development and maintenance
processes.
ii. Error frequencies for software work products.
iii. Error distribution throughout development phases.
iv. Increasing costs for error removal throughout the life cycle.

b. Inability to Test All Software: It is not possible to test all


software. Clearly exhaustive testing of code is impractical. Current
technology also does not exist for testing a specification or high
level design. The idea of testing a software test plan is also
bewildering. Testing also does not address quality issues or
adherence to standards which are possible with review processes.

Summary:
i. Exhaustive testing is impossible.
ii. Intermediate software products are largely untestable.
c. Reviews are a Form of Testing: The degree of formalism,
scheduling, and generally positive attitude afforded to testing must
exist for software technical reviews if quality products are to be
produced.

Summary:
i. Objectives
ii. Human based versus machine based
iii. Attitudes and norms
d. Reviews are a Way of Tracking a Project: Through
identification of deliverables with well defined entry and exit
criteria and successful review of these deliverables, progress on a
project can be followed and managed more easily [Fagan]. In
essence, review processes provide milestones with teeth. This
tracking is very beneficial for both project management and
customers.

Summary:
i. Individual developer tracking
ii. Management tracking
iii. Customer tracking
e. Reviews Provide Feedback: The instructor should discuss and
provide examples about the value of review processes for
providing feedback about software and its development process.

Summary:
i. Product
ii. Process
f. Educational Aspects of Reviews: It includes benefits like a better
understanding of the software by the review participants that can be
obtained by reading the documentation as well as the opportunity
of acquiring additional technical skills by observing the work of
others.

Summary:
i. Project understanding
ii. Technical skills

2.11.2. TYPES OF STRS


A variety of STRs are possible on a project depending upon the
developmental model followed, the type of software product being
produced, and the standards which must be adhered to. These
developmental modes may be
i. Waterfall model
ii. Rapid prototyping
iii. Iterative enhancement
iv. Maintenance activity modeling
And the current standards may be
i. Military standards
ii. IEEE standards
iii. NBS standards
Reviews are classified as formal and Informal reviews. We
tabulate the differences between them in tabular form.

Informal reviews Formal reviews


i. It is a type of review that i. It is a planned meeting.
typically occurs spontaneously
among peers.
ii. Reviewers have no ii. Reviewers are held accountable
responsibility. for their participation in the
review.
iii. No review reports are iii. Review reports containing
generated. action items are generated and
acted upon.

2.11.3. REVIEW METHODOLOGIES


There are three approaches to reviews
a. Walkthrough (or presentation reviews)
b. Inspection (or work product reviews)
c. Audits
a. Walkthroughs (or Presentation Reviews) Walkthroughs are
well defined by Yourdon. Walkthroughs can be viewed as
presentation reviews in which a review participant, usually the
developer of the software being reviewed, narrates a
description of the software and the remainder of the review
group provides their feedback throughout the presentation.
They are also known as presentation reviews because the bulk
of the feedback usually occurs for the material actually
presented at the level it is presented. Thus, advance preparation
on the part of reviewers is often not detectable during a
walkthrough. Walkthroughs suffer from these limitations as
well as the fact that they may lead to disorganized and
uncontrolled reviews.
Walkthroughs may also be stressful if the developer of the
software is conducting the walkthrough. This has lead to many
variations such as having someone other than the developer
perform the walkthrough. It is also possible to combine
multiple reviews into a single review such as a combined
design and code walkthrough.
b. Inspections (or Walk Product Reviews): It is a formal
approach. Inspections were first performed by Fagan at IBM.
They require a high degree of preparation for the review
participants but the benefits include a more systematic review
of the software and a more controlled and less stressed meeting.
There are many variations of inspections, but all include some
form of a checklist or agenda that guides the preparation of the
review participants and serves to organize the review meeting
itself. Inspections are also characterized by rigorous entry and
exit requirements for the work products being inspected.
An inspection process involves the collection of data that can
be used for feedback on the quality of the development and
review processes as well as to influence future validation
techniques on the software itself.
Inspections Walkthroughs
1. It is a five-step process that is 1. It has fewer steps than
well formalized. inspections and is a less
formal process.
2. It uses checklists for locating 2. It does not use a checklist.
errors.
3. It is used to analyze the 3. It is used to improve the
quality of the process. quality of the product.
4. This process takes a longer 4. It is a shorter process.
time.
5. It focuses on training of junior 5. It focuses on finding defects.
staff.

c. Audits: Audits should also be described as an external type of


review process. Audits serve to ensure that the software is
properly validated and that the process is producing its intended
results.

2.12. INDEPENDENT V&V CONTRACTOR (IV&V)


An independent V&V contractor may sometimes be used to ensure
independent objectivity and evaluation for the customer.
The use of a different organization, other than the software
development group, for software V&V is called independent
verification and validation (IV&V). Three types of independence are
usually required:
i. Technical Independence: It requires that members of the
IV&V team (organization or group) may not be personnel
involved in the development of the software. This team must
have some knowledge about the system design or some
engineering background enabling them to understand the
system. The IV&V team must not be influenced by the
development team when the IV&V team is learning about the
system requirements, proposed solutions for building the
system, and problems encountered. Technical independence is
crucial in the team’s ability to detect the subtle software
requirements, software design, and coding errors that escape
detection by development testing and SQA reviews.
The technical IV&V team may need to share tools from the
computer support environment (e.g., compilers, assemblers,
utilities) but should execute qualification tests on these tools to
ensure that the common tools themselves do not mask errors in
the software being analyzed and tested. The IV&V team uses or
develops its own set of test and analysis tools separate from the
developer’s tools whenever possible.
ii. Managerial Independence: It means that the responsibility for
IV&V belongs to an organization outside the contractor and
program organizations that develop the software. While
assurance objectives may be decided by regulations and project
requirements, the IV&V team independently decides the areas
of the software/system to analyze and test, techniques to
conduct the IV&V, schedule of tasks, and technical issues to act
on. The IV&V team provides its findings in a timely manner
simultaneously to both the development team and the systems
management.
iii. Financial Independence: It means that control of the IV&V
budget is retained in an organization outside the contractor and
program organization that develop the software. This
independence protects against diversion of funds or adverse
financial pressures or influences that may cause delay or
stopping of IV&V analysis and test tasks and timely reporting of
results.

2.13. POSITIVE AND NEGATIVE EFFECTS OF


SOFTWARE V&V ON PROJECTS
Software V&V has some positive effects on a software project. The
following are given below:
1. Better quality of software. This includes factors like completeness,
consistency, readability, and testablity of the software.
2. More stable requirements.
3. More rigorous development planning, at least to interface with the
software V&V organization.
4. Better adherence by the development organization to programming
language and development standards and configuration
management practices.
5. Early error detection and reduced false starts.
6. Better schedule compliance and progress monitoring.
7. Greater project management visibility into interim technical quality
and progress.
8. Better criteria and results for decision making at formal reviews
and audits.

Some negative effects of software V&V on a software development


project include:
1. Additional project cost of software V&V (10–30% extra).
2. Additional interface involving the development team, user, and
software V&V organization. For example, attendance at software
V&V status meetings, anomaly resolution meetings.
3. Additional documentation requirement beyond the deliverable
products, if software V&V is receiving incremental program and
documentation releases.
4. Need to share computing facilities with and access to classified
data for the software V&V organization.
5. Lower development staff productivity if programmers and
engineers spend time explaining the system to software V&V
analysts, especially if explanations are not documented.
6. Increased paperwork to provide written responses to software V&V
error reports and other V&V data requirements. For example,
notices of formal review and audit meetings, updates to software
release schedule, and response to anomaly reports.
7. Productivity of development staff affected adversely in resolving
invalid anomaly reports.
Some steps can be taken to minimize the negative effects and to
maximize the positive effects of software V&V. To recover much of
the costs, software V&V is started early in the software requirements
phase. The interface activities for documentation, data, and software
deliveries between developer and software V&V groups should be
considered as an inherently necessary step required to evaluate
intermediate development products.
To offset unnecessary costs, software V&V must organize its
activities to focus on critical areas of the software so that it uncovers
critical errors for the development group and thereby results in
significant cost savings to the development process. To do this,
software V&V must use its criticality analysis to identify critical areas.
It must scrutinize each discrepancy before release to ensure that no
false or inaccurate information is released to prevent the development
group from wasting time on inaccurate or trivial reports.
To eliminate the need to have development personnel train the
software V&V staff, it is imperative that software V&V select
personnel who are experienced and knowledgeable about the software
and its engineering application. When software V&V engineers and
computer scientists reconstruct the specific details and idiosyncrasies
of the software as a method of reconfirming the correctness of
engineering and programming assumptions, they often find subtle
errors. They gain detailed insight into the development process and an
ability to spot critical errors early. The cost of the development
interface is minimal, and at times nonexistent, when the software
V&V assessment is independent.
Finally, the discrepancies detected in software and the
improvement in documentation quality resulting from error correction
suggests that software V&V costs are offset by having more reliable
and maintainable software. Many companies rely on their software
systems for their daily operations. Failure of the system, loss of data,
and release of or tampering with sensitive information may cause
serious work disruptions and serious financial impact. The costs of
software V&V are offset in many application areas by increased
reliability during operation and reduced costs of maintenance.
2.14. STANDARD FOR SOFTWARE TEST
DOCUMENTATION (IEEE829)
The IEEE829 standard for software test documentation describes
a set of basic software test documents. It defines the content and
form of each test document.
In this addendum, we give a summary of the structure of the most
important IEEE829 defined test documents.
This addendum is based on the course materials by Jukka Paakki
(and the IEEE829 standard).

Test Plan
1. Test-plan Identifier: Specifies the unique identifier assigned to the
test plan.
2. Introduction: Summarizes the software items and features to be
tested, provides references to the documents relevant for testing
(for example, overall project plan, quality assurance plan,
configuration management plan, applicable standards, etc.).
3. Test Items: Identifies the items to be tested including their
version/revision level, provides references to the relevant item
documentation (for example, requirements specification, design
specification, user’s guide, operations guide, installation guide,
etc.), and identifies items which are specifically excluded from
testing.
4. Features to be Tested: Identifies all software features and their
combinations to be tested, identifies the test-design specification
associated with each feature and each combination of features.
5. Features not to be Tested: Identifies all features and significant
combinations of features which will not be tested, and the reasons
for this.
6. Approach: Describes the overall approach to testing (the testing
activities and techniques applied, the testing of non functional
requirements such as performance and security, the tools used in
testing); specifies completion criteria (for example, error
frequency or code coverage); identifies significant constraints
such as testing-resource availability and strict deadlines; serves
for estimating the testing efforts.
7. Item Pass/Fail Criteria: Specifies the criteria to be used to
determine whether each test item has passed or failed testing.
8. Suspension Criteria and Resumption: Specifies the criteria used to
suspend all or a portion of the testing activity on the test items
(for example, at the end of working day, due to hardware failure
or other external exception, etc.), specifies the testing activities
which must be repeated when testing is resumed.
9. Test Deliverables: Identifies the deliverable documents, typically
test-design specifications, test-case specifications, test-procedure
specifications, test-item transmittal reports, test logs, test-incident
reports, description of test-input data and test-output data, and
description of test tools.
10. Testing Tasks: Identifies the set of tasks necessary to prepare and
perform testing (for example, description of the main phases in
the testing process, design of verification mechanisms, plan for
maintenance of the testing environment, etc.).
11. Environmental Needs: Specifies both the necessary and desired
properties of the test environment (for example, hardware,
communications and systems software, software libraries, test
support tools, level of security for the test facilities, drivers and
stubs to be implemented, office or laboratory space, etc.).
12. Responsibilities: Identifies the groups of persons responsible for
managing, designing, preparing, executing, witnessing, checking,
and resolving the testing process; identifies the groups responsible
for providing the test items (Section 3) and the environmental
needs (Section 11).
13. Staffing and Training Needs: Specifies the number of testers by
skill level, and identifies training options for providing necessary
skills.
14. Schedule: Includes test milestones (those defined in the overall
project plan as well as those identified as internal ones in the
testing process), estimates the time required to do each testing
task, identifies the temporal dependencies between testing tasks,
specifies the schedule over calendar time for each task and
milestone.
15. Risks and Contingencies: Identifies the high-risk assumptions of
the test plan (lack of skilled personnel, possible technical
problems, etc.), specifies contingency plans for each risk (for
example, employment of additional testers, increase of night shift,
exclusion of some tests of minor importance, etc.).
16. Approvals: Specifies the persons who must approve this plan.

Test-Case Specification
1. Test-case Specification Identifier: Specifies the unique identifier
assigned to this test-case specification.
2. Test Items: Identifies and briefly describes the items and features
to be exercised by this test case, supplies references to the
relevant item documentation (for example, requirements
specification, design specification, user’s guide, operations guide,
installation guide, etc.).
3. Input Specifications: Specifies each input required to execute the
test case (by value with tolerance or by name); identifies all
appropriate databases, files, terminal messages, memory resident
areas, and external values passed by the operating system;
specifies all required relationships between inputs (for example,
timing).
4. Output Specifications: Specifies all of the outputs and features
(for example, response time) required of the test items, provides
the exact value (with tolerances where appropriate) for each
required output or feature.
5. Environmental Needs: Specifies the hardware and software
configuration needed to execute this test case, as well as other
requirements (such as specially trained operators or testers).
6. Special Procedural Requirements: Describes any special
constraints on the test procedures which execute this test case (for
example, special set-up, operator intervention, etc.).
7. Intercase Dependencies: Lists the identifiers of test cases which
must be executed prior to this test case, describes the nature of the
dependencies.

Test-Incident Report (Bug Report)


1. Bug-Report Identifier: Specifies the unique identifier assigned to
this report.
2. Summary: Summarizes the (bug) incident by identifying the test
items involved (with version/revision level) and by referencing
the relevant documents (for example, test-procedure specification,
test-case specification, test log).
3. Bug Description: Provides a description of the incident, so as to
correct the bug, repeat the incident, or analyze it offline.
Inputs
Expected results
Actual results
Date and time
Test-procedure step
Environment
Repeatability (whether repeated; whether occurring always,
occasionally, or just once).
Testers
Other observers
Additional information that may help to isolate and correct
the cause of the incident; for example, the sequence of
operational steps or history of user-interface commands that
lead to the (bug) incident.
4. Impact: Priority of solving the incident/correcting the bug (urgent,
high, medium, low).

Test-Summary Report
1. Test-Summary-Report Identifier: Specifies the unique identifier
assigned to this report.
2. Summary: Summarizes the evaluation of the test items, identifies
the items tested (including their version/revision level), indicates
the environment in which the testing activities took place,
supplies references to the documentation over the testing process
(for example, test plan, test-design specifications, test-procedure
specifications, test-item transmittal reports, test logs, test-incident
reports, etc.).
3. Variances: Reports any variances/deviations of the test items from
their design specifications, indicates any variances of the actual
testing process from the test plan or test procedures, specifies the
reason for each variance.
4. Comprehensiveness Assessment: Evaluates the
comprehensiveness of the actual testing process against the
criteria specified in the test plan, identifies features or feature
combinations which were not sufficiently tested and explains the
reasons for omission.
5. Summary of Results: Summarizes the success of testing (such as
coverage), identifies all resolved and unresolved incidents.
6. Evaluation: Provides an overall evaluation of each test item
including its limitations (based upon the test results and the item-
level pass/fail criteria).
7. Summary of Activities: Summarizes the major testing activities
and events, summarizes resource consumption (for example, total
staffing level, total person-hours, total machine time, and total
elapsed time used for each of the major testing activities).
8. Approvals: Specifies the persons who must approve this report
(and the whole testing phase).

Inspection Checklist for Test Plans


1. Have all materials required for a test plan inspection been
received?
2. Are all materials in the proper physical format?
3. Have all test plan standards been followed?
4. Has the testing environment been completely specified?
5. Have all resources been considered, both human and
hardware/software?
6. Have all testing dependencies been addressed (driver function,
hardware, etc.)?
7. Is the test plan complete, i.e., does it verify all of the
requirements? (For unit testing: does the plan test all functional
and structural variations from the high-level and detailed design?)
8. Is each script detailed and specific enough to provide the basis for
test case generation?
9. Are all test entrance and exit criteria sufficient and realistic?
10. Are invalid as well as valid input conditions tested?
11. Have all pass/fail criteria been defined?
12. Does the test plan outline the levels of acceptability for pass/fail
and exit criteria (e.g., defect tolerance)?
13. Have all suspension criteria and resumption requirements been
identified?
14. Are all items excluded from testing documented as such?
15. Have all test deliverables been defined?
16. Will software development changes invalidate the plan? (Relevant
for unit test plans only).
17. Is the intent of the test plan to show the presence of failures and
not merely the absence of failures?
18. Is the test plan complete, correct, and unambiguous?
19. Are there holes in the plan or is there overlap in the plan?
20. Does the test plan offer a measure of test completeness and test
reliability to be sought?
21. Are the test strategy and philosophy feasible?

Inspection Checklist for Test Cases


1. Have all materials required for a test case inspection been
received?
2. Are all materials in the proper physical format?
3. Have all test case standards been followed?
4. Are the functional variations exercised by each test case required
by the test plan? (Relevant for unit test case documents only.)
5. Are the functional variations exercised by each test case clearly
documented in the test case description? (Relevant for unit test
case documents only.)
6. Does each test case include a complete description of the expected
input and output or result?
7. Have all testing execution procedures been defined and
documented?
8. Have all testing dependencies been addressed (driver function,
hardware, etc.)?
9. Do the test cases accurately implement the test plan?
10. Are all data set definitions and setup requirements complete and
accurate?
11. Are operator instructions and status indicators complete, accurate,
and simple?
12. Have all intercase dependencies been identified and described?
13. Is each condition tested once and only once?
14. Have all test entrance and exit criteria been observed?
15. Are the test cases designed to show the presence of failure and not
merely the absence of failure?
16. Are the test cases designed to show omissions and extensions?
17. Are the test cases complete, correct, and unambiguous?
18. Are the test cases realistic?
19. Are the test cases documented so as to be 100% reproducible?
20. Has the entire testing environment been documented?
21. Has configuration management been set up, directories
established, and have case data and tools been loaded?

SUMMARY
1. Software engineering technology has matured sufficiently to be
addressed in approved and draft software engineering standards
and guidelines.
2. Business, industries, and government agencies spend billions
annually on computer software for many of their functions:
To manufacture their products.
To provide their services.
To administer their daily activities.
To perform their short- and long-term management functions.
3. As with other products, industries and businesses are discovering
that their increasing dependence on computer technology to
perform these functions, emphasizes the need for safe, secure,
reliable computer systems. They are recognizing that software
quality and reliability are vital to their ability to maintain their
competitiveness and high technology posture in the market place.
Software V&V is one of several methodologies that can be used
for building vital quality software.

MULTIPLE CHOICE QUESTIONS


1. Which one of the following is product related?
a. Quality control
b. Quality assurance
c. Both (a) and (b)
d. None of the above.
2. Howden claims that
a. No general purpose testing can be used to prove program
correctness.
b. There is no such thing as an absolute proof of correctness.
c. Both (a) and (b)
d. None of the above.
3. Static testing involves
a. Symbolic execution
b. Code walkthrough
c. Inspections
d. All of the above.
4. The role of V&V in SDLC is given in which of the following
standards
a. IEEE std. 1012
b. IEEE std. 9012
c. IEEE std. 83b
d. None of the above.
5. Detailed designs are addressed by a
a. Preliminary design review
b. Critical design review
c. Both (a) and (b)
d. None of the above.
6. A planned meeting is also known as a
a. Informal review
b. Formal review
c. Technical review
d. Dynamic review
7. Structured walkthrough is a
a. Dynamic testing technique
b. Formal static testing technique
c. Informal static testing
d. Acceptance testing technique
8. Which of the following is not a validation activity?
a. Unit testing
b. System testing
c. Acceptance testing
d. Walkthroughs
9. Which of the following is not a verification activity?
a. Acceptance testing
b. Inspections
c. Walkthroughs
d. Buddy check
10. During validation
a. Process is checked.
b. Product is checked.
c. Developer’s performance is evaluated.
d. The customer checks the product.

ANSWERS
1. a.
2. c.
3. d.
4. a.
5. b.
6. b.
7. c.
8. d.
9. a.
10. d.

CONCEPTUAL SHORT QUESTIONS WITH ANSWERS


Q. 1. What sort of errors are covered by regression testing?
Ans. Regression testing includes mainly four types of errors:
i. Data Corruption Errors: Due to sharing of data, these errors
result in side effects.
ii. Inappropriate Control Sequencing Errors: Due to the changes
in the execution sequences, these errors result in side effects.
For example, an attempt to remove an item from a queue
before it is placed into the queue.
iii. Resource Contention: Potential bottlenecks and deadlocks are
some examples of these types of errors.
iv. Performance Deficiencies: Timing errors and storage
utilization errors are some examples of these types of errors.
Q. 2. What is criticality analysis?
Ans. It is a method to locate and reduce high-risk problems. It is
performed at the beginning of the project. It identifies the
functions and modules that are required to implement critical
program functions or quality requirements like safety, security,
etc.
The steps of analysis are as follows:
Step 1: Develop a block diagram or control flow diagram of the
system and its software elements. Each block or control flow box
represents a system or software function (module).
Step 2: Trace each critical function or quality requirements
through the block or control flow diagram.
Step 3: Classify all traced software functions (modules) as
critical to either the proper execution of critical software
functions or the quality requirements.
Step 4: Focus additional analysis on these traced critical
software functions (modules).
Step 5: Repeat criticality analysis for each life-cycle
process/activity to determine whether the implementation details
shift the emphasis of the criticality.
Q. 3. What is traceability analysis?
Ans. Traceability analysis traces each software requirement back to
the system requirements. This is done to ensure that each
requirement correctly satisfies the system requirements. This
analysis will also determine whether any derived software
requirements are consistent with the original objectives, physical
laws, and technologies described in the system document.
Q. 4. What is interface analysis?
Ans. It is a detailed examination of the interface requirements
specifications. The evaluation criteria here is on the interfaces
between the software and other hardware, user, and external
software. Criticality analysis is continued and updated for the
software. Criticality is assigned to each software requirement.
When requirements are combined into functions, the combined
criticality of requirements form the criticality for the aggregate
function. The criticality analysis is updated periodically as
requirement changes are introduced as such changes can cause a
functions criticality to increase or decrease depending on how
the revised requirements impact system criticality.
Q. 5. Explain the tool support for review processes.
Ans. As tools become available to perform some of the tasks
previously done by humans, the cost effectiveness of review
processes increases.
For example, utilization of a compiler to detect syntax errors in
code and thus alleviating this task for the reviewers.
Another example is the design and specification consistency
checkers.
Q. 6. Suggest some techniques to find different types of errors.
Ans. Some of the techniques are discussed below:
i. Algorithm Analysis: It examines the logic and accuracy of
the software requirements by translating algorithms into some
language or structured format. The analysis involves
rederiving equations or evaluating the suitability of specific
numerical techniques. Algorithm analysis examines the
correctness of the equations and numerical techniques,
truncation and sounding effects, numerical precision of word
storage and variables, and data typing influences.
ii. Analytic Modeling: It provides performance evaluation and
capacity planning information on software design. It
represents the program logic and processing of some kind of
model and analyzes it for efficiency.
iii. Control Flow Analysis: It is used to show the hierarchy of
main routines and their subfunctions. It checks that the
proposed control flow is free of problems like unreachable or
incorrect code.
iv. Database Analysis: It ensures that the database structure and
access methods are compatible with the logical design. It is
done on programs with significant data storage to ensure that
common data and variable regions are used consistently
between all calling routines, that data integrity is enforced
and no data or variable can be accidentally overwritten by
overflowing data tables, and that data typing and use are
consistent throughout the program.
Q. 7. What is desk checking?
It involves the examination of the software design or code by an
Ans. individual. It includes:
Looking over the code for defects.
Checking for correct procedure interfaces.
Reading the comments and comparing it to external
specifications and software design.
Q. 8. What are Petri-nets?
Ans. Petri-nets model system to ensure software design adequacy for
catastrophic failure. The system is modeled using conditions and
events represented by STDs. They can be executed to see how
the software design will actually work under certain conditions.
Q. 9. What is program slicing?
Ans. Slicing is a program decomposition technique used to trace an
output variable back through the code to identify all code
statements relevant to a computation in the program.
Q. 10. What is test certification?
Ans. It ensures that the reported test results are the actual finding of
the tests. Test related tools, media, and documentation are
certified to ensure maintainability and repeatability of tests. This
technique is also used to show that the delivered software
product is identical to the software product that was subjected to
V&V. It is used in critical software systems to verify that the
required tests have been executed and that the delivered software
product is identical to the product subjected to software V&V.

REVIEW QUESTIONS
1. a. Discuss briefly the V&V activities during the design phase
of the software development process.
b. Discuss the different forms of IV&V.
2. a. What is the importance of technical reviews in software
development and maintenance life cycle?
b. Briefly discuss how walkthroughs help in technical
reviews.
3. a. Explain why validation is more difficult than verification.
b. Explain validation testing.
4. Explain the following with the help of an example:
‘Verification and Validation.’
5. Write short notes on V&V standards.
6. Write short notes on ‘Independent V&V contractor.’
7. What is an independent test organization? Why is it necessary?
8. Discuss the role of verification and validation in software
evolution. What are verification and validation objective
standards?
9. What is the difference between verification and validation?
10. Discuss V&V approaches, standards, and objectives.
11. What is formal testing?
12. a. What is a software audit?
b. Explain code walkthrough.
13. What are the different V&V activities performed at the coding
phase of the software development process?
14. Discuss in brief software verification and validation plan.
15. a. Why are technical reviews conducted?
b. Discuss a formal method for conducting technical reviews.
16. Discuss one testing technique used during the design phase of
SDLC.
17. Explain all the phases through which we can approach testing
using a sample software testing problem. Comment on
difficulty of testing.
18. Briefly explain code walkthrough and inspection.
19. During code review you detect errors whereas during code
testing you detect failures. Justify.
20. a. What are the testing activities performed during
requirement analysis phase of SDLC.
b. Briefly describe formal proof of correctness.
21. What is a requirements traceability matrix? Give its
importance.
CHAPTER
3

BLACK-BOX (OR FUNCTIONAL)


TESTING TECHNIQUES
Inside this Chapter:
3.0. Introduction to Black-Box (or Functional Testing)
3.1. Boundary Value Analysis (BVA)
3.2. Equivalence Class Testing
3.3. Decision Table Based Testing
3.4. Cause-Effect Graphing Technique
3.5. Comparison on Black-Box (or Functional) Testing Techniques
3.6. Kiviat Charts

3.0. INTRODUCTION TO BLACK-BOX (OR


FUNCTIONAL TESTING)
The term Black-Box refers to the software which is treated as a black-
box. By treating it as a black-box, we mean that the system or source
code is not checked at all. It is done from the customer’s viewpoint.
The test engineer engaged in black-box testing only knows the set of
inputs and expected outputs and is unaware of how those inputs are
transformed into outputs by the software. We will now discuss various
techniques of performing black-box testing.

3.1. BOUNDARY VALUE ANALYSIS (BVA)


It is a black-box testing technique that believes and extends the
concept that the density of defect is more towards the boundaries. This
is done for the following reasons:
i. Programmers usually are not able to decide whether they have
to use <= operator or < operator when trying to make
comparisons.
ii. Different terminating conditions of for-loops, while loops, and
repeat loops may cause defects to move around the boundary
conditions.
iii. The requirements themselves may not be clearly understood,
especially around the boundaries, thus causing even the
correctly coded program to not perform the correct way.
Strongly typed languages such as Ada and Pascal permit explicit
definition of variable ranges. Other languages such as COBOL,
FORTRAN, and C are not strongly typed, so boundary value testing is
more appropriate for programs coded in such languages.

3.1.1. WHAT IS BVA?


The basic idea of BVA is to use input variable values at their
minimum, just above the minimum, a nominal value, just below their
maximum, and at their maximum. That is, {min, min+, nom, max–,
max}. This is shown in Figure 3.1.
BVA is based upon a critical assumption that is known as single
fault assumption theory. According to this assumption, we derive the
test cases on the basis of the fact that failures are not due to a
simultaneous occurrence of two (or more) faults. So we derive test
cases by holding the values of all but one variable at their nominal
values and letting that variable assume its extreme values.
FIGURE 3.1 BVA Test Cases.

If we have a function of n-variables, we hold all but one at the


nominal values and let the remaining variable assume the min, min+,
nom, max–, and max values, repeating this for each variable. Thus, for
a function of n variables, BVA yields (4n + 1) test cases.

3.1.2. LIMITATIONS OF BVA


1. Boolean and logical variables present a problem for boundary
value analysis.
2. BVA assumes the variables to be truly independent which is not
always possible.
3. BVA test cases have been found to be rudimentary because they
are obtained with very little insight and imagination.

3.1.3. ROBUSTNESS TESTING

Another variant to BVA is robustness testing. In BVA, we are within


the legitimate boundary of our range. That is, we consider the
following values for testing:
{min, min+, nom, max–, max} whereas in robustness testing, we
try to cross these legitimate boundaries also. So, now we consider
these values for testing:
{min–, min, min+, nom, max–, max, max+}
Again, with robustness testing, we can focus on exception
handling. With strongly typed languages, robustness testing may be
very awkward. For example, in PASCAL, if a variable is defined to be
within a certain range, values outside that range result in run-time
errors that abort normal execution.
For a program with n-variables, robustness testing will yield (6n
+ 1) test-cases. So, we can draw a graph now. (Figure 3.2)
FIGURE 3.2 Robustness Test Cases.

Each dot represents a test value at which the program is to be


tested. In robustness testing, we cross the legitimate boundaries of
input domain. In the graph of Figure 3.2, we show this by dots that are
outside the range [a, b] of variable x1. Similarly, for variable x2, we
have crossed its legitimate boundary of [c, d] of variable x2.
This type of testing is quite common in electric and electronic
circuits.
Furthermore, this type of testing also works on single fault
assumption theory.

3.1.4. WORST-CASE TESTING

If we reject our basic assumption of single fault assumption theory and


focus on what happens when we reject this theory—it simply means
that we want to see what happens when more than one variable has an
extreme value. This is multiple path assumption theory. In electronic
circuit analysis, this is called as “worst-case analysis.” We use this
idea here to generate worst-case test cases.
For each variable, we start with the five- element set that contains
the min, min+, nom, max–, and max values. We then take the
Cartesian product of these sets to generate test cases. This is shown in
Figure 3.3.
For a program with n-variables, 5n test cases are generated.
NOTE

Robust worst-case testing yields 7n test cases.

FIGURE 3.3 Worst-Case Test Cases.

3.1.5. EXAMPLES WITH THEIR PROBLEM DOMAIN

3.1.5.1. TEST-CASES FOR THE TRIANGLE PROBLEM

Before we generate the test cases, first we need to give the problem
domain:
Problem Domain: “The triangle program accepts three integers, a, b,
and c, as input. These are taken to be the sides of a triangle. The
integers a, b, and c must satisfy the following conditions:

C1: 1 ≤ a ≤ 200 C4: a < b + c


C2: 1 ≤ b ≤ 200 C5: b < a + c
C3: 1 ≤ c ≤ 200 C6: c < a + b

The output of the program may be: Equilateral, Isosceles, Scalene,


or “NOT-A-TRIANGLE.”

How to Generate BVA Test Cases?


We know that our range is [1, 200] where 1 is the lower bound and
200 is the upper bound. Also, we find that this program has three
inputs—a, b, and c. So, for our case
n=3
BVA yields (4n + 1) test cases, so we can say that the total
number of test cases will be (4 × 3 + 1) = 12 + 1 = 13.

Table 3.1 shows those 13 test cases.

TABLE 3.1 BVA test cases for triangle problem.

Please note that we explained above that we can have 13 test cases
(4n + 1) for this problem. But instead of 13, now we have 15 test
cases. Also, test case ID number 8 and 13 are redundant. So, we
ignore them. However, we do not ignore test case ID number 3 as we
must consider at least one test case out of these three. Obviously, it is
mechanical work!
We can say that these 13 test cases are sufficient to test this
program using BVA technique.

Question for Practice


1. Applying the robustness testing technique, how would you
generate the test cases for the triangle problem given above?

3.1.5.2. TEST CASES FOR NEXT DATE FUNCTION

Before we generate the test cases for the Next Date function, we must
know the problem domain of this program:
Problem Domain Next Date is a function of three variables: month,
date, and year. It returns the date of next day as output. It reads current
date as input date. The conditions are:
C1: 1 ≤ month ≤ 12
C2: 1 ≤ day ≤ 31
C3: 1900 ≤ year ≤ 2025
If any of conditions C1, C2, or C3 fails, then this function produces
an output “value of month not in the range 1...12.”
Because many combinations of dates exist this function just
displays one message: “Invalid Input Date.”

Complexities in Next Date Function

A very common and popular problem occurs if the year is a leap year.
We have taken into consideration that there are 31 days in a month.
But what happens if a month has 30 days or even 29 or 28 days? A
year is called as a leap year if it is divisible by 4, unless it is a century
year. Century years are leap years only if they are multiples of 400.
So, 1992, 1996, and 2000 are leap years while 1900 is not a leap year.
Furthermore, in this Next Date problem, we find examples of
Zipf’s law also, which states that “80% of the activity occurs in 20%
of the space.” Here also, much of the source-code of the Next Date
function is devoted to the leap year considerations.

How to Generate BVA Test Cases for This Problem?


The Next Date program takes date as input and checks it for validity.
If valid, it returns the next date as its output.
As we know, with single fault assumption theory, (4n + 1) test
cases can be designed. Here, also n = 3. So, the total number of test
cases are (4 × 3 + 1) = 12 + 1 = 13.
The boundary value test cases are

So, we have applied BVA on our Next Date problem.

Question for Practice


1. Applying robustness testing, how would you generate the test cases
for the Next Date function given above.

3.1.5.3. TEST CASES FOR THE COMMISSION PROBLEM


Before we generate the test cases, we must formulate the problem
statement or domain for commission problem.

Problem Domain: A rifle salesperson sold rifle locks, stocks, and


barrels that were made by a gunsmith. Locks cost $45, stocks cost
$30, and barrels cost $25. This salesperson had to sell at least one
complete rifle per month, and the production limits were such that the
most the salesperson could sell in a month was 70 locks, 80 stocks,
and 90 barrels. The salesperson used to send the details of sold items
to the gunsmith. The gunsmith then computed the salesperson’s
commission as follows:
a. 10% on sales up to and including $1000
b. 15% of the next $800
c. 20% on any sales in excess of $1800
The commission program produced a monthly sales report that gave
the total number of locks, stocks, and barrels sold, the salesperson’s
total dollar sales and finally, the commission.

How to Generate BVA Test Cases for this Problem?


We have 3 inputs in this program. So, we need (4n + 1) = 4 * 3 + 1 =
12 + 1 = 13 test cases to test this program.
The boundary value test cases are listed below in Table. We can
also find that the monthly sales are limited as follows:
1 ≤ locks ≤ 70
1 ≤ stocks ≤ 80
1 ≤ barrels ≤ 90
Out of these 15 test cases, 2 are redundant. So, 13 test cases are
sufficient to test this program.

3.1.6. GUIDELINES FOR BVA


1. The normal versus robust values and the single-fault versus the
multiple-fault assumption theory result in better testing. These
methods can be applied to both input and output domain of any
program.
2. Robustness testing is a good choice for testing internal variables.
3. Keep in mind that you can create extreme boundary results from
non-extreme input values.
3.2. EQUIVALENCE CLASS TESTING
The use of equivalence classes as the basis for functional testing has
two motivations:
a. We want exhaustive testing
b. We want to avoid redundancy
This is not handled by the BVA technique as we can see massive
redundancy in the tables of test cases.

FIGURE 3.4 Equivalence Class Partitioning.

In this technique, the input and the output domain is divided into a
finite number of equivalence classes. Then, we select one
representative of each class and test our program against it. It is
assumed by the tester that if one representative from a class is able to
detect error then why should he consider other cases. Furthermore, if
this single representative test case did not detect any error then we
assume that no other test case of this class can detect error. In this
method, we consider both valid and invalid input domains. The system
is still treated as a black-box meaning that we are not bothered about
its internal logic.
The idea of equivalence class testing is to identify test cases by
using one element from each equivalence class. If the equivalence
classes are chosen wisely, the potential redundancy among test cases
can be reduced.
For example, in our triangle problem, we would certainly have a
test case for an equilateral triangle and we might pick the triple (10,
10, 10) as inputs for a test case. If this is so then it is obvious that there
is no sense in testing for inputs like (8, 8, 8) and (100, 100, 100). Our
intuition tells us that these would be “treated the same” as the first test
case. Thus, they would be redundant. The key and the craftsmanship
lies in the choice of the equivalence relation that determines the
classes.

Four types of equivalence class testing are discussed below:


1. Weak normal equivalence class testing
2. Strong normal equivalence class testing
3. Weak robust equivalence class testing
4. Strong robust equivalence class testing

We will discuss these one by one.

3.2.1. WEAK NORMAL EQUIVALENCE CLASS TESTING


The word “weak” means single fault assumption. This type of testing
is accomplished by using one variable from each equivalence class in
a test case. We would, thus, end up with the weak equivalence class
test cases as shown in Figure 3.5.
Each dot in Figure 3.5 indicates a test data. From each class we
have one dot meaning that there is one representative element of each
test case. In fact, we will have, always, the same number of weak
equivalence class test cases as the classes in the partition.

FIGURE 3.5 Weak Normal Equivalence Class Test Cases.

3.2.2. STRONG NORMAL EQUIVALENCE CLASS TESTING


This type of testing is based on the multiple fault assumption theory.
So, now we need test cases from each element of the Cartesian product
of the equivalence classes, as shown in Figure 3.6.

FIGURE 3.6 Strong Normal Equivalence Class Test Cases.

Just like we have truth tables in digital logic, we have similarities


between these truth tables and our pattern of test cases. The Cartesian
product guarantees that we have a notion of “completeness” in two
ways:
a. We cover all equivalence classes.
b. We have one of each possible combination of inputs.

3.2.3. WEAK ROBUST EQUIVALENCE CLASS TESTING


The name for this form of testing is counter intuitive and oxymoronic.
The word “weak” means single fault assumption theory and the word
“robust” refers to invalid values. The test cases resulting from this
strategy are shown in Figure 3.7.
Two problems occur with robust equivalence testing. They are
listed below:
1. Very often the specification does not define what the expected
output for an invalid test case should be. Thus, testers spend a lot
of time defining expected outputs for these cases.
2. Also, strongly typed languages like Pascal and Ada, eliminate the
need for the consideration of invalid inputs. Traditional
equivalence testing is a product of the time when languages such as
FORTRAN, C, and COBOL were dominant. Thus, this type of
error was common.

FIGURE 3.7 Weak Robust Equivalence Class Test Cases.

3.2.4. STRONG ROBUST EQUIVALENCE CLASS TESTING


This form of equivalence class testing is neither counter intuitive nor
oxymoronic but is redundant. As explained earlier “robust” means
consideration of invalid values and “strong” means multiple fault
assumption. We obtain the test cases from each element of the
Cartesian product of all the equivalence classes. This is shown in
Figure 3.8.
We find here that we have 8 robust (invalid) test cases and 12
strong or valid inputs. Each is represented with a dot. So, totally we
have 20 test cases (represented as 20 dots) using this technique.
FIGURE 3.8 Strong Robust Equivalence Class Test Cases.

3.2.5. SOLVED EXAMPLES

3.2.5.1. EQUIVALENCE CLASS TEST CASES FOR THE TRIANGLE PROBLEM

As stated in the problem definition earlier, we note that in our triangle


problem four possible outputs can occur:
a. NOT-A-TRIANGLE
b. Scalene
c. Isosceles
d. Equilateral
We can use these to identify output (range) equivalence classes as
follows:
01 = {<a, b, c>: the triangle is equilateral}
02 = {<a, b, c>: the triangle is isosceles}
03 = {<a, b, c>: the triangle is scalene}
04 = {<a, b, c>: sides a, b, and c do not form a triangle}
Now, we apply these four techniques of equivalence class
partitioning one by one to this problem.
a. The four weak normal equivalence class test cases are:
b. Because no valid subintervals of variables a, b, and c exist, so the
strong normal equivalence class test cases are identical to the weak
normal equivalence class test cases.
c. Considering the invalid values for a, b, and c yields the following
additional weak robust equivalence class test cases:

d. While strong robust equivalance class test cases are:

Please note that the expected outputs describe the invalid input
values thoroughly.
3.2.5.2. EQUIVALENCE CLASS TEST CASES FOR NEXT DATE FUNCTION

The actual craft of choosing the test cases lies in this example. Recall
that Next Date is a function of three variables— month (mm), day
(dd), and year (yyyy). Assume that their ranges are

1 ≤ month ≤ 12
1 ≤ day ≤ 31
1812 ≤ year ≤ 2012

So, based on valid values, the equivalence classes are:


M1 = {month: 1 ≤ month ≤ 12}
D1 = {day: 1 ≤ day ≤ 31}
Y1 = {year: 1812 ≤ year ≤ 2012}
And the invalid equivalence classes are:
M2 = {month: month < 1}
M3 = {month: month > 12}
D2 = {day: day < 1}
D3 = {day: day > 31}
Y2 = {year: year < 1812}
Y3 = {year: year > 2012}
a. & b. Because the number of valid classes equals the number of
independent variables, only one weak normal equivalence
class test case occurs and it is identical to the strong normal
equivalence class test case:

So, we get this test case on the basis of valid classes – M1, D1, and
Y1 above.
c. Weak robust test cases are given below:
So, we get 7 test cases based on the valid and invalid classes of the
input domain.
d. Strong robust equivalence class test cases are given below:

Modified Equivalence Class for this Problem


We need the modified classes as we know that at the end of a month,
the next day is 1 and the month is incremented. At the end of a year,
both the day and the month are reset to 1 and the year is also
incremented. Finally, the problem of leap year makes determining the
last day of a month interesting. With all this in mind, we postulate the
following equivalence classes:
M1 = {month: month has 30 days}
M2 = {month: month has 31 days}
M3 = {month: month is February}
D1 = {day: 1 ≤ day ≤ 28}
D2 = {day: day = 29}
D3 = {day: day = 30}
D4 = {day: day = 31}
Y1 = {year: year = 2000}
Y2 = {year: year is a leap year}
Y3 = {year: year is a common year}
So, now what will be the weak equivalence class test cases?
As done earlier, the inputs are mechanically selected from the
approximate middle of the corresponding class:

This mechanical selection of input values makes no consideration


of our domain knowledge and thus, we have two impossible dates.
This will always be a problem with “automatic” test case generation
because all of our domain knowledge is not captured in the choice of
equivalence classes.
The strong normal equivalence class test cases for the revised
classes are:
So, three month classes, four day classes, and three year classes
results in 3 × 4 × 3 = 36 strong normal equivalence class test cases.
Furthermore, adding two invalid classes for each variable will result in
150 strong robust equivalence class test cases.
It is difficult to show these 150 classes here.

3.2.5.3. EQUIVALENCE CLASS TEST CASES FOR THE COMMISSION PROBLEM


The valid classes of the input variables are as follows:
L1 = {locks: 1 ≤ locks ≤ 70}
L2 = {locks = –1}
S1 = {stocks: 1 ≤ stocks ≤ 80}
B1 = {barrels: 1 ≤ barrels ≤ 90}

The corresponding invalid classes of the input variables are as follows:


L3 = {locks: locks = 0 or locks < –1}
L4 = {locks: locks > 70}
S2 = {stocks: stocks < 1}
S3 = {stocks: stocks > 80}
B2 = {barrels: barrels < 1}
B3 = {barrels: barrels > 90}
a. & b. As the number of valid classes is equal to the number of
independent variables, so we have exactly one weak normal
equivalence class test case and again, it is identical to the strong
normal equivalence class test case. It is given in the following
table.

c. Also, we have seven weak robust test cases as given below:

d. And finally, the strong robust equivalence class test cases are as
follows:
3.2.6. GUIDELINES FOR EQUIVALENCE CLASS TESTING
The following are guidelines for equivalence class testing:
1. The weak forms of equivalence class testing (normal or robust) are
not as comprehensive as the corresponding strong forms.
2. If the implementation language is strongly typed and invalid values
cause run-time errors then it makes no sense to use the robust form.
3. If error conditions are a high priority, the robust forms are
appropriate.
4. Equivalence class testing is approximate when input data is defined
in terms of intervals and sets of discrete values. This is certainly
the case when system malfunctions can occur for out-of-limit
variable values.
5. Equivalence class testing is strengthened by a hybrid approach with
boundary value testing (BVA).
6. Equivalence class testing is used when the program function is
complex. In such cases, the complexity of the function can help
identify useful equivalence classes, as in the next date problem.
7. Strong equivalence class testing makes a presumption that the
variables are independent and the corresponding multiplication of
test cases raises issues of redundancy. If any dependencies occur,
they will often generate “error” test cases, as shown in the next
date function.
8. Several tries may be needed before the “right” equivalence relation
is established.
9. The difference between the strong and weak forms of equivalence
class testing is helpful in the distinction between progression and
regression testing.

3.3. DECISION TABLE BASED TESTING


Of all the functional testing methods, those based on decision tables
are the most rigorous because decision tables enforce logical rigor.

3.3.1. WHAT ARE DECISION TABLES?


Decision tables are a precise and compact way to model complicated
logic. They are ideal for describing situations in which a number of
combinations of actions are taken under varying sets of conditions.

FIGURE 3.9 Structure of Decision Table.

It is another popular black-box testing technique. A decision table


has four portions:
a. Stub portion
b. Entry portion
c. Condition portion
d. Action portion
A column in the entry portion is a rule. Rules indicate which
actions are taken for the conditional circumstances indicated in the
condition portion of the rule. Decision tables in which all conditions
are binary are called limited entry decision tables. If conditions are
allowed to have several values, the resulting tables are called extended
entry decision tables.
To identify test cases with decision tables, we follow certain steps:
Step 1. For a module identify input conditions (causes) and action
(effect).
Step 2. Develop a cause-effect graph.
Step 3. Transform this cause-effect graph, so obtained in step 2 to
a decision table.
Step 4. Convert decision table rules to test cases. Each column of
the decision table represents a test case. That is,

Number of Test Cases = Number of Rules

For a limited entry decision table, if n conditions exist, there must


be 2n rules.

3.3.2. ADVANTAGES, DISADVANTAGE, AND APPLICATIONS OF DECISION


TABLES
Advantages of Decision Tables
1. This type of testing also works iteratively. The table that is drawn
in the first iteration acts as a stepping stone to derive new decision
table(s) if the initial table is unsatisfactory.
2. These tables guranatee that we consider every possible
combination of condition values. This is known as its
“completeness property.” This property promises a form of
complete testing as compared to other techniques.
3. Decision tables are declarative. There is no particular order for
conditions and actions to occur.
Disadvantages of Decision Tables
Decision tables do not scale up well. We need to “factor” large tables
into smaller ones to remove redundancy.

Applications of Decision Tables


This technique is useful for applications characterized by any of the
following:
a. Prominent if-then-else logic.
b. Logical relationships among input variables.
c. Calculations involving subsets of the input variables.
d. Cause-and-effect relationships between inputs and outputs.
e. High cyclomatic complexity.

Technique: To identify test cases with decision tables, we interpret


conditions as inputs and actions as output. The rules are interpreted as
test cases. Because the decision table can mechanically be forced to be
complete, we know we have a comprehensive set of test cases.
Example of a Decision Table: The triangle problem

FIGURE 3.10 Example of Decision Table.


Each “-” (hyphen) in the decision table represents a “don’t care”
entry. Use of such entries has a subtle effect on the way in which
complete decision tables are recognized. For limited entry decision
tables, if n conditions exist, there must be 2n rules. When don’t care
entries indicate that the condition is irrelevant, we can develop a rule
count as follows:
Rule 1. Rules in which no “don’t care” entries occur count as one
rule.
Note that each column of a decision table represents a rule and the
number of rules is equal to the number of test cases.
Rule 2. Each “don’t care” entry in a rule doubles the count of that
rule.
Note that in this decision table we have 6 conditions (C1—C6).
Therefore,
n=6
Also, we can have 2n entries, i.e., 26 = 64 entries. Now we
establish the rule and the rule count for the above decision table.

FIGURE 3.11 Decision Table With Rule Counts.


From the previous table we find that the rule count is 64. And we
have already said that 2n = 26 = 64. So, both are 64.
The question, however, is to find out why the rule count is 32 for
the Rule-1 (or column-1)?
We find that there are 5 don’t cares in Rule-1 (or column-1) and
hence 2n = 25 = 32. Hence, the rule count for Rule-1 is 32. Similarly,
for Rule-2, it is 24 = 16 and 23 = 8 for Rule-3. However, from Rule-4
through Rule-11, the number of don’t care entries is 0 (zero). So rule
count is 20 = 1 for all these columns. Summing the rule count of all
columns (or R1-R11) we get a total of 64 rule count.
Many times some problems arise with these decision tables. Let us
see how.
Consider the following example of a redundant decision table:

FIGURE 3.12 Redundant Decision Table.

Please note that the action entries in Rule-9 and Rules 1–4 are
NOT identical. It means that if the decision table were to process a
transaction in which C1 is true and both C2 and C3 are false, both rules
4 and 9 apply. We observe two things
1. Rules 4 and 9 are in-consistent because the action sets are different.
2. The whole table is non-deterministic because there is no way to
decide whether to apply Rule-4 or Rule-9.
Also note carefully that there is a bottom line for testers now. They
should take care when don’t care entries are being used in a decision
table.
3.3.3. EXAMPLES

3.3.3.1. TEST CASES FOR THE TRIANGLE PROBLEM USING DECISION TABLE
BASED TESTING TECHNIQUE
We have already studied the problem domain for the famous triangle
problem in previous chapters. Next we apply the decision table based
technique on the triangle problem. The following are the test cases:

FIGURE 3.13 Test Cases For Triangle Problem.

So, we get a total of 11 functional test cases out of which three are
impossible cases, three fail to satisfy the triangle property, one
satisfies the equilateral triangle property, one satisfies the scalene
triangle property, and three ways to get an isoceles triangle.

3.3.3.2. TEST CASES FOR NEXT DATE FUNCTION


In the previous technique of equivalence partitioning we found that
certain logical dependencies exist among variables in the input
domain. These dependencies are lost in a Cartesian product. The
decision table format allows us to use the notion of “impossible
action” to denote impossible combinations of conditions. Thus, such
dependencies are easily shown in decision tables.
From the problem domain for the next date function, we know that
there are some critical situations like the treatment of leap years,
special treatment to year 2000 (Y2K), and special treatment to
December month and a 28 day month is also to be given. So, we go
for 3 tries before we derive its test cases.
First try: Special treatment to leap years.
Second try: Special treatment to year = 2000.
Third try: Special treatment to December month and days = 28.
First try: Special treatment to leap years.
The art of testing and the actual craftsmanship lies in identifying
appropriate conditions and actions. Considering the following
equivalence classes again:
M1 = {Month: Month has 30 days}
M2 = {Month: Month has 31 days}
M3 = {Month: Month is February}
D1 = {day: 1 ≤ day ≤ 28}
D2 = {day: day = 29}
D3 = {day: day = 30}
D4 = {day: day = 31}
Y1 = {year: year is a leap year}
Y2 = {year: year is not a leap year}
Based on these classes, we draw the decision table:
FIGURE 3.14 First Try Decision Table.

Herein, we have 3 × 4 × 2 = 24 elements and 2n = 28 = 256 entries.


Many of the rules would be impossible (note that the classes Y1 and
Y2 collapse into one condition C8). Some of these rules are impossible
because
a. there are too many days in a month.
b. they cannot happen in a non-leap year.
c. they compute the next date.
Second try: Special treatment given to year 2000.
As we know, the year 2000 is a leap year. Hence, we must give special
treatment to this year by creating a new class-Y1 as follows:

M1 = {month: month has 30 days}


M2 = {month: month has 31 days}
M3 = {month: month is February}
D1 = {day: 1 ≤ day ≤ 28}
D2 = {day: day = 29}
D3 = {day: day = 30}
D4 = {day: day = 31}
Y1 = {year: year = 2000}
Y2 = {year: year is a leap year}
Y3 = {year: year is a common year}
This results in 3 × 4 × 3 = 36 rules that represents the Cartesian
product of 3 month classes (M1.... M3), 4 day classes (D1....D4), and 3
year classes.
So, after second try, our decision table is
Why is the rule count value 3 in column 1?
In column 1 or Rule 1 of this decision table, we have 3
possibilities with don’t care:

{M1, D1, Y1}


{M1, D1, Y2}
{M1, D1, Y3}
i.e., with “–” or don’t care we have either Y1 or Y2 or Y3.
Also note that in Rule 8, we have three impossible conditions,
shown as “?” (question mark). Also, we find that this table has certain
problems with December month. We will fix these next in the third
try.

Third try: Special treatment to December month and to number of


days = 28.
FIGURE 3.15 Second Try Decision Table With 36 Rule Count.

Because we know that we have serious problems with the last day
of last month, i.e., December. We have to change month from 12 to 1.
So, we modify our classes as follows:
M1 = {month: month has 30 days}
M2 = {month: month has 31 days except December}
M3 = {month: month is December}
D1 = {day: 1 ≤ day ≤ 27}
D2 = {day: day = 28}
D3 = {day: day = 29}
D4 = {day: day = 30}
D5 = {day: day = 31}
Y1 = {year: year is a leap year}
Y2 = {year is a common year}

The Cartesian product of these contain 40 elements. Here, we have


a 22-rule decision table. This table gives a clearer picture of the Next
Date function than does the 36-rule decision table and is given below:
In this table, the first five rules deal with 30-day months. Notice
that the leap year considerations are irrelevant. Rules (6 – 10) and (11
– 15) deal with 31-day months where the first five with months other
than December and the second five deal with December. No
impossible rules are listed in this portion of the decision table.
Still there is some redundancy in this table. Eight of the ten rules
simply increment the day. Do we really require eight separate test
cases for this sub-function? No, but note the type of observation we
get from the decision table.
Finally, the last seven rules focus on February and leap year. This
decision table analysis could have been done during the detailed
design of the Next Date function.
Further simplification of this decision table can also be done. If the
action sets of two rules in a decision table are identical, there must be
at least one condition that allows two rules to be combined with a
don’t care entry. In a sense, we are identifying equivalence classes of
these rules. For example, rules 1, 2, and 3 involve day classes as D1,
D2, and D3 (30 day classes). These can be combined together as the
action taken by them is the same. Similarly, for other rules other
combinations can be done. The corresponding test cases are shown in
the table as in Figure 3.17.

FIGURE 3.16 Decision Table for the Next Date Function.

FIGURE 3.17 Test Cases for Next Date Problem Using Decision Table
Based Testing.

Because, we have 22 rules there are 22 test cases that are listed above.

3.3.3.3. TEST CASES FOR THE COMMISSION PROBLEM


The commission problem is not suitable to be solved using this
technique of decision table analysis because very little decisional logic
is used in the problem. The variables in the equivalence classes are
truly independent, therefore no impossible rules will occur in a
decision table in which conditions correspond to the equivalence
classes. Thus, we will have the same test cases as we did for
equivalence class testing.

3.3.4. GUIDELINES FOR DECISION TABLE BASED TESTING

The following guidelines have been found after studying the previous
examples:
1. This technique works well where lot of decision making takes
place such as the triangle problem and next date problem.
2. The decision table technique is indicated for applications
characterized by any of the following:
Prominant if-then-else logic.
Logical relationships among input variables.
Calculations involving subsets of the input variables.
Cause-and-effect relationships between inputs and outputs.
High cyclomatic complexity.
3. Decision tables do not scale up well. We need to “factor” large
tables into smaller ones to remove redundancy.
4. It works iteratively meaning that the table drawn in the first
iteration, and acts as a stepping stone to design new decision tables,
if the initial table is unsatisfactory.
3.4. CAUSE-EFFECT GRAPHING TECHNIQUE
Cause-effect graphing is basically a hardware testing technique
adapted to software testing. It is a black-box method. It considers only
the desired external behavior of a system. This is a testing technique
that aids in selecting test cases that logically relate causes (inputs) to
effects (outputs) to produce test cases.

3.4.1. CAUSES AND EFFECTS

A “cause” represents a distinct input condition that brings about an


internal change in the system. An “effect” represents an output
condition, a system transformation, or a state resulting from a
combination of causes.
Myer suggests the following steps to derive test cases:
Step 1. For a module, identify the input conditions (causes) and
actions (effect).
Step 2. Develop a cause-effect graph.
Step 3. Transform the cause-effect graph into a decision table.
Step 4. Convert decision table rules to test cases. Each column of
the decision table represents a test case.
Basic cause-effect graph symbols used are given below:
Consider each node as having the value 0 or 1 where 0 represents
the “absent state” and 1 represents the “present state.” Then the
identity function states that if c1 is 1, e1 is 1, or we can say if c1 is 0, e1
is 0.
The NOT function states that if C1 is 1, e1 is 0 and vice versa.
Similarly, the OR function states that if C1 or C2 or C3 is 1, e1 is 1 else
e1 is 0. The AND function states that if both C1 and C2 are 1, e1 is 1;
else e1 is 0. The AND and OR functions are allowed to have any
number of inputs.

3.4.2. TEST CASES FOR THE TRIANGLE PROBLEM


We follow the steps listed in Section 3.4.1 to design the test cases for
our triangle problem:

Step 1. First, we must identify the causes and its effects. The causes
are:
C1: Side x is less than sum of y and z
C2: Side y is less than sum of x and z
C3: Side z is less than sum of x and y
C4: Side x is equal to side y
C5: Side x is equal to side z
C6: Side y is equal to side z

The effects are:


e1: Not a triangle
e2: Scalene triangle
e3: Isosceles triangle
e4: Equilateral triangle
e5: Impossible

Step 2. Its cause-effect graph is shown in Figure 3.18.


Step 3. We transform it into a decision table:

FIGURE 3.18 Cause Effect Graph for Triangle Problem.

Step 4. Because there are 11 rules, we get 11 test cases and they are:
3.4.3. TEST CASES FOR PAYROLL PROBLEM
Problem 1. Consider the payroll system of a person.
a. If the salary of a person is less than $70,000 and expenses do not
exceed $30,000, then a 10% tax is charged by the IT department.
b. If the salary is greater than $60,000 and less than or equal to $3000
and expenses don’t exceed $40,000, then a 20% tax is charged by
the IT department.
c. For a salary greater than $3000, a 5% additional surcharge is also
charged.
d. If expenses are greater than $40,000, the surcharge is 9%.
Design test-cases using decision table based testing technique.
Solution. See the following steps:
Step 1. All causes and their effects are identified:

Causes Effects
C1: Salary < = 70,000 E1: 10% tax is charged.
C2: Salary > 60,000 and Salary < = E2: 20% tax is charged.
3000 E3: (20% tax) + (5% surcharge) is
C3: Salary > 3000 charged.
C4: Expenses < = 30,000 E4: (20% tax) + (9% surcharge) is
C5: Expenses < = 40,000 charged.
C6: Expenses > 40,000

Step 2. It’s cause-effect graph is drawn.

FIGURE 3.19 Cause Effects Graph.

Step 3. We transform this cause-effect graph into a decision table.


Please note that these “causes” and “effects” are nothing else but
“conditions” and “actions” of our decision table. So, we get:
FIGURE 3.20 Decision Table.

That is, if C1 and C4 are 1 (or true) then the effect (or action) is E1.
Similarly, if C2 and C5 is 1 (or true), action to be taken is E2, and so
on.
Step 4. Because there are 4 rules in our decision table above, we must
have at least 4 test cases to test this system using this technique.
These test cases can be:
1. Salary = 20,000, Expenses = 2000
2. Salary = 100,000, Expenses = 10,000
3. Salary = 300,000, Expenses = 20,000
4. Salary = 300,000, Expenses = 50,000

So we can say that a decision table is used to derive the test cases
which can also take into account the boundary values.

3.4.4. GUIDELINES FOR THE CAUSE-EFFECT FUNCTIONAL TESTING


TECHNIQUE
1. If the variables refer to physical quantities, domain testing and
equivalence class testing are indicated.
2. If the variables are independent, domain testing and equivalence
class testing are indicated.
3. If the variables are dependent, decision table testing is indicated.
4. If the single-fault assumption is warranted, boundary value analysis
(BVA) and robustness testing are indicated.
5. If the multiple-fault assumption is warranted, worst-case testing,
robust worst-case testing, and decision table testing are identical.
6. If the program contains significant exception handling, robustness
testing and decision table testing are indicated.
7. If the variables refer to logical quantities, equivalence class testing
and decision table testing are indicated.

3.5. COMPARISON ON BLACK-BOX (OR FUNCTIONAL)


TESTING TECHNIQUES

3.5.1. TESTING EFFORT


The functional methods that we have studied so far vary both in terms
of the number of test cases generated and the effort to develop these
test cases. To compare the three techniques, namely, boundary value
analysis (BVA), equivalence class partitioning, and decision table
based technique, we consider the following curve shown in Figure
3.21.
FIGURE 3.21 Test Cases as Per the Testing Method.

The domain-based techniques have no recognition of data or


logical dependencies. They are very mechanical in the way they
generate test cases. Because of this, they are also easy to automate.
The techniques like equivalence class testing focus on data
dependencies and thus we need to show our craft. The thinking goes
into the identification of the equivalence classes and after that, the
process is mechanical. Also note that from the graph, the decision
table based technique is the most sophisticated because it requires the
tester to consider both data and logical dependencies. As we have
seen in our example of the next date function, we had to go three times
but once we get a good and healthy set of conditions, the resulting test
cases are complete and minimal.
Now, consider another graph to show the effort required to identify
the test cases versus its sophistication.
FIGURE 3.22 Test Case Identification Effort as per Testing Method.

We can say that the effort required to identify test cases is the
lowest in BVA and the highest in decision tables. The end result is a
trade-off between the test case effort identification and test case
execution effort. If we shift our effort toward more sophisticated
testing methods, we reduce our test execution time. This is very
important as tests are usually executed several times. Also note that,
judging testing quality in terms of the sheer number of test cases has
drawbacks similar to judging programming productivity in terms of
lines of code.
The examples that we have discussed so far show these trends.

3.5.2. TESTING EFFICIENCY


What we found in all of these functional testing strategies is that either
the functionality is untested or the test cases are redundant. So, gaps
do occur in functional test cases and these gaps are reduced by using
more sophisticated techniques.
We can develop various ratios of the total number of test cases
generated by method-A to those generated by method-B or even ratios
on a test case basis. This is more difficult but sometimes management
demands numbers even when they have little meaning. When we see
several test cases with the same purpose, sense redundancy, detecting
the gaps is quite difficult. If we use only functional testing, the best we
can do is compare the test cases that result from two methods. In
general, the more sophisticated method will help us recognize gaps but
nothing is guaranteed.

3.5.3. TESTING EFFECTIVENESS


How can we find out the effectiveness of the testing techniques?
a. By being dogmatic, we can select a method, use it to generate test
cases, and then run the test cases. We can improve on this by not
being dogmatic and allowing the tester to choose the most
appropriate method. We can gain another incremental improvement
by devising appropriate hybrid methods.
b. The second choice can be the structural testing techniques for the
test effectiveness. This will be discussed in subsequent chapters.
Note, however, that the best interpretation for testing effectiveness
is most difficult. We would like to know how effective a set of test
cases is for finding faults present in a program. This is problematic for
two reasons.
1. It presumes we know all the faults in a program.
2. Proving that a program is fault free is equivalent to the famous
halting problem of computer science, which is known to be
impossible.

What Is the Best Solution?

The best solution is to work backward from fault types. Given a


particular kind of fault, we can choose testing methods (functional and
structural) that are likely to reveal faults of that type. If we couple this
with knowledge of the most likely kinds of faults, we end up with a
pragamatic approach to testing effectiveness. This is improved if we
track the kinds of faults and their frequencies in the software we
develop.
3.5.4. GUIDELINES FOR FUNCTIONAL TESTING

1. If the variables refer to physical quantities then domain testing and


equivalence class testing are used.
2. If the variables are independent then domain testing and
equivalence class testing are used.
3. If the variables are dependent, decision table testing is indicated.
4. If the single-fault assumption is warranted then BVA and
robustness testing are used.
5. If the multiple-fault assumption is warranted then worst case
testing, robust worst case testing, and decision table based testing
are indicated.
6. If the program has significant exception handling, robustness
testing and decision table based testing are identical.
7. If the variable refers to logical quantities, equivalence class testing
and decision table testing are indicated.

3.6. KIVIAT CHARTS


A kiviat chart visually displays a set of metrics that provides easy
viewing of multiple metrics against minimum and maximum
thresholds. Each radial of a Kiviat chart is a metric. All metrics are
scaled so that all maximums are on a common circle and all
minimums are on a common circle.
In the charts below, the circle is the maximum threshold and the
band is the minimum threshold. The band between the two is the
acceptable range. The chart on the left shows that all metrics are well
within the acceptable range. The chart on the right shows an example
where all metrics are above maximum limits.
FIGURE 3.23 DevCodeMetricsWeb screen shot with Kiviat:
http://devcodemetrics.sourceforge.net/

The best graphical or visual representation of complex concepts and


data can be communicated through Kiviat charts. In addition, Kiviat
charts provide a quick focus on the areas that are in most need of
attention and can be customized to use different scales to represent
various problems. This deliverable will help you get smart on what a
Kiviat chart is and how to use it properly.

NOTE
For an update on Kiviat charts using R! visit:
http://metrico.statanything.com/diagramas-kiviat-spider-charts-em-r/

3.6.1. THE CONCEPT OF BALANCE

Why Is It Needed?
The problem – multiple dimensions:
Computer: processor
• printer
• disks
• CDROM
• modem
Software
Personnel

All these together create multiple dimensions and hence the concept of
balance is needed.

Kiviat Charts: The Method


The following steps are usually followed to draw Kiviat charts:
Step 1. Choose factors to be measured.
Step 2. Define factors so that for half the optimum utilization is 1
and the other half is 0.
Step 3. Mark the factors around the chart so that axes with an
optimum of 0 alternate with those with an optimum of 1.
Step 4. Mark the values on the factor axes.
Step 5. Join the marks to form a “star.”
Step 6. Evaluate the result.
We shall now consider four different cases to draw Kiviat charts.
Case 1: Kiviat charts—A perfectly balanced system.
Consider a sample data given below. Its Kiviat chart is also shown
below:
FIGURE 3.24

Case 2: Kiviat charts—A well-balanced system.


Consider another sample data for a well-balanced system. Using the
steps listed earlier, we draw its kiviat chart.
FIGURE 3.25

Case 3: Kiviat charts—A poorly balanced system.


Now consider a poorly balanced system. Its kiviat chart is shown
below:

FIGURE 3.26

Case 4: Kiviat charts—A perfectly balanced system.


Consider another set of sample data. We draw its kiviat chart.
FIGURE 3.27

Kiviat Charts—A Different Example (CASE STUDY)

Problem: Comparing Internet diffusion in countries.


Measures: pervasiveness
geographic dispersion
sectoral absorption
connectivity infrastructure
organizational infrastructure
sophistication of use
Countries compared:
Finland
Jordan
Israel

Internet Dimensions Compared


Internet Dimensions Compared in the Form of a Kiviat Chart

FIGURE 3.28

Problem with Kiviat Charts


The Cost/Utilization Method
The problem with Kiviat charts - how to develop a metric for
multidimensional data?
The solution: express everything in terms of a common
measure - cost.
There are then two dimensions - utilization and cost - which
when multiplied yield a cost/ utilization factor for each
system component.
Illustrative Cost/Utilization Histograms

FIGURE 3.29

FIGURE 3.30
FIGURE 3.31

FIGURE 3.32
FIGURE 3.33

FIGURE 3.34

Cost/Utilization—The Method

The following steps are shown below:


1. Choose factors to be measured.
2. Determine the cost of each factor as a percent of total system cost.
3. Determine the utilization of each factor.
4. Prepare a chart showing the cost and utilization of each factor.
5. Compute the measure of cost/utilization, F.
6. Compute the measure of balance, B.
7. Evaluate the resulting chart and measures.

Cost/Utilization—The Measures
Cost/Utilization:
where: ui = percent utilization of factor i
pi = cost contribution of factor i
Balance:

FIGURE 3.35 Cost/Utilization—The Measures.


FIGURE 3.36 Cost/Utilization—The Measures.

FIGURE 3.37 Cost/Utilization—Interpretation.

FIGURE 3.38 Cost/Utilization—Interpretation.


FIGURE 3.39 Cost/Utilization—Interpretation.

FIGURE 3.40 Trace of Cost/Utilization Criteria.


FIGURE 3.41 Composite Cost/Utilization Histogram for Two Real
Linked Systems.

Conclusions
It is essential to maintain balance between system components in order
to:
reduce costs.
maintain smooth functioning with no bottlenecks.
attain effectiveness AND efficiency.

SUMMARY
We summarize the scenarios under which each of these techniques
will be useful:

When we want to test scenarios The most effective black-


that have box testing technique to use
1. Output values dictated by certain Decision tables
conditions depending on values of input
variables.
2. Input values in ranges, with each range Boundary Value Analysis
showing a particular functionality. (BVA)
3. Input values divided into classes. Equivalence partitioning
4. Checking for expected and unexpected Positive and negative
input values. testing
5. Workflows, process flows, or language Graph based testing
processors.
6. To ensure that requirements are tested Requirements based testing
and met properly.
7. To test the domain expertise rather than Domain testing
product specification.
8. To ensure that the documentation is Documentation testing
consistent with the product.

MULTIPLE CHOICE QUESTIONS


1. Which is not a functional testing technique?
a. BVA
b. Decision table
c. Regression testing
d. None of the above.
2. One weakness of BVA and equivalence partitioning is
a. They are not effective.
b. They do not explore combinations of input circumstances.
c. They explore combinations of input circumstances.
d. None of the above.
3. Decision tables are useful in a situation where
a. An action is taken under varying sets of corditions.
b. A number of combinations of actions are taken under
varying sets of conditions.
c. No action is taken under varying sets of conditions.
d. None of the above.
4. “Causes” and “Effects” are related to
a. Input and output
b. Output and input
c. Destination and source
d. None of the above.
5. Functionality of a software is tested by
a. White-box testing.
b. Black-box testing.
c. Regression testing.
d. None of the above.
6. If n represents the number of variables in a program then BVA
yields how many test cases?
a. 4n + 2
b. 4n + 1
c. n+2
d. n+1
7. For a function of n variables, the robustness testing will yield
a. 6n + 1 test cases.
b. 6n + 2 test cases.
c. 6n + 4 test cases.
d. None of the above.
8. Worst case testing yields
a. 5n test cases.
b. 5n + 1.
c. 5n.
d. None of the above.
9. BVA is based upon
a. Single fault assumption theory.
b. Multiple fault assumption theory.
c. Both of the above.
d. None of the above.
10. In decision tables, which of the following is true?
a. Number of test cases is equal to number of rules (or
columns)
b. Number of test cases is not equal to number of rules (or
columns)
c. Both (a) and (b)
d. None of the above.

ANSWERS
1. c.
2. b.
3. b.
4. a.
5. b.
6. b.
7. a.
8. c.
9. a.
10. a.
CONCEPTUAL SHORT QUESTIONS WITH ANSWERS
Q. 1. Why we need to perform both types of testing?
Ans. A functional (Black-box) test case might be taken from the
documentation description of how to perform a certain function.
For example, accepting the bar code input.
On the other hand, a structural test case might be taken from a
technical documentation manual.
Both methods together validate the entire system and is shown in
Table.

Q. 2. What is the source of knowledge for functional testing?


Ans. The following items are the knowledge source of functional (or
black-box) testing:
a. Requirements document
b. Specifications
c. Domain knowledge
d. Defect analysis data
Q. 3. What is special value testing?
Ans. It is a sort of functional testing technique. It is most intuitive and
least uniform. Special value testing occurs when a tester uses his
or her domain knowledge, experience with similar programs, and
information about soft-spots (i.e., critical areas) to device test
cases. It is also known as adhoc testing. It is dependent on the
tester’s ability. Even though special value/adhoc testing is highly
subjective, it often results in a set of test cases that is more
effective in revealing faults than the test cases generated by other
methods.
Q. 4. What is random testing?
Ans. A testing technique in which instead of always choosing the min,
min+, nom, max–, and max values of a bounded variable, we use
a random number generator to pick test case values. This will
reduce bias during testing.
Q. 5. Write down the differences between static and dynamic testing?
Ans. The differences between static and dynamic testing are shown
below:

Static testing Dynamic testing


1. It talks about prevention. It talks about cure.
2. It is more cost effective. It is less cost effective.
3. It may achieve 100% It may achieve less than 50%
statement coverage. statement coverage as it finds
the errors only in the part of
codes that are actually
executed.
4. It is not time consuming. It is time consuming as it
may involve running several
test cases.
5. It can be done before It can be done only after
compilation. executables are ready.

Q. 6. Give some advantages and disadvantages of functional test


cases?
Ans. Functional test cases have two main advantages:
i. They are independent of how the software is implemented.
So, even if the implementation changes, the test cases are still
useful.
ii. Test case development can occur in parallel with the
implementation, thereby reducing overall project
development internal.
Functional test cases have two main disadvantages:
i. Some unavoidable redundancies may exist among test cases.
ii. There exists a possibility of gaps of untesting software.
Both of these problems can be solved if we combine the test
cases, so obtained from both functional and structural testing.
Q. 7. Differentiate between positive and negative testing?
Ans. Let us tabulate the differences between the two:

Positive Testing Negative Testing


1. Positive testing tries to prove that a Negative testing is done to
given product does what it is show that the product does not
supposed to do. fail when an unexpected input
is given.
2. A positive test case is one that A negative test case will have
verifies the requirements of the the input values that may not
product with a set of expected have been represented in the
output. SRS. These are unknown
conditions for the product.
3. The purpose of positive testing is to The purpose of negative testing
prove that the product works as per is to try and break the system.
the user specifications.
4. Positive testing checks the Negative testing covers
product’s behavior. scenarios for which the product
is not designed and coded.
5. Positive testing is done to verify Negative testing is done to
the known test conditions. break the product with
unknown test conditions, i.e.,
test conditions that lie outside
SRS.
6. If all documented requirements and There is no end to negative
test conditions are covered then it testing and 100% coverage is
gives 100% coverage. impractical here.
For example: A product delivering For example: A product not
an error when it is expected to give delivering an error when it
error. should or delivering an error
when it should not.

Q. 8. What is domain testing?


Ans. Domain testing is a sort of testing strategy wherein we do not
look even at SRS of a software product but test purely on the
basis of domain knowledge and expertise in the domain of
application.
For example, in a banking software, knowing the account
opening, closing, etc. processes, enables a tester to test that
functionality better.
Q. 9. What is documentation testing?
Ans. A testing done to ensure that the documentation is consistent
with the product.
Q. 10. What are the characteristics of a good test?
Ans. i. A good test has a high probability of finding an error.
ii. A good test is not redundant.
iii. A good test is the “best of breed.”
iv. A good test should be neither too simple nor too complex.
Q. 11. Consider the above use case diagram for coffee maker. Find at
least ten acceptance test cases and black-box test cases and
document it.
Ans. Test cases for coffee maker.
Preconditions: Run coffee maker by switching on power supply.
REVIEW QUESTIONS
1. Perform the following:
a. Write a program to find the largest number.
b. Design a test case for program at 2(a) using a decision
table.
c. Design equivalence class test cases.
2. Explain any five symbols used in the cause-effect graphing
technique?
3. How do you measure:
a. Test effectiveness?
b. Test efficiency?
4. Write a short paragraph:
a. Equivalence testing.
5. Explain the significance of boundary value analysis. What is
the purpose of worst case testing?
6. Describe cause-effect graphing technique with the help of an
example.
7. a. Discuss different types of equivalence class tests cases.
b. Consider a program to classify a triangle. Its input is a
triple of the integers (day x, y, z) and date types or input
parameters ensure that they will be integers greater than
zero and less than or equal to 200. The program output may
be any of the following words: scalene, isosceles,
equilateral, right angle triangle, not a triangle. Design the
equivalence class test cases.
8. How can we measure and evaluate test effectiveness? Explain
with the help of 11 step S/W testing process.
9. What is the difference between:
Equivalence partitioning and boundary value analysis
methods?
10. Consider the previous date function and design test cases using
the following techniques:
a. Boundary value analysis.
b. Equivalence class partitioning.
The function takes current date as an input and returns the
previous date of the day as the output.
All variables have integer values subject to conditions as
follows:
C1: 1 ≤ month ≤ 2
C2: 1 ≤ day ≤ 31
C3: 1920 ≤ year ≤ 2000.
11. Compare the single/multiple fault assumption theory with
boundary value and equivalence class testing.
12. Explain the cause-effect graphing technique in detail. Why is it
different from other functional testing techniques?
13. Differentiate between positive and negative functional testing.
14. What is domain testing? Illustrate through a suitable example.
15. a. “Software testing is an incremental process.” Justify the
statement.
b. What do we mean by functional analysis? Discuss in brief
any method to perform functional analysis.
c. Consider a program to determine whether a number is
“odd” or “even” and print the message “Number is Even”
or “Number is Odd.” The number may be any valid integer.
Design boundary value and equivalence class test cases.
16. a. Consider a program that calculates the roots of a quadratic
equation with a, b, and c in range [1, 100]. Design test
cases using boundary value and robustness testing.
b. What is decision table testing?
17. Write a short paragraph on cause-effect graphing.
18. Discuss any two model-based black-box testing approaches?
19. A program computes the grade of each student based on the
average of marks obtained by them in physics, chemistry, and
math. Maximum marks in each subject is 60. Grades are
awarded as follows:

Marks Range Grade


60 A+
59–60 A
49–40 B
39–30 C
29–30 D
19–0 F

Design robust test cases and identify equivalence class test


cases for output and input domains for this problem.
20. What is the difference between weak normal and strong
normal equivalence class testing?
21. Consider a program for the determination of previous date. Its
input is a triple of day, month, and year with the values in the
range:
1 ≤ month ≤ 12
1 ≤ day ≤ 31
1900 ≤ year ≤ 2025

The possible outputs are “Previous date” and “Invalid date.”


Design a decision table and equivalence classes for input
domain.
22. Consider a program given below for the selection of the largest
of numbers.
a. Design the set of test cases using BVA technique and
equivalence class testing technique.
b. Select a set of test cases that will provide 100% statement
coverage.
c. Develop a decision table for this program.
23. Consider the above program and show that why is it practically
impossible to do exhaustive testing?
24. a. Consider the following point-based evaluation system for a
trainee salesman of an organization:

Points earned Management action


0–20 Thank you
21–40 Extend probation
41–60 Confirmation
61–80 Promotion
81–100 Promotion with a letter of
recommendation

Generate the test cases using equivalence class testing.


b. Explain functional testing.
25. Design test cases using the functional testing taking any
example program? Explain how and why complete testing is
not possible by highlighting some cases of the example given.
26. Let us consider an example of grading the students in an
academic institution. The grading is done according to the
following rules:

Marks obtained Grade


80–100 Distinction
60–79 First division
50–59 Second division
40–49 Third division
0–39 Fail

Generate test cases using the equivalence class testing


technique.
27. Consider the following point-based evaluation system for a
salesman of an organization.

Points earned Grade Management action


80–100 A+ Raise of $10,000
75–80 A– Raise of $5,000
70–75 A Raise of $3000
65–70 B+ Raise of $1000
60–65 B No increment
50–60 C Warning

Generate the test cases using equivalence class testing.


28. Consider a program for the determination of previous date. Its
input is a triple of day, year, month with values in range:
1 ≤ month ≤ 12
1 ≤ day ≤ 31
1912 ≤ year ≤ 2020
Design a decision table for the given scenario.
29. Consider the program for the determination of next date in a
calender. Its input is a triple of day, month, and year with the
following range:
1 ≤ month ≤ 12
1 ≤ day ≤ 31
1900 ≤ year ≤ 2025
The possible outputs would be next date or invalid date.
Design boundary value, robust, and worst test cases for this
program.
30. a. Explain cause-effect graphing technique in detail. Why is
it different from other functional techniques?
b. Design test cases using functional testing taking any
example program. Explain how and why complete testing
is not possible by highlighting some cases of the example
given.
31. Consider an example of grading a student in a university. The
grading is done as below:

Average marks Grade


90–100 A+
75–89 A
60–74 B
50–59 C
Below 50 Fail
The marks of any three subjects are considered for the
calculation of average marks. Scholarships of $1000 and $500
are given to students securing more than 90% and 85% marks,
respectively. Develop a decision table, cause effect graph, and
generate test cases for the above scenario.
CHAPTER
4

WHITE-BOX (OR STRUCTURAL)


TESTING TECHNIQUES
Inside this Chapter:
4.0. Introduction to White-Box Testing or Structural Testing or Clear-Box or Glass-Box or
Open-Box Testing
4.1. Static Versus Dynamic White-Box Testing
4.2. Dynamic White-Box Testing Techniques
4.3. Mutation Testing Versus Error Seeding—Differences in Tabular Form
4.4. Comparison of Black-Box and White-Box Testing in Tabular Form
4.5. Practical Challenges in White-Box Testing
4.6. Comparison on Various White-Box Testing Techniques
4.7. Advantages of White-Box Testing

4.0. INTRODUCTION TO WHITE-BOX TESTING


OR STRUCTURAL TESTING OR CLEAR-BOX OR
GLASS-BOX OR OPEN-BOX TESTING
White-box testing is a way of testing the external functionality of the
code by examining and testing the program code that realizes the
external functionality. It is a methodology to design the test cases that
uses the control structure of the application to design test cases.
White-box testing is used to test the program code, code structure, and
the internal design flow.

4.1. STATIC VERSUS DYNAMIC WHITE-BOX TESTING


A number of defects get amplified because of incorrect translation of
requirements and design into program code. We shall now study
different techniques of white-box testing. As shown in Figure 4.1,
white-box texting is classified as static and dynamic.

FIGURE 4.1 Classification of White-Box Testing.

As already discussed in Chapter 2, static testing is a type of testing


in which the program source code is tested without running it (not the
binaries or executable). We only need to examine and review the code.
It means that we need not execute the code. We need to find out
whether
The code works according to the functional requirements.
The code has been written in accordance with the design
developed earlier in the project life cycle.
The code for any functionality has been missed.
The code handles errors properly.
Static testing can be done by humans or with the help of
specialized tools. So, static white-box testing is the process of
carefully and methodically reviewing the software design,
architecture, or code for bugs without executing it. It is sometimes
referred to as structural analysis [PATT01].
We next discuss some of the dynamic white-box testing techniques
one by one.

4.2. DYNAMIC WHITE-BOX TESTING TECHNIQUES


In dynamic testing, we test a running program. So, now binaries and
executables are desired. We try to test the internal logic of the program
now. It entails running the actual product against some pre-designed
test cases to exercise as much of the code as possible.

4.2.1. UNIT/CODE FUNCTIONAL TESTING


It is the process of testing in which the developer performs some quick
checks prior to subjecting the code to more extensive code coverage
testing or code complexity testing. It can be performed in many ways:
1. At the initial stages, the developer or tester can perform certain
tests based on the input variables and the corresponding expected
output variables. This can be a quick test. If we repeat these tests
for multiple values of input variables then the confidence level of
the developer to go to the next level increases.
2. For complex modules, the tester can insert some print statements in
between to check whether the program control passes through all
statements and loops. It is important to remove the intermediate
print statements after the defects are fixed.
3. Another method is to run the product under a debugger or an
integrated development environment (IDE). These tools involve
single stepping of instructions and setting break points at any
function or instruction.
All of these initial tests actually fall under “debugging” category
rather than under “testing” category of activities. We put them under
“white-box testing” head as all are related to the knowledge of code
structure.

4.2.2. CODE COVERAGE TESTING


Code coverage testing involves designing and executing test cases and
finding out the percentage of code that is covered by testing. The
percentage of code covered by a test is found by adopting a technique
called the instrumentation of code. These tools rebuild the code, do
product linking with a set of libraries provided by the tool, monitor the
portions of code covered, and report on the portions of the code that
are covered frequently, so that the critical or most used portions of
code can be identified. We will next discuss some basic testing
techniques based on code coverage.

4.2.2.1. STATEMENT COVERAGE


In most of the programming languages, the program construct may be
a sequential control flow, a two-way decision statement like if-then-
else, a multi-way decision statement like switch, or even loops like
while, do, repeat until and for.
Statement coverage refers to writing test cases that execute each of
the program statements. We assume that the more the code is covered,
the better the testing of the functionality.
For a set of sequential statements (i.e., with no conditional
branches), test cases can be designed to run through from top to
bottom. However, this may not always be true in two cases:
1. If there are asynchronous exceptions in the code, like divide by
zero, then even if we start a test case at the beginning of a section,
the test case may not cover all the statements in that section. Thus,
even in the case of sequential statements, coverage for all
statements may not be achieved.
2. A section of code may be entered from multiple points.
In case of an if-then-else statement, if we want to cover all the
statements then we should also cover the “then” and “else” parts of the
if statement. This means that we should have, for each if-then-else, at
least one test case to test the “then” part and at least one test case to
test the “else” part.
The multi-way, switch statement can be reduced to multiple two-
way if statements. Thus, to cover all possible switch cases, there
would be multiple test cases.
A good percentage of the defects in programs come about because
of loops that do not function properly. More often, loops fail in what
are called “boundary conditions.” So, we must have test cases that
1. Skip the loop completely so that the situation of the termination
condition being true before starting the loop is tested.
2. Check the loop for n = 1.
3. Try covering the loop at the boundary values of n, i.e., just below n
and just above n.

The statement coverage for a program, which is an indication of the


percentage of statements actually executed in a set of tests, can be
calculated as follows:

For example, if the total number of statements exercised = 08


Total number of executable statements in program = 10

Please note from the above discussion that as the type of statement
progresses from a simple sequential statement to if-then-else and
through loops, the number of test cases required to achieve statement
coverage increases. As we have already seen, exhaustive testing
(100%) is not possible, so exhaustive coverage of all statements in a
program will be impossible for practical purposes.
Even if we get a high level of the statement coverage, it does not
mean that the program is defect free.
Consider a program that implements wrong requirements and, if
such a code is fully tested with say, 100% coverage, it is still a wrong
program. Hence 100% code coverage does not mean anything.
Consider another example.
In this program, when we test with code = “y,” we will get 80%
code coverage. But if the data distribution in the real world is such that
90% of the time the value of code is not = “y,” then the program will
fail 90% of the time because of the exception-divide by zero. Thus,
even with a code coverage of 80%, we are left with a defect that hits
the users 90% of the time. The path coverage technique, discussed
next, overcomes this problem.

4.2.2.2. PATH COVERAGE


In the path coverage technique, we split a program into a number of
distinct paths. A program or a part of a program can start from the
beginning and take any of the paths to its completion. The path
coverage of a program may be calculated based on the following
formula:

For example, if the total path exercised = 07

Total number of paths in program = 10

Consider the following flow graph in Figure 4.2.


There are different paths in this particular flow graph.

FIGURE 4.2 Flow Graph Example.

Some of the paths are:

Path – 1: 1→2→6
Path – 2: 1→3→5→6
Path – 3: 1→3→4→5→6
Path – 4: 1→3→4→2→6

Regardless of the number of statements in each of these paths, if we


can execute these paths, then we would have covered most of the
typical scenarios.
Path coverage provides a stronger condition of coverage than
statement coverage as it relates to the various logical paths in the
program rather than just program statements.

4.2.2.3. CONDITION COVERAGE


In the above example, even if we have covered all the paths possible,
it does not mean that the program is fully tested. Path testing is not
sufficient as it does not exercise each part of the Boolean expressions,
relational expressions, and so on. This technique of condition coverage
or predicate monitors whether every operand in a complex logical
expression has taken on every TRUE/FALSE value. Obviously, this
will mean more test cases and the number of test cases will rise
exponentially with the number of conditions and Boolean expressions.
For example, in if-then-else, there are 22 or 4 possible true/false
conditions. The condition coverage which indicates the percentage of
conditions covered by a set of test cases is defined by the formula:

Please note that this technique of condition coverage is much


stronger criteria than path coverage, which in turn is much stronger
criteria than statement coverage.

4.2.2.4. FUNCTION COVERAGE


In this white-box testing technique, we try to identify how many
program functions are covered by test cases. So, while providing
function coverage, test cases can be written so as to exercise each of
different functions in the code.
The following are the advantages of this technique:
1. Functions (like functions in C) are easier to identify in a program
and, hence, it is easier to write test cases to provide function
coverage.
2. Because functions are at a higher level of abstraction than code, it
is easier to achieve 100% function coverage.
3. It is easier to prioritize the functions for testing.
4. Function coverage provides a way of testing traceability, that is,
tracing requirements through design, coding, and testing phases.
5. Function coverage provides a natural transition to black-box
testing.
Function coverage can help in improving the performance as well
as the quality of the product. For example, if in a networking software,
we find that the function that assembles and disassembles the data
packets is being used most often, it is appropriate to spend extra effort
in improving the quality and performance of that function. Thus,
function coverage can help in improving the performance as well as
the quality of the product.
Better code coverage is the result of better code flow
understanding and writing effective test cases. Code coverage up to
40–50% is usually achievable. Code coverage of more than 80%
requires an enormous amount of effort and understanding of the code.
The multiple code coverage techniques discussed so far are not
mutually exclusive. They supplement and augment one another. While
statement coverage can provide a basic comfort factor, path, decision,
and function coverage provide more confidence by exercising various
logical paths and functions.

4.2.3. CODE COMPLEXITY TESTING


Two questions that come to mind are:
1. Which of the paths are independent? If two paths are not
independent, then we may be able to minimize the number of tests.
2. Is there any limit to the number of tests that must be run to ensure
that all the statements have been executed at least once?
The answer to the above questions is a metric that quantifies the
complexity of a program and is known as cyclomatic complexity. It is
also known as structural complexity because it gives the internal view
of the code. It is a number which provides us with an upper bound for
the number of tests that must be conducted to ensure that all
statements have been executed at least once.

4.2.3.1. CYCLOMATIC COMPLEXITY, ITS PROPERTIES AND ITS MEANING IN


TABULAR FORM

McCabe IQ covers about 146 different counts and measures. These


metrices are grouped according to six main “collections” each of
which provides a different level of granularity and information about
the code being analyzed. The collections are given below:
i. McCabe metrics based on cyclomatic complexity, V(G).
ii. Execution coverage metrics based on any of branch, path, or
Boolean coverage.
iii. Code grammar metrics based around line counts and code
structure counts such as nesting.
iv. OO metrics based on the work of Chidamber and Kemerer.
v. Derived metrics based on abstract concepts such as
understability, maintainability, comprehension, and testability.
vi. Custom metrics imported from third-party software/systems,
e.g., defect count.
McCabe IQ provides for about 100 individual metrics at the
method, procedure, function, control, and section/paragraph level.
Also, there are 40 metrices at the class/file and program level.

Categories of Metrics
There are three categories of metrics:
1. McCabe metrics
2. OO metrics
3. Grammar metrics

Please remember that when collecting metrics, we rely upon


subordinates who need to “buy into” the metrics program. Hence, it is
important to only collect what you intend to use.
We should keep in mind, the Hawthorne Effect which states that
when you collect metrics on people, the people being measured will
change their behavior. Either of these practices will destroy the
efficiency of any metrics program.
The three metrics categories are explained below.

I. McCabe metrics
a. Cyclomatic complexity, V(G): It is the measure of the amount
of logic in a code module of 3rd and 4th generation languages.
If V(G) is excessively high then it leads to impenetrable code,
i.e., a code that is at a higher risk due to difficulty in testing.
The threshold value is 10. When V(G) > 10, then the likelihood
of code being unreliable is much higher. Please remember that
a high V(G) shows a decreased quality in the code resulting in
higher defects that become costly to fix.
b. Essential complexity: It is a measure of the degree to which a
code module contains unstructured constructs. If the essential
complexity is excessively high, it leads to impenetrable code,
i.e., a code that is at higher risk due to difficulty in testing.
Furthermore, the higher value will lead to increased cost due to
the need to refactor, or worse, reengineer the code. The
threshold value is 4. When the essential complexity is more
than 4 then the likelihood of the code being unmaintainable is
much higher. Please note that a high essential complexity
indicates increased maintenance costs with decreased code
quality. Some organizations have used the essential density
metric (EDM) defined as:

c. Integration complexity: It is a measure of the interaction


between the modules of code within a program. Say, S0 and S1
are two derivatives of this complexity.
where S0 — Provides an overall measure of size and
complexity of a program’s design. It
will not reflect the internal calculations
of each module. It is the sum of all the
integration complexity in a program
(∑iv(g)).
and S1 = (S0 – Number of methods + 1).
This is primarily used to determine the number of tests for
the “some test” that is designed to ensure that the application
would execute without issues in module interaction.
d. Cyclomatic density (CD): It is a measure of the amount of
logic in the code. It is expressed as follows:

By eliminating the size factor, this metric reduces


complexity strictly to modules that have unusually dense
decision logic. Please note that the higher the CD value, the
denser the logic.
The CD metric should be in the range of 0.14 to 0.42 for the
code to be simple and comprehensible.
e. Pathological complexity: It represents an extremely
unstructured code which shows a poor design and hence a
suggestion for code’s reengineering. A value greater than one
indicates poor coding practices like branching into a loop or
into a decision. Please note that these conditions are not easy to
replicate with modern post 3GL languages.
f. Branch coverage: It is a measure of how many branches or
decisions in a module have been executed during testing.
If the branch coverage is <95% for new code or 75% for
code under maintenance then the test scripts require review and
enhancement.
g. Basis path coverage: A measure of how many of the basis
(cyclomatic, V(G)) paths in a module have been executed. Path
coverage is the most fundamental of McCabe design. It
indicates how much logic in the code is covered or not covered.
This technique requires more thorough testing than branch
coverage.
If the path coverage is < 90% for new code or 70% for code
under maintenance then the test scripts require review and
enhancement.
h. Boolean coverage: A technique used to establish that each
condition within a decision is shown by execution to
independently and correctly affect the outcome of the decision.
The major application of this technique is in safety critical
systems and projects.
i. Combining McCabe metrics: Cyclomatic complexity is the
basic indicator for determining the complexity of logic in a unit
of code. It can be combined with other metrics.

1. Code review candidate


If V(G) > 10 and essential complexity/essential density exceeds 4,
then the unit needs a review.

2. Code refactoring
If V(G) > 10 and the condition
V(G) – EV(g) ≤ V(g) is true

Then, the code is a candidate for refactoring.

3. Inadequate comment content

If the graph between V(G) against comment % (in terms of LOC) does
not show a linear increase then the comment content need to be
reviewed.

4. Test coverage

If the graph between V(G) against path coverage does not show a
linear increase then the test scripts need to be reviewed.

II. OO Metrics
a. Average V(G) for a class: If average V(G) > 10 then this
metric indicates a high level of logic in the methods of the class
which in turn indicates a possible dilution of the original object
model. If the average is high, then the class should be reviewed
for possible refactoring.
b. Average essential complexity for a class: If the average is
greater than one then it may indicate a dilution of the original
object model. If the average is high, then the class should be
reviewed for possible refactoring.
c. Number of parents: If the number of parents for a class is
greater than one then it indicates a potentially overly complex
inheritance tree.
d. Response for class (RFC): RFC is the count of all methods
within a class plus the number of methods accessible to an
object of this class due to implementation. Please note that the
larger the number of methods that can be invoked in response
to a message, the greater the difficulty in comprehension and
testing of the class. Also, note that low values indicate greater
specialization. If the RFC is high then making changes to this
class will be increasingly difficult due to the extended impact to
other classes (or methods).
e. Weighted methods for class (WMC): WMC is the count of
methods implemented in a class. It is a strong recommendation
that WMC does not exceed the value of 14. This metric is used
to show the effort required to rewrite or modify the class. The
aim is to keep this metric low.
f. Coupling between objects (CBO): It indicates the number of
non-inherited classes this class depends on. It shows the degree
to which this class can be reused.
For dynamic link libraries (DLLs) this measure is high as
the software is deployed as a complete entity.
For executables (.exe), it is low as here reuse is to be
encouraged. Please remember this point:

Strong coupling increases the difficulty in


comprehending and testing a class. The objective is to
keep it less than 6.

g. Class hierarchy level: It shows the degree of inheritance used.


If it is greater than 6, the increased depth increases the testing
effort. If it is less than 2 then the value shows a poor
exploitation of OO. So, one should aim for 2 and 3 levels only
in our OO-code.
h. Number of methods (n): If number of methods (n) > 40 then it
shows that the class has too much functionality. So, it can be
split into several smaller classes. Please note that ideally one
should aim for no more than 20 methods in a class.
i. Lack of cohesion between methods (LOCM): It is a metric
used to measure the dissimilarity of methods on a class by an
instance variable or attribute.

What is to be done?

The percentages of methods in a class using an attribute are


averaged and subtracted from 100. This measure is expressed in
percentage. Two cases arise:
i. If % is low, it means simplicity and high reusability.
ii. If % is high, it means a class is a candidate for refactoring
and could be split into two or more subclasses with low
cohesion.
j. Combined OO metrics: V(G) can also be used to evaluate OO
systems. It is used with OO metrics to find out the suitable
candidates for refactoring.
By refactoring, we mean making a small change to the code
which improves its design without changing its semantics.

Rules for refactoring:


i. IIf avg. V(G) > 10 (high) and the number of methods (n) is
< 10 (low) then the class requires refactoring.
ii. If avg. V(G) is low and the lack of cohesion is high then the
class is a suitable candidate for refactoring into two or more
classes.
iii. If avg. V(G) is high and CBO is high then the class is a
candidate for refactoring.
iv. If CBO is high and lack of cohesion is high then the class is
a candidate for refactoring.

III. Grammar Metrics


a. Line count: It is a size indicator. It is used in—
i. Estimation techniques like COCOMO2.

b. Nesting levels: Nesting of IF statements, switch, and loop


constructs can indicate unnecessarily complex conditions which
makes future modifications quite difficult. So, refactoring may
be done. Typical industry standards are 4, 2, and 2 for IF,
switch, and loop constructs respectively.
c. Counts of decision types: It is used to show single outcome (IF
and loop) and multiple outcome decision statements. When
used in conjunction with V(G), then its value can determine if a
method/procedure/control/section is over complex and, hence, a
suitable candidate for refactoring.
d. Maximum number of predicates: This measure shows overly
complex decision statements which are candidates for
refactoring.
e. Comment lines: It indicates the level of comments in a unit of
code. It shows:

Historically, a ratio of 15–25% of comments is adequate to


enable any user to understand the code.

4.2.3.2. CYCLOMATIC COMPLEXITY, ITS PROPERTIES AND ITS MEANING IN


TABULAR FORM
McCabe’s cyclomatic metric, V(G) of a graph G with n vertices and e
edges is given by the formula:
V(G) = e – n + 2
Given a program, we associate it with a directed graph that has
unique entry and exit nodes. Each node in the graph, G, corresponds to
a block of statements in the program where flow is sequential and the
arcs correspond to branches taken in the program. This graph is known
as a flow graph.
The cyclomatic complexity, V(G) provides us two things:
1. To find the number of independent paths through the program.
2. To provide an upper bound for the number of test cases that must
be executed in order to test the program thoroughly. The
complexity measure is defined in terms of independent paths. It is
defined as any path through the program that introduces at least
one new set of processing statements or a new condition. See the
following steps:
Step 1. Construction of flow graph from the source code or flow
charts.
Step 2. Identification of independent paths.
Step 3. Computation of cyclomatic complexity.
Step 4. Test cases are designed.
Using the flow graph, an independent path can be defined as a path
in the flow graph that has at least one edge that has not been traversed
before in other paths. A set of independent paths that cover all the
edges is a basis set. Once the basis set is formed, test cases should be
written to execute all the paths in the basis set.

Properties of Cyclomatic Complexity


The following are the properties of cyclomatic complexity represented
as V(G):
1. V(G) ≥ 1.
2. V(G) is the maximum number of independent paths in graph, G.
3. Inserting and deleting functional statements to G does not affect
V(G).
4. G has only one path if and only if V(G) = 1. Consider this example
Here, V(G) = 1 and that graph, G, has only one path.
5. Inserting a new row in G, increases V(G) by unity.

Meaning of V(G) in Tabular Form


For small programs cyclomatic complexity can be calculated
manually, but automated tools are essential as several thousand of
lines of code are possible in each program in a project. It will be very
difficult to manually create flow graphs for large programs. There are
several tools that are available in the market which can compute
cyclomatic complexity. Note that calculating the complexity of a
module after it has been built and tested may be too late. It may not be
possible to redesign a complex module after it has been tested. Thus,
some basic complexity checks must be performed on the modules
before embarking upon the testing (or even coding) phase. Based on
the complexity number that emerges from using the tool, one can
conclude what actions need to be taken for complexity measure using
the table given below:

TABLE 4.1 What V(G) Means?

Complexity What it means?


1–10 Well-written code, testability is high, cost/effort to maintain
is low.
10–20 Moderately complex, testability is medium, cost/effort to
maintain is medium.
20–40 Very complex, testability is low, cost/effort to maintain is
high.
> 40 Not testable, any amount of money/effort to maintain may
not be enough.

4.2.3.3. BASIS PATH TESTING WITH SOLVED EXAMPLES


Basis path testing helps a tester to compute logical complexity
measure, V(G), of the code. This value of V(G), defines the maximum
number of test cases to be designed by identifying basis set of
execution paths to ensure that all statements are executed at least once.
See the following steps:
1. Construct the flow graph from the source code or flow charts.
2. Identify independent paths.
3. Calculate cyclomatic complexity, V(G).
4. Design the test cases.
A program is represented in the form of a flow graph. A flow
graph consists of nodes and edges. Using the flow graph, an
independent path can be defined as a path in the flow graph that has at
least one edge that has not been traversed before in other paths. A set
of independent paths that cover all edges is a basis set. Once that basis
set is formed, test cases should be written to execute all the paths in
the basis set.

Flow Graph Testing


The control flow of a program can be represented using a graphical
representation known as a “flow graph.” The flow graph is a directed
graph in which nodes are either entire statements or fragments of a
statement. Edges represent the flow of control. If u and v are nodes in
the program graph, there is an edge from node u to node v if the
statement (fragment) corresponding to node v can be executed
immediately after the statement (fragment) corresponding to node u,

i.e.,
We next show the basic notations that are used to draw a flow graph:

FIGURE 4.3 Notations for Flow Graph.

SOLVED EXAMPLES
EXAMPLE 4.1. Consider the following code:
Draw its flow graph, find its cyclomatic complexity, V(G), and the
independent paths.
SOLUTION. First, we try to number the nodes, as follows:

So, its flow graph is shown in Figure 4.4. Next, we try to find
V(G) by three methods:
FIGURE 4.4 Flow Graph for Example 4.1.

a. V(G) = e – n + 2 (e – edges, n – nodes)


=8–7+2=3
b. V(G) = P + 1 (P- predicate nodes without degree =
2)
= 2 + 1 = 3. (Nodes 1 and 5 are predicate nodes)
c. V(G) = Number of enclosed regions + 1
=2+1=3
∴ V(G) = 3 and is same by all the three methods.
By V(G) = 3 we mean that it is a well written code, its testability is
high, and cost/effort to maintain is low.
Also, it means that there are 3 paths in this program which are
independent paths and they form a basis-set. These paths are given
below:
Path 1: 1 – 2 – 4 – 5 – 7
Path 2: 1 – 3 – 4 – 5 – 7
Path 3: 1 – 3 – 4 – 5 – 6 – 7
Another basis-set can also be formed:
Path 1: 1 – 2 – 4 – 5 – 7
Path 2: 1 – 3 – 4 – 5 – 7
Path 3: 1 – 2 – 4 – 5 – 6 – 7
This means that we must execute these paths at least once in order
to test the program thoroughly. So, test cases can be designed.

EXAMPLE 4.2. Consider the following program that inputs the


marks of five subjects of 40 students and outputs average marks and
the pass/fail message.

Draw its flow graph and compute its V(G). Also identify the
independent paths.
SOLUTION. The process of constructing the flow graph starts with
dividing the program into parts where flow of control has a single
entry and exit point. In this program, line numbers 2 to 4 are grouped
as one node (marked as “a”) only. This is because it consists of
declaration and initialization of variables. The second part comprises
of a while loop-outer one, from lines 5 to 19 and the third part is a
single printf statement at line number 20.
Note that the second part is again divided into four parts—
statements of lines 6 and 7, lines 8 to 12, line 13, and lines 14–17, i.e.,
if-then-else structure using the flow graph notation, we get this flow
graph in Figure 4.5.
Here, “*” indicates that the node is a predicate node, i.e., it has an
outdegree of 2.
The statements corresponding to various nodes are given below:

FIGURE 4.5 Flow Graph for Example 4.2.

Nodes Statement Numbers


a 2–4
b 5
e 6–7
f 8
z 9–12
g 13–14
h 15
i 17
j 18
c 19
d 20

Next, we compute its cyclomatic complexity, V(G):


a. V(G) =e–n+2
= 14 – 11 + 2 = 3 + 2 = 5
b. V(G) = Number of predicate nodes (P) + 1
=4+1=5
[ Nodes b, f, g, and c are predicate nodes with two outgoing edges]
c. V(G)
= Number of enclosed regions + 1
=4+1=5
[ R1 – R4 are 4 enclosed regions and 1 corresponds to one outer
region]
∴ V(G) = 5 by all three methods. Next, we identify a basis-set with
five paths:
Path 1: a – b – d – e
Path 2: a – b – d – f – n – b – d – e
Path 3: a – b – c – g – j – k – m – n – b – d – e
Path 4: a – b – c – g – j – l – m – n – b – d – e
Path 5: a – b – c – g – h – i – n – b – d – e
Each of these paths consist of at least one new edge. Please
remember that this basis set of paths is NOT unique. Finally, test cases
can be designed for the independent path execution that have been
identified above. This is to ensure that all statements are executed at
least once.

EXAMPLE 4.3. (Quadratic Equation Problem). This program


reads a, b, and c as the three coefficients of a quadratic equation ax2 +
bx + c = 0. It determines the nature of the roots of this equation. Draw
its flow graph and calculate its cyclomatic complexity.
SOLUTION. First, we write its procedure:

FIGURE 4.6 Flow Graph of Quadratic Equation Problem.

Now, we draw its flow graph as shown in Figure 4.6.


∴ V(G) = P + 1
= 3 + 1 = 4 [ Node 3, 5, and 7 are predicate nodes]
V(G) = Number of regions + 1
=3+1=4
Also, V(G) = e – n + 2 = 11 – 9 + 2 = 4

∴ We have 4 independent paths and they are:


Path 1: 1 – 2 – 3 – 4 – 9
Path 2: 1 – 2 – 3 – 5 – 6 – 9
Path 3: 1 – 2 – 3 – 5 – 7 – 8 – 9
Path 4: 1 – 2 – 3 – 5 – 7 – 1 – 2 ......
So, the test cases for each of the path are as follows:

Path 1: test case 1


a, b, c: valid input
expected results: D < 0, imaginary roots
Path 2: test case 2
a, b, c: valid input
expected results: D = 0, equal roots
Path 3: test case 3
a, b, c: valid input
expected results: D > 0, root 1 and root 2 are real
Path 4: test case 4
a, b, c: valid input
expected results: D is not > 0, read a, b, c again

EXAMPLE 4.4. (Triangle Problem): This program reads a, b, c as


the three sides of a triangle and determines whether they form an
isosceles, equilateral, or scalene triangle. Draw its flow graph and
calculate its V(G).
SOLUTION. We draw its flow chart first.
FIGURE 4.7 Flow Chart for Example 4.4.

FIGURE 4.8 Flow Graph for the Triangle Problem.


So, we draw its flow graph shown in Figure 4.8.
∴ Its cyclomatic complexity, V(G) by all three methods is
a. V(G) = e – n + 2 = 10 – 8 + 2 = 4
b. V(G) = P + 1 = 3 + 1 = 4 [Nodes 2, 4, 6 are predicate nodes]
c. V(G) = Number of enclosed regions + 1 = 3 + 1 = 4.

∴ 4-test cases need to be designed. They are given below:

EXAMPLE 4.5. Consider the following code snippet to search a


number using binary search:
a. Draw its flow graph
b. Find V(G) by all 3 methods
c. Derive test cases

SOLUTION. Flow graph is shown in Figure 4.9.


FIGURE 4.9

∴ V(G) = [Total number of enclosed regions (R1, R2, R3) + 1


outer region where this graph is enclosed]
=3+1=4
or V(G) = e – n + 2 = 13 – 11 + 2 = 2 + 2 = 4
or V(G) = P + 1 = 3 + 1 = 4 [∵ Nodes 3, 5, and 8 are predicate
nodes with outdegree as 2]
Basis set of paths is
Path 1: 1, 2, 3, 4, 5, 6, 7, 11
Path 2: 1, 2, 3, 11
Path 3: 1, 2, 3, 4, 5, 8, 9, 3, …
Path 4: 1, 2, 3, 4, 5, 8, 10, 3, …
V(G) = 4
⇒ Paths are also 4 as given above.
⇒ These 4 paths must be executed at least once in order to test this
program thoroughly.
∴ Test cases are:

EXAMPLE 4.6. Consider the following recursive code to find GCD


of two numbers.

FIGURE 4.10
Draw its flowgraph, find V(G), and basis path set. Derive test cases
also.
SOLUTION. The flowgraph for the GCD program is shown in Figure
4.10.
∴ V(G) = Enclosed regions + 1 = 2 + 1 = 3
V(G) = e – n + 2 = 7 – 6 + 2 = 1 + 2 = 3
= P + 1 = 2 + 1 = 3 (∵ Nodes 1 and 3 are predicate
nodes)

Three independent paths are


1–3–4–6
1–3–5–6
1–2–6

Three test cases are:


NOTE
1. In test case id–2, we call GCD function recursively with x = x – y
and y as it is.
2. In test case id–3, we call GCD function recursively with y = y – x
and x as it is.

EXAMPLE 4.7. What is a cyclomatic complexity? Find the following


related to given:
if(c1){
f1();
}else{
f2();
}
if(c2){
f3();
}else{
f4();
}

i. What is a cyclomatic number?


ii. How many test cases are required to achieve complete branch
average?
iii. How many test cases are required for complete path coverage?

SOLUTION.
Cyclomatic complexity is a software metric that provides a
quantitative measure of the logical complexity of a program.
Cyclomatic complexity has a foundation in graph theory and is
computed in one of the three ways:
i. The number of regions corresponds to the cyclomatic
complexity.
ii. Cyclomatic complexity: E – N + 2 (E is the number of edges,
and N is number of nodes).
iii. Cyclomatic complexity: P + 1 (P is the number of predicate
nodes).

Referring to the flow graph, the cyclomatic number is:


1. The flow graph has three regions.
2. Complexity = 8 edges – 7 nodes + 2 = 3
3. Complexity = 2 predicate nodes + 1 = 3 (Predicate nodes = C1, C2)

FIGURE 4.11

Two test cases are required for complete branch coverage and four test
cases are required for complete path coverage.

Assumptions:
c1; if(i%2==0)
f1: EVEN()
f2: ODD()
c2: if(j > 0)
f3: POSITIVE()
f4: NEGATIVE()
i.e.
if(i%2==0){
EVEN();
}else{
ODD();
}
if(j < 0){
POSITIVE();
}else{
NEGATIVE();
}

Test cases that satisfy the branch coverage criteria, i.e., <c2, f1, c2,
f3> and <c1, f2, c2, f4>.

Test cases that satisfies the path coverage criteria, i.e.,


<c1, f1, c2, f3>
<c1, f2, c2, f4>
<c1, f1, c2, f4>
<c1, f2, c2, f3>

4.2.3.4. DD PATH TESTING WITH SOLVED EXAMPLES

From a given flow graph, we can easily draw another graph that is
known as a decision-to-decision (DD) path graph. Our main
concentration now is on the decision nodes only. The nodes of the
flow graph are combined into a single node if they are in sequence.

EXAMPLE 4.8. Consider the following pseudo code:

a. Design its program graph


b. Design its DD path graph
c. List all DD-paths

SOLUTION.
a. Its program graph is as follows:
FIGURE 4.12

b. Its DD-path graph is as follows:


FIGURE 4.13

c. The DD-paths are as follows:


Path–1: A–B–C–J
Path–2: A–B–D–E–G–I–J
Path–3: A–B–D–E–F–G–H–I–J
Path–4: A–B–D–E–G–H–I–J

EXAMPLE 4.9. Consider the following code snippet:


Draw its data-flow graph. Find out whether paths (1, 2, 5, 6) and (6, 2,
5, 6) are def-clear or not. Find all def-use pairs?
SOLUTION. We draw its data flow graph (DD-path graph) first.

FIGURE 4.14

Now, path (1, 2, 5, 6) is definition clear (def-clear) with respect to


definition of x at node-1, i.e., d1(x).
So, d1(x) is live at node 6, i.e., x is being used at node-6 again. But
path (1, 2, 5, 6) is not def-clear with respect to d1(z) because of d5(z).
Now, consider path (6, 2, 5, 6). It is def-clear with respect to d6
(count). Variables y and z are used at node 4. Please note that we can
easily check for definitions d1(y), d6(y), d1(z), and d5(z) that they are
live at node 4.

What about def-use pairs?


We have 2 types of def-use pairs:
1. That correspond to a definition and its c-use, (dcu). (“c” stands for
computational.)
2. That correspond to a definition and its p-use, (dpu). (“p” stands for
predicate condition.)
Let us then compute dcu (x, 1), i.e., the set of all c-uses associated
with the definition of x at node-1. Also note that there is a c-use of x at
node-5 and a def-clear path (1, 2, 5) from node-1 to node-5. So, node-
5 is included in dcu (x, 1). Similarly, we can include node-6 also in
this set. So, we get:
dcu (x, 1) = {5, 6}

What is dpu (x, 1)?


As shown in program, there is a p-use of x at node-2 with outgoing
edges (2, 3) and (2, 5). The existence of def-clear paths from node-1 to
each of these two edges is easily seen in DD-path graph. There is no
other p-use of x. So,
dpu (x, 1) = {(2, 3), (2, 5)}
We can now compute for all other variables as shown below:
The spots that are not def-clear are trouble spots.

4.2.3.5. GRAPH MATRICES TECHNIQUE WITH SOLVED EXAMPLES


In graph matrix based testing, we convert our flow graph into a square
matrix with one row and one column for every node in the graph. The
objective is to trace all links of the graph at least once. The aim is
again to find cyclomatic complexity, V(G) and, hence, the
independent paths. If the size of graph increases, it becomes difficult
to do path tracing manually.

For example, Consider the following flow graph:

FIGURE 4.15 Flow Graph Example.

Its graph-matrix is:


FIGURE 4.16 Graph Matrix.

Now, consider another flow graph:

FIGURE 4.17 Flow Graph Example.

∴ Its graph matrix is:

FIGURE 4.18 Graph Matrix.

Note that if there are several links between two nodes then “+” sign
denotes a parallel link.
This is, however, not very useful. So, we assign a weight to each
entry of the graph matrix. We use “1” to denote that the edge is
present and “0” to show its absence. Such a matrix is known as a
connection matrix. For the Figure above, we will have the following
connection matrix:
FIGURE 4.19 Connection Matrix.

FIGURE 4.20

Now, we want to compute its V(G). For this, we draw the matrix
again, i.e., we sum each row of the above matrix. Then, we subtract 1
from each row. We, then, add this result or column of result and add 1
to it. This gives us V(G). For the above matrix, V(G) = 2.

EXAMPLE 4.10. Consider the following flow graph:

FIGURE 4.21

Draw its graph and connection matrix and find V(G).


SOLUTION.
First, we draw its graph matrix:

FIGURE 4.22

Now, its connection matrix is:

FIGURE 4.23

Similarly, we can find two link or three link path matrices, i.e., A2,
A3, ... An – 1. These operations are easy to program and can be used as
a testing tool.

4.2.3.6. DATA FLOW TESTING WITH EXAMPLES


Data flow testing uses the sequence of variable access to select paths
from a control graph. The main focus of this structural or white-box
testing technique is on the usage of variables. With a program data
definition faults are nearly as frequent (22% of all faults) as control
flow faults (24 percent). Some common data related faults include:
Using an undefined or uninitialized variable.
Unreachable code segments.
Assigning a value to a variable more than once without an
intermediate reference.
Deallocating or re-initializing a variable before its last use.
It is another form of white-box testing. Here the focal point is on
the usage of variables. Because data variables and data structures are
an important part of any software, so we must check for variables—
which have not been defined but used or variables which are defined
but not used or a variable that is defined twice. These may be the main
culprits or sources of errors in our program. We need to identify them
and remove the problems associated with them.
Some terminology used during data flow testing is given below:
1. Defining node: A defining node, n, of the variable, v, if the value
of the variable, v, is defined at this node, n.

2. Usage node: A usage node, n, of the variable, v, if the value of the


variable, v, is used at this node, n.

3. Definition use path (du-path): The path from a node, m, to node,


n, where the variable, v, is initially defined to the point where it is
used in the program is called as du-path.
4. Definition clear path (dc-path): The path from node, m, to node,
n, such that there is no other node in this path that redefines the
variable, v.
Please note that our objective is to find all du-paths and then
identify those du-paths which are not dc-paths. We generate test
cases for du-paths that are not dc-paths.
Steps followed for data flow testing are given below:
Step 1. Draw the program flow graph.
Step 2. Prepare a table for define/use status of all variables used in
your program.
Step 3. Find all du-paths.
Step 4. Identify du-paths that are not dc-paths.
This is data flow testing.

EXAMPLE 4.11. For the following C-like program, give the


1. flow graph.
2. def/use graph.

SOLUTION.
Let us rewrite the program again for easy drawing of its flow graph.

i. The flow graph of a given program is as follows:


FIGURE 4.24

ii. The def/use graph for this program is as follows:

FIGURE 4.25

Now, let us find its dcu and dpu. We draw a table again as follows:

(Node, var) dcu dpu


(1, x) {6} ϕ
(1, y) {2, 3} {(1, 2), (1, 3), (7, 8), (7, 9)}
(2, pow) {6} {(5, 6), (5, 7)}
(3, pow) {6} {(5, 6), (5, 7)}
(4, z) {6, 8, 9} ϕ
(6, z) {6, 8, 9} ϕ
(6, pow) {6} {(5, 6), (5, 7)}
(8, z) {9} ϕ

EXAMPLE 4.12. (The Commission Problem). Consider again the


commission problem. The problem domain has already been discussed
in Chapter 2. Write its program, draw its flow graph and DD-path
graph. Find its V(G) and, hence, independent paths. Also, perform its
data flow testing.

SOLUTION. We develop its program first:


Now, we draw its flow graph first:

FIGURE 4.26 Flow Graph of Commission Problem.


Next, we draw its DD path graph:
Note that more compression exists in this DD-path graph because
of the increased computation in the commission problem.
The table below details the statement fragments associated with
DD-paths.

FIGURE 4.27 DD Path Graph.

DD-path Nodes
A 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13
B 14
C 15, 16, 17, 18, 19
D 20, 21, 22, 23, 24, 25, 26, 27, 28
E 29
F 30, 31, 32, 33
G 34
H 35, 36, 37
I 38
J 39
K 40
L 41, 42

∴ We now compute its V(G):

a. V(G) = Number of enclosed regions + 1 = 3 + 1 = 4


b. V(G) = e – n + 2 = 14 – 12 + 2 = 2 + 2 = 4
c. V(G) = P + 1 = 3 + 1 = 4 [ Nodes B, E, and G are predicate nodes]
∴ There must exist four independent paths. They are given below.
Path 1: A – B – C – B – ...
Path 2: A – B – D – E – F – K – L
Path 3: A – B – D – E – G – H – J – K – L
Path 4: A – B – D – E – G – I – J – K – L
Next, we prepare the define/use nodes for variables in the
commission problem.

Variables Defined at node Used at node


lockprice 7 24
stockprice 8 25
barrelprice 9 26
total locks 10, 16 16, 21, 24
total stocks 11, 17 17, 22, 25
total barrels 12, 18 18, 23, 26
locks 13, 19 14, 16
stocks 15 17
barrels 15 18
lock sales 24 27
stock sales 25 27
barrel sales 26 27
sales 27 28, 29, 33, 34, 37, 38
commission 31, 32, 33, 36, 37, 38 32, 33, 37, 41

Next, to find du-paths, we prepare the table:

Variable Path (beginning, end) Definition clear


nodes
lockprice 7, 24 yes
stockprice 8, 25 yes
barrelprice 9, 26 yes
total stocks 11, 17 yes
total stocks 11, 22 no
total stocks 17, 25 no
total stocks 17, 17 yes
total stocks 17, 22 no
total stocks 17, 25 no
locks 13, 14 yes
locks 19,14 yes
locks 13, 16 yes
locks 19, 16 yes
sales 27, 28 yes
sales 27, 29 yes
sales 27, 33 yes
sales 27, 34 yes
sales 27, 37 yes
sales 27, 38 yes
commission 31, 32 yes
commission 31, 33 no
commission 31, 37 not possible
commission 31, 41 no
commission 32, 32 yes
commission 32, 33 yes
commission 32, 37 not possible
commission 32, 41 no
commission 33, 32 not possible
commission 33, 33 yes
commission 33, 37 not possible
commission 33, 41 yes
commission 36, 32 not possible
commission 36, 33 not possible
commission 36, 37 yes
commission 36, 41 no
commission 37, 32 not possible
commission 37, 33 not possible
commission 37, 37 yes
commission 37, 41 yes
commission 38, 32 not possible
commission 38, 33 not possible
commission 38, 37 not possible
commission 38, 41 yes

The initial value definition for the variable, “titalstocks” occurs at


node-11 and it is first used at node-17. Thus, the path (11, 17) which
consists of the node sequence <11, 12, 13, 14, 15, 16, 17>, is
definition clear.
The path (11, 22), which consists of the node sequence <11, 12,
13, 14, 15, 16, 17, 18, 19, 20> * & <21, 22> is not definition clear
because the values of total stocks are defined at node-11 and at node-
17. The asterisk, *, is used to denote zero or more repetitions.
Thus, out of 43 du-paths, 8-paths, namely, (11, 22), (17, 25), (17,
22), (17, 25), (31, 33), (31, 41), (32, 41), and (36, 41), are not
definition clear. These 8-paths are the main culprits and thus the cause
of the error.

Problem for Practice


1. Consider the problem of finding out the roots of a given quadratic
equation, ax2 + bx + c = 0.
Write its program in C and perform the following:
a. Draw the flow graph.
b. Draw the DD path graph.
c. Find V(G) by the three methods.
d. Perform its data flow testing and find all du- and dc-paths.

How to Develop Data Flow Test Cases?


The steps that are usually followed to develop data flow test cases are
as follows:
1. Layout a control graph for the program.
2. For each variable, note the defined (d), used (u), and killed (k)
actions by node.
3. Trace all possible du- and dk-paths for each variable.
4. Merge the paths.
5. Test/check each path for testability.
6. Select test data for feasible paths.

We shall now apply these steps to the following code:


Step 1: Its flow graph is:

FIGURE 4.28 Flow Graph for Example.

Step 2: Next, we tabulate data actions. The D and U actions in xyz-


class can be read from the code.
Step 3: Trace data flow for X

There will be a set of paths for each d. Mark the d nodes for the
variable on the graph and trace the paths. The baseline trace method
can be adapted for this tracing. Tabulate the paths as you trace them.
So, define/use paths for “x” are shown in Figure 4.29.
Note that the variable-x has d actions at nodes-1 and -2.
∴ Paths from node-1:
1–5–6–8–9–10 DU–
1–5–6–8–10 DU–
1–2 DUD
FIGURE 4.29 Trace Data Flow Graph for “X.”

And paths from node-2:


2–3–6–8–9–10 DU–
2–4–6–8–9–10 DU–
2–3–6–8–10 DU–
Similarly, trace data flow for y is as follows: Define/Use paths for
y are shown in Figure 4.30. Variable “y” has d actions are nodes 1, 2,
3, 4, 5, 7.
Paths from node-1:
FIGURE 4.30 Trace Data Flow for “y.”

1–5 DUD
1–2 DUD
Paths from node-2:
2–3 DD
2–4 DD
Paths from node-3:
3–6–7 DUD
3–6–8–10 DU–
Paths from node-4:
4–6–7 DUD
4–6–8–10 DU–
Paths from node-5:
5–6–7 DUD
5–6–8–10 DU–
Paths from node-7:
5–6–7 DUD
5–6–8–10 DU–
Similarly, the trace data flow for “Z” is as follows: Define/Use Paths
for Z are shown in Figure 4.31. The variable Z has d actions at nodes-
1 and-7.
∴ Paths from node-1 are:
1–5–6–7 DUD
1–5–6–8–9 DU–
1–5–6–8 DU–
1–2–3–6–7 DUD
1–2–4–6–7 DUD
1–2–3–6–8–9 DU–
1–2–3–6–8 DU–
Paths from node-7:
7–6–8–9 DU–
7–6–7 DU–
7–6–8 DU–

FIGURE 4.31 Trace Data Flow for “Z.”

Step 4: Merge the paths


a. Drop sub-paths: Many of the tabulated paths are sub-paths. For
example,
{1 2 3 6 8} is a sub-path of {1 2 3 6 8 9}.
So, we drop the sub-path.
b. Connect linkable paths: Paths that end and start on the same node
can be linked. For example,
{1 5} {5 6 8 10}
becomes {1 5 6 8 10}
We cannot merge paths with mutually exclusive decisions. For
example,
{1 5 6 8 9} and {1 2 3 6 7}
cannot be merged because they represent the predicate branches
from node-1.
So, merging all the traced paths provides the following set of paths:

Path Nodes visited


1 1–5–6–8–10
2 1–5–6–8–9–10
3 1–5–6 (7 6)* 8 10
4 1–5–6 (7 6)* 8 9 10
5 1–2–3–6–8–10
6 1–2–3–6–8–9–10
7 1–2–3–6 (7 6)* 8–10
8 1–2–3–6 (7 6)* 8–9–10
9 1–2–4–6–8–10
10 1–2–4–6–8–9–10
11 1–2–4–6 (7 6)* 8–10
12 1–2–4–6 (7 6)* 8–9–10

The (7 6)* means that path 7–6 (the loop) can be iterated. We require
that a loop is iterated at least twice.

Step 5: Check testability

We need to check path feasibility.


Try path 1: {1–5–6–8–10}

Condition Comment
1. x ≤ 10 Force branch to node-5
2. y′ = z + x Calculation on node-5
3. x≤z+x Skip node-7, proceed to
4. x≤z node-8
Skip node-9, proceed to
node-10

The values that satisfy all these constraints can be found by trial and
error, graphing, or solving the resulting system of linear inequalities.
The set x = 0, y = 0, z = 0 works, so this path is feasible.
Try path 5: {1–2–3–6–8–10}

Condition Comment
1. x > 10 Force branch to node-2
2. x′ = x + 2 Calculation on node-2
3. y′ = y – 4 Calculation on node-2
4. x′ > z Force branch to node-3
5. x′ ≤ y′ Skip node-7, proceed to
node-8
6. x′ ≤ z Skip node-9, proceed to
node-10

By inspection, conditions 4 and 6 cannot both be true, so path 5 is


dropped.

Step 6: Select test data


Path feasibility analysis also provides domain information for each
path. This can be used in conjunction with other domain constraints
(from requirements, for example) to develop test data for each path.
TEST DATA

4.3. MUTATION TESTING VERSUS ERROR SEEDING —


DIFFERENCES IN TABULAR FORM
In the error-seeding technique, a predefined number of artificially
generated errors is “sown” in the program code. After that, test runs
are used to detect errors and to examine the ratio between actual and
artificial errors based on the total number of detected errors. The
artificially generated errors are not known to the testers. Error seeding
and mutation testing are both error-oriented techniques and are
generally, applicable to all levels of testing. We shall now tabulate the
differences between these two techniques of white-box testing. They
are as follows:

Error seeding Mutation testing


1. No mutants are present 1. Mutants are developed for
here. testing.
2. Here source code is tested 2. Here mutants are
within itself. combined, compared for
testing to find errors
introduced.
3. Errors are introduced 3. Special techniques are used
directly. to introduce errors.
4. Test cases which detect 4. Here, test cases which kill
errors are used for testing. mutants are used for
testing.
5. It is a less efficient error 5. It is more efficient than
testing technique. error seeding.
6. It requires less time. 6. It is more time consuming.
7. It is economical to 7. It is expensive to perform.
perform.
8. It is a better method for 8. It is a better method for
bigger problems. small size programs.

This mutation testing scheme was proposed by DeMillo in 1978.


In this testing technique, we mutate (change) certain statements in the
source code and check if the test code is able to find the errors. It is a
technique that is used to assess the quality of the test cases, i.e.,
whether they can reveal certain types of faults.
For example, SDLC is viewed as a “root-finding” procedure. That
is, a designer submits an initial guess which is then iteratively tested
for validity until a correct solution is obtained. This is the principle of
meta-induction.
These mutants are run with an input data from a given test set. If a
test set can distinguish a mutant from the original program, i.e., it
produces a different execution result, the mutant is said to be killed.
Otherwise, the mutant is called as a live mutant. Please note that a
mutant remains live because it is equivalent to the original program,
i.e., it is functionally identical to the original program or the test data
is inadequate to kill the mutant. If a test data is inadequate, it can be
improved by adding test cases to kill the live mutant. A test set which
can kill all non equivalent mutants is said to be adequate (mutation
score). Now, the adequacy of a test set is measured as follows:
Advantages of Mutation Testing
1. It can show the ambiguities in code.
2. It leads to a more reliable product.
3. A comprehensive testing can be done.

Disadvantages of Mutation Testing


1. It is difficult to identify and kill equivalent mutants.
2. Stubborn mutants are difficult to kill.
3. It is a time consuming technique, so automated tools are required.
4. Each mutation will have the same size as that of the original
program. So, a large number of mutant programs may need to be
tested against the candidate test suite.

4.4. COMPARISON OF BLACK-BOX AND WHITE-BOX


TESTING IN TABULAR FORM

Functional or black-box testing Structural or white-box testing


or glass-box testing
1. This method focus on 1. This method focuses on
functional requirements of procedural details, i.e.,
the software, i.e., it enables internal logic of a program.
the software engineer to
derive sets of input conditions
that will fully exercise all
functional requirements for a
program.
2. It is NOT an alternative 2. It concentrates on internal
approach to white-box logic, mainly.
technique rather is a
complementary approach that
is likely to uncover a different
class of errors.
3. Black-box testing is applied 3. Whereas, white-box testing is
during later stages of testing. performed early in the testing
process.
4. It attempts to find errors in 4. Whereas, white-box testing
the following categories: attempts errors in following
a. Incorrect or missing cases:
functions a. Internal logic of your
b. Interface errors program

c. Errors in data structures b. Status of program


or external database
access
d. Performance errors
e. Initialization and
termination errors
5. It disregards control structure 5. It uses control structure of
of procedural design (i.e., the procedural design to
what is the control structure derive test cases.
of our program, we do not
consider here).
6. Black-box testing broadens 6. White-box testing, as
our focus on the information described by Hetzel is
domain and might be called “testing in small,” i.e.,
as “testing in the large,” i.e., testing small program
testing bigger monolithic components (e.g., modules or
programs. small group of modules).
7. Using black-box testing 7. Using white-box testing, the
techniques, we derive a set of software engineer can desire
test cases that satisfy test cases that:
following criteria: a. guarantee that all
a. Test cases that reduce (by independent paths within
a count that is greater a module have been
than 1) the number of exercised at least once.
additional test cases that b. exercise all logical
must be designed to decisions on their true
achieve reasonable and false sides.
testing.
c. execute all loops at their
b. Test cases that tell us boundaries and within
something about the their operational bounds.
presence or absence of
classes of errors rather d. exercise internal data
than an error associated structures to ensure their
only with the specific validity.
tests at hand.
8. It includes the tests that are 8. A close examination of
conducted at the software procedural detail is done.
interface.
9. Are used to uncover errors. 9. Logical paths through the
software are tested by
providing test cases, that
exercise specific sets of
conditions or loops.
10. To demonstrate that software 10. A limited set of logical paths
functions are operational, i.e., be examined.
input is properly accepted and
output is correctly produced.
Also, the integrity of external
information (e.g., a data base)
is maintained.

4.5. PRACTICAL CHALLENGES IN WHITE-BOX


TESTING
The major challenges that we face during white-box testing are as
follows:
1. Difficult for Software Developer to Pin-Point Defects From His
Own Creations: As discussed earlier, no one would like to point out
errors from their own creations. So, does a developer. So, usually
we select a different test team.
2. Even a Completely Tested Code May Not Satisfy Real Customer
Requirements: Developers do not have a full appreciation and
favor to external customer’s requirements or the domain
knowledge. This means that even after thorough verification and
validation, common user scenarios may get left out. So, we must
address such scenarios.

4.6. COMPARISON ON VARIOUS WHITE-BOX TESTING


TECHNIQUES
Functional testing techniques always result in a set of test cases and
structural metrics are always expressed in terms of something
countable like the number of program paths, the number of decision-
to-decision paths (DD-paths), and so on.

FIGURE 4.32 Shows Trends of Test Coverage Item(s).


FIGURE 4.33 Shows Trend of Test Method Effort.

Figures 4.32 and 4.33 show the trends for the number of test
coverage items and the effort to identify them as functions of
structural testing methods, respectively. These graphs illustrate the
importance of choosing an appropriate structural coverage metric.

4.7. ADVANTAGES OF WHITE-BOX TESTING


1. White-box testing helps us to identify memory leaks. When we
allocate memory using malloc( ) in C, we should explicitly release
that memory also. If this is not done then over time there would be
no memory available for allocating memory on requests. This can
be done using debuggers that can also tally allocated and freed
memory.
2. Performance analysis: Code coverage tests can identify the areas of
a code that are executed most frequently. Extra efforts can then be
made to check these sections of code. To do further performance
improvement techniques like caching, coprocessing or even
parallel processing can be considered.
3. Coverage tests with instrumented code is one of the best means of
identifying any violations of such concurrency constraints through
critical sections.
4. White-box testing is useful in identifying bottlenecks in resource
usage. For example, if a particular resource like RAM or ROM or
even network is perceived as a bottleneck then instrumented code
can help identify where the bottlenecks are and point towards
possible solutions.
5. White-box testing can help identify security holes in dynamically
generated code. For example, in case of Java, some intermediate
code may also be generated. Testing this intermediate code requires
code knowledge. This is done by white-box testing only.

SUMMARY
1. White-box testing can cover the following issues:
a. Memory leaks
b. Uninitialized memory
c. Garbage collection issues (in JAVA)
2. We must know about white-box testing tools also. They are
listed below:
a. Purify by Rational Software Corporation
b. Insure++ by ParaSoft Corporation
c. Quantify by Rational Software Corporation
d. Expeditor by OneRealm Inc.

MULTIPLE CHOICE QUESTIONS


1. A testing which checks the internal logic of the program is
a. Black-box testing.
b. White-box testing.
c. Both (a) and (b)
d. None of the above.
2. The cyclomatic complexity, V(G) was developed by:
a. Howard
b. McCabe
c. Boehm
d. None of the above.
3. A node with indegree = 0 and outdegree ≠ 0 is called
a. Source node.
b. Destination node.
c. Transfer node.
d. None of the above.
4. A node with indegree ≠ 0 and outdegree = 0 is called
a. Source node.
b. Destination node.
c. Predicate node.
d. None of the above.
5. V(G) is given by which formula?
a. V(G) = e – n + 2
b. V(G) = e – 2n + P
c. V(G) = e – 2n
d. None of the above.
6. A predicate node is one which has
a. Two outgoing edges.
b. No outgoing edges.
c. Three or more outgoing edges.
d. None of the above.
7. An independent path is one
a. That is a complete path from source to destination node.
b. That is a path which introduces at least one new set of
processing statements.
c. That is a path which introduces at most one new set of
processing statements.
d. None of the above.
8. The size of the graph matrix is the
a. Number of edges in the flow graph.
b. Number of nodes in the flow graph.
c. Number of paths in the flow graph.
d. Number of independent paths.
9. In data flow testing, the objective is to find
a. All dc-paths that are not du-paths.
b. All du-paths.
c. All du-paths that are not dc-paths.
d. All dc-paths.
10. Mutation testing is related to
a. Fault seeding.
b. Functional testing.
c. Fault checking.
d. None of the above.

ANSWERS
1. b.
2. b.
3. a.
4. b.
5. a.
6. a.
7. b.
8. b.
9. c.
10. a.

CONCEPTUAL SHORT QUESTIONS WITH ANSWERS


Q. 1. Write a short paragraph on test coverage analyzers?
Ans. Coverage analyzers are a class of test tools that offer automated
support for this approach to testing management. With this tool,
the tester runs a set of test cases on a program that has been
instrumented by the coverage analyzer. The analyzer then uses
the information produced by the instrumentation code to
generate a coverage report. For example, in case of DD-path
coverage, it identifies and labels all DD-paths in the original
program. When the instrumented program is executed with test
cases, the analyzer tabulates the DD-paths traversed by each test
case. So, a tester can experiment with different sets of test cases
to determine the coverage of each set.
Q. 2. What is slice-based testing?
Ans. A program slice is a set of program statements that contribute to
or affect a value for a variable at some point in the program. This
is an informal definition.
Let us see more formal definitions of a program slice.
“Given a program, P and a set of V of variables in P, a slice on
the variable set V at statement n, written as S(V, n), is the set of
all statements in P that contribute to the values of variables in
V.”
Note: Listing elements in a slice S(V, n) will be cumbersome
because the elements are the program statement fragments.
We can further refine the definition of a program slice.
“Given a program P and a program graph G(P) in which
statements and statement fragments are numbered and a set V of
variables in P, the slice on the variable set V at statement
fragment n, written as S(V, n), is the set of node numbers of all
statement fragments in P prior to n that contribute to the values
of variables in V at statement fragment n.”
The basic idea is to separate a program into components (slices)
that have some useful/functional meaning.
Please note that we have used the words “prior to” (in the 2nd
definition) in the dynamic sense. So, a slice captures the
execution time behavior of a program with respect to the
variables in the slice. This develops a lattice, or a directed,
acyclic graph of slices in which nodes are slices and edges
correspond to the subset relationship.
We have also used a word, “contribute,” which means that data
declaration statements have an effect on the value of a variable.
We simply exclude all non-executable statements.
We have five forms of usages:
1. P-use — used in a predicate (decision)
2. C-use — used in computation
3. O-use — used for output
4. L-use — used for location (pointers, subscripts)
5. I-use — used for iterations (counters, loops)
We also identify two forms of definitions nodes:
1. I-def: defined by input
2. A-def: defined by assignment
Now assume that the slice S(v, n) is a slice on one variable, i.e.,
the set V consists of a single variable, v. If the statement
fragment “n” is a defining node for v, then n is included in the
slice. Also, if the statement fragment “n” is a usage node for v,
then n is not included in the slice. P-uses and C-uses of other
variables, i.e., not the v in the slice set V, are included to the
extent that their execution affects the value of variable, v.
Tip: If the value of v is the same whether a statement fragment is
included or excluded, then exclude the statement fragment.
Also, O-use, L-use, and I-use nodes are excluded from the slices
as L-use and I-use variables are typically invisible outside their
modules.
Q. 3. What is slice splicing?
Ans. If we decide to develop a program in terms of compatible slices
then we could code a slice and immediately test it. We can then
code and test other slices and merge them into a fair solid
program. This is known as slice splicing.
Q. 4. What are the different sources of knowledge for white-box
testing?
Ans. The following are the knowledge sources for white-box or
structural testing:
a. High-level design
b. Detailed design
c. Control flow graphs
Q. 5. What is stress testing?
Ans. Stress testing is a testing technique used to determine if the
system can function when subjected to large volumes of data. It
includes areas like
a. Input transactions
b. Internal tables
c. Disk space
d. Output
e. Communications
f. Computer capacity
g. Interaction with users
If the application functions properly with stressed data then it is
assumed that it will function properly with normal volumes of
work.
Objectives of stress testing: To simulate a production
environment for determining that
a. Normal or above-normal volumes of transactions can be
processed through the transaction within the available time
frame.
b. The system is able to process large volumes of data.
c. Enough system capacity is available.
d. Users can perform their assigned tasks.
Q. 6. How to use stress testing?
Ans. Online systems should be stress tested by allowing people to
enter transactions of a normal or above normal pace.
Batch systems can be stress tested with large input batches.
Please note that the error conditions should be included in tested
transactions. Transactions that are used in stress testing can be
obtained from test data generators, test transactions created by
the test group, or the transactions previously processed in the
production environment.
In stress testing, the system should be run as it would in the
production environment. Operators should use standard
documentation and the people entering transactions or working
with the system should be the clerical personnel that will use the
system.
Examples of stress testing:
i. Enter transactions that determine that sufficient disk space
has been allocated to the application.
ii. Overloading the communication network with transactions.
iii. Testing system overflow conditions by entering more
transactions that can be accommodated by tables, queues, and
internal storage facilities, etc.
Q. 7. What is execution testing? Give some examples.
Ans. Execution testing is used to determine whether the system can
meet the specific performance criteria. It includes the following:
a. Verifying the optimum use of hardware and software.
b. Determining the response time to online user requests.
c. Determining transaction processing turnaround time.
Execution testing can be done in any phase of SDLC. It can
evaluate a single aspect of the system like a critical routine in the
system. We can use hardware and software monitors or create
quick and dirty programs to evaluate the approximate
performance of a completed system. This testing may be
executed onsite or offsite for the performance of the test. Please
note that the earlier the technique is used, the higher is the
assurance that the completed application will meet the
performance criteria.
Q. 8. What is recovery testing? Give some examples.
Ans. Recovery testing is used to ensure that operations can be
continued even after a disaster.
It not only verifies the recovery process but also the
effectiveness of the component parts of that process. Its
objectives are:
a. Document recovery procedures
b. Preserve adequate backup data
c. Training recovery personnel
Examples of recovery testing:
i. Inducing failures into one of the application system programs
during processing.
ii. Recovery could be conducted from a known point of integrity
to ensure that the available backup data was adequate for the
recovery process.
Q. 9. What is operations testing? Give some examples.
Ans. Operations testing is designed to determine whether the system is
executable during normal system operations. It evaluates both
the process and the execution of the process. During different
phases of SDLC, it can be used as follows:
Phase 1: Requirements Phase
During this phase, operational requirements can be evaluated to
determine the reasonableness and completeness of these
requirements.
Phase 2: Design Phase
During this phase, the operating procedures should be designed
and thus can be evaluated.
Examples of operations testing:
i. Verifying that the file labeling and protection procedures
function properly.
ii. Determining that the OS supports features and performs the
predetermined tasks.
Q. 10. What is security testing?
Ans. A testing used to identify defects that are very difficult to
identify. It involves determining that adequate attention is
devoted to identifying security risks and determining that
sufficient expertise exists to perform adequate security testing.
Examples of security testing:
i. Determining that the resources being protected are identified.
ii. Unauthorized access on online systems to ensure that the
system can identify such accesses.
Q. 11. Draw a flowchart to show overall mutation testing process.
Ans.
FIGURE 4.34

Q. 12. How is FTR different from the management review?


Ans. We tabulate the differences between the two.

FTR Management review


1. It is done to examine the product. 1. It is done to evaluate a project
plan.
2. Its purpose is to evaluate the 2. Its purpose is to ensure adequacy
software elements like and completeness of each
requirements and design. planning document for meeting
project requirements.
3. It is done to ensure conformity to 3. It is done to ensure that project
specifications. activities are progressing per the
planning documents.
4. The results of FTRs are given in 4. The results of the management
technical review reports (TRR). reviews are summarized in a
management review report
(MRR).
5. It is done to ensure the integrity 5. The purpose here is to ensure
of changes to the software proper allocation of resources.
elements.

REVIEW QUESTIONS
1. White-box testing is complementary to black-box testing, not
alternative. Why? Give an example to prove this statement.
2. a. What is a flow graph and what it is used for?
b. Explain the type of testing done using flow graph?
3. Perform the following:
a. Write a program to determine whether a number is even or
odd.
b. Draw the paths graph and flow graph for the above
problem.
c. Determine the independent path for the above.
4. Why is exhaustive testing not possible?
5. a. Draw the flow graph of a binary search routine and find its
independent paths.
b. How do you measure
1. Test effectiveness?
2. Test efficiency?
6. Write short paragraphs on:
a. Mutation testing.
b. Error-oriented testing.
7. Differentiate between structural and functional testing in detail.
8. Will exhaustive testing guarantee that the program is 100%
correct?
9. a. What is cyclomatic complexity? How can we relate this to
independent paths?
b. Explain the usefulness of error guessing-testing technique.
10. Differentiate between functional and structural testing.
11. Discuss the limitations of structural testing. Why do we say
that complete testing is impossible?
12. Explain define/use testing. Consider any example and show
du-paths. Also identify those du-paths that are not dc-paths.
13. Find the cyclomatic complexity of the graph given below:

FIGURE 4.35

Find all independent paths.


14. What is data flow testing? Explain du-paths. Identify du- and
dc-paths of any example of your choice. Show those du-paths
that are not dc-paths.
15. Write a short paragraph on data flow testing.
16. Explain the usefulness of error guessing testing technique.
17. Discuss the pros and cons of structural testing.
18. a. What are the problems faced during path testing? How they
can be minimized?
b. Given the source code below:

List the test cases for statement coverage, branch coverage, and
condition coverage.
19. Why is error seeding performed? How it is different from
mutation testing?
20. a. Describe all methods to calculate the cyclomatic
complexity.
b. What is the use of graph matrices?
21. Write a program to calculate the average of 10 numbers. Using
data flow testing design all du- paths and dc-paths in this
program.
22. Write a short paragraph on mutation testing.
23. Write a C/C++ program to multiply two matrices. Try to take
care of as many valid and invalid conditions are possible.
Identify the test data. Justify.
24. Discuss the negative effects of the following constructs from
the white-box testing point of view:
a. GO TO statements
b. Global variables
25. Write a C/C++ program to count the number of characters,
blanks, and tabs in a line. Perform the following:
a. Draw its flow graph.
b. Draw its DD-paths graph.
c. Find its V(G).
d. Identify du-paths.
e. Identify dc-paths.
26. Write the independent paths in the following DD-path graph.
Also calculate mathematically. Also name the decision nodes
shown in Figure 4.36.
27. What are the properties of cyclomatic complexity?

FIGURE 4.36

28. Explain in detail the process to ensure the correctness of data


flow in a given fragment of code.
29. Write a C program for finding the maximum and minimum out
of three numbers and compute its cyclomatic complexity using
all possible methods.
30. Consider the following program segment:

i. Draw the control flow graph for this program segment.


ii. Determine the cyclomatic complexity for this program
(give all intermediate steps).
iii. How is cyclomatic complexity metric useful?
31. Explain data flow testing. Consider an example and show all
“du” paths. Also identify those “du” paths that are not “dc”
paths.
32. Consider a program to find the roots of a quadratic equation.
Its input is a triplet of three positive integers (say a, b, c) from
the interval [1, 100]. The output may be one of the following
words—real roots, equal roots, imaginary roots. Find all du-
paths and those that are dc-paths. Develop data flow test cases.
33. If the pseudocode below were a programming language, list
the test cases (separately) required to achieve 100% statement
coverage and path coverage.

34. Consider a program to classify a triangle. Draw its flow graph


and DD-path graph.
CHAPTER
5

GRAY-BOX TESTING
Inside this Chapter:
5.0. Introduction to Gray-Box Testing
5.1. What Is Gray-Box Testing?
5.2. Various Other Definitions of Gray-Box Testing
5.3. Comparison of White-Box, Black-Box, and Gray-Box Testing Approaches in Tabular
Form

5.0. INTRODUCTION TO GRAY-BOX TESTING


Code coverage testing involves “dynamic testing” methods of
executing the product with pre-written test cases and finding out how
much of the code has been covered. If a better coverage of the code is
desired, several iterations of testing may be required. For each
iteration, one has to write a new set of test cases for covering those
portions of code that were not covered by earlier test cases. To do such
types of testing, not only does one need to understand the code and
logic but one also needs to understand how to write effective test cases
that can cover good portions of the code. Understanding of code and
logic means white-box or structural testing whereas writing effective
test cases means black-box testing. So, we need a combination of
white-box and black-box techniques for test effectiveness. This type of
testing is known as “gray-box testing.” We must then understand that:
WHITE + BLACK = GRAY

5.1. WHAT IS GRAY-BOX TESTING?


Black-box testing focuses on software’s external attributes and
behavior. Such testing looks at an application’s expected behavior
from the user’s point of view. White-box testing/glass-box testing,
however, tests software with knowledge of internal data structures,
physical logic flow, and architecutre at the source code level. White-
box testing looks at testing from the developer’s point of view. Both
black-box and white-box testing are critically important complements
of a complete testing effort. Individually, they do not allow for
balanced testing. Black-box testing can be less effective at uncovering
certain error types such as data-flow errors or boundary condition
errors at the source level. White-box testing does not readily highlight
macro-level quality risks in operating environment, compatibility,
time-related errors, and usability.

5.2. VARIOUS OTHER DEFINITIONS OF GRAY-BOX


TESTING
Gray-box testing incorporates the elements of both black-box and
white-box testing. It considers the outcome on the user end, system-
specific technical knowledge, and the operating environment. It
evaluates the application design in the context of the inter-operability
of system components. The gray-box testing approach is integral to the
effective testing of web applications that comprise numerous
components, both software and hardware. These components must be
tested in the context of system design to evaluate their functionality
and compatibility. Listed below are some more definitions of gray-box
testing:

“Gray-box testing consists of methods and tools derived from the


knowledge of the application internals and the environment with
which it interacts, that can be applied in black-box testing to enhance
testing productivity, bug finding and bug analyzing efficiency.”
—Nguyen H.Q.
OR

“Gray-box testing is using inferred or incomplete structural or design


information to expand or focus black-box testing.”
—Dick Bender
OR

“Gray-box testing is designing of the test cases based on the


knowledge of algorithms, interval states, architectures or other high
level descriptions of program behavior.”
—Dong Hoffman

OR

“Gray-box testing involves inputs and outputs, but test design is


educated by information about the code or the program operation of a
kind that would normally be out of scope of view of the tester.”
—Cem Kaner

Gray-box testing is well suited for web application testing because it


factors in high level design environment and the inter-operability
conditions. It will serve problems that are not as easily considered by a
black-box or white-box analysis, especially problems of end-to-end
information flow and distributed hardware/software system
configuration and compatibility. Context-specific errors that are
germane to web systems are commonly uncovered in this process.

5.3. COMPARISON OF WHITE-BOX, BLACK-BOX, AND


GRAY-BOX TESTING APPROACHES IN TABULAR
FORM
Before tabulating the differences between black-box, gray-box, and
white-box testing techniques, we must first understand that when we
say test granularity, we mean the level of details. And when we say the
highest, it means that all internals are known.
FIGURE 5.1 Comparison of Black-Box, Gray-Box, and White-Box
Techniques.

Next, we tabulate the points of differences between them.

Black-box testing Gray-box testing White-box testing


1. Low granularity Medium granularity High granularity
2. Internals NOT known Internals partially Internals fully known
known
3. Internals not required Internals relevant to Internal code of the
to be known testing are known application and
database known
4. Also known as Also known as Also known as
• Opaque-box testing translucent-box • Glass-box testing
• Closed-box testing testing • Clear-box testing
• Input-output testing • Design-based testing
• Data-driven testing • Logic-based testing
• Behavioral • Structural testing
• Functional testing • Code-based testing.
5. It is done by end- It is done by end- Normally done by
users (user users (user testers and developers.
acceptance testing). acceptance testing).
Also done by tester, Also done by
developers. testers, developers.
6. Testing method Here, internals Internals fully known
where partly known
• System is viewed as (gray), but not fully
a black-box known (white). Test
• Internal behavior of design is based on
the program is the knowledge of
ignored algorithm, interval
• Testing is based states, architecture,
upon external or other high level
specifications descriptions of the
program behavior.
7. It is likely to be least It is somewhere in Potentially most
exhaustive of the between. exhaustive of the three.
three.
8. Requirements based. Better variety/depth Ability to exercise code
Test cases based on in test cases on with relevant variety of
the functional account of high data.
specifications, as level knowledge of
internals not known. internals.
9. Being specification It would also not Because test cases are
based if would not suffer from the written based on the
suffer from the deficiency as code, specifications
deficiency as described for white- missed in coding would
described for white- box testing. not be revealed.
box testing.
10. It is suited for It is suited for It is suited for all.
functional/ business functional/ business
domain testing. domain testing bit
in depth.
11. Not suited to Not suited to Appropriate for
algorithm testing. algorithm testing. algorithm testing.
12. It is concerned with Here in, we have a It facilitates structural
validating outputs for better variety of testing. It enables logic
given inputs, the inputs and the coverage, coverage of
application being ability to extract statements, decisions,
treated as a black- test results from conditions, path, and
box. database for control flow within the
comparison with code.
expected results.
13. It can test only by It can test data It can determine and
trial and error data domains, internal therefore test better:
domains, internal boundaries, and data domains, internal
boundaries, and overflow, if known. boundaries, and
overflow. overflow.

SUMMARY
1. As testers, we get ideas for test cases from a wide range of
knowledge areas. This is partially because testing is much
more effective when we know what types of bugs we are
looking for. As testers of complex systems, we should strive to
attain a broad balance in our knowledge, learning enough
about many aspects of the software and systems being tested to
create a battery of tests that can challenge the software as
deeply as it will be challenged in the rough and tumble day-to-
day use.
2. Every tester in a test team need not be a gray-box tester. More
is the mix of different types of testers in a team, better is the
success.

MULTIPLE CHOICE QUESTIONS


1. When both black-box and white-box testing strategies are
required to test a software then we can apply
a. Gray-box testing.
b. Mutation testing.
c. Regression testing.
d. None of the above.
2. Which of the following is true?
a. There is no such testing named gray-box testing.
b. Gray-box testing is well suited for web applications.
c. Gray-box is same as white-box only.
d. None of the above.
3. Gray-box testing is also known as
a. Opaque-box testing.
b. Clear-box testing.
c. Translucent testing.
d. None of the above.
4. Gray-box testing involves
a. Low granularity.
b. Medium granularity.
c. High granularity.
d. None of the above.
5. Gray-box testing is done by
a. End users.
b. Testers and developers.
c. Both (a) and (b)
d. None of the above.

ANSWERS
1. a.
2. b.
3. c.
4. b.
5. c.

CONCEPTUAL SHORT QUESTIONS WITH ANSWERS


Q. 1. Gray-box testing is based on requirement-based test case
generation. Explain.
Ans. Gray-box testing uses assertion methods to preset all the
conditions required, prior to a program being tested. Formal
testing is one of the commonly used techniques for ensuring that
a core program is correct to a very large degree. If the
requirement specification language is being used to specify the
requirement, it would be easy to execute the requirement stated
and verify its correctness. Gray-box testing will use the predicate
and verifications defined in requirement specification language
as inputs to the requirements based test case generation phase.
Q. 2. Why is gray-box testing especially useful for web and Internet
applications?
Ans. It is useful because the Internet is built around loosely integrated
components that connected via relatively well-defined interfaces.
It factors in high-level design, environment and inter-operability
conditions. It reveals problems of end-to-end information flow
and distributed hardware/software system configuration and
compatibility. Please note that context specific errors are
germane to web systems and it is our gray-box testing only that
uncovers these errors.
Q. 3. Can we have only gray-box testers?
Ans. No, because a high level of success lies in the fact that we can
have a mix of different types of testers with different skill sets—
like database expert, security expert, test automation expert, etc.
Q. 4. What is the gray-box assumption for object-oriented software?
Ans. Typically, a reference state model for SUT is assumed. The
testing problem is to identify failure to conform to the reference
model. The tester must determine which state is actually
obtained by applying a distinguishing sequence and observing
the resulting output. This increases the number of tests by a large
amount. With object-oriented software, we assume that the
internal state can be determined by
a. A method of activation state where activation is equated with
state.
b. State reporting capability in the class-under-test (CUT).
c. Built-in test reporting in the CUT.
This is the Gray-Box assumption.

REVIEW QUESTIONS
1. Define “gray-box testing.”
2. Give various other definitions of gray-box testing.
3. Compare white-box, black-box and gray-box testing
approaches?
4. How is gray-box testing related to white- and black-box
testing?
5. Gray-box testing is well suited for web applications testing.
Why?
6. Assume that you have to build the real-time large database
connectivity system software. Then what kinds of testing do
you think are suitable? Give a brief outline and justification for
any four kinds of testing.
CHAPTER
6

REDUCING THE NUMBER OF TEST


CASES
Inside this Chapter:
6.0. Prioritization Guidelines
6.1. Priority Category Scheme
6.2. Risk Analysis
6.3. Regression Testing—Overview
6.4. Prioritization of Test Cases for Regression Testing
6.5. Regression Testing Technique—A Case Study
6.6. Slice Based Testing

6.0. PRIORITIZATION GUIDELINES


The goal of prioritization of test cases is to reduce the set of test cases
based on some rational, non-arbitrary, criteria, while aiming to select
the most appropriate tests. If we prioritize the test cases, we run the
risk that some of the application features will not be tested at all.
The prioritization schemes basically address these key concepts:
a. What features must be tested?
b. What are the consequences if some features are not tested?
The main guide for selecting test cases is to assess the risk first. At the
end of the test case prioritization exercise, the tester and the project
authority must feel confident with the tests that have been selected for
execution. If someone is distressed about the omitted tests, then re-
evaluate the list and apply another prioritization scheme to analyze the
application from another point of view. Ideally, the project authority
and possibly other project members, must buy-in—and sign off—to
the prioritized set of tests.
There are four schemes that are used for prioritizing the existing
set test cases. These reduction schemes are as follows:
1. Priority category scheme
2. Risk analysis
3. Interviewing to find out problematic areas
4. Combination schemes
All of these reduction methods are independent. No one method is
better than the other. One method may be used in conjunction with
another one. It raises confidence when different prioritization schemes
yield similar conclusions.
We will discuss these techniques now.

6.1. PRIORITY CATEGORY SCHEME


The simplest scheme for categorizing tests is to assign a priority code
directly to each test description. One possible scheme is a three-level
priority categorization scheme. It is given below:
Priority 1: This test must be executed.
Priority 2: If time permits, execute this test.
Priority 3: If this test is not executed, the team won’t be upset.
So, assigning a priority code is as simple as writing a number
adjacent to each test description. Once priority codes have been
assigned, the tester estimates the amount of time required to execute
the tests selected in each category. If the time estimate falls within the
allotted schedule, then the partitioning exercise is completed and you
have identified the tests to use. Otherwise, further partitioning is
required.
The second scheme can be a new five-level scale to further classify
the tests. This scheme is given below:
Priority 1a: This test must pass or the delivery will be out of date.
Priority 2a: Before final delivery, this test must be executed.
Priority 3a: If time permits, execute this test.
Priority 4a: This test can wait after the delivery date.
Priority 5a: We will probably never execute this test.
We try to divide the tests. For example, say tests from priority 2
are now divided between priorities 3a, 4a, and 5a, we can downgrade
or upgrade any test in the similar fashion.

6.2. RISK ANALYSIS


All software projects benefit from risk analysis. Even non-critical
software, using risk analysis at the beginning of a project highlights
the potential problem areas. This helps developers and managers to
mitigate the risks. The tester uses the results of risk analysis to select
the most crucial tests.

How Risk Analysis Is Done?


Risk analysis is a well-defined process that prioritizes modules for
testing. A risk contains three components:
1. The risk, ri, associated with a project (i ← 1 to n).
2. The probability of occurrence of a risk, (li).
3. The impact of the risk, (xi).
Risk analysis consists of first listing the potential problems and
then assigning a probability and severity value for each identified
problem. By ranking the results, the tester can identify the potential
problems most in need of immediate attention and select test cases to
address those needs. During risk analysis we draw a table as shown
below:
FIGURE 6.1 Risk Analysis Table (RAT).

Here, in Figure 6.1,


Problem ID. It is a unique identifier associated with a risk.
Potential problem. It is a brief description of the problem.
Probability of occurrence, (li). It is a probability value on a scale of 1
(low) to 10 (high).
Severity of impact, (xi). It is a severity value on a scale of 1 (low) to
10 (high).
Risk exposure. It is defined as the product of li and xi.
In this table, the values of li and xi range from 1 to 10. Multiplying
the probability and severity values yields the risk exposure. “The
higher the risk exposure product, the more important it is to test for
that condition.” Applying this rule to Figure 6.1, we will get the
following rankings of the potential risk problems based on the product
of risk-exposure. The order of preference is:

C—D—G—A—E—F—B
Although problems D and G have the same risk exposure, they
differ by their probability and severity values.
For some organizations, this method may produce enough
information. Others may wish to do risk-matrix analysis also. We will
now discuss risk matrix analysis.

What Is a Risk Matrix?


A risk matrix allows the tester to evaluate and rank potential problems
by giving more weight to the probability or severity value as
necessary. The software tester uses the risk matrix to assign thresholds
that classify the potential problems into priority categories.
There are four main methods of forming risk matrices:
Method I: Typically, the risk matrix contains four quadrants, as
shown in Figure 6.2.
Each quadrant in Figure 6.2 represents a priority class defined as
follows:
Priority 1: High severity and high probability
Priority 2: High severity and low probability
Priority 3: Low severity and high probability
Priority 4: Low severity and low probability

FIGURE 6.2 Method I.


FIGURE 6.3 Method II.

We can see from the graph of Figure 6.2 that a risk with high
severity is deemed more important than a problem with high
probability. Thus, all risks mapped in the upper-left quadrant fall into
priority 2.
For example, the risk-e which has a high probability of occurrence
but a low severity of impact is put under priority 3.
Method II: For an entirely different application, we may swap the
definitions of priorities 2 and 3, as shown in Figure 6.3.
An organization favoring Figure 6.3 seeks to minimize the total
number of defects by focusing on problems with a high probability of
occurrence.
Dividing a risk matrix into quadrants is most common, testers can
determine the thresholds using different types of boundaries based on
application specific needs.
FIGURE 6.4 Method III.

Method III: Diagonal band prioritization scheme.


If severity and probability tend to be equal weight, i.e., if li = xi,
then diagonal band prioritization scheme may be more appropriate.
This is shown in Figure 6.4.
This threshold pattern is a compromise for those who have
difficulty in selecting between priority-2 and priority-3 in the quadrant
scheme.

FIGURE 6.5 Method IV.

Method IV: The problems with high severity must be given the top
priority, irrespective of the value of probability. This problem is
solved with method-IV and is shown in Figure 6.5. The remainder of
the risk matrix is partitioned into several lower priorities, either as
quadrants (Method-I and -II) or as diagonal bands (Method-III).

6.3. REGRESSION TESTING—OVERVIEW


Regression testing is done to ensure that enhancements or defect fixes
made to the software works properly and does not affect the existing
functionality. It is usually done during maintenance phase.
As a software system ages, the cost of maintaining the software
dominates the overall cost of developing the software. Regression
testing is a testing process that is used to determine if a modified
program still meets its specifications or if new errors have been
introduced. Improvements in the regression testing process would help
reduce the cost of software.

6.3.1. DIFFERENCES BETWEEN REGRESSION AND NORMAL TESTING


Let us now compare normal and regression testing in the below Table.

Normal testing Regression testing


1. It is usually done during fourth 1. It is done during the maintenance
phase of SDLC. phase.
2. It is basically software’s 2. It is also called program
verification and validation. revalidation.
3. New test suites are used to test 3. Both old and new test cases can
our code. be used for testing.
4. It is done on the original software. 4. It is done on modified software.
5. It is cheaper. 5. It is a costlier test plan.

6.3.2. TYPES OF REGRESSION TESTING

Four types of regression testing techniques are discussed one-by-one.


They are
1. Corrective regression testing
2. Progressive regression testing
3. Retest-all regression testing
4. Selective regression testing

6.3.2.1. CORRECTIVE REGRESSION TESTING

Corrective regression testing applies when specifications are


unmodified and test cases can be reused.
6.3.2.2. PROGRESSIVE REGRESSION TESTING

Progressive regression testing applies when specifications are


modified and new test cases must be designed.

6.3.2.3. THE RETEST-ALL STRATEGY

The retest-all strategy reuses all tests, but this strategy may waste time
and resources due to the execution of unnecessary tests. When the
change to a system is minor, this strategy would be wasteful.

6.3.2.4. THE SELECTIVE STRATEGY


The selective strategy uses a subset of the existing test cases to reduce
the retesting cost. In this strategy, a test unit must be rerun if and only
if any of the program entities, e.g., functions, variables etc., it covers
have been changed. The challenge is to identify the dependencies
between a test case and the program entities it covers. Rothermel and
Harrold specified a typical selective regression testing process. In
general, the process includes the following steps:
Step Identify affected software components after program, P, has
1. been modified to P′.
Step Select a subset of test cases, T′, from an existing test suite, T,
2. that covers the software components that are affected by the
modification.
Step Test modified program P′ with T′ to establish the correctness
3. of P′ with respect to T′.
Step Examine test results to identify failures.
4.
Step Identify and correct the fault(s) that caused a failure.
5.
Step Update the test suite and test history for P′.
6.
From these steps, we can observe the following characteristics of
selective regression testing:
1. Identifying the program components that must be retested and
finding those existing tests that must be rerun are essential.
2. When selected test cases satisfy retest criterion, new test cases
are not needed.
3. Once regression testing is done, it is necessary to update and
store the test information for reuse at a later time.
Regression testing is one kind of testing that is applied at all three
levels of testing. White suggested applying specification based
(system) testing before structure based (unit) testing to get more test
case reuse because system testing cases could be reused in unit testing
but the converse is not true. McCarthy suggested applying regression
unit testing first to find faults early.

6.3.2.5. REGRESSION TESTING AT UNIT LEVEL


Unit testing is the process of testing each software module to ensure
that its performance meets its specifications. Yau and Kishimoto
developed a method based on the input partition strategy. McCarthy
provided a way to automate unit regression testing.
Gupta, Harrold, and Soffa proposed an approach to data flow
based regression testing.

6.3.2.6. REGRESSION TESTING AT INTEGRATION LEVEL

Integration testing is the testing applied when individual modules are


combined to form larger working units until the entire program is
created. Integration testing detects failures that were not discovered
during unit testing. Integration testing is important because
approximately 40% of software errors can be traced to module
integration problems discovered during integration testing.
Leung and White introduced the firewall concept to regression
testing at the integration level. A firewall is used to separate the set of
modules affected by program changes from the rest of the code. The
modules enclosed in the firewall could be those that interact with the
modified modules or those that are direct ancestors or direct
descendants of the modified modules.
The firewall concept is simple and easy to use, especially when the
change to a program is small. By retesting only the modules and
interfaces inside the firewall, the cost of regression integration testing
can be reduced.

6.3.2.7. REGRESSION TESTING AT SYSTEM LEVEL

System testing is testing of the entire software system against the


system specifications. It must verify that all system elements have
been properly integrated and perform allocated functions. It can be
performed without the knowledge of the software implementation at
all.
Test Tube is a system developed at AT&T Bell laboratories to
perform system level regression testing. Test Tube is an example of a
selective retesting technique. Test Tube partitions a software system
into basic code entities, then monitors the execution of a test case,
analyzes its relationship with the system under test, and determines
which subset of the code entities the test covers. There are plans to
extend Test Tube to non-deterministic systems such as real-time
telecommunications software.

6.3.2.8. REGRESSION TESTING OF GLOBAL VARIABLES

A global variable is an output parameter in a module where the


variable is defined and an input parameter for a module that uses the
variable. The global variables can be retested at the unit level by
running the test that exercise the changed code and the instruction
referencing the global variable.
A global variable can be retested at the integration level. If any of
its defining modules have been changed then all of its using modules
must be retested. The regression testing of global variables is very
time consuming and, therefore, costly.
6.3.2.9. COMPARISON OF VARIOUS REGRESSION TESTING TECHNIQUES

There are four criteria that form a framework for evaluation of


different selective regression testing techniques. They are discussed
below:
1. Inclusiveness: It measures the extent to which a technique
chooses tests that will expose faults caused by program
changes.
2. Precision: It measures the ability of a technique to avoid
choosing tests that will not reveal the faults caused by the
changes.
3. Efficiency: It measures the computational cost of a technique.
4. Generality: It measures the ability of a technique to handle
different language constructs and testing applications.

6.3.2.10.REGRESSION TESTING IN OBJECT-ORIENTED SOFTWARE

Object-oriented concepts such as inheritance and polymorphism


present unique problems in maintenance and, thus, regression testing
of object-oriented programs. Several regression testing techniques
have been extended to retesting of object-oriented software.
Rothermal and Harrold extended the concept of selective
regression testing to object-oriented environments. Their algorithm
constructs program or class dependence graphs and uses them to
determine which test cases to select. To test a graph, driver programs
are developed to invoke the methods in a class in different sequences.
The class dependence graph links all these driver programs together
by selecting one driver as the root of the graph and adding edges from
it to the public methods in the class. Now the methods can be invoked
in different sequences.
Abdullah and White extended the firewall concept to retesting
object- oriented software. A firewall is an imaginary boundary that
encloses the entities that must be retested in a modified program.
Unlike the firewall in a procedure-based program, the firewall in an
object-oriented program is constructed in terms of classes and objects.
When a change is made to a class or an object, the other classes or
objects that interact with this changed class or object must be enclosed
in the firewall. Because the relations between classes and objects are
different, there should be different ways to construct the firewalls for
classes and objects.

6.4. PRIORITIZATION OF TEST CASES FOR


REGRESSION TESTING
Prioritization of tests requires a suitable cost criterion. Please
understand that tests with lower costs are placed at the top while those
with higher costs at the bottom.

What Cost Criterion to Use?


We could use multiple criteria to prioritize tests. Also, note that the
tests being prioritized are the ones selected using some test selection
technique. Thus, each test is expected to traverse at some modified
portion of P′.
Prioritization of regression tests offers a tester an opportunity to
decide how many and which tests to run under time constraints. When
all tests cannot be run, one needs to find some criteria to decide when
to stop testing. This decision could depend on a variety of factors such
as:
a. Time constraints
b. Test criticality
c. Customer requirements

6.5. REGRESSION TESTING TECHNIQUE—A CASE


STUDY
Regression testing is used to confirm that fixed bugs have been fixed
and that new bugs have not been introduced. How many cycles of
regression testing are required will depend upon the project size.
Cycles of regression testing may be performed once per milestone or
once per build. Regression tests can be automated.

The Regression–Test Process


The regression-test process is shown in Figure 6.6.

FIGURE 6.6

This process assumes that P′ (modified is program) available for


regression testing. There is a long series of tasks that lead to P′ from P.
Test minimization ignores redundant tests. For example, if both t1
and t2, test function, f in P, then one might decide to reject t2 in favor
of t1. The purpose of minimization is to reduce the number of tests to
execute for regression testing.
Test prioritization means prioritizing tests based on some criteria.
A set of prioritized tests becomes useful when only a subset of tests
can be executed due to resource constraints. Test selection can be
achieved by selecting a few tests from a prioritized list. The possible
sequence to execute these tasks is:

Test setup means the process by which AUT (application under


test) is placed in its intended or simulated environment and is ready to
receive data and output the required information. Test setup becomes
more challenging when we test embedded software like in ATMs,
printers, mobiles, etc.
The sequence in which tests are input to an application is an
important issue. Test sequencing is very important for applications
that have an internal state and runs continuously. For example, an
online-banking software.
We then execute the test cases. Each test needs verification. This
can also be done automatically with the help of CASE tools. These
tools compare the expected and observed outputs. Some of the tools
are:
a. Test Tube (by AT&T Bell Labs.) in 1994: This tool can do
selective retesting of functions. It supports C.
b. Echelon (by Microsoft) in 2002: No selective retesting but
does test prioritization. It uses basic blocks to test. It supports C
and binary languages.
c. ATACLx Suds (by Telcordia Technologies) in 1992: It does
selective retesting. It allows test prioritization and
minimization. It does control/data flow average. It also supports
C.

6.6. SLICE-BASED TESTING

Static slicing may lead to an unduly large program slice. So, Korel and
Laski proposed a method for obtaining dynamic slices from program
executions. They used a method to extract executable and smaller
slices and to allow more precise handling of arrays and other
structures. So, we discuss dynamic slicing.
Let “P” be the program under test and “t” be a test case against
which P has been executed. Let “l” be a location in P where variable v
is used. Now, the dynamic slice of P with respect to “t” and “v” is the
set of statements in P that lie in trace (t) and did effect the value of “v”
at “l.” So, the dynamic slice is empty if location “l” was not traversed
during this execution. Please note that the notion of a dynamic slice
grew out of that of a static slice based on program “P” and not on its
execution.
Let us solve an example now.

FIGURE 6.7

EXAMPLE 6.1. Consider the following program:


1. main ( ) {
2. int p, q, r, z;
3. z = 0;
4. read (p, q, r);
5. if (p < q)
6. z = 1; //modified z
7. if (r < 1)
8. x = 2
9. output (z);
10. end
11. }
Test case (t1): <p = 1, q = 3, r = 2>. What will be the dynamic
slice of P with respect to variable “z” at line 9? What will be its static
slice? What can you infer? If t2: <p = 1, q = 0, r = 0> then what will be
dynamic and static slices?
SOLUTION. Let us draw its flow graph first shown in Figure 6.7.
∴ Dynamic slice (P) with respect to variable z at line 9 is td = <4, 5,
7, 8> Static slice, ts = <3, 4, 5, 6, 7, 8>

NOTE

Dynamic slice for any variable is generally smaller than the


corresponding static slice.

Now, t2: <p = 1, q = 0, r = 0>


∴ Its dynamic slice is statements (3, 4, 5) while the static slice does
not change.

NOTE

Dynamic slice contains all statements in trace (t) that had an effect on
program output.

Inferences mode:

1. A dynamic slice can be constructed based on any program


variable that is used at some location in P, the program that is
being modified.
2. Some programs may have several locations and variables of
interest at which to compute the dynamic slice, then we need to
compute slices of all such variables at their corresponding
locations and then take union of all slices to create a combined
dynamic slice. This approach is useful for regression testing of
relatively small components.
3. If a program is large then a tester needs to find out the critical
locations that contain one or more variables of interest. Then,
we can build dynamic slices on these variables.

SUMMARY
Regression testing is used to confirm that fixed bugs have, in fact,
been fixed and that new bugs have not been introduced in the process
and that features that were proven correctly functional are intact.
Depending on the size of a project, cycles of regression testing may be
performed once per milestone or once per build. Some bug regression
testing may also be performed during each acceptance test cycle,
focusing on only the most important bugs. Regression tests can be
automated.

MULTIPLE CHOICE QUESTIONS


1. The main guide for selecting test cases is
a. To assess risks.
b. To assess quality.
c. Both (a) and (b)
d. None of the above.
2. Which of the following is not a risk reduction scheme?
a. Priority category scheme
b. Risk analysis
c. Interviewing
d. None of the above.
3. The potential problems are identified and their probabilities and
impacts are given weights in which technique.
a. Priority categorization scheme
b. Risk analysis
c. Interviews
d. None of the above.
4. Risk exposure is given by which formula.
a. It is the product of probability of occurrence of risk and its
impact
b. It is the sum of probability of its occurrence and its impact
c. It is the standard deviation of the sum of its probability and its
impact
d. None of the above.
5. A risk matrix
a. Allows the testers to evaluate and rank potential problems by
giving weights to proabability or severity value as necessary.
b. Allows testers to assign thresholds that classify the potential
problems into priority categories.
c. Both (a) and (b).
d. None of the above.
6. In diagonal band prioritization scheme
a. Severity equals probability.
b. Severity is never equal to probability.
c. Either (a) or (b).
d. Both (a) and (b).
7. Some managers found out that
a. Probability of risk to occur is very important.
b. Problems with high severity must be given top priority.
c. Severity of risk is not at all important.
d. None of the above.
8. Which one of the following is NOT a regression testing strategy?
a. Correction regression testing
b. Retest-all strategy
c. Combinational explosion
d. None of the above.
9. Which of the following testing strategy is applicable at all three
levels of testing?
a. White-box testing
b. Mutation testing
c. Regression testing
d. None of the above.
10. What is used to separate a set of modules affected by program
changes from the rest of the code?
a. Firewall
b. NIC
c. Emulator
d. None of the above.

ANSWERS
1. a.
2. d.
3. b.
4. a.
5. c.
6. a.
7. b.
8. c.
9. c.
10. a.
CONCEPTUAL SHORT QUESTIONS WITH ANSWERS
Q. 1. Explain the risk reduction method?
Ans. The formula for quantifying risk also explains how to control or
minimize risk. The formula is as follows:

For example, say ten users a day terminated their business


website. The average customer places $80 orders and thus,
Loss due to unfriendly website is = 10 * $80 = $800
Therefore, once the variables and the loss expectation formula
have been defined, controls can then be identified to minimize
that risk.
Q. 2. Can you list the most common risk factors for various project
types given below:
a. MIT
b. Commercial Software Sectors
Ans. a. For MIS Projects
Risk Factors % of Projects at Risk
1. Creeping user requirements 80%
2. Excessive schedule pressure 65%
3. Low quality 60%
4. Cost overruns 55%
5. Inadequate configuration control 50%
b. For Commercial Software Sectors
Risk Factors % of Projects at Risk
1. Inadequate user documentation 70%
2. Low-user satisfaction 55%
3. Excessive time to market 50%
4. Harmful competitive actions 45%
5. Litigation expense 30%
Q. 3. What is SEI-SRE service?
Ans. The Software Engineering Institute (SEI) defines a Software
Risk Evaluation (SRE) service.
SRE is a diagnostic and decision-making tool that enables the
identification, analysis, tracking, mitigation, and
communications of risks in software-intensive programs. It is
used to identify and categorize specific program risks arising
from product, process, management, resources, and constraints.
An SRE can be used for:
a. Preparing for a critical milestone.
b. Recovery from crisis.
An SRE is conducted by a trained SEI-program team. By
implementing SRE, the management improves its ability to
ensure success through the creation of a proactive risk
management methodology.
Q. 4. Why is smoke testing done?
Ans. Smoke testing is an integration testing approach that constructs
and tests software on a daily basis. It consists of:
a. Finding out the basic functionality that a product must satisfy.
b. Designing test cases to ensure that the basic functionality
work and combining these test cases into a smoke test suite.
c. To ensure that everytime a product is built, this suite is run
first.
d. If this suite fails then escalate to the developers to find out the
changes or to roll back the changes to a state where the smoke
test suite succeeds.
Q. 5. How can regression testing be done at the module level and at
the product level?
Ans. a. At a module level, it may involve retesting module-execution-
paths (MEPs) traversing the modification.
b. At a product level, this activity may involve retesting
functions that execute the modified area.
The effectiveness of these strategies is highly dependent on the
utilization of test matrices, which enable identification of
coverage provided by particular test cases.

REVIEW QUESTIONS
1. What do you mean by regression testing? Discuss the different
types of regression testing.
2. Define and discuss the following:
a. Regression testing and how regression test selection is done.
b. Selective retest and coverage techniques.
c. Minimization and safe techniques.
3. Write a short paragraph on regression testing.
4. What is regression testing? Discuss the regression test selection
problem.
5. What is the importance of regression testing?
6. Explain how risks are prioritized.
7. How can we reduce the number of test cases?
8. See the following priority scheme for test cases:
Priority 0: Test cases that check basic functionality are run for
accepting the build for further testing, and are run when the
product undergoes a major change. These test cases deliver a
very high project value to development teams and customers.
Priority 1: Uses the basic and normal setup and these test cases
deliver high project value to both development teams and
customers.
Priority 2: These test cases deliver moderate project value.
They are executed as part of the testing cycle and selected for
regression testing on a need basis.
Using this scheme of prioritization, prioritize the following test
cases as P0 or P1 or P2:
a. A test case for DBMS software that tests all options of a
select query.
b. A test case for a file system that checks for deallocation of
free space.
c. A test case that checks the functionality of a router/bridge.
d. A test case that tests the 0S boot strap process with normal
parameters.
e. A test case that checks a login form of a website.
9. Explain in brief the reduction schemes on prioritizing the test
cases to reduce required testing effort? As a tester, how do you
evaluate and rank potential problems? Suggest some guidelines
of your own approach to reduce the number of test cases.
10. Explain how risk matrix can be used to prioritize the test cases.
Explain giving an example. Why do we need to prioritize the
test cases?
11. a. What are prioritization categories and guidelines for test case
reduction?
b. What is the need for regression testing? How it is done?
12. Suppose your company is about to roll out an e-commerce
application. It is not possible to test the application on all types
of browsers on all platforms and operating systems. What steps
would you take in the testing environment to reduce the testing
process and all possible risks?
13. List and explain prioritization guidelines.
14. What is regression testing and the explain different types of
regression testing? Compare various regression testing
techniques.
15. a. What is the role of risk matrix for the reduction of test cases?
b. How is risk analysis used in testing? Explain the role of the
risk analysis table in testing.
CHAPTER
7

LEVELS OF TESTING
Inside this Chapter:
7.0. Introduction
7.1. Unit, Integration, System, and Acceptance Testing Relationship
7.2. Integration Testing

7.0. INTRODUCTION
When we talk of levels of testing, we are actually talking of three
levels of testing:
1. Unit testing
2. Integration testing
3. System testing
The three levels of testing are shown in Figure 7.1.

FIGURE 7.1 Levels of Testing.

Generally, system testing is functional rather than structural


testing. We will now study these testing techniques one-by-one.
7.1. UNIT, INTEGRATION, SYSTEM, AND
ACCEPTANCE TESTING RELATIONSHIP
We can categorize testing as follows:

FIGURE 7.2

We will explain unit or module testing next. Consider the


following diagram of unit testing as shown in Figure 7.3.

FIGURE 7.3 Unit Testing.

Unit (or module) testing “is the process of taking a module (an
atomic unit) and running it in isolation from the rest of the software
product by using prepared test cases and comparing the actual results
with the results predicted by the specification and design module.” It
is a white-box testing technique.
Importance of unit testing:
1. Because modules are being tested individually, testing
becomes easier.
2. It is more exhaustive.
3. Interface errors are eliminated.
TEST is one of the CASE tools for unit testing (Parasoft) that
automatically tests classes written in MS.NET framework. The tester
need not write a single test or a stub. There are tools which help to
organize and execute test suites at command line, API, or protocol
level. Some examples of such tools are:

7.2. INTEGRATION TESTING


A system is composed of multiple components or modules that
comprise hardware and software. Integration is defined as the set of
interactions among components. Testing the interaction between the
modules and interaction with other systems externally is called
integration testing.
It is both a type of testing and a phase of testing. The architecture
and design can give the details of interactions within systems;
however, testing the interactions between one system and another
system requires a detailed understanding of how they work together.
This knowledge of integration depends on many modules and systems.
These diverse modules could have different ways of working when
integrated with other systems. This introduces complexity in
procedures and in what needs to be done recognizing this complexity,
a phase in testing is dedicated to test their interactions, resulting in the
evolution of a process. This ensuing phase is called the integration
testing phase.

7.2.1. CLASSIFICATION OF INTEGRATION TESTING

As shown in Figure 7.4 integration testing is classified as follows:

FIGURE 7.4

We will discuss each of these techniques in the following sections.

7.2.2. DECOMPOSITION-BASED INTEGRATION

When we talk of decomposition-based integration testing techniques,


we usually talk of the functional decomposition of the system to be
tested which is represented as a tree or in textual form. It is further
classified as top-down, bottom-up, sandwich, and big bang integration
strategies. All these integration orders presume that the units have
been separately tested, thus, the goal of decomposition-based
integration is to test the interfaces among separately tested units.

7.2.2.1. TYPES OF DECOMPOSITION-BASED TECHNIQUES TOP-DOWN INTEGRATION


APPROACH
It begins with the main program, i.e., the root of the tree. Any lower-
level unit that is called by the main program appears as a “stub.” A
stub is a piece of throw-away code that emulates a called unit.
Generally, testers have to develop the stubs and some imagination is
required. So, we draw.
Where “M” is the main program and “S” represents a stub from
the figure, we find out that:

Number of Stubs Required = (Number of Nodes – 1)

FIGURE 7.5 Stubs.

Once all of the stubs for the main program have been provided, we
test the main program as if it were a standalone unit. We could apply
any of the appropriate functional and structural techniques and look
for faults. When we are convinced that the main program logic is
correct, we gradually replace the stubs with the actual code. Top-down
integration follows a breadth-first traversal (bfs) of the functional
decomposition tree.

7.2.2.2. BOTTOM-UP INTEGRATION APPROACH


It is a mirror image to the top-down order with the difference that
stubs are replaced by driver modules that emulate units at the next
level up in the tree. In bottom-up integration, we start with the leaves
of the decomposition tree and test them with specially coded drivers.
Less throw-away code exists in drivers than there is in stubs. Recall
that there is one stub for each child node in the decomposition tree.
Most systems have a fairly high fan-out near the leaves, so in the
bottom-up integration order, we will not have as many drivers. This is
partially offset by the fact that the driver modules will be more
complicated. See Figure 7.6
where “D” represents a driver module and “M” represents other
modules.
Number of drivers required = (No. of nodes – No. of leaf nodes)
For Figure 7.6, drivers required = 5 – 4 = 1,
i.e., 1 driver module (D) is required.

FIGURE 7.6 Drivers.

7.2.2.3. SANDWICH INTEGRATION APPROACH


It is a combination of top-down and bottom-up integration. There will
be less stub and driver development effort. But the problem is in the
difficulty of fault isolation that is a consequence of big bang
integration.
Because this technique is the combination of the top-down and
bottom-up integration approaches, it is also called bidirectional
integration. It is performed initially with the use of stubs and drivers.
Drivers are used to provide upstream connectivity while stubs provide
downstream connectivity. A driver is a function which redirects the
request to some other component and stubs simulate the behavior of a
missing component. After the functionality of these integrated
components are tested, the drivers and stubs are discarded. This
technique then focuses on those components which need focus and are
new. This approach is called as sandwich integration.
In the product development phase when a transition happens from
two-tier architecture to three-tier architecture, the middle tier gets
created as a set of new components from the code taken from bottom-
level applications and top-level services.

7.2.2.4. BIG-BANG INTEGRATION


Instead of integrating component by component and testing, this
approach waits until all the components arrive and one round of
integration testing is done. This is known as big-bang integration. It
reduces testing effort and removes duplication in testing for the multi-
step component integrations. Big-bang integration is ideal for a
product where the interfaces are stable with fewer number of defects.

7.2.2.5. PROS AND CONS OF DECOMPOSITION-BASED TECHNIQUES

The decomposition-based approaches are clear (except big-bang


integration). They are built with tested components. Whenever a
failure is observed, the most recently added unit is suspected.
Integration testing progress is easily tracked against the decomposition
tree. The top-down and bottom-up terms suggest breadth-first
traversals of the decomposition tree but this is not mandatory.
One common problem to functional decomposition is that they are
artificial and serve the needs of project management more than the
need of software developers. This holds true also for decomposition-
based testing. The whole mechanism is that units are integrated with
respect to structure. This presumes that correct behavior follows from
individually correct units and correct interfaces.
The development effort for stubs and drivers is another drawback
to these approaches and this is compounded by the retesting effort. We
try to compute the number of integration test sessions for a given
decomposition tree. A test session is defined as one set of tests for a
specific configuration actual code and stubs. Mathematically,
Sessions = Nodes – Leaves + Edges
For example, if a system has 42 integration testing sessions then it
means 42 separate sets of integration test cases, which is too high.
7.2.2.6. GUIDELINES TO CHOOSE INTEGRATION METHOD AND CONCLUSIONS

S. No. Factors Suggested method


1. Clear requirements and design. Top-down approach.
2. Dynamically changing Bottom-up approach.
requirements, design, and
architecture.
3. Changing architecture and stable Sandwich (or bi-directional)
design. approach.
4. Limited changes to existing Big-bang method.
architecture with less impact.
5. Combination of above. Select any one after proper
analysis.

7.2.3. CALL GRAPH-BASED INTEGRATION

One of the drawbacks of decomposition-based integration is that the


basis is the functional decomposition tree. But if we use the call graph-
based technique instead, we can remove this problem. Also, we will
move in the direction of structural testing. Because call graph is a
directed graph thus, we can use it as a program graph also. This leads
us to two new approaches to integration testing which are discussed
next.

7.2.3.1. PAIRWISE INTEGRATION


The main idea behind pairwise integration is to eliminate the
stub/driver development effort. Instead of developing stubs and
drivers, why not use the actual code? At first, this sounds like big-
bang integration but we restrict a session to only a pair of units in the
call graph. The end result is that we have one integration test session
for each edge in the call graph. This is not much of a reduction in
sessions from either top-down or bottom-up but it is a drastic
reduction in stub/driver development. Four pairwise integration
sessions are shown in Figure 7.7.

FIGURE 7.7 Pairwise Integration.

7.2.3.2. NEIGHBORHOOD INTEGRATION


The neighborhood of a node in a graph is the set of nodes that are one
edge away from the given node. In a directed graph, this includes all
of the immediate predecessor nodes and all of the immediate successor
nodes. Please note that these correspond to the set of stubs and drivers
of the node.
For example, for node-16, neighborhood nodes are 9, 10, and 12
nodes as successors and node-1 as predecessor node.
We can always compute the number of neighbors for a given call
graph. Each interior node will have one neighborhood plus one extra
in case leaf nodes are connected directly to the root node.

NOTE

An interior node has a non-zero in-degree and a non-zero out-degree.


FIGURE 7.8 Neighborhood Integration.

So, we have the following formulas:


1. Interior nodes = Nodes – (Source Nodes + Sink Nodes)
2. Neighborhoods = Interior Nodes + Source Nodes
Substituting 1st equation in 2nd equation above, we get:
Neighborhoods = Nodes – Sink Nodes
Neighborhood integration yields a drastic reduction in the number
of integration test sessions. It avoids stub and driver development. The
end result is that the neighborhoods are essentially the sandwiches that
we slipped past in the previous section.

7.2.3.3. PROS AND CONS


The call graph-based integration techniques move away from a purely
structural basis toward a behavioral basis. These techniques also
eliminate the stub/driver development effort. This technique matches
well with the developments characterized by builds. For example,
sequences of the neighborhood can be used to define builds.
The biggest drawback to call graph-based integration testing is the
fault isolation problem, especially for large neighborhoods. Another
problem occurs when a fault is found in a node (unit) that appears in
several neighborhoods. Obviously, we resolve the fault but this means
changing the unit’s code in some way, which in turn means that all the
previously tested neighborhoods that contain the changed node need to
be retested.

7.2.4. PATH-BASED INTEGRATION WITH ITS PROS AND CONS


We already know that the combination of structural and functional
testing is highly desirable at the unit level and it would be nice to have
a similar capability for integration and system testing. Our goal for
integration testing is: “Instead of testing interfaces among separately
developed and tested units, we focus on interactions among these
units.” Here, cofunctioning might be a good term. Interfaces are
structural whereas interaction is behavioral.
We now discuss some basic terminologies that are used in this
technique.
1. Statement fragment. It is a complete statement. These are the
nodes in the program graph.
2. Source node. A source node in a program is a statement
fragment at which program execution begins or resumes. The
first executable statement in a unit is clearly a source node.
Source nodes also occur immediately after nodes that transfer
control to other units.
3. Sink node. It is a statement fragment in a unit at which
program execution terminates. The final executable statement
in a program is clearly a sink node, so are the statements that
transfer control to other units.
4. Module execution path (MEP). It is a sequence of statements
that begins with a source node and ends with a sink node with
no intervening sink nodes.
5. Message. A message is a programming language mechanism
by which one unit transfers control to another unit. Depending
on the programming language, messages can be interpreted as
subroutine invocations, procedure calls, and function
references. We follow the convention that the unit that receives
a message always eventually returns control to the message
source. Messages can pass data to other units.
6. Module-to-module path (MM-path). An MM-path is an
interleaved sequence of module execution paths (MEPs) and
messages.
7. Module-to-module path graph (MM-path graph). Given a
set of units, their MM-path graph is the directed graph in
which nodes are module execution paths and edges correspond
to messages and returns from one unit to another. The effect of
these definitions is that program graphs now have multiple
source and sink nodes. This would increase the complexity of
unit testing but the integration testing presumes unit testing is
complete. Also, now our goal is to have an integration testing
analog of DD-paths, as done earlier.
The basic idea of an MM-path is that we can now describe
sequences of module execution paths that include transfers of control
among separate units. These transfers are by messages, therefore,
MM-paths always represent feasible execution paths and these paths
cross unit boundaries. We can find MM-paths in an extended program
graph in which nodes are module execution paths and edges are
messages.
Consider a hypothetical example as shown in Figure 7.9.
Herein, module-A calls module-B, which in turn calls module-C.
Note from Figure 7.9 that MM-path begins with and ends in the main
program only. This is true for traditional (or procedural) software.
FIGURE 7.9 MM-Path Across Three Units (A, B, and C).

In module-A, nodes 1 and 5 are source nodes and nodes 4 and 6


are sink nodes. Similarly, in module-B, nodes 1 and 3 are source
nodes and nodes 2 and 4 are sink nodes. Module-C has a single source
node 1 and a single sink node, 5. This can be shown as follows:

Module Source-node Sink-node


A 1, 5 4, 6
B 1, 3 2, 4
C 1 5

So, the seven module execution paths are as follows:

MEP (A, 1) = <1, 2, 3, 6>


MEP (A, 2) = <1, 2, 4>
MEP (A, 3) = <5, 6>
MEP (B, 1) = <1, 2>
MEP (B, 2) = <3, 4>
MEP (C, 1) = <1, 2, 4, 5>
MEP (C, 2) = <1, 3, 4, 5>
These are the module execution paths. We can now define an
integration testing analog of the DD-path graph that serves unit testing
so effectively.

FIGURE 7.10 MM-Path Graph Derived from Figure 7.9.

Now, its MM-path graph is shown in Figure 7.10.


Herein, the solid arrows indicate messages and the dotted arrows
represent returns.
Also, note that a program path is a sequence of DD-paths and an
MM-path is a sequence of module execution paths.

What Are the Endpoints on MM-Paths?


Two criteria that are observed and are behavioral put endpoints on
MM-paths:
1. Message quiescence
2. Data quiescence
Message quiescence occurs when a unit that sends no messages is
reached. For example, module-C in Figure 7.9.
Data quiescence occurs when a sequence of processing
culminates in the creation of stored data that is not immediately used.
This happens in a data flow diagram. Points of quiescence are natural
endpoints for an MM-path.

Pros and Cons


1. MM-paths are a hybrid of functional (black-box) and structural
(white-box) testing. They are functional because they represent
actions with inputs and outputs. As such, all the functional
testing techniques are potentially applicable. The structural
side comes from how they are identified, particularly the MM-
path graph. The net result is that the cross-check of the
functional and structural approaches is consolidated into the
constructs for path-based integration testing. We therefore
avoid the pitfall of structural testing and, at the same time,
integration testing gains a fairly seamless junction with system
testing. Path-based integration testing works equally well for
software developed in the traditional waterfall process or with
one of the composition-based alternative life-cycle models.
Later, we will also show that the concepts are equally
applicable to object-oriented software testing.
2. The most important advantage of path-based integration testing
is that it is closely coupled with the actual system behavior,
instead of the structural motivations of decomposition and call
graph-based integration.
3. The disadvantage of this technique is that more effort is needed
to identify the MM-paths. This effort is probably offset by the
elimination of stub and driver development.

7.2.5. SYSTEM TESTING

System testing focuses on a complete, integrated system to evaluate


compliance with specified requirements. Tests are made on
characteristics that are only present when the entire system is run.

7.2.5.1. WHAT IS SYSTEM TESTING?

The testing that is conducted on the complete integrated products and


solutions to evaluate system compliance with specified requirements
on functional and non functional aspects is called system testing. It is
done after unit, component, and integration testing phases.
As we already know, a system is defined as a set of hardware,
software, and other parts that together provide product features and
solutions. In order to test the entire system, it is necessary to
understand the product’s behavior as a whole. System testing brings
out issues that are fundamental to the design, architecture, and code of
the whole product.
System-level tests consist of batteries of tests that are designed to
fully exercise a program as a whole and check that all elements of the
integrated system function properly. System-level test suites also
validate the usefulness of a program and compare end results against
requirements.
System testing is the only testing phase that tests both functional
and non functional aspects of the product.
On the functional side, system testing focuses on real-life customer
usage of the product and solutions. It simulates customer deployments.
For a general-purpose product, system testing also means testing it for
different business verticals and applicable domains such as insurance,
banking, asset management, and so on.
On the non-functional side, it brings into consideration different
testing types which are also called quality factors.

7.2.5.2. WHY IS SYSTEM TESTING DONE?


System testing is done to:
1. Provide independent perspective in testing as the team
becomes more quality centric.
2. Bring in customer perspective in testing.
3. Provide a “fresh pair of eyes” to discover defects not found
earlier by testing.
4. Test product behavior in a holistic, complete, and realistic
environment.
5. Test both functional and non functional aspects of the product.
6. Build confidence in the product.
7. Analyze and reduce the risk of releasing the product.
8. Ensure all requirements are met and ready the product for
acceptance testing.
Explanation: An independent test team normally does system
testing. This independent test team is different from the team that does
the component and integration testing. The system test team generally
reports to a manager other than the product-manager to avoid conflicts
and to provide freedom to individuals during system testing. Testing
the product with an independent perspective and combining that with
the perspective of the customer makes system testing unique, different,
and effective.
The behavior of the complete product is verified during system
testing. Tests that refer to multiple modules, programs, and
functionality are included in system testing. This task is critical as it is
wrong to believe that individually tested components will work
together when they are put together.
System testing is the last chance for the test team to find any
leftover defects before the product is handed over to the customer.
System testing strives to always achieve a balance between the
objective of finding defects and the objective of building confidence in
the product prior to release.
The analysis of defects and their classification into various
categories (called as impact analysis) also gives an idea about the kind
of defects that will be found by the customer after release. If the risk
of the customers getting exposed to the defects is high, then the
defects are fixed before the release or else the product is released as
such. This information helps in planning some activities such as
providing workarounds, documentation on alternative approaches, and
so on. Hence, system testing helps in reducing the risk of releasing a
product.
System testing is highly complementary to other phases of testing.
The component and integration test phases are conducted taking inputs
from functional specification and design. The main focus during these
testing phases are technology and product implementation. On the
other hand, customer scenarios and usage patterns serve as the basis
for system testing.
7.2.5.3. FUNCTIONAL VERSUS NON FUNCTIONAL SYSTEM TESTING (IN TABULAR
FORM)

We are now in a position to state an equation:


System testing = Functional testing + Non functional
testing
We first tabulate the differences between functional and non
functional testing in a tabular form.

Functional testing Non functional testing


1. It involves the product’s 1. It involves the product’s quality
functionality. factors.
2. Failures, here, occur due to code. 2. Failures occur due to either
architecture, design, or due to
code.
3. It is done during unit, component, 3. It is done in our system testing
integration, and system testing phase.
phase.
4. To do this type of testing only 4. To do this type of testing, we
domain of the product is required. need domain, design,
architecture, and product’s
knowledge.
5. Configuration remains same for a 5. Test configuration is different for
test suite. each test suite.

Thus, functional testing helps in verifying what the system is


supposed to do. It aids in testing the product’s features or
functionality. It has only two results—requirements met or not met. It
should have very clear expected results documented in terms of the
behavior of the product. It has simple methods and steps to execute the
test cases. Functional testing results normally depend on the product
and not on the environment. It uses a predetermined set of resources. It
requires in-depth customer, product, and domain knowledge to
develop different test cases and find critical defects. It is performed in
all phases of testing, i.e., unit, component, integration, and system
testing.
Non functional testing is performed to verify the quality factors
such as reliability, scalability, etc. These quality factors are also called
non functional requirements. It requires the expected results to be
documented in qualitative and quantifiable terms. It requires a large
amount of resources and the results are different for different
configurations and resources. It is a very complex method as a large
amount of data needs to be collected and analyzed. The focus point is
to qualify the product. It is not a defect finding exercise. Test cases for
non functional testing includes a clear pass/fail criteria.
However, test results are concluded both on pass/fail definitions
and on the experiences encountered in running the tests. Non
functional test results are also determined by the amount of effort
involved in executing them and any problems faced during execution.
For example, if a performance test met the pass/fail criteria after 10
iterations, then the experience is bad and the test result cannot be taken
as pass. Either the product or the non functional testing process needs
to be fixed here.
Non functional testing requires understanding the product
behavior, design, architecture, and also knowing what the competition
provides. It also requires analytical and statistical skills as the large
amount of data generated requires careful analysis. Failures in non
functional testing affect the design and architecture much more than
the product code. Because non functional testing is not repetitive in
nature and requires a stable product, it is performed in the system
testing phase.
The differences listed in the table above are just the guidelines and
not the dogmatic rules.
Because both functional and non functional aspects are being
tested in the system testing phase so the question that arises is—what
is the ratio of the test-cases or effort required for the mix of these two
types of testing? The answer is here: Because functional testing is a
focus area starting from the unit testing phase while non functional
aspects get tested only in the system testing phase, it is a good idea
that a majority of system testing effort be focused on the non
functional aspects. A 70%–30% ratio between non functional and
functional testing can be considered good and 50%–50% ratio is a
good starting point. However, this is only a guideline and the right
ratio depends more on the context, type of release, requirements, and
products.

7.2.5.4. FUNCTIONAL SYSTEM TESTING TECHNIQUES


As functional testing is performed at various testing phases, there are
two problems that arise. They are:
1. Duplication: It refers to the same tests being performed
multiple times.
2. Gray area: It refers to certain tests being missed out in all the
phases.
A small percentage of duplication across phases cannot be avoided
as different teams are involved performing cross-reviews (i.e.,
involving teams from earlier phases of testing) and looking at the test
cases of the previous phase before writing system test cases can help
in minimizing the duplication. However, a small percentage of
duplication is advisable as then different test teams will test the
features with different perspectives thus yielding new defects.
Gray-areas in testing happens due to a lack of product knowledge,
lack of knowledge of customer usage, and lack of coordination across
test teams. These areas (missing of tests) arise when a test team
assumes that a particular test may be performed in the next phase. So,
the guideline is—“A test case moved from a later phase to an earlier
phase is a better option than delaying a test case from an earlier
phase to a later phase, as the purpose of testing is to find defects as
early as possible.” This has to be done after completing all tests meant
for the current phase, without diluting the tests of the current phase.
We are now in a position to discuss various functional system
testing techniques in detail. They are discussed one by one.

7.2.5.4.1. Design/Architecture verification


We can compare functional testing with integration testing. They are
given in table below:

S. Integration testing System testing


No.
1. The test cases are created by The test cases are created first
looking at interfaces. and verified with design and
architecture.
2. The integration test cases The functional system test
focus on interactions between focuses on the behavior of the
modules or components. complete product.

In this method of system testing, the test cases are developed and
checked against the design and architecture to see whether they are
actual product-level test cases. This technique helps in validating the
product features that are written based on customer scenarios and
verifying them using product implementation.
If there is a test case that is a customer scenario but failed
validation using this technique, then it is moved to the component or
integration testing phase. Because functional testing is performed at
various test phases, it is important to reject the test cases and move
them to an earlier phase to catch defects early and avoid any major
surprise at later phases.
We now list certain guidelines that are used to reject test cases for
system functional testing. They are:
1. Is this test case focusing on code logic, data structures, and unit
of the product?
If yes, then it belongs to unit testing.
2. Is this specified in the functional specification of any
component?
If yes, then it belongs to component testing.
3. Is this specified in design and architecture specification for
integration testing?
If yes, then it belongs to integration testing.
4. Is it focusing on product implementation but not visible to
customers?
If yes, then it is focusing on implementation to be covered in
unit/component/integration testing.
5. Is it the right mix of customer usage and product
implementation?
If yes, then it belongs to system testing.

7.2.5.4.2. Business vertical testing (BVT)


Using and testing the product for different business verticals such as
banking, insurance, asset management, etc. and verifying the business
operations and usage is called as business vertical testing. In this type
of testing, the procedure in the product is altered to suit the process in
the business. For example, in personal loan processing, the loan is
approved first by the senior officer and then sent to a clerk. User
objects such as a clerk and officer are created by the product and
associated with the operations. This is one way of customizing the
product to suit the business. Some operations that can only be done by
some user objects is called a role-based operation. BVT involves three
aspects. They are:
1. Customization
2. Terminology
3. Syndication
Customization: It is important that the product understands the
business processes and includes customization as a feature so that
different business verticals can use the product. With the help of this
feature, a general workflow of a system is altered to suit specific
business verticals.
Terminology: To explain this concept, we consider a common
example of e-mail. When an e-mail is sent in a loan processing system
than it is called a loan application. An e-mail sent in the insurance
context may be called a claim and so on. The users would be familiar
with this terminology rather than the generic terminology of “e-mail.”
So, the user interface should reflect these terminologies rather than use
generic terminology e-mails, which may dilute the purpose and may
not be understood clearly by the users. An e-mail sent to a blood bank
cannot take the same priority as an internal e-mail sent to an employee
by another employee. These differentiations need to be made. Some e-
mails need to be tracked. For example, an e-mail to a blood bank
service needs a prompt reply. Some mail can be given automated mail
replies also. Hence, the terminology feature of the product should call
the e-mail appropriately as a claim or a transaction and also associate
the profile and properties in a way a particular business vertical works.
Syndication: Not all the work needed for business verticals is done by
product development organizations only. Even the solution integrators,
service providers pay a license fee to a product organization and sell
the products and solutions using their name and image. In this case,
the product name, company name, technology names, and copyrights
may belong to the latter parties or associations and the former would
like to change the names in the product. A product should provide
features for those syndications in the product and they are tested as a
part of BVT.

How BVT Can Be Done?


BVT can be done in two ways:
1. Simulation
2. Replication
Simulation: In simulation of a vertical test, the customer or the tester
assumes requirements and the business flow is tested.
Replication: In replication, customer data and process is obtained and
the product is completely customized, tested, and the customized
product as it was tested is released to the customer.

NOTES

Business verticals are tested through scenarios. Scenario testing is


only a method to evolve scenarios and ideas and is not meant to be
exhaustive. It is done from interfaces point of view. Having some
business vertical scenarios created by integration testing ensures
quick progress in system testing. In the system testing phase, the
business verticals are completely tested in real-life customer
environment using the aspects such as customization, terminology,
and syndication as described above.

7.2.5.4.3. Deployment testing


System testing is the final phase before the product is delivered to the
customer. The success or failure of a particular product release is
assured on the basis of the how well the customer requirements are
met. This type of deployment (simulated) testing that happens in a
product development company to ensure that customer deployment
requirements are met is called as offsite deployment. Deployment
testing is also conducted after the release of the product by utilizing
the resources and setup available in customer’s locations. This is a
combined effort by the product development organization and the
organization trying to use the product. This is called onsite
deployment. Although onsite deployment is not conducted in the
system testing phase, it is the system testing team that completes this
test. Onsite deployment testing is considered to be a part of acceptance
testing.
We will now discuss, onsite deployment testing in detail. It is done
in two stages:
I. The first stage (stage-1)
II. The second stage (stage-2)
In the first stage (stage-1), the actual data from the live system is
taken and similar machines and configurations are mirrored and the
operations from the user are rerun on the mirrored deployment
machine (see Figure 7.11).
This gives an idea whether the enhanced or similar product can
perform the existing functionality without affecting the user. This also
reduces the risk of a product not being able to satisfy existing
functionality, as deploying the product without adequate testing can
cause major business loss to an organization. Some deployments use
“intelligent- recorders” to record the transactions that happen on a live
system and commit these operations on a mirrored system and then
compare the results against the live system. The objective of the
recorder is to help in keeping the mirrored and live system identical
with respect to business transactions.
In the second stage (stage-2) after a successful first stage, the
mirrored system is made a live system that runs the new product (see
Figure 7.12).
Regular backups are taken and alternative methods are used to
record the incremental transactions from the time the mirrored system
became alive. The recorder that was used in the first stage can also be
used here. In this stage, the live system that was used earlier and the
recorded transactions from the time mirrored system became live, are
preserved to enable going back to the old system if any major failures
are observed at this stage. If no failures are observed in this (second)
stage of deployment for an extended period (say, 1 month) then the
onsite deployment is considered successful and the old live system is
replaced by the new system.

FIGURE 7.11 Stage-1 of Onsite Deployment.

FIGURE 7.12 Stage-2 of the Onsite Deployment.


Please note that in stage-1, the recorder intercepts the user and the
live system to record all transactions. All the recorded transactions
from the live system are then played back on the product under test
under the supervision of the test engineer (as shown by dotted lines).
In stage-2, the test engineer records all transactions using a recorder
and other methods and plays back on the old live system (as shown
again by dotted lines). So, the overall stages are:

7.2.5.4.4. Beta testing


Any project involves a significant amount of effort and time.
Customer dissatisfaction and time slippages are very common.
If a customer rejects a project then it means a huge loss to the
organization. Several reasons have been found for customer
dissatisfaction. Some of them are as follows:
1. Implicit requirements like ease of use. If not found in a
software then it may mean rejection by the customer.
2. Customer’s requirements keep changing constantly.
Requirements given at the beginning of the project may
become obsolete. A failure to reflect changes in the product
makes it obsolete.
3. Finding ambiguous requirements and not resolving them
with the customer results in rejection of the product.
4. Even if understanding of requirements may be correct but their
implementation could be wrong. So, design and code need to
be reworked. If this is not done in time, it results in product
rejection.
5. Poor documentation and difficulty in using of the product may
also result in rejection.
To reduce risks, which is the objective of system testing, periodic
feedback is obtained on the product. This type of testing in which the
product is sent to the customers to test it and receive the feedback is
known as beta testing.
During beta program (various activities that are planned and
executed according to a specific schedule), some of the activities
involved are as follows:
1. Collecting the list of customers and their beta testing
requirements alongwith their expectations on the product.
2. Working out a beta program schedule and informing the
customers.
3. Sending some documents for reading in advance and training
the customer on product usage.
4. Testing the product to ensure that it meets “beta testing entry
criteria” which is prepared by customers and management
groups of the vendor.
5. Sending beta program to the customer and enable them to
carry out their own testing.
6. Collecting the feedback periodically from the customers and
prioritizing the defects for fixing.
7. Responding to customer’s feedback with product fixes or
documentation changes and closing the communication loop
with the customers in a timely fashion.
8. Analyzing and concluding whether the beta program met the
exit criteria.
9. Communicate the progress and action items to customers and
formally closing the beta program.
10. Incorporating the appropriate changes in the product.
Deciding the timing of beta test poses several conflicts. Sending
the product too early, with inadequate internal testing will make the
customers unhappy and may create a bad impression on the quality of
the product. Sending the product too late may mean too little a time
for beta defect fixes and this one defeats the purpose of beta testing.
So, late integration testing phase and early system testing phase is the
ideal time for starting a beta program.
We send the defect fixes to the customers as soon as problems are
reported and all necessary care has to be taken to ensure the fixes
meets the requirements of the customer.
How many beta customers should be chosen?
If the number chosen are too few, then the product may not get a
sufficient diversity of test scenarios and test cases.
If too many beta customers are chosen, then the engineering
organization may not be able to cope with fixing the reported defects
in time. Thus, the number of beta customers should be a delicate
balance between providing a diversity of product usage scenarios and
the manageability of being able to handle their reported defects
effectively.
Finally, the success of a beta program depends heavily on the
willingness of the beta customers to exercise the product in various
ways.

7.2.5.4.5. Certification, standards, and testing for compliance


A product needs to be certified with the popular hardware, operating
system (OS), database, and other infrastructure pieces. This is called
certification testing. A product that doesn’t work with any of the
popular hardware or software may be unsuitable for current and future
use. The sale of the product depends on whether it was certified with
the popular systems or not. Not only should the product co-exist and
run with the current versions of these popular systems, but the product
organization should also document a commitment to continue to work
with the future versions of the popular systems. This is one type of
testing where there is equal interest from the product development
organization, the customer, and certification agencies to certify the
product. The certification agencies produce automated test suites to
help the product development organization.
There are many standards for each technology area and the product
may need to conform to those standards. Standards can be like IPv6 in
networking and 3G in mobile technology. Tools associated with those
open standards can be used cost free to verify the standard’s
implementation. Testing the product to ensure that these standards are
properly implemented is called testing for standards. Once the product
is tested for a set of standards, they are published in the release
documentation for the customer’s information so that they know what
standards are implemented in the product.
There are many contractual and legal requirements for a product.
Failing to meet these may result in business loss and bring legal action
against the organization and its senior management.
The terms certification, standards, and compliance testing are used
interchangeably. There is nothing wrong in the usage of terms as long
as the objective of testing is met. For example, a certifying agency
helping an organization meet standards can be called both certification
testing and standards testing.

7.2.5.5. NON FUNCTIONAL TESTING TECHNIQUES


Non functional testing differs from the functional testing in terms of
complexity, knowledge requirement, effort needed, and the number of
times the test cases are repeated. Because repeating non functional test
cases involves more time, effort, and resources, the process for non
functional testing has to be stronger than functional testing to
minimize the need for repetition. This is achieved by having more
stringent entry/exit criteria and better planning.

7.2.5.5.1. Setting up the configuration


Due to the varied types of customers, resources availability, time
involved in getting the exact setup, and so on setting up a scenario that
is exactly real life is difficult. Due to several complexities involved,
simulated setup is used for non functional testing where actual
configuration is difficult to get.
In order to create a “near real-life” environment, the details
regarding customer’s hardware setup, deployment information, and
test data are collected in advance. Test data is built based on the
sample data given. If it is a new product then information regarding
similar or related products is collected. These inputs help in setting up
the test environment close to the customer’s so that the various quality
characteristics of the system can be verified more accurately.
7.2.5.5.2. Coming up with entry/exit criteria
Meeting the entry criteria is the responsibility of the previous test
phase, i.e., integration testing phase or it could be the objective of dry-
run tests performed by the system testing team, before accepting the
product for system testing. The table below gives some examples of
how entry/exit criteria can be developed for a set of parameters and for
various types of non functional tests.

7.2.5.5.3. Managing key resources


There are four key resources:
CPU
Disk
Memory
Network
FIGURE 7.13

We need to completely understand their relationship to implement


the non functional testing technique. These four resources need to be
judiciously balanced to enhance the quality factors of the product. All
these resources are interdependent. For example, if the memory
requirements in the system are addressed, the need for the CPU may
become more intensive. The demand for the resources tends to grow
when a new release of the product is produced as software becomes
more and more complex. Software is meant not only for computers but
also for equipment such as cell phones; hence upgrading the resources
is not easy anymore.
Usually, the customers are perplexed when they are told to
increase the number of CPUs, memory, and the network bandwidth for
better performance, scalability, and other non functional aspects.
When we ask customers to upgrade the resources one important
aspect, i.e., ROI (return-on-investment), needs to be justified clearly.
Before conducting non functional testing, some assumptions are
validated by the development team and customers. Some of them are
given below:
1. When a higher priority job comes in, the CPU must be freed.
2. Until a new job requires memory, the available memory can be
completely used.
3. If we can justify the benefits for each new resource that is
added then the resources can be added easily to get better
performance.
4. The product can generate many network packets as long as the
network bandwidth and latency is available. Now, most of the
packets generated are for LAN and not for WAN—an
assumption. In case of WAN or routes involving multiple
hops, the packets generated by the product need to be reduced.
5. More disk space or the complete I/O bandwidth can be used for
the product as long as they are available while disk costs are
getting cheaper, I/O bandwidth is not.
6. The customer gets the maximum return on investment (ROI)
only if the resources such as CPU, disk, memory, and network
are optimally used. So, there is intelligence needed in the
software to understand the server configuration and its usage.
Without these assumptions being validated, there cannot be any
good conclusion that can be made out of non functional testing.

7.2.5.5.4. Scalability testing


This type of testing is done when stakeholders like business owners
and operations departments want to know whether a system not only
meets its efficiency requirements now but will continue to do so in
future.
Please understand that both scalability problems and now
scalability requirements typically arise once a system has become
operative. So, it is done on systems that are in production. Both
technical test analyst and test manager ensure that scalability
requirements are captured.
The primary objective is the evaluation of resources like the usage
of memory space, disk capacity, and network bandwidth. For
interconnected systems, it means testing the network’s capacity to
handle high data volumes.
For example, a test to find out how many client-nodes can
simultaneously log into the server. Failures during scalability test
includes the system not responding or system crashing. A product not
able to respond to 100 concurrent users while it is supposed to serve
200 users simultaneously is a failure. For a given configuration, the
following template may be used:
This data is collected for several configurations in these templates
and analyzed.
In the above template, if disk utilization approaches 100% then
another server is set up or a new disk is added. If successful, then
repeat tests for a higher number of users but the aim is to find the
maximum limit for that configuration. Increasing resources is not a
silver bullet to achieve better scalability.
During scalability testing, the resources demand grows
exponentially when the scalability parameter is increased. This
scalability reaches a saturation point beyond which it cannot be
improved. This is called as the maximum capability of a scalability
parameter.

FIGURE 7.14

Guidelines for Scalability Testing:


1. Scalability should increase by 50% when the number of CPUs
is doubled. This test is applicable for a CPU-intensive product.
2. Scalability should increase by 40% when memory is doubled.
This test is applicable for memory intensive product.
3. Scalability should increase by 30%. When the number of NIC
(network interface cards) are doubled. This task is useful for
network-intensive products.
4. Scalability should increase by 50% when the I/O bandwidth is
doubled. This test is useful for I/O intensive products.
We get a very important deliverable from this scalability testing,
i.e., a sizing guide. It is a document containing timing parameters like
OS parameters, other product parameters like number of open files,
number of product threads, etc.
But it is expensive process.

7.2.5.5.5. Reliability testing


Reliability testing is done to find out if the software will work in the
expected environment for an acceptable amount of time without
degradation.
Executing reliability tests involves repeating the test cases for the
defined operational profiles. The tests are executed when specific
events occur like:
a. Major software releases
b. Completion of a time-box (time range)
c. At regular time intervals (say, weekly)
The test cases may already be fully specified or the required test
data may be generated dynamically prior to execution. Please
understand that the conditions for the test like test environment and
test data should remain constant during the test. This is done to enable
subsequent comparisons between execution cycles to be made. The
achieved levels of reliability are reported after each test cycle. Tests
relating to reliability growth can benefit from the following tools:
a. Tools for test data generation.
b. Test execution tools.
c. Code coverage tools for fault tolerance testing.

NOTE

These tools help identify the areas of code not yet exercised after
performing functional tests.

Reliability Testing for our Websites Involves:


i. Establish operational profiles for causal browsers and select
50 functional test cases each from an existing test case
database.
ii. Choose 30 test cases at random from this database.

Probability Testing for Robustness Involves:


i. Perform software technical reviews (STRs) of code.
ii. Design negative tests with the help of test analysts.
iii. After delivery of each software release, do exploratory testing,
i.e., to find defects in handling incorrect or unexpected inputs
from the users.
iv. Memory leak and log-in and log-out operations are tested
here.

Reliability Testing for Backup and Recovery Involves:


i. Verifying that a full backup can be restored within one hour of
failure.
ii. Partial backup can be restored with in 30 minutes.
iii. Execute test case and verify that all data inconsistencies can
be fully determined.
It is done to evaluate the product’s ability to perform its required
functions under stated conditions for a specified period of time or for a
large number of iterations. For example, querying a database
continuously for 48 hours and performing logic operations 10,000
times.

NOTE

The reliability of a product should not be confused with reliability


testing.

7.2.5.5.6. Stress testing


Stress testing is done to evaluate a system beyond the limits of
specified requirements or resources to ensure that system does not
break. It is done to find out if the product’s behavior degrades under
extreme conditions and, when it is desired, the necessary resources.
The product is over-loaded deliberately to simulate the resource
crunch and to find out its behavior. It is expected to gracefully degrade
on increasing the load but the system is not expected to crash at any
point of time during stress testing.
It helps in understanding how the system can behave under
extreme and realistic situations like insufficient memory, inadequate
hardware, etc. System resources upon being exhausted may cause such
situations. This helps to know the conditions under which these tests
fail so that the maximum limits, in terms of simultaneous users, search
criteria, large number of transactions, and so on can be known.

TIPS

Stress testing is a combination of several types of tests like capacity or


volume testing, reliability and stability testing, error handling, and so
on.

Spike testing is a special sort of stress testing where an extreme


load is suddenly placed on the system. If spikes are repeated with
periods of low usage then it forms a bounce test. This is shown in
Figure 7.15.

NOTE
Both spike and bounce tests determines how well the system behaves
when sudden changes of loads occur.

Two spikes together form a bounce test scenario. Then, the load
increases into the stress area to find the system limits. These load
spikes occur suddenly on recovery from a system failure.
There are differences between reliability and stress testing.
Reliability testing is performed by keeping a constant load condition
until the test case is completed. The load is increased only in the next
iteration to the test case.
In stress testing, the load is generally increased through various
means such as increasing the number of clients, users, and transactions
until and beyond the resources are completely utilized. When the load
keeps on increasing, the product reaches a stress point when some of
the transactions start failing due to resources not being available. The
failure rate may go up beyond this point. To continue the stress
testing, the load is slightly reduced below this stress point to see
whether the product recovers and whether the failure rate decreases
appropriately. This exercise of increasing/decreasing the load is
performed two or three times to check for consistency in behavior and
expectations (see Figure 7.15).

FIGURE 7.15

Sometimes, the product may not recover immediately when the


load is decreased. There are several reasons for this. Some of the
reasons are
1. Some transactions may be in the wait queue, delaying the
recovery.
2. Some rejected transactions may need to be purged, delaying
the recovery.
3. Due to failures, some clean-up operations may be needed by
the product, delaying the recovery.
4. Certain data structures may have gotten corrupted and may
permanently prevent recovery from stress point.
We can show stress testing with variable load in Figure 7.15.
Another factor that differentiates stress testing from reliability
testing is mixed operations/tests. Numerous tests of various types run
on the system in stress testing. However, the tests that are run on the
system to create stress points need to be closer to real-life scenarios.

How to Select Test Cases for Stress Testing?


The following are the guidelines:
1. Execute repeated tests to ensure that at all times the code
works as expected.
2. The operations that are used by multiple users are selected and
performed concurrently for stress testing.
3. The operations that generate the amount of load needed are
planned and executed for stress testing.
4. Tests that stress the system with random inputs (like number of
users, size of data, etc.) at random instances and random
magnitude are selected and executed as part of stress testing.
Defects that emerge from stress testing are usually not found from
any other testing. Defects like memory leaks are easy to detect but
difficult to analyze due to varying load and different types/ mix of
tests executed. Hence, stress tests are normally performed after
reliability testing. To detect stress-related errors, tests need to be
repeated many times so that resource usage is maximized and
significant errors can be noticed. This testing helps in finding out
concurrency and synchronization issues like deadlocks, thread leaks,
and other synchronization problems.

7.2.5.5.7. Interoperability testing


This type of testing is done to verify if SUT will function correctly in
all the intended target environments. It includes:
a. Hardware
b. Software
c. Middle ware
d. OS
e. Network configurations
Any software is said to be good interoperable if it can be
integrated easily with other systems without requiring major changes.

TIPS

The numbers and types of changes required to work in different


environments determines the degree of interoperability of any
software.

The degree of interoperability is determined by the use of industry


standards like XML. Please understand that the higher the degree of
manual effort required to run on a different supported configuration,
the lower is the interoperability of the software. Also, note that the so-
called plug-and-play devices are good examples of highly
interoperable software. This type of testing is done at the integration
level of testing. Effective interoperability testing requires effective
planning of the test lab, equipment, and configurations. Once the
configurations to be tested have been found, some companies use shot
gunning technique. A method to distribute their test cases across the
different configurations.
TIPS

Select those test cases that provide end-to-end functionality and run
them.

Can We Do Test Automations Here?


It is not easy in a multi-configuration environment but it can be done.
We need a good test management system that can easily track the
environment configuration used for test case execution. Let us solve
an example now.
Example: The following testings are to be performed. Name the
testing techniques that would be most appropriate for them. They are
as follows:
a. Testing interfaces between product modules.
b. Testing information exchange between front-end (say, VB 6.0)
and back-end (say, SQL server).
c. Testing product with other infrastructural pieces like OS,
database, network.
d. Testing whether the API interfaces work properly.
Answers: These testings belong to:
a. Integration testing
b. Interoperability testing
c. Compatibility testing
d. Integration testing
Some guidelines to improve interoperability testing are as follows:
1. Consistent information flow across systems.
2. Appropriate messages should be given which the user must be
aware of.
3. It is a collective responsibility.
4. It should be restricted to qualify the information exchange
rather than finding defects and fixing them one by one.

7.2.5.6. ACCEPTANCE TESTING


It is a phase after system testing that is done by the customers. The
customer defines a set of test cases that will be executed to qualify and
accept the product. These test cases are executed by the customers are
normally small in number. They are not written with the intention of
finding defects. Testing in detail is already over in the component,
integration, and system testing phases prior to product delivery to the
customer. Acceptance test cases are developed by both customers and
the product organization. Acceptance test cases are black-box type of
tests cases. They are written to execute near real-life scenarios. They
are used to verify the functional and non functional aspects of the
system as well. If a product fails the acceptance test then it may cause
the product to be rejected and may mean financial loss or may mean
rework of product involving effort and time.
A user acceptance test is:
• A chance to complete test software.
• A chance to completely test business processes.
• A condensed version of a system.
• A comparison of actual test results against expected results.
• A discussion forum to evaluate the process.
The main objectives are as follows:
• Validate system set-up for transactions and user access.
• Confirm the use of the system in performing business process.
• Verify performance on business critical functions.
• Confirm integrity of converted and additional data.
The project team will be responsible for coordinating the
preparation of all test cases and the acceptance test group will be
responsible for the execution of all test cases.
Next, we discuss the acceptance test checklist. It is shown in the
table below.
Test
complete
Tester Component test (pass/fail)
Tester #1 List browser/platform to use
End-to-end transaction
Verify billing
Verify log files on server
Etc.
Tester #2 List browser/platform to use
End-to-end transaction
Help file
Remove book from shopping cart
Verify that high priority problems found
during prior testing have been properly fixed
Tester #n List browser/platform to be used
Etc.

7.2.5.6.1. Selecting test cases


The test cases for acceptance testing are selected from the existing set
of test cases from different phases of testing.

Guidelines to Select Test Cases for Acceptance Testing


1. Test cases that include the end-to-end functionality of the
product are taken up for acceptance testing. This ensures that
all the business transactions are completed successfully. Real-
life test scenarios are tested when the product is tested end-to-
end.
2. Because acceptance tests focus on business scenarios, the
product domain tests are included. Test cases that reflect
business domain knowledge are included.
3. Acceptance tests reflect the real-life user scenario verification.
As a result, test cases that portray them are included.
4. Tests that verify the basic existing behavior of the product are
included.
5. When the product undergoes modifications or changes, the
acceptance test cases focus on verifying the new features.
6. Some non functional tests are included and executed as part of
acceptance testing.
7. Tests that are written to check if the product complies with
certain legal obligations are included in the acceptance test
criteria.
8. Test cases that make use of customer real-life data are included
for acceptance testing.

7.2.5.6.2. Executing test cases


Acceptance testing is done by the customer or by the representative of
the customer to check whether the product is ready for use in the real-
life environment.
An acceptance test team usually comprises members who are
involved in the day-to-day activities of the product usage or are
familiar with such scenarios. The product management, support, and
consulting teams contribute to acceptance testing definition and
execution. A test team may be formed with 90% of them possessing
the required business process knowledge of the product and 10%
being representatives of the technical testing team.
Acceptance test team members are not aware of testing. Hence,
before acceptance testing, appropriate training on the product and the
process needs to be provided to the team. The acceptance test team
may get the help of team members who developed/tested the software
to obtain the required product knowledge. Test team members help the
acceptance members to get the required test data, select and identify
test cases, and analyze the acceptance test results. Test teams help the
acceptance test team report defects. If major defects are identified
during acceptance testing, then there is a risk of missing the release
date.
All resolution of those defects and the unresolved defects are
discussed with the acceptance test team and their approval is obtained
for concluding the completion of acceptance testing.

7.2.5.6.3. Types of acceptance testing


There are two types of acceptance tests. They are as follows:

FIGURE 7.16

We shall discuss each of these tests one by one.

I. Development Acceptance Test


(a) Release acceptance test (RAT): It is also known as a build
acceptance or smoke test. It is run on each development release to
check that each build is stable enough for further testing. Typically,
this test suite consists of entrance and exit test cases plus test cases
that check mainstream functions of the program with mainstream data.
Copies of RAT can be distributed to developers so that they can run
the tests before submitting builds to the testing group. If a build does
not pass a RAT test, it is reasonable to do the following:
• Suspend testing on the new build and resume testing on the prior
build until another build is received.
• Report the failing criteria to the development team.
• Request a new build.
(b) Functional acceptance simple test (FAST): It is run on each
development release to check that key features of the program are
appropriately accessible and functioning properly on at least one test
configuration (preferably minimum or common configuration). This
test suite consists of simple test cases that check the lowest level of
functionality for each command—to ensure that task-oriented
functional tests (TOFTs) can be performed on the program. The
objective is to decompose the functionality of a program down to the
command level and then apply test cases to check that each command
words as intended. No attention is paid to the combination of these
basic commands, the context of the feature that is formed by these
combined commands, or the end result of the overall feature. For
example, FAST for a File/SaveAs menu command checks that the
SaveAs dialog box displays. However, it does not validate that the
overall file-saving feature works nor does it validate the integrity of
saved files.
Typically, errors encountered during the execution of FAST are
reported through the standard issue-tracking process. Suspending
testing during FAST is not recommended. Note that it depends on the
organization for which you work. Each might have different rules in
terms of which test cases should belong to RAT versus FAST and
when to suspend testing or to reject a build.

II. Deployment Acceptance Test


The configurations on which the web system will be deployed will
often be much different from develop-and-test configurations. Testing
efforts must consider this in the preparation and writing of test cases
for installation time acceptance tests. This type of test usually includes
the full installation of the applications to the targeted environment or
configurations. Then, FASTs and TOFTs are executed to validate the
system functionality.

7.2.5.6.4. Acceptance testing for critical applications


In case of critical applications where a current system is going to be
replaced by a new system, it may be prudent to do acceptance testing
in three phases:
1. Acceptance testing: The usual testing for acceptance.
2. Parallel testing: Both the old and the new system are run
concurrently, i.e., the old one as the main, live system and new
one as the parallel offline system and the results of the two are
compared each day. Once these operations are found to be
satisfactory for a predetermined period of time, switch over is
done to the third phase, i.e., reverse parallel.
3. Reverse parallel: We switch the roles in parallel testing. Make
the new system the main, live system and the old system the
offline system. Feed data in both. Compare results each day.
When the results are found to be satisfactory for a
predetermined period of time, discontinue the old system.

7.2.5.7. PERFORMANCE TESTING


The primary goal of performance testing is to develop effective
enhancement strategies for mantaining acceptable system
performance. It is an information gathering and analyzing process in
which measurement data are collected to predict when load levels will
exhaust system resources.
Performance tests use actual or simulated workload to exhaust
system resources and other related problematic areas, including:
a. Memory (physical, virtual, storage, heap, and stack space)
b. CPU time
c. TCP/IP addresses
d. Network bandwidth
e. File handles
These tests can also identify system errors, such as:
a. Software failures caused by hardware interrupts
b. Memory runtime errors like leakage, overwrite, and pointer
errors
c. Database deadlocks
d. Multithreading problems
7.2.5.7.1. Introduction
In this internet era, when more and more of business is transacted
online, there is a big and understandable expectation that all
applications will run as fast as possible. When applications run fast, a
system can fulfill the business requirements quickly and put it in a
position to expand its business. A system or product that is not able to
service business transactions due to its slow performance is a big loss
for the product organization, its customers, and its customer’s
customer. For example, it is estimated that 40% of online marketing
for consumer goods in the US happens in November and December.
Slowness or lack of response during this period may result in losses of
several million dollars to organizations.
In another example, when examination results are published on the
Internet, several hundreds of thousands of people access the
educational websites within a very short period. If a given website
takes a long time to complete the request or takes more time to display
the pages, it may mean a lost business opportunity, as the people may
go to other websites to find the results. Hence, performance is a basic
requirement for any product and is fast becoming a subject of great
interest in the testing community.
Performance testing involves an extensive planning effort for the
definition and simulation of workload. It also involves the analysis of
collected data throughout the execution phase. Performance testing
considers such key concerns as:
Will the system be able to handle increases in web traffic
without compromising system response time, security,
reliability, and accuracy?
At what point will the performance degrade and which
components will be responsible for the degradation?
What impact will performance degradation have on
company sales and technical support costs?
Each of these preceding concerns requires that measurements be
applied to a model of the system under test. System attributes, such as
response time, can be evaluated as various workload scenarios are
applied to the model. Conclusions can be drawn based on the collected
data. For example, when the number of concurrent users reaches X,
the response time equals Y. Therefore, the system cannot support
more than X number of concurrent users. However, the complication
is that even when the X number of concurrent users does not change,
the Y value may vary due to differing user activities. For example,
1000 concurrent users requesting a 2K HTML page will result in a
limited range of response times whereas response times may vary
dramatically if the same 1000 concurrent users simultaneously submit
purchase transactions that require significant server-side processing.
Designing a valid workload model that accurately reflects such real-
world usage is no simple task.

7.2.5.7.2. Factors governing performance testing


The testing performed to evaluate the response time, throughput, and
utilization of the system, to execute its required functions in
comparison with different versions of the same product(s) or different
competitive product(s) is called performance testing. There are many
factors that govern performance testing:
1. Throughput
2. Response time
3. Latency
4. Tuning
5. Benchmarking
6. Capacity planning
We shall discuss these factors one by one.
1. Throughput. The capability of the system or the product in handling
multiple transactions is determined by a factor called throughput.
It represents the number of requests/ business transactions
processed by the product in a specified time duration. It is very
important to understand that the throughput, i.e., the number of
transactions serviced by the product per unit time varies according to
the load the product is put under. This is shown in Figure 7.17.
FIGURE 7.17 Throughput of a System at Various Loads.

From this graph, it is clear that the load to the product can be
increased by increasing the number of users or by increasing the
number of concurrent operations of the product. Please note that
initially the throughput keeps increasing as the user load increases.
This is the ideal situation for any product and indicates that the
product is capable of delivering more when there are more users trying
to use the product. Beyond certain user load conditions (after the
bend), the throughput comes down. This is the period when the users
of the system notice a lack of satisfactory response and the system
starts taking more time to complete business transactions. The
“optimum throughput” is represented by the saturation point and is
one that represents the maximum throughput for the product.
2. Response time. It is defined as the delay between the point of request
and the first response from the product. In a typical client-server
environment, throughput represents the number of transactions that
can be handled by the server and response time represents the delay
between the request and response.
Also, note that it was mentioned earlier that customers might go to
a different website or application if a particular request takes more
time on this website or application. Hence, measuring response time
becomes an important activity of performance testing.
3. Latency. It is defined as the delay caused by the application,
operating system, and by the environment that are calculated
separately. In reality, not all the delay that happens between the
request and the response is caused by the product. In the networking
scenario, the network or other products which are sharing the network
resources can cause the delays. Hence, it is important to know what
delay the product causes and what delay the environment causes.

FIGURE 7.18 Latencies in 3-Tier Architecture.

To understand the latency, let us consider an example of a web


application providing a service by talking to a web server and a
database server connected in the network.
In this case, the latency can be calculated for the product that is
running on the client and for the network that represents the
infrastructure available for the product. Thus, from the figure above,
we can compute both the latency and the response time as follows:
Network latency = (N1 + N2 + N3 + N4)
Product latency = (A1 + A2 + A3)
Actual response time = (Network latency + Product latency)
The discussion about the latency in performance is very important,
as any improvement that is done in the product can only reduce the
response time by the improvements made in A1, A2, and A3. If the
network latency is more relative to the product latency and, if that is
affecting the response time, then there is no point in improving the
product performance. In such a case, it will be worthwhile to improve
the network infrastructure. In those cases where network latency is too
large or cannot be improved, the product can use intelligent
approaches of caching and sending multiple requests in one packet and
receiving responses as a bunch.
4. Tuning. Tuning is procedure by which the product performance is
enhanced by setting different values to the parameters (variables) of
the product, operating system, and other components. Tuning
improves the product performance without having to touch the source
code of the product. Each product may have certain parameters or
variables that can be set at run time to gain optimum performance.
The default values that are assumed by such product parameters may
not always give optimum performance for a particular deployment.
This necessitates the need for changing the values of parameters or
variables to suit the deployment or a particular configuration. During
performance testing, tuning of the parameters is an important activity
that needs to be done before collecting numbers.
5. Benchmarking. It is defined as the process of comparing the
throughput and response time of the product to those of the
competitive products. No two products are the same in features, cost,
and functionality. Hence, it is not easy to decide which parameters
must be compared across two products. A careful analysis is needed
to chalk out the list of transactions to be compared across products.
This produces meaniningful analysis to improve the performance of
the product with respect to competition.
6. Capacity planning. The most important factor that affects
performance testing is the availability of resources. A right kind of
hardware and software configuration is needed to derive the best
results from performance testing and for deployments. This exercise
of finding out what resources and configurations are needed is called
capacity planning. The purpose of capacity planning is to help
customers plan for the set of hardware and software resources prior to
installation or upgrade of the product. This exercise also sets the
expectations on what performance the customer will get with the
available hardware and software resources.
Summary: Performance testing is done to ensure that a product:
1. processes the required number of transactions in any given
interval (throughput).
2. is available and running under different load conditions
(availability).
3. responds fast enough for different load conditions (response
time).
4. delivers worth while ROI for resources—hardware and
software. Also decides what kind of resources are needed for
the product for different load conditions (i.e., capacity
planning).
5. is comparable to and better than that of the competitors for
different parameters (competitive analysis and benchmarking).
We next consider a simple example to understand how increased
traffic load and consequently, increased response time can result in
lost company revenue.
EXAMPLE 7.1. Suppose your e-commerce site currently handles
300,000 transactions per day. How many transactions are being done
per second? How can performance testing be done?

After conducting a marketing survey, the findings show that:


a. The transaction response time is of an acceptable level as long
as it does not exceed 4 seconds.
b. If the transaction response time is greater than 4 but less than 9
seconds, 30% of users cancel their transactions.
c. If the transaction response time is greater than 8 but less than
10 seconds, 60% of users cancel their transactions.
d. If the transaction response time increases to over 10 seconds,
over 90% of users cancel their transactions.

FIGURE 7.19

Suppose in the next 6 months, the number of transactions is


expected to rise between 25% and 75% from the current level and the
potential revenue for each transaction is $1. Management would like
to learn how the performance would impact company revenue as the
number of transactions per day increases.
Refer to Figure 7.19 for detailed analysis of traffic, percentage,
and dollar amounts.
A performance test is conducted and the findings show that the
system cannot handle such increases in traffic without increasing
response time, user transaction cancellations, and/or failure results. If
the number of transactions increases as expected, the company will
face a potential revenue loss of between $112,500 and $472,500 per
day.
It takes time, effort, and commitment to plan for and execute
performance testing. Performance testing involves individuals from
many different departments. A well planned testing program requires
the coordinated efforts of members of the product team, upper
management, marketing, development, information technology (IT),
and testing. Management’s main objective in performance testing
should be to avoid financial losses due to lost sales, technical support
issues, and customer dissatisfaction.

7.2.5.7.3. Steps of performance testing


A methodology for performance testing involves the following steps:
Step 1. Collecting requirements.
Step 2. Writing test cases.
Step 3. Automating performance test cases.
Step 4. Executing performance test cases.
Step 5. Analyzing performance test results.
Step 6. Performance tuning.
Step 7. Performance benchmarking.
Step 8. Capacity planning.
We will explain each of these steps one by one.

(I) Collecting Requirements


Performance testing generally needs elaborate documentation and
environment setup and the expected results may not be well known in
advance.

Challenges of Performance Testing


a. A performance testing requirement should be testable. All
features/functionality cannot be performance tested.
For example, a feature involving a manual intervention cannot
be performance tested as the results depend on how fast a user
responds with inputs to the product.
b. A performance testing requirement needs to clearly state what
factors need to be measured and improved.
c. A performance testing requirement needs to be associated with
the actual number or percentage of improvement that is desired.
There are two types of requirements that performance testing
focuses on:
1. Generic requirements.
2. Specific requirements.
1. Generic requirements are those that are common across all
products in the product domain area. All products in that area are
expected to meet those performance expectations.
Examples are time taken to load a page, initial response when a
mouse is clicked, and time taken to navigate between screens.
Specific requirements are those that depend on implementation
for a particular product and differ from one product to another in a
given domain.
An example is the time taken to withdraw cash from an ATM.
During performance testing both generic and specific
requirements need to be tested.
See Table in next page for examples of performance test
requirements.

A performance test conducted for a product needs to validate this


graceful degradation as one of the requirement.

(II) Writing Test Cases


A test case for performance testing should have the following details
defined:
1. List of operations or business transactions to be tested.
2. Steps for executing those operations/transactions.
3. List of product, OS parameters that impact the performance
testing and their values.
4. Loading pattern.
5. Resources and their configurations (network, hardware,
software configurations).
6. The expected results, i.e., expected response time, throughput,
latency.
7. The product versions/competitive products to be compared
with related information such as their corresponding fields
(Steps 2-6 in the above list).
Performance test cases are repetitive in nature. These test cases are
normally executed repeatedly for different values of parameters,
different load conditions, different configurations, and so on. Hence,
the details of what tests are to be repeated for what values should be
part of the test case documentation.
While testing the product for different load patterns, it is important
to increase the load or scalability gradually to avoid any unnecessary
effort in case of failures.
For example, if an ATM withdrawal fails for ten concurrent
operations, there is no point in trying it for 10,000 operations. The
effort involved in testing for 10 concurrent operations may be less than
that of testing for 10,000 operations by several times. Hence, a
methodical approach is to gradually improve the concurrent operations
by say 10, 100, 1000, 10,000, and so on rather than trying to attempt
10,000 concurrent operations in the first iteration itself. The test case
documentation should clearly reflect this approach.
Performance testing is a tedious process involving time and effort.
All test cases of performance testing are assigned different priorities.
Higher priority test cases are to be executed first. Priority may be
absolute (given by customers) or may be relative (given by test team).
While executing the test cases, the absolute and relative priorities are
looked at and the test cases are sequenced accordingly.

(III) Automating Performance Test Cases


Automation is required for performance testing due to the following
characteristics:
1. Performance testing is repetitive in nature.
2. Performance test cases cannot be effective without automation.
3. The results of performance testing need to be accurate.
Accuracy will be less if response-time, throughput, etc. are
calculated manually.
4. Performance testing involves several factors and their
permutations and combinations. It is not easy to remember all
of these and use them manually during testing.
5. Performance testing collects various details like resource
utilization, log files, trace files, and so on at regular intervals. It
is not possible to do book keeping of all this related
information manually.
There should not be any hard coded data in automated scripts for
performance testing as it affects the repeatability nature of test cases.
The set up required for the test cases, setting different values to
parameters, creating different load conditions, setting up and
executing the steps for operations/transactions of competitive
products, and so on have to be included as the part of the automation
script.
While automating performance test cases, it is important to use
standard tools and practices. Some of the performance test cases
involve comparisons with the competitive product, so the results need
to be consistent, repeatable, and accurate.

(IV) Executing Performance Test Cases


Performance testing involves less effort for execution but more effort
for planning, data collection, and analysis. As explained earlier, 100%
end-to-end automation is desirable for performance testing.
During the execution of performance testing, the following data
needs to be collected:
1. Start and end time of test case execution.
2. Log and trace (or audit) files of the product and OS.
3. Utilization of resources like CPU, memory, disk, and network
on a periodic basis.
4. Configuration of hardware and software envionmental factors.
5. The response time, throughput, latency, etc., as specified in the
test case documentation at regular intervals.
Scenario testing is also done here. A scenario means a set of
transaction operations that are usually performed by the user. It is
done to ensure whether the mix of operations/transactions
concurrently by different users/machines meets the performance
criteria.
Configuration performance testing is also done to ensure that the
performance of the product is compatible with different hardware. The
tests need to be repeated for different configurations. For a given
configuration, the product has to give the best possible performance
and if the configuration is better, it has to get even better. For
example, we show a sample configuration performance test.

TABLE 7.1 Sample Configuration Performance Test.

Transaction No. of users Test environment


Querying ATM 20 RAM 512 MB, P-IV dual processor,
account balance WIN-NT server (OS)
ATM cash 20 RAM 128 MB, P-IV dual processor,
withdrawal WIN-98 (OS)
ATM user profile 40 RAM 256 MB, P-IV quad processor,
query WIN-2000 (OS)

The performance test case is repeated for each row in this table
and factors such as the response time and throughput are recorded and
analyzed.
After the execution of performance test cases, various data points
are collected and the graphs are plotted. For example, the response
time graph is shown below:
FIGURE 7.20(A) Response Time.

Similarly, for throughput, we can draw:

FIGURE 7.20(B) Throughput.

Plotting the data helps in making an easy and quick analysis which
is difficult with only raw data.

(V) Analyzing the Performance Test Results


It is a multi-dimensional thinking part of performance testing. The
product knowledge, analytical thinking, and statistical background are
all absolutely essential.
Before data analysis, some calculations of data and its organization
are required. They are:
1. Calculating the mean of the performance test result data.
2. Calculating the standard deviation.
3. Removing noise and replotting and recalculating the mean and
standard deviation.
They are explained below:
The performance numbers are to be reproducible for the
customers. To ensure that all performance tests are repeated
multiple times, the average or mean of those values are taken.
Thus, the probability of performance data for reproducibility at
a customer site also increases for the same configuration and
load condition.
The standard deviation represents how much the data
varies from the mean. For example, if the average response
time of 100 people withdrawing money from an ATM is 100
seconds and the standard deviation is 2, then there as a greater
chance that this performance data is repeatable than in a case
where the standard deviation is 30. A standard deviation close
to zero means the product performance is highly repeatable and
performance values are consistent. The higher the standard
deviation, the more the variability of the product performance.
When a set of values is plotted on a chart, one or two
values that are out of range may cause the graph to be cluttered
and prevent meaningful analysis. Such values can be ignored
to produce a smooth curve/graph. The process of removing
some unwanted values in a set is called noise removal. Because
some values are removed from the set, the mean and standard
deviation need to be re-calculated.
Also, note that the majority of client server, Internet, and
database applications store data in a local high-speed buffer
when a query is made. This is called caching. The performance
data need to be differentiated according to where the result is
coming from—the server or the cache. So, the data points can
be kept as two different sets—one for the cache and one from
the server. This helps in the extrapolation of performance data.
For example, let us assume that data in a cache can produce a
response time of 100 μs and a server access takes 2 μs and
80% of the time a request is satisfied by the cache. Then, the
average response time is given by
= (0.8) * 100 + 0.2 * 2
= 80 + 0.4 μs = 80.4 μs
Then, the mean response time is computed as a weighted average
rather than a simple mean.
Other factors such as garbage collection/defragmentation in the
memory of an OS or compiler also affect the performance data.

(VI) Performance Tuning


Once the parameters that affect performance testing are minimized, we
can repeat performance test cases to further analyze their effect in
getting better performance. This exercise is known as performance
tuning and it requires skills in identifying such parameters and their
contribution to performance and the relationship among these
parameters is equally important for performance tuning.
There are two steps involved here:
Step 1. Tuning the product parameters.
Step 2. Tuning the operating system and parameters.
The first step involves a set of parameters associated with the
product where users of the product can set different values to obtain
optimum performance. Some of the factors are:
a. Providing a number of forked processes for parallel
transactions.
b. Caching.
c. Disabling low-priority operations.
d. Creating background activities.
e. Deferring routine checks to a later point of time.
f. Changing the sequence or clubbing a set of operations and so
on.
Setting different values to these parameters enhances the product
performance.
Some do’s are listed below:
1. Repeat the performance tests for different values of each
parameter that impact performance.
2. Repeat the performance tests for a group of parameters and
their different values.
3. Repeat the performance tests for the default values of all
parameters. This is called factory settings tests.
4. Repeat the performance tests for low and high values of each
parameter and combinations.
Please note that performance tuning provides better results only for
a particular configuration and for certain transactions. Therefore,
tuning may be counter productive to other situations or scenarios. This
side effect of tuning product parameters needs to be analyzed.
The second step involves the tuning of OS parameters. These
parameters can be:
1. File system-related parameters like the number of open files
allowed.
2. Disk management parameters like simultaneous disk
reads/writes.
3. Memory management parameters like virtual memory page
size and number of pages.
4. Processor management parameters like enabling/disabling
processors in a multiprocessor environment.
5. Network parameters like setting TCP/IP time out.
Please note that the OS parameters need to be tuned only when the
complete impact is known to all applications running in the machine.
They should be tuned only if they are absolutely necessary.
The results of the performance tuning are published in the form of
a guide called the performance tuning guide for customers so that they
can benefit from this exercise. It explains in detail the effect of each
product and OS parameter on performance. It also gives a set of
guideline values for the combination of parameters and what
parameter must be tuned in and in which situation.
(VII) Performance Benchmarking
Performance benchmarking is the process of comparing the
performance of product transactions to that of the competitors. No two
products can have the same architecture, design, functionality, and
code. Hence, it will be very difficult to compare two products on those
aspects. End-user transactions/scenarios could be one approach for
comparison.
In general, an independent test team or an independent
organization not related to the organizations of the products being
compared does performance benchmarking.
The steps involved in performance benchmaking are:
1. Identify the transactions/scenarios and the test configuration.
2. Compare the performance of different products.
3. Tune the parameters.
4. Publish the results of performance benchmarking.
Generally, the test cases for all products being compared are
executed in the same test bed. Next step is to compare the results. A
well-tuned product, X, may be compared with a product B with no
parameter tuning to prove that product A performs better than B. It is
important in performance benchmarking that all products should be
tuned to the same degree.
There could be three outcomes from performance benchmarking.
1. Positive outcome. A set of transactions/scenarios out-perform
with respect to competition.
2. Neutral outcome. A set of transactions are comparable with
those of the competition.
3. Negative outcome. A set of transactions under-perform
compared to those of the competition.
Tuning can be repeated for all situations of positive, neutral, and
negative results to derive the best performance results.
The results of performance benchmaking are published. Two types
of publications are involved. They are:
a. An internal, confidential publication to product teams containing
all three outcomes given above and the recommended set of
actions.
b. The positive outcomes are normally published as marketing
collateral—which helps as a sales tool for the product.
Also benchmarks conducted by the independent organizations are
published as audited benchmarks.

(VIII) Capacity Planning


The planning of load generation capacity does not stop with the
specification of required hardware. We should also consider our
network’s capacity to transport a large number of transactions and a
huge amount of data. Both our hardware and network bandwidth must
be sufficient. This can cost a bit more. It is the process in which the
performance requirements and the performance results are taken as
input requirements and the configuration needed to satisfy that set of
requirements are derived. Thus, capacity planning is the reverse
process.
It necessitates a clear understanding of the resource requirement
for transactions/scenarios.
Some transactions of the product associated with certain load
conditions could be
a. Disk intensive
b. CPU intensive
c. Network intensive
d. Memory intensive
Some transactions may require a combination of these resources
for performing better. This understanding of what resources are
needed for each transaction is a prerequisite for capacity planning.
It is critical to consider some requirements during capacity
planning. The load can be
a. Short term, i.e., actual requirements of the customer for
immediate need.
b. Medium term, i.e., requirements for the next few months.
c. Long-term, i.e., requirements for the next few years.
Capacity planning corresponding to short-, medium-, and long-
term requirements are called
1. Minimum required configuration
2. Typical configuration
3. Special configuration
A minimum-required configuration denotes that with anything less
than this configuration, the product may not even work. Thus,
configurations below the minimum-required configuration are usually
not supported.
A typical configuration denotes that under that configuration the
product will work fine for meeting the performance requirements of
the required load pattern and can also handle a slight increase in the
load pattern.
A special configuration denotes that capacity planning was done
considering all future requirements.
There are two techniques that play a vital role in capacity
planning. They are as follows:
a. Load balancing
b. High availability
Load balancing ensures that the multiple machines available are
used equally to service the transactions. This ensures that by adding
more machines more load can be handled by the product. Machine
clusters are used to ensure availability. In a cluster, there are multiple
machines with shared data so that in case one machine goes down, the
transactions can be handled by another machine in the cluster.
Capacity planning is based on the performance test data generated
in the test lab which is only a simulated environment. In real-life
deployment, there could be several other parameters that may impact
product performance.

7.2.5.7.4. Tools for performance testing


There are two types of tools.

FIGURE 7.21

Functional performance tools help in recording and playing back


the transactions and obtaining performance numbers. This test
generally involves very few machines.
Load testing tools simulate the load condition for performance
testing without having to keep that many users or machines. These
tools simplify the complexities involved in creating the load. This is
only a simulated load and real-life experience may vary from the
simulation. Some popular performance tools are:
I. Functional performance tools
• Win Runner (Mercury-Vendor)
• QA partner (Compuware-Vendor)
• Silk Test from Segue.
II. Load testing tools
• Load Runner (Mercury-Vender)
• QA load (Compuware-Vendor)
• Silk Performer (segue)
These tools can help in getting performance numbers only. The
utilization of resources is an important parameter that needs to be
collected. For example, “windows task manager” and “top in linux”
are such tools. Network performance monitoring tools are available
with almost all OS today to collect network data.

7.2.5.7.5. Performance testing challenges


1. The availability of required skills is one of the major problems
facing performance testing.
2. The comparison of functional performance numbers with load
testing numbers becomes difficult as the build used and time-
lines are also different as they were performed in two different
phases.
3. Performance testing requires a large number and amount of
resources such as hardware, software, effort, time, tools, and
people. This is another challenge.
4. It needs to reflect real-life environment and expectations.
Adequate care to create a test bed as close to a customer
deployment is another expectation for performance tests.
5. Selecting the right tool for the performance testing is another
challenge. These tools are expensive. They also expect the test
engineers to learn additional meta-languages and scripts.
6. Interfacing with different teams that include a set of customers
is another challenge.
7. Lack of seriousness on performance tests by the management
and development team is another challenge.

SUMMARY
We can say that we start with unit or module testing. Then we go in
for integration testing, which is then followed by system testing. Then
we go in for acceptance testing and regression testing. Acceptance
testing may involve alpha and beta testing while regression testing is
done during maintenance.
System testing can comprise of “n” different tests. That is it could
mean:
1. End-to-end integration testing
2. User interface testing
3. Load testing in terms of
a. Volume/size
b. Number of simultaneous users
c. Transactions per minute/second (TPM/TPS)
4. Stress testing
5. Testing of availability (24 × 7)
Performance testing is a type of testing that is easy to understand
but difficult to perform due to the amount of information and effort
needed.

MULTIPLE CHOICE QUESTIONS


1. Alpha testing involves
a. Customers
b. Testers
c. Developers
d. All of the above.
2. Site for alpha testing is
a. Software development company
b. Installation site
c. Anywhere
d. None of the above.
3. Site of beta testing is
a. Software organization
b. Customer’s site
c. Anywhere
d. All of the above.
4. Acceptance testing is done by
a. Developers
b. Customers
c. Testers
d. All of the above.
5. Integration testing techniques are
a. Top down
b. Bottom up
c. Sandwich
d. All of the above.
6. Top-down approach is used for
a. Development
b. Identification of faults
c. Validation
d. Functional testing
7. Which of the following validation activities belong to low-level
testing:
a. Unit testing
b. Integration testing
c. System testing
d. Both (a) and (b).
8. Testing done to ensure that the software bundles the required
volume of data is called
a. Load testing
b. Volume testing
c. Performance testing
d. Stress testing
9. Number of sessions for integration testing is given by
a. Sessions = nodes + leaves – edges
b. Sessions = nodes – leaves – edges
c. Sessions = nodes – leaves + edges
d. None of the above.
10. An MM-path graph refers to
a. Module-to-module path graph
b. Manager-to-manager path graph
c. Many-module path graph
d. None of the above.

ANSWERS
1. a.
2. a.
3. b.
4. b.
5. d.
6. b.
7. d.
8. b.
9. c.
10. a.

CONCEPTUAL SHORT QUESTIONS WITH ANSWERS


Q. 1. Some scenarios have a low probability but a high potential
impact. How to tackle such scenarios?
Ans. This can be accomplished by adding a weight to each scenario as
follows:

Weight Meaning
+2 Must test, mission/safety critical
+1 Essential functionality, necessary for robust operation
+0 All other scenarios

Q. 2. How will the operational profile maximize system reliability for


a given testing budget?
Ans. Testing driven by an operational profile is very efficient because
it identifies failures on average, in order of how often they occur.
This approach rapidly increases reliability—reduces failure
intensity per unit of execution time because the failures that
occur most frequently are caused by the faulty operations used
most frequently. According to J.D. Musa, users will also detect
failures in order of their frequency, if they have not already been
found in the test.
Q. 3. What are three general software performance measurements?
Ans. The three general software performance measurements are as
follows:
a. Throughput: The number of tasks completed per unit time.
It indicates how much of work has been done in an interval.
It does not indicate what is happening to a single task.
Example: Transactions per second.
b. Response time: The time elapsed between input arrival and
output delivery. Here, average and worst-case values are of
interest.
Example: Click to display delay.
c. Utilization: The percentage of time a component is busy. It
can be applied to processor resources like CPU, channels,
storage, or software resources like file-server, transaction
dispatcher, etc.
Example: Server utilization.
Q. 4. What do you understand by system testing coverage?
Ans. System testing is requirements driven. The TC coverage metric
reflects the impact of requirements. [IEEE 89a]
FIGURE 7.22

Q. 5. What sort of verifications are required during system testing?


Ans. The following tests are suggested:
a. Conformity test: One of the major tasks during system
testing is to cover all requirements to ensure that all types of
system requirements are exercised to ascertain their
conformity.
b. Data coverage test: In this test, we select test data that cover
a broad range of usages. It employs important data values like
valid and invalid data and boundary values to test the
requirements in the first step.
Q. 6. Compare unit and integration testing in a tabular form.
Ans.
Unit testing Integration testing
1. It starts from the module 1. It starts from the interface
specification. specification.
2. It tests the visibility of code 2. It tests the visibility of the
in detail. integration structure.
3. It requires complex 3. It requires some
scaffolding. scaffolding.
4. It checks the behavior of 4. It checks the integration
single modules. among modules.
Q. 7. Compare integration and system testing in a tabular form.
Ans.
Integration testing System testing
1. It starts from the interface 1. It starts from the
specification. requirements specification.
2. It tests the visibility of the 2. It does not test the visibility
integration structure. of code.
3. It requires some 3. It does not require any
scaffolding. scaffolding.
4. It checks the integration 4. It checks the system
among modules. functionality.

Q. 8. Consider hypothetical “online railway reservation system.” Write


suitable scope statement for the system. Write all assumptions
and identify two test cases for each of the following:
i. Acceptance testing
ii. GUI testing
iii. Usability and accessibility testing
iv. Ad hoc testing
Document test cases.
Ans. Test cases for online railway reservation system:
Preconditions: Open web browser and enter URL in the address
bar. Homepage must be displayed.
REVIEW QUESTIONS
1. Differentiate between alpha and beta testing?
2. Explain the following: Unit and Integration testing?
3. a. What would be the test objective for unit testing? What are
the quality measurements to ensure that unit testing is
complete?
b. Put the following in order and explain in brief:
i. System testing
ii. Acceptance testing
iii. Unit testing
iv. Integration testing
4. Explain integration and system testing.
5. Write a short paragraph on levels of software testing.
6. a. Why is threaded integration considered an incremental
testing technique?
b. Given the module order in call graph:
Give the integration testing order for bottom-up testing.
7. Describe all methods of integration testing.
8. What do you mean by “big bang” integration strategy?
9. What do you mean by unit testing? Discuss how you would
perform unit testing of a module that implements bounded
stack of 100 integers that you have designed and coded.
Assume that the stack functions supported are push and pop.
10. What do you understand by the term integration testing? What
are its different methods? Compare the relative merits and
demerits of these different integration testing strategies.
11. What are drivers and stubs? Why are they required?
12. Consider a piece of an embedded software that is part of a TV.
Which of the types of system testing discussed would you
choose to apply and at what times?
13. Each product has different tuning parameters. For each of the
following cases, identify the important tuning parameters:
a. OS (e.g., windows XP)
b. Database (e.g., oracle 11i)
c. A network card (e.g., a wireless LAN card)
14. What are the differences in the requirements gathering for
performance testing and for functional testing like black-box
testing? Discuss on the basis of sources used, methods used,
tools used, and skill sets required.
15. “Staffing people for performance testing is most difficult.”
Justify.
16. a. Explain how you test the integration of two fragment codes
with suitable examples.
b. What are the various kinds of tests we apply in system
testing? Explain.
17. Assume that you have to build the real-time multiperson
computer game. What kinds of testing do you suggest or think
are suitable. Give a brief outline and justification for any four
kinds of tests.
18. Discuss some methods of integration testing with examples.
19. a. What is the objective of unit and integration testing?
Discuss with an example code fragment.
b. You are a tester for testing a large system. The system data
model is very large with many attributes and there are
many interdependencies within the fields. What steps
would you use to test the system and what are the effects of
the steps you have taken on the test plan?
20. What is the importance of stubs? Give an example.
21. a. Explain BVT technique.
b. Define MM-path graph. Explain through an example.
c. Give the integration order of a given call graph for bottom-
up testing.
d. Who performs offline deployment testing? At which level
of testing it is done?
e. What is the importance of drivers? Explain through an
example.
22. Which node is known as the transfer node of a graph?
23. a. Describe all methods of integration testing.
b. Explain different types of acceptance testing.
24. Differentiate between integration testing and system testing.
25. a. What are the pros and cons of decomposition based
techniques?
b. Explain call graph and path-based integration testing. Write
advantages and disadvantages of them.
c. Define acceptance testing.
d. Write a short paragraph on system testing.
CHAPTER
8

OBJECT-ORIENTED TESTING
Inside this Chapter:
8.0. Basic Unit for Testing, Inheritance, and Testing
8.1. Basic Concepts of State Machines
8.2. Testing Object-Oriented Systems
8.3. Heuristics for Class Testing
8.4. Levels of Object-Oriented Testing
8.5. Unit Testing a Class
8.6. Integration Testing of Classes
8.7. System Testing (with Case Study)
8.8. Regression and Acceptance Testing
8.9. Managing the Test Process
8.10. Design for Testability (DFT)
8.11. GUI Testing
8.12. Comparison of Conventional and Object-Oriented Testing
8.13. Testing Using Orthogonal Arrays
8.14. Test Execution Issues
8.15. Case Study—Currency Converter Application

The techniques used for testing object-oriented systems are quite


similar to those that have been discussed in previous chapters. The
goal is to provide some test design paradigms that help us to perform
Object-Oriented Testing (OOT).
Object-oriented software centers on a class (the basic unit) and the
inheritance and encapsulation that affect a class. However, procedural
programming controls the flow between software functions. Testing
these object-oriented features call for new strategies.
8.0. BASIC UNIT FOR TESTING, INHERITANCE, AND
TESTING
The class is the smallest unit for testing. This is due to the following
reasons
a. Methods are meaningless apart from their class.
b. We can design method tests only if we take a class into
consideration.
c. Class behavior should be verified before a class is reused or
promoted to a library.

The class clusters are the practical unit for testing. This is due to the
following reasons:
a. It is typically impossible to test a class in total isolation.
b. It is not cost effective to develop a complete set of drivers and
stubs for every class.
c. A smaller testing budget usually means larger clusters.
We have different types of classes such as application specific,
general-purpose, abstract or parameterized (template) classes.
Therefore we need different test requirements for each class.

What About the Objects?


Testing a class instance (an object) can verify a class in isolation.
However, when untested objects of tested classes are used to create an
application system, the entire system must be tested before these
objects can be considered to be verified.
The major problems that we face during object-oriented testing are:
a. Dynamic (or late) binding requires a separate test.
b. Complex inheritance structures.
c. Interface errors because OO programs typically have many small
components and, therefore, more interfaces.
d. Objects preserve state but the state control, i.e., the acceptable
sequence of events, is typically distributed over an entire program.
This leads to state control errors.
e. Encapsulation is not a source of errors but may be an obstacle to
testing.
So we conclude that we now require developing testable state
models and state-based test suites.

What About Inheritance and Testing?

Weyuker gave some testing axioms that establishes limits on the


reusability of test cases. These axioms are discussed below.
Axiom-1: Antiextensionality. The same function may be computed
many ways, so test cases that cover method A do not necessarily cover
method B, even if A and B are functionally equivalent.
Axiom-2: Antidecomposition. Test cases that cover a client class do
not necessarily result in coverage of its server objects.
Axiom-3: Anticomposition. Tests that are individually adequate for
each method in a class are not collectively adequate for the class. The
individual test suites do not guarantee that all the class interactions
will be tested.
Axiom-4: General multiple change. Even when two methods have the
same structure (flow graph), test cases that cover method A do not
necessarily cover B.
Let us now consider what these axioms indicate for testing under
inheritance.

Case I: Extension
Suppose we change some methods in class A. Clearly,
We need to retest the changed methods.
We need to retest interaction among changed and unchanged
methods.
We need to retest unchanged methods, if data flow exists between
statements, in changed and unchanged methods.
But, what about unchanged subclass B, which inherits from A? By
anticomposition:

“The changed methods from A also need to be exercised in the unique


context of the subclass.”
Now, suppose we add or modify subclass B, without any change to
superclass A. By antidecomposition and anticomposition:
“We need to retest inherited methods, even if they weren’t
changed.” For example, say we develop a class acct, test it and add it
to our class library.

FIGURE 8.1 Extension Case.

Here, we will not find the error unless we retest both number (DC)
and the inherited, previously tested, init balance( ) member functions.

Case II: Overriding


Suppose we add a subclass method B-1 which now overrides
superclass method A-1. The new method should be tested.
Would the superclass test cases for A-1 necessarily provide the
same coverage for the method B-1? The answer is:
a. No (by antiextensionality).
b. If the data domains and the control paths are only slightly different,
the superclass test cases will probably not provide the same
coverage.
This means that additional test cases are needed at each level of
the inheritance hierarchy.
For example, suppose we develop a classes account and cert of
deposit, test them, and then add both to our class library.

FIGURE 8.2

Later, we specialize cert of deposit to have its own rollover ( ). That is,
FIGURE 8.3 Overriding Case.

It is likely that both the specification and implementation of the


new method will be different, so reusing the test cases for account : :
rollover ( ) won’t give us the coverage we need.

SOLVED EXAMPLES
EXAMPLE 8.1. Consider the following code for the shape class
hierarchy.
What kind of testing is required for this class hierarchy?
SOLUTION. We can use method-specific retesting and test case
reuse for the shape class hierarchy.
Let D = denote, develop, and execute test suite.
R = Reuse and execute superclass test suite.
E = Extend and execute superclass test suite.
S = Skip, super’s test adequate.
N = Not testable.
Then, we get the following table that tells which type of testing is
to be performed.

Method-Specific Test Reuse


The method-specific retesting and test case reuse for the shape class
hierarchy is shown below.
D denote, develop, and execute test suite
R Reuse and execute superclass test suite
E Extend and execute superclass test suite
S Skip, super’s test adequate
N Not testable

Method-Specific Test Reuse


Each kind of derivation calls for a different method-specific test strategy.

NA = Not Applicable
1. By antiextensionality (different implementation of the same
function requires different tests).
2. By anticomposition (individual method test cases are not sufficient
to test method interactions).
3. New test cases are needed when inherited features use any locally
defined features.
4. Superclass test cases are reusable only to the extent that the super
and subclass have the same pre- and post-conditions.
5. A pure extension subclass is an exception.
The subclass consists of entirely new features.
There are no interactions between the new and inherited
features.

8.1. BASIC CONCEPTS OF STATE MACHINES


A state machine is an abstract, formal model of sequential
computation. It is defined as a system whose output is determined by
both current and past input. Although most classes retain some
representation of previous messages, only certain classes are sensitive
to the sequence of past message. A state machine can model the
behavior of such classes. They are also called sequential machines.
State models have many uses in software engineering when
sequence is important. Classes (objects) are readily modeled by state
machines. It is the state of an object that may limit the messages it
accepts. It typically determines its response to messages. The state
determined message response patterns are called the behavior of an
object. We can easily produce powerful test suites for behavior
verification from a state model of class.
A system is said to exhibit state-based behavior when identical
inputs are not always accepted and when they are accepted may
produce different outputs. State machines are an engineering
application of mathematical models known as finite automata.

State-Based Behavior
A state machine accepts only certain sequences of input and rejects all
others. Each accepted input/state pair results in a specific output.
State-based behavior means that the same input is not necessarily
always accepted and when accepted, does not necessarily produce the
same output.
This simple mechanism can perform very complex control tasks.
Examples of sequentially constrained behavior include:
a. GUI interface control in MS-Windows
b. Modem controller
c. Device drivers with retry, restart, or recovery
d. Command syntax parser
e. Long-lived database transactions
f. Anything designed by a state model
The central idea is that sequence matters. Object behavior can be
readily modeled by a state machine.
A state machine is a model that describes behavior of the system
under test in terms of events (inputs), states, and actions (outputs).

FIGURE 8.4

A state machine has four building blocks:


1. State: An abstraction which summarizes the information
concerning past inputs that is needed to determine the behavior of
the system on subsequent inputs.
2. Transition: Any allowable change from one state to another.
3. Event: An input, an interval of time, or other condition that causes
a transition.
4. Action: The result or output that follows a transition.

Its graphic representation is shown in Figure 8.5.


FIGURE 8.5 State Transition Diagrams.

A transition takes the system from one state to another state.


A transition has a pair of states, the accepting state and the
resultant state.
A machine may be in one state at a time.
The current state refers to the active state.
A mechanism of a state machine has several steps. They are as
follows:
Step 1. The machine begins in the initial state.
Step 2. The machine waits for an event for an indefinite interval.
Step 3. An event presents itself to the machine.
Step 4. If the event is not accepted in the current state, it is ignored.
Step 5. If the event is accepted in the current state, the designated
transition is said to fire. The associated output action (if any)
is produced and the state designated as the resultant state
becomes the current state. The current and the resultant state
may be the same.
Step 6. The cycle is repeated from Step 2 unless the resultant state is
the final state.
UML notations of state-transition diagram (STD) is given next.
It is a graphic representation of a state machine.
Nodes represent states.
Arrows represent transitions.
The annotations on the edge represent events and actions.
FIGURE 8.6 STD of Water Tank System.

A Water-Tank Example—STD
We draw STD of a water tank system with a valve. This valve can be
in one of the two states—shut or open. So, we have the following:

State Tables/State Transition Table


A state to state table shows current state (rows) by next state
(columns). Cells with an event/action pair define transactions. State-
transition diagrams (STDs) are useful with a relatively small number
of states (up to 20 states). For more states, state transition tables are
compact and ease systematic examination.

Properties of Finite State Automata


Some properties of finite automata have practical implications for
testing. They are as follows:
1. Any two states are equivalent if all possible event sequences
applied to these states result in identical output action sequences.
That is, if we start in either state and apply all possible combination
of events up to and including those that result in a final state, we
cannot tell from which state we started by comparing output
actions.
A minimal state machine has no equivalent states. A model with
equivalent states is at best redundant and probably incorrect or
incomplete in some more serious way.
2. A state, Si, is reachable from another state, Sj, when a legal event
sequence takes the machine from Si to Sj.
When some state is said to be reached without specifying a starting
state, the initial state is assumed.
Reachability analysis is concerned with determining which states
are reachable.
Non reachable states arise for several reasons:
a. Dead state: No other state is reachable from it.
b. Dead loop: No state outside of the loop may be reached.
c. Magic state: It has no inbound transition but provides
transition to other states. It is an extra initial state.
3. A dead state/loop is appropriate in some rare cases but generally is
an error.
4. A magic state is always an error.

Mealy/Moore Machines
There are two main variants of state models (named for their
developers).

Moore Machine:
Transitions do not have output.
An output action is associated with each state. States are active.

Mealy Machine:
Transitions have output.
No output action is associated with state. States are passive.
In software engineering models, the output action often represents
the activation of a process, program, method, or module.
Although the Mealy and Moore models are mathematically
equivalent, the Mealy type is preferred for software engineering.
A passive state can be precisely defined by its variables. When the
same output action is associated with several transitions, the Mealy
machine provides a more compact representation.

Conditional/Guarded Transitions
Basic state models do not represent conditional transitions. This is
remedied by allowing a Boolean conditions on event or state variables.
Consider a state model for stack class. It has three states: empty,
loaded, and full.
We first draw the state machine model for a STACK class, without
guarded transitions. The initial state is “Empty.”
What is a guard? It is a predicate expression associated with an
event. Now, we draw another state machine model for STACK class,
with guarded transitions.

FIGURE 8.7 State Model of a Stack Without Guards.

FIGURE 8.8 State Machine Model of Stack With Guards.

Conditions/guards are Boolean expressions on:


a. Variables of the class under test.
b. Parameters of the messages sent to the class under test.

There are two kinds of conditions:


a. Acceptance conditions specify the conditions under which a
message is accepted.
b. Resultant conditions specify the conditions under which one of the
several possible states may result from the same message.

EXAMPLE 8.2. Draw a state model of a two-player volley ball game.

SOLUTION.
The game starts.
The player who presses the start button first gets the first serve.
The button press is modeled as the player-1 start and player-2
start events.
The current player serves and a volley follows. One of three things
end the volley.
If the server misses the ball, the server’s opponent becomes the
server.
If the server’s opponent misses the ball, the server’s score is
incremented and gets another chance.
If the server’s opponent misses the ball and the server’s score is at
the game point, the server is declared winner.
FIGURE 8.9

Here, 20 is the game point, so a score of 21 wins.

Design Guidelines–Basic Requirements


1. One state must be designated as the initial state.
2. At least, one state must be designated as the final state. If not,
assumptions about termination should be made explicit.
3. There are no equivalent states. (Equivalent states produce exactly
the same response for all events visible to the machine in question.)
4. Every defined event and every defined action must appear in at
least one transition.
5. There should be an explicit definition of the mechanism to handle
events which are implicitly rejected.

Limitations of the Basic State Models


1. State diagrams can be easily drawn or read up to 20 states. Tabular
representation helps but is not suitable for large states.
2. Concurrency cannot be modeled. Only a single state may be active,
so the basic model cannot accommodate two or more simultaneous
transition processes. Only one state may be designated as the
current state. For example, tasks running on separate computers in
a client/server systems.
3. With even a moderately large number of states, a state diagram
becomes a spaghetti ball.
4. Basic models do not offer any mechanisms for partitioning or
hierarchic abstraction.
Statecharts extend the basic model to remove these deficiencies.

What Are Statecharts?


Statecharts use a graphic shorthand to avoid enumeration of all states.
They overcome the basic machine’s scalability and concurrency
limitations.

Properties of Statecharts
1. They use two types of state: group and basic.
2. Hierarchy is based on a set-theoretic for malism (hypergraphs).
3. Easy to represent concurrent or parallel states and processes.
Therefore, we can say that:

A basic state model and its equivalent statechart are shown below:

FIGURE 8.10

We have already discussed the STD. We now discuss its equivalent,


statechart. In the Figure 8.10(b), we observe the following:
1. State D is a super state. It groups states A and C because they share
common transitions.
2. State A is the initial state.
3. Event f fires the transition AB or CB, depending on which state is
active.
4. Event g fires AC but only if C is true (a conditional event).
5. Event h fires BC because C is marked as the default state.
6. The unlabelled transition inside of state D indicates that C is the
default state of D.
Statecharts can represent hierarchies of single-thread or concurrent
state machines.
Any state may contain substates.
The substate diagram may be shown on a separate sheet.
Decomposition rules are similar to those used for data flow
diagrams (DFD). Orthogonal superstates can represent several
situations.
The interaction among states of separate classes (objects).
The non interaction among states of separate processes which
proceed independently. For example, “concurrent,” “parallel,”
“multi-thread,” or “asynchronous” execution.
Statecharts have been adapted for use in many OOA/OOD
methodologies including:
Booch OOD
Object modelling technique (OMT)
Object behavior analysis (OBA)
Fusion
Real-time object-oriented modeling (ROOM)

EXAMPLE 8.3. Consider a traffic light control system. The traffic


light has five states—red, yellow, green, flashing red, and OFF. The
event, “power on,” starts the system but does not turn on a light. The
system does a self test and if no faults are recognized, the no fault
condition becomes true. When a reset event occurs and the no fault
condition holds, the red on event is generated. If a fault is raised in any
state, the system raises a “Fault” event and returns to the off state.
Draw its
a. State transition diagram.
b. State chart.

SOLUTION. (a) We first draw its state transition diagram (STD)


shown in Figure 8.11.
The unlabeled transition inside the cycling state shows that red is
the default state of cycling.
An aggregration of states is a superstate and the model within an
aggregation is a process. The entire model is also a process,
corresponding to the entire system under study.
The traffic light model has three processes: the traffic light system,
the on process, and the cycling process.
The state enclosed in the undivided box are mutually exclusive
indicating that the system may be in exactly one of these states at a
time. This is called XOR decomposition.
The two superstates “on” and “cycling” show how transitions are
simplified.
FIGURE 8.11 STD for Traffic Light.

Superstate “on” means the system is either cycling or flashing red.


Superstate cycling groups are the state red, yellow, and green
because they share common transitions.
Off is the initial state, reached by the power on event.
The event flashing red on fires the transition red-flashing red on,
yellow-flashing red on, or green-flashing red on depending on
which state is active.
The event reset fires off-on, but only if no fault is true. This is a
guarded transition.
The event reset fires off-red on because red on is marked as the
only default state with both superstates on and cycling.
In a state chart, a state may be an aggregate of other states (a
superstate) or an atomic state.
In the basic state model, one transition arrow must exist for each
transition having the same event, the same resultant state but
different accepting states. This may be represented in a statechart
with a single transition from a superstate.

Figure 8.12 shows the statechart for the traffic light system.
FIGURE 8.12 Statechart for the Traffic Light.

Concurrent states are represented by divided superstates. This is


called AND decomposition. That is, one state or superstate in each
partition is active.

EXAMPLE 8.4. Consider the behavior of two automotive control


systems: Crusive control and antilock brake control. These systems
share a common event (brake applied) but are otherwise independent.
Draw its statechart.

SOLUTION. Please note the following points here.


To be in the moving state means that both substates cruise control
and antilock brake control are active.
The substates of cruise control (off, suspended, and active) are
mutually exclusive as are the substates of antilock brake control
(free wheeling, braking, and modulating).
Multiple transitions are used to model constructor behavior.
Subclass constructors typically percolate up the hierarchy so a
complete flattened model must show the chain of events that happens
upon construction of a subclass.

Concatenation
Concatenation involves the formation of a subclass that has no locally
defined features other than the minimum requirement of a class
definition.

State Space
A subclass state space results from the two factors:

FIGURE 8.13

a. The superclass state space is inherited to the extent that


superclass instance variables are visible.
b. The subclass adds state space dimensions to the extent that it
defines or overrides instance variables.

For a subclass to be well-formed and testable:


a. Orthogonal composition: Each subclass inherits all of the non
private (public) superclass features without redefinition and
redefines some new local features. Subclass extensions must be
orthogonal. The inherited state space must not be spindled,
folded, or mutilated by the extensions.
b. The inherited state space must be the same or smaller. The
subclass should not redefine inherited instance variables in any
other way.
c. Superclass private variables must be orthogonal in the state
space formed by inheritable variables. In other words, the effect
of superclass private variables on the superclass state space
must be additive.

TIPS

Class hierarchies that do not meet these conditions are very likely to
be buggy.

How State Machines Fail?


i. State machines fail due to control faults like:
a. A missing or incorrect transition (the resultant state is
incorrect).
b. A missing or incorrect event (a valid message is ignored).
c. A missing or incorrect action.
d. An extra, missing, or corrupt state (behavior becomes
unpredicatable).
e. A sneak path (a message is accepted when it should not be).
f. An illegal message failure (an unexpected message causes a
failure).
g. A trap door (the implementation accepts undefined
messages).

Missing transition:

FIGURE 8.14

Player-2 loses the volley but continues as server.


Incorrect transition/resultant state: After player-2 missed, the
game resets.

FIGURE 8.15

Missing action:
FIGURE 8.16

No volley is generated. System will wait indefinitely.


ii. State machines fail due to an incorrect actions like:
a. Sneak path
b. Corrupt state
c. Illegal message failure
d. Trap doors
Sneak path: The implementation accepts an event that is illegal or
unspecified for a state. For example, when player-2 is serving, player-
2 can win if his or her start button is pressed.
Corrupt state: The implementation computes a state that is not
valid.
Player-1 is serving at game point and player-2 misses, the game
crashes and cannot be restarted.

FIGURE 8.17

Illegal message failure: The implementation fails to handle an


illegal message correctly. Incorrect output is produced, the state is
computed or both.
For example, if player-1 presses the player select button after
serving, the game crashes and can’t be restarted.
Trap door: The implementation accepts an event that is not
defined in the specification. For example, when player-1 is serving,
player-1 can win any time by pressing the scroll lock key. A trapdoor
can result from:
1. Obsolete features that were not removed when a class was revised.
2. Inherited features that are inconsistent with the requirements of a
subclass.
3. Undocumented features added by the developer for debugging
purposes.

EXAMPLE 8.5. Consider a class TwoPlayerGame. This class acts as


a base class. The class ThreePlayerGame is derived from this class.
Draw its class hierarchy and statecharts at class scope. Also, draw its
flattened transition diagram for ThreePlayerGame.

SOLUTION. We draw its class hierarchy first. This diagram is known


as a class diagram. Both of them are shown in Figure 8.18. Its
flattened diagram is shown in Figure 8.19.
FIGURE 8.18 ThreePlayerGame Class Hierarchy and Statecharts at
Class Scope.
FIGURE 8.19 Flattened Transition Diagram, Three Player Game.

EXAMPLE 8.6. Write an algorithm that produces a transition tree


from a state transition diagram (STD). Draw a transition tree for three
player game.
SOLUTION.
1. The intial state is the root node of the tree.
2. Examine the state that corresponds to each non terminal leaf node
in the three and each outbound transition on this state. At least one
new edge will be drawn for each transition. Each new edge and
node represents an event and resultant state reached by an
outbound transition.
a. If the transition is unguarded, draw one new branch.
b. If the transition guard is a simple predicate or a complex
predicate composed of only AND operators, draw one new
branch.
c. If the transition guard is a complex predicate using one or more
OR operators, draw a new branch for each truth value
combination that is sufficient to make the guard TRUE.
3. For each edge and node drawn in Step 2:
a. Note the corresponding transition event, guard, and action on
the new branch.
b. If the state that the new node represents is already represented
by another node (anywhere in the diagram) or is a final state,
mark this node as terminal because no more transitions are
drawn from this node.
4. Repeat Steps 2 and 3 until all leaf nodes are marked terminal.
FIGURE 8.20 Transition Tree for ThreePlayerGame.

EXAMPLE 8.7. For a three player game, what conformance test suite
will you form? Derive the test cases.
SOLUTION.

TABLE 8.1 Conformance Test Suite for Three Player Game.


EXAMPLE 8.8. For a three player game, what sneak path test suite
will you form? Derive the test cases.
SOLUTION.

TABLE 8.2 Sneak Path Test Suite for Three Player Game.
8.2. TESTING OBJECT-ORIENTED SYSTEMS
Conventional test case designs are based on the process they are to test
and its inputs and outputs. Object-oriented test cases need to
concentrate on the state of a class. To examine the different states, the
cases have to follow the appropriate sequence of operations in the
class. Class becomes the main target of 00 testing. Operations of a
class can be tested using the conventional white-box methods and
techniques (basis path, loop, data flow) but there is some notion to
apply these at the class level instead.

8.2.1. IMPLEMENTATION-BASED CLASS TESTING/WHITE-BOX OR


STRUCTURAL TESTING
Implementation-based tests are developed by analyzing how a class
meets its responsibilities. Implementation-based tests are developed by
analyzing source code. This is also called structural, white-box, clear-
box, or glass-box testing.
The main approaches to implementation-based testing are:
Control flow testing
Path coverage
Data flow testing

These techniques are further classified as follows:


FIGURE 8.21 Types of White-Box Testing Techniques.

We will discuss these techniques one by one.

(I) Control Flow Testing


What Is Coverage Coverage is a key concept. It is a measure of test
completeness. It is denoted by Cx. It is the percent of test cases
identified by technique x that were actually run. So,

Following are some basis coverage metrics:


C1 or Statement Coverage
100 percent statement coverage means every statement has been
executed by a test at least once.
C1 coverage (i.e., 100% statement coverage) is the minimum
required by IEEE standard 1008, the standard for software unit
testing. It has been IBM’s corporate standard for nearly 30 years.
C1 coverage is the absolute minimum for responsible testing. It is
the weakest possible coverage.
C2 or Decision Coverage
100% decision coverage means that every path from a predicate
has been executed at least once by a test.
For structured programs, C1 and C2 are equivalent. That is, you
will cover all statements in a structured program if you cover all
branches.
Control Graphs: Control flow faults account for 24% of all
conventional software faults. For example,
Correct code Fault
a = myobject.get ( ) a = myobject.get ( )
if (a = = b) ..... if (a = b) .....
Structured programming and small methods decrease the
opportunity for these kinds of faults but do not eliminate them.
Implementation-based testing can reveal subtle faults in control flow,
and provides a basis for coverage analysis.

Consider the following C++ code snippet:

1 Void abx (int a, int b, int x)


{
if (a > 1) && (b = = 0)
2 x = x/a;
3 if (a = = 2) | | (x > 1)
4 x = x + 1;
5 cout < < x;
}

We draw its control graph as it is our main tool for test case
identification shown in Figure 8.22.
FIGURE 8.22 Flowgraph for C++ Code.

Statement coverage for this C++ code: It simply requires that a


test suite cause every statement to be executed at least once.
We can get 100% C1 coverage for the abx method with one test
case.

C1 is not sufficient to discover common errors in logic and iteration


control.
Suppose the condition in node-3 is incorrect and should have seen
x>0
There are many C1 test sets that would miss this error.
Suppose path 1-3-5 has a fault (x is not touched on this path).
There are many C1 test sets that would miss this fault.
Statement coverage is a very weak criterion. However, on average,
adhoc testing covers only half of the statements.

Predicate Testing
A predicate is the condition in a control statement: if, case, do while,
do until, or for. The evaluation of a predicate selects a segment of code
to be executed.

There are four levels of predicate average [Myers]:


Decision coverage
Condition coverage
Decision/condition coverage
Multiple condition coverage
Each of these subsumes C1 and provides greater fault detecting
power. There are some situations where predicate coverage does not
subsume statement coverage:
Methods with no decisions.
Methods with built-in exception handlers, for example, C++
try/throw/catch.

Decision Coverage
We can improve on statement coverage by requiring that each decision
branch be taken at least once (at least one true and one false
evaluation). Either of the test suites below provides decision (C2)
coverage for the abx ( ) method.

TABLE 8.3 Test Suite for Decision Coverage of abx ( ) Method (function).

However, C2 is not sufficient to discover common errors in logic


and iteration control. For example, we can meet C2 without testing the
path where x is untouched (1-3-5) as test suite D2 shows. Suppose
condition-4 in statement number 3 is incorrectly typed as x < 1 then
neither of the C2 test suites would detect this fault.

Decision Coverage
It does not require testing all possible outcomes of each condition. It
improves on this by requiring that each condition be evaluated as true
or false at least once. There are four conditions in the abx ( ) method.

Condition True domain False domain


a>1 a>1 a≤1
b=0 b=0 b≠0
a=2 a=2 a≠
x>1 x>1 x≤1
Either of the following test suites will force at least one evaluation
of every condition. They are given below:

However, this is not sufficient to discover common errors in logic


and iteration control. For example, it is possible to skip a branch and
still cover the conditions. This is shown by test suite C2. Because
condition coverage can miss nodes, it can miss faults.

Decision/Condition Coverage

Condition coverage does not require testing all possible branches.


Decision/condition coverage could improve on this by requiring that
each condition be evaluated as true or false at least once and each
branch be taken at least once.
However, this may be infeasible due to a short-circuit Boolean
evaluation.
Most programming languages evaluate compound predicates from
left to right, branching as soon as a sufficient Boolean result is
obtained. This allows statements like
if (a = 0) or (b/a > c) then .....

to handle a = 0 without causing a “divide-by-zero” exception. This can


prevent execution of compound conditions.
We are now in a position to have decision/condition coverage for
the abx ( ) method with two test cases.

But due to short-circuit evaluation, the false domains of


b ≠ 0 and x ≤ 1 are not checked.

Multiple Condition Coverage


The preceding problems occur because we have not exercised all
possible outcomes of each condition for each decision.
Multiple condition coverage requires test cases that will force all
combinations of simple conditions to be evaluated. If there are n
simple conditions, there will be, at most, 2n combinations. Multiple
condition coverage subsumes all lower levels of predicate coverage.
We now derive the test cases based on multiple condition coverage
for the abx ( ) method. They are given below.

TABLE 8.4 Test Cases for abx ( ) Method/Function.


Please note the following observations:
a. We usually don’t need 2n test cases for multiple condition
coverage.
b. Multiple condition coverage does not guarantee path coverage. For
example, path 1-2-3-5 is not exercised.
c. If a variable is changed and then used in a predicate, reverse data
flow tracing may be needed to establish test cases values. This is
called path sensitization.
The following table shows how each condition is covered by the
M test suite (for the abx ( ) method).
So, there are 8 possible conditional variants. And we are able to
exercise all 8 with only 4 test cases.

Impossible Conditions and Paths


Some predicate domains or programming structures may render a path
impossible. There are at least four causes of impossible conditions and
paths.
1. Boolean short circuits, as discussed earlier.
2. Contradictory or constraining domains.
Given:
if (x > 2) && (x < 10) then ....

Only three conditions can be obtained. x cannot be simultaneously


less than 3 and greater than 9, so the false-false case is impossible.
3. Redundant Predicates.
if (x) <A>
else <B>
endif
if not (x) <C>
else <D>
endif

For the same value of x, paths <A> <C> and <B> <D> are not
possible.
Because the predicates could be merged, you may have found a
fault or at least a questionable piece of code.
4. Unstructured Code.
Acceptable: Exception handling, break.
Not acceptable: Anything else.

Consideration for Loops


Loops also pose some unique test problems. Loops may be simple
loops, nested loops, serial loops, and spaghetti loops. We shall discuss
each one by one.
a. Simple loops: All loops should get at least two iterations. This
is the minimum needed to detect data initialization/use faults.
We focus on boundary conditions because they are a frequent
source of loop control faults.
A simple loop is either a do, while, or repeat-until form with a
single entry and a single exit. A complete test suite will
exercise the domain boundary of the loop control variable such
as (minimum-1), minimum (possibly zero), (minimum +1),
typical, (maximum –1), maximum, (maximum +1), and so on.
b. Nested loops: A nested loop is a simple loop contained in
another simple loop. Multiplicative testing is overkill. For
example, if we use the five basic loop test cases then:
i. We would have 25 test cases for a doubly nested loop.
ii. We would have 125 test cases for a triply nested loop.
Beizer suggests that we should start with the innermost loop
first and then go to the outer loops to test nested loops.
Beizer suggests the following procedure:
Step 1. Test the inner loop first, and the outer loop last.
Step 2. Run the five basic tests on the inner loop:
a. Min, min +1, typical, max –1, max
b. Set all outer loop controls to minimum values
c. Add excluded and out of range checks, if needed
Step 3. For the next loop out:
Run min, min +1, typical, max –1, max.
Set the inner and outer loop controls to typical values.
Add excluded and out of range checks, if needed.
Step 4. Repeat Step-3 for each inner loop.
Step 5. When all loops check out individually, set all values at
max and test all loops together. With two-level loops, this
requires 12 test cases, 16 with three-level, and 19 with four-
level.
c. Serial loops: Two or more loops on the same control path are
serial loops. To test them
If there are any define/use data relationships between the
loops, treat them as if they were nested.
Use data flow analysis to determine test cases.
If there is no define/use relationship between the loops, they
can be tested in isolation.
d. Spaghetti loops: It is a loop which has more than one entry or
exit point. Such loops, if nested, can result in very complex
control flow paths.
To test them, we can follow one of the following methods:
a. Reject such loops and return the code to the developer for
correction.
b. Design another equivalent single-entry/single-exit loop.
Develop your test cases from this design, then add test data
for every wired path you can think of.

(II) Path Coverage Testing

The number of unique independent paths through a non directed graph


is given by graph theory. The number of independent paths is called
the cyclomatic complexity of the graph, denoted by V(G). They are
called basis paths because they correspond to the basis vector in a
matrix representation of an undirected graph.
This number is not defined for directed (e.g., control flow) graphs.
However, it has become a popular metric for the complexity and test
coverage.
There are three methods to compute V(G). They are as follows:
a. V(G) = e – n + 2 (e = edges, n = nodes)
b. V(G) = Number of enclosed regions +1
c. V(G) = Number of predicate nodes +1

These expressions hold for single-entry, single-exit program


complexity, C.

Basis Paths. Consider the following function foo ( ).

Its flowgraph is shown in Figure 8.23.


Here, Number of edges (e) = 8
Number of nodes (n) = 7
∴ V(G) = e – n + 2 = 8 – 7 + 2 = 3.
FIGURE 8.23

This indicates there are at least 3 entry/exit paths in this graph:


Path 1: 1-2-4-5-7
Path 2: 1-3-4-5-7
Path 3: 1-3-4-5-6-7
or they can be:
Path 1: 1-2-4-5-7
Path 2: 1-3-4-5-7
Path 3: 1-2-4-5-6-7
These are called basis paths.
This technique has already been discussed in previous chapters.

(III) Data Flow Testing


We have already discussed this technique in earlier chapters.

(IV) Intra Class Coverage


It further involves two techniques:
a. The FREE flow model
b. The C* metric (or V*(G) metric).
We shall discuss these techniques one by one.
a. The FREE flow graph: We have seen several ways to identify
paths for single-entry, single-exit code. However, a class typically
contains many single-entry, single-exit (SESE) code segments. In
C++, each member function is a SESE segment. The methods in a
class access the same instance variables and must cooperate for the
correct execution under all possible activation sequences. What is
the control and data flow for the entire class? To answer such
questions, we need:
a. A state model for the class under test.
b. A control flow graph for each method.
We will use the FREE state model.
State is the result of method activation.
A single state may be computed by one or several methods.
We use a concrete, aggregate model of state.
Events are either:
Method activation.
Interrupts that change the state of an instance variable, i.e., we
are testing an actor class.
Action are either:
Message response (values returned to the client).
Messages sent to servers or other interfaces.
The FREE flow graph is based on the following observation:
Methods compute state.
A single transition edge may be replaced by the flow graph of
the method which causes the transition.
A FREE flow graph is constructed in two steps:
1. Method graphs are substituted for transitions.
2. Edges are added to link the state exit/method entry and
method exit/state entry nodes.
Test case derivation: We now have a graph model of the entire
class. We can identify all intra-class control paths. We can identify
all intra-class du paths for instance variables. We can apply all the
preceding test case techniques at the class level.
b. The C* metric: We know that V(G) or C of a graph, G, is given
by e – n + 2. Similarly, for the FREE flow graph the class
complexity is represented by C* or V*(G). It is the minimum
number of intra-class control paths.
∴ C* = E – N + 2
E= em + 2m
N= nm + ns
where em = Total edges in all inserted subgraphs
m= Number of inserted subgraphs
nm = Total nodes in all inserted subgraphs
ns = Nodes in state graph (states)

Thus, we have
C * = em + 2m - nm - ns + 2

Limitations of Implementation-Based Testing/White-Box Testing


of Classes
It provides many useful approaches to develop and evaluate unit test
cases. The limitations are as follows:
1. Exhaustive path testing is impossible.
2. Implementation-based testing requires code analysis skills and
knowledge of test case design techniques.
3. Without automated support, it is hard to scale up.
4. Available testing tools are useful but not ideal.
5. Coverage metrics are guidelines not absolutes.
6. Even a high coverage test suite cannot prove the absence of faults,
validate requirements, or test missing functions.
7. Each code-based coverage approach has “blind spots” and is
sensitive to the particular test values selected.
8. Code-based testing is tautological.

8.2.2. RESPONSIBILITY-BASED CLASS TESTING/BLACK-


BOX/FUNCTIONAL SPECIFICATION-BASED TESTING OF CLASSES
Responsibility-based class testing is the process of devising tests to
see that a class correctly fulfills its responsibilities. These tests are
derived from specifications and requirements. They are not derived
from implementation details. This is also called black-box, functional,
or specification-based testing.
We categorize this type of testing as follows:

FIGURE 8.24

We need to consider three main facets of a class and its methods to


develop responsibility test cases. They are:
i. Functional Analysis
What kind of function is used to transform method inputs
into outputs?
Are inputs externally or internally determined?
ii. Domain Analysis
What are valid and invalid message inputs, states, and
message outputs?
What values should we select for test cases?
iii. Behavior Analysis
Does the sequence of method activation matter or not?
When sequence matters, how can we select efficient
testing sequence?
We will discuss each of these one by one.

Overview of the Approach


Responsibility-based testing has four main steps:

Step 1. Select a functional model for each method: numerical,


combinational, or compound.
a. Model numerical functions with an equation.
b. Model combinatorial functions with a decision table.
c. Develop heuristic tests for compound functions.
Step 2. Identify test case data for each method test case using domain
analysis.
Step 3. Select an activation model for the method and the class.
Develop an sequential activation plan.
Step 4. Interleave functional test cases with the activation plan.
Function types. A method computes a result. In this sense, it is
like a mathematical function. For testing purposes, it is useful to
consider three kinds of functions. They are as follows:
Numerical: A numerical function is specified or can be modeled
by a mathematical formula.
Combinational: A combinational function is specified or can be
modeled by a decision table, Boolean equations, or equivalent
conditions, rules or logic.
General: A general function has responsibilities that are not easily
modeled as numerical or combinational functions.

A Testable Function
Must be independently invocable and observable.
Is typically a single responsibility and the collaborations necessary
to carry it out.
Often corresponds to a specific user command or menu action.
Testable functions should be organized into a hierarchy
Follow the existing design.
Develop a test function hierarchy.
Testable functions should be small, “atomic” units of work.
A method is a typically testable function for a class-level test. A
use case is a typically testable function for a cluster-level or system
test.
For numerical functions, how many test cases do we need?
At least two.
If there are n variables, we need (n + 1) test cases.
What values should we use?
Each value in the test suite must be unique.
Do not use 1 (1.0) or 0 (0.0) for input values.
Select values that are expected to produce a non-zero result.
If a variable is returned by a message, the message must be forced
to provide a non-result.
This test will reveal all errors in which one or more coefficients is
incorrect.
Test cases are defined in two steps:
Step 1. Prepare a matrix with one more column than there are
variables. This extra column is filled with “1,” which is used
to check the determinant but not as part of the test case.

Step 2. Find values for the variable matrix such that the determinant
of the entire matrix is not zero. This requires that
No row or column consists entirely of zeros.
No row or column is identical.
No row or column is an exact multiple of any other.
So, for F = (10 * x) – (x + y) ^ 3, we have

To select test values for variables


Use prime numbers for integer variables.
For rational-valued variables, select values which meet the above
matrix constraints by inspection. If the matrix is too large to check
by inspection, then compute the determinant.
If the variables are bounded (e.g., 100 < x < 1000) then develop
additional tests for boundary values.
Alternatively, we can select output values and solve for the test points.
We must:
Test cases for inequalities in n-variables and should have at least
(n + 2) tests. The function should be exercised at least one on-
point and one off-point.
Test cases for non-linear functions should have points falling on
either side of the points of inflexion.
Boolean functions (i.e., variables with a {0/1} or {true/false}
domain should be validated by combinational analysis).
The expected results must be established by an oracle or a
trustworthy source of correct values.
Inspections and acceptance testing are the only way to detect
design errors.

What About the Combinational Functions?


A combinational function is one where:
a. Output is determined by current input.
b. Previous inputs do not change the output.
c. This is like a simple spin lock which will open when the correct
number is dialed in regardless of the dialing order.
We model combinational functions with a decision table. A
decision table has conditions and actions. When all of the individual
conditions on one line are true, the corresponding action is taken.
There can be many conditions for each action. A condition must
resolve to a Boolean variable.
Cells irrelevant for a rule are labelled DC, for “don’t care.” Rules
can be given in either rows or columns.
For example, we have already drawn a decision table for some
problems in previous chapters. Now, let us see how to find
combinational faults.
Although combinational functions should ignore prior inputs and
give the same results regardless of input sequence, try varying the
order of the input variables. The evaluation of the conditions is
typically hard-coded in a case or switch construct which may have
assumed a particular order of input.
A decision table with n conditions can have 2n actions, however ....
It is not unusual to see fewer than 2n actions.
This often results from “don’t care” (DC) or “impossible”
conditions.
DCs should be implemented by default processing by which they
are treated as an error or ignored without causing a failure.
As a practical matter, DCs often result in failures.
So, we must not ignore DCs. We must expand the DC conditions
to the full 2n set. We test each of this expanded condition in the same
way we would test regular conditions.

(II) Domain Analysis

Test case data is identified by domain analysis. For each testable


function, we must:
a. Identify all parameters, arguments, variables, database conditions,
set points, toggles, etc.
b. Identify equivalence classes or partitions for each variables of
interest.
c. Identify boundary values for the partition.
Domain analysis can reveal design faults. It is good at revealing
missing logic faults—holes in the specification or the implementation.

Domains–Input v/s Output


A domain is a set defined over a method’s arguments or instance
variables. A typical input domain is astronomically large. For a
method that accepts a 10 character (8-bit) input string, there are
28 × 28 × 28 × 28 × 28 × 28 × 28 × 28 × 28 × 28 = 280
possible input combinations.
Even if you were very fast and could run and check 1000 tests per
second and work 24 hours a day, non stop, you would need about
twice the current estimated age of the universe to complete the tests.
Remember that exhaustive testing is impossible.
Because we can’t test all values, which ones should we test? The
answer may be an equivalence class test. But before discussing these
techniques we must know the differences between the input and output
domain.
An input domain and domain values are relatively easy to define. So,
we have
a. Any input domain is the entire range of valid values for all external
and internal inputs to the method under test.
b. Private instance variables should be treated as input variables.
c. The domain must be modeled as a combinational function if there
are dependencies among several input variables.
An output domain may be more difficult to identify. So, we have
a. The output domain of an arithmetic function is the entire range of
values that can be computed by the function.
b. Output domains can be discontinuous–they may have “holes” or
“cliffs.”
c. The output domain of a combination function is the entire range of
values for each action stub.
Next, we discuss a technique under domain analysis which is a
type of responsibility-based/ black-box testing method. This technique
is popularly known as equivalence class partitioning.
An equivalence class is a group of input values which require the
same response. A test case is written for members of each group and
for non-members. The test case designates the expected response.

NOTE

When speaking of an equivalence class, class is used in the


mathematical sense and not as a source code construct.

Equivalence class testing is based on the assumption that if a


method fails for one member of a equivalence class, it is likely to fail
for all members of that class. This reduces the size of the input space
to manageable proportions.

How are Equivalence Classes Identified?


We have already discussed this issue in previous chapters. System
requirements are the source of domains, equivalence classes, and
boundary values. There is no algorithm for exactly determining
“good” and “bad” partitions. Any available information can be used.
As quoted by Hamlet—
“The goal is to make the resulting classes so narrow that each aspect
of the program, of the specification, of development, each programmer
concern etc. is separated into a unique class.”
The steps are:
1. Define valid and invalid partitions for each input variable.
2. Further subdivide the valid partition.
At least one separately executable test case is written for each
equivalence class. The second technique is known as boundary value
analysis (BVA). As with equivalence classes, there is no general
algorithm for exactly determining boundary values. The domain
boundary depends on what the variable represents and its constraints.
For a range of (n, m), the boundaries may lie
Inside upper and lower edges, n, n + 1, m – 1, m.
Outside upper and lower edges, n – 1, m + 1.
For counters, boundaries are often related to capacity limits. Say,
we can store from 1 to 255 objects.
We would have boundary values at 0, 1, 255, and 256.
For ordered collections, boundaries occur at the start, end, etc.
Like first, last, empty, full top, bottom, etc.
For output variables, boundaries can suggest some tests.
Try tests that produce exactly the minimum and maximum output
values.
Try tests that exceed the minimum and maximum output values by
the smallest increment permitted by the data type.
It is the type of function that suggests boundary values.
For a numeral with floating-point arguments
Try test cases that should cause over flow and under flow.
Try tests to check the finest resolution. For example, with floating-
point operations, [(1.0/ 10.0) * 10.0] never really equals 1.
For combinational functions, we identify all combinations of on-
points. A test case may be prepared for each on-point set.
Another technique is state space analysis. The number of corners
in a state space depends upon the state variable constraints. A state
space boundary may be defined by:
Complex but well-formed mathematical relations (sphere, torus,
polyhedra, etc.).
Arbitrarily complex and irregular constraints.
Research which indicates that a test set consisting of (n + 2) points
with least one off-point and one on-point is needed to probe a non
linear boundary.
To probe a state space, we adapt the nested loop approach to select
points. With two state variables this produces 15 test points, 22 for
three, and 29 with four.

(III) Behavior Analysis


Behavior is a specified or observed sequence of accepted messages
and responses. There are three behavior patterns of interest for testing.
a. Independent: The method output is totally independent of
previous activation. We can use domain, numerical, or
combinational techniques.
b. Sequential: The method output depends on specific prior
sequences of activations and can be modeled by a finite state
machine.
c. Recursive: Method output depends on previous activations but
there are no constraints on order of activation.
It is of three types:
a. Sequential behavior testing
b. Conditional and sneak path testing
c. Recursive behavior testing
Now, we shall discuss these techniques one by one.
a. Sequential behavior testing: To develop test cases for sequential
behavior:
Develop a state model.
Messages are modeled as transition events.
Outbound messages are modeled transition actions.
Generate the state path tree.
Tabulate events and actions along each path to form test cases.
Develop test data for the path by domain analysis on events
and actions.
Consider a simple state model of a STACK.
What kinds of sequential behavior faults can occur?
The faults that can occur are:
a. Missing or incorrect transition.
b. Missing or incorrect event.
c. Missing or incorrect action.
d. Extra, missing, or corrupt state.
We have already discussed this type of testing in Section 8.1 of
this chapter. In the table below we list various state control faults for
the 2 player game already discussed.

FIGURE 8.25
b. Conditional transition tests: With conditional transitions, we
need to be sure that all conditional transition outcomes have been
exercised.
During conformance testing, we want a conditional transition
to fire when we send a message which meets its condition.
For each conditional transition, we need to develop a truth
table for the variables in the condition.
An additional test case is developed for each entry in the truth
table which is not exercised by the conformance tests.
This test is added to the transition tree.

Event Path Sensitization


Values (or states) necessary for conditional expression must
be determined by analysis.
We can use the same approach taken to identify path
conditions for source code.
But this testing is not complete. We also need to do sneak path
testing.

What Is a Sneak Path?


An illegal transition is present when the class-under-test
(CUT), in some valid state, accepts a message which is not
explicitly specified for that state.
An illegal message is an otherwise valid message which
should not be accepted given the current state of the class; if
accepted, an illegal transition results.
A sneak path is the bug in the CUT which allows an illegal
transition.
We test for sneak paths by sending illegal messages. For each
specified state, a sneak path is possible for each message not accepted
in that state.
The expected response to a sneak path is usually:
The message should be rejected in an appropriate manner.
The state of the object should be unchanged after rejecting the
illegal event.
Strict black-box state-based testing approaches assume that only
externally visible I/O sequences are available to the tester. The tester
must determine which state is actually obtained by applying a
distinguishing sequence and observing the resulting output. This
increases the number of tests by a large amount.
With object-oriented software, we assume an internal state can be
determined by
a. A method activation trace (activation is equated with state).
b. State reporting capability in the CUT.
c. Built-in reporting in the CUT.
This is a gray-box assumption.
And last but not the least, another testing technique is recursive
behavior testing. We will discuss it following.
c. Recursive behavior testing: Some classes are purposely designed
to respond to all methods in any state. This is often true of
“framework” or “foundation” classes. Classes that represent the
application domain typically have sequential constraints.
The recursive behavior test strategy uses paired sequences of
methods on instances of the same class [Doong]. In this type of testing

a. Each sequence is picked to yield the same results.
b. Each member of the pair is executed.
c. The results are compared.
d. If the results are not equivalent, the test has succeeded in
revealing a fault.
For example, in the table below, test cases 1 and 2 run a pair of
sequences on stack objects A and B, which should yield the same
result in all four sequences, that is, 5 on the top. So, we have

Test Test sequence


1. Stack A Create, push (5), push (6), pop,
Stack B Create, push (5)
2. Stack A Create, push (5), push (6), pop, top,
Stack B Create, push (5), top

8.3. HEURISTICS FOR CLASS TESTING


An abstract class is typically not directly instantiated. Instead,
subclasses specialize or provide an implementation. A strategy for
testing C++ abstract classes can be given below:
1. Develop specification-based tests from abstract class specification.
2. Test all member functions that don’t call pure virtual methods.
3. Test all member functions that do call pure virtual methods. This
may require stubs for the derived class implementation.
4. Do state/sequential tests of new and inherited methods.
Like abstract classes, the templates cannot be directly instantiated
or tested. Once instantiated, all other basis testing techniques apply.
The strategy for testing template classes is given below:
1. Prepare test cases, and test at least one instantiation.
A single test suite can be reused to test each type with the
necessary changes per type and additional special tests per
type.
Does any possible type suggest special or different test cases?
Do any pre- or post-conditions change due to any possible
type?
2. Select a type that exercises all class features. If you can’t then this
is probably a poorly designed template.
3. Decide on how these additional types may be tested. Testing one
type may be sufficient if
Only =, = =, and ! = operators are used.
There are no parameters of T* or T& in the public interface of
the template class.
This means no polymorphic bindings can be made.
If either of the above conditions is not present, then each possible
type should be instantiated and tested.
A collection class houses a group of individual objects of the same
type, for example, a list, stack, queue, array, table, etc. The
responsibility of this class is to manage the collection according to
some scheme. The test strategy must be based on this scheme.
There are several patterns useful for testing collections. They are
discussed below.

Sequential Collection Test Pattern


A sequential collection contains a variable number of items. Items are
added to the “end,” e.g., strings, buffers, ASCII files, etc. You must
“walk” the entire collection in sequence to access a particular element.
For a sequential collection like a stack or queue, we have the test suite
given below.
Ordered Collection Test Pattern
In an ordered collection, elements are added, accessed, and removed
using some ordering scheme—stack, queue, tree, list, etc. All of the
sequential test patterns can be applied with the necessary changes in
expected results. The following test patterns verify position-dependent
behavior.
Pairwise Operand Test Pattern
Operations may be defined that use collections as operands. Suppose
there are two collections, A and B. Then,
A and B should be individually verified first.
A and B are populated before running the test.
The following test patterns verify size-dependent operation
processing and not the semantics of the operation—

where N > 1 and N ≠ M.

What About Arrays and Pointers?


Try the following as inputs to the pointer calculation
A negative, zero, or very large value
The lower-bound value –1
The lower-bound value
A normal value
The upper-bound
The upper-bound +1
Try formula verification tests with array elements initialized to
unusual data patterns.
All elements zero or null
All elements one
All elements same value
All elements maximum value, all bits on, etc.
All elements except one are zero, one, or max
We treat each sub-structure with a special role (header, pointer
vector, etc.), as a separate data structure.
The pair-wise operand test pattern may also be applied to operators
with array operands.
Relationship test patterns: Collection classes may implement
relationships (mapping) between two or more classes. We have
various ways of showing relationships. For example

Entity-relationship model:

FIGURE 8.26

Coad-Yourdon diagram for the above case is

FIGURE 8.27

While its Booch class diagram is


FIGURE 8.28

We can now model valid instances of relations with combinational


analysis.

Relationships can be tested by considering likely scenarios:


Person buys dog
Dogs runs away
Person sells dog
Dog dies
We can use the relationship’s cardinality parameters to
systematically identify test cases that verify correct implementation of
relationships.

Keyed Collection Test Patterns


Some collections (e.g., Dictionary in Smalltalk) provide a unique
identifier for each member of a collection.
Application classes that provide file or database wrappers may also
rely on keyed access. The basic test patterns for keyed collections
follow.
Keyed collections must provide a wide range of paired operations.
Item-to-item processing is often faulty.
Collection classes are well-suited to sequential constraint or
recursive equivalence test patterns on operations.

Exception Testing
Exception handling (like Ada exceptions, C++ try/throw/catch) add
implicit paths. It is harder to write cases to activate these paths.
However, exception handling is often crucial for reliable operations
and should be tested. Test cases are needed to force exceptions.
File errors (empty, overflow, missing)
I/O errors (device not ready, parity check)
Arithmetic over/under flows
Memory allocation
Task communication/creation
How can this testing be done?
1. Use patches and breakpoints: Zap the code or data to fake an
error.
2. Use selective compilation: Insert exception-forcing code (e.g.,
divide-by-zero) under control of conditional assembly, macro
definition, etc.
3. Mistune: Cut down the available storage, disk space, etc. to 10%
of normal, for example, saturate the system with compute bound
tasks. This can force resource related exceptions.
4. Cripple: Remove, rename, disable, delete, or unplug necessary
resources.
5. Pollute: Selectively corrupt input data, files, or signals using a data
zap tool.

Suspicion Testing
There are many situations that indicate additional testing may be
valuable [Hamlet]. Some of those situations are given below:
1. A module written by the least experienced programmer.
2. A module with a high failure rate in either the field or in
development.
3. A module that failed an inspection and needed big changes at the
last minute.
4. A module that was subject to a late or large change order after most
of the coding was done.
5. A module about which a designer or programmer feels uneasy.
These situations don’t point to specific faults. They may mean
more extensive testing is warranted. For example if n-tests are needed
for branch coverage, use 5n instead to test.

Error Guessing
Experience, hunches, or educated guesses can suggest good test cases.
There is no systematic procedure for guessing errors. According to
Beizer, “logic errors and fuzzy thinking are inversely proportional to
the probability of a path’s execution.”
For example, for a program that sorts a list, we could try:
An empty input list.
An input list with only one item.
A list where all entries have the same value.
A list that is already sorted.
We can try for weird paths:
Try to find the most tortuous, longest, strongest path from entry to
exit.
Try “impossible” paths, and so on.
The idea is to find special cases that may have been overlooked by
more systematic techniques.

Historical Analysis
Metrics from past projects or previous releases may suggest possible
trouble spots. We have already noted that C metric was a good
predictor of faults. Coupling and cohesion are also good fault
predictors. Modules with high coupling and low cohesion are 7 times
more likely to have defects compared to modules with low coupling
and high cohesion.

Class Testing Summary


The main steps to test class methods are given below:

Step 1. Instantiate a test object of the class-under-test (CUT).


Step 2. Set the object-under-test (OUT) and supplier objects to the
state specified by the test case.
Step 3. Set the parameters of the method-under-test (MUT) to the
values specified by the test case.
Step 4. Send the test message to the MUT.
Step 5. Check for compliance with the expected results of the test
case:
a. The returned values of the MUT.
b. The resultant state of the OUT.
c. The state of the message parameters.
d. Exceptions (if any) thrown or raised.
e. Message sent to other objects (or stubs) are correct in sequence and
content.

Six Principles of Effective Testing


Brain Marick offers six principles of effective testing. They are as
follows:
Principle 1: Most errors are not very creative. Because errors tend to
follow patterns, methodical checklist testing has a high
payoff.
Principle 2: Faults of omission activated by unanticipated special
cases are the most difficult to find and should therefore
get greater attention.
Principle 3: Specification faults, especially omissions, are more
dangerous and harder to find than code faults.
Principle 4: At every stage of testing, mistakes are inevitable. Later
stages should compensate for this.
Principle 5: Code coverage is a good approximation of test quality.
Because it is an approximation, coverage should be
viewed as a useful indicator, not a guarantee.
Principle 6: Good tests emphasize variety and complexity. So,

Use the same test case patterns in different ways and in different
contexts.
Try to combine test cases in random, unusual, strange, or weird
ways. This increases the chance of revealing a sneak path, a
surprise, or an omission.

8.4. LEVELS OF OBJECT-ORIENTED TESTING


Three or four levels of object-oriented testing are used depending on
the choice of what constitutes a unit. If individual operations or
methods are considered to be units, we have four levels:
operation/method, class, integration, and system testing.
1. With this choice, operation/method testing is identical to unit
testing of procedural software.
2. Class and integration testing can be renamed as intraclass and
interclass testing. The second level then, consists of testing
interactions among previously tested operations/methods.
3. Integration testing is the major issue of object-oriented testing and
must be concerned with testing interactions among previously
tested classes.
4. System testing is conducted at the port event level and is identical
to system testing of traditional software. The only difference is
where system level test cases originate.

8.5. UNIT TESTING A CLASS


Classes are the building blocks for an entire object-oriented system.
Just as the building blocks of a procedure-oriented system have to be
unit tested individually before being put together, so the classes have
to be unit tested. This is due to several reasons. Some of them are
given below:
1. Reusability of the code is the focal point of the object-oriented
system. A class is also reused many times. A residual defect in a
class can, therefore, potentially affect every instance of reuse.
2. Many defects get introduced at the time a class gets defined. If
these defects are not caught in time then they may go into the
clients of these classes. Thus, the fix for the defect would have to
be reflected in multiple places giving rise to inconsistencies.
3. A class may have different features. Different clients of the class
may pick up different pieces of the class. No one single client may
use all the pieces of the class. Thus, unless the class is tested as a
unit first, there may be pieces of a class that may never get tested.
4. A class encapsulates data and functions. If the data and functions
do not work in synchronization at a unit test level, it may cause
defects that are potentially very difficult to narrow down later.
5. Because object-oriented systems support inheritance, the building
blocks are thoroughly tested stand alone, defects arising out of
these context may surface magnified many times later in the cycle.
The question is now whether we can apply conventional methods
to test our classes? Yes, some of the methods for unit testing apply
directly to testing classes. For example:
1. Every class has certain variables. The techniques of BVA and
equivalence class partitioning discussed earlier in black-box testing
can be applied to make sure that the most effective test data is used
to find as many defects as possible.
2. Not all member-functions in C++ or methods in JAVA are
exercised by all the clients. The methods of function coverage that
were discussed in white-box testing can be used to ensure that
every method (function) is exercised.
3. Every class will have functions/methods that have some procedural
logic. So, the techniques of condition coverage, branch coverage,
code complexity, and so on that we discussed during white-box
testing can be used to make sure that branch coverage is complete
and to increase the software maintainability.
4. Because a class is meant to be instantiated multiple times by
different clients, the various techniques of stress testing and system
and acceptance testing can be performed for early detection of
stress-related problems such as memory leaks.
Also note that we have already discussed state-based testing in
Section 8.2.2 of this chapter. This technique is useful for black-box
testing of classes. Because a class is a combination of data and
methods that operate on the data, in some cases, it can be visualized as
an object going through different states. The messages that are passed
to the class act as inputs to trigger the state transition. It is useful to
capture this view of a class during the design phase so that testing can
be more natural.

How to Test Classes?


In order to test an instantiated object, messages have to be passed to
various methods. In what sequence does one pass the messages to the
objects? One of the methods that is effective for this purpose is the
alpha-omega method.

Principles of the Alpha-Omega Method


The alpha-omega method works on the following principles:
1. Test the object through its life cycle from birth to death, that is,
from instantiation to destruction. An instance gets instantiated by a
constructor method; then the variables are initialized. These values
may get modified also. Finally, destructors are used to destroy
objects.
2. Test simple methods first and then more complex methods. This is
because object-oriented programming languages support
inheritance and thus more complex methods will be built upon the
simpler methods.
3. Test the methods from private through public methods. The private
access specifier reduces the dependencies in testing and gets the
building blocks in a more robust state before they are used by
clients.
4. Send a message to every method at least once. This ensures that
every method is tested at least once.
The following steps are followed during the alpha-omega method:
Step 1. Test the constructor methods first. When multiple
constructors are used, all should be tested individually.
Step 2. Test the get methods or access or methods (methods that
retrieve the values of variables in an object for use by the
calling programs). This ensures that the variables in the class
definition are accessible by the appropriate methods.
Step 3. Test the methods that modify the object variables. There are
methods that test the contents of variables, methods that
set/update the contents of variables, and methods that loop
through various variables.
Step 4. Finally, the object has to be destroyed and when the object is
destroyed, no further accidental access should be possible.
Also, all the resources used by the object instantiation should
be released. These tests conclude the lifetime of an
instantiated object.

Which Unit to Select—Method or Class?


Let us consider this one by one as two separate cases.
Case 1: Methods as units: Superficially, this choice reduces
object-oriented unit testing to traditional (procedural) unit testing. A
method is nearly equivalent to a procedure so all the traditional
functional and structural testing techniques should apply. Unit testing
of procedural code requires stubs and a driver test program to supply
test cases and record the results. Similarly, if we consider methods as
0-0 units, we must provide stub classes that can be instantiated and a
“main program” class that acts as a driver to provide and analyze test
cases. Also, it is found that nearly as much effort will be made to
create the proper stubs as in identifying test cases. Another important
consequence is that much of the burden is shifted to integration
testing. In fact, we can identify two levels of integration testing:
intraclass and interclass integration.
Case 2: Classes as units: Testing a class as a unit solves the
intraclass integration problem but it creates other problems. One has to
do with various views of a class. In the static view, a class exists as
source code. This is fine if all we do is code reading. The problem
with the static view is that inheritance is ignored but we can fix this by
using fully flattened classes. We might call the second view, the
compile-time view, because this is when the inheritance actually
“occurs.” The third view is the execution-time view, when objects of
classes are instantiated.
Testing really occurs with the third view but we still have some
problems. For example, we cannot test abstract classes because they
cannot be instantiated. Also, if we are using fully flattened classes, we
will need to “unflatten” them to their original form when our unit
testing is complete. If we do not use fully flattened classes, in order to
compile a class, we will need all of the other classes above it in the
inheritance tree. One can imagine the software configuration
management (SCM) implications of this requirement.
The class as a unit makes the most sense when little inheritance
occurs and classes have what we might call internal control
complexity. The class itself should have an “interesting” state-chart
and there should be a fair amount of internal messaging.

8.6. INTEGRATION TESTING OF CLASSES


Of the three main levels of software testing, integration testing is the
least understood. This is true for both traditional and object-oriented
software. As with traditional procedural software, object-oriented
integration testing presumes complete unit-level testing.
Both unit choices have implications for object-oriented integration
testing. If the operation/ method choice is taken, two levels of
integration are required:
1. One to integrate operations into a full class.
2. One to integrate the class with other classes.
This should not be dismissed. The whole reason for the operation-
as-unit choice is that the classes are very large and several designers
were involved.
Assuming that a class is the basic unit choice, once the unit testing
is complete two steps must occur:
1. If flattened classes were used, the original class hierarchy must be
restored.
2. If test methods were added, they must be removed.
Once we have our “integration test bed” we need to identify what
needs to be tested. As with our traditional software integration, static
and dynamic choices can be made. We can address the complexities
introduced by polymorphism in a purely static way: test messages with
respect to each polymorphic context. The dynamic view of object-
oriented integration testing is more interesting.
In addition to addressing excapsulation and polymorphism in
object-oriented testing, another question that arises is what order do
we put the classes together for testing? The various methods of
integration like top-down, bottom up, big bang, and so on can all be
applicable here. Please note the following points about object-oriented
systems.
1. Object-oriented systems are inherently meant to be built out of
small, resuable components. Hence, integration testing will be even
more critical for object-oriented systems.
2. There is typically more parallelism in the development of the
underlying components of object-oriented systems, thus the need
for frequent integration is higher.
3. Given the parallelism in development, the sequence of availability
of the classes will have to be taken into consideration while
performing integration testing. This would also require the design
of stubs and harnesses to simulate the function of yet-unavailable
classes.
There are four basic strategies for integration:
1. Client/Supplier: The structure of class/supplier classes can be used
to guide integration.
2. Thread based: The thread of messages activated to service a single
user input or external event studied to decide the order of
integration.
3. Configuration based: The thread of messages activated to service a
single user input or external event studied to decide the order of
integration.
4. Hybrid strategy: A mix of top-down, bottom-up, or big-bang
integration can be used.
Before we discuss these methods, we must define some
terminology used during integration testing. It is useful to distinguish
several kinds of objects—
a. Actor: An object that changes state or uses other objects without
receiving a message from another application object. An actor is
never a recipient of a message from another object.
b. Agent: An object that accepts and sends object messages. An agent
typically causes other objects to be created and used.
c. Server: An object that accepts messages from other objects but
does not send messages.
We will now discuss these techniques one by one.

1. Client/Supplier Integration
This integration is done by “users.” The following steps are followed:
Step 1. We first integrate all servers, that is, those objects that do
not send messages to other application objects.
Step 2. Next we integrate agents, i.e., those objects that send and
receive messages. This first integration build consists of
the immediate clients of the application servers.
Step 3. There may be several agent builds.
Step 4. Finally, we integrate all actors, i.e., application objects
that send messages but do not receive them.
This technique is called client/supplier integration and is shown in
Figure 8.29.
FIGURE 8.29 Client/Supplier Integration.

2. Thread Integration
Thread integration is integration by end-to-end paths. A use case
contains at least one, possibly several threads.
Threaded integration is an incremental technique. Each processing
function is called a thread. A collection of related threads is often
called a build. Builds may serve as a basis for test management. The
addition of new threads for the product undergoing integration
proceeds incrementally in a planned fashion. System verification
diagrams are used for “threading” the requirements.

3. Configuration Integration
In systems where there are many unique target environment
configurations, it may be useful to try to build each configuration. For
example, in a distributed system, this could be all or part of the
application allocated to a particular node or processor. In this
situation, servers and actors are likely to encapsulate the physical
interface to other nodes, processors, channels, etc. It may be useful to
build actor simulators to drive the physical subsystem in an
controllable and repeatable manner.
Configuration integration has three main steps:
1. Identify the component for a physical configuration.
2. Use message-path or thread-based integration for this
subsystem.
3. Integrate the stabilized subsystems using a thread-based
approach.

4. Hybrid Strategy
A general hybrid strategy is shown in the following steps:
a. Do complete class testing on actors and servers. Perform limited
bottom-up integration.
b. Do top-down development and integration of the high-level control
modules. This provides a harness for subsequent integration.
c. Big-bang the minimum software infrastructure: OS configuration,
database initialization, etc.
d. Achieve a high coverage for infrastructure by functional and
structural testing.
e. Big-bang the infrastructure and high-level control.
f. Use several message path builds or thread builds to integrate the
application agents.

Integration Test Checklist


The completion of the following checklist will reveal many interface
faults and establish the basis for effective system testing.
1. Set up infrastructure:
1.1 Configure environment.
1.2 Create/Initialize files.
1.3 Connect I/O devices, establish minimal handshaking.
1.4 Compile, link, build, make, or assemble the entire system from
a single library under SCM control.
2. Verify executability:
2.1 Run all batch job streams to completion with minimal input.
2.2 Bring up all tasks in an online system. Navigate all menu paths
and perform normal shut down.
2.3 Bring up all tasks in an embedded system. Accept normal
signal inputs, produce all output at nominal values. Perform
normal shut down.
3. Verify minimal cooperative functionality:
3.1 Run through primary end-to-end threads: All event-response
paths.
3.2 Run through all cycles: on/off, polling loop, hourly, daily,
weekly, monthly, etc.

Integration testing should not try for extensive coverage as this is


the goal of unit and system test. Do good unit testing or make unit
tests an explicit and separate part of your integration strategy. Don’t
stress the system. Do just enough thread and cycle testing to reveal
interface and integration faults. Focus on stabilizing the system so that
you can do efficient system testing.

8.7. SYSTEM TESTING (WITH CASE STUDY )


According to Myer, if you do not have written objectives for your
product, or if your objectives are unmeasurable, then you cannot
perform a system test.
Object-oriented systems are by design meant to be built using
smaller sensable components (i.e., the classes). Due to this heavy
reusability of components, system testing becomes even important for
object-oriented systems. This is due to the following reasons:
1. A class may have different parts not all of which are used at the
same time. When different clients start using a class, they may be
using different parts of a class and this may introduce defects at a
later (system testing) phase.
2. Different classes may be combined together by a client and this
combination may lead to new defects that are uncovered.
3. An instantiated object may not free all of its allocated resources,
thus causing memory leaks and such related problems, which will
show up only in the system testing phase.
Please note that proper entry and exit criteria should be set for the
various test phases before system testing so as to maximize the
effectiveness of system testing.
We describe four types of system testing.
1. Functional testing: A requirement is a capability, a feature, or
function that the system must provide. Every explicitly stated
requirement must be tested. We must, however, take care of the
following:
Traceability from requirements to test cases is necessary.
Each test case, at all levels, should indicate which requirement
(if any) it implements.
The test is simple: can you meet the requirement?
Well-written, testable line-item requirements are a necessity. We must
take care of the following:
If you don’t have line-item requirements, use the primitives in
your requirements model like PDLs.
If you don’t have line-item requirements, extract a list of
requirements from any and all available system documentation
like user manuals, contracts, marketing literature, etc.
2. Scenario-based system testing: The notion of the “average” user
or usage scenario can be misleading or too vague to construct
useful tests. We need to define specific scenarios to explore
dynamic behavior. For example:
Customer/user oriented definition.
Revalidates the product.
Focuses development on customer requirements.
Early customer involvement can build relationships.
Real-world orientation.
A scenario is—who, what, when, where, how, and why scenarios
have three dimensions—the users (system for embedded applications),
the tasks they perform, and the environment for different tasks and
users.

Who, how, and why?


Define the significant types, classes, and categories of users.

What and when?


Define from the user’s point of view the tasks that use the
system.
A user task must have specific time characteristics (when):
interval, cycle, duration, frequency, etc.
Where and how?
Define all the variations for context of tasks and users.

Users
The following questions can help to identify user categories:

Who?
Who are the users?
Can you find any dichotomies?
— Big company versus small
— Novice versus experienced
— Infrequent versus heavy user
Experience: Education, culture, language, training, work with
similar systems, etc.

Why?
What are their goals in performing the task—what do they
want?
What do they produce with the system?

How?
What other things are necessary to perform the task?
— Information, other systems, time, money, materials,
energy, etc.
What methods or procedures do they use?

Environment
The user/task environment (as well as the OS or computer) may span a
wide range of conditions.
Consider any system embedded in a vehicle. Anywhere the vehicle
can be taken is a possible environment.
What external factors are relevant to the user? To the system’s
ability to perform? For example, buildings, weather, electromagnetic
interference, etc.
What internal factors are relevant to the user? To the system’s
ability to perform? For example, platform resources like speed,
memory, ports, etc., AC power system loading, multitasking.
With scenarios categories in hand, we can focus on specific test
cases. This is called an operational profile.
An activity is a specific discrete interaction with the system.
Ideally, an activity closely corresponds to an event-response pair. It
could be a subjective definition but must have a start/stop cycle. We
can refine each activity into a test by specifying:
Probability of occurrence.
Data values derived by partitioning.
Equivalence classes are scenario-oriented.
Scenarios are a powerful technique but have limitations and
require a concentrated effort. So, we have the following suggestions:
User/customer cooperation will probably be needed to identify
realistic scenarios.
Scenarios should be validated with user/customer or a focus
group.
Test development and evaluation requires people with a high level
of product expertise who are typically in short supply.
Generate a large number of test cases.
Well-defined housekeeping procedures and automated support is
needed if the scenarios will be used over a long period of time by
many people.
Next we consider a case study of ACME Widget Co.
We will illustrate the operational profile (or specific test cases)
with the ACME Widget Co. order system.

Users
There are 1000 users of the ACME Widget order system.
Their usage patterns differ according to how often they use the
system. Of the total group, 300 are experienced, and about 500
will use the system on a monthly or quarterly basis. The balance
will use the system less than once every six months.

Environment
Several locations have significantly different usage patterns.
Plant, office, customer site, and hand-held access.
Some locations are only visited by certain users. For example,
only experienced users go to customer sites.

Usage
The main user-activities are order entry, order inquiry, order
update, printing a shipping ticket, and producing periodic reports.
After studying the usage patterns, we find proportions vary by user
type and location.
For example, the infrequent user will never print a shipping ticket
but is likely to request periodic reports.
Some scenarios are shown in Table 8.5.

TABLE 8.5 Showing Scenario Probability (p).


There are two main parts in an operational profile: usage scenarios
and scenario probabilities or:
Operational Profile = Scenarios + Probabilities
The factor proportions are multiplied to get scenario probability
(p). That is,
p = p1 × p2 × p3 (from the above table)
For example, looking at the first factor—user type experienced in
column-1 of Table 8.5, we observe that
a. About 30% are experienced users.
b. The second factor (location) shows that 80% of the experienced
group uses the system in the plant.
c. Nine times in 10 an experienced user in the plant will use the
system to print a shipping ticket.
The column on the far right gives the probability of this scenario.
Out of all users of the system, about 22% will be by experienced users
in the plant to print a shipping ticket.

NOTE

The individual scenario probabilities must add to 1.0.


If we sort the table in probability order, we have a prioritized test
system strategy; we sort it in descending order of the scenario
probability, p. Thus, we get Table 8.6.

TABLE 8.6 Sorted Scenario Probability (p) for Prioritization.


The operational profile is a framework for a complete test plan.
For each scenario, we need to determine which functions of the system
under test will be used.
An activity often involves several system functions; these are
called “runs.” Each run is a thread. It has an identifiable input and
produces a distinct output.
For example, the experienced/plant/ticket scenario might be
composed of several runs.
Display pending shipments.
Display scheduled pickups.
Assign carrier to shipment.
Enter carrier landing information.
Print shipment labels.
Enter on-truck timestamp.
Some scenarios may be low probability but have high potential
impact. For example, suppose ACME Widget is promoting order entry
at the customer site as a key selling feature. So, even though this
accounts for only 3 in a thousand uses, it should be tested as if it was a
high-priority scenario.
This can be accomplished by adding a weight to each scanario:
Weights Scenario
+2 Must test, mission/safety critical
+1 Essential functionality, necessary for robust operation
+0 All other scenarios
The operational profile maximizes system reliability for a given
testing budget. According to J.D. Musa, “testing driven by an
operational profile is very efficient because it identifies failures on
average, in order of how often they occur. This approach rapidly
increases reliability— reduces failure intensity—per unit of execution
time because the failures that occur most frequently are caused by the
faulty operations used most frequently.
User will also detect failures in order of their frequency, if they
have not already been in test.”
3. Performance testing: Performance is the behavior of the system
with respect to goals for time, space, cost, and reliability.
Performance testing requires:
Clear, objective test criterion
Controlled, well-instrumented test-bed
Data reduction support
Automated test drivers
We will look at two aspects of performance test:
1. Performance objectives.
2. Considerations for running performance tests.
Software performance measurement parameters are throughput,
response time, and utilization. We will discuss their benefits,
definitions, and examples in the table below:

The following example shows the difference between throughput,


response time, and utilization. The same tasks are run on similar
processors but executed in a different order. Tasks arrive at the same
time.
Service-level objectives target performance features visible to
users under various operational scenarios:
Peak load performance.
Average load performance.
Worst-case performance.
Availability profile.
Average or worst-case tolerance depends on the application.
Worst-case performance is relevant for mission critical systems like
air traffic control. Average performance is often the focus of data
processing systems.
Performance objectives specify unambiguous, quantifiable targets
for system behavior. A system performance requirement can be stated
in terms of response time, throughput, or utilization.
This objective must be translated into a constraint on software
execution time or resource use. Response time is the dominant
performance issue for many applications. Response time constraints
are of the form:
Elapsed Execution Time < Response Time Objective
There are several variations of performance testing. They are given
below one by one.
a. Volume test: This is simple quantity saturation. Lots of input, with
no constraint on time under normal system loading.
b. Background test: This is an attempt to find faults related to
concurrent, parallel, or multitasking. We try to get several things
going concurrently to see if we can force:
Resource contention resolution faults.
Scheduling and deadlock faults.
Race conditions.
We try to sample loading levels without stressing the system. This
is good at detecting:
Basic timing problems.
Faults with re-entrant code.
Data integrity faults.
This sets the stage for stress testing.
c. Stress test: The idea of a stress test is to “break” the system. That is,
we want to see what happens when the system is pushed beyond
design limits.
4. Configuration testing: Some system products are intended for use
in a wide range of platform installations.
For example, consider a typical PC package. It must run on several
computer makes and models, OS versions, and support every printer
and monitor sold.
Configuration testing should be done when everything else is
stable.
Test cases are a matter of selecting a subset of configuration
permutations:
a. Identify all possible configuration variables and domains,
requirements, and marketing.
b. Select a subset using domain analysis.
c. Pay attention to “special” or “custom” configurations promised to
specific customers or users.
At a minimum, the test suite should cover a normal duty cycle for
the devices in configuration:
Start-up/run/shut down
Start-up/device failure/recovery/run/shut down
The system testing approaches that we have discussed can be used
to investigate other relevant categories also. Some of them are listed
below:
Security
Restart and recovery
Parallel
Reliability
Installation
User documentation
Operator procedures

8.8. REGRESSION AND ACCEPTANCE TESTING


As we know reusability of code is the focal point of object-oriented
programming. So changes to any one component could have
potentially unintended side-effects on the clients that use the
component. Also, because of cascaded effects of changes resulting
from properties like inheritance, it makes sense to catch the defects as
early as possible.

What Is Regression Testing?


It is defined as the selective retesting of system or component after
changes have been made.

Or
It is defined as the process to verify absence of unintended effects.
Or
It is defined as the process of verifying compliance with all old and
new requirements.

Approach used
Reveal faults in new or modified modules. This requires running
new test cases and typically reusing old test cases.
Reveal faults in unchanged modules. This requires re-running old
test cases.
Requires reusable library of test suites.

Scope of regression test


Regression test suite with functional or interface subsets.
Run entire test suite periodically.
Run subset on every new integration.
The selection of subset and frequency requires evaluation of
time/cost effectiveness trade offs.

When to do?
Periodically, every three months.
After every integration of fixes and enhancements.
Frequency, volume, and impact must be considered.

What to test?
Changes result from new requirements or fixes.
Analysis of the requirements hierarchy may suggest which subset
to select.
If new modules have been added, you should redetermine the call
paths required for CI coverage.

Possible test suites


Maximal: Test changed component, all ancestors, and
descendants.
Minimal: Test changed component and immediate ancestors.
Old test cases may need to be revised.

Acceptance Testing
On completion of the developer administered system test, three
additional forms of system testing may be appropriate.
a. Alpha test: Its main features are:
1. It is generally done “in-house” by an independent test
organization.
2. The focus is on simulating real-world usage.
3. Scenario-based tests are emphasized.
b. Beta test: It’s main features are:
1. It is done by representative groups of users or customers with
prerelease system installed in an actual target environment.
2. Customer attempts routine usage under typical operating
conditions.
3. Testing is completed when failure rate stabilizes.
c. Acceptance test: Its main features are:
1. Customer runs test to determine whether or not to accept the
system.
2. Requires meeting of the minds on acceptance criterion and
acceptance test plan.

8.9. MANAGING THE TEST PROCESS


A comprehensive test plan package is described in IEEE 83b, which is
an accepted industry standard for test planning. It recommends the
following main documents:
1. The test plan: It defines the features to be tested and excluded from
testing. It covers approach, deliverables, suspend/resume criteria,
environmental needs, responsibilities, staffing and training,
schedule, risks and contingencies, and approvals.
2. The test design: It defines the features/functions to test and the pass
fail criterion. It designates all test cases to be used for each
feature/functions.
3. The test cases: It defines the items to be tested and provides
traceability to SRS, SDD. User operations or installation guides. It
specifies the input, output, environment, procedures, intercase
dependencies of each test case.
4. Test procedures: It describes and defines the procedures necessary
to perform each test.
Each item, section, and sub-section should have an identifying
number and designate date prepared and revised, authors, and
approvals.
How should we go about testing a module, program, or system?
What activities and deliverables are necessary? A general approach is
described in IEEE 87a, an accepted industry standard for unit testing.
It recommends four main steps:
Step 1. Prepare a testing plan: Document the approach, the
necessary resources, and the exit criterion.
Step 2. Design the test:
2.1 Develop an architecture for the test, organize by goals.
2.2 Develop a procedure for each test case.
2.3 Prepare the test cases.
2.4 Package the plan per IEEE 82a.
2.5 Develop test data.
Step 3. Test the components:
3.1 Run the test cases.
3.2 Check and classify the results of each test case:
3.2.1 Actual results meet expected results.
3.2.2 Failure observed:
Implementation fault.
Design fault.
Undetermined fault.
3.2.3 Unable to execute test case:
Fault in test specification or test data.
Fault in test procedure.
Fault in test environment.

Step 4. Prepare a test summary report: Identify the component,


document observed variances from specifications, summarize test case
results, evaluate component usability, summarize testing activities,
and obtain necessary approvals.

A Test Plan Schema


A detailed test case schema is presented in [Berard] is given below.
This schema is a tree. So, there will be typically many branches and
leaves for a single root (object test case). The template of a test plan
schema is given next.
How much testing is enough?
Three general types of exit criteria have been proposed:
1. Sufficient coverage: Testing ceases when:
statement coverage is achieved
decision coverage is achieved
path coverage is achieved
all du-path coverage is achieved
call path coverage is achieved
You will probably not be able to do 100% predicate coverage.
85% is practical goal.
2. Failure rate stabilization: Testing ceases when:
the rate of failures versus CPU test time is low and steady.
3. Residual fault threshold: Testing ceases when:
the ratio of detected faults to the estimated number of total
faults becomes acceptable.
The economic, ethical, legal, and practical considerations for an
application to determine adequate testing. They are summarized
below:

Type Example Appropriate testing


Life critical Avionics, medical, motion Parallel development,
control comparison tests
Mission Weapon control systems, Independent V&V
critical secure communications,
process control
Asset Wire transfer, product Security testing
management inventory,
payables/receivables
External Communications, OS, GUI Alpha, beta
market
Internal Order processing Formal and informal
users
One-shot Test drivers, command files Informal
project tools
Personal use Tools, games Hack and go

Choice of Standards
The planning aspects are proactive measures that can have an across-
the-board influence on all testing projects.
Standards comprise an important part of planning in any
organization. Standards are of two types:
1. External standards
2. Internal standards
External standards are standards that a product should comply
with, are externally visible, and are usually stipulated by external
consortia. From a testing perspective, these standards include standard
tests supplied by external consortia and acceptance tests supplied by
customers.
Internal standards are standards formulated by a testing
organization to bring in consistency and predictability. They
standardize the processes and methods of working within the
organization. Some of the internal standards include
1. Naming and storage conventions for test artifacts
2. Document standards
3. Test coding standards
4. Test reporting standards
These standards provide a competitive edge to a testing
organization. It increases the confidence level one can have on the
quality of the final product. In addition, any anomalies can be brought
to light in a timely manner.
Testing requires a robust infrastructure to be planned upfront. This
infrastructure is made up of three essential elements. They are as
follows:
a. A test case database
b. A defect repository
c. SCM repository and tool
A test case database captures all of the relevant information about
the test cases in an organization.
A defect repository captures all of the relevant details of defects
reported for a product. It is an important vehicle of communication
that influences the work flow within a software organization.
A software configuration management (SCM) repository/CM
repository keeps track of change control and version control of all the
files/entities that make up a software product. A particular case of the
files/entities is test files.

8.10. DESIGN FOR TESTABILITY (DFT)


Design for testability (DFT) is a strategy to align the development
process for maximum effectiveness under either a reliability-driven or
resource-limited regime.
A reliability-driven process uses testing to produce evidence that a
pre-release reliability goal has been met.
A resource-limited process views testing as a way to remove as
many rough edges from a system as time or money permits.
Testability is important in either case. It
Reduces cost in a reliability-driven process.
Increases reliability in a resource-limited process.
Object-oriented systems present some unique obstacles to
testability as well sharing many with conventional development.
This cost and difficulty can be reduced by following some basic
design principles and planning for test.
Broadly conceived, software testability is a result of many factors.
Some of them are:
a. Characteristics of the representation.
b. Characteristics of the implementation.
c. Built-in test capabilities.
d. The test-suite.
e. The test support environment.
f. The software process in which testing is conducted.
We will now discuss a fishbone chart to consider testability
relationships.
FIGURE 8.30 Fishbone Chart.

1. Representation: The presence of a representation and its


usefulness in test development is a critical testability factor. This is
because
a. Testing without a representation is simply experimental
prototyping.
b. It cannot be decided that a test has passed or failed without an
explicit statement of expected result.
c. It may force production of a partial representation as part of test
plan.
There are many approaches to developing object-oriented
representations, generically known as object-oriented analysis (OOA)
and object-oriented design (OOD).
2. Implementation: An object-oriented program that complies with
generally accepted principles of OOP poses the fewest obstacles to
testing.
Structural testability can be assessed by a few simple metrics. A
metric may indicate testability, scope of testing, or both.
For example, with high coupling among classes, it is typically
more difficult to control the class-under-test (CUT), thus reducing
testability.
The effect of all intrinsic testability metrics is same:
relatively high value = decreased testability
relatively low value = increased testability
Scope metrics indicate that the number of tests is proportional to
the value of the metric.
A good design always improves testability but some applications
may be inherently hard to test. Testability metrics provide information
useful for resolving design trade-offs. They do not represent some
kind of score to be maximized or minimized.
3. Built-in test:

FIGURE 8.31 Fish Bone Chart.

Built-in test features are shown in Figure 8.31 and are summarized
below:
1. Assertions automate basic checking and provide “set and forget”
runtime checking of basic conditions for correct execution.
2. Set/Reset provides controllability.
3. Reporters provide observability.
4. A test suite is a collection of test cases and plan to use them. IEEE
standard 829 defines the general contents of a test plan.
5. Test tools require automation. Without automationless testing,
greater costs will be incurred to achieve a given reliability goal.
The absence of tools inhibits testability.
6. Test process: The overall software process capability and maturity
can significantly facilitate or hinder testability. This model follows
the key process ability of the defined level for software product
engineering.

8.11. GUI TESTING


The main characteristics of any graphical user interface (GUI)
application is that it is event driven. Users can cause any of several
events in any order. Many GUIs have an event sequence that is not
guided. One benefit of GUI applications to testers is that there is little
need for integration testing. Unit testing is typically at the “button
level”; that is, buttons have functions and these can be tested in the
usual unit-level sense. The essence of system-level testing for GUI
applications is to exercise the event-driven nature of the application.
Unfortunately, most of the models in UML are of little help with
event-driven systems. The main exception is behavioral models,
specifically statecharts and their simpler case, finite-state machines
(FSMs).

8.12. COMPARISON OF CONVENTIONAL AND OBJECT-


ORIENTED TESTING
Although there is basically no difference in the method of testing,
whether one is testing a non-object-oriented application or an object-
oriented application; still the following aspects applicable mostly for
object-oriented applications may be kept in mind and testing tactics
accordingly devised. The main points of comparisons are given below:
1. Object-oriented application development requires object-oriented
analysis (OOA) and object-oriented design (OOD) and very often
the code is generated from the design models-OOM (object-
oriented modelling). This overwhelming importance of OOAD
(object-oriented analysis and design) and OOM requires that
testing lay special stress on verifying analysis and models, that
being critical.
2. Unit testing in OOT context would mean testing not an isolated
method or function or procedure, but a class as a whole.
3. In non-OOT, we independently test Module-X and Module-Y and
then under integration testing, integrate them together and test
mainly for the correctness of their interface. In OOT, however,
Module-X and Module-Y may share many components and may
also have classes inherited from common base class. One of the
distingushing OOT approaches would therefore be to do layered-
testing. In layered-testing we test base classes (bottom layer) first,
followed by the next layer and so on; that is, proceed from
independent classes to dependent classes.
4. A properly designed OOApp (object-oriented application) is an
excellent candidate for black-box testing, as each component/class
ought to behave as a black-box with well defined sets of inputs and
outputs.
5. Model base testing (MBT) that is a black-box testing strategy is
greatly facilitated by OOM.
6. For systematic testing, partition testing at the class level can be
used. Partition testing can be categorized under:
a. State-based partitioning: Operations that change the state of
the class (like credit, debit, etc.) and not those that merely fetch
an information (like date of maturity, balance, etc.).
b. Attribute-based partitioning: Class operations based on
attributes, that is operations, that use a given attribute (like
balance, etc.).
c. Function-group based partitioning: Some examples of such
partitioning are open, close, transact, report, etc.

8.13. TESTING USING ORTHOGONAL ARRAYS


Although some applications have a small number of input values,
listing all of the possible combinations may increase the number of
test cases. One technique to reduce the number of test cases is to apply
a combinational method known as orthogonal array testing (OAT).
This method can be used for both normal software applications as well
as object-oriented software.
To apply the technique, the software must have independent sets of
states. The goal is to pair each state (from one set) with every other
state (from another set) at least once. We need to identify unique pairs
of states.
Let us consider an example of a bookstore application. Further
assume that there are three classes:

FIGURE 8.32 Classes and States in a Bookstore.

Each of these three classes have a finite number of possible states


in an object-oriented application. Whereas in a procedural application,
if we have three procedures, each will have arguments with a finite set
of values.
Now, from the previous figure we find that two classes (book and
purchase) have three states each and one class has four states, so
testing every combination of states requires 3 × 3 × 4 = 36 test cases.
Selecting a state from each of the two classes (known as the pair-wise
combinations) only requires 12 test cases as shown in table below,
thereby reducing the number of possible test cases.

TABLE 8.7 An Orthogonal Array.


Test case 1 in the table above says to test the combination that has
book set to “in-stock,” purchase set to “cash,” and shipping set to
“overnight.”

How to implement this test case?


In a procedural language, the test consists of passing the parameters
“in-stock,” “cash,” and “overnight” to the procedures order-book,
purchase-book, and ship-book, respectively. However, in object-
oriented programming, this test consists of having an object of class
book, in state “in-stock,” sends a message that passes an object of
class purchase, in state “cash,” to an object of class shipping in state
“overnight.”
Test cases 9-12 have some improbable combinations. It shows
“out-of-print” state of the book. A user cannot purchase a book which
is not of store/print. A software tester may find this a non-feasible
combination and may remove it from the set of test cases. On the other
hand, executing such a test ensures that the application traps this
condition and returns the proper error message.
When reducing the number of test cases, the possibility of missing
a crucial combination always exists. One can use risk analysis to
identify such crucial tests and give them priorities.
Orthogonal array testing is a powerful technique to use any time a
system presents a small input domain with a large number of
permutations.

8.14. TEST EXECUTION ISSUES


After defining the test cases, the next step is to formulate an
environment in which to execute the tests. This requires special test
software, known as a driver. It invokes the software-under-test (SUT).
The typical driver consists of the following features:
a. Packaging input and data
b. Invoking the software-under-test (SUT)
c. Capturing the resulting outcome
Creating a driver to test a class calls for additional considerations. The
driver must take care of encapsulation so as to control and observe a
class’s internal data and methods. There are several ways to achieve
this, such as:
Providing a test case method for each class
Creating a parallel class, which appears identical to the original
except for the addition of code needed for testing
Creating a child class that inherits the methods to be tested
All of these methods, alter the class and thus modify the actual
application. Consequently the tested implementation may not be
identical to the released code. It is important that the testing
environment be as close as possible to the deployment environment,
otherwise, you are not fully testing the end product.
Drivers for testing classes can take on many forms. The testers
must analyze the trade offs when deciding on a driver scheme.
Creating effective and reusable test drivers takes significant planning
and effort, even if commercial tools are used.

8.15. CASE STUDY—CURRENCY CONVERTER


APPLICATION
Problem Statement: The currency converter application coverts
U.S. dollars to any of the four currencies: Brazilian real, Canadian
dollars, European euros, and Japanese yen. The user can revise inputs
and perform repeated currency conversion. This program is an event-
driven program that emphasizes code associated with a GUI. The GUI is
shown in Figure 8.33.
The user selects the country whose currency equivalent is desired
using radio buttons which are mutually exclusive. When the compute
button is clicked, the result is displayed. The clear button will clear the
screen. Clicking on the quit button ends the application.
Now, we will perform the following on this GUI application:

FIGURE 8.33

1. GUI testing of this application: To test a GUI application, we begin


by identifying all of the user input events and all of the system output
events. Some of them are given below:

Input events Output events


inp1 Enter U.S. dollar amount op1 Display U.S. dollar amount
inp2 Click on a country button op2 Display currency name
inp2.1 Click on Brazil op2.1 Display Brazilian reals
inp2.2 Click on Canada op2.2 Display Canadian dollars
inp2.3 Click on Europe op2.3 Display European euros
op2.4 Display Japanese yen
inp2.4 Click on Japan op2.5 Display ellipsis
inp3 Click on compute button op3 Indicate selected country
inp4 Click on clear button op4 Reset selected country
inp5 Click on quit button op5 Display foreign currency value
inp6 Click on OK in error op6 Must select a country : Error
message message

2. Unit testing of currency conversion program: The command


buttons (i.e., Compute, Clear, and Quit) on the form have an event-
driven code attached to them. Hence, they are sensible places to
perform unit-level testing. Unit testing of the compute button should
also consider invalid U.S. dollar amount entries such as non-
numerical inputs, negative inputs, and very large inputs.

What are the best methods for performing unit-level testing?


One method is to run test cases from a specially coded driver that would
provide values for input data and check output values against expected
values.
A second method is to use the GUI as a test bed. This looks like
system-level unit testing, which seems oxymoronic, but it is workable. It
is system level in the sense that test case inputs are provided via system-
level user input events. And the test case result comparisons are based on
system-level output events. This works for small applications but it does
beg some serious questions such as the computation is correct but a fault
occurs in the output software. Another problem is that it will be harder to
capture test execution results. It also has some other problems like
repeating a set of test cases is time consuming.
Hence, unit testing with a test driver looks preferable.
Unit testing with the U.S. dollar amount text box can be done. It is
language dependent. If it is implemented in VB then there is little need to
unit test the test box. But if we use an OOP language then we would need
to verify that the inputs observed by the keyboard handler are correctly
displayed on the GUI and are correctly stored in the object’s attributes.
3. Integration testing of currency conversion program: Whether
integration testing is suitable for this problem depends on how it is
implemented. Three main choices are available for the compute
button.
First choice concentrates all logic in one place and simply uses the
status of the option buttons as in IF tests. Such IF tests would be
thoroughly done at the unit level. So, there is little need for integration
testing.
The second and more object-oriented choice would have methods in
each option button object that send the exchange rate value for the
corresponding country in response to a click event. In this case, we notice
that as units, there is very little to test. That is why the choice of methods
as units depends on size. Everything of interest is at the integration level
and at that level we have two concerns—are the option/radio buttons
sending the right exchange rate values and is the equivalent amount
calculation correct?
The third choice is in the visual basic style. The code for VB includes
a global variable for the exchange rate. There are event procedures for
each option button. The result is very similar to the pure object-oriented
version. Unit testing could be replaced by simple code reading for the
option button event procedures and the testing of the command compute
procedure is similarly trivial.
Note that in all the three variations given previously there is little
need for integration testing. This is true for small GUI applications only.
By “small” we mean to say an application that is implemented by one
person.
4. System testing for currency conversion program: As we have
already discussed, unit and integration testing are minimally needed
for small GUI applications. The onus therefore shifts to system
testing.
System testing may be
a. UML-based system testing.
b. Statechart-based system testing.
In the first step, also called project inception, the customer/user
describes the application in very general terms. Three types of system
functions are identified—evident, hidden, and frill. Evident functions are
the obvious ones. Hidden functions might not be discovered immediately
and frills are the “bells and whistles” that so often occur. The table below
lists the system functions for the currency converter application.
Reference no. Function Category
R1 Start application Evident
R2 End application Evident
R3 Input U.S. dollar amount Evident
R4 Select country Evident
R5 Perform conversion calculation Evident
R6 Clear user inputs and program outputs Evident
R7 Maintain EX-OR relationship among countries Hidden
R8 Display country flag images Frill

We first discuss UML-based system testing.


Our formulation lets us be very specific about system-level testing.
There are four levels with corresponding coverage metrics for GUI
applications. Two of these are naturally dependent on the UML
specification.
First Level: To test the system functions by developing an
incidence matrix as shown below:

TABLE 8.8 Use Case Incidence with System Functions.


Examining the incidence matrix, we can see several possible ways
to cover the seven system functions. One way would be to derive test
cases from real-use cases that correspond to extended essential-use
cases 1, 2, 5, and 6. These will need to be real-use cases as opposed to
the expanded essential-use cases. The difference is that specific
countries and dollar values are used, instead of the higher level
statements such as “click on a country button” and enter a dollar
amount.
The process of deriving system test cases from real-use cases is
mechanical. The use case preconditions are the test-case preconditions
and the sequences of actor actions and system responses map directly
into sequences of user input events and system output events.
The set of extended essential-use cases 1, 2, 5, and 6 is a nice
example of a set of regression test cases because test-case 5 in the
above table covers four system functions and test-case 6 covers 3.
Second Level: To develop test cases from all of the real-use cases.
We can develop system level test cases from the real-use case based
on extended essential-use case. For example, this has been applied on
real-use-case-3 (RUC-3) that is obtained from extended essential-use-
case-3. So, our RUC-3 is
RU3 Normal usage
Actor(s) User
Preconditions txtDollar has focus
Type Primary
Description The user inputs a U.S. $10 and
selects the European option; the
application computes and
displays the equivalent: 9.30
euros
Sequence Actor action System response
1. User enters 10 from 2. 10 appears in txt Dollar
keyboard
3. User clicks on European 4. Euros appears in IblEquiv
button
5. User clicks cmd-compute 6. 9.30 appears in IbIEq Amt
button
Alternative Actions 1 and 3 can be reversed and consequently responses 4
reversed and 6 will be sequence
Cross- R3, R4, R5
reference
Post- Cmd clear has focus
conditions
This is RUC-3. Based on this real-use case, we derive system-level test
cases also. They are given below.
SysTC3 Normal usage ($ amount entered)
Test Rajiv Chopra
operator
Pre- txtDollar has focus
conditions
Test Tester inputs Expected system response
operator
sequence 1. Enters 10 from 2. Observe 10 appears in
keyboard txtDollar
3. Click on the European 4. Observe euros appears in
button lblEquiv
5. Clicks cmdcompute 6. Observe 9.30 appears in Ibl
button Eq. Amt
Post- cmdclear has focus
conditions
Test Result Pass/Fail
Data Run September 6, 2007

Third Level: To derive test cases from the finite state machines
derived from a finite state machine description of the external appearance
of the GUI. This is shown below:

FIGURE 8.34 GUI Finite State Machine.

A test case from this formulation is a circuit. A path in which the start
node is the end node is usually an idle state. Nine such test cases are
shown in the table below. The numbers in the table show the sequence in
which the states are traversed by the test case. The test cases, TC1 to
TC9, are as follows:
FIGURE 8.35 Test Cases Derived from FSM.

Note that these numbers in columns TC1-TC9 show the sequence in


which states are traversed by the test case. Many other cases exist but this
shows how to identify them.
Fourth Level: To derive test cases from state-based event tables.
This would have to be repeated for each state. We might call this the
exhaustive level because it exercises every possible event for each state.
However, it is not truly exhaustive because we have not tested all
sequences of events across states. The other problem is that it is an
extremely detailed view of system testing that is likely very redundant
with integration and even unit-level test cases.
Now, we will discuss statechart-based system testing.
Statecharts are a fine basis for system testing. The problem is that
Statecharts are prescribed to be at the class level in UML. There is no
easy way to compose Statecharts of several classes to get a system-level
Statechart. A possible solution is to translate each class-level Statechart
into a set of event-driven petri nets (EDPNs) to describe threads to be
tested. Then the atomic system functions (ASFs) and the data places are
identified. Say, For our GUI-application they are as follows:
FIGURE 8.36 ASFs and Data Places.

The next step in building an EDPN description of the currency


conversion GUI is to develop the EDPNs for the individual atomic
system functions. For example,
where p1 was “Enter US dollar amount” and d1 is “US dollar amount
entered.”

Then, system-level threads are built up by composing atomic system


functions into sequences. We are finally in a position to describe various
sets of system-level test cases for the currency conversion GUI. The
lowest level is to simply exercise every atomic system function. This is a
little artificial because many atomic system functions (ASFs) have data
outputs that are not visible system-level outputs. Even worse, the
possibility always exist that an ASF has no port outputs, like with S1 :
Store US dollar amount.
FIGURE 8.37 Directed Graph of ASF Sequences.

At the system testing level, we cannot tell if an amount is correctly


stored, although we can look at the screen and see that the correct amount
has been entered.
The next level of system testing is to exercise a “suitable” set of
threads. By suitable we mean that we should exercise a set of threads
that:
Uses every atomic system function.
Uses every port input.
Uses every port output.
Beyond this we can look at a directed graph of ASFs as shown in
Figure 8.37.
This is only a partial graph showing mainline behavior. The thread
<S1, S2, S6, S7>
is one of the 16 paths. Half of these end with the clear button clicked and
the other half end with the quit button.
Consider the set, T, of threads {T1, T2, T3, T4} where their ASF
sequences are given below:
T1 = <S1, S4, S6, S7>
T2 = <S1, S2, S6, S7>
T3 = <S1, S3, S6, S7>
T4 = <S1, S5, S7, S8>
The threads (T1-T4) given above, in set, T, have the following
coverages:
Every ASF
Every port input
Every port output
As such, set T, constitutes a reasonable minimum level of system
testing for the currency conversion GUI. Similarly, we can go into more
detail by exploring some next-level user behavior.

SUMMARY
Various key object-oriented concepts can be tested using some testing
methods and tools. They are summarized below:

Key object-oriented concept Testing methods and tools


1. Object orientation Tests need to integrate data and methods
more tightly.
2. Unit testing of classes BVA, equivalence partitioning to test
variables.
Code coverage methods for methods of a
class.
Alpha-Omega method of exercising
methods.
State diagram to test states of a class.
Stress testing to detect memory leaks.
3. Abstract classes Requires retesting for every new
implementation of the abstract class.
4. Encapsulation and Requires unit testing at class level and
inheritance incremental class testing when
encapsulating.
Inheritance introduces extra context;
each combination of different contexts
has to be tested.
5. Polymorphism Each of the different methods of the
same name should be tested separately.
Maintainability of code may suffer.
Conventional code coverage has to be
6. Dynamic binding modified to be applicable for dynamic
binding
Possibility of unanticipated runtime
defects higher.
7. Inter-object Message sequencing.
communication via Sequence diagrams.
messages
8. Object reuse and parallel Needs more frequent integration tests
development of objects and regression tests.
Integration testing and unit testing are
not as clearly separated as in the case of
a procedure-oriented language.
Errors in interfaces between objects
likely to be more common in object-
oriented systems and hence needs
thorough interface testing.

MULTIPLE CHOICE QUESTIONS


1. UML stands for:
a. Unified modeling language
b. Universal modeling language
c. Uniform modeling language
d. None of the above.
2. A general purpose mechanism for organizing elements into
groups in UML is called
a. Package
b. Class
c. Deployment
d. None of the above.
3. Which one of the following is an example of call and return
architecture?
a. Object-oriented architecture
b. Layered architecture
c. Both (a) and (b)
d. Agent architecture
4. A predicate expression associated with an event is known as
a. Transition
b. Class
c. Guard
d. None of the above.
5. Statecharts refers to
a. State diagrams + depth + orthogonality + broadcast
communication
b. State diagrams only
c. State diagrams + depth
d. None of the above.
6. Which of the following is OOA/OOD method?
a. OMT
b. ROOM
c. Fusion
d. All of the above.
7. The formation of a subclass that has no locally defined features
other than the minimum requirement of a class definition is
called
a. Concatenation
b. Binding
c. Polymorphism
d. None of the above.
8. Testing of nested loops was suggested by
a. Little wood
b. Beizer
c. Boehm
d. None of the above.
9. An accepted industry standard for unit testing is
a. IEEE 83b
b. IEEE 87a
c. IEEE 9126
d. None of the above.
10. A powerful technique to use any time a system presents a small
input domain with a large number of permutations is known as
a. Regression testing
b. Object-oriented testing
c. Orthogonal array testing
d. None of the above.

ANSWERS
1. a.
2. a.
3. c.
4. c.
5. a.
6. d.
7. a.
8. b.
9. b.
10. c.

CONCEPTUAL SHORT QUESTIONS WITH ANSWERS


Q. 1. What is LCOM metric?
Ans. The LCOM metric is defined as a count of the method pairs
that do not have common instance variable minus the count
of method pairs that do. Please note that the larger the
number of similar methods, the more cohesive is the class.
LCOM (fopr a class) = 0 if none of the methods of a class
display any instance behavior, i.e., do not use any instance
variables. Low cohesion increases complexity, thereby
increasing the likelihood of errors during the development
process.
Q. 2. How do we compute the component weightage (CW) of an
object-oriented software?
Ans. We try to compute the CWs of individual design attributes
like encapsulation, inheritance, coupling, and cohesion to
show the complexity relationship with the design properties.
That is,
Complexity = [0.16 * Encapsulation] + [2.78 * Inheritance]
+ [1.57 * Coupling]
Q. 3. Explain the two main variants of state models?
Ans. (a) Moore Machine:
Transitions do not have output.
An output action is associated with each state.
States are active.
(b) Mealy Machine:
Transitions have output.
No output action is associated with state.
States are passive.
Q. 4. The big bang is estimated to have occurred about 18 billion
years ago.
Given: Paths/second = 1 × 103
Second/year = 3.154 × 107
If we start testing 103 paths/second at instant of the big bang,
how many paths could we test? At what loop value of x,
would we run out of time?
Ans. Given: 18 billion years = 1.8 × 109 years
and 3.154 × 107 seconds/year
We have, 1.8 × 109 years * 3.154 × 107 seconds
= 5.6772 × 1016 seconds (since
the big bang)
If we test 103 paths/second at the instant of big bang, we
could test
= 5.6772 × 1016 × 103 paths
= 5.6772 × 1019 paths
Also, we find x = ?
Now, 2x = 5.6772 × 1019
Taking log on both sides, we get:
log (2x) = log (5.6772 × 1019)
or x log 2 = 19 log (5.6772)
x(0.3010) = 19 × 0.754134
or x = 3.321928 × 19 × 0.754134
∴ x = 35.89559
Q. 5. What is pesticide paradox?
Ans. As defined by Beizer,
1. The purpose of a test process is to find faults.
2. An effective process will eventually find fewer faults.
3. By definition, this process becomes increasingly
ineffective, i.e., its ability to find faults decreases because
the initial fault sources have been eliminated.
4. Generally, applications, platforms, and development tools
become more complex, creating new opportunities for
errors.
This is called the pesticide paradox.
For example, a given pesticide (P1) kills 98% of the first
generation of bugs B1, having survivors S1. The 2% that
survive are immune to P1. You have helped the survivors by
reducing their competition. The survivors regenerate and
become generation B2. P1 is useless against the second
generation, B2.
Moral: Process monitoring is important to identify both
chronic and emerging fault causes.

REVIEW QUESTIONS
1. Object-oriented languages like JAVA do not support pointers.
This makes testing easier. Explain how.
2. Consider a nested class. Each class has one method. What kind
of problems will you encounter during testing of such nested
classes? What about their objects?
3. Explain the following:
a. Unit and integration testing
b. Object-oriented testing
4. Write and explain some common inheritance related bugs.
5. How is object-oriented testing different from procedural
testing? Explain with examples.
6. Describe all the methods for class testing.
7. Write a short paragraph on issues in object-oriented testing.
8. Explain briefly about object-oriented testing methods with
examples. Suggest how you test object-oriented systems by
use-case approach.
9. Illustrate “how do you design interclass test cases.” What are
the various testing methods applicable at the class level?
10. a. What are the implications of inheritance and polymorphism
in object-oriented testing?
b. How does GUI testing differ from normal testing? How is
GUI testing done?
11. With the help of suitable examples, demonstrate how
integration testing and system testing is done for object-
oriented systems?
12. How reusability features can be exploited by object-oriented
testing approach?
13. a. Discuss the salient features of GUI testing. How is it
different from class testing?
b. Explain the testing process for object-oriented programs
(systems).
14. Draw a state machine model for a two-player game and also
write all possible control faults from the diagram.
CHAPTER
9

AUTOMATED TESTING
Inside this Chapter:
9.0. Automated Testing
9.1. Consideration During Automated Testing
9.2. Types of Testing Tools-Static V/s Dynamic
9.3. Problems with Manual Testing
9.4. Benefits of Automated Testing
9.5. Disadvantages of Automated Testing
9.6. Skills Needed for Using Automated Tools
9.7. Test Automation: “No Silver Bullet”
9.8. Debugging
9.9. Criteria for Selection of Test Tools
9.10. Steps for Tool Selection
9.11. Characteristics of Modern Tools
9.12. Case Study on Automated Tools, Namely, Rational Robot, WinRunner, Silk Test, and
Load Runner

9.0. AUTOMATED TESTING


Developing software to test the software is called test
automation/automated testing. In simple terms, automated testing is
automating the manual testing process. It is used to replace or
supplement manual testing with a suite of testing tools. Automated
testing tools assist software testers to evaluate the quality of the
software by automating the mechanical aspects of the software testing
task. Automated testing tools vary in their underlying approach,
quality, and ease of use.
Manual testing is used to document tests, produce test guides
based on data queries, provide temporary structures to help run tests,
and measure the results of the tests. Manual testing is considered to be
costly and time-consuming. Therefore, automated testing is used to cut
down time and cost.

9.1 CONSIDERATION DURING AUTOMATED TESTING


While performing testing with automated tools, the following points
should be noted:
1. Clear and reasonable expectations should be established in order to
know what can and what cannot be accomplished with automated
testing in the organization.
2. There should be a clear understanding of the requirements that
should be met in order to achieve successful automated testing.
This requires that the technical personnel should use the tools
effectively.
3. The organization should have detailed, reusable test cases which
contain exact expected results and a stand alone test environment
with a restorable database.
4. The testing tool should be cost-effective. The tool must ensure that
test cases developed for manual testing are also useful for
automated testing.
5. Select a tool that allows the implementation of automated testing in
a way that conforms to the specified long-term testing strategy.
Automated tools like Mothora are used to create and execute test
cases, measure test case adequacy, determine input-output correctness,
locate and remove faults or bugs, and control and document the test.
Similarly, Bug Trapper is used to perform white-box testing. This
tool traces the path of execution and captures the bug along with the
path of execution and the different input values that resulted in the
error.
We now list some commonly used automated test tools.
It may be any phase of software development life cycle (SDLC),
automated tools are always there. Tools like MS Project are project
management tools. Design tools like object-modeling technique
(OMT) and testing tools like a flow graph generator are also available
in the market. Regression testing tools are finally used during the
testing life cycle. These tools are often known as computer aided
software testing (CAST) tools.

TABLE 9.1 Software Testing Tools and Their Vendors/Manufacturers.

S. no. Manufacturer Testing tools


1. Segue Silk Test
Silk Performer
Silk Central
2. IBM/Rational Requirements Pro
Robot
Clear Case
3. Mercury Interactive WinRunner
Load Runner
Test Director
4. Compuware Reconcile
QA Load
QA Run

See Table on next page for certain other tools used during testing
and their field of applications.

9.2. TYPES OF TESTING TOOLS-STATIC V/S DYNAMIC


Because testing is of two types
a. Static testing
b. Dynamic testing

The tools used during testing are named


a. Static testing tools.
b. Dynamic testing tools.

Static testing tools seek to support the static testing process whereas
dynamic testing tools support the dynamic testing process. Note that
static testing is different from dynamic testing. We tabulate the
differences between static and dynamic testing before discussing its
tools. See Table 9.2.
Software testing tools are frequently used to ensure consistency,
thoroughness, and efficiency in testing software products and to fulfill
the requirements of planned testing activities. These tools may
facilitate unit (module) testing and subsequent integration testing (e.g.,
drivers and stubs) as well as commercial software testing tools.
Testing tools can be classified into one of the two categories listed
below:

TABLE 9.2 Static Testing V/s Dynamic Testing.


S. no. Static testing Dynamic testing
1. Static testing does not Dynamic testing involves testing the
require the actual software by actually executing it.
execution of software.
2. It is more cost It is less cost effective.
effective.
3. It may achieve 100% It achieves less than 50% statement
statement coverage in coverage because it finds bugs only in part
relatively short time. of the codes that are actually executed.
4. It usually takes shorter It may involve running several test cases,
time. each of which may take longer then
compilation.
5. It may uncover a It uncovers a limited type of bugs that are
variety of bugs. explorable through execution.
6. It can be done before It can take place only after executables are
compilation. ready.

a. Static Test Tools: These tools do not involve actual input and
output. Rather, they take a symbolic approach to testing, i.e., they
do not test the actual execution of the software. These tools include
the following:
a. Flow analyzers: They ensure consistency in data flow from
input to output.
b. Path tests: They find unused code and code with contradictions.
c. Coverage analyzers: It ensures that all logic paths are tested.
d. Interface analyzers: It examines the effects of passing variables
and data between modules.
b. Dynamic Test Tools: These tools test the software system with
“live” data. Dynamic test tools include the following:
a. Test driver: It inputs data into a module-under-test (MUT).
b. Test beds: It simultaneously displays source code along with
the program under execution.
c. Emulators: The response facilities are used to emulate parts of
the system not yet developed.
d. Mutation analyzers: The errors are deliberately “fed” into the
code in order to test fault tolerance of the system.

9.3. PROBLEMS WITH MANUAL TESTING


The main problems with manual testing are listed below:
1. Not Reliable: Manual testing is not reliable as there is no yardstick
available to find out whether the actual and exepected results have
been compared. We just rely on the tester’s words.
2. High Risk: A manual testing process is subject to high risks of
oversights and mistakes. People get tired, they may be temporarily
attentive, they may have too many tasks on hand, they may be
insufficiently trained, and so on. Hence, unintentionally mistakes
happen in entering data, in setting parameters, in execution, and in
comparisons.
3. Incomplete Coverage: Testing is quite complex when we have a
mix of multiple platforms, OS servers, clients, channels, business
processes, etc. Testing is non exhaustive. Full manual regression
testing is impractical.
4. Time Consuming: Limited test resources makes manual testing
simply too time consuming. As per a study done, 90% of all IT
projects are delivered late due to manual testing.
5. Fact and Fiction: The fiction is that manual testing is done while
the fact is only some manual testing is done depending upon the
feasibility.
Please note that manual testing is used to document tests, produce
test guides based on data queries, provide temporary structures to help
run tests, and measure the results of the test. Manual testing is
considered to be costly and time-consuming; so we use automated
testing to cut down time and cost.

9.4. BENEFITS OF AUTOMATED TESTING


Automated testing is the process of automating the manual testing
process. It is used to replace or supplement manual testing with a suite
of testing tools. Automated testing tools assist software testers to
evaluate the quality of the software by automating the mechanical
aspects of the software testing task. The benefits of automation include
increased software quality, improved time to market, repeatable test
procedures, and reduced testing costs. We will now list some more
benefits of test automation. They are given below:
1. Automated execution of test cases is faster than manual execution.
This saves time. This time can also be utilized to develop
additional test cases, thereby improving the coverage of testing.
2. Test automation can free test engineers from mundane tasks and
make them focus on more creative tasks.
3. Automated tests can be more reliable. This is because manually
running the tests may result in boredom and fatigue, more chances
of human error. While automated testing overcomes all these
shortcomings.
4. Automation helps in immediate testing as it need not wait for the
availability of test engineers. In fact,
Automation = Lesser Person Dependence
5. Test cases for certain types of testing such as reliability testing,
stress testing, and load and performance testing cannot be executed
without automation. For example, if we want to study the behavior
of a system with millions of users logged in, there is no way one
can perform these tests without using automated tools.
6. Manual testing requires the presence of test engineers but
automated tests can be run around the clock, in a 24 × 7
environment.
7. Tests, once automated, take comparatively far less resources to
execute. A manual test suite requiring 10 persons to execute over
31 days, i.e., 31 × 10 = 310 man-days, may take just 10 man-days
for execution, if automated. Thus, a ratio of 1: 31 is achieved.
8. Automation produces a repository of different tests which helps us
to train test engineers to increase their knowledge.
9. Automation does not end with developing programs for the test
cases. Automation includes many other activities like selecting the
right product build, generating the right test data, analyzing results,
and so on.
Automation should have scripts that produce test data to maximize
coverage of permutations and combinations of input and expected
output for result comparison. They are called test data generators.
It is important for automation to relinquish the control back to test
engineers in situations where a further set of actions to be taken are
not known.
As the objective of testing is to catch defects early, the automated
tests can be given to developers so that they can execute them as part
of unit testing.

9.5. DISADVANTAGES OF AUTOMATED TESTING


Despite many benefits, the pace of test-automation is slow. Some of its
disadvantages are given below:
1. An average automated test suite development is normally 3-5 times
the cost of a complete manual test cycle.
2. Automation is too cumbersome. Who would automate? Who would
train? Who would maintain? This complicates the matter.
3. In many organizations, test automation is not even a discussion
issue.
4. There are some organizations where there is practically no
awareness or only some awareness on test automation.
5. Automation is not a high priority item for management. It does not
make much difference to many organizations.
6. Automation would require additional trained staff. There is no staff
for the purpose of automation.
Automation actually allows testing professionals to concentrate on
their real profession of creating tests and test-cases rather than doing
the mechanical job of test execution.

9.6. SKILLS NEEDED FOR USING AUTOMATED TOOLS


The skills required depends on what generation of automation the
company is in.
1. Capture/playback and test harness tools (first generation).
2. Data driven tools (second generation).
3. Action driven (third generation).

We discuss these three generations on which skills depends, one by


one.

I. Capture/Playback and Test Harness Tools


One of the most boring and time-consuming activities during testing
life cycle is to rerun manual tests a number of times. Here,
capture/playback tools are of great help to the testers. These tools do
this by recording and replaying the test input scripts. As a result, tests
can be replayed without attendent for long hours especially during
regression testing. Also, these recorded test scripts can be edited as per
need, i.e., whenever changes are made to the software. These tools can
even capture human operations, e.g., mouse activity, keystrokes, etc.
A capture/playback tool can be either intrusive or non intrusive.
Intrusive capture/playback tools are also called native tools as they
along with software-under-test (SUT) reside on the same machine.
Non intrusive capture/playback tools, on the other hand, reside on
the separate machine and is connected to the machine containing the
software to be tested using special hardware. One advantage of these
tools is to capture errors that users frequently make and developers
cannot reproduce.
Test harness tools are the category of capture/playback tools used
for capturing and replaying a sequence of tests. These tools enable
running a large volume of tests unattended and help in generating
reports by using comparators. These tools are very important at CMM
level-2 and above.

II. Data-driven Tools


This method helps in developing test scripts that generate the set of
input conditions and corresponding expected output. The approach
takes as much time and effort as the product. However, changes to the
application does not require the automated test cases to be changed as
long as the input conditions and expected output are still valid. This
generation of automation focuses on input and output conditions using
the black-box testing approach.

III. Action-driven Tools


This technique enables a layman to create automated tests. There are
no input and expected output conditions required for running the tests.
All actions that appear on the application are automatically tested,
based on a generic set of controls defined for automation.
From the above approaches/generations of automation, it is clear
that different levels of skills are needed based on the generation of
automation selected.

9.7. TEST AUTOMATION: “NO SILVER BULLET”


Test automation is a partial solution and not a complete solution. One
does not go into automation because it is easy. It is a painful and
resource-consuming exercise but once it is done, it has numerous
benefits. For example, developing a software to automate inventory
management may be a time-consuming, painful, and costly, resource-
intensive exercise but once done, inventory-management becomes a
breeze.
Of course, test-automation is “no silver bullet.” It is only one of
the many factors that determine software quality.

9.8. DEBUGGING
Debugging occurs as a consequence of successful testing. That is,
when a test case uncovers an error, debugging is the process that
results in the removal of the error. After testing, we begin an
investigation to locate the error, i.e., to find out which module or
interface is causing it. Then that section of the code is to be studied to
determine the cause of the problem. This process is called debugging.
Debugging is an activity of locating and correcting errors.
Debugging is not testing but it always occurs as a consequence of
testing. The debugging process begins with the execution of a test
case. The debugging process will always have one of the two
outcomes.
1. The cause will be found, corrected, and removed.
2. The cause will not be found.
During debugging, we encounter errors that range from mildly
annoying to catastrophic. Some guidelines that are followed while
performing debugging are:
1. Debugging is the process of solving a problem. Hence, individuals
involved in debugging should understand all of the causes of an
error before starting with debugging.
2. No experimentation should be done while performing debugging.
The experimental changes often increase the problem by adding
new errors in it.
3. When there is an error in one segment of a program, there is a high
possibility of the presence of another error in that program. So, the
rest of the program should be properly examined.
4. It is necessary to confirm that the new code added in a program to
fix errors is correct. And to ensure this, regression testing should be
performed.

The Debugging Process


During debugging, errors are encountered that range from less
damaging (like input of an incorrect function) to catastrophic (like
system failure, which leads to economic or physical damage). The
various levels of errors and their damaging effects are shown in Figure
9.1.
FIGURE 9.1 Level of Error and its Effect.

Note that in this graph, as the number of errors increases, the


amount of effort to find their causes also increases.
Once errors are identified in a software system, to debug the
problem, a number of steps are followed:
Step 1. Identify the errors.
Step 2. Design the error report.
Step 3. Analyze the errors.
Step 4. Debugging tools are used.
Step 5. Fix the errors.
Step 6. Retest the software.
After the corrections are made, the software is retested using
regression tests so that no new errors were introduced during
debugging process.
Please note that debugging is an integral component of the
software testing process. Debugging occurs as a consequence of
successful testing and revealing the bugs from the software-under-test
(SUT). When a test case uncovers an error, debugging is the process
that results in the removal of the bugs. Also note that debugging is not
testing, but it always occurs as a consequence of testing. The
debugging process begins with the execution of a test case. This is
shown in Figure 9.2.

FIGURE 9.2 Overall Debugging Life Cycle.

The debugging process attempts to match symptom with cause


thereby leading to error correction. The purpose of debugging is to
locate and fix the offending code responsible for a sympton violating a
known specification.
Testing uses unit-, integration-, and system-level approaches for
doing fault detection. On the other hand, debugging checks the
correctness and the performance of software to do fault detection. The
differences between these two techniques are shown in Table.

S. no. Software testing Debugging


1. It is the process of executing the It is an activity of locating and
program with the intent of correcting errors.
finding faults.
2. It is a phase of the software It occurs as a consequence of
development life cycle (SDLC). testing.
3.
Once code is written, testing The debugging process begins
commences. with the execution of a test case.
4. It is further comprised of It attempts to match symptom with
validation and verification of cause, thereby leading to error
software. correction.
5. It uses unit-, integration-, and It checks the correctness and
system-level testing. performance of the software.

Debugging Approaches
Several approaches have been discussed in literature for debugging
software-under-test (SUT). Some of them are discussed below.
1. Brute Force Method: This method is most common and least
efficient for isolating the cause of a software error. We apply this
method when all else fails. In this method, a printout of all registers
and relevant memory locations is obtained and studied. All dumps
should be well documented and retained for possible use on
subsequent problems.
2. Back Tracking Method: It is a fairly common debugging
approach that can be used successfully in small programs.
Beginning at the site where a sympton has been uncovered, the
source code is traced backward until the site of the cause is found.
Unfortunately, as the number of source lines increases, the number
of potential backward paths may become unmanageably large.
3. Cause Elimination: The third approach to debugging, cause
elimination, is manifested by induction or deduction and introduces
the concept of binary partitioning. This approach is also called
induction and deduction. Data related to the error occurrence are
organized to isolate potential causes. A cause hypothesis is devised
and the data are used to prove or disprove the hypothesis.
Alternatively, a list of all possible causes is developed and tests are
conducted to eliminate each. If initial tests indicate that a particular
cause hypothesis shows promise, the data are refined in an attempt
to isolate the bug.
Tools for Debugging: Each of the above debugging approaches can
be supplemented with debugging tools. For debugging we can apply a
wide variety of debugging tools such as debugging compilers,
dynamic debugging aids, automatic test case generators, memory
dumps, and cross reference maps. The following are the main
debugging tools:
1. Turbo Debugger for Windows: The first debugger that comes to
mind when you think of a tool especially suited to debug your
Delphi code is Borland’s own Turbo Debugger for Windows.
2. Heap Trace: Heap Trace is a shareware heap debugger for Delphi
1·X and 2·X applications that enables debugging of heap memory
use. It helps you to find memory leaks, dangling pointers, and
memory overruns in your programs. It also provides optional
logging of all memory allocations, de-allocations, and errors. Heap
trace is optimized for speed, so there is only a small impact on
performance even on larger applications. Heap trace is
configurable and you can also change the format and destination of
logging output, choose what to trace, etc. You can trace each
allocation and de-allocation and find out where each block of
memory is being created and freed. Heap trace can be used to
simulate out of memory condition to test your program in stress
conditions.
3. MemMonD: MemMonD is another shareware memory monitor
for Delphi 1·X applications. It is a stand-alone utility for
monitoring memory and stack use. You don’t need to change your
source code but only need to compile with debug information
included. MemMonD detects memory leaks and deallocations with
wrong size.
4. Re-Act: Re-Act is not really a debugger for Delphi but more a
Delphi component tester. However, this tool is a real pleasure to
use. So we could not resist including it in this list of debugging
tools. Currently, the reach is for Delphi 2 only, but a 16-bit version
is in the works. Re-Act version 2.0 is a really nice component. It
can view and change properties at run time with the built in
component inspector, monitor events in real time, set breakpoints,
and log events visually and dynamically. You can find elusive
bugs, evaluate third-party components, and learn how poorly
documented components really work. If you build or purchase
components you need this tool. It’s totally integrated with Delphi
2.0 and the CDK 2.0.

9.9 CRITERIA FOR SELECTION OF TEST TOOLS


The main criteria for the selection of test tools are given below:
1. Meeting requirements
2. Technology expectations
3. Training/skills
4. Management aspects
We now discuss these criteria one by one.
1. Meeting Requirements
a. There are many tools available in the market today but rarely
do they meet all the requirements of a given product or a given
organization. Evaluating different tools for different
requirements involves lot of effort, money, and time. A huge
delay is involved in selecting and implanting test tools.
b. Test tools may not provide backward or forward compatibility
with the product-under-test (PUT).
c. Test tools may not go through the same amount of evaluation
for new requirements. Some tools had Y2K-problems too.
d. A number of test tools cannot distinguish between a product
failure and a test failure. This increases analysis time and
manual testing. The test tools may not provide the required
amount of trouble-shooting/debug/error messages to help in
analysis. For example, in the case of GUI testing, the test tools
may determine the results based on messages and screen
coordinates at run-time. So, if the screen elements of the
product are changed, it requires the test suite to be changed.
The test tool must have some intelligence to proactively find
out the changes that happened in the product and accordingly
analyze the results.
2. Technology Expectations
a. In general, test tools may not allow test developers to
extend/modify the functionality of the framework. It involves
going back to the tool vendor with additional cost and effort.
Very few tools available in the market provide source code for
extending functionality or fixing some problems. Extensibility
and customization are important expectations of a test tool.
b. A good number of test tools require their libraries to be linked
with product binaries. When these libraries are linked with the
source code of the product, it is called the “instrumented code.”
This causes portions of testing to be repeated after those
libraries are removed, as the result of certain types of testing
will be different and better when those libraries are removed.
For example, the instrumented code has a major impact on
performance testing because the test tools introduce additional
code and there could be a delay in executing the additional
code.
c. Finally, test tools are not 100% cross-platform. They are
supported only on some OS platforms and the scripts generated
from these tools may not be compatible on other platforms.
Moreover, many of the test tools are capable of testing only the
product, not the impact of the product/test tool to the system or
network. When there is an impact analysis of the product on the
network or system, the first suspect is the test tool and it is
uninstalled when such analysis starts.
3. Training Skills
Test tools require plenty of training but very few vendors provide the
training to the required level. Organization-level training is needed to
deploy the test tools, as the users of the test suite are not only the test
team but also the development team and other areas like SCM
(software configuration management). Test tools expect the users to
learn new language/scripts and may not use standard
languages/scripts. This increases skill requirements for automation and
increases the need for a learning curve inside the organization.
4. Management Aspects
A test tool increases the system requirement and requires the hardware
and software to be upgraded. This increases the cost of the already-
expensive test tool. When selecting the test tool, it is important to note
the system requirements and the cost involved in upgrading the
software and hardware needs to be included with the cost of the tool.
Migrating from one test tool to another may be difficult and requires a
lot of effort. Not only is this difficult as the test suite that is written
cannot be used with other test tools but also because of the cost
involved. The tools are expensive and unless management feels that
the returns on investment (ROI) are justified, changing tools are
generally not permitted.
Deploying a test tool requires as much effort as deploying a
product in a company. However, due to project pressures, the test tools
effort at deploying gets diluted, not spent. Thus, later it becomes one
of the reasons for delay or for automation not meeting expectations.
The support available on the tool is another important point to be
considered while selecting and deploying the test tool.

9.10. STEPS FOR TOOL SELECTION


There are seven steps to select and deploy a test tool in an
organization. These steps are:
Step 1. Identify your test suite requirements among the generic
requirements discussed. Add other requirements, if any.
Step 2. Make sure experiences discussed in previous sections are
taken care of.
Step 3. Collect the experiences of other organizations which used
similar test tools.
Step 4. Keep a checklist of questions to ask the vendors on
cost/effort/support.
Step 5. Identify a list of tools that meet the above requirements and
give priority for the tool that is available with the source
code.
Step 6. Evaluate and shortlist one/set of tools and train all of the test
developers on the tool.
Step 7. Deploy the tool across test teams after training all of the
potential users of the tool.

9.11. CHARACTERISTICS OF MODERN TESTING TOOLS


The modern testing tools available today have some salient features
that are discussed below:
i. It should use one or more testing strategy for performing testing on
host and target platforms.
ii. It should support GUI-based test preparation.
iii. It should provide complete code coverage and create test
documentation in various formats like .doc, .html, .rtf, etc.
iv. These tools should be able to adopt the underlying hardware.
v. It should be easy to use.
vi. It should provide a clear and correct report on test case, test case
status (PASS/FAIL), etc.

9.12. CASE STUDY ON AUTOMATED TOOLS , NAMELY,


RATIONAL ROBOT, WIN RUNNER, SILK TEST,
AND LOAD RUNNER
Rational Robot: Its main features are:
1. Rational is for the name of the company, Rational Software
Cooperation (IBM). It is called “Robot” because like a robot it
operates application and does data-entry on screens.
2. Rational Robot is highly useful and effective in automating
screen-based testing.
3. Operations on Windows/Web GUI, data entry on screens, and
other screen operations can be automatically captured by Robot
into a script.
4. The recorded scripts can be played back.
5. Once the operations and data-entry are recorded, they can be
automatically played back anytime, dispensing with human
intervention and manual data-entry.
6. The scripts can be edited. Scripts are in SQA Basic. SQA Basic is
a powerful language providing many options. For example,
instead of recording/capturing data-entry, data can be read-in
from a repository/database and entered into target screens. One
way would be to capture screen operations and entry of just one
record into a script through Robot and then to suitably edit the
script to provide for data-entry by reading-in values from a
repository/ database.
7. Rational Robot and SQA basic provide an excellent, user-friendly
development environment.
8. GUI applications such as VS.NET, VB6.0, Oracle Forms, HTML,
and JAVA can be tested through Robot.
9. Robot uses object-oriented recording technology (OORT). It
identifies objects by their internal object names and lets you
generate scripts automatically by simply running and using the
application-under-test (AUT).
10. There are many similar tools like Mercury Interactive’s
WinRunner, Seague’s Silk Test, etc.
WinRunner: Its main features are:
1. WinRunner is a recording and playback tool of Mercury
Interactive.
2. WinRunner captures, verifies, and replays user GUI interactions
automatically.
3. Its DataDriver facilitates the process of preparing test data and
scripts. It examines and compares expected and actual results
using multiple verfications for text, GUI, bitmaps, URLs, and
databases.
4. WinRunner enables the same test to be used to validate
applications in Internet Explorer, Netscape, etc.
5. It integrates with other testing solutions of Mercury like Load
Runner for load testing and Test Director for global test
management.
6. WinRunner supports the Windows platform, Web browsers, and
various ERP/CRM applications.
7. Its recovery manager and exception handling mechanism
automatically troubleshoot unexpected events, errors, and
application crashes to ensure smooth test completion.
8. WinRunner has an interactive reporting tool, also. It generates
reports so that the results can be interpreted.
9. Testers need not modify multiple tests when the application’s
screen are modified over time. Instead, they can apply changes to
the GUI map, central repository of test-related information and
WinRunner will automatically propagate changes to relevant
scripts.
10. WinRunner supports VB6, ActiveX, Oracle, C/C++, Java,
Javascript, JDK, Delphi, Centura, and Windows platform and
Terminal Emulators like VT 100.
Segue’s Silk Test: Its main features are
1. Silk Test is a recording and playback tool of Segue Software Inc.
2. It provides user simulation capabilities, driving the application
exactly how an end-user would.
3. Its recovery system allows tests to be run unattended. If
unexpected errors such as application crashes occur, the errors are
logged, and the application is restored to its original base state so
that subsequent tests can be run.
4. It’s 4 Test scripting language is an object-based fourth-generation
language (4GL).
5. Silk Test supports Internet Explorer, Netscape, Windows
Platform, VB 6.0, Power Builder, etc.
LoadRunner: Its main features are:
1. LoadRunner is a load/performance testing tool of Mercury
Interactive.
2. Using minimal hardware resources, it emulates hundreds to
thousands of concurrent users with real-life user loads, monitors
performance of servers and network, and helps identify
performance bottle necks and scalability issues before they impact
end-users.
3. It measures the response times of key business processes and
transactions, captures and displays performance data from every
tier-server and component.
4. Its analysis module allows drill down to specific source of
bottlenecks and generate reports.
5. It allows organizations to minimize testing cycles, optimise
performance, and reduce risks.
6. Its WAN emulation capability allows testing for network errors,
bandwidth, and latency.
7. It supports Citrix’s ICA for testing applications deployed with
Citrix Metaframe.
8. LoadRunner supports a very wide variety of platforms.

SUMMARY
Testing is an expensive and laborious phase of the software process.
As a result, testing tools were among the first software tools to be
developed. These tools now offer a range of facilities and their use.
Significantly reduces the cost of the testing process. Different testing
tools may be integrated into the testing workbench.
These tools are:
1. Test manager: It manages the running of program tests. It keeps
track of test data, expected results, and program facilities tested.
2. Test data generator: It generates test data for the program to be
tested. This may be accomplished by selecting data from a
database.
3. Oracle: It generates predictions of expected test results.
4. File comparator: It compares the results of program tests with
previous test results and reports difference between them.
5. Report generator: It provides report definition and generation
facilities for test results.
6. Dynamic analyzer: It adds code to a program to count the number
of times each statement has been executed. After the tests have
been run, an execution profile is generated showing how often each
program statement has been executed.
Testing workbenches invariably have to be adapted to the test suite
of each system. A significant amount of effort and time is usually
needed to create a comprehensive testing workbench. Most testing
work benches are open systems because testing needs are organization
specific.
Automation, however, makes life easier for testers but is not a
“Silver bullet.” It makes life easier for testers for better reproduction
of test results, coverage, and reduction in effort. With automation we
can produce better and more effective metrics that can help in
understanding the state of health of a product in a quantifiable way,
thus taking us to the next change.

MULTIPLE CHOICE QUESTIONS


1. Automated testing is
a. To automate the manual testing process
b. To ensure quality of software
c. To increase costs
d. None of the above.
2. Automated tools can be used during the
a. SRS phase
b. Design phase
c. Coding phase
d. Testing and maintenance phases
e. All of the above.
3. Testing system with live data is done using
a. Static test tools
b. Dynamic test tools
c. Both (a) and (b)
d. None of the above.
4. Tools used to record and replay the test input scripts are known as
a. Test harness tools
b. Data driven tools
c. Action driven tools
d. None of the above.
5. Thread testing is used for testing
a. Real-time systems
b. Object-oriented systems
c. Event-driven systems
d. All of the above
6. Debugging is
a. An activity of locating and correcting errors
b. A process of testing
c. A process of regression testing
d. None of the above.
7. Which of the following is a debugging approach?
a. Brute-Force method
b. Inheritance
c. Data flow diagrams
d. None of the above.
8. Which of these is a debugging tool
a. Windows
b. Heap-trace
c. SCM
d. None of the above.
9. CASE stands for
a. Computer aided software engineering
b. Case aided system engineering
c. Computer aided system engineering
d. None of the above.
10. Which of these can be used for testing as a tool
a. Rational Robot
b. Waterfall model
c. MS-WORD
d. None of the above.

ANSWERS
1. a.
2. e.
3. b.
4. a.
5. d.
6. a.
7. a.
8. b.
9. a.
10. a.

CONCEPTUAL SHORT QUESTIONS WITH ANSWERS


Q. 1. List one example of a test objective and two examples of a
test requirement.
Ans. Test objective example: Creating a new order with the
system.
Test requirement example:
a. Verifying that the insertion done appears in the status bar of
the application’s window.
b. Verifying that an order number is displayed in the order
number box of the AUT (Application Under Test).
Q. 2. What is the importance of learning AUT before creating
automated test scripts?
Ans. You can record a basic test with the correct user actions in a
short amount of time.
Q. 3. How can you estimate the total number of testers required
for your project?
Ans. We have a simple formula:

If total testing hours = 100


Total no. of weeks = 4
Hours/week = 25

Then, No. of testers required = = = 1 tester.

Q. 4. What should you do when you evaluate test automation?


Ans. When evaluating test automation, we should:
1. Look for the tests that take the most time.
2. Look for tests that could otherwise not be run like server
tests.
3. Consider acceptance tests.
4. Look for stable application components.
Q. 5. Which is costlier—an automated test suite or a manual test
suite?
Ans. An average automated test suite development is normally 3-
5 times the cost of a complete manual test cycle.

REVIEW QUESTIONS
1. Answer the following:
a. What is debugging?
b. What are different approaches to debugging?
c. Why is exhaustive testing not possible?
2. Explain the following:
a. Modern testing tools.
3. a. Differentiate between static and dynamic testing tools with
examples in detail?
b. Will exhaustive testing guarantee that the program is 100%
correct?
4. Compare testing with debugging.
5. Differentiate between static and dynamic testing.
6. Write a short paragraph on testing tools.
7. Compare testing with debugging.
8. Explain back tracking method for debugging.
9. Differentiate between static and dynamic testing tools.
10. What are the benefits of automated testing tools over
conventional testing tools?
11. Discuss various debugging approaches with some examples.
12. a. What is debugging? Describe various debugging
approaches.
b. Differentiate between static testing tools and dynamic
testing tools.
13. Briefly discuss dynamic testing tools.
14. Write a short paragraph on any two:
a. Static testing tools.
b. Dynamic testing tools.
c. Characteristics of modern tools.
15. a. Discuss in detail automated testing and tools. What are the
advantages and disadvantages?
b. Explain in brief modern tools in the context of software
development and their advantages and disadvantages.
16. List and explain the characteristics of modern testing tools.
17. Explain modern testing tools.
18. Write a short paragraph on static and dynamic testing tools.
CHAPTER
10

TEST POINT ANALYSIS (TPA)


Inside this Chapter:
10.0. Introduction
10.1. Methodology
10.2. Case Study
10.3. TPA for Case Study
10.4. Phase Wise Breakup Over Testing Life Cycle
10.5. Path Analysis
10.6. Path Analysis Process

10.0. INTRODUCTION
There are a number of accepted techniques for estimating the size of
the software. This chapter describes the test estimate preparation
technique known as test point analysis (TPA). TPA can be applied for
estimating the size of testing effort in black-box testing, i.e., system
and acceptance testing. The goal of this technique is to outline all the
major factors that affect testing projects and to ultimately do an
accurate test effort estimation. On time project delivery cannot be
achieved without an accurate and reliable test effort estimate.
Effective test effort estimation is one of the most challenging and
important activity in software testing. There are many popular models
for test effort estimation in vogue today. One of the most popular
methods is FPA. However, this technique can only be used for white-
box testing. Organizations specializing in niche areas need an
estimation model that can accurately calculate the testing effort of the
application-under-test.
TPA is one such method that can be applied for estimating test
effort in black-box testing. It is a 6-step approach to test estimation
and planning. We believe that our approach has a good potential for
providing test estimation for various projects. Our target audience for
using this approach would be anyone who would want to have a
precise test effort estimation technique for any given application-
under-test (AUT).
Ineffective test effort estimation leads to schedule and cost
overruns. This is due to a lack of understanding of the development
process and constraints faced in the process. But we believe that our
approach overcomes all these limitations.
To this end, the problem will be approached from a mathematical
perspective. We will be implementing the following testing metrics in
C++: static test points, dynamic test points, total number of test points,
and primary test hours.
We will be illustrating TPA using following case study.
DCM Data Systems Ltd. had a number of software products. One
of the newly developed products, vi editor, was installed locally and
abroad.
Reports and surveys depicted that some of the program
functionality claimed did not adequately function. The management of
the company then handed over the project to an ISO certified CMM
level 5 company, KRV&V. KRV&V decided to use the TPA method
to estimate black-box testing effort.

10.1. METHODOLOGY

10.1.1. TPA PHILOSOPHY


The effort estimation technique TPA is based on three fundamental
elements:
Size of the information system to be tested
Test strategy
Productivity
Size denotes the size of the information system to be tested.
Test strategy implies the quality characteristics that are to be tested
on each subsystem.
Productivity is the amount of time needed to perform a given
volume of testing work.

FIGURE 10.1 TPA Philosophy.

10.1.2. TPA MODEL


FIGURE 10.2 Schematic Overview of TPA.

The detailed procedure is as follows:

Step 1: Calculation of static test points (ST):


ST depends on the total FP of the information system (size) and
the static quality characteristics of the system.
ISO 9126 has listed the following quality characteristics as static:
a. Functionality
b. Usability
c. Efficiency
d. Portability
e. Maintainability
f. Reliability
Method of calculation of ST

where FP = total number of function points assigned to an


information system
Qi = weighing factor for statically measurable quality
characteristics
500 = minimum number of FPs that can be calculated in
a day

Method of Calculation of Qi: If a quality characteristic is tested by


means of a checklist that is the static testing factor, Qi gets the value
16. For each subsequent quality characteristic to be included in a static
test, another 16 is added to Qi factor rating.

Step 2: Calculation of dynamic test points (DT):


Dynamic test point is assigned to each individual function of the
system to be tested.
Mathematically,

where DT = Dynamic test points


FPf = Number of function points assigned to the
function
Df = Weighing factor for function dependent factors
QD = Weighing factor for dynamic quality
characteristics

The details of each of these parameters is as follows:


a. FPf: It is determined during FPA. It indicates the size of each
function.
b. Df: The function dependent factors (Df) are defined per function
and a weight will be assigned to each factor. One of the given
weights must be selected, intermediate weights are not allowed. If
insufficient information is available to enable the weight of a given
factor, the nominal weight should be assigned.

Various function dependent factors are described below:


i. User importance (Up): It implies how important the function is to
the users related to other system functions.
Rule: About 25% of functions should be placed in the high
category, 50% in normal category, and 25% in low category.

Weights:

Weight Category Description


3 LOW The importance of function relative to
other functions is low.
6 NOMINAL The importance of function relative to
other functions is nominal.
12 HIGH The importance of function relative to
other functions is high.

ii. Usage intensity (Ui): It depicts how many users process a function
and how often.

Weights:

Weight Category Description


2 LOW The function is only used a few times per
day or per week.
4 NOMINAL The function is being used a great many
times per day.
12 HIGH The function is used continuously
throughout the day.

iii. Interfacing (I): It implies how much one function affects the other
parts of the system. The degree of interfacing is determined by
first ascertaining the logical data sets (LDSs) which the function in
question can modify, then the other functions which access these
LDSs. An interface rating is assigned to a function by reference to
a table in which the number of LDSs affected by the function are
arranged vertically and the number of the other functions accessing
LDSs are arranged horizontally. When working out the number of
“other functions” affected, a given function may be counted
several times if it accesses several LDSs, all of which are
maintained by the function for which the calculation is being
made.

TABLE 10.1 Complexity Interface Factor Table.


where L = Low interfacing
A = Average interfacing
H = High interfacing

If a function does not modify any of the LDSs, it is given a low


interface rating.

Weights:

Weight Category Description


2 LOW The degree of interfacing associated with
the function is low.
4 NOMINAL The degree of interfacing associated with
the function is nominal.
8 HIGH The degree of interfacing associated with
the function is high.

iv. Complexity (C): The complexity of a function is determined on the


basis of its algorithm, i.e., how complex the algorithm is in a
specific function. The complexity rating of the function depends
on the number of conditions in the functions algorithm.

Conditions:
When counting the conditions, only the processing algorithm
should be considered.
Conditions which are the results of database checks such as
domain validations or physical presence checks do not count
because these are implicitly included in FPA.
Composite conditions such as “IF a AND b, THEN” have C = 2
because without the AND statement we would need 2 IF
statements.
A CASE statement with “n” cases have complexity = (n – 1)
because the replacement of the CASE statement with n cases
counts as (n – 1) conditions.
Count only the simple conditions and not the operators for
calculating complexity (C).

Weights:

Weight Category Description


3 Simple 0-5 conditions
6 Medium 6-11 conditions
12 Complex More than 11 conditions

The sum of medium weights (ratings) for all factors is calculated. It


comes out to be 20.
v. Uniformity (U): It checks the reusability of the code. A uniformity
factor of 0.6 is assigned in case of the 2nd occurrence (reuse) of
unique, clone, and dummy functions. Otherwise in all cases a
uniformity factor 1 is assigned.
Method of calculation of Df: The Df factor is calculated by
adding together the ratings of first-four functions dependent variables,
i.e., Up, Ui, I, and C and then dividing it by 20 (sum of median/
nominal weights of these factors).
The result is then multiplied by the uniformity factor (U). A Df
factor is calculated for each function.

where Up = User importance


Ui = Usage intensity
I = Interfacing
C = Complexity
U = Uniformity
Df for some standard functions: For some of the standard
functions like the error report function, help screen function, and menu
function, the standard number of test points can be assigned, as shown
in the Table 10.2.

TABLE 10.2 Test Points of Some Standard Functions.

c. Dynamic quality characteristics (QD):


In the TPA process, the dynamic measurable quality
characteristics are taken into account for each function.
Four dynamically explicit measurable quality characteristics are
defined in TPA. They are:
i. Functionality/suitability: Characteristics relating to the
achievement of the basic purpose for which the software is
being engineered.
ii. Security: Ability to prevent unauthorized access.
iii. Usability: Characteristics relating to the effort needed for use
and on the individual assessment of such use by a set of users.
iv. Efficiency: Characteristics related to the relationship between
the level of performance of software and the amount of
resources used.

The importance of these characteristics is rated as follows:

Rate Importance level


0 Not important
3 Relatively unimportant
4 Medium importance
5 Very important
6 Extremely important

Weights:

Quality characteristics Weight


Functionality/suitability 0.75
Security 0.05
Usability 0.10
Efficiency 0.10

The dynamic quality characteristics can also be measured


implicitly. For each dynamic quality characteristic that is tested
implicitly a value 0.02 is added to DT.
Method of calculation of QD:

where number of explicit dynamic quality characteristics,

i ← 1 to 4

Ri denotes the rating for each dynamically explicit measurable


quality characteristic.
Wi is the weighing factor for each of them.

Step 3: Calculation of total number of test points (TP):


where TP = Total number of test points assigned to the system
as a whole.
ST = Total number of static test points.
ΣDT = Total number of dynamic test points assigned to
all functions in the information system.
FP = Total number of function points assigned to an
information system (minimum value is 500).
Qi = Weighing factor for statically measurable quality
characteristics.

Step 4: Productivity factor (PF):


The productivity factor is defined as the number of test hours
required per test point.
Mathematically,

PF is a measure of the experience, knowledge, and skill of the test


team. Less experience testers have less product knowledge and
thus will take more time to complete testing.
PF can vary from one organization to another organization. So,
this factor can be called the organization dependent factor.
PF determination requires historical data of the projects.
PF values:

PF value Description
0.7 If test team is highly skilled
2.0 If test team has insufficient skills
Step 5: Environmental factor (EF):
The number of test hours required for each test point is not only
influenced by PF but also by the environmental factor.
EF includes a number of environmental variables.
The following EFs might affect the testing effort:
a. Test tools: It reflects the extent to which the testing is
automated, i.e., how much of the primary testing activities
employ automatic tools for testing.

Rating:

Rate Description
1 Testing involves the use of SQL, record, and
playback tool. These tools are used for test
specification and testing.
2 Testing involves the use of SQL only. No record and
playback tool is being used. Tools are used for test
specification and testing.
4 No testing tools are available.

b. Development testing: The development testing variable reflects


the quality ratings of earlier testing. The more thoroughly the
preceding test is done, the less likely one is to encounter time
consuming problems.

Ratings:

Rate Description
2 A development testing plan is available and the
testing team is familiar with the actual test cases and
results.
4 If development testing plan is available.
8 If no development testing plan is available.
c. Test basis: The test basis variable reflects the quality of
documentation upon which the test under consideration is
based. This includes documents like SRS, DFD, etc. The more
detailed and higher quality the documentation is, the less time
is necessary to prepare for testing (preparation and
specification phases).

Ratings:

Rate Description
3 Documentation standards and documentation
templates are used, inspections are carried out.
6 Documentation standards and documentation
templates are used.
12 No documentation standards and templates are used.

d. Development environment: It reflects the nature of the


environment within which the information system was
realized, i.e., the environment in which the code was
developed. It also signifies the degree to which the
development environment will have prevented errors and
inappropriate working methods.

Ratings:

Rate Description
2 System was developed in 4GL programming
language with integrated DBMS.
4 System was developed using 4GL and 3GL
programming language.
8 System was developed using only 3GL programming
language such as COBOL and PASCAL.

e. Test environment: This variable depicts the extent to which the


test infrastructure in which the testing is to take place has been
tried out. Fewer problems and delays are likely during the
execution phase in a well tried and tested infrastructure.

Ratings:

Rate Description
1 Environment has been used for testing several times
in the past.
2 Test environment is new but similar to earlier used
environment.
4 Test environment is new and setup is experimental.

f. Testware: Testware variable reflects the extent to which the


tests can be conducted using existing testware where testware
includes all of the testing documentation created during the
testing process, for example, test specification, test scripts, test
cases, test data, and environment specification.

Ratings:

Rate Description
1 A usable, general initial data set and specified test
cases are available for test.
2 Usable, general initial data set available.
4 No usable testware is available.

Method of calculation of EF: EF is calculated by adding together the


ratings for the various environmental variables, i.e, test tools,
development testing, test basis, development environment, test
environment, and testware, and then dividing the sum by 21 (the sum
of nominal ratings). Mathematically,

Step 6: Calculation of primary test hours (PT): The number of


primary test hours is obtained by multiplying the number of test points
by the productivity factor (PF) and environmental factor (EF).

where PT = Total number of primary test hours


TP = Total number of test points assigned to the system
as a whole
PF = The productivity factor
EF = The environmental factor

Step 7: Control factor (CF): The standard value of CF is 10%. The


CF value may be increased or decreased depending on the following 2
factors:
1. Team size (T)
2. Management tools (planning and control tools) (C)

Team Size (T): This factor reflects the number of people making up
the team, i.e., test manager, test controller, test contractors, and part-
time testers. The bigger the team, the more effort it will take to
manage the project.

Ratings:

Rate Description
3 The team consists of up to 4 persons.
6 The team consists of up to 5 and 10 persons.
12 The team consists of more than 10 persons.

Management tools (planning and control tools) (C): This variable


reflects the extent to which automated resources are to be used for
planning and control. More is the number of tools used to automate
management and planning, less is the amount of effort required.

Ratings:
Rate Description
2 Both an automated time registration system and
automated defect tracking system are available.
4 Either an automated time registration system or
automated defect tracking system is available.
8 No automated systems are available.

Method of calculation of CF: The planning and control management


percentage is obtained by adding together the ratings for the 2
influential factors.
Team size and planning and control tools.
Mathematically,

This will help us in calculating:


i. The test allowance in hours (THA).
ii. The total test hours (TTH).
The “allowance” in hours is calculated by multiplying the primary
test hour count by this percentage.

Step 8: Calculation of total test hours (TTH): The “total number of


test hours” are obtained by adding primary test hours and the planning
and control allowance.

Phasewise breakdown: If a structured testing approach is used, the


testing process can be divided into 5 life cycle phases:

Phase Estimate
I. Planning and control THA
II. Preparation 10%

III. Specification 40%

IV. Execution 45%

V. Completion 5%

The breakdown between phases can vary from one organization to


another or even from one organizational unit to another. Suitable
phase percentages can be calculated by analyzing completed test
projects. Therefore, historical data on such projects is necessary for
breaking down the total estimate.
Experience with the TPA technique suggests that the percentages
given in the above Table are generally appropriate.

10.2. CASE STUDY


Next, we apply the above stated methodology to our case study (given
earlier).

Step 1: System study by KRV&V Company: KRV&V requests a 2-


day systems and requirements study to understand the scope of testing
work and assessing the testing requirement to arrive at TP A estimate.
This study and discussions with DCM DATA Systems Ltd. reveal the
following:
a. User importance: It was observed that 20% of the functionality of
the product is of low importance to user, 60% is of medium
importance, and 20% is of high importance.
b. Usage intensity of the functionality: 10% of functions are less used,
70% are medium used, and 20% are extensively used.
c. Interfacing of functions with other functions: 50% of functions are
almost independent and do not interface with other functions. The
remaining 50% are highly dependent and interface with 50% of
independent functions.
d. Complexity of functions: All of the functions are of medium
complexity.
e. Structural uniformity (uniformity factor): 40% of test cases are
repetitive/dummy.
f. Quality characteristics:
Suitability- Medium importance
Security- Extremely important
Usability- Highly important
Efficiency- Medium importance

Step 2: Environmental assessment by KRV&V Company:


KRV&V carried out environmental assessment and noted the
following:
KRV&V would use query language, records, and playback tools in
testing. (Rate = 1)
The development test plan is available from DCM DataSystems
Ltd. But the test team is not familiar with the earlier test cases
executed and results. (Rate = 4)
Documentation templates and documentation standards were used
by DCM but due to shortage of time for product release, no
inspection of documentation was carried out. (Rate = 6)
Product is developed in 4GL with integrated databases. (Rate = 2)
Test environment is new for KRV&V but is similar to earlier used
test environment. (Rate = 2)
Testware: DCM would make the existing testware available to
KRV&V but the testware consists of only general initial data set
and not specific test cases. (Rate = 2)

Step 3: Planning and control technique followed by KRV&V:


Use of 5-tester team for similar assignments. (Rate = 6)
Use of automated time registration and automated defect tracking
tools (Rate = 2).

10.3. TPA FOR CASE STUDY


I. Dynamic test points (DT): From Eqn. (2), we have:
DT = FPf * Df * QD
where, FPf = Transaction FP = 600 (given)
Df = Dependency factor = weighted rating on importance to
user, usage intensity, interfacing of functions, and
complexity of functions.
Rating on User Importance (Up):
Up = 3 * 20% + 6 * 60% + 12 * 20%
= 0.6 + 3.6 + 2.4 = 6.6
Rating on Usage Intensity (Ui):
Ui = 2 * 10% + 4 * 70% + 12 * 20%
= 0.2 + 2.8 + 2.4 = 5.4
Rating on Interfacing (I):
I = 2 * 50% + 8 * 50% = 5
Rating on Complexity (C):

putting the value of U in equation (A), we get:

Df = (23/20) * 0.84 = 1.15 * 0.84 = 0.97

and QD (dynamic quality characteristic) = weighted score on the


following 4 quality characteristics:
Suitability (weight = 0.75, medium importance – rate = 4)
Security (weight = 0.05, extremely important – rate = 6)
Usability (weight = 0.10, highly important – rate = 5)
Efficiency (weight = 0.10, medium importance – rate = 4)
So, weighted score = (0.75 * 4 +0.05 * 6 + 0.10 * 5 + 0.10 * 4)
= 3 + 0.3 + 0.5 + 0.4 = 4.2
So, QD = 4.2

Hence, DT (total dynamic test points)

DT = FPf * Df * QD = 600 * 0.97 * 4.2 = 2444.4

II. Static test points (ST):

Now, Total FP = Data FP + Transaction FP = 650 + 600 = 1250

III. Total test points (TP):

TP = DT + ST = 2444.4 + 160 = 2604.4

IV. Productivity: = 1.4 test hours per test point

where
Rating on test tools = 1
Rating on development testing = 4
Rating on test basis = 6
Rating on development environment = 2
Rating on test environment = 2
Rating on testware = 2

VI. Primary test hours:


= TP * PF * EF = 2604 * 1.4 * 0.81 = 2953
VII. Planning and control allowance:
= Rating on team size factor + Rating on management tools
factor
= 6% + 2% = 8%
VIII. Total test hours:
= Primary test hours+ 8% of Primary test hours
= 2953 + 8% of 2953 = 3189 hours

10.4. PHASE WISE BREAKUP OVER TESTING LIFE


CYCLE
For this case study, the breakup is as follows:

Phases Time required (hours)


Plan (10%) 319
Development (40%) 1276
Execute (45%) 1435
Management (5%) 159

10.5. PATH ANALYSIS


Path analysis is a process specially developed for turning use cases
into test cases. This is a tool for the testers who are required to test
applications that have use cases as requirements. Use case can be
executed in many possible ways and each thread of execution through
a use case is called a path. Each use-case path is derived from a
possible combination of following use-case elements:
Basic courses
Alternate courses
Exception courses
Extends and uses relationships
Thus, any use case has multiple paths and each path through the
use case is a potential test case.
Path analysis is a scientific technique to identify all possible test
cases without relying on intuition and experience. It reduces the
number of test cases and the complexity and bulk of the test script by
moving the non-path related variability of test cases to test data. It
simplifies the coupling of test cases to test data. As an example, if we
have a use case with 10 paths and each path needs to be tested with 5
sets of data in traditional methodology we would have written 10 × 5
= 50 test cases. Using path analysis we write 10 test cases (one per
path) and 10 data tables (each with 5 columns). The advantages can be
summarized as
Better coverage through scientific analysis
Both positive and negative test cases can be easily identified
through path analysis
Easier coupling of path and data
Reduction of test cases
Better management of testing risks
Extremely effective in identifying and correcting use-case errors

10.6. PATH ANALYSIS PROCESS


The path analysis process consists of following four major steps:
1. Draw flow diagram for the use case
2. Determine all possible paths
3. Analyze and rank the paths
4. Decide which paths to use for testing

Step 1: Draw flow diagram for the use case:

Use case is a textual document and hence cannot be easily analyzed.


The first task is to convert this into a graphical format called a flow
diagram. Let us define some concepts.
Flow diagram: Graphical depiction of the use case.
Node: The point where flow deviates or converges.
Branch: Connection between two nodes. Each branch has a
direction attached to it. Node is also where more than one branch
meets.
In order to illustrate this process, we will take the example of the use
case that is given in Figure A.

Each use case has only one start and can have multiple end points.
Using UML terminology, the start is indicated by a plain circle and a
circle with a dot inside indicates the end.
In order to draw a diagram for the use case the following steps
should be followed:
1. Draw the basic flow.
Identify nodes
Combine sequential steps into one branch
Annotate the branches with the text summarizing the action in
those branches
Connect the nodes indicating the direction of flow
2. Repeat the step above for each alternate and exception flow.
The complete step-by-step process is illustrated in the attached
diagram at Figure B. The use-case example of Figure A has been
used to illustrate the process. As explained above, the flow diagram
is an excellent tool to identify the use-case flow and other problems
in the early stages. This feature of the process can save lot of time
and effort in the earlier stages of the software process. Figure 10.3
also covers how to identify use-case problems and then correcting
them early in the software process.

Step 2: Determine all possible paths:


As discussed earlier, use-case path is a single thread of execution
through the use case. The path determination process is basically very
simple and can be stated as “Beginning from the start each time, list
all possible ways to reach the end, keeping the direction of flow in
mind.” As you can see from the diagram, there could be a large
number of paths through the use cases. In some complex use cases,
especially when there is a lot of feedback branches, there could
potentially be a very large number of paths through the use case. The
aim here is to list all these paths. However, if there are too many paths,
it may be necessary to use judgement at this stage itself. The process
consists of following steps:
Path ID designation: Each path should be suitably designated as P1,
P2, etc.
Path name: This is the sequence of branch numbers that comprises
the path. For the example given above, path ID P2 has a path name
of 2,3,4,5.
Path description: This is a textual description of the complete
sequence of user actions and system responses taking place in that
path. It is a good practice to describe the path in good plain English
that is meaningful. The path description is ultimately nothing but a
description of the test case itself. All this information should be
combined in a table called Table of Paths.

Step 3: Analyze and rank the paths:


For simple use cases, which have about 10 separate paths, it is
straightforward to select all of the paths for testing and hence, this step
can be skipped. However, for some complex use cases, the number of
possible paths can easily exceed 50, in which case it may be necessary
to select only limited paths for testing. Quite frequently testers may be
able to use judgement and experience to select those paths. However
in lot of cases, we may need some objectivity and subject matter
expert’s (SME) guidance. Typically, the SMEs are business users who
have high stakes in the project. We define two attributes for each path
called:
Frequency: This attribute can have a value of 1 to 10, with 1
being least frequent and 10 most frequent. This attribute states the
likelihood of this path being exercised by the user.
Criticality: This attribute describes how critical the failure of this
path could be with 1 being least and 10 being most.
Having defined these attributes, we can compute a path factor
which is Frequency + Criticality. This is an equal weight path factor.
However, we can provide different weights to these attributes to arrive
at a proper path factor.

Step 4: Selection of paths for testing:


In order to have adequate test coverage and minimize the risk of not
testing while balancing the resources, there is a need to follow some
guidelines. They are as follows:
Always select basic flow for testing as
• It is a critical functionality.
• If basic flow fails; there may not be much of a point in testing
other paths.
• Basic flow should be included in sanity test suites and also in
acceptance testing.
Some other paths are too important to be ignored from a
functionality point of view. These are generally obvious from the
table of paths.
The path factor should be used for selecting a path among several
possibilities that do not meet the above criteria.
The aim is to select a minimum set of paths which would
adequately test the critical functionality while ensuring that all
branches have been included in at least one path. The path selection
process has been adequately explained in the example at Figure B.
Step 5: Writing test cases from the table of paths:
The table of paths provides us with adequate information for testing a
use case. The path description is in fact a summary description of the
test case itself and even a tester with very little experience can write
test cases from that description. Before proceeding further, let me
introduce one more concept called test scenario. Any path with
specific test data becomes a test scenario. Each path is generally
associated with a number of test scenarios to adequately test the data
width.
Let me take an example of use case called withdraw money from
an ATM. One of the paths in the use case that is a basic flow will have
a path description like this:

“User inserts his card in the machine, system successfully


validates the card, and prompts for 4 digit pin. User enters a valid
pin. System successfully validates the pin and prompts for amount.
User enters a valid amount. System ejects user card and correct
amount for collection by the user.”

Now this is a happy day path but as we know there may be certain
minimum and maximum amounts that a user can withdraw from the
ATM. Let us say it is $10 and $500, respectively. In order to
adequately test this path using boundary value analysis (BVA), we
need to test this withdrawal for $10, $500, and $200 (middle). Thus,
we need to create 3 test scenarios for the same path.
FIGURE 10.3 Relationship Between Path and Scenario.

The point to note here is that all three scenarios have the exact
same path through the use case. Why not test <$10 and >$500
withdrawal in the same path? The reason we do not do it is that such
tests belong to a different path where a “user” tries to withdraw <$10
or >$500 and he gets an appropriate message and the user is prompted
to reenter an acceptable amount. The user reenters the correct amount
and the system then lets the user withdraw the money.
The following guidelines should be followed to create test cases:
1. Create one test case for each path that has been selected for testing.
As explained previously, the path description provides enough
information to write a proper test case.
2. Create multiple test scenarios within each test case. We
recommend using data tables that have data elements as rows and
each column is a data set and also a test scenario. See Appendix B
where this is explained with reference to the example.
3. Add GUI details in the steps, if necessary.

Use Case: Add/Delete/Modify student information.


Purpose/Description: This use case describes the way the
registrar can maintain student information system which consists of
adding, deleting, or modifying students from the system.
Type: Concrete
Actors/Roles: Registrar

Preconditions:
1. The registrar must be logged on the system (Use Case: Login).
2. System offers registrar choices: Add, Delete, and Modify.
Basic Course:
1. Registrar chooses to add a student in the system.
Alternate Course: Delete Student
Alternate Course: Modify Student
2. Registrar enters the following information:
Name
DOB
SS#
Status
Graduation Date
3. Registrar confirms the information.
4. System returns an unique ID number for the student.
Post Conditions: System displays the list of students with the
selection of new student that have been added.
Alternate Courses:
Delete Student:
1. At basic course Step 1, instead of selecting to Add a student,
registrar selects to Delete a student.
2. System requests the student ID.
3. Registrar enters student ID. Alternate Course: Invalid Student ID.
4. System displays the student information.
5. System prompts the registrar to confirm the student deletion.
Alternate Course: Cancel Delete.
6. Registrar verifies the decision to delete.
7. System deletes the student from the system.
Post Conditions: Student name and other data no longer displayed to
the user.
Modify Student:
1. At basic course Step 1, instead of selecting to Add a student,
registrar selects to Modify a student.
2. System requests the student ID.
3. Registrar enters student ID. Alternate Course: Invalid Student ID.
4. System displays the student information.
5. Registrar changes any of the student attributes. Alternate Course:
Cancel Modify.
6. Registrar confirms the modification.
7. System updates the student information.
Post Conditions: Student is not deleted from the system.
Cancel Modify:
1. At Step 5 of alternate course Modify, instead of confirming the
deletion, registrar decides not to modify the student.
2. System cancels the modify.
3. Return to Step 1 of basic course.
Post Condition: Student information is not modified.
Invalid ID:
1. At Step 3 of alternate courses (Delete/Modify), system determines
that the student ID is invalid. (Collaboration case validate student
ID.)
2. System displays a message that the student ID is invalid.
3. Registrar chooses to reenter student ID.
4. Go to Step 1 of Modify/Delete course.
Post Condition: Student not deleted/student information is not
modified.
We follow all the steps for the use case of Figure A above.

Draw the Basic Flow


1. Draw start and end symbols.
Mark start and end in the diagram with UML symbols.
2. Connect with a vertical line. This is your basic flow.
The basic flow steps lie on this line.
3. Identify nodes in the basic course.
Node is a point where flow deviates. In this case start is also a
node. Why?
If there is more than one node, identify them.
4. Annotate the branches. Use text description depending on the
contents of the steps.
The branch is the line/action between the nodes. It can have
multiple steps.
5. Go to first identified node and start with first alternate or exception
course.

6. Identify nodes within each alternate course.


7. Connect the nodes within each alternate course. Arrows should
show direction of the flow.
8. Annotate each branch of the alternate course, depending on the
contents of the steps.
9. Go to each identified node in the alternate course and draw its
alternate and exception course.

10. Connect the nodes within each alternate course. Arrows should
show direction of the flow.
11. Annotate branches as before.
12. Repeat Steps 5 to 10 for other alternate or exception courses.

What Is Wrong With This Case?


Invalid ID should return to Step 3 rather than to Step 1.
Needs to have one more node.
Needs to divide each of these branches into 2 branches:
• Select Delete and Enter ID.
• Select Modify and Enter ID.

Correct the use case and the flow diagram by showing two more
nodes. The node is also a point where two or more flows meet.

What Else Is Wrong With This Use Case?

What Else Is Wrong With This Use Case?


Endless loops?
Raise an issue:
How to exit from an endless loop?

What else could be wrong?


Where should Cancel Delete or Cancel Modify return to?

Number the branches:


Number basic course starting from 1.
Number each alternate course and their exceptions continuously.
List all paths:
Identify the paths.
Begin from start.
Try to go to finish in all possible ways keeping the direction of
flow in mind.

Determine all Possible Paths


Begin from start and determine all possible ways to reach end keeping
the direction of flow in mind.
List all possible paths and for each path:
Designate path ID
Designate path name
• Designate path description
• Describe what is taking place in the path.
• Use branch designations to come up with the description.
• This description is a summary of what needs to be tested.
TABLE 10.3 Partial Table of Paths.
Analyze and Rank Paths
Three Stage Process
Add Attribute Values -> Consult Business Users
Criticality
Frequency
Compute Path Factor

Path Factor = Criticality + Frequency (can use weighted attributes


also)

Sort paths as per path factor in descending order (optional).

TABLE 10.4 Adding Attributes.


Selection of Paths for Testing
Guidelines for minimizing the risk and balancing the resources.
Always test the basic flow:
• It is critical.
• Failure of the basic flow may be a show stopper.
• If the basic flow fails the test, there may be no need to test other
paths for that use case in this build.
• Good tests for sanity checking and also for inclusion in
acceptance testing.
Although not basic flows, some other paths are too important to
be missed. Test them next.
Determine a set of paths that would:
• Combine other paths in which case the path factor is cumulative.
• Ensure that all the branches are tested at least once.
This set should test all the flows and vital combinations.
Example: Path Selection

How to Select Paths


How to Select Paths
Path selected:
P1 Basic Flow
P5 or P6 : Choose one of them. I choose P6
P7 : Just to ensure testing of multiple invalid loops
P8
P11 or P12. Choose one of them. I choose P12
P13

What Did We Achieve?


By testing 6 paths out of 13 identified (may be many more) we
have almost tested everything worth testing in this use case.
Have we tested all the branches?

Path Selection
Note: In a large number of cases, you will probably be testing all use-
case paths. The prioritization process helps you in selecting critical or
more important paths for testing.

Data Table Example 1


Data Table G (For Path P7)
Note: Notice how we show valid and invalid data.

Data Table Example 2


Data Table A (For Path P1)

SUMMARY
In our opinion, one of the most difficult and critical activities in IT is
the estimation process. We believe that it occurs because when we say
that one project will be accomplished in a certain amount of time by a
certain cost, it must happen. If it does not happen, several things may
follow: from peers’ comments and senior management’s warnings to
being fired depending on the reasons and seriousness of the failure.
Before even thinking of moving to systems test at our
organization, we always heard from the development group members
that the estimations made by the systems test group were too long and
expensive. We tried to understand the testing estimation process.
The testing estimation process in place was quite simple. The
inputs for the process provided by the development team were the size
of the development team and the number of working days needed for
building a solution before starting systems tests.
The testing estimation process said that the number of testing
engineers would be half of the number of development engineers and
one-third of the number of development working days.
A spreadsheet was created in order to find out the estimation and
calculate the duration of tests and testing costs. They are based on the
following formulas:
Testing working = (Development working days)/3
days
Testing engineers = (Development engineers)/2
Testing costs = Testing working days * Testing engineers *
Person daily costs
As the process was only playing with numbers, it was not necessary to
register anywhere how the estimation was obtained.
To show how the process worked, if one development team said
that to deliver a solution for systems testing it would need 4 engineers
and 66 working days, then the systems test would need 2 engineers
(half) and 21 working days (one-third). So the solution would be ready
for delivery to the customer after 87 (66 + 21) working days.
Just to be clear, in testing time the time for developing the test
cases and preparing the testing environment was not included.
Normally, it would need an extra 10 days for the testing team.
Besides being simple, that process worked fine for different
projects and years. But, we were not happy with this approach and the
development group were not either. Metrics, project analogies,
expertise, and requirements, were not being used to support the
estimation process.
We mentioned our thoughts to the testing group. We could not
stand the estimation process for very long. We were not convinced to
support it any more. Then some rules were implemented in order to
establish a new process.
Those rules are being shared below. We know that they are not
complete and it was not our intention for estimating but, for now, we
have strong arguments to discuss our estimation when someone doubts
our numbers.
The Rules:
First Rule: Estimation should be always based on the software
requirements:
All estimation should be based on what would be tested, i.e., the
software requirements.
Normally, the software requirements were only established by the
development team without any or just a little participation from the
testing team. After the specifications have been established and the
project costs and duration have been estimated, the development team
asks how long it would take for testing the solution. The answer
should be said almost right away.
Then the software requirements should be read and understood by
the testing team, too. Without the testing participation, no serious
estimation can be considered.
Second Rule: Estimation should be based on expert judgement:
Before estimating, the testing team classifies the requirements in the
following categories:
Critical: The development team has little knowledge in how to
implement it.
High: The development team has good knowledge in how to
implement it but it is not an easy task.
Normal: The development team has good knowledge in how to
implement.
The experts in each requirement should say how long it would take for
testing them. The categories would help the experts in estimating the
effort for testing the requirements.
Third Rule: Estimation should be based on previous projects:
All estimation should be based on previous projects. If a new project
has similar requirements from a previous one, the estimation is based
on that project.
Fourth Rule: Estimation should be based on metrics:
Our organization has created an OPD (organization process database)
where the project metrics are recorded. We have recorded metrics
from three years ago obtained from dozens of projects.
The number of requirements is the basic information for estimating
a testing project. From it, our organization has metrics that guide us to
estimate a testing project. The table below shows the metrics used to
estimate a testing project. The team size is 01 testing engineer.

Metric Value
1. Number of testcases created for each 4,53
requirement
2. Number of testcases developed by working 14,47
day
3. Number of testcases executed by working day 10,20
4. Number of ARs for testcase 0,77
5. Number of ARs verified by working day 24,64

For instance, if we have a project with 70 functional requirements and


a testing team size of 2 engineers, we reach the following estimates:

Metric Value
Number of testcases – based on metric 31,710
1
Preparation phase – based on metric 2 11 working days
Execution phase – based on metric 3 16 working days
Number of ARs – based on metric 4 244 ARs
Regression phase – based on metric 5 6 working days

The testing duration is estimated in 22 (16 + 6) working days plus, 11


working days for preparing it.
Fifth Rule: Estimation should never forget the past:
We have not sent away the past. The testing team continues using the
old process and the spreadsheet. After the estimation is done following
the new rules, the testing team estimates again using the old process in
order to compare both results.
Normally, the results from the new estimate process are cheaper
and faster than the old one in about 20 to 25%. If the testing team gets
a different percentage, the testing team returns to the process in order
to understand if something was missed.
Sixth Rule: Estimation should be recorded:
All decisions should be recorded. It is very important because if
requirements change for any reason, the records would help the testing
team to estimate again. The testing team would not need to return for
all steps and take the same decisions again. Sometimes, it is an
opportunity to adjust the estimation made earlier.
Seventh Rule: Estimation should be supported by tools:
A new spreadsheet has been created containing metrics that help to
reach the estimation quickly. The spreadsheet calculates automatically
the costs and duration for each testing phase.
There is also a letter template that contains some sections such as:
cost table, risks, and free notes to be filled out. This letter is sent to the
customer. It also shows the different options for testing that can help
the customer decide which kind of test he or she needs.
Eighth Rule: Estimation should always be verified:
Finally, all estimation should be verified. We’ve created another
spreadsheet for recording the estimations. The estimation is compared
to the previous ones recorded in a spreadsheet to see if they have
similar trend. If the estimation has any deviation from the recorded
ones, then a re-estimation should be made.
We can conclude from this chapter that the effort calculation can
be done even for black-box testing. It is indeed a challenging activity
during software testing. Test point analysis (TPA) is one such
technique. Other techniques like use case analysis, however, can also
be used. It is also a very powerful method to generate realistic test
cases.

MULTIPLE CHOICE QUESTIONS


1. TPA stands for
a. Total point analysis
b. Test point analysis
c. Terminal point analysis
d. None of the above.
2. TPA involves
a. 2 steps
b. 4 steps
c. 6 steps
d. None of the above.
3. TPA is
a. An effort estimation technique during black-box testing.
b. A testing technique.
c. A design process.
d. None of the above.
4. TPA involves basically,
a. 2 inputs
b. 3 inputs
c. 4 inputs
d. None of the above.
5. Total test points are given by
a. Sum of static and dynamic test points.
b. Product of static and dynamic test points.
c. Standard deviation of static and dynamic test points.
d. None of the above.
6. Use cases form the basis for:
a. Design
b. Test cases
c. GUI design
d. All of the above.
7. Use cases are of two types:
a. Concrete and abstract
b. Concrete and rapid
c. Abstract and rapid
d. None of the above.

ANSWERS
1. b.
2. c.
3. a.
4. b.
5. a.
6. d.
7. a.

CONCEPTUAL SHORT QUESTIONS WITH ANSWERS


Q. 1. Define test points.
Ans. Test points allow data to be inspected or modified at various
points in the system.
Q. 2. When are use cases written? How are they useful?
Ans. Use cases have become an industry standard method of
specifying user interaction with the system. They have become
the part of the requirements analysis phase of SDLC. Use cases
can be used for the derivation of test cases. The technique is
called use case path analysis. They are written using a case tool
like Rational Rose. They are usually written in MS-Word as it is
easily available.
Q. 3. What is path analysis?
Ans. Path analysis is a process specially developed for turning use
cases into test cases. It is a tool for testers. Use cases can be
executed in many possible ways. Each thread of execution
through a use case is called a path. Path analysis is a scientific
technique to identify all possible test cases without relying on
intuition and experience. It reduces the number of test cases and
the complexity and the bulk of the test script by moving the non-
path related variability of test cases to test data. Note that it
simplifies the coupling of test cases to test data.
Q. 4. If we have a use case with 15 paths, each path needs to be tested
with 6 sets of data. How many test cases have we written with
traditional methodology and with the path analysis technique?
Ans. Given that:
Number of paths = 15
Sets of data = 6
a. In traditional methodology, we have written 15 * 6 = 90 test
cases.
b. In the path analysis technique, we write 15 test cases (one per
path) and 10 data tables each with 9 columns.
Q. 5. Give some advantages of the path analysis technique.
Ans. The advantages are as follows:
1. Better coverage through scientific analysis.
2. Both positive and negative test cases can be easily identified
through path analysis.
3. Easier coupling of path and data.
4. Test case(s) reduction.
5. Better management of testing risks.
6. Very effective in identifying and correcting use-case errors.

REVIEW QUESTIONS
1. What are the three main inputs to a TPA process? Explain.
2. With a flowchart explain the TPA model.
3. Explain the test points of some standard functions.
4. “The bigger the team, the more effort it will take to manage the
project.” Comment.
5. Write short paragraphs on:
a. Testware
b. Planning and control tools
CHAPTER
11

TESTING YOUR WEBSITES—


FUNCTIONAL AND NON-FUNCTIONAL
TESTING
Inside this Chapter:
11.0. Abstract
11.1. Introduction
11.2. Methodology

11.0. ABSTRACT
Today everyone depends on websites for business, education, and
trading purposes. Websites are related to the Internet. It is believed
that no work is possible without Internet today. There are many types
of users connected to websites who need different types of
information. So, websites should respond according to the user
requirements. At the same time, the correct behavior of sites has
become crucial to the success of businesses and organizations and thus
should be tested thoroughly and frequently. In this chapter, we are
presenting various methods (functional and non-functional) to test a
website. However, testing a website is not an easy job because we
have to test not only the client-side but also the server-side. We
believe our approach will help any website engineer to completely test
a website with a minimum number of errors.

11.1 INTRODUCTION
The client end of the system is represented by a browser which
connects to the website server via the Internet. The centerpiece of all
web applications is a relational database which stores dynamic
contents. A transaction server controls the interactions between the
database and other servers (often called “application servers”). The
administration function handles data updates and database
administration.

Web Application Architecture


It is clear from Figure 11.1 that we have to conduct the following
tests:
What are the expected loads on the server and what kind of
performance is required under such loads? This may include web
server response time and database query response times.
What kinds of browsers will be used? What kinds of connection
speeds will they have? Are they intra-organization (with high-
connection speeds and similar browsers) or Internet-wide (with a
wide variety of connection speeds and browser types)?

FIGURE 11.1 Web Application Architecture.

What kind of performance is expected on the client side (e.g.,


how fast should pages appear, how fast should animations,
applets, etc. load and run)?
There are many possible terms for the web app development life cycle
including the spiral life cycle or some form of the iterative life cycle.
A more cynical way to describe the most commonly observed
approach is to describe it as the unstructured development similar to
the early days of software development before software engineering
techniques were introduced. The “maintenance phase” often fills the
role of adding missed features and fixing problems.
Will down time for server and content maintenance/upgrades be
allowed? How much?
What kinds of security (firewalls, encryptions, passwords, etc.)
will be required and what is it expected to do? How can it be
tested?
How reliable are the Internet connections? And how does that
affect the backup system or redundant connection requirements
and testing?
What processes will be required to manage updates to the
website’s content, and what are the requirements for maintaining,
tracking, and controlling page content, graphics, links, etc.?
Will there be any standards or requirements for page appearance
and/or graphics throughout a site or parts of a site?
How will internal and external links be validated and updated?
How often?
How many times will the user login and do they require testing?
How are CGI programs, Applets, Javascripts, ActiveX
components, etc. to be maintained, tracked, controlled, and
tested?
The table below shows the differences between testing a software
project that is not web based and testing a web application project.

Typical software project Web application project


1. Gathering user requirements: 1. Gathering user requirements:
What are we going to build? How What services are we going to offer
does it compare to products currently our customers? What is the best user
available? interface and navigation to reach the
This is typically supported by a most important pages with a
detailed requirements specifications. minimum number of clicks? What
are the current trends and hot
technologies?
This is typically based on
discussions, notes, and ideas.
2. Planning: 2. Planning:
How long will it take our available We need to get this product out now.
resources to build this product? How Purely driven by available time
will we test this product? Typically window and resources.
involves experience-based
estimation and planning.
3. Analysis and Design: 3. Analysis and Design:
What technologies should we use? How should the site look? What
Any design patterns we should kinds of logos and graphics will we
follow? What kind of architecture is use? How do we develop a “brand”
most suitable? for our site? Who is our “typical”
Mostly based on well-known customer? How can we make it
technologies and design methods. usable? What technologies will we
Generally complete before use? Short, iterative cycles of design
implementation starts. in parallel with implementation
activities.
4. Implementation: 4. Implementation:
Let us decide on the sequence of Let us put in the framework and
building blocks that will optimize hang some of the key features. We
our integration of a series of builds. can then show it as a demo or pilot
Sequential development of design site to our customers.
components. Iterative prototyping with transition
of prototype to a website.
5. Integration: 5. Integration:
How does the product begin to take This phase typically does not exist. It
shape, as the constituent pieces are is a point in time when prototyping
bolted together? Are we meeting our stops and the site goes live.
requirements? Are we creating what
we set out to create in the first place?
Assembly of components to build the
specified system.

6. Testing: 6. Testing:
Have we tested the product in a It’s just a website — the designer
reproducible and consistent manner? will test it as (s)he develops it, right?
Have we achieved complete test How do you test a website? Make
coverage? Have all serious defects sure the links all work?
been resolved in some manner Testing of implied features based on
Systematic testing of functionality a general idea of desired
against specifications. functionality.
7. Release: 7. Release:
Have we met our acceptance criteria? Go live NOW! We can always add
Is the product stable? Has QA the rest of the features later!
authorized the product for release? Transfer of the development site to
Have we implemented version the live server.
control methods to ensure we can
always retrieve the source code for
this release?
Building a release candidate and
burning it to CD.
8. Maintenance: 8. Maintenance:
What features can we add for a We just publish new stuff when it’s
future release? What bug fixes? How ready...we can make changes on the
do we deal with defects reported by fly, because there’s no installation
the end-user? required. Any changes should be
transparent to our users...”
Periodic updates based on feature Integral part of the extended
enhancements and user feedback. development life cycle for web apps.
Average timeframe for the above: Average timeframe for the above:
One to three years 4 months

11.2. METHODOLOGY
11.2.1. NON-FUNCTIONAL TESTING (OR WHITE-BOX TESTING)

11.2.1.1.CONFIGURATION TESTING
This type of test includes
The operating system platforms used.
The type of network connection.
Internet service provider type.
Browser used (including version).
The real work for this type of test is ensuring that the requirements
and assumptions are understood by the development team, and that
test environments with those choices are put in place to properly test
it.

11.2.1.2.USABILITY TESTING

For usability testing, there are standards and guidelines that have been
established throughout the industry. The end-users can blindly accept
these sites because the standards are being followed. But the designer
shouldn’t completely rely on these standards. While following these
standards and guidelines during the making of the website, he or she
should also consider the learnability, understandability, and operability
features so that the user can easily use the website.

11.2.1.3.PERFORMANCE TESTING

Performance testing involves testing a program for timely responses.


The time needed to complete an action is usually benchmarked, or
compared, against either the time to perform a similar action in a
previous version of the same program or against the time to perform
the identical action in a similar program. The time to open a new file
in one application would be compared against the time to open a new
file in previous versions of that same application, as well as the time to
open a new file in the competing application. When conducting
performance testing, also consider the file size.
In this testing the designer should also consider the loading time of
the web page during more transactions. For example, a web page loads
in less than eight seconds, or can be as complex as requiring the
system to handle 10,000 transactions per minute, while still being able
to load a web page within eight seconds.
Another variant of performance testing is load testing. Load
testing for a web application can be thought of as multi-user
performance testing, where you want to test for performance slow-
downs that occur as additional users use the application. The key
difference in conducting performance testing of a web application
versus a desktop application is that the web application has many
physical points where slow-downs can occur. The bottlenecks may be
at the web server, the application server, or at the database server, and
pinpointing their root causes can be extremely difficult.
Typical steps to create performance test cases are as follows:
Identify the software processes that directly influence the overall
performance of the system.
For each of the identified processes, identify only the essential
input parameters that influence system performance.
Create usage scenarios by determining realistic values for the
parameters based on past use. Include both average and heavy
workload scenarios. Determine the window of observation at this
time.
If there is no historical data to base the parameter values on use
estimates based on requirements, an earlier version, or similar
systems.
If there is a parameter where the estimated values form a range,
select values that are likely to reveal useful information about the
performance of the system. Each value should be made into a
separate test case.
Performance testing can be done through the “window” of the
browser, or directly on the server. If done on the server some of the
performance time that the browser takes is not accounted for taken
into consideration.
11.2.1.4.SCALABILITY TESTING

The term “scalability” can be defined as a web application’s ability to


sustain its required number of simultaneous users and/or transactions
while maintaining adequate response times to its end users.
When testing scalability, configuration of the server under test is
critical. All logging levels, server timeouts, etc. need to be configured.
In an ideal situation, all of the configuration files should be simply
copied from test environment to the production environment with only
minor changes to the global variables.
In order to test scalability, the web traffic loads must be
determined to know what the threshold requirement for scalability
should be. To do this, use existing traffic levels if there is an existing
website, or choose a representative algorithm (exponential, constant,
Poisson) to simulate how the user “load” enters the system.

11.2.1.5.SECURITY TESTING

Probably the most critical criterion for a web application is that of


security. The need to regulate access to information, to verify user
identities, and to encrypt confidential information is of paramount
importance. Credit card information, medical information, financial
information, and corporate information must be protected from
persons ranging from the casual visitor to the determined hacker.
There are many layers of security from password-based security to
digital certificates, each of which has its pros and cons. The test cases
for security testing can be derived as follows:
The web server should be setup so that unauthorized users cannot
browse directories and the log files in which all data from the
website stores.
Early in the project, encourage developers to use the POST
command wherever possible because the POST command is used
for large data.
When testing, check URLs to ensure that there are no
“information leaks” due to sensitive information being placed in
the URL while using a GET command.
A cookie is a text file that is placed on a website visitor’s system
that identifies the user’s “identity.” The cookie is retrieved when
the user revisits the site at a later time. Cookies can be controlled
by the user, regarding whether they want to allow them or not. If
the user does not accept cookies, will the site still work?
Is sensitive information in the cookie? If multiple people use a
workstation, the second person may be able to read the sensitive
information saved from the first person’s visit. Information in a
cookie should be encoded or encrypted.

11.2.1.6.RECOVERABILITY TESTING
A website should have a backup or redundant server to which the
traffic is rerouted when the primary server fails. And the rerouting
mechanism for the data must be tested. If a user finds your service
unavailable for an excessive period of time, the user will switch over
or browse the competitor’s website. If the site can’t recover quickly
then inform the user when the site will be available and functional.

11.2.1.7.RELIABILITY TESTING

Reliability testing is done to evaluate the product’s ability to perform


its required functions and give responses under stated conditions for a
specified period of time.
For example, a web application is trusted by users who use an
online banking web application (service) to complete all of their
banking transactions. One would hope that the results are consistent
and up to date and according to the user’s requirements.

11.2.2. FUNCTIONAL TESTING (OR BLACK-BOX TESTING)

11.2.2.1.WEB BROWSER-PAGE TESTS

This type of test covers the objects and code that executes within the
browser but does not execute the server-based components. For
example, JavaScript and VB Script code within HTML that does
rollovers and other special effects. This type of test also includes field
validations that are done at the HTML level. Additionally, browser-
page tests include Java applets that implement screen functionality or
graphical output. The test cases for web browser testing can be derived
as follows:
If all mandatory fields on the form are not filled in then it will
display a message on pressing a submit button.
It will not show the complete information about sensitive data
like full credit card number, social security number (SSN), etc.
Hidden passwords.
Login by the user is a must for accessing the sensitive
information.
It should check the limits of all the fields given in the form.

11.2.2.2.TRANSACTION TESTING
In this testing, test cases are designed to confirm that information
entered by the user at the web page level makes it to the database, in
the proper way, and that when database calls are made from the web
page, the proper data is returned to the user.

SUMMARY
It is clear from this chapter that for the failure-free operation of a
website we must follow both non-functional and functional testing
methods. With these methods one can test the performance, security,
reliability, user interfaces, etc. which are the critical issues related to
the website. Web testing is a challenging exercise and by following
the methods described in this chapter, some of those challenges may
be mitigated.

MULTIPLE CHOICE QUESTIONS


1. Websites can be tested using
a. Black-box techniques
b. White-box techniques
c. Both (a) and (b)
d. None of the above.
2. Which of the following is a functional testing technique?
a. Transaction testing
b. Web-browser page testing
c. Both (a) and (b)
d. None of the above.
3. Maintenance of websites may involve
a. 2 months (average)
b. 4 months (average)
c. 6 months (average)
d. None of the above.
4. Which type of testing involves testing a program for timely
responses?
a. Usability testing
b. Performance testing
c. Scalability testing
d. None of the above.
5. User’s identity is identified by a
a. Cookie file
b. EOF
c. Header files
d. All of the above.
ANSWERS
1. c. 2. c. 3. b. 4. b. 5. a.

CONCEPTUAL SHORT QUESTIONS WITH ANSWERS


Q. 1. Consider a web server supporting 10,000 concurrent users who
request documents from a pool of 10 different HTML documents
(with an average size of 2K each) every 3.5 minutes. Calculate
the bandwidth requirement for handling this throughput?

Q. 2. Discuss any one load/performance testing tool.


Ans. Tool type: Web-load simulator and performance analysis
Input: Simulated user requests
Output: Various performance and analytical reports
Primary user: Tester
Secondary user: Developer
Technology principle: This tool enables us to simulate
thousands of users accessing the website, in addition to other e-
commerce and e-business activities. Virtual load can also
simulate various versions of web browsers and network
bandwidth. The simulated load is applied to the server whereas
the performance data is collected and plotted in several useful
report formats for further analysis.
Q. 3. Distinguish between an inspection and a walkthrough.
Ans. We tabulate the differences between the two:

Inspection Walkthrough
1. It is a five-step process that is 1. It has fewer steps than inspection
well-formalized. and is a less formal process.
2. It uses checklists for locating 2. It does not use a checklist.
errors.
3. It is used to analyze the quality of 3. It is used to improve the quality
the process. of product.
4. This process takes a long time. 4. It does not take a long time.
5. It focuses on training of junior 5. It focuses on finding defects.
staff.

Q. 4. How may test cases be developed for website testing?


Ans. Generated loads may be designed to interact with servers via a
web browser user interface (WBUI). Consideration must be
given to the types of requests that are sent to the server under test
by the load generator and the resources available to the load
generator.
Q. 5. A web-based project has various attributes to be used as metrics.
List some of them.
Ans. Some measurable attributes of web-based projects are:
a. Number of static web pages used in a web-project. By static,
we mean no databases.
b. Number of dynamic web pages used in a web-project. By
dynamic, we mean databases are also connected to the web-
pages.
c. Word count metric which counts the total number of words on
a page.

REVIEW QUESTIONS
1. How is website testing different from typical software testing?
2. Discuss various white-box testing techniques for websites.
3. Discuss various black-box testing techniques for websites.
4. Write short paragraphs on:
a. Scalability testing of websites.
b. Transaction testing of websites.
CHAPTER
12

REGRESSION TESTING OF A
RELATIONAL DATABASE
Inside this Chapter:
12.0. Introduction
12.1. Why Test an RDBMS?
12.2. What Should We Test?
12.3. When Should We Test?
12.4. How Should We Test?
12.5. Who Should Test?

12.0. INTRODUCTION
Relational databases are tabular databases that are used to store target
related data that can be easily reorganized and queried. They are used
in many applications by millions of end users. Testing databases
involves three aspects:
Testing of the actual data
Database integrity
Functionality testing of database application
These users may access, update, delete, or append to the database.
The modified database should be error free. To make the database
error free and to deliver the quality product, regression testing of the
database must be done. Regression testing involves retesting of the
database again and again to ensure that it is free of all errors. It is a
relatively new idea in the data community. Agile software developers
take this approach to the application code.
In this chapter, we will focus on the following issues in regression
testing:
Why test an RDBMS?
What should we test?
When should we test?
How should we test?
Who should test?
To this end, the problem will be approached from practical
perspective.

12.1. WHY TEST AN RDBMS?


Extensive testing of an RDBMS is done due to the following reasons:
1. Quality data is an important asset
Recently a survey was done by Scott W. Amber on the importance
of quality data and he came out with the following conclusions:

95.7% of people believed that data is a corporate asset.


4.3% believed that data is not a corporate asset.
Of the 95.7%, 40.3% had a test suite for data validation.
31.6% discussed the importance of data.
2. Target related business functions are implemented in RDBMS
RDBMS should focus on mission-critical business functionality.
3. Present approaches of RDBMS testing are inefficient
Presently we develop a database by setting up database, writing
code to access the database, running code, and doing a SELECT
operation to find the query results. Although visual inspection is a
good start, it may help us to find problems but not prevent them.
4. Testing provides a concrete test suite to regression test an
RDBMS
Database regression testing is the act of running the database test
suite on a regular basis. This includes testing of actual data,
database integrity, ensuring that database is not corrupted, and
schemas are correct as well as the functionality testing of database
applications.
5. Verification of all modifications
Making changes to the database may result in some serious errors
like missing data and regression testing may help us in detecting
such missing data.

12.2. WHAT SHOULD WE TEST?


We will be discussing both black-box and white-box testing
approaches on relational databases. Black-box testing will involve:
I/O validation: Regression testing will help us in validating
incoming data-values; outgoing data-values from queues, stored-
functions, and views.
Error handling: Regression testing of an RDBMS allows us to
test quasi-nulls that is, empty strings that are not allowed.
Table structure can be easily validated: We can validate the
relationships between the rows in different tables. This is known
as referential integrity. For example, if a row in an employee
table references a row within the position table then that row
should actually exist.
Testing the interaction between SQL and other components
such as scripts: Regression testing allows testing of interfaces
between SQL and scripts by techniques such as parameter
passing.
Testing of stored data: Data stored in the form of tables can be
tested extensively by regression testing.
Testing of modified data: Updating the tables might introduce
new errors which can be easily detected by regression testing.
White-box testing will involve:
i. Testing of the entire structure of stored procedures and
functions: Entire schema can be tested by regression testing. We
can refactor our database tables into structures which are more
performant. The process of refactoring here means a small change
to a database schema which improves its design without changing
its semantics. It is an evolutionary improvement of our database
schema which will support three things:
1. New needs of our customers.
2. Evolutionary software development.
3. Fix legacy database design problems.
ii. Testing various stimulations: Regression testing allows unit
testing of stored procedures, functions, and triggers. The idea is
that the test is automatically run via a test framework and success
or failure is indicated via a Boolean flag.
iii. Testing all views: Regression testing allows an extensive testing
of all three views viz, conceptual, logical and physical.
iv. Testing of all data constraints: Regression testing allows testing
of all data constraints like null values, handling single quote in a
string field, handling comma in an integer field, handling wrong
data types, large size value, large size string, etc.
v. Improving the quality of data: Data quality may range from
syntactic mistakes to undetectable dirty data. Data quality
involves four C’s, i.e., correctness, completeness, comprehension,
and consistency.
Correctness: Regression testing provides a correct database by
removing the following errors:
Incorrect manipulation through the use of views.
Incorrect joins performed using non-key attributes.
Integrity constraints incorrectly used.
CHECK, UNIQUE, and NULL constraints which cause problems
with data insertion, updates, and deletions.

12.3. WHEN SHOULD WE TEST?


Testing databases involves initial testing of database and database
refactoring. This strategy can be applied concurrently to both the
application code and the database schema. Testing of databases is
done not only during the release but also during the development.

FIGURE 12.1 Test-First Approach.

New software developers follow the Test-First Approach wherein


a test case is first written and then the code is written which will fulfill
this test.
The step-by-step approach is as follows:
Step 1: A test is added for just enough code to fail.
Step 2: Tests are then run to make sure that the new tests do in fact
fail.
Step 3: Functional code is then updated so that it passes the new
tests.
Step 4: Tests are run again.
Step 5: If tests fail, update functional code again and retest.
Step 6: Once the tests pass, the next step is start again.
Test First Approach: TFA is also known as initial testing of
database.
Test Driven Development: TDD is a progressive approach. It
comprises TFA and refactoring (regression testing). Thus, we express
this in the form of an equation:

TDD = TFA + Refactoring


12.4. HOW SHOULD WE TEST?
a. Database testing involves the need of a copy of the databases called
sandboxes.
Functionality sandbox: In this we check the new functionality of
the database and refactor the existing functionality. Then we pass
the tested sandbox to the next stage which is the integrated
sandbox.
Integrated sandbox: In this we integrate all of the sandboxes and
then test the system.
QA sandbox: After the system is tested, sandboxes are sent for
acceptance testing. This will ensure the quality of the database.
b. Development of test cases: The step-by-step procedure for the
development of test cases is:
Step 1: Setting up the test cases: Set up the database to a known
state. The sources of test data are
External test data
Test scripts
Test data with known values
Real-world data

FIGURE 12.2 Types of Sand-Boxes and the Flow of Database


Testing.

Step 2: Running the test cases: The test cases are run. The running
of the database test cases is analogous to usual development
testing.

Traditional Approach
Test cases are executed on the browser side. Inputs are entered on
web input forms and data is submitted to the back-end database via
the web browser interface. The results sent back to the browser are
then validated against expected values.
Advantages: It is simple and no programming skill is required.
It not only addresses the functionality of stored procedures, rules,
triggers, and data integrity but also the functionality of the web
application as a whole.
Disadvantages: Sometimes the results sent to the browser
after test case execution do not necessarily indicate that the data
itself is properly written to a record in the table. When erroneous
results are sent back to the browser after the execution of test
cases, it doesn’t necessarily mean that the error is a database error.
A crucial danger with database testing and with regression
testing specificly is coupling between tests. If we put the database
in to a known state, run several tests against that known state
before setting it, then those tests are potentially coupled to one
another.

Advanced Approach
Preparation for Database Testing
Generate a list of database tables, stored procedures, triggers,
defaults, rules, and so on. This will help us have a good handle on
the scope of testing required for database testing. The points which
we can follow are:
1. Generate data schemata for tables. Analyzing the schema will
help us determine:
Can a certain field value be Null?
What are the allowed or disallowed values?
What are the constraints?
Is the value dependent upon values in another table?
Will the values of this field be in the look-up table?
What are user-defined data types?
What are primary key and foreign key relationships among
tables?
2. At a high level, analyze how the stored procedures, triggers,
defaults, and rules work. This will help us determine:
What is the primary function of each stored procedure and
trigger? Does it read data and produce outputs, write data,
or both?
What are the accepted parameters?
What are the return values?
When is the stored procedure called and by whom?
When is a trigger fired?
3. Determine what the configuration management process is. That
is how the new tables, stored procedures, triggers, and such are
integrated.
Step 3: Checking the results: Actual database test results and
expected database test results are compared in this step as shown
in the following example:

The following test cases were derived for this code snippet:
12.5. WHO SHOULD TEST?
The main people responsible for doing database testing are application
developers and agile database administrators. They will typically pair
together and will perform pair testing which is an extension of pair
programming. Pair database testing has the following advantages.
First, testing becomes a real-time interaction. Secondly, discussions
are involved throughout.
The database testers are also responsible for procuring database
testing tools for the organization. Some of the dataset testing CASE
tools are:

Category of Meaning Examples


testing
UNIT TESTING Tools which enable DBUnit, SQL Unit
TOOLS you to regression test
your database.
LOAD TESTING These tools will test Mercury Interactive,
TOOLS whether our system Rational Suite Test
will be able to stand Studio
high conditions of
load.
TEST DATA They help to generate Data Factory, Turbo
GENERATOR large amounts of data Data
for stress and load
testing.

SUMMARY
In this chapter, we have studied the regression testing of relational
databases. We have also done the black-box testing of a database code
example.

MULTIPLE CHOICE QUESTIONS


1. RDBMS should focus on:
a. Logic only
b. Mission critical business functionality
c. Both (a) and (b)
d. None of the above.
2. When can database regression testing be done?
a. Sometimes
b. Not required
c. Regular basis
d. None of the above.
3. A process of making a small change to a database schema which
improves its design without changing its semantics is known as:
a. Refactoring
b. Regression testing
c. Unit testing
d. None of the above.
4. Regression testing allows an extensive testing of:
a. Conceptual level
b. Logical level and physical level
c. Conceptual, logical, and physical level
d. None of the above.
5. Which of the following is true?
a. TDD = TFA + Refactoring
b. TDD = TFA – Refactoring
c. TDD = TFA
d. None of the above.
6. The copy of a database is called as:
a. Instance
b. Alias
c. Sandbox
d. None of the above.
7. A load testing CASE TOOL is:
a. SQL Unit
b. Rational Suite Test Studio
c. Turbo Data
d. None of the above.

ANSWERS
1. b.
2. c.
3. a.
4. c.
5. a.
6. c.
7. b.

CONCEPTUAL SHORT QUESTIONS WITH ANSWERS


Q. 1. How can you unit test your databases?
Ans. We can test stored procedures by executing SQL statements one
at a time against known results. Then the results can be validated
with expected results. This is similar to unit testing.
Q. 2. What are certain points that are to be kept in mind during
database testing?
Ans. There are four important points to be kept in mind. They are as
follows:
1. I/O validations and error handling must be done outside of the
stored procedures.
2. Do thorough analysis to design black-box test cases that
produce problematic inputs that would break the constraints.
3. Testing the interaction between SQL and other components
like scripts.
4. Understanding how to use database tools to execute SQL
statements can improve our ability to analyze web-based
errors. It helps us to determine whether an error is in the
stored procedure code, the data itself, or in the components
outside of the database.
Q. 3. What sort of tests may be carried out during database testing?
Ans. 1. SQL databases may not be able to accept special characters
(e.g., $, @, &) as valid inputs.
2. Data sent to the database server may be corrupted due to
packet losses caused by slow connections.
3. Proper implementation of database rollback logic. Otherwise,
it causes data corruption. Design your test cases to exercise
those critical areas.
4. Check for complete loading of tables in the database.
5. Check for proper error handling.
6. Check that your server does not run out of disk space.
Q. 4. What are milestone tests?
Ans. Milestone tests are performed prior to each development
milestone. They are scheduled according to the milestone plan.
Q. 5. What sort of errors are handled at the client side and at the
server side?
Ans. Simple errors such as invalid inputs should be handled at the
client side. Handling error conditions can be done at the server
side.

REVIEW QUESTIONS
1. Why should an RDBMS be tested extensively?
2. How can you do black-box testing of a database?
3. What is refactoring? What are its three main objectives?
4. Explain with the help of a flowchart/an algorithm, the Test-First
approach used to test an RDBMS.
5. Comment on the flow of database testing.
6. Name some unit testing and load testing CASE tools and some test
data generators used during database testing.
CHAPTER
13

A CASE STUDY ON TESTING OF E-


LEARNING MANAGEMENT SYSTEMS
ABSTRACT

Software testing is the process of executing a program or system with


the intent of finding errors. It involves any activity aimed at evaluating
an attribute or capability of a program or system and determining that
it meets its required results. To deliver successful software products,
quality has to be ensured in each and every phase of a development
process. Whatever the organizational structure may be, the most
important point is that the output of each phase should be of very high
quality. The SQA team is responsible to ensure that all the
development team should follow the quality-oriented process. Any
modifications to the system should be thoroughly tested to ensure that
no new problems are introduced and that the operational performance
is not degraded due to the changes. The goal of testing is to determine
and ensure that the system functions properly beyond the expected
maximum workload. Additionally, testing evaluates the performance
characteristics like response times, transaction rates, and other time
sensitive issues.

CHAPTER ONE

INTRODUCTION

NIIT Technologies is a global IT and business process management


services provider with a footprint that spans 14 countries across the
world. It has been working with global corporations in the USA,
Europe, Japan, Asia Pacific, and India for over two decades. NIIT
Technologies provides independent validation and verification
services for your high-performance applications. Their testing services
help organizations leverage their experience in testing to significantly
reduce or eliminate functional, performance, quality, and reliability
issues. NIIT Technologies helps enterprises and organizations make
their software development activities more successful and finish
projects in time and on budget by providing systematic software
quality assurance.
The government of India Tax Return Preparers scheme to train
unemployed and partially employed persons to assist small and
medium taxpayers in preparing their returns of income has now
entered its second phase. During its launch year, on a pilot basis, close
to 5,000 TRPs at 100 centers in around 80 cities across the country
were trained. 3737 TRPs were certified by the Income Tax
Department to act as Tax Return Preparers who assisted various
people in filing their IT returns. The government has now decided to
increase their area of operations by including training on TDS returns
and service tax returns to these TRPs. The quality assurance and
testing team of NIIT who constantly indulges in testing and
maintaining the product quality have to test such online learning
content management websites such as www.trpscheme.com in the
following manner:
Functional and regression testing
System testing: Load/stress testing, compatibility testing
Full life cycle testing

CHAPTER TWO

SOFTWARE REQUIREMENT SPECIFICATIONS

Inside this Chapter:


2.1. Introduction
2.2. Overall Descriptions
2.3. Specific Requirements
2.4. Change Management Process
2.5. Document Approval
2.6. Supporting Information

2.1. INTRODUCTION
This document aims at defining the overall software requirements for
“testing of an online learning management system
(www.trpscheme.com).” Efforts have been made to define the
requirements exhaustively and accurately.

2.1.1. PURPOSE

This document describes the functions and capabilities that will be


provided by the website, www.trpscheme.com. Its purpose is that the
resource center will be responsible for the day-to-day administration
of the scheme. The functions of the resource center will include to
specify the curriculum and all other matters relating to the training of
the Tax Return Preparers and maintain the particulars relating to the
Tax Return Preparers. Also, any other function that is assigned to it by
the Board for the purposes of implementation of the scheme.

2.1.2. SCOPE

The testing of the resource center section for service tax is done
manually mainly using functional and regression testing. Other forms
of testing may also be used such as integration testing, load testing,
installation testing, etc.

2.1.3. DEFINITIONS, ACRONYMS, AND ABBREVIATIONS


Definitions:
1. Test plan: We write test plans for two very different purposes.
Sometimes the test plan is a product; sometimes it’s a tool. In
software testing, a test plan gives detailed testing information
regarding an upcoming testing effort.
2. Test case: A set of conditions or variables under which a tester will
determine if a requirement upon an application is partially or fully
satisfied. It may take many test cases to determine that a
requirement is fully satisfied.
3. Manual testing: The most popular method of software application
testing. It can be improved by using requirements traceability, test
cases, test plan, SQA and debugging, and testing techniques and
checklist.
4. Test report: A management planning document that shows: Test
Item Transmittal Report, Test Log, Test Incident Report, and Test
Summary Report.
5. Staging server: A staging server is a web server used to test the
various components of, or changes to, a website before
propagating them to a production server.
Acronym and Abbreviation:
1. TRP–Tax Return Preparer
2. STRP–service tax return preparers scheme
3. QA–quality assurance
4. HTML–hyper text mark-up language
5. STC–service tax code
6. EAR File–enterprise archive file
7. VSS–visual source safe
8. STD–service tax department

2.1.4. REFERENCES BOOKS

Chopra, R. 2017. Software testing and quality assurance: A


practical approach. S.K. Kataria & Sons, New Delhi.
Aggarwal. K. K. 2005. Software engineering. New Age
International, New Delhi.

Sites
http://en.wikipedia.org/Software_testing

2.1.5. OVERVIEW
The rest of the SRS document describes the various system
requirements, interfaces, features, and functionalities in detail.

2.2. OVERALL DESCRIPTIONS

2.2.1. PRODUCT PERSPECTIVE


The application will be self-contained.

FIGURE 2.1

2.2.1.1. SYSTEM INTERFACES

None.

2.2.1.2. USER INTERFACES

The application will have a user-friendly and menu-based interface.


The login page will entertain both user and admin. The following
forms and pages will be included in the menu:
Login screen
Homepage
Return filed report
Return filed form
STRP wise report
Service wise report
Zone commissionerate wise report
STRP summary report

2.2.1.3. HARDWARE INTERFACES

1. Processor: Intel Pentium (4) Processor


2. Ram: 512 MB and above
3. Storage Space: 5 GB and above
4. A LAN card for the Internet

2.2.1.4. SOFTWARE INTERFACES

1. Language: Java, XML


2. Software: Bugzilla, Putty, Toad
3. Database: Oracle
4. Platform: Windows 2000 (Server) / Linux

2.2.1.5. COMMUNICATION INTERFACES

The application should support the following communication


protocols:
1. Http
2. Proxy server: In computer networks, a proxy server is a server
(a computer system or an application program) that acts as a go-
between for requests from clients seeking resources from other
servers.
2.2.1.6. MEMORY CONSTRAINTS

At least 512 MB RAM and 2 GB hard disk will be required.

2.2.1.7. SITE ADAPTATION REQUIREMENTS

The terminals at the client site will have to support the hardware and
software interfaces specified in the above sections.

2.2.2. PRODUCT FUNCTIONS


According to the customer use and needs the website function shall
include:
i. To specify, with prior approval of the Board,
a. The number of persons to be enrolled during a financial year
for training to act as Tax Return Preparers;
b. The number of centers for training and their location where
training is to be imparted during a financial year;
c. The number of persons to be trained at each center for
training during a financial year;
ii. To specify the curriculum and all other matters relating to the
training of Tax Return Preparers;
iii. Maintain the particulars relating to the Tax Return Preparers;
iv. Any other function which is assigned to it by the Board for the
purposes of implementation of the scheme.

2.2.3. USER CHARACTERISTICS

Education level: The user be able to understand one of the


languages of the browser (English, Hindi, Telugu). The user must
also have a basic knowledge of tax return and payments rules and
regulations.
Technical expertise: The user should be comfortable using
general-purpose applications on a computer.

2.2.4. CONSTRAINTS

Monitor sizes and ratios and color or black-and-white monitors


render it virtually impossible to design pages that look good on
all device types.
Font sizes and colors need to be changeable to fit the
requirements of sight-impaired viewers.

2.2.5. ASSUMPTIONS AND DEPENDENCIES

Some pages display wrong with some browsers.


Some web master along the way programmed in some browser-
specific codes.

2.2.6. APPORTIONING OF REQUIREMENTS


None.

2.3. SPECIFIC REQUIREMENTS


This section contains the software requirements to a level of detail
sufficient to enable designers to design the system and the testers to
test the system.

2.3.1. USER INTERFACES AND VALIDATIONS


The following interfaces will be provided:
Login screen
Visit www.trpscheme.com. The following page appears:
When the STRPs click on the login button of resource center
service tax on TRPscheme.com, the following page will be displayed.
Homepage
When the STRP logs in by user id and password, the homepage is
displayed. The homepage has “Reported by STRP” menu on the left
under which the user will see two links, “Return Filed” and
“Monthly/Quarterly Tax Paid Form.” The user will also see two report
links, “Return Filed” and “Monthly/Quarterly Tax Paid Report” under
the “Reports” menu. A message “This is the Resource Center site. By
using this site you can get information on Resource Center Service
Tax” also appears on the homepage.
Monthly/Quarterly tax paid form
This form is used by the STRPs to fill out the Monthly/Quarterly Tax
Paid by the Assesses in order to capture the information so that the
total tax paid can be assessed. On this page STRP Details will be
displayed and the fields that need to be filled in for the completion of
the Monthly/Quarterly Tax Paid form. Filling out the
“Monthly/Quarterly Tax Paid Form” is mandatory before filling out
the Return Filed form. This will not be applied when STRP is filling
return for the STC code first time.
Form validation: This form will require validation of the data such as
all of the mandatory fields cannot be left blank and “STC Code” must
be filled in otherwise the form will not be submitted. Fields such as
“Amount of Tax Payable,” “Amount of Tax Paid,” and “Interest Paid”
will only be numeric.
To complete a form, the user must fill out the following fields.
All of the fields are mandatory in this form.
• Name of Assesses
• STC Code
• Period
• Monthly/Quarterly
• Month
• Amount of Tax Payable
• Amount of Tax Paid
Field format/length for STC Code will be as follows: [First 5
alphabetical] [6-9 numeric] [10 alphabetical] [11-12 ST] [13-15
numeric]
“Month” drop-down list will be populated based on the “Period”
and “Monthly/Quarterly” selection. “Month” will be selected. If
the user has selected “Period” as April 1 though Sept 30 and 2009
and “Monthly” in “Monthly/Quarterly” drop down then he or she
will see April, May, June, July, August, and September in the
“Month” drop down. If the TRP has selected “Quarterly” in
“Monthly/Quarterly” drop down then the drop down will show
Apr_May_June and July_Aug_Sep.
The STRP can only fill in the details for the same STD code,
period, and month only once.
Report to view Monthly\Quarterly form data.
This report will allow STRPs to view Monthly\Quarterly Tax Paid
form data and will be able generate a report of the data. STRPs will
generate reports in HTML format and also be to able to export them
into Excel format.
To view the report data the STRP is required to provide the
“Period” in the given fields that are the mandatory fields.
The STRP can also use the other field STC code to generate.
The user must click on the “Generate Report” button to view the
report in HTML format or on “Export to Excel” if he or she
wants to export the data into Excel format.
The “Cancel” button will take the user to the login page. 2.5
Service Wise Report (Admin Report)
This report will allow the admin to generate a Report Service Wise of
STRPs. This report will be generated in HTML format as well as in
Excel format.
Validations:
This page should contain a “Service Type” drop down and “Date
from” and “To” textboxes.
To view the Service Wise Report data the admin can select
multiple services from the “Service Type” list box and the data
for those services will be populated. “Service Type” will be a
mandatory field so the user has to select at least one service to
view the report data.
The user must click on the “Generate Report” button to view the
report in HTML format or on “Export to Excel” if he or she
wants to export the data them into Excel format.
The TRP id, TRP name, and service type will also be provided in
the Excel sheet.
The “Cancel” button will take the user to the login page.
The user needs to fill in both the “Date from” and “To” fields.
“Date from” and “To” will extract the data based on “Date of
Filling Return.”
STRPs Wise Report (Admin Report)
This report will allow the admin to search the data of the STRPs
and will be able to generate a report of the data. The admin will
generate reports in HTML format and also in Excel format.

To view the STRPs Wise Report data users have to give a


“Period” because its a mandatory field while the rest of the fields
are non mandatory.
The user can also provide the date range if the user wants data
from a particular date range. If no date range is provided then all
the data from all of the STRPs will be populated for the given
period.
The user needs to fill in both “Date from” and “To” fields. “Date
from” and “To” will extract the data based on “Date of Filling
Return.”
The user must click on the “Generate Report” button to view the
report in HTML format or on “Export to Excel” if he or she
wants to export the data into Excel format.
The “TRP id” and “TRP name” will also be provided in the Excel
sheet.
The “Cancel” button will take the user to the login page.
STRP Summary Report (Admin Report).
This report will allow the admin to generate a report for the top ten
STRPs based on the highest amount of tax paid for each return filled
by the TRP. This report will be generated in HTML format as well as
in Excel format.

Validations:
To view this report the user will have to select a “Zone” as well
as a “Period.” These are mandatory filters.
There will be an option of “ALL” in the “Zone” drop down if the
report needs to be generated for all the zones.
The user must click on the “Generate Report” button to view the
report in HTML format or on “Export to Excel” if he or she
wants to export the data into Excel format.
The “Cancel” button will take the user to the login page.
The user needs to fill both “Date from” and “To” fields. “Date
from” and “To” will extract the data based on “Date of Filling
Return.”
The user can either select the “Period” or “Date from” and “To” to
generate the report. Both of the fields cannot be selected.
Zone/Commissionerate Wise Report (Admin Report)
This report will allow the admin to generate the report
Zone/Commissionerate Wise of STRPs. This report will be generated
in HTML format as well as in Excel format.

Validations:
To view the Commissionerate Wise Report data the admin can
provided “Zone,” “Commissionerate,” and “Division” to view the
data but if no input is provided then the data will include the
entire “Zone,” the “Commissionerate,” and the “Division.” The
user will have to select “Zone” because it will be a mandatory
field. There will be an option of “ALL” in the “Zone” drop down
if the report needs to be generated for all of the “Zone.”
“Commissionerate” will be mapped to the “Zone” and “Division”
will be mapped to “Commissionerate,” i.e., if a user selects a
“Zone” then all the “Commissionerate” under that “Zone” will
come in to the “Commissionerate” drop down and if a user
selects a “Commissionerate” then only those “Division” will be
populated in the “Division” drop down that are under that
“Commissionerate.” If any LTU is selected in the “Zone” drop
down the no other field will be populated.
The user must click on the “Generate Report” button to view the
report in HTML format or on “Export to Excel” if he or she
wants to export the data into Excel format.
The “TRP id,” “TRP name,” “Commissionerate,” and “Division”
will also be provided in the Excel sheet.
The “Cancel” button will take the user to the login page.

2.3.2. FUNCTIONS
It defines the fundamental actions that must take place in the software
in accepting and processing the inputs and generating the outputs. The
system will perform the following:
VALIDITY CHECKS
The address should be correct.
An Internet connection should be present.
RESPONSES TO ABNORMAL SITUATIONS
An error message will be generated if the date format is wrong.
An error message will be generated if the STC code is entered
incorrectly.
An error message will be generated if two users are assigned the
same STC code.

2.3.3. MODULES

Test Plan
We write test plans for two very different purposes. Sometimes the
test plan is a product; sometimes it’s a tool. It’s too easy but also too
expensive to confuse these goals. In software testing, a test plan gives
detailed testing information regarding an upcoming testing effort
including:
Scope of testing
Schedule
Test deliverables
Release criteria risks and contingencies
How the testing will be done?
Who will do it?
What will be tested?
How long it will take?
What the test coverage will be, i.e., what quality level is required?

Test Cases
A test case is a set of conditions or variables under which a tester will
determine if a requirement upon an application is partially or fully
satisfied. It may take many test cases to determine that a requirement
is fully satisfied. In order to fully test that all of the requirements of an
application are met, there must be at least one test case for each
requirement unless a requirement has sub requirements. In that
situation, each sub requirement must have at least one test case. There
are different types of test cases.
Common test case
Functional test case
Invalid test case
Integration test case
Configuration test case
Compatibility test case

2.3.4. PERFORMANCE REQUIREMENTS


Static numerical requirements are:
HTTP should be supported
HTML should be supported
Any number of users can be supported
Dynamic numerical requirements include the number of
transactions and tasks and the amount of data to be processed within
certain time periods for both normal and peak workload conditions
depend upon the connection speed of the user.

2.3.5. LOGICAL DATABASE REQUIREMENTS

The database must be updated and maintained on a regular basis by a


database administrator.

2.3.6. DESIGN CONSTRAINTS

None.

2.3.7. SOFTWARE SYSTEM ATTRIBUTES

Quality attributes that can serve as requirements:


Reliability: It supports the latest functions as per the user
requirements.
Availability: It can be downloaded from the site.
Security: It supports the privacy mode.
Portability: It supports all operating systems.
Efficiency: Appropriate amount of computing resources and
code.

2.4. CHANGE MANAGEMENT PROCESS


Changes in project scope and requirements will be done if there is:
Software update
Change in technology (presence of any future OS)
Change in user requirements

2.5. DOCUMENT APPROVAL


None.

2.6. SUPPORTING INFORMATION


The table of contents is given.

CHAPTER THREE

SYSTEM DESIGN
CHAPTER FOUR

REPORTS AND TESTING

4.1. TEST REPORT


A management planning document that shows the following:
Test Item Transmittal Report: Reporting on when tested
software components have progressed from one stage of testing
to the next.
Test Log: Recording which test cases were run, who ran them, in
what order, and whether each test passed or failed.
Test Incident Report: Detailing for any test that failed, the
actual versus expected result, and other information intended to
throw light on why a test has failed.
Test Summary Report: A management report providing any
important information uncovered by the tests accomplished, and
including assessments of the quality of the testing effort, the
quality of the software system under test, and statistics derived
from incident reports. The report also records what testing was
done and how long it took in order to improve any future test
planning. This final document is used to indicate whether the
software system under test is fit for its purpose according to
whether or not it has met acceptance criteria defined by project
stakeholders.

4.2. TESTING
The importance of software testing and its implications with respect to
software quality cannot be overemphasized. Software testing is a
critical element of software quality assurance and represents the
ultimate review of specification, design, and code generation.

4.2.1. TYPES OF TESTING


White-Box Testing: This type of testing goes inside the program and
check all the loops, paths, and branches to verify the program’s
intention.
Black-Box Testing: This type of testing is done to identify whether
the output are the expected ones or not. This type of testing verifies
that the software generates the expected output with a given set of
inputs.
Static Analysis: In this type of testing, code is examined rather than
exercised to verify its conformance to a set of criteria.
4.2.2. LEVELS OF TESTING

Unit Testing: This is the first phase of testing. The unit test of the
system was performed using a unit test plan. The parameters that are
required to be tested during a unit testing are as follows:
Validation check: Validations are all being performed correctly. For
this, two kinds of data are entered for each entry going into the
database—valid data and invalid data.
Integrating Testing: It is a systematic technique for constructing the
program structure while at the same time conducting tests to uncover
tests associated with interfacing. The objective is to take unit tested
components and build a program structure that has been dictated by
design.
In this testing we followed the bottom-up approach. This approach
implies construction and testing with atomic modules.
Stress Testing: The final test to be performed during unit testing in
the stress test. Here the program is put through extreme stress like all
of the keys of the keyboard being pressed or junk data being put
through. The system being tested should be able to handle that stress.
Functional Testing: Functional testing verifies that your system is
ready for release. The functional tests define your working system in a
useful manner. A maintained suite of functional tests:
Captures user requirements in a useful way.
Gives the team (users and developers) confidence that the system
meets those requirements.
Load Testing: Load testing generally refers to the practice of
modeling the expected usage of a software program by simulating
multiple users accessing the program’s services concurrently. As such,
this testing is most relevant for multi-user systems often one built
using a client/server model, such as web servers.
Regression Testing: Regression testing is initiated after the
programmer has attempted to fix a recognized problem or has added
the source code to a program that may have inadvertently introduced
errors. It is a quality control measure to ensure that the newly modified
code still complies with its specified requirements and that unmodified
code has not been affected by the maintenance activity.
Installation Testing: Installation testing will check the installation
and configuration procedure as well as any missing dependencies.
Installation tests test the installation and configuration procedures.
These tests are a set of scripts that automatically download all
necessary packages and install them. To test that the installation was
successful there is an edge-java-security-test RPM which contains a
simple script to test the SSL handshake with the secure service.

CHAPTER FIVE

TEST CASES

5.1. RETURN FILED REPORT


5.2. MONTHLY/QUARTERLY TAX PAID FORM
5.3. MONTHLY/QUARTERLY TAX PAID FORM
DATABASE TEST CASE

5.4. MONTHLY /QUARTERLY TAX PAID FORM

POSITIVE FUNCTIONAL TEST CASES


NEGATIVE FUNCTIONAL TEST CASES
5.5. SERVICE WISE REPORT (ADMIN REPORT)

POSITIVE FUNCTIONAL TEST CASE


NEGATIVE FUNCTIONAL TEST CASE
5.6. STRPS WISE REPORT (ADMIN REPORT)

NEGATIVE FUNCTIONAL TEST CASES


CONCLUSION
a. Advantages:
Delivery of a quality product and software met all quality
requirements.
Website is developed within the time frame and budget.
A disciplined approach to software development.
Quality products lead to happy customers.
Software is developed within the time frame and budget.
b. Limitations:
Quality is compromised to deliver the product on time and within
budget.
It is a time consuming process.
It requires a large number of employees that leads to an increase
in the product cost.
CHAPTER
14

THE GAME TESTING PROCESS1


Inside this Chapter:
14.1. “Black-Box” Testing
14.2. “White-Box” Testing
14.3. The Life Cycle of a Build
14.4. On Writing Bugs Well

Developers don’t fully test their own games. They don’t have time to,
and even if they did, it’s not a good idea. Back at the dawn of the
video game era, the programmer of a game was also its artist,
designer, and tester. Even though games were very small—the size of
email— the programmer spent most of his time designing and
programming. Little of his time was spent testing. If he did any
testing, it was based on his own assumptions about how players would
play his game. The following sidebar illustrates the type of problem
these assumptions could create.

THE PLAYER WILL ALWAYS SURPRISE YOU


The programmer of Astrosmash, a space shooter released for the
Intellivision® system in 1981, made an assumption when he designed
the game that no player would ever score 10 million points. As a
result, he didn’t write a check for score overflowing. He read over his
own code and—based on his own assumptions—it seemed to work
fine. It was a fun game—its graphics were breathtaking (for the time)
and the game went on to become one of the best sellers on the
Intellivision platform.
Weeks after the game was released, however, a handful of
customers began to call the game’s publisher, Mattel Electronics, with
an odd complaint: when they scored more than 9,999,999 points, the
score displayed negative numbers, letters, and symbol characters. This
in spite of the promise of “unlimited scoring potential” in the game’s
marketing materials. The problem was exacerbated by the fact that the
Intellivision console had a feature that allowed players to play the
game in slow motion, making it much easier to rack up high scores.
John Sohl, the programmer, learned an early lesson about video
games: the player will always surprise you.
The sidebar story demonstrates why video game testing is best
done by testers who are: (a) professional, (b) objective, and (c)
separated—either physically or functionally—from the game’s
development team. That remove and objectivity allows testers to think
independently of the developers, to function as players, and to figure
out new and interesting ways to break the game. This chapter
discusses how, like the gears of a watch, the game testing process
meshes into the game development process.

“BLACK-BOX” TESTING

Almost all game testing is black-box testing, testing done from outside
the application. No knowledge of, or access to, the source code is
granted to the tester. Game testers typically don’t find defects by
reading the game code. Rather, they try to find defects using the same
input devices available to the average player, be it a mouse, a
keyboard, a console gamepad, a motion sensor, or a plastic guitar.
Black-box testing is the most cost-effective way to test the extremely
complex network of systems and modules that even the simplest video
game represents.
Figure 14.1 illustrates some of the various inputs you can provide
to a videogame and the outputs you can receive back. The most basic
of inputs are positional, and control data in the form of button presses
and cursor movements, or vector inputs from accelerometers, or even
full-body cameras. Audio input can come from microphones fitted in
headsets or attached to a game controller. Input from other players can
come from a second controller, a local network, or the Internet.
Finally, stored data such as saved games and options settings can be
called up as input from memory cards or a hard drive.

FIGURE 14.1 Black-box testing: planning inputs and examining outputs.

Once some or all of these types of input are received by the game,
it reacts in interesting ways and produces such output as video, audio,
vibration (via force feedback devices), and data saved to memory
cards or hard drives.
The input path of a video game is not oneway, however. It is a
feedback loop, where the player and the game are constantly reacting
to each other. Players don’t receive output from a game and stop
playing. They constantly alter and adjust their input “on the fly,” based
on what they see, feel, and hear in the game. The game, in turn, makes
similar adjustments in its outputs based on the inputs it receives from
the player. Figure 14.2 illustrates this loop.

FIGURE 14.2 The player’s feedback loop adjusts to the game’s


input, and vice versa.
If the feedback received by the player were entirely predictable all
the time, the game would be no fun. Nor would it be fun if the
feedback received by the player were entirely random all the time.
Instead, feedback from games should be just random enough to be
unpredictable. It is the unpredictability of the feedback loop that
makes games fun. Because the code is designed to surprise the player
and the player will always surprise the programmer, black-box testing
allows testers to think and behave like players.

“WHITE-BOX” TESTING

In contrast to black-box testing, white-box testing gives the tester


opportunities to exercise the source code directly in ways that no
player ever could. It can be a daunting challenge for the white-box
tester to read a piece of game code and predict every single interaction
it will have with every other bit of code, and whether the programmer
has accounted for every combination and order of inputs possible.
Testing a game using only white-box methods is also extremely
difficult because it is nearly impossible to account for the complexity
of the player feedback loop. There are, however, situations in which
white-box testing is more practical and necessary than black-box
testing. These include the following:
Tests performed by developers prior to submitting new code for
integration with the rest of the game
Testing code modules that will become part of a reusable library
across multiple games or platforms
Testing code methods or functions that are essential parts of a
game engine or middleware product
Testing code modules within your game that might be used by
thirdparty developers or “modders” who, by design, could expand
or modify the behavior of your game to their own liking
Testing low-level routines that your game uses to support specific
functions in the newest hardware devices, such as graphics cards
or audio processors
In performing white-box tests, you execute specific modules and
the various paths that the code can follow when you use the module in
various ways. Test inputs are determined by the types and values of
data that can be passed to the code. Results are checked by examining
values returned by the module, global variables that are affected by the
module, and local variables as they are processed within the module.
To get a taste of white-box testing, consider the TeamName routine
from Castle Wolfenstein: Enemy Territory:

Four white-box tests are required for this module to test the proper
behavior of each line of code within the module. The first test would
be to call the TeamName function with the parameter TEAM_AXIS and
then check that the string “RED” is returned. Second, pass the value of
TEAM_ALLIES and check that “BLUE” is returned. Third, pass
TEAM_SPECTATOR and check that “SPECTATOR” is returned. Finally, pass
some other value such as TEAM_NONE, which makes sure that “FREE” is
returned. Together these tests not only exercise each line of code at
least once, they also test the behavior of both the “true” and “false”
branches of each if statement.
This short exercise illustrates some of the key differences between
a white-box testing approach and a black-box approach:
Black-box testing should test all of the different ways you could
choose a test value from within the game, such as different menus
and buttons. White-box testing requires you to pass that value to
the routine in one form—its actual symbolic value within the
code.
By looking into the module, white-box testing reveals all of the
possible values that can be provided to and processed by the
module being tested. This information might not be obvious from
the product requirements and feature descriptions that drive
black-box testing.
Black-box testing relies on a consistent configuration of the game
and its operating environment in order to produce repeatable
results. White-box testing relies only on the interface to the
module being tested and is concerned only about external files
when processing streams, file systems, or global variables.

THE LIFE CYCLE OF A BUILD

Game testers are often frustrated that, like players, they must wait (and
wait) for the work product of the development team before they can
spring into action. Players wait for game releases; testers wait for code
releases, or builds. The test results from each build is how all
stakeholders in the project—from QA to the project manager to the
publisher—measure the game’s progress toward release.
A basic game testing process consists of the following steps:
1. Plan and design the test. Although much of this is done early in
the planning phase, planning and design should be revisited with
every build. What has changed in the design spec since the last
build? What additional test cases have been added? What new
configurations will the game support? What features have been
cut? The scope of testing should ensure that no new issues were
introduced in the process of fixing bugs prior to this release.
2. Prepare for testing. Code, tests, documents, and the test
environment are updated by their respective owners and aligned
with one another. By this time the development team should have
marked the bugs fixed for this build in the defect database so the
QA team can subsequently verify those fixes and close the bugs.
3. Perform the test. Run the test suites against the new build. If you
find a defect, test “around” the bug to make certain you have all
the details necessary to write as specific and concise a bug report
as possible. The more research you do in this step, the easier and
more useful the bug report will be.
4. Report the results. Log the completed test suite and report any
defects you found.
5. Repair the bug. The test team participates in this step by being
available to discuss the bug with the development team and to
provide any directed testing a programmer might require to track
the defect down.
6. Return to Step 1 and re-test. With new bugs and new test results
comes a new build.
These steps not only apply to black-box testing, they also describe
white-box testing, configuration testing, compatibility testing, and any
other type of QA. These steps are identical no matter what their scale.
If you substitute the word “game” or “project” for the word “build” in
the preceding steps, you will see that they can also apply to the entire
game, a phase of development (Alpha, Beta, and so on), or an
individual module or feature within a build. In this manner, the
software testing process can be considered fractal—the smaller system
is structurally identical to the larger system, and vice versa.
As illustrated in Figure 14.3, the testing process itself is a feedback
loop between the tester and developer. The tester plans and executes
tests on the code, then reports the bugs to the developer, who fixes
them and compiles a new build, which the tester plans and executes,
and so on.

FIGURE 14.3 The testing process feedback loop.

A comfortable scale from which to examine this process is at the


level of testing an individual build. Even a relatively small game
project could consist of dozens of builds over its development cycle.

Test Cases and Test Suites


As discussed in the previous chapter, a single test performed to answer
a single question is a test case; a collection of test cases is a test suite.
The lead tester, primary tester, or any other tester tasked with test
creation should draft these documents prior to the distribution of the
build. Each tester will take his or her assigned test suites and perform
them on the build. Any anomalies not already present in the database
should be written up as new bugs.
In its simplest form, a test suite is a series of incremental steps that
the tester can perform sequentially. Subsequent chapters in this book
discuss in depth the skillful design of test cases and suites through
such methods as combinatorial tables and test flow diagrams. For the
purposes of this discussion, consider a short test suite you might
execute on Minesweeper, a simple game available with most versions
of Microsoft Windows®. A portion of this suite is shown in Figure
14.4.

FIGURE 14.4 A portion of a test suite for Minesweeper.

This is a very small portion of a very simple test suite for a very
small and simple game. The first section (steps one through seven)
tests launching the game, ensuring that the default display is correct,
and exiting. Each step either gives the tester one incremental
instruction or asks the tester one simple question. Ideally, these
questions are binary and unambiguous. The tester performs each test
case and records the result.
Because the testers will inevitably observe results that the test
designer hadn’t planned for, the Comments field allows the tester to
elaborate on a Yes/No answer, if necessary. The lead or primary tester
who receives the completed test suite can then scan the Comments
field and make adjustments to the test suite as needed for the next
build.
Where possible, the questions in the test suite should be written in
such a way that a “yes” answer indicates a “pass” condition—the
software is working as designed and no defect is observed. “No”
answers, in turn, should indicate that there is a problem and a defect
should be reported. There are several reasons for this: it’s more
intuitive, because we tend to group “yes” and “pass” (both positives)
together in our minds the same way we group “no” and “fail.” Further,
by grouping all passes in the same column, the completed test suite
can be easily scanned by both the tester and test managers to
determine quickly whether there were any fails. A clean test suite will
have all the checks in the Pass column.
For example, consider a test case covering the display of a tool tip
—a small window with instructional text incorporated into many
interfaces. A fundamental test case would be to determine whether the
tool tip text contains any typographical errors. The most intuitive
question to ask in the test case is:
Does the text contain any typographical errors?

The problem with this question is that a pass (no typos, hence no
bugs) would be recorded as a “no.” It would be very easy for a hurried
(or tired) tester to mistakenly mark the Fail column. It is far better to
express the question so that a “yes” answer indicates a “pass”
condition:
Is the text free of typographical errors?
As you can see, directed testing is very structured and methodical.
After the directed testing has concluded, or concurrently with directed
testing, a less structured, more intuitive form of testing, known as ad
hoc testing, takes place.

Entry Criteria

It’s advisable to require that any code release meets some criteria for
being fit to test before you risk wasting your time, or your team’s time,
testing it. This is similar to the checklists that astronauts and pilots use
to evaluate the fitness of their vehicle systems before attempting flight.
Builds submitted to testing that don’t meet the basic entry criteria are
likely to waste the time of both testers and programmers. The
countdown to testing should stop until the test “launch” criteria are
met.
The following is a list of suggestions for entry criteria. Don’t keep
these a secret from the rest of the development team. Make the team
aware of the purpose—to prevent waste—and work with them to
produce a set of criteria that the whole team can commit to.
The game code should be built without compiler errors. Any new
compiler warnings that occur are analyzed and discussed with the
test team.
The code release notes should be complete and should provide the
detail that testers need to plan which tests to run or to re-run for
this build.
Defect records for any bugs closed in the new release should be
updated so they can be used by testers to make decisions about
how much to test in the new build.
Tests and builds should be properly version-controlled, as
described in the sidebar, “Version Control: Not Just for
Developers.”
When you are sufficiently close to the end of the project, you also
want to receive the game on the media on which it will ship.
Check that the media provided contains all of the files that would
be provided to your customer.
VERSION CONTROL: NOT JUST FOR DEVELOPERS
A fundamental principle of software development is that every build
of an application should be treated as a separate and discrete version.
Inadvertent blending of old code with new is one of the most common
(and most preventable) causes of software defects. The process of
tracking builds and ensuring that all members of a development team
are checking current code and assets into the current version is known
as version control.
Test teams must practice their own form of version control. There
are few things more time wasting than for a test team to report a great
number of bugs in an old build. This is not only a waste of time, but it
can cause panic on the part of the programmer and the project
manager.
Proper version control for the test team includes the following
steps:
1. Collect all prior physical (e.g., disk-based) builds from the test
team before distributing the new build. The prior versions should
be staked together and archived until the project is complete.
(When testing digital downloads, uninstall and delete or archive
prior digital builds.)
2. Archive all paperwork. This includes not only any build notes you
received from the development team, but also any completed test
suites, screen shots, saved games, notes, video files, and any other
material generated during the course of testing a build. It is
sometimes important to retrace steps along the paper trail, whether
to assist in isolating a new defect or determining in what version an
old bug was re-introduced.
3. Verify the build number with the developer prior to distributing it.
4. In cases where builds are transmitted electronically, verify the byte
count, file dates, and directory structure before building it. It is
vital in situations where builds are sent via FTP, email, Dropbox
(www.dropbox.com) or other digital means that the test team makes
certain to test a version identical to the version the developers
uploaded. Confirm the integrity of the transmitted build before
distributing it to the testers.
5. Renumber all test suites and any other build-specific paperwork or
electronic forms with the current version number.
6. Distribute the new build for smoke testing.

Configuration Preparation

Before the test team can work with the new build, some housekeeping
is in order. The test equipment must be readied for a new round of
testing. The test lead must communicate the appropriate hardware
configuration to each tester for this build. Configurations typically
change little over the course of game testing. To test a single-player-
only console game, you need the game console, a controller, and a
memory card or hard drive. That hardware configuration typically will
not change for the life of the project. If, however, the new build is the
first in which network play is enabled, or a new input device or PC
video card has been supported, you will perhaps need to augment the
hardware configuration to perform tests on that new code.
Perhaps the most important step in this preparation is to eliminate
any trace of the prior build from the hardware. “Wiping” the old build
of a disk-based game on a Nintendo Wii™ is simple, because the only
recordable media for that system is an SD card or its small internal
flash memory drive. All you have to do is remove and archive the
saved game you created with the hold build. More careful test leads
will ask their testers to go the extra step of reformatting the media,
which completely erases it, to ensure that not a trace of the old build’s
data will carry forward during the testing of the new build.

TIP

Save your saves! Always archive your old player-created data,


including game saves, options files, and custom characters, levels, or
scenarios.

Not surprisingly, configuration preparation can be much more


complicated for PC games. The cleanest possible testing configuration
for a PC game is:
A fresh installation of the latest version of the operating system,
including any patches or security updates.
The latest drivers for all components of the computer. This not
only includes the obvious video card and sound card drivers, but
also chipset drivers, motherboard drivers, Ethernet card drivers,
WiFi® firmware, and so on.
The latest version of any “helper apps” or middleware the game
requires in order to run. These can range from Microsoft’s
DirectX® multimedia libraries to third-party multiplayer
matchmaking software.
The only other software installed on the computer should be the
new build.

CHASING FALSE DEFECTS


We once walked into a QA lab that was testing a (then) very cutting-
edge 3D PC game. Testing of the game had fallen behind, and we had
been sent from the publisher to investigate. We arrived just as the
testers were breaking for lunch, and were appalled to see the testers
exit the game they were testing and fire up email, instant messenger
clients, Web browsers, and file sharing programs—a host of
applications that were installed on their test computers. Some even
jumped into a game of Unreal Tournament. We asked the assistant test
manager why he thought it was a good idea for the testers to run these
extraneous programs on their testing hardware. “It simulates real-
world conditions,” he shrugged, annoyed by our question.
As you perhaps have already guessed, this lab’s failure to wipe
their test computers clean before each build led to a great deal of
wasted time chasing false defects—symptoms testers thought were
defects in the game, but which were in fact problems brought about
by, for example, email or file sharing programs running in the
background, taxing the system’s resources and network bandwidth.
This wasted tester time also meant a good amount of wasted
programmer time, as the development team tried to figure out what in
the game code might be causing such (false) defects.
The problem was solved by reformatting each test PC, freshly
installing the operating system and latest drivers, and then using a
drive image backup program to create a system restore file. From that
point forward, testers merely had to reformat their hard drive and copy
the system restore file over from a CD-ROM.

Testing takes place in the lab and labs should be clean. So should
test hardware. It’s difficult to be too fastidious or paranoid when
preparing test configurations. When you get a new build, reformat
your PC rather than merely uninstall the new build.

TIP

Delete your old builds! Reformat your test hardware—whether it’s a


PC, a tablet or a smartphone. If it’s a browser game, delete the cache.

Browser games should be purged from each browser’s cache and


the browser should be restarted before you open the new game build.
In the case of Flash® games, you can right-click on the old build and
select “Global Settings…” This will launch a separate browser process
and will connect you to the Flash Settings Manager. Choosing the
“Website Storage Settings panel” will launch a Flash applet. Click the
“Delete all sites” button and close all of your browser processes. Now
you can open the new build of your Flash game.
iOS™ games should be deleted both from the device and the
iTunes® client on the computer the device is synched to. When
prompted by iTunes, choose to delete the app entirely (this is the
“Move to Recycle Bin” or “Move to Trash” button). Now, synch your
device and make certain the old build has been removed both from
iTunes and your device. Empty the Recycle Bin (or the Trash),
relaunch iTunes, copy the new build, and synch your device again.
Android™ games, like iOS games, should be deleted entirely
from the device and your computer. Always synch your device to
double-check that you have scrubbed the old build off before you
install the new build.
Whatever protocol is established, config prep is crucial prior to the
distribution of a new build.
Smoke Testing

The next step after accepting a new build and preparing to test it is to
certify that the build is worthwhile to submit to formal testing. This
process is sometimes called smoke testing, because it’s used to
determine whether a build “smokes” (malfunctions) when run. At a
minimum, it should consisted of a “load & launch,” that is, the lead or
primary tester should launch the game, enter each module from the
main menu, and spend a minute or two playing each module. If the
game launches with no obvious performance problems and each
module implemented so far loads with no obvious problems, it is safe
to certify the build, log it, and duplicate it for distribution to the test
team.
Now that the build is distributed, it’s time to test for new bugs,
right? Not just yet. Before testing can take a step forward, it must first
take a step backward and verify that the bugs the development team
claims to have fixed in this build are indeed fixed. This process is
known as regression testing.

Regression Testing

Fix verification can be at once very satisfying and very frustrating. It


gives the test team a good sense of accomplishment to see the defects
they report disappear one by one. It can be very frustrating, however,
when a fix of one defect creates another defect elsewhere in the game,
as can often happen.
The test suite for regression testing is the list of bugs the
development team claims to have fixed. This list, sometimes called a
knockdown list, is ideally communicated through the bug database.
When the programmer or artist fixes the defect, all they have to do is
change the value of the Developer Status field to “Fixed.” This allows
the project manager to track the progress on a minute-to-minute basis.
It also allows the lead tester to sort the regression set (by bug author or
by level, for example). At a minimum, the knockdown list can take the
form of a list of bug numbers sent from the development team to the
lead tester.
TIP

Don’t accept a build into test unless it is accompanied by a


knockdown list. It is a waste of the test team’s time to regress every
open bug in the database every time a new build enters test.

Each tester will take the bugs they’ve been assigned and perform
the steps in the bug write-up to verify that the defect is indeed fixed.
The fixes for many defects are easily verified (typos, missing features,
and so on). Some defects, such as hard-to-reproduce crashes, could
seem fixed, but the lead tester might want to err on the side of caution
before he closes the bug. By flagging the defect as verify fix, the bug
can remain in the regression set (i.e., stay on the knockdown list) for
the next build (or two), but out of the set of open bugs that the
development team is still working on. Once the bug has been verified
as fixed in two or three builds, the lead tester can then close the bug
with more confidence.
At the end of regression testing, the lead tester and project
manager can get a very good sense of how the project is progressing.
A high fix rate (number of bugs closed divided by the number of bugs
claimed to have been fixed) means the developers are working
efficiently. A low fix rate could be cause for concern. Are the
programmers arbitrarily marking bugs as fixed if they think they’ve
implemented new code that might address the defect, rather than
troubleshooting the defect itself? Are the testers not writing clear
bugs? Is there a version control problem? Are the test systems
configured properly? While the lead tester and project manager mull
over these questions, it’s time for you to move on to the next step in
the testing process: performing structured tests and reporting the
results.

Testing “Around” a Bug

The old saying in carpentry is “measure twice, cut once.” Good game
testers thoroughly investigate a defect before they write it up,
anticipating any questions the development team might have.
Before you begin to write a defect report, ask yourself some
questions:
1. Is this the only location or level where the bug occurs?
2. Does the bug occur while using other characters or units?
3. Does the bug occur in other game modes (for example, multiplayer
as well as single player, skirmish as well as campaign)?
4. Can I eliminate any steps along the path to reproducing the bug?
5. Does the bug occur across all platforms (for example, does it occur
on both the Xbox One and PlayStation 4 builds)?
6. Is the bug machine-specific (for example, does it occur only on
PCs with a certain hardware configuration)?
These are the types of questions you will be asked by the lead
tester, project manager, or developer. Try to develop the habit of
second-guessing such questions by performing some quick additional
testing before you write the bug. Test to see whether the defect occurs
in other areas. Test to determine whether the bug happens when you
choose a different character. Test to check which other game modes
contain the issue. This practice is known as testing “around” the bug.
Once you are satisfied that you have anticipated any questions that
the development team might ask, and you have all your facts ready,
you are finally ready to write the bug report.

ON WRITING BUGS WELL

Good bug writing is one of the most important skills a tester must
learn. A defect can be fixed only if it is communicated clearly and
effectively. One of the oldest jokes in software development goes
something like this:
Q: How many programmers does it take to screw in a light bulb?
A: None—it’s not dark where they’re sitting.
Good bug report writing gets the development team to “see the
light” of the bug. The developers are by no means the only people who
will read your bug, however. Your audience could include:
The lead tester or primary tester, who might wish to review the
bug before she gives it an “open” status in the bug database.
The project manager, who will read the bug and assign it to the
appropriate member of the development team.
Marketing and other business executives, who might be asked to
weigh in on the possible commercial impact of fixing (or not
fixing) the bug.
Third parties, such as middleware developers, who could be
asked to review a bug that is possibly related to a project they
supply to the project team
Customer service representatives, who might be asked to devise
workarounds for the bug
Other testers, who will reproduce the steps if they are asked to
verify a fix during regression testing
Because you never know exactly who will be reading your bug
report, you must always write in as clear, objective, and dispassionate
a manner as possible. You can’t assume that everyone reading your
bug report will be as familiar with the game as you are. Testers spend
more time in the game—exploring every hidden path, closely
examining each asset—than almost anyone else on the entire project
team. A well-written bug will give a reader who is not familiar with
the game a good sense of the type and severity of the defect it
describes.

Just the Facts, Ma’am

The truth is that defects stress out development teams, especially


during “crunch time.” Each new bug added to the database means
more work still has to be done. An average-sized project can have
hundreds or thousands of defects reported before it is completed.
Developers can feel overwhelmed and might, in turn, get hostile if
they feel their time is being wasted by frivolous or arbitrary bugs.
That’s why good bug writing is fact based and unbiased.
The guard’s hat would look better if it was blue.
This is neither a defect nor a fact; it’s an unsolicited and arbitrary
opinion about design. There are forums for such opinions—
discussions with the lead tester, team meetings, play testing feedback
—but the bug database isn’t one of them.
A common complaint in many games is that the artificial
intelligence, or AI, is somehow lacking. (AI is a catch-all term that
means any opponents or NPCs controlled by the game code.)
The AI is weak.

This could indeed be a fact, but it is written in such a vague and


general way that it is likely to be considered an opinion. A much better
way to convey the same information is to isolate and describe a
specific example of AI behavior and write up that specific defect. By
boiling issues down to specific facts, you can turn them into defects
that have a good chance of being addressed.

TIP

Before you begin to write a bug report, you have to be certain that you
have all your facts.

Brief Description

Larger databases could contain two description fields: Brief


Description (or Summary) and Full Description (or Steps). The Brief
Description field is used as a quick reference to identify the bug. This
should not be a cute nickname, but a one-sentence description that
allows team members to identify and discuss the defect without having
to read the longer, full description each time. Think of the brief
description as the headline of the defect report.
Crash to desktop.

This is not a complete sentence, nor is it specific enough to be a


brief description. It could apply to one of dozens of defects in the
database. The brief description must be brief enough to be read easily
and quickly, but long enough to describe the bug.
The saving system is broken.

This is a complete sentence, but it is not specific enough. What did


the tester experience? Did the game not save? Did a saved game not
load? Does saving cause a crash?
Crash to desktop when choosing “Options” from Main Menu.

This is a complete sentence, and it is specific enough so that


anyone reading it will have some idea of the location and severity of
the defect.
Game crashed after I killed all the guards and doubled
back through the level to get all the pick-ups and
killed the first re-spawned guard.

This is a run-on sentence that contains far too much detail. A good
way to boil it down might be
Game crashes after killing respawned guards.

The one-sentence program descriptions used by cable television


guides and download stores can provide excellent examples of brief
description writing—they boil an entire one-hour cop show or two-
hour movie into one sentence.

TIP

Write the full description first, and then write the brief description.
Spending some time polishing the full description will help you
understand the most important details to include in the brief
description.

Full Description

If the brief description is the headline of a bug report, the Full


Description field provides the gory details. Rather than a prose
discussion of the defect, the full description should be written as a
series of brief instructions so that anyone can follow the steps and
reproduce the bug. Like a cooking recipe—or computer code, for that
matter—the steps should be written in second person imperative, as
though you were telling someone what to do. The last step is a
sentence (or two) describing the bad result.

1. Launch the game.


2. Watch the animated logos. Do not press ESC to
skip through them.
-- Notice the bad strobing effect at the end of the
> Developer logo.

The fewer steps, the better; and the fewer words, the better.
Remember Brad Pitt’s warning to Matt Damon in Ocean’s Eleven:
don’t use seven steps when four will do. Time is a precious resource
when developing a game. The less time it takes a programmer to read,
reproduce, and understand the bug, the more time he has to fix it.

1. Launch game.
2. Choose multiplayer.
3. Choose skirmish.
4. Choose “Sorrowful Shoals” map.
5. Choose two players.
6. Start game.

These are very clear steps, but for the sake of brevity they can be
boiled down to

1. Start a two player skirmish game on “Sorrowful


Shoals.”

Sometimes, however, you need several steps. The following bug


describes a problem with a power-up called “mugging,” which steals
any other power-up from any other unit.

1. Create a game against one human player. Choose


Serpent tribe.
2. Send a swordsman into a Thieves Guild to get the
Mugging power-up.
3. Have your opponent create any unit and give that
unit any power-up.
4. Have your Swordsman meet the other player’s unit
somewhere neutral on the map.
5. Activate the Mugging power-up.

6. Attack your opponent’s unit.

-- Crash to desktop as Swordsman strikes.


>

This might seem like many steps, but it is the quickest way to
reproduce the bug. Every step is important to isolate the behavior of
the mugging code. Even small details, like meeting in a neutral place,
are important, because meeting in occupied territory might bring allied
units from one side or another into the fight, and the test might then be
impossible to perform.

TIP

Good bug writing is precise yet concise.

Great Expectations

Oftentimes, the defect itself will not be obvious from the steps in the
full description. Because the steps produce a result that deviates from
player expectation, but does not produce a crash or other severe or
obvious symptom, it is sometimes necessary to add two additional
lines to your full description: Expected Result and Actual Result.
FIGURE 14.5 Fallout 4: One would expect player-placed structures
to appear grounded on the terrain rather than floating
above it.

Expected Result describes the behavior that a normal player would


reasonably expect from the game if the steps in the bug were followed.
This expectation is based on the tester’s knowledge of the design
specification, the target audience, and precedents set (or broken) in
other games, especially games in the same genre.
Actual Result describes the defective behavior. Here’s an example.

1. Create a multiplayer game.


2. Click Game Settings.
3. Using your mouse, click any map on the map list.
Remember the map you clicked on.
4. Press up or down directional keys on your
keyboard.
5. Notice the highlight changes. Highlight any other
map.
6. Click Back.
7. Click Start Game.
Expected Result: Game loads map you chose with the
keyboard.
Actual Result: Game loads map you chose with the mouse.

Although the game loaded a map, it wasn’t the map the tester
chose with the keyboard (the last input device he used). That’s a bug,
albeit a subtle one. Years of precedent creates the expectation in the
player’s mind that the computer will execute a command based on the
last input the player gave. Because the map-choosing interface failed
to conform to player expectation and precedent, it could be confusing
or annoying, so it should be written up as a bug.
Use the Expected/Actual Result steps sparingly. Much of the time,
defects are obvious (see Figure 14.5) Here’s an example of “stating
the obvious” in a crash bug.
4. Choose “Next” to continue.
Expected Result: You continue.
Actual Result: Game locks up. You must reboot the
console.

It is understood by all members of the project team that the game


shouldn’t crash. Don’t waste time and space stating that with an
unnecessary statement of Expected and Actual Results.
You should use these statements sparingly in your bug reports, but
you should use them when necessary. They can often make a
difference when a developer wants to close a bug in the database by
declaring it “by design,” “working as intended,” or “NAB” (Not a
Bug).

INTERVIEW
More players are playing games than ever before. As any human
population grows—and the pool of game players has grown
exponentially over the last decade—that population becomes more
diverse. Players are different from each other, have different levels of
experience with games, and play games for a range of different
reasons. Some players want a competitive experience, some an
immersive experience, some want a gentle distraction.
The pool of game testers in any organization is always less diverse
than the player base of the game they are testing. Game testers are
professionals, they have skills in manipulating software interfaces,
they are generally (but not necessarily) experienced game players. It’s
likely that if your job is creating games, that you’ve played video
games—a lot of them. But not every player is like you.
Brent Samul, QA Lead for developer Mobile Deluxe, put it this
way: “The biggest difference when testing for mobile is your
audience. With mobile you have such a broad spectrum of users.
Having played games for so long myself, it can sometimes be really
easy to overlook things that someone who doesn’t have so much
experience in games would get stuck on or confused about.”
It’s a big job. “With mobile, we have the ability to constantly
update and add or remove features from our games. There are always
multiple things to test for with all the different configurations of
smartphones and tablets that people have today,” Mr. Samul says.
Although testers should write bugs against the design
specification, the authors of that specification are not omniscient. As
the games on every platform become more and more complex, it’s the
testers’ job to advocate for the players—all players—in their bug
writing. (Permission Brent Samul)

Habits to Avoid

For the sake of clarity, effective communication, and harmony among


members of the project team try to avoid two common bug writing
pitfalls: humor and jargon.
Although humor is often welcome in high-stress situations, it is
not welcome in the bug database. Ever. There are too many chances
for misinterpretation and confusion. During crunch time, tempers are
short, skins are thin, and nerves are frayed. The defect database could
already be a point of contention. Don’t make the problem worse with
attempts at humor (even if you think your joke is hilarious). Finally, as
the late William Safire warned, you should “avoid clichés like the
plague.”
It perhaps seems counterintuitive to want to avoid jargon in such a
specialized form of technical writing as a bug report, but it is wise to
do so. Although some jargon is unavoidable, and each project team
quickly develops it own nomenclature specific to the project they’re
working on, testers should avoid using (or misusing) too many
obscure technical terms or acronyms. Remember that your audience
could range from programmers to financial or marketing executives,
so use plain language as much as possible.
Although testing build after build might seem repetitive, each new
build provides exciting new challenges with its own successes (fixed
bugs and passed tests) and shortfalls (new bugs and failed tests). The
purpose of going about the testing of each build in a structured manner
is to reduce waste and to get the most out of the game team. Each time
around, you get new build data that is used to re-plan test execution
strategies and update or improve your test suites. From there, you
prepare the test environment and perform a smoke test to ensure the
build is functioning well enough to deploy to the entire test team.
Once the test team is set loose, your top priority is typically regression
testing to verify recent bug fixes. After that, you perform many other
types of testing in order to find new bugs and to check that old ones
have not re-emerged. New defects should be reported in a clear,
concise, and professional manner after an appropriate amount of
investigation. Once you complete this journey, you are rewarded with
the opportunity to do it all over again.

EXERCISES

1. Briefly describe the difference between the Expected Result and


the Actual Result in a bug write-up.
2. What’s the purpose of regression testing?
3. Briefly describe the steps in preparing a test configuration.
4. What is a “knockdown list”? Why is it important?
5. True or False: Black-box testing refers to examining the actual
game code.
6. True or False: The Brief Description field of a defect report should
include as much information as possible.
7. True or False: White-box testing describes the testing of gameplay.
8. True or False: Version control should be applied only to the
developers’ code.
9. True or False: A “Verify Fix” status on a bug means it will remain
on the knockdown list for at least one more test cycle.
10. True or False: A tester should write as many steps as possible
when reporting a bug to ensure the bug can be reliably reproduced.
11. On a table next to a bed is a touch-tone landline telephone. Write
step-by-step instructions for using that phone to dial the following
local number: 555-1234. Assume the person reading the
instructions has never seen or used a telephone before.

1 This chapter appeared in Game Testing, Third Edition, C. Schultz and R. D.


Bryant. Copyright 2017 Mercury Learning and Information. All rights
reserved.
CHAPTER
15

BASIC TEST PLAN TEMPLATE1


Game Name
1. Copyright Information

Table of Contents
SECTION I: QA TEAM (and areas of responsibility)
1. QA Lead
a. Office phone
b. Home phone
c. Mobile phone
d. Email / IM / VOIP addresses
2. Internal Testers
3. External Test Resources

SECTION II: TESTING PROCEDURES


1. General Approach
a. Basic Responsibilities of Test Team
i. Bugs
1. Detect them as soon as possible after they enter the build
2. Research them
3. Communicate them to the dev team
4. Help get them resolved
5. Track them
ii. Maintain the Daily Build
iii. Levels of Communication. There’s no point in testing
unless the results of the tests are communicated in some
fashion. There are a range of possible outputs from QA. In
increasing levels of formality, they are:
1. Conversation
2. ICQ/IM/Chat
3. Email to individual
4. Email to group
5. Daily Top Bugs list
6. Stats/Info Dump area on DevSite
7. Formal Entry into Bug Tracking System
2. Daily Activities
a. The Build
i. Generate a daily build.
ii. Run the daily regression tests, as described in “Daily Tests”
which follows.
iii. If everything is okay, post the build so everyone can get it.
iv. If there’s a problem, send an email message to the entire
dev team that the new build cannot be copied, and contact
whichever developers can fix the problem.
v. Decide whether a new build needs to be run that day.
b. Daily Tests
i. Run though a predetermined set of single-player levels,
performing a specified set of activities.
1. Level #1
a. Activity #1
b. Activity #2
c. Etc.
d. The final activity is usually to run an automated
script that reports the results of the various tests and
posts them in the QA portion of the internal Web
site.
2. Level #2
3. Etc.
ii. Run though a predetermined set of multiplayer levels,
performing a specified set of activities.
1. Level #1
a. Activity #1
b. Activity #2
c. Etc.
d. The final activity is usually for each tester involved
in the multiplayer game to run an automated script
that reports the results of the various tests and posts
them in the QA portion of the internal Web site.
2. Level #2
3. Etc.
iii. Email showstopper crashes or critical errors to the entire
team.
iv. Post showstopper crashes or critical errors to the daily top
bugs list (if one is being maintained).
3. Daily Reports
a. Automated reports from the preceding daily tests are posted in
the QA portion of the internal Web site.
4. Weekly Activities
a. Weekly tests
i. Run though every level in the game (not just the preset ones
used in the daily test), performing a specified set of
activities and generating a predetermined set of tracking
statistics. The same machine should be used each week.
1. Level #1
a. Activity #1
b. Activity #2
c. Etc.
2. Level #2
3. Etc.
ii. Weekly review of bugs in the Bug Tracking System
1. Verify that bugs marked “fixed” by the development
team really are fixed.
2. Check the appropriateness of bug rankings relative to
where the project is in the development.
3. Acquire a “feel” for the current state of the game, which
can be communicated in discussions to the producer and
department heads.
4. Generate a weekly report of closed-out bugs.
b. Weekly Reports
i. Tracking statistics, as generated in the weekly tests.
5. Ad Hoc Testing
a. Perform specialized tests as requested by the producer, tech
lead, or other development team members
b. Determine the appropriate level of communication to report the
results of those tests.
6. Integration of Reports from External Test Groups
a. If at all possible, ensure that all test groups are using the same
bug tracking system.
b. Determine which group is responsible for maintaining the
master list.
c. Determine how frequently to reconcile bug lists against each
other.
d. Ensure that only one consolidated set of bugs is reported to the
development team.
7. Focus Testing (if applicable)
a. Recruitment methods
b. Testing location
c. Who observes them?
d. Who communicates with them?
e. How is their feedback recorded?
8. Compatibility Testing
a. Selection of external vendor
b. Evaluation of results
c. Method of integrating filtered results into bug tracking system

SECTION III: HOW TESTING REQUIREMENTS ARE


GENERATED
1. Some requirements are generated by this plan.
2. Requirements can also be generated during project meetings, or
other formal meetings held to review current priorities (such as the
set of predetermined levels used in the daily tests).
3. Requirements can also result from changes in a bug’s status within
the bug tracking system. For example, when a bug is marked
“fixed” by a developer, a requirement is generated for someone to
verify that it has been truly killed and can be closed out. Other
status changes include “Need More Info” and “Can’t Duplicate,”
each of which creates a requirement for QA to investigate the bug
further.
a. Some requirements are generated when a developer wants QA
to check a certain portion of the game (see “Ad Hoc Testing”).

SECTION IV: BUG TRACKING SOFTWARE


1. Package name
2. How many seats will be needed for the project?
3. Access instructions (Everyone on the team should have access to
the bug list)
4. “How to report a bug” instructions for using the system

SECTION V: BUG CLASSIFICATIONS


1. “A” bugs and their definition
2. “B” bugs and their definition
3. “C” bugs and their definition

SECTION VI: BUG TRACKING


1. Who classifies the bug?
2. Who assigns the bug?
3. What happens when the bug is fixed?
4. What happens when the fix is verified?

SECTION VII: SCHEDULING AND LOADING


1. Rotation Plan. How testers will be brought on and off the project,
so that some testers stay on it throughout its life cycle while “fresh
eyes” are periodically brought in.
2. Loading Plan. Resource plan that shows how many testers will be
needed at various points in the life of the project.

SECTION VIII: EQUIPMENT BUDGET AND COSTS


1. QA Team Personnel with Hardware and Software Toolset
a. Team Member #1
i. Hardware
1. Testing PC
a. Specs
2. Console Debug Kit
a. Add-ons (TV, controllers, etc.)
3. Record/capture hardware or software (see
“Capture/Playback Testing”)
ii. Software Tools Needed
1. Bug tracking software
2. Other
b. Team Member #2
c. Etc.
2. Equipment Acquisition Schedule and Costs (summary of who
needs what, when they will need it, and how much it will cost)

1 This chapter appeared in Game Testing, Third Edition, C. Schultz and R. D.


Bryant. Copyright 2017 Mercury Learning and Information. All rights
reserved.
APPENDIX
A

QUALITY ASSURANCE AND TESTING


TOOLS
IEEE/ANSI Software Test
Standard Process Purpose
829–1983 Software Test This standard covers the entire testing
Documentation process.
1008–1987 Software Unit This standard defines an integrated
Testing approach to systematic and
documented unit testing.
1012–1986 Software This standard provides uniform and
Verification and minimum requirements for the format
Validation Plans and content of software verification
and validation plans.
1028–1988 Software This standard provides direction to
Reviews and the reviewer or auditor on the conduct
Audits of evaluations.
730–1989 Software Quality This standard establishes a required
Assurance Plans format and a set of minimum contents
for software quality assurance plans.
828–1990 Software This standard is similar to IEEE
Configuration standard 730, but deals with the more
Management limited subject of software
Plans configuration management. This
standard identifies requirements for
configuration identification,
configuration control, configuration
status reporting, configuration audits
and reviews.
1061–1992 Software Quality This standard provides a methodology
Metrics for establishing quality requirements.
Methodology It also deals with identifying,
implementing, analyzing and
validating the process of software
quality metrics.

Description Tools
Functional/Regression Testing WinRunner
Silkiest
Quick Test Pro (QTP)
Rational Robot
Visual Test
In-house Scripts
Load/Stress Testing (Performance) LoadRunner
Astra Load Test
Application Centre Test (ATC)
In-house Scripts
Web Application Stress Tool
(WAS)
Test Case Management Test Director
Test Manager
In-house Test Case Management
tools
Defect Tracking TestTrack Pro
Bugzilla
Element Tool
ClearQuest
TrackRecord
In-house Defect Tracking tools of
clients
Unit/Integration Testing C++ Test
JUnit
NUnit
PhpUnit
Check
Cantata++
APPENDIX
B

SUGGESTED PROJECTS
1. ONLINE CHATTING
Develop a Software package that will act as an online community,
which you can use to meet new friends using voice, video, and text.
The community has rooms for every interest, where people can
communicate by using real-time multi-point video, text, and voice in
“calls.” The package should also have video and voice instant
Messages with different colour, fonts, and overlays to choose from, or
you can participate in a real-time text chat room.
Also incorporate broadcast, where one host shares his/her video,
voice, and text chat with up to ten other viewers. Everyone involved in
the broadcast can communicate with text chat. Also add the option of
profiles; wherein each community member has the opportunity to post
a picture and some optional personal information that can be accessed
from the directory by any other community member.

2. DESIGN AN ANSWERING MACHINE


Design a software package, which acts as an answering machine for
your voice modem. It features call monitoring and logging, Caller ID
with pop-ups and voice alerts, customisable and personalized
greetings, and conversation recording. The answering machine can
send call notifications to your pager and voice messages via email. It
can also block certain types of incoming calls, even without Caller ID
information. The package also provides integration with sound cards,
doesn’t require an extra speaker attached to the modem, and has the
ability to compress voice messages to be forwarded via email. Add the
facility of remote message retrieval and address book.
3. BROWSER
Design a fast, user-friendly, versatile Internet/intranet browser,
Monkey, that also includes a newsreader. The keyboard plays an
integral role in surfing, which can make moving around the Web easy
and fast. You can run multiple windows, even at start-up, and special
features are included for users with disabilities. Other options include
the ability to design your own look for the buttons in the program; file
uploading support to be used with forms and mail; an option to turn-
off tables; advanced cookie filtering; and a host of other powerful
features, including an email client enhancement, keyboard shortcuts.
Ensure also integrated search and Instant Messaging, email support
and accessibility to different Web sites.

4. WEB LOADER
Design a software package, named Web Loader, that enables you to
download Web sites quickly to your local drive and navigate offline. It
uses project files to store lists of sites and includes several project files
ready to use. Web Loader should include predefined filters to
download a complete site, a directory (and all subdirectories in it), a
page. One may be able to define ones own customised filter by levels,
external links, directories, include/exclude links, external images,
wildcards, and so on. The tool should be able to download up to 100
files simultaneously. It should also print an entire site or only specific
parts of it. The tool should offer support that allow to specify an
account name and password to access secure Web sites.

5. WEB CATCHER
Design a multipurpose software package, Web Catcher, that allows
you to capture Web pages and store on the hard disk for offline
viewing. The tool should create photo albums, greeting cards, etc.
Package any item created or captured in a special format so that it can
be emailed from within the program using your personal email
account Develop a drop-down menu that records the addresses from
any open browser window, allowing to easily select which pages to
download. Also specify the link depth of each capture. The Web
Catcher engine should be able to capture different file formats and
Web-based publishing languages.

6. WEATHER MATE
Develop a software package that receives up-to-date weather
information on your desktop from the Internet. Retrieve weather
forecasts from hundreds of major cities in countries around the world.
You can also generate assorted full-color weather maps. Configure
Weather Mate to update weather information at set intervals, and
resides in the system tray. The system tray icon changes to reflect the
current weather. Also add the feature “Speak Forecast” option using
Microsoft Agent 2.0, which allows you to have your local or extended
forecast read to you.

7. PARTY PLANNER
With Party Planner, you can create guest lists for parties, weddings,
banquets, reunions, and more. You can also assign meals and plan
seating arrangements. You can have up to 1,000 guests with 100 on a
reserve list, plan ten different meals, and arrange up to 99 tables (with
upto 50 guests at each). You can print several different reports for all
of these lists.

8. SMILING JOKER
Develop a software package, Smiling Joker, which uses text-to-speech
and voice recognition technology to make it fully interactive. It should
be able to tell you jokes, useless facts, and even sing to you. It also has
a built in calendar, which it uses to remind you of appointments,
birthdays, due dates, or anything else you want it to remind you about.
It should also have the ability to read aloud your email, Web page, or
anything else you want it to read.
9. MYTOOL
Prepare a project, named MyTool, which is an all-in-one desktop and
system utility program. Its features include an easy-to-use personal
calendar that helps you manage your daily events and can remind you
of important events in a timely manner. You can create quick notes on
your desktop with the embedded WordPad applet. Schedule events,
shut down Windows, down load software, and more, all automatically
at intervals you set.

10. TIMETRAKER
Track the time using your computer. Set alarms for important
appointments, lunch, and so on. Sum the time spent on projects
weekly, monthly, yearly, and do much more.

11. SPEAKING CALENDAR


Develop a software package, named Speaking Calendar, that
announces and speaks. It also speaks the date and appointments to
you. You may set as many appointments or alarms as you wish. You
can also choose whether you want to confirm the message and whether
the message should be repeated every hour, week, or month, or played
only once. This tool also allow you to select your favourite music or
song to provide a wake-up tune for your morning routine. Let it
support various formats, such as MP3, WAV files, and all basic CD
formats for your musical selection. You may use Microsoft Speaking
Agent, if you like.

12. DATA VIEWER


Design a software tool, called Data Viewer, which provides a common
interface to databases. You should be able to view tables and schemes
in a convenient tree view, execute queries and updates. Exhibit results
in table format. Results may be saved or viewed as HTML and also
forwarded as email.

13. CLASS BROWSER


Design a tool, named, Class Browser, which allows developers to
view methods, fields, strings constants, referred classes, and referred
methods defined in the Java/C++ class file. Classes can be either
locally stored or downloaded from the Web site. This tool should
allow users to browse into any referred class.

14. CODE ANALYZER


Develop a tool, called Code Analyzer, that checks the
Java/C++/C/Visual Basic source code conforms to a set of coding
convention rules.

15. CODE COVERAGE


Design a software tool, named Code Coverage, which allows
automatic location of untested code in Java/C++/C/Visual Basic
applications. It should tell exactly what method, line and block of code
in the program did or did not execute, listed per class, file, and
method. It should measure and track code execution.

16. VERNACULAR WORD


Develop a word processor for Spanish. You should be able to send
emails, build Web sites and export the language text into a graphic
software such as Adobe PhotoShop.

17. FEECHARGER
Translators, language professionals, and international writers charge
based upon the number of words, lines, and/or characters in the
documents they handle. They need to check the currency conversion
rates very frequently. They go online, find and calculate the current
rate. Design a tool that does all this.

18. ENGLISH-SPANISH TRANSLATOR


Design a tool to translate words from English to Spanish by right
clicking. Select a word in any active window, then right-click; the
translator will pop up and show you the translation. Click on
synonyms to find similar words. A hierarchical display should show
you the translation, organized by parts of speech. If you want the
adjective form of a noun, a click should show it. Select the word you
want, and it should be replaced in the active document. You should be
able to get conjugation, tense, and gender, as well as synonym phrases
and root words.

19. PROTECT PC
Develop a software tool, named Protect PC, which locks and hides
desired folders, files, data, and applications. The tool should provide a
user-friendly solution for security and privacy. It should have the
security and privacy of 128-bit encryption. Also have the feature so
that hackers cannot find encrypted files. Also provide Network, FDD,
and CD-ROM Locking functions, and a Delete function for List of
Recent Documents and Windows Explorer.

20. SECURE DEVICE


Design a software tool, named, Secure Device, which gives network
administrators control over which users can access what removable
devices (DVDs, etc.) on a local computer. It should control access to
DVDs, or any other device, depending on the time and date. Windows
System Administrators should have access control and be able to
control removable disk usage. It should protect network and local
computers against viruses, Trojans, and other malicious programs
often introduced by removable disks. Network administrators should
be able to flush a storage device’s buffers. Try to build remote control
also, if possible.

21. DATA ENCRYPTOR


Develop a software package, named Data Encryptor, which allows
using five well known, highly secure encryption algorithms. The tool
should help to encrypt single files, groups of files, or entire folders,
including all subfolders, quickly and easily. One should be able to
work with encrypted folders as simply as with usual folders (except
entering the unlocking password). The tool should assess the
efficiency of encryption and decryption algorithms also.

22. INTERNET CONTROLLER


Parents, schools, libraries and anyone else can control access to the
Internet. It may also be desired to monitor which Websites users visit,
what they type (via keystroke logging), which programs they access,
and the time they spend using the programs. You may also like to
secure Windows so that users cannot run unauthorized programs or
modify Windows configurations such as wallpaper and network
settings. Develop a software tool to do all this with additional features,
such as screenshot capturing, enhanced keystroke capturing (captures
lowercase and special characters), ability to email log files to a
specific email address.

23. WEB SPY


Design a software tool, named Web Spy, which automatically records
every screen that appears on your PC. It records the Websites visited,
incoming and outgoing email, chat conversations, passwords, and
every open application invisibly. Frames can be played back like a
video recording, enabling you to see what they have. Include the
features—inactivity suspend, colour, password protection, single
frame viewing, fast forward, rewind, save, search, and print.

24. FAXMATE
Design a software tool, FaxMate, which creates, sends, and receives
faxes. The incoming faxes should be announced by customizing the
program with sound file. Include a library of cover sheet templates,
and provide a phone number database for frequently used numbers.
Use the latest fax technology, including fax-modem autodetection.

25. PROJECT TRAKER


Develop a software tool, Project Traker, which helps to manage
software development projects by tracking software bugs, action
items, and change requests with problem reports. Try to include the
following features as far as possible:
Records problem reports in a network database.
Supports simultaneous access to the database by multiple users.
Supports multiple projects.
Supports configuration of data collection for each project.
Supports configuration of workflow for each project.
Redundant data storage for speed and backup.
Classifies problems according to priority, type, and other
properties.
Supports sort and search operations on the database or problem
reports.
Assign personnel to perform the tasks required to correct bugs.
Tracks the tasks of problem investigation, resolution, and
verification.
Determines which testing methods are most effective in detecting
bugs.
Helps estimate the time required to complete projects.
Estimates personnel work load.
Exports reports in HTML format.
Email notification.
Supports document attachment.

26. SOURCE TRANSLATOR


Design a software tool that translates Visual basic program source
code into different languages. Include the following features:
Autosearching of strings (text lines) in source codes.
The ability to manually translate strings.
Functions with string lists, including search, delete, and so on.
The ability to import strings from a file.
The ability to automatically replace translated strings.

Allow at least five files to be translated.

27. SOURCE LINE COUNTER


Develop a software tool, named Source Line Counter, which counts
the number of lines of code in a given software file or project of a
source code of any high level language, CC++/Java/Pascal/Visual
Basic. In fact, this tool should accurately count almost any language
that uses comment line and comment block delimiters SLC should
also permit to recursively count files in a directory based on a selected
HLL, or automatically count all of the known file types in a directory.

28. IMAGE VIEWER


Develop a software, named Image Viewer, which views and edits
images and supports all major graphic formats, including BMP, DIB,
JPEG, GIF, animated GIF, PNG, PCX, multipage TIFF, TGA, and
others. It should also have features such as drag-and-drop, directory
viewing, TWAIN support, slide shows, batch conversions, and
modifications such as color depth, crop, blur, and sharpen. Incorporate
thumbnail, print, batch, and view menu options as well.
29. MOVIEEDITOR
Develop a software tool, named MovieEditor, which can edit and
animate video clips, add titles and audio, or convert video formats. It
should support real-time preview and allow experimenting as well.
Build Internet functionality such as Real Video, ASF, and Quicktime
into this tool. The tool should also have the capability so that all
current video, graphic, and audio formats can be imported into a video
production, animated, and played back in different formats. The tool
should convert all current image, video, and audio files, and may be
used as a multimedia browser for displaying, searching, and
organizing images, videos, and audio files. It should provide support
for both native DV and Fire Wire interfaces with integrated device
control for all current camcorders.

30. TAXMASTER
Develop a software package, named TaxMaster, which helps to
complete income tax and sales tax returns. It should make available all
the rules, forms, schedules, worksheets and information available in
the US scenario to complete the tax forms. The tool should calculate
returns for the user, then review the return and send alert for possible
errors. It should provide the facility to file the return electronically or
print a paper return, so your tool should be available on the Web.

31. CALENDAR DESIGNER


Develop a software tool, named Calendar Designer, which creates
personalised calendars, and has scheduling features as well. Build into
it dialogs and a wizard that guides you through the process of creating
and modifying event entries for your calendar. Have the preview
feature for the calendar. Create monthly printouts with font and color
formatting with different color combinations to make the calendar
attractive. Include also the feature that reminds you of upcoming
events at system startup.
32. DIGITAL CIRCUITS
Develop the graphical design tool, named Digital Circuits, which
enables you to construct digital logic circuits and to analyze their
behavior. Circuits can be composed of simple gates (AND, OR,
NAND, NOR, XOR, XNOR, NOT) and simple flip-flops (D, RS, and
JK). You can also use tri-state logic to construct systems with buses.
Digital Circuits also provides mechanisms for detecting race
conditions and bus contention.
The package should have the feature to create macros, so that you
can convert a circuit into a logic element itself. The new logic element
can be used as a building block in the construction of more complex
circuits. The complex circuit can also be converted into a new logic
element, and so on.
This would enable you to create a hierarchy of digital objects, with
each new level hiding the complexity of its implementation. Some
examples of macros are: counters, shift registers; data registers. You
should be able to create even integrated circuits e.g., a 74HC08.

33. RANDOM TEST GENERATOR


Develop a package, named Random Test Generator, which will create
tests made up of randomly selected questions from test data banks that
you create. The tool should make a selection of questions as to how
many questions you want from different data banks. The program
should not select the same question twice for the same test. You
should be able to create as many tests as you need with a single each
student can have a different test. Add Internet testing capabilities with
automatic HTML creation and a Test Item Analysis feature to track
every test question for purposes of analysis.

34. METRIC
Study and define metrics for any of the programming languages
Java/C++/C/Perl/Visual Basic. Develop a software tool, called Metric,
which determines the source code and McCabe metrics for these
languages.

35. FUNCTIONPTESTIMATOR
Study the function point method for size estimation for software
projects. Design a software package, named FunctionPtEstimator,
which computes the function points and the corresponding KLOC for
a project. Your tool should be able to prepare a comparative analysis
for different programming languages.

36. RISK ANALYZER


Develop a software tool, named Risk Analyzer, which estimates and
analyzes risk involved in any software Project at the different stages of
its development, such as Requirement Specification and Analysis,
Design, Coding, Testing, and so on.

37. WEB ANALYZER


Develop a front-end software tool, in Java, for Web log analysis
providing the following capabilities: different reports, customized
reports, path analysis, management of reports, auto scheduling of
report generation, exporting data to spreadsheets, support for proxy
logs, intranet reports, and customizable log format.

38. CLASS VIEWER


Design a tool, named Class Viewer, in Java, which allows developers
to view methods, fields, string constants, referred classes and referred
methods defined in the Java class file. Classes can be either locally
stored or downloaded from the Web site. Class Viewer should allow
users to browse into any referred class.

39. STRUCTURE TESTER


Develop a software tool, named Structure Tester, in Java, which
analyzes C/C++/Java source code. It should perform function coverage,
that is, whether each function is invoked, branch coverage, that is,
whether each branch point is followed in every possible direction;
condition/decision coverage (whether each condition within a decision
is exercised), and multiple condition coverage (whether all
permutations of a condition is exercised).

40. WORLDTIMER
Develop a software tool, named WorldTimer, which displays the date
and time of cities around the globe in 8 Clocks. It should provide
information, such as language, population, currency and telephone
codes, of the capital cities of all the countries. It should show the
sunrise/sunset line in a world map and the difference in time between
any two cities.
APPENDIX
C

GLOSSARY
Abstract class: A class that cannot be instantiated, i.e., it cannot have
any instances.
Abstract test case: See high-level test case.
Acceptance: See acceptance testing.
Acceptance criteria: The exit criteria that a component or system
must satisfy in order to be accepted by a user, customer, or other
authorized entity. [IEEE 6.10]
Acceptance testing: It is done by the customer to check whether the
product is ready for use in the real-life environment. Formal testing
with respect to user needs, requirements, and business processes
conducted to determine whether or not a system satisfies the
acceptance criteria and to enable the user, customers, or other
authorized entity to determine whether or not to accept the system.
[After IEEE 610]
Accessibility testing: Testing to determine the ease by which users
with disabilities can use a component or system. [Gerrard]
Accuracy: The capability of the software product to provide the right
or agreed results or effects with the needed degree of precision. [ISO
9126] See also functionality testing.
Activity: A major unit of work to be completed in achieving the
objectives of a hardware/ software project.
Actor: An actor is a role played by a person, organization, or any
other device which interacts with the system.
Actual outcome: See actual result.
Actual result: The behavior produced/observed when a component or
system is tested.
Ad hoc review: See informal review.
Ad hoc testing: Testing carried out informally; no formal test
preparation takes place, no recognized test design technique is used,
there are no expectations for results and randomness guides the test
execution activity.
Adaptability: The capability of the software product to be adapted for
different specified environments without applying actions or means
other than those provided for this purpose for the software considered.
[ISO 9126] See also portability testing.
Agile testing: Testing practice for a project using agile
methodologies, such as extreme programming (XP), treating
development as the customer of testing and emphasizing the test-first
design paradigm.
Aggregation: Process of building up of complex objects out of
existing objects.
Algorithm test [TMap]: See branch testing.
Alpha testing: Simulated or actual operational testing by potential
users/customers or an independent test team at the developers’ site, but
outside the development organization. Alpha testing is often employed
as a form of internal acceptance testing.
Analyst: An individual who is trained and experienced in analyzing
existing systems to prepare SRS (software requirement specifications).
Analyzability: The capability of the software product to be diagnosed
for deficiencies or causes of failures in the software, or for the parts to
be modified to be identified. [ISO 9126] See also maintainability
testing.
Analyzer: See static analyzer.
Anomaly: Any condition that deviates from expectation based on
requirements specifications, design documents, user documents,
standards, etc. or from someone’s perception or experience.
Anomalies may be found during, but not limited to, reviewing, testing,
analysis, compilation, or use of software products or applicable
documentation. [IEEE 1044] See also defect, deviation, error, fault,
failure, incident, or problem.
Arc testing: See branch testing.
Atomicity: A property of a transaction that ensures it is completed
entirely or not at all.
Attractiveness: The capability of the software product to be attractive
to the user. [ISO 9126] See also usability testing.
Audit: An independent evaluation of software products or processes
to ascertain compliance to standards, guidelines, specifications, and/or
procedures based on objective criteria, including documents that
specify: (1) the form or content of the products to be produced, (2) the
process by which the products shall be produced, and (3) how
compliance to standards or guidelines shall be measured. [IEEE 1028]
Audit trail: A path by which the original input to a process (e.g., data)
can be traced back through the process, taking the process output as a
starting point. This facilitates defect analysis and allows a process
audit to be carried out. [After TMap]
Automated testware: Testware used in automated testing, such as
tool scripts.
Availability: The degree to which a component or system is
operational and accessible when required for use. Often expressed as a
percentage. [IEEE 610]
Back-to-back testing: Testing in which two or more variants of a
component or system are executed with the same inputs, the outputs
compared, and analyzed in cases of discrepancies. [IEEE 610]
Baseline: A specification or software product that has been formally
reviewed or agreed upon, that thereafter serves as the basis for further
development, and that can be changed only through a formal change
control process. [After IEEE 610] A hardware/software work product
that has been formally reviewed and agreed upon, which serves as the
basis for further development.
Basic block: A sequence of one or more consecutive executable
statements containing no branches.
Basis test set: A set of test cases derived from the internal structure or
specification to ensure that 100% of a specified coverage criterion is
achieved.
Bebugging: See error seeding. [Abbott]
Behavior: The response of a component or system to a set of input
values and preconditions.
Benchmark test: (1) A standard against which measurements or
comparisons can be made. (2) A test that is to be used to compare
components or systems to each other or to a standard as in (1). [After
IEEE 610]
Bespoke software: Software developed specifically for a set of users
or customers. The opposite is off-the-shelf software.
Best practice: A superior method or innovative practice that
contributes to the improved performance of an organization under
given context, usually recognized as “best” by other peer
organizations.
Beta testing: Operational testing by potential and/or existing
users/customers at an external site not otherwise involved with the
developers, to determine whether or not a component or system
satisfies the user/customer needs and fits within the business
processes. Beta testing is often employed as a form of external
acceptance testing in order to acquire feedback from the market.
Big-bang testing: Testing all modules at once. A type of integration
testing in which software elements, hardware elements, or both are
combined all at once into a component or an overall system, rather
than in stages. [After IEEE 610] See also integration testing.
Black-box technique: See black-box test design technique.
Black-box test design technique: Documented procedure to derive
and select test cases based on an analysis of the specification, either
functional or non-functional, of a component or system without
reference to its internal structure.
Black-box testing: Testing, either functional or non-functional,
without reference to the internal structure of the component or system.
Blocked test case: A test case that cannot be executed because the
preconditions for its execution are not fulfilled.
Bohr bug: A repeatable bug; one that manifests reliably under a
possibly unknown but well defined set of conditions.
Bottom-up testing: An incremental approach to integration testing
where the lowest level components are tested first, and then used to
facilitate the testing of higher level components. This process is
repeated until the component at the top of the hierarchy is tested. See
also integration testing.
Boundary value: An input value or output value which is on the edge
of an equivalence partition or at the smallest incremental distance on
either side of an edge, for example, the minimum or maximum value
of a range.
Boundary value analysis: A black-box test design technique in which
test cases are designed based on boundary values.
Boundary value coverage: The percentage of boundary values that
have been exercised by a test suite.
Boundary value testing: See boundary value analysis.
Brainstorming: The process of obtaining ideas, opinions, and
answers to a question in groups. The members of the group are given
the opportunity to contribute.
Branch: A basic block that can be selected for execution based on a
program construct in which one of two or more alternative program
paths are available, e.g., case, jump, go to, ifthen-else.
Branch condition: See condition.
Branch condition combination coverage: See multiple condition
coverage.
Branch condition combination testing: See multiple condition
testing.
Branch condition coverage: See condition coverage.
Branch coverage: The percentage of branches that have been
exercised by a test suite. 100% branch coverage implies both 100%
decision coverage and 100% statement coverage.
Branch testing: A white-box test design technique in which test cases
are designed to execute branches.
Budget: A statement of management plans and expected results
expressed in numbers, quantitative, and monetary terms.
Bug: See defect.
Build: Builds are versions or redesigns of a project that is in
development.
Business process-based testing: An approach to testing in which test
cases are designed based on descriptions and/or knowledge of
business processes.
Capability maturity model (CMM): A five-level staged framework
that describes the key elements of an effective software process. The
capability maturity model covers practices for planning, engineering,
and managing software development and maintenance. [CMM]
Capability maturity model integration (CMMI): A framework that
describes the key elements of an effective product development and
maintenance process. The capability maturity model integration covers
practices for planning, engineering, and managing product
development and maintenance. CMMI is the designated successor of
the CMM. [CMMI]
Capture/playback tool: A type of test execution tool where inputs are
recorded during manual testing in order to generate automated test
scripts that can be executed later (i.e., replayed). These tools are often
used to support automated regression testing.
Capture/replay tool: See capture/playback tool.
CASE: Acronym for Computer Aided Software Engineering.
CAST: Acronym for Computer Aided Software Testing. See also test
automation.
Cause-effect analysis: See cause-effect graphing.
Cause-effect decision table: See decision table.
Cause-effect graph: A graphical representation of inputs and/or
stimuli (causes) with their associated outputs (effects) which can be
used to design test cases.
Cause-effect graphing: A black-box test design technique in which
test cases are designed from cause-effect graphs. [BS 7925/2]
Certification: The process of confirming that a component, system, or
person complies with its specified requirements, e.g., by passing an
exam.
Changeability: The capability of the software product to enable
specified modifications to be implemented. [ISO 9126] See also
maintainability.
Checker: See reviewer.
Chow’s coverage metrics: See N-switch coverage. [Chow]
Class: It provides a blueprint for an object.
Class diagram: A diagram used to show the static view of the system
in terms of classes and their relationships.
Classification tree method: A black-box test design technique in
which test cases, described by means of a classification tree, are
designed to execute combinations of representatives of input and/or
output domains. [Grochtmann]
Code analyzer: See static code analyzer.
Code coverage: An analysis method that determines which parts of
the software have been executed (covered) by the test suite and which
parts have not been executed, e.g., statement coverage, decision
coverage, or condition coverage.
Code-based testing: See white-box testing.
Co-existence: The capability of the software product to co-exist with
other independent software in a common environment sharing
common resources. [ISO 9126] See portability testing.
Collaboration diagram: A diagram that focuses on object
interactions irrespective of time.
Commercial off-the-shelf (COTS) software: Hardware and software
can be purchased and placed in service without additional
development cost for the system or component. See off-the-shelf
software.
Comparator: See test comparator.
Comparison/back-to-back testing: It detects test failures by
comparing the output of two or more programs complemented to the
same specification.
Compatibility testing: See interoperability testing.
Complete testing: See exhaustive testing.
Completion criteria: See exit criteria.
Complexity: The degree to which a component or system has a design
and/or internal structure that is difficult to understand, maintain, and
verify. See also cyclomatic complexity.
Compliance: The capability of the software product to adhere to
standards, conventions, or regulations in laws and similar
prescriptions. [ISO 9126]
Compliance testing: The process of testing to determine the
compliance of a component or system.
Component: A minimal software item that can be tested in isolation.
Component integration testing: Testing performed to expose defects
in the interfaces and interaction between integrated components.
Component specification: A description of a component’s function in
terms of its output values for specified input values under specified
conditions, and required non-functional behavior (e.g., resource-
utilization).
Component testing: The testing of individual software components.
[After IEEE 610]
Compound condition: Two or more single conditions joined by
means of a logical operator (AND, OR, or XOR), e.g., “A>B AND
C>1000.”
Computer Aided Software Engineering (CASE): Are the automated
tools that help a software engineer to automatically perform some
tasks which if done manually, are very tedious and cumbersome.
Concrete test case: See low-level test case.
Concurrency testing: Testing to determine how the occurrence of
two or more activities within the same interval of time, achieved either
by interleaving the activities or by simultaneous execution, is handled
by the component or system. [After IEEE 610]
Condition: A logical expression that can be evaluated as True or
False, e.g., A>B. See also test condition.
Condition combination coverage: See multiple condition coverage.
Condition combination testing: See multiple condition testing.
Condition coverage: The percentage of condition outcomes that have
been exercised by a test suite. 100% condition coverage requires each
single condition in every decision statement to be tested as True and
False.
Condition determination coverage: The percentage of all single
condition outcomes that independently affect a decision outcome that
have been exercised by a test case suite. 100% condition determination
coverage implies 100% decision condition coverage.
Condition determination testing: A white-box test design technique
in which test cases are designed to execute single condition outcomes
that independently affect a decision outcome.
Condition outcome: The evaluation of a condition to True or False.
Condition testing: A white-box test design technique in which test
cases are designed to execute condition outcomes.
Confidence test: See smoke test.
Configuration: The composition of a component or system as defined
by the number, nature, and interconnections of its constituent parts.
Configuration auditing: The function to check on the contents of
libraries of configuration items, e.g., for standards compliance. [IEEE
610]
Configuration control: An element of configuration management,
consisting of the evaluation, coordination, approval or disapproval,
and implementation of changes to configuration items after formal
establishment of their configuration identification. [IEEE 610]
Configuration identification: An element of configuration
management, consisting of selecting the configuration items for a
system and recording their functional and physical characteristics in
technical documentation. [IEEE 610]
Configuration item: An aggregation of hardware, software, or both,
that is designated for configuration management and treated as a
single entity in the configuration management process. [IEEE 610]
Configuration management: A discipline applying technical and
administrative direction and surveillance to: identify and document the
functional and physical characteristics of a configuration item, control
changes to those characteristics, record and report change processing
and implementation status, and verify compliance with specified
requirements. [IEEE 610]
Configuration testing: See portability testing.
Confirmation testing: See retesting.
Conformance testing: See compliance testing.
Consistency: The degree of uniformity, standardization, and freedom
from contradiction among the documents or parts of a component or
system. [IEEE 610]
Control flow: An abstract representation of all possible sequences of
events (paths) in the execution through a component or system.
Control flow graph: See control flow.
Control flow path: See path.
Conversion testing: Testing of software used to convert data from
existing systems for use in replacement systems.
Correctness: The extent to which software is free from faults.
Cost-benefit analysis: The process of deciding whether to do
something by evaluating the costs of doing it and the benefits of doing
it.
COTS: Acronym for Commercial off-the-shelf software.
Coverage: The degree to which a software feature or characteristic is
tested or analyzed. A measure of test completeness. The degree,
expressed as a percentage, to which a specified coverage item has
been exercised by a test suite.
Coverage analysis: Measurement of achieved coverage to a specified
coverage item during test execution referring to predetermined criteria
to determine whether additional testing is required and, if so, which
test cases are needed.
Coverage item: An entity or property used as a basis for test
coverage, e.g., equivalence partitions or code statements.
Coverage tool: A tool that provides objective measures of what
structural elements, e.g., statements and branches have been exercised
by the test suite.
Critical path method (CPM): A method that shows the analysis of
paths in an activity graph among different milestones of the project.
Custom software: Software that is developed to meet the specific
needs of a particular customer. See bespoke software.
Cyclomatic complexity: The number of independent paths through a
program. Cyclomatic complexity is defined as: L – N + 2P, where – L
= the number of edges/links in a graph – N = the number of nodes in a
graph – P = the number of disconnected parts of the graph (e.g., a
calling graph and a subroutine). [After McCabe]
Cyclomatic number: See cyclomatic complexity.
Data definition: An executable statement where a variable is assigned
a value.
Data driven testing: A scripting technique that stores test input and
expected results in a table or spreadsheet, so that a single control script
can execute all of the tests in the table. Data driven testing is often
used to support the application of test execution tools such as
capture/playback tools. [Fewster and Graham] See also keyword
driven testing.
Data flow: An abstract representation of the sequence and possible
changes of the state of data objects, where the state of an object is a
creation, usage, or destruction. [Beizer]
Data flow analysis: A form of static analysis based on the definitions
and usage of variables.
Data flow coverage: The percentage of definition-use pairs that have
been exercised by a test case suite.
Data flow test: A white-box test design technique in which test cases
are designed to execute definitions and use pairs of variables.
Dead code: See unreachable code.
Debugger: See debugging tool.
Debugging: The process of finding, analyzing, and removing the
causes of failures in software.
Debugging tool: A tool used by programmers to reproduce failures,
investigate the state of programs, and find the corresponding defect.
Debuggers enable programmers to execute programs step by step to
halt a program at any program statement and to set and examine
program variables.
Decision: A program point at which the control flow has two or more
alternative routes. A node with two or more links to separate branches.
Decision condition coverage: The percentage of all condition
outcomes and decision outcomes that have been exercised by a test
suite. 100% decision condition coverage implies both 100% condition
coverage and 100% decision coverage.
Decision condition testing: A white-box test design technique in
which test cases are designed to execute condition outcomes and
decision outcomes.
Decision coverage: The percentage of decision outcomes that have
been exercised by a test suite. 100% decision coverage implies both
100% branch coverage and 100% statement coverage.
Decision outcome: The result of a decision (which therefore
determines the branches to be taken).
Decision table: A table showing combinations of inputs and/or stimuli
(causes) with their associated outputs and/or actions (effects) which
can be used to design test cases. It lists various decision variables, the
conditions assumed by each of the decision variables, and the actions
to take in each combination of conditions.
Decision table testing: A black-box test design technique in which
test cases are designed to execute the combinations of inputs and/or
stimuli (causes) shown in a decision table. [Veenendaal]
Decision testing: A white-box test design technique in which test
cases are designed to execute decision outcomes.
Defect: A flaw in a component or system that can cause the
component or system to fail to perform its required function, e.g., an
incorrect statement or data definition. A defect, if encountered during
execution, may cause a failure of the component or system.
Defect bash: It is an adhoc testing, done by people performing
different roles in the same time duration during the integration testing
phase.
Defect density: The number of defects identified in a component or
system divided by the size of the component or system (expressed in
standard measurement terms, e.g., lines-of code, number of classes, or
function points).
Defect detection percentage (DDP): The number of defects found by
a test phase, divided by the number found by that test phase and any
other means afterwards.
Defect management: The process of recognizing, investigating,
taking action, and disposing of defects. It involves recording defects,
classifying them, and identifying the impact. [After IEEE 1044]
Defect management tool: See incident management tool.
Defect masking: An occurrence in which one defect prevents the
detection of another. [After IEEE 610]
Defect report: A document reporting on any flaw in a component or
system that can cause the component or system to fail to perform its
required function. [After IEEE 829]
Defect tracking tool: See incident management tool.
Definition-use pair: The association of the definition of a variable
with the use of that variable. Variable uses include computational
(e.g., multiplication) or to direct the execution of a path (“predicate”
use).
Deliverable: Any (work) product that must be delivered to someone
other than the (work) product’s author.
Delphi technique: Several participants make their individual cost
estimates and then share them.
Design-based testing: An approach to testing in which test cases are
designed based on the architecture and/or detailed design of a
component or system (e.g., tests of interfaces between components or
systems).
Desk checking: Testing of software or specification by manual
simulation of its execution. A manual analysis of a work product to
discover errors.
Development testing: Formal or informal testing conducted during
the implementation of a component or system, usually in the
development environment by developers. [After IEEE 610]
Deviation: See incident.
Deviation report: See incident report.
Dirty testing: See negative testing.
Divide-and-conquer (or Decomposition): The process of dividing
something large into smaller units (called modules) so as to simplify
our tasks. It is appliciable to cost estimation, design, and testing.
Documentation testing: To ensure that the documentation is
consistent with the product. Testing the quality of the documentation,
e.g., user guide or installation guide.
Domain: The set from which valid input and/or output values can be
selected.
Domain testing: The execution of test cases developed by analysis of
input data relationships and constraints. Domain testing exploits the
tester’s domain knowledge to test the suitability of the product to what
the users do on a typical day.
Driver: In bottom-up integration, we start with the leaves of the
decomposition tree and test them with specially coded drivers. Less
throw-away code exists in drivers than there is in stubs. A software
component or test tool that replaces a component that takes care of the
control and/or the calling of a component or system. [After TMap]
Duplex testing: A test bed where the test driver runs on a computer
system separate from the computer system where the system-under-
test (SUT) runs.
Dynamic analysis: The process of evaluating behavior, e.g., memory
performance, CPU usage, of a system, or component during execution.
[After IEEE 610]
Dynamic comparison: Comparison of actual and expected results,
performed while the software is being executed, for example, by a test
execution tool.
Dynamic testing: Testing that involves the execution of the software
of a component or system.
Efficiency: The capability of the software product to provide
appropriate performance, relative to the amount of resources used
under stated conditions. [ISO 9126]
Efficiency testing: The process of testing to determine the efficiency
of a software product.
Elementary comparison testing: A black-box test design technique
in which test cases are designed to execute combinations of inputs
using the concept of condition determination coverage. [TMap]
Embedded software: A software to run in specific hardware devices.
It is embedded in ROM.
Emulator: A device, computer program, or system that accepts the
same inputs and produces the same outputs as a given system. [IEEE
610] See also simulator.
Encapsulation: The wrapping up of data and function into a single
unit.
Entry criteria: The set of generic and specific conditions for
permitting a process to go forward with a defined task, e.g., test phase.
The purpose of entry criteria is to prevent a task from starting which
would entail more (wasted) effort compared to the effort needed to
remove the failed entry criteria. [Gilb and Graham]
Entry point: The first executable statement within a component.
Equivalence class: See equivalence partition.
Equivalence partition: A portion of an input or output domain for
which the behavior of a component or system is assumed to be the
same, based on the specification.
Equivalence partition coverage: The percentage of equivalence
partitions that have been exercised by a test suite.
Equivalence partitioning: A black-box test design technique in
which test cases are designed to execute representatives from
equivalence partitions. In principle, test cases are designed to cover
each partition at least once.
Error: A human action that produces an incorrect result. [After IEEE
610]
Error guessing: A test design technique where the experience of the
tester is used to anticipate what defects might be present in the
component or system-under-test as a result of errors made, and to
design tests specifically to expose them.
Error seeding: It determines whether a set of test cases is adequate by
inserting known error types into the program and executing it with test
cases. The process of intentionally adding known defects to those
already in the component or system for the purpose of monitoring the
rate of detection and removal, and estimating the number of remaining
defects. [IEEE 610]
Error tolerance: The ability of a system or component to continue
normal operation despite the presence of erroneous inputs. [After
IEEE 610]
Evaluation: See testing.
Event: Something that causes a system or object to change state.
Exception handling: Behavior of a component or system in response
to erroneous input, from either a human user or from another
component or system, or to an internal failure.
Executable statement: A statement which, when compiled, is
translated into object code, and which will be executed procedurally
when the program is running and may perform an action on data.
Exercised: A program element is said to be exercised by a test case
when the input value causes the execution of that element, such as a
statement, decision, or other structural element.
Exhaustive testing: A test approach in which the test suite comprises
all combinations of input values and preconditions.
Exit criteria: The set of generic and specific conditions, agreed upon
with the stakeholders, for permitting a process to be officially
completed. The purpose of exit criteria is to prevent a task from being
considered completed when there are still outstanding parts of the task
which have not been finished. Exit criteria are used by testing to report
against and to plan when to stop testing. [After Gilb and Graham]
Exit point: The last executable statement within a component.
Expected outcome: See expected result.
Expected result: The behavior predicted by the specification, or
another source, of the component or system under specified
conditions.
Exploratory testing: Testing where the tester actively controls the
design of the tests as those tests are performed and uses information
gained while testing to design new and better tests. [Bach]
Fail: A test is deemed to fail if its actual result does not match its
expected result.
Failure: Actual deviation of the component or system from its
expected delivery, service, or result. [After Fenton]
Failure mode: The physical or functional manifestation of a failure.
For example, a system in failure mode may be characterized by slow
operation, incorrect outputs, or complete termination of execution.
Failure mode and effect analysis (FMEA): A systematic approach to
risk identification and analysis of identifying possible modes of failure
and attempting to prevent their occurrence.
Failure rate: The ratio of the number of failures of a given category
to a given unit of measure, e.g., failures per unit of time, failures per
number of transactions, failures per number of computer runs. [IEEE
610]
Fault: See defect.
Fault density: See defect density.
Fault detection percentage (FDP): See Defect detection percentage
(DDP).
Fault masking: See defect masking.
Fault spawing: The introduction of new faults when a fault is
removed.
Fault tolerance: The capability of the software product to maintain a
specified level of performance in cases of software faults (defects) or
of infringement of its specified interface. [ISO 9126] See also
reliability.
Fault tree analysis: A method used to analyze the causes of faults
(defects).
Feasible path: A path for which a set of input values and
preconditions exists which causes it to be executed.
Feature: An attribute of a component or system specified or implied
by requirements documentation (for example, reliability, usability, or
design constraints). [After IEEE 1008]
Field testing: See beta testing.
Finite state machine: A computational model consisting of a finite
number of states and transitions between those states, possibly with
accompanying actions. [IEEE 610]
Finite state testing: See state transition testing.
Fork: A symbol in an activity diagram to show splitting of control
into multiple threads.
Formal language: A language that uses mathematics for the purpose
of modelling.
Formal review: A review characterized by documented procedures
and requirements, e.g., inspection. Careful planned meetings,
reviewers are responsible and review reports are also generated and
acted upon.
Formal testing: Testing conducted in accordance with test plans and
procedures that have been reviewed and approved by a customer, user,
or designated level of management.
Frozen test basis: A test basis document that can only be amended by
a formal change control process. See also baseline.
Function point analysis (FPA): Method aiming to measure the size
of the functionality of an information system. The measurement is
independent of the technology. This measurement may be used as a
basis for the measurement of productivity, the estimation of the
needed resources, and project control.
Functional integration: An integration approach that combines the
components or systems for the purpose of getting a basic functionality
working early. See also integration testing.
Functional requirement: A requirement that specifies a function that
a component or system must perform. [IEEE 610]
Functional test design technique: Documented procedure to derive
and select test cases based on an analysis of the specification of the
functionality of a component or system without reference to its
internal structure. See also black-box test design technique.
Functional testing: Testing based on an analysis of the specification
of the functionality of a component or system. See also black-box
testing.
Functionality: The capability of the software product to provide
functions which meet stated and implied needs when the software is
used under specified conditions. [ISO 9126]
Functionality testing: The process of testing to determine the
functionality of a software product.
Gantt chart: A diagram used to graphically represent the start and
end dates of each software engineering task. These charts can be
drawn using MS-Project as a CASE tool.
Generalization: It is a form of abstraction that specifies that 2 or
more entities that share common attributes can be generalized into a
higher level entity type called a super type or generic entity.
Generic software: Softwares that perform functions on general
purpose computers.
Glass-box testing: See white-box testing.
Glue code: Code that is written to connect reused commercial off-the-
shelf applications.
Gold plating: Building a list of requirements that does more than
needed.
Gray-box testing: A combined (or hybrid) approach of white- and
black-box techniques. Gray-box may involve 95% of white-box
testing strategies and 5% of black-box and vice versa is also true.
Guard condition: A condition that determines whether a certain
transition will occur in a state diagram when an event happens.
Heuristic evaluation: A static usability test technique to determine
the compliance of a user interface with recognized usability principles
(the so-called “heuristics”).
High-level test case: A test case without concrete (implementation
level) values for input data and expected results.
Horizontal traceability: The tracing of requirements for a test level
through the layers of test documentation (e.g., test plan, test design
specification, test case specification, and test procedure specification).
Impact analysis: The assessment of change to the layers of
development documentation, test documentation, and components in
order to implement a given change to specified requirements.
Incident: Any event occurring during testing that requires
investigation. [After IEEE 1008]
Incident: It is the symptom associated with a failure that alerts the
user to the occurrence of a failure.
Incident management: The process of recognizing, investigating,
taking action, and disposing of incidents. It involves recording
incidents, classifying them, and identifying the impact. [After IEEE
1044]
Incident management tool: A tool that facilitates the recording and
status tracking of incidents found during testing. They often have
workflow-oriented facilities to track and control the allocation,
correction, and re-testing of incidents and provide reporting facilities.
Incident report: A document reporting on any event that occurs
during the testing which requires investigation. [After IEEE 829]
Incremental development model: A development life cycle where a
project is broken into a series of increments, each of which delivers a
portion of the functionality in the overall project requirements. The
requirements are prioritized and delivered in priority order in the
appropriate increment. In some (but not all) versions of this life cycle
model, each subproject follows a “mini-model” with its own design,
coding, and testing phases.
Incremental testing: Testing where components or systems are
integrated and tested one or some at a time until all of the components
or systems are integrated and tested.
Independence: Separation of responsibilities, which encourages the
accomplishment of objective testing. [After DO-178b]
Infeasible path: A path that cannot be exercised by any set of
possible input values.
Informal review: No planned meetings, reviewers have no
responsibility and they do not produce a review reports. A review not
based on a formal (documented) procedure.
Informal testing: Testing conducted in accordance with test plans and
procedures that have not been reviewed and approved by a customer,
user, or designated level of management.
Input: A variable (whether stored within a component or outside) that
is read by a component.
Input domain: The set from which valid input values can be selected.
See also domain.
Input value: An instance of an input. See also input.
Inspection: A type of review that relies on visual examination of
documents to detect defects, e.g., violations of development standards
and non-conformance to higher level documentation. The most formal
review technique and therefore always based on a documented
procedure. [After IEEE 610, IEEE 1028]
Inspection leader: See moderator.
Inspector: See reviewer.
Installability: The capability of the software product to be installed in
a specified environment. [ISO 9126] See also portability.
Installability testing: The process of testing the installability of a
software product. See also portability testing.
Installation guide: Supplied instructions on any suitable media which
guides the installer through the installation process. This may be a
manual guide, step-by-step procedure, installation wizard, or any other
similar process description.
Installation wizard: Supplied software on any suitable media which
leads the installer through the installation process. It normally runs the
installation process, provides feedback on installation results, and
prompts for options.
Instrumentation: The insertion of additional code into the program in
order to collect information about program behavior during execution.
Instrumenter: A software tool used to carry out instrumentation.
Intake test: A special instance of a smoke test to decide if the
component or system is ready for detailed and further testing. An
intake test is typically carried out at the start of the test execution
phase.
Integration: The process of combining components or systems into
larger assemblies.
Integration testing: Testing performed to expose defects in the
interfaces and in the interactions between integrated components or
systems. See also component integration testing and system
integration testing.
Integration testing in the large: See system integration testing.
Integration testing in the small: See component integration testing.
Interaction diagram: A sequence diagram or collaboration diagram
used to model the dynamic aspects of software.
Interface testing: An integration test type that is concerned with
testing the interfaces between components or systems.
Interoperability: The effort required to couple one system to another.
The capability of the software product to interact with one or more
specified components or systems. [After ISO 9126] See also
functionality.
Interoperability testing: The process of testing to determine the
interoperability of a software product. See also functionality testing.
Invalid testing: Testing using input values that should be rejected by
the component or system. See also error tolerance.
Isolation testing: Testing of individual components in isolation from
surrounding components, with surrounding components being
simulated by stubs and drivers, if needed.
Item transmittal report: See release note.
Key performance indicator: See performance indicator.
Keyword driven testing: A scripting technique that uses data files to
contain not only test data and expected results, but also keywords
related to the application being tested. The keywords are interpreted
by special supporting scripts that are called by the control script for
the test. See also data driven testing.
LCSAJ: A linear code sequence and jump, consisting of the following
three items (conventionally identified by line numbers in a source
code listing): the start of the linear sequence of executable statements,
the end of the linear sequence, and the target line to which control
flow is transferred at the end of the linear sequence.
LCSAJ coverage: The percentage of LCSAJs of a component that
have been exercised by a test suite. 100% LCSAJ coverage implies
100% decision coverage.
LCSAJ testing: A white-box test design technique in which test cases
are designed to execute LCSAJs.
Learnability: The capability of the software product to enable the
user to learn its application. [ISO 9126] See also usability.
Limited entry decision tables: Decision tables in which all
conditions are binary.
Link testing: See component integration testing.
Load test: A test type concerned with measuring the behavior of a
component or system with increasing load, e.g., number of parallel
users and/or numbers of transactions to determine what load can be
handled by the component or system.
Locale: An environment where the language, culture, laws, currency,
and many other factors may be different.
Locale testing: It focuses on testing the conventions for number,
punctuations, date and time, and currency formats.
Logic-coverage testing: See white-box testing. [Myers]
Logic-driven testing: See white-box testing.
Logical test case: See high-level test case.
Low-level test case: A test case with concrete (implementation level)
values for input data and expected results.
Maintainability: The ease with which a software product can be
modified to correct defects, modified to meet new requirements,
modified to make future maintenance easier, or adapted to a changed
environment. [ISO 9126]
Maintainability testing: The process of testing to determine the
maintainability of a software product.
Maintenance: Modification of a software product after delivery to
correct defects, to improve performance or other attributes, or to adapt
the product to a modified environment. [IEEE 1219]
Maintenance testing: Testing the changes to an operational system or
the impact of a changed environment to an operational system.
Management review: A systematic evaluation of software
acquisition, supply, development, operation, or maintenance process,
performed by or on behalf of management that monitors progress,
determines the status of plans and schedules, confirms requirements
and their system allocation, or evaluates the effectiveness of
management approaches to achieve fitness for purpose. [After IEEE
610, IEEE 1028]
Mandel bug: A bug whose underlying causes are so complex and
obscure as to make its behavior appear chaotic or even non-
deterministic.
Master test plan: See project test plan.
Maturity: (1) The capability of an organization with respect to the
effectiveness and efficiency of its processes and work practices. See
also capability maturity model and test maturity model. (2) The
capability of the software product to avoid failure as a result of defects
in the software. [ISO 9126] See also reliability.
Measure: The number or category assigned to an attribute of an entity
by making a measurement. [ISO 14598]
Measurement: The process of assigning a number or category to an
entity to describe an attribute of that entity. [ISO 14598]
Measurement scale: A scale that constrains the type of data analysis
that can be performed on it. [ISO 14598]
Memory leak: A situation in which a program requests memory but
does not release it when it is no longer needed. A defect in a
program’s dynamic store allocation logic that causes it to fail to
reclaim memory after it has finished using it, eventually causing the
program to fail due to lack of memory.
Message: Message is a programming language mechanism by which
one unit transfers control to another unit.
Messages: It shows how objects communicate. Each message
represents one object making function call of another.
Metric: A measurement scale and the method used for measurement.
[ISO 14598]
Migration testing: See conversion testing.
Milestone: A point in time in a project at which defined (intermediate)
deliverables and results should be ready.
Mistake: See error.
Moderator: The leader and main person responsible for an inspection
or other review process.
Modified condition decision coverage: See condition determination
coverage.
Modified condition decision testing: See condition determination
coverage testing.
Modified multiple condition coverage: See condition determination
coverage.
Modified multiple condition testing: See condition determination
coverage testing.
Module: Modules are parts, components, units, or areas that comprise
a given project. They are often thought of as units of software code.
See also component.
Module testing: See component testing.
Monitor: A software tool or hardware device that runs concurrently
with the component or system under test and supervises, records,
and/or analyzes the behavior of the component or system. [After IEEE
610]
Monkey testing: Randomly test the product after all planned test
cases are done.
Multiple condition: See compound condition.
Multiple condition coverage: The percentage of combinations of all
single condition outcomes within one statement that have been
exercised by a test suite. 100% multiple condition coverage implies
100% condition determination coverage.
Multiple condition testing: A white-box test design technique in
which test cases are designed to execute combinations of single
condition outcomes (within one statement).
Multiplicity: Information placed at each end of an association
indicating how many instances of one class can be related to instances
of the other class.
Mutation analysis: A method to determine test suite thoroughness by
measuring the extent to which a test suite can discriminate the
program from slight variants (mutants) of the program.
N-switch coverage: The percentage of sequences of N+1 transitions
that have been exercised by a test suite. [Chow]
N-switch testing: A form of state transition testing in which test cases
are designed to execute all valid sequences of N+1 transitions. [Chow]
See also state transition testing.
Negative functional testing: Testing the software with invalid inputs.
Negative testing: Tests aimed at showing that a component or system
does not work. Negative testing is related to the testers’ attitude rather
than a specific test approach or test design technique. [After Beizer]
Nonconformity: Non fulfillment of a specified requirement. [ISO
9000]
Nonfunctional requirement: A requirement that does not relate to
functionality, but to attributes of such as reliability, efficiency,
usability, maintainability, and portability.
Nonfunctional testing: Testing the attributes of a component or
system that do not relate to functionality, e.g., reliability, efficiency,
usability, maintainability, and portability.
Nonfunctional test design techniques: Methods used to design or
select tests for nonfunctional testing.
Non-reentrant code: It is a piece of program that modifies itself.
Object: It encapsulates information and behavior. It represents a real-
world thing, e.g., ATM screen, card reader, etc.
Off point: A value outside of a domain.
Off-the-shelf software: A software product that is developed for the
general market, i.e., for a large number of customers, and that is
delivered to many customers in identical format.
Operability: The capability of the software product to enable the user
to operate and control it. [ISO 9126] See also usability.
Operational environment: Hardware and software products installed
at users’ or customers’ sites where the component or system-under-test
will be used. The software may include operating systems, database
management systems, and other applications.
Operational profile testing: Statistical testing using a model of
system operations (short duration tasks) and their probability of typical
use. [Musa]
Operational testing: Testing conducted to evaluate a component or
system in its operational environment. [IEEE 610]
Oracle: See test oracle.
Outcome: See result.
Output: A variable (whether stored within a component or outside)
that is written by a component.
Output domain: The set from which valid output values can be
selected. See also domain.
Output value: An instance of an output. See also output.
Pair programming: A software development approach whereby lines
of code (production and/ or test) of a component are written by two
programmers sitting at a single computer. This implicitly means
ongoing real-time code reviews are performed.
Pair testing: Two testers work together to find defects. Typically,
they share one computer and trade control of it while testing.
Pareto Principle: A rule that states that 80% of the benefit can be
obtained with 20% of the work. For example, 80% of CPU time is
spent executing 20% of the statements.
Partition testing: See equivalence partitioning. [Beizer]
Pass: A test is deemed to pass if its actual result matches its expected
result.
Pass/fail criteria: Decision rules used to determine whether a test
item (function) or feature has passed or failed a test. [IEEE 829]
Path: A sequence of events, e.g., executable statements, of a
component or system from an entry point to an exit point.
Path coverage: The percentage of paths that have been exercised by a
test suite. 100% path coverage implies 100% LCSAJ coverage.
Path sensitizing: Choosing a set of input values to force the execution
of a given path.
Path testing: A white-box test design technique in which test cases
are designed to execute paths.
Peer review: See technical review.
Performance: The degree to which a system or component
accomplishes its designated functions within given constraints
regarding processing time and throughput rate. [After IEEE 610] See
efficiency.
Performance indicator: A high level metric of effectiveness and/or
efficiency used to guide and control progressive development, e.g.,
defect detection percentage (DDP) for testing. [CMMI]
Performance testing: The process of testing to determine the
performance of a software product. See efficiency testing.
Performance testing tool: A tool to support performance testing and
that usually has two main facilities: load generation and test
transaction measurement. Load generation can simulate either multiple
users or high volumes of input data. During execution, response time
measurements are taken from selected transactions and these are
logged. Performance testing tools normally provide reports based on
test logs and graphs of load against response times.
Person-month: One person-month is the amount of work done by one
person in one month if they are working full time.
Phase test plan: A test plan that typically addresses one test level.
Polymorphism (poly = many, morphs = forms): Many forms of a
same function is polymorphism. The compiler resolves such issues
based on either the total number of parameters or their data types.
Portability: The ease with which the software product can be
transferred from one hardware or software environment to another.
[ISO 9126]
Portability testing: The process of testing to determine the portability
of a software product.
Positive functional testing: Testing the software with valid inputs.
Post-execution comparison: Comparison of actual and expected
results, performed after the software has finished running.
Postcondition: Environmental and state conditions that must be
fulfilled after the execution of a test or test procedure.
Precondition: Environmental and state conditions that must be
fulfilled before the component or system can be executed with a
particular test or test procedure.
Predicted outcome: See expected result.
Pretest: See intake test.
Priority: The level of (business) importance assigned to an item, e.g.,
defect.
Problem: See defect.
Problem management: See defect management.
Problem report: See defect report.
Process: A set of interrelated activities which transform inputs into
outputs. [ISO 12207]
Process cycle test: A black-box test design technique in which test
cases are designed to execute business procedures and processes.
[TMap]
Production environment: It refers to the environment in which the
final software will run.
Program instrumenter: See instrumenter.
Program testing: See component testing.
Project: A project is a unique set of coordinated and controlled
activities with start and finish dates undertaken an objective
conforming to specific requirements, including the constraints of time,
cost, and resources. [ISO 9000]
Project Evaluation Review Technique (PERT): It shows different
project task activities and their relationship with each other.
Project test plan: A test plan that typically addresses multiple test
levels.
Proof of correctness: A formal technique to prove mathematically
that a program satisfies its specifications.
Prototyping: It helps to examine the probable results of implementing
software requirements.
Pseudo-random: A series which appears to be random but is in fact
generated according to some prearranged sequence.
Quality: The degree to which a component, system, or process meets
specified requirements and/or user/customer needs and expectations.
[After IEEE 610]
Quality assurance: Part of quality management focused on providing
confidence that quality requirements will be fulfilled. [ISO 9000]
Quality attribute: A feature or characteristic that affects an item’s
quality. [IEEE 610]
Quality characteristic: See quality attribute.
Quality management: Coordinated activities to direct and control an
organization with regard to quality. Direction and control with regard
to quality generally includes the establishment of the quality policy
and quality objectives, quality planning, quality control, quality
assurance, and quality improvement. [ISO 9000]
Random testing: A black-box test design technique where test cases
are selected, possibly using a pseudo-random generation algorithm, to
match an operational profile. This technique can be used for testing
nonfunctional attributes such as reliability and performance.
Rapid (or throwaway prototyping): This approach is to construct a
“quick and dirty” partial solution to the system prior to requirements
stage.
Real-time software: Software in which we have strict time
constraints.
Recorder: See scribe.
Record/playback tool: See capture/playback tool.
Recoverability: The capability of the software product to re-establish
a specified level of performance and recover the data directly affected
in case of failure. [ISO 9126] See also reliability.
Recoverability testing: The process of testing to determine the
recoverability of a software product. See also reliability testing.
Recovery testing: See recoverability testing.
Reentrant code: It is a piece of program that does not modify itself.
Regression testing: The process of retesting a system after changes
have been made to it. Testing of a previously tested program following
modification to ensure that defects have not been introduced or
uncovered in unchanged areas of the software, as a result of the
changes made. It is performed when the software or its environment is
changed.
Release (or golden master): The build that will eventually be shipped
to the customer, posted on the Web, or migrated to the live Web site.
Release note: A document identifying test items, their configuration,
current status, and other delivery information delivered by
development to testing, and possibly other stakeholders, at the start of
a test execution phase. [After IEEE 829]
Reliability: Probability of failure free operation of software for a
specified time under specified operating conditions. The ability of the
software product to perform its required functions under stated
conditions for a specified period of time, or for a specified number of
operations. [ISO 9126]
Reliability testing: The process of testing to determine the reliability
of a software product.
Replaceability: The capability of the software product to be used in
place of another specified software product for the same purpose in the
same environment. [ISO 9126] See also portability.
Requirement: A condition or capability needed by a user to solve a
problem or achieve an objective that must be met or possessed by a
system or system component to satisfy a contract, standard,
specification, or other formally imposed document. [After IEEE 610]
Requirements-based testing: An approach to testing in which test
cases are designed based on test objectives and test conditions derived
from requirements, e.g., tests that exercise specific functions or probe
non functional attributes such as reliability or usability.
Requirements management tool: A tool that supports the recording
of requirements, requirements attributes (e.g., priority, knowledge
responsible) and annotation, and facilitates traceability through layers
of requirements and requirements change management. Some
requirements management tools also provide facilities for static
analysis, such as consistency checking and violations to pre-defined
requirements rules.
Requirements phase: The period of time in the software life cycle
during which the requirements for a software product are defined and
documented. [IEEE 610]
Requirements tracing: It is a technique of ensuring that the product,
as well as the testing of the product, addresses each of its
requirements.
Resource utilization: The capability of the software product to use
appropriate amounts and types of resources, for example, the amounts
of main and secondary memory used by the program and the sizes of
required temporary or overflow files, when the software performs its
function under stated conditions. [After ISO 9126] See also efficiency.
Resource utilization testing: The process of testing to determine the
resource utilization of a software product.
Result: The consequence/outcome of the execution of a test. It
includes outputs to screens, changes to data, reports, and
communication messages sent out. See also actual result and expected
result.
Resumption criteria: The testing activities that must be repeated
when testing is restarted after a suspension. [After IEEE 829]
Retesting: Testing that runs test cases that failed the last time they
were run, in order to verify the success of corrective actions.
Review: An evaluation of a product or project status to ascertain
discrepancies from planned results and to recommend improvements.
Examples include management review, informal review, technical
review, inspection, and walkthrough. [After IEEE 1028]
Reviewer: The person involved in the review who shall identify and
describe anomalies in the product or project under review. Reviewers
can be chosen to represent different viewpoints and roles in the review
process.
Risk: A factor that could result in future negative consequences;
usually expressed as impact and likelihood. The possibility of loss or
injury; a problem that might occur.
Risk analysis: The process of assessing identified risks to estimate
their impact and probability of occurrence (likelihood).
Risk-based testing: Testing oriented towards exploring and providing
information about product risks. [After Gerrard]
Risk control: The process through which decisions are reached and
protective measures are implemented for reducing risks to, or
maintaining risks within, specified levels.
Risk identification: The process of identifying risks using techniques
such as brainstorming, checklists, and failure history.
Risk management: Systematic application of procedures and
practices to the tasks of identifying, analyzing, prioritizing, and
controlling risk.
Risk mitigation: See risk control.
Robustness: The degree to which a component or system can function
correctly in the presence of invalid inputs or stressful environmental
conditions. [IEEE 610] See also error tolerance and fault-tolerance.
Root cause: An underlying factor that caused a nonconformance and
possibly should be permanently eliminated through process
improvement.
Root cause analysis: The process of determining the ultimate reason
why a software engineer made the error that introduced a defect.
Safety: The capability of the software product to achieve acceptable
levels of risk of harm to people, business, software, property, or the
environment in a specified context of use. [ISO 9126]
Safety testing: The process of testing to determine the safety of a
software product.
Sanity test: See smoke test.
Scalability: The capability of the software product to be upgraded to
accommodate increased loads. [After Gerrard]
Scalability testing: Testing to determine the scalability of the
software product.
Scenario testing: See use-case testing.
Scribe: The person who has to record each defect mentioned and any
suggestions for improvement during a review meeting on a logging
form. The scribe has to make sure that the logging form is readable
and understandable.
Scripting language: A programming language in which executable
test scripts are written, used by a test execution tool (e.g., a
capture/replay tool).
Security: Attributes of software products that bear on its ability to
prevent unauthorized access, whether accidental or deliberate, to
programs and data. [ISO 9126]
Security testing: Testing to determine the security of the software
product.
Serviceability testing: See maintainability testing.
Severity: The degree of impact that a defect has on the development
or operation of a component or system. [After IEEE 610]
Shelfware: Software that is not used.
Simulation: A technique that uses an executable model to examine
the behavior of the software. The representation of selected behavioral
characteristics of one physical or abstract system by another system.
[ISO 2382/1]
Simulator: A device, computer program, or system used during
testing, which behaves or operates like a given system when provided
with a set of controlled inputs. [After IEEE 610, DO178b] See also
emulator.
Sink node: It is a statement fragment at which program execution
terminates.
Slicing: It is a program decomposition technique used to trace an
output variable back through the code to identify all code statements
relevant to a computation in the program.
Smoke test: It is a condensed version of a regression test suite. A
subset of all defined/planned test cases that cover the main
functionality of a component or system, to ascertaining that the most
crucial functions of a program work, but not bothering with finer
details. A daily build and smoke test are among industry best
practices. See also intake test.
Software feature: See feature.
Software quality: The totality of functionality and features of a
software product that bear on its ability to satisfy stated or implied
needs. [After ISO 9126]
Software quality characteristic: See quality attribute.
Software runaways: Large size projects failed due to lack of usage of
systematic techniques and tools.
Software test incident: See incident.
Software test incident report: See incident report.
Software Usability Measurement Inventory (SUMI): A
questionnaire based usability test technique to evaluate the usability,
e.g., user-satisfaction, of a component or system. [Veenendaal]
Source node: A source node in a program is a statement fragment at
which program execution begins or resumes.
Source statement: See statement.
Specialization: The process of taking subsets of a higher-level entity
set to form lower-level entity sets.
Specification: A document that specifies, ideally in a complete,
precise, and verifiable manner, the requirements, design, behavior, or
other characteristics of a component or system, and, often, the
procedures for determining whether these provisions have been
satisfied. [After IEEE 610]
Specification-based test design technique: See black-box test design
technique.
Specification-based testing: See black-box testing.
Specified input: An input for which the specification predicts a result.
Stability: The capability of the software product to avoid unexpected
effects from modifications in the software. [ISO 9126] See also
maintainability.
Standard software: See off-the-shelf software.
Standards testing: See compliance testing.
State diagram: A diagram that depicts the states that a component or
system can assume, and shows the events or circumstances that cause
and/or result from a change from one state to another. [IEEE 610]
State table: A grid showing the resulting transitions for each state
combined with each possible event, showing both valid and invalid
transitions.
State transition: A transition between two states of a component or
system.
State transition testing: A black-box test design technique in which
test cases are designed to execute valid and invalid state transitions.
See also N-switch testing.
Statement: An entity in a programming language, which is typically
the smallest indivisible unit of execution.
Statement coverage: The percentage of executable statements that
have been exercised by a test suite.
Statement testing: A white-box test design technique in which test
cases are designed to execute statements.
Static analysis: Analysis of software artifacts, e.g., requirements or
code, carried out without execution of these software artifacts.
Static analyzer: A tool that carries out static analysis.
Static code analysis: Analysis of program source code carried out
without execution of that software.
Static code analyzer: A tool that carries out static code analysis. The
tool checks source code, for certain properties such as conformance to
coding standards, quality metrics, or data flow anomalies.
Static testing: Testing of a component or system at specification or
implementation level without execution of that software, e.g., reviews
or static code analysis.
Statistical testing: A test design technique in which a model of the
statistical distribution of the input is used to construct representative
test cases. See also operational profile testing.
Status accounting: An element of configuration management,
consisting of the recording and reporting of information needed to
manage a configuration effectively. This information includes a listing
of the approved configuration identification, the status of proposed
changes to the configuration, and the implementation status of the
approved changes. [IEEE 610]
Storage: See resource utilization.
Storage testing: See resource utilization testing.
Stress testing: Testing conducted to evaluate a system or component
at or beyond the limits of its specified requirements. [IEEE 610]
Structural coverage: Coverage measures based on the internal
structure of the component.
Structural test design technique: See white-box test design
technique.
Structural testing: See white-box testing.
Structured walkthrough: See walkthrough.
Stub: A skeletal or special-purpose implementation of a software
component, used to develop or test a component that calls or is
otherwise dependent on it. It replaces a called component. [After IEEE
610]
Stubs: These are the dummy modules or pieces of throw-away code
that emulate a called unit. Top-down integration begins with the main
program (root of the tree). Any lower-level unit that is called by the
main program appears as a stub.
Subpath: A sequence of executable statements within a component.
Suitability: The capability of the software product to provide an
appropriate set of functions for specified tasks and user objectives.
[ISO 9126] See also functionality.
Suspension criteria: The criteria used to (temporarily) stop all or a
portion of the testing activities on the test items. [After IEEE 829]
Syntax testing: A black-box test design technique in which test cases
are designed based upon the definition of the input domain and/or
output domain.
System: A collection of components organized to accomplish a
specific function or set of functions. [IEEE 610]
System integration testing: Testing the integration of systems and
packages; testing interfaces to external organizations (e.g., electronic
data interchange, Internet).
System testing: The process of testing an integrated system to verify
that it meets specified requirements. [Hetzel]
Technical review: A peer group discussion activity that focuses on
achieving consensus on the technical approach to be taken. A technical
review is also known as a peer review. [Gilb and Graham, IEEE 1028]
Technology transfer: The awareness, convincing, selling, motivating,
collaboration, and special effort required to encourage industry,
organizations, and projects to make good use of new technology
products.
Test: A test is the act of exercising software with test cases. A set of
one or more test cases. [IEEE 829]
Test approach: The implementation of the test strategy for a specific
project. It typically includes the decisions made that follow based on
the (test) project’s goal and the risk assessment carried out, starting
points regarding the test process, the test design techniques to be
applied, exit criteria, and test types to be performed.
Test automation: The use of software to perform or support test
activities, e.g., test management, test design, test execution, and results
checking.
Test basis: All documents from which the requirements of a
component or system can be inferred. The documentation on which
the test cases are based. If a document can be amended only by way of
formal amendment procedure, then the test basis is called a frozen test
basis. [After TMap]
Test bed: An environment containing the hardware, instrumentation,
simulators, software tools, and other support elements needed to
conduct a test. See also test environment.
Test case: A test that, ideally, executes a single well-defined test
objective, i.e., a specific behavior of a feature under a specific
condition. A set of input values, execution preconditions, expected
results and execution postconditions, developed for a particular
objective or test condition, such as to exercise a particular program
path or to verify compliance with a specific requirement. [After IEEE
610]
Test case design technique: See test design technique.
Test case specification: A document specifying a set of test cases
(objective, inputs, test actions, expected results, and execution
preconditions) for a test item. [After IEEE 829]
Test case suite: See test suite.
Test charter: A statement of test objectives, and possibly test ideas.
Test charters are amongst others used in exploratory testing. See also
exploratory testing.
Test comparator: A test tool to perform automated test comparison.
Test comparison: The process of identifying differences between the
actual results produced by the component or system under test and the
expected results for a test. Test comparison can be performed during
test execution (dynamic comparison) or after test execution.
Test completion criterion: See exit criteria.
Test condition: An item or event of a component or system that could
be verified by one or more test cases, e.g., a function, transaction,
quality attribute, or structural element.
Test coverage: See coverage.
Test data: Data that exists (for example, in a database) before a test is
executed, and that affects or is affected by the component or system
under test.
Test data preparation tool: A type of test tool that enables data to be
selected from existing databases or created, generated, manipulated,
and edited for use in testing.
Test deliverables: List of test materials developed by the test group
during the test cycles that are to be delivered before the completion of
the project.
Test design: See test design specification.
Test design specification: A document specifying the test conditions
(coverage items) for a test item, the detailed test approach, and
identifying the associated high-level test cases. [After IEEE 829]
Test design technique: A method used to derive or select test cases.
Test design tool: A tool that supports the test design activity by
generating test inputs from a specification that may be held in a CASE
tool repository, e.g., requirements management tool, or from specified
test conditions held in the tool itself.
Test driver: It automates the execution of a test case. See also driver.
Test environment: An environment containing hardware,
instrumentation, simulators, software tools, and other support elements
needed to conduct a test. [After IEEE 610]
Test evaluation report: A document produced at the end of the test
process summarizing all testing activities and results. It also contains
an evaluation of the test process and lessons learned.
Test execution: The process of running a test by the component or
system under test, producing actual result(s).
Test execution automation: The use of software, e.g.,
capture/playback tools, to control the execution of tests, the
comparison of actual results to expected results, the setting up of test
preconditions, and other test control and reporting functions.
Test execution phase: The period of time in a software development
life cycle during which the components of a software product are
executed, and the software product is evaluated to determine whether
or not requirements have been satisfied. [IEEE 610]
Test execution schedule: A scheme for the execution of test
procedures. The test procedures are included in the test execution
schedule in their context and in the order in which they are to be
executed.
Test execution technique: The method used to perform the actual test
execution, either manually or automated.
Test execution tool: A type of test tool that is able to execute other
software using an automated test script, e.g., capture/playback.
[Fewster and Graham]
Test fail: See fail.
Test generator: See test data preparation tool.
Test harness: A tool that performs automated testing of the core
components of a program or system. It is the driver of test drivers.
Tests under a central control. A test environment comprised of stubs
and drivers needed to conduct a test.
Test incident: See incident.
Test incident report: See incident report.
Test infrastructure: The organizational artifacts needed to perform
testing, consisting of test environments, test tools, office environment,
and procedures.
Test item: The individual element to be tested. There usually is one
test object and many test items. See also test object.
Test item transmittal report: See release note.
Test level: A group of test activities that are organized and managed
together. A test level is linked to the responsibilities in a project.
Examples of test levels are component test, integration test, system
test, and acceptance test. [After TMap]
Test log: A chronological record of relevant details about the
execution of tests. [IEEE 829]
Test logging: The process of recording information about tests
executed into a test log.
Test management: The planning, estimating, monitoring, and control
of test activities, typically carried out by a test manager.
Test manager: The person responsible for testing and evaluating a
test object. The individual, who directs, controls, administers plans,
and regulates the evaluation of a test object.
Test maturity model (TMM): A five level staged framework for test
process improvement, related to the capability maturity model (CMM)
that describes the key elements of an effective test process.
Test object: The component or system to be tested. See also test item.
Test objective: A reason or purpose for designing and executing a
test.
Test oracle: It is a mechanism, different from the program itself that
can be used to check the correctness of the output of the program for
the test cases. It is a process in which test cases are given to test
oracles and the program under testing. The output of the two is then
compared to determine if the program behaved correctly for the test
cases. A source to determine expected results to compare with the
actual result of the software under test. An oracle may be the existing
system (for a benchmark), a user manual, or an individual’s
specialized knowledge, but should not be the code. [After Adrion]
Test outcome: See result.
Test pass: See pass.
Test performance indicator: A metric, in general high level,
indicating to what extent a certain target value or criterion is met.
Often related to test process improvement objectives, e.g., defect
detection percentage (DDP).
Test phase: A distinct set of test activities collected into a manageable
phase of a project, e.g., the execution activities of a test level. [After
Gerrard]
Test plan: A management document outlining risks, priorities, and
schedules for testing. A document describing the scope, approach,
resources, and schedule of intended test activities. It identifies
amongst others test items, the features to be tested, the testing tasks,
who will do each task, degree of tester independence, the test
environment, the test design techniques and test measurement
techniques to be used, and the rationale for their choice, and any risks
requiring contingency planning. It is a record of the test planning
process. [After IEEE 829]
Test planning: The activity of establishing or updating a test plan.
Test point analysis (TPA): A formula-based test estimation method
based on function point analysis. [TMap]
Test points: They allow data to be modified or inspected at various
points in the system.
Test policy: A high-level document describing the principles,
approach, and major objectives of the organization regarding testing.
Test procedure: See test procedure specification.
Test procedure specification: A document specifying a sequence of
actions for the execution of a test. Also known as test script or manual
test script. [After IEEE 829]
Test process: The fundamental test process comprises planning,
specification, execution, recording, and checking for completion. [BS
7925/2]
Test process improvement (TPI): A continuous framework for test
process improvement that describes the key elements of an effective
test process, especially targeted at system testing and acceptance
testing.
Test record: See test log.
Test recording: See test logging.
Test repeatability: An attribute of a test indicating whether the same
results are produced each time the test is executed.
Test report: See test summary report.
Test requirement: A document that describes items and features that
are tested under a required condition.
Test result: See result.
Test run: Execution of a test on a specific version of the test object.
Test run log: See test log.
Test script: Step-by-step instructions that describe how a test case is
to be executed. A test script may contain one or more test cases.
Commonly used to refer to a test procedure specification, especially an
automated one.
Test session: One set of tests for a specific configuration actual code
and stubs.
Test situation: See test condition.
Test specification: A set of test cases, input, and conditions that are
used in the testing of a particular feature or set of features. A test
specification often includes descriptions of expected results. A
document that consists of a test design specification, test case
specification, and/or test procedure specification.
Test stage: See test level.
Test strategy: A high-level document defining the test levels to be
performed and the testing within those levels for a program (one or
more projects).
Test stub: Dummy function/component to simulate a real component.
Test suite: A collection of test scripts or test cases that is used for
validating bug fixes or finding new bugs within a logical or physical
area of a product. For example, an acceptance test suite contains all of
the test cases that are used to verify that the software has met certain
predefined acceptance criteria. On the other hand, a regression suite
contains all of the test cases that are used to verify that all previously
fixed bugs are still fixed. A set of several test cases for a component or
system under test, where the post condition of one test is often used as
the precondition for the next one.
Test summary report: A document summarizing testing activities
and results. It also contains an evaluation of the corresponding test
items against exit criteria. [After IEEE 829]
Test target: A set of exit criteria.
Test tool: A software product that supports one or more test activities,
such as planning and control, specification, building initial files and
data, test execution, and test analysis. [TMap] See also CAST.
Test type: A group of test activities aimed at testing a component or
system regarding one or more interrelated quality attributes. A test
type is focused on a specific test objective, i.e., reliability test,
usability test, regression test, etc., and may take place on one or more
test levels or test phases. [After TMap]
Testable requirements: The degree to which a requirement is stated
in terms that permit establishment of test designs (and subsequently
test cases) and execution of tests to determine whether the
requirements have been met. [After IEEE 610]
Testability: The capability of the software product to enable modified
software to be tested. [ISO 9126] See also maintainability.
Testability looks: The code that is inserted into the program
specifically to facilitate testing.
Testability review: A detailed check of the test basis to determine
whether the test basis is at an adequate quality level to act as an input
document for the test process. [After TMap]
Tester: A technically skilled professional who is involved in the
testing of a component or system.
Testing: The process of executing the program with the intent of
finding faults. The process consisting of all life cycle activities, both
static and dynamic, concerned with planning, preparation, and
evaluation of software products and related work products to
determine that they satisfy specified requirements, to demonstrate that
they are fit for purpose, and to detect defects.
Testing interface: A set of public properties and methods that you can
use to control a component from an external testing program.
Testware: Artifacts produced during the test process required to plan,
design, and execute tests, such as documentation, scripts, inputs,
expected results, set-up and clear-up procedures, files, databases,
environment, and any additional software or utilities used in testing.
[After Fewster and Graham]
Thread testing: A version of component integration testing where the
progressive integration of components follows the implementation of
subsets of the requirements, as opposed to the integration of
components by levels of a hierarchy.
Time behavior: See performance.
Top-down testing: An incremental approach to integration testing
where the component at the top of the component hierarchy is tested
first, with lower level components being simulated by stubs. Tested
components are then used to test lower level components. The process
is repeated until the lowest level components have been tested.
Traceability: The ability to identify related items in documentation
and software, such as requirements with associated tests. See also
horizontal traceability and vertical traceability.
Transaction: A unit of work seen from a system user’s point of view.
UML (unified modified language): A standard language used for
developing software blueprints using nine different diagrams.
Understandability: The capability of the software product to enable
the user to understand whether the software is suitable, and how it can
be used for particular tasks and conditions of use. [ISO 9126] See also
usability.
Unit testing: See component testing.
Unreachable code: Code that cannot be reached and, therefore, is
impossible to execute.
Usability: The capability of the software to be understood, learned,
used, and attractive to the user when used under specified conditions.
[ISO 9126]
Usability testing: The testing that validates the ease of use, speed, and
aesthetics of the product from the user’s point of view. Testing to
determine the extent to which the software product is understood, easy
to learn, easy to operate, and attractive to the users under specified
conditions. [After ISO 9126]
Use-case testing: A black-box test design technique in which test
cases are designed to execute user scenarios.
User acceptance testing: See acceptance testing.
User scenario testing: See use-case testing.
User test: A test whereby real-life users are involved to evaluate the
usability of a component or system.
V-model: A framework to describe the software development life
cycle activities from requirements specification to maintenance. The
V-model illustrates how testing activities can be integrated into each
phase of the software development life cycle.
Validation: It is the process of evaluating a system or component
during or at the end of the development process to determine whether
it satisfies the specified requirements. It involves executing the actual
software. It is “computer based testing” process. Confirmation by
examination and through provision of objective evidence that the
requirements for a specific intended use or application have been
fulfilled. [ISO 9000]
Variable: An element of storage in a computer that is accessible by a
software program by referring to it by a name.
Verification: It is the process of evaluating a system or component to
determine whether the products of a given development phase satisfy
the conditions imposed at the start of that phase. It is a “human
testing” activity. Confirmation by examination and through the
provision of objective evidence that specified requirements have been
fulfilled. [ISO 9000]
Vertical traceability: The tracing of requirements through the layers
of development documentation to components.
Volume testing: Testing where the system is subjected to large
volumes of data. See also resource-utilization testing.
Walkthrough: A step-by-step presentation by the author of a
document in order to gather information and to establish a common
understanding of its content. [Freedman and Weinberg, IEEE 1028]
White-box test design technique: Documented procedure to derive
and select test cases based on an analysis of the internal structure of a
component or system.
White-box testing: Testing based on an analysis of the internal
structure of the component or system.
Wide band delphi: An expert-based test estimation technique that
aims at making an accurate estimation using the collective wisdom of
the team members.
Widget: A synonym for user interface component and control.
Wizard: A user assistance device that guides users step by step
through a procedure.
Work breakdown structure: A structure diagram that shows the
project in terms of phases.
APPENDIX
D

SAMPLE PROJECT DESCRIPTION


[N.B.: Students may be encouraged to prepare descriptions of projects
on these lines and then develop the following deliverables]

1. SRS Document 2. Design Document 3. Codes 4. Test


Oracles

Title of the Project


Development of a practical Online Help Desk (OHD) for the facilities
in the campus.

Abstract of the Project


This project is aimed at developing an Online Help Desk (OHD) for
the facilities in the campus. This is an Intranet-based application that
can be accessed throughout the campus. This system can be used to
automate the workflow of service requests for the various facilities in
the campus. This is one integrated system that covers different kinds
of facilities like classrooms, labs, hostels, mess, canteen, gymnasium,
computer center, faculty club, etc. Registered users (students, faculty,
lab-assistants, and others) will be able to log in a request for service
for any of the supported facilities. These requests will be sent to the
concerned people, who are also valid users of the system, to get them
resolved. There are features like email notifications/reminders, the
addition of a new facility to the system, report generators, etc. in this
system.

Keywords
Generic Technology Keywords: databases, network and middleware,
programming.
Specific Technology Keywords: MS-SQL server, HTML, Active
Server Pages.
Project Type Keywords: analysis, design, implementation, testing, user
interface.

Functional Components of the Project


The following is a list of functionalities of the system. Other
functionalities that you find appropriate can be added to this list. Other
facilities that are appropriate to your college can be included in the
system. And, in places where the description of a functionality is not
adequate, you can make appropriate assumptions and proceed.
There are registered people in the system (students, faculty, lab-
assistants, and others). Some of them are responsible for maintaining
the facilities (for example, the lab-assistant is responsible for keeping
the lab ready with all the equipment in proper condition, the student
council is responsible for taking forward students’ complaints/requests
to the faculty/administration. etc.).
There are three kinds of users for this system:
1. Those who use the system to create a request (end-users).
2. Those who look at the created requests and assign them to the
concerned people (facility-heads).
3. Those who work on the assigned requests and update the status on
the system (assignees).
There is also an “Administrator” for doing the Admin-level
functions such as creating user accounts, adding new facilities to the
system, etc.
1. A person should be able to —
• login to the system through the first page of the application.
• change the password after logging into the system.
• see the status of the requests created by him/her (the status could
be one of unassigned/assigned/work in progress/closed/rejected).
• see the list of requests (both open and closed) created by him/her
in the past.
• create a new request by specifying the facility, the severity of the
request (there may be several levels of severity defined), and a
brief description of the request.
• close a request created by him/her by giving an appropriate
reason.
• see the requests that are assigned to him/her by the facility-heads
and update the status of requests (after working on them).
• view the incoming requests (if he/she is a facility-head) and
assign them to registered users of the system.
• get help about the OHD system on how to use the different
features of the system.
2. As soon as a request is created, an automatic email should be sent
to the person who created the request and the concerned facility-
head. The email should contain the request details.
3. Similarly, when any status-change occurs for a request (such as the
request getting completed etc.), an automatic email should be sent
to the person who created the request and the concerned facility-
head.
4. A summary report on the requests that came in and requests that
were serviced should be sent to every facility-head periodically
(say, once in a month).

Steps to Start-off the Project


The following steps will be helpful to start-off the project.
1. Study and be comfortable with technologies such as Active Server
Pages/HTML and SQL server. Some links to these technologies are
given in the “Guidelines and References” section of this document.
2. Decide on the list of facilities that would be supported and define it
formally.
3. Make a database of different kinds of users (End-users, Facility-
heads, Assignees).
4. Assign a system-admin who will create mail-ids for the people in
the Intranet of your lab or in the Internet. These mail-ids will be
used for sending automatic notifications and reports. The system-
admin will also take care of assigning the logins to the users of the
OHD system.
5. Create the front-page of the OHD system giving a brief description
about the system and a login box.
6. Create the help-pages of the system in the form of a Q&A. This
will help you also when implementing the system.
7. Create other subsystems like automatic notification, screens for
various functions (like create_new_request, view_open_requests,
forward_new_request_to_assignee, etc.).
Requirements:
Hardware requirements:

Alternatives
Number Description (if available)
1. PC with 2 GB hard-disk and 256 Not-Applicable
MB RAM
2.

Software requirements:

Alternatives
Number Description (if available)
1. Windows 95/98/XP with MS- Not-Applicable
Office
2. MS-SQL server MS-Access
3.

Manpower requirements:
2-3 students can complete this in 4-6 months if they work full-time on
it.

Milestones and timelines:


Guidelines and References
http://msdn.microsoft.com/library/default.asp?url=/library/en-
us/dnasp/html/asptutorial.asp (ASP tutorial)
http://www.functionx.com/sqlserver/ (SQL-server tutorial)
APPENDIX
E

BIBLIOGRAPHY
Special thanks to the great researchers without whose help this book
would not have been possible:
1. Jorgensen Paul, “Software Testing—A Practical Approach”, CRC
Press, 2nd Edition 2007.
2. Srinivasan Desikan and Gopalaswamy Ramesh, “Software testing
—Principles and Practices”, Pearson Education Asia, 2002.
3. Tamres Louise, “Introduction to Software Testing”, Pearson
Education Asia, 2002.
4. Mustafa K., Khan R.A., “Software Testing—Concepts and
Practices”, Narosa Publishing, 2007.
5. Puranik Rajnikant, “The Art of Creative Destination”, Shroff
Publishers, First Reprint, 2005.
6. Agarwal K.K., Singh Yogesh, “Software Engineering”, New Age
Publishers, 2nd Edition, 2007.
7. Khurana Rohit, “Software Engineering—Principles and Practices”,
Vikas Publishing House, 1998.
8. Agarwal Vineet, Gupta Prabhakar, “Software Engineering”, Pragati
Prakashan, Meerut.
9. Sabharwal Sangeeta, “Software Engineering—Principles, Tools
and Techniques”, New Age Publishers, 1st Edition, 2002.
10. Mathew Sajan, “Software Engineering”, S. Chand and Company
Ltd., 2000.
11. Kaner, “Lessons Learned in Software Testing”, Wiley, 1999.
12. Rajani Renu, Oak Pradeep, “Software Testing”, Tata McGraw Hill,
First Edition, 2004.
13. Nguyen Hung Q., “Testing Applications on Web”, John Wiley,
2001.
14. “Testing Object-Oriented Systems—A Workshop Workbook”, by
Quality Assurance Institute (India) Ltd., 1994-95.
15. Pressman Roger, “A Manager’s Guide to Software Engineering”,
Tata McGraw Hill, 2004.
16. Perry William E., “Effective Methods for Software Testing”, Wiley
Second Edition, 1995.
17. Cardoso Antonio, “The Test Estimation Process”, June 2000,
Stickyminds.com.
18. Erik P.W.M., Veenendall Van, Dekkers Ton, “Test Point Analysis;
A method for Test Estimation”.
19. Mathur P. Aditya, “Foundations of Software Testing”, Pearson
Education Asia, First Impression, 2008.
20. Black Rex, “Test Estimation”, 2002, Stque Magazine.com.
21. Nangia Rajan, “Software Testing”, Cyber Tech. Publications, First
Edition, 2008.
22. Stout A. Glenn, “Testing a Website: Best Practices”,
“http://www.stickyminds.com/sitewide.asp/XUS893545file/.pdf
23. Kota Krishen, “Testing Your Web Application—A Quick 10–Step
Guide”, Stickyminds.com.
24. Intosh Mac, Strigel W., “The Living Creature”—Testing Web
Applications, 2000,
http://www.stickyminds.com/docs_index/XUS11472file/.pdf.
25. Binder Robert V., “Testing Object Oriented Systems—Models,
Patterns and Tools”, Addison Wesley, 1999.
26. Websites: www.stickyminds.com, www.testingcenter.com.
INDEX

A
Acceptance testing 268, 371
for critical applications 272
types of 271
Advantages of white box testing 191
Automated testing 409
disadvantages of 415
benefits of 414
Available testing tools, techniques and metrics 20

B
Basic concepts of state machines 310
Basic terminology related to software testing 11
Basic test plan template 583–589
Basic unit for testing, inheritance and testing 302
Basis path testing 149
Benefits of automated testing 414
Beta testing 255
Big bang integration 240
Black box testing 489, 560
Black box (or functional) testing techniques 65
Bottom-up integration approach 239
Boundary value analysis (BVA) 66
guidelines for 74
limitations of 67
robustness testing 67
worst-case testing 68
Business vertical testing (BVT) 252
BVA. See Boundary value analysis

C
Call graph based integration 241
Categorizing V&V techniques 33
Cause-effect graphing technique 97
causes and effects 97
guidelines for 101
Certification, standards and testing for compliance 257
Characteristics of modern testing tools 425
Chasing false defects 569
Classification of integration testing 238
Code complexity testing 141
Code coverage testing 136
Coming up with entry/exit criteria 258
Comparison of
black box and white box testing in tabular form 188
conventional and object oriented testing 390
various regression testing techniques 223
white box, black box and gray box testing approaches 209
Comparison on
black box (or functional) testing techniques 102
various white box testing techniques 190
Concept of balance 107
Condition coverage 139
Configuration testing 486
Consideration during automated testing 410
Corrective regression testing 221
Coupling between objects (CBO) 145
Criteria for selection of test tools 422
Cyclomatic complexity
properties and meaning 141, 147
Cyclomatic density (CD) 143

D
Data flow testing 171
DD path testing 165
Debugging 418
Decision table based testing 86
guidelines for 96
Decision tables 87
advantages, disadvantages and applications of 87
Decomposition-based integration 238
Deployment acceptance test 272
Deployment testing 253
Design for testability (DFT) 387
Design/architecture verification 251
Differences between QA and QC 31
Differences between regression and normal testing 220
Differences between verification and validation 30
Disadvantages of automated testing 415
Dynamic white box testing techniques 135

E
Equivalence class test cases for next date function 79
Equivalence class test cases for the commission problem 84
Equivalence class test cases for the triangle problem 78
Equivalence class testing 74
guidelines for 85
strong normal 76
strong robust 77
weak normal 75
weak robust 76
Essential density metric (EDM) 142
Evolving nature of area 31
Executing test cases 270

F
Factors governing performance testing 274
Formal verification 37
Function coverage 140
Functional acceptance simple test (FAST) 271
Functional system testing techniques 250
Functional testing (or black box testing) 489
guidelines for 105
techniques 65
Functional versus non-functional system testing 248

G
Game testing process 559
Good bug writing 573
Graph matrices technique 169
Gray box testing 207
various other definitions of 208
Gui testing 390
Guidelines for
BVA 74
cause-effect functional testing technique 101
choose integration method and
conclusions 241
decision table based testing 96
equivalence class testing 85
functional testing 105
scalability testing 262

H
Heuristics for class testing 356

I
Implementation-based class testing/white box or structural testing 333
Incremental testing approach 10
Independent V&V contractor (IV&V) 49
Integration complexity 143
Integration testing of classes 367
Integration testing 237
Interoperability testing 266
K
Kiviat charts 105

L
Levels of object oriented testing 363
Levels of testing 235–291
Life cycle of a build 563
Limitations of BVA 67
Limitations of testing 19

M
Managing key resources 259
Managing the test process 383
Measurement of testing 10
Mutation testing versus error seeding 186

N
Neighborhood integration 242
Non-functional testing (or white box testing) 486
Non-functional testing techniques 258

O
Object oriented testing 301
levels of 363

P
Pairwise integration 241
Path analysis 453
process 454
coverage 138
Path-based integration with its pros and cons 243
Pathological complexity 143
Performance testing 273, 486
challenges 280
factors governing 274
steps of 279
tools for 290
Phase wise breakup over testing life cycle 453
Positive and negative effect of software V&V on projects 48
Practical challenges in white box testing 190
Principles of testing 18
Prioritization guidelines 215
Prioritization of test cases for regression testing 224
Priority category scheme 216
Problems with manual testing 413
Progressive regression testing 221
Proof of correctness (formal verification) 37
Pros and cons of decomposition-based techniques 240
Pros and cons 242

R
Rationale for STRs 43
Recoverability testing 488
Regression and acceptance testing 381
Regression testing at integration level 222
Regression testing at system level 223
Regression testing at unit level 222
Regression testing in object oriented software 224
Regression testing of a relational database 493–500
Regression testing of global variables 223
Regression testing technique 225
Regression testing 220, 381, 571
types of 221
Release acceptance test (RAT) 271
Reliability testing 262, 489
Requirements tracing 38
Response for class (RFC) 145
Responsibility-based class testing/black-box/functional specification-based testing of classes 345
Retest-all strategy 221
Risk analysis 217
Robustness testing 67
Role of V&V in SDLC 33

S
Sandwich integration approach 239
Scalability testing 260, 487
Security testing 488
Selecting test cases 269
Selection of good test cases 9
Selective strategy 221
Setting up the configuration 258
Simulation and prototyping 38
Skills needed for using automated tools 416
Slice based testing 226
Smoke testing 571
Software technical reviews 43
rationale for 43
review methodologies 46
types of 45
Software testing 2
basic terminology related to 11
Software V&V planning (SVVP) 39
Software verification and validation 29–57
Standard for software test documentation (IEEE829) 50
State machines
basic concepts of 310
Statement coverage 136
Static versus dynamic white box testing 134
Steps for tool selection 424
Steps of performance testing 279
Stress testing 263
Strong normal equivalence class testing 76
Strong robust equivalence class testing 77
System testing 246, 371

T
Test automation: “no silver bullet” 431
Test cases for commission problem 72, 96
Test cases for next date function 71, 91
Test cases for payroll problem 100
Test cases for the triangle problem 69, 90, 98
Test execution issues 394
Test point analysis (TPA) 435
for case study 450
methodology 436
model 437
philosophy 436
Testing
effectiveness 104
efficiency 104
effort 102
levels of 235–291
life cycle 17
measurement of 10
number of 9
object oriented systems 333
performance See Performance testing
principles of 18
purpose 6
Testing “around” a bug 572
Testing of e-learning management systems 505–557
Testing process 2
Testing using orthogonal arrays 392
Tools for performance testing 290
TPA. See Test point analysis
Transaction testing 489
Types of
acceptance testing 271
decomposition based techniques top-down 238
integration approach 238
RDBMS 494
regression testing 221
STRs 45
testing tools-static v/s dynamic 411

U
Unit testing a class 364
Unit, integration, system and acceptance testing relationship 236
Unit/code functional testing 135
Usability testing 486

V
Version control 567
V&V limitations 32

W
Weak normal equivalence class testing 75
Weak robust equivalence class testing 76
Web browser-page tests 489
Weighted methods for class (WMC) 145
When to stop testing 18
White box testing 486, 562. See also Black box testing
advantages of 191
comparison of black box and 188
practical challenges in 190
static versus dynamic 134
White box (or structural) testing techniques 133
Worst-case testing 68

You might also like