0% found this document useful (0 votes)
72 views

Indian Institute of Technology (BHU), Varanasi: Software Metrics (CS-7003) "Empirical Investigation "

This document outlines the principles and techniques for conducting empirical software engineering investigations through surveys, case studies, and formal experiments. It discusses four key principles: choosing an investigative technique, stating a hypothesis, maintaining control over variables, and making the investigation meaningful. It then covers planning formal experiments, including experimental procedures, principles of experimental design, types of designs, and selecting a design. Finally, it discusses planning case studies by considering sister projects, baselines, and random selections. The overall goal is to scientifically evaluate tools, techniques, methods and models in the field of software engineering.

Uploaded by

Kiran Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views

Indian Institute of Technology (BHU), Varanasi: Software Metrics (CS-7003) "Empirical Investigation "

This document outlines the principles and techniques for conducting empirical software engineering investigations through surveys, case studies, and formal experiments. It discusses four key principles: choosing an investigative technique, stating a hypothesis, maintaining control over variables, and making the investigation meaningful. It then covers planning formal experiments, including experimental procedures, principles of experimental design, types of designs, and selecting a design. Finally, it discusses planning case studies by considering sister projects, baselines, and random selections. The overall goal is to scientifically evaluate tools, techniques, methods and models in the field of software engineering.

Uploaded by

Kiran Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Indian Institute of Technology (BHU), Varanasi

Department of Computer Science and Engineering

Software Metrics
(CS-7003)
“Empirical Investigation ”

Course Instructor: Presented By


Dr. Amrita Chaturvedi B Kiran Kumar
Assistant Professor PhD scholar
Outline
•Four principals of investigation
•Choosing an investigative technique
•Stating the hypothesis
•Maintaining the control over variables
• Making your investigation meaningful
•Planning formal experiments
•Procedure for performing experiments
•Principals for experimental design
•Types of experimental designs
•Selecting an experimental design
•Planning case studies
•Sister projects
•Baselines
•Random selections
Four principals of
investigation
 Suppose your manager is proposed to use new tool,
technique or method.
 So if I decided to investigate the tool, technique or
method in scientific way.
 The scientific investigation involves case study, a
survey, a formal experimentations.
 How to conduct the scientific investigation , what are
different types of investigations with examples is what
addressed in this section

3
Choosing an investigative
technique
 A survey is a retrospective study of situation to try to
document relationships and outcomes
 Survey is always done after an event is occurred
 In software engineering survey are conducted by using
polls to determine which sets of data has got highest
majority of votes
 Based on the majority voting the best method, techniques
, tool or relation is determined

4
Example

5
 Survey is retrospective study which means we don’t have
any control of the situation
 Both case studies and formal experiments are usually
not retrospective. We decide in advance what we want to
investigate and plan what data to capture to support
your investigation
 A case study is a research technique where you identify
key factors that may effect the outcome of the activity
and then documents the activity like inputs, constraints
, output, and resources.
 Formal experiments is a rigorous controlled investigation
of an activity, where key factors are identified and
manipulated to documents there effects on the outcome.
 Formal experiments require great deal of control, they
tend to be small, involving small number of people or
events

6
 Case studies generally look at typical projects, rather
then trying to capture information about all possible
cases.
 Case study is thought of a “research in typical” where as
surveys is “research in large”
 There are several guidelines which help what to follow
either surveys , formal experiments or case studies.
 The first step is deciding whether the investigation is
retrospective or not
 If the activity of your investigation is already occurred,
then you must perform survey or case study, if not
occurred you may choose between case study or formal
experiments.
 The central factor is upon the level of control is required
for formal experiments.

7
 If you have high level of control over the variables that
can effect the outcome , then u can consider an
experiment
 If you do not have a control then case study is
preferred technique

8
Stating the hypothesis
 After deciding how to investigate , the first step is what
to you want to investigate
 This step help us to identify which type of research
technique is most appropriate for the situation
 The goal of the research can be expressed as
hypothesis, that is what we want know from the
investigation
 The hypothesis is a tentative theory which explains
what we want to explore
 For example ,say method A provides better results
then method B

9
Maintaining control over variables
 Once we have explicit hypothesis, then we can figure
out which variables control the truth.
 After identifying the variables, we must decide how
much control we have on the variables
 A state variable is a factor that can characterize your
projects, and influence your evaluation results
 Sometimes state variables are called independent
variables, because they can be manipulated to effect of
the outcome
 Dependent variable is one whose value is effected by
changing one or more independent variables
 variables are selected in such a way that it should
exabits the characteristics of the project

10
• In the above example the case study might involve
choosing a language that is usually used in most of
the projects, rather than trying to choose a set of
projects to cover as many languages as possible.
• Thus a state variable is used to distinguish between

11
Making your investigation meaningful
 Conforming theories and conventional wisdom
 Conventional wisdom suggests that they are the
best approaches in software engineering
 Many national or international standard follows
this approach
 Many organization uses standard limits on
structural measures or thumb rule about module
size to assure the quality of the software
 Based on the macab rule or module size must be
less than 200 lines of code
 Case study and surveys can be used to check the
validity of the claims made with in the
organization
 Formal experiments can be used to provide a
context in which certain standards ,methods, and
tools are recommended to use.

12
Exploring relationships
 Software practitioners often interested in the
relationship among various attributes of resources and
software products
◦ how does the project team experience with the application area
effect the quality of the resulting code?
◦ How does the requirement quality effect the productivity of the
designers?
◦ How does the design structure effect the maintainability of the
code?

◦ The relationship can be suggested by a case study or by survey.


◦ For example , the survey on completed project may revel that
software written in Ada has fewer faults than software written in
other languages
◦ Clearly, understanding and verifying these relationship is crucial
to success of any future projects.
◦ Each relationship can be expresses as hypothesis and formal
experiment can be designed to test the degree to which the
relationship holds.

13
Evaluating the accuracy of the models
 Models are always used to predict the outcome of the
activity or to guide the use of the methods or tools
 For example, size model like functional points suggest
how large the code may be
 Cost model predicts how the development or
maintenance cost would be
 Capability maturity model guide the use of techniques
such as configuration management or the introduction
to testing tools and methods
 Formal experiments can confirm or refute the accuracy
and dependability of the models
 Models present a particular difficult problem when
design an experiment or case study, because
predictions effect their outcome
 When predictions become goals and the developers
strive to achieve the goal intentional or not.

14
 This effect is common when cost and schedule models
are used and project managers turn the predictions
into targets for completion.
 For this reason experiments evaluating the models can
be designed as double blinded experiments, where the
participants do not know what the predictions until the
experiment is done
 Some models like reliability models do not influence the
outcome, since reliability measured as mean time to
failure cannot be evaluated until the project is
completed

15
Validating measures
 In general study can be conducted to test whether a
given measure appropriately reflects changes in the
attribute its supposed to capture
 Validating measures is driven with problems
 Validation is performed by corelating one measure with
another
 It is very important to validate using second measure
which is itself a direct and valid measure of the factors
its measures
 measurements must be confined to human notations
that is it should be expressed in mathematical
notations
 This notations helps to preserve the relationship means
that the measure must be subjective or objective

16
Planning formal experiments
Formal experiments requires a great deal of care and
planning if they are to provide meaningful ,useful results
 Procedure for performing experiments
◦ Conception
◦ Design
◦ Preparation
◦ Execution
◦ Analysis
◦ Dissemination and decision making

17
Conception
 The first step is to decide what you want to learn and
define the goal of your experiment
 The conception stage include the type of analysis required,
to ensure that the formal experiment is the most
appropriate research technique to use.
 Next the objective of the study must be stated clearly and
precisely
 The objective may include which tool or method is
superior in some way to another method or tool
 One can also show the difference in the environmental
conditions or quality of resources effect the use or output
of the method or tool
 The objective must be clearly evaluated at the end of the
experiment, that is it should be stated as the question you
want to answer
 The next step is design of an experiment that will provide
the answer.

18
Design
 Once the objective is clearly stated ,you must translate
the objective into formal hypothesis
 There are two hypothesis usually a) Null hypothesis b)
experimental hypothesis(alternative hypothesis)
 Null hypothesis is the one that assume there is no
significant difference between two treatments(methods,
tools, experiments, etc) with respect to dependent
variable you are measuring( such as productivity, cost,
or quality)
 Alternative hypothesis on the other hand post the
significant difference between the two treatments
 It is always easy to tell which is null and alternative
based on statistical assumption
 The null hypothesis is assumed to be true unless the
data indicates otherwise
 Testing the null hypothesis means determining whether
the data is convincing enough to reject the null
hypothesis and accept the alternative as true.
19
 Preparations
◦ Preparation involves reading the subjects for the
applications of the treatment
◦ For example, preparation of the experiment may
involve purchasing tools, training staff, or configuring
hardware in the certain way
◦ Instructions must be written out or recorded properly
◦ To ensure the plan is complete and the instructions
are understandable dry run the experiment on small
set of people may be useful

20
Execution
 Following the steps laid out in plan, and measuring the
attributes as prescribed in the plan, you apply the
treatment to the experimental subjects
 To make the results sensible carefully the items must
be measured, and treatments are applied consistently
 Finally, the experiments can be executed
Analysis
 The analysis has two parts
 First, you must review all the measurements taken to
make sure that they are valid or useful
 Organize the measurements into sets of data that will
be examined as part of hypothesis testing process
 Second, analyze the sets of data accordingly to the
statistical principals
 These statistical test will tell whether the null
hypothesis is supported or refuted by the results of the
experiment

21
Dissemination and decision making
 At the end of the analysis phase, a conclusion can be
made how different characteristics examined effected
the outcome
 Document the conclusion that will allow others to
duplicate the experiment and confirm the conclusion in
similar testing
 All the key aspects of the research like the objectives,
the hypothesis, and the experimental subjects and
objects, the response, the state variables, the
treatments and the resulting data must be documented
properly
 You must state the conclusion clearly making sure to
address any problems experienced during the running
of the experiment
◦ Example staff change or toll upgrade must be noted

22
 The experimental results must be used in three ways:
◦ First you may use them to support decision about how you will
develop and maintain software in future, what tools or methods
will be used and in what situation
◦ Second others may use your results to suggest changes to their
development environment
◦ Others likely to replicate your experiment to confirm the results on
their similar projects
◦ Third , you and others may perform similar experiments with
variations in experimental subjects or state variables
◦ These new experiments will help to understand how the results
are affected by carefully controlled changes.
◦ For example , if your experiment demonstrates a positive change
in the quality by using c++ program , others may test to see if the
quality can be improved still further by using c++ in concert with
any c++ related tool or in application domain

23
Principals of experimental design
 Useful results depends upon careful, rigorous and
complete experimental design
 Simple design help to make the experimental practical ,
minimizing the use of time, money, personnel, and
experimental resources
 Simple design are easy to analyze
 Experimental design has two important concepts:
◦ Experimental units
◦ Experimental errors

◦ Experimental unit is an experimental object to which


a single treatment is applied
◦ Single treatment is applied many times in different
groups to check the difference in results even if the
conditions are almost same

24
 Experimental error describes the failure of two
identically treated experimental units to yield identical
results
 The error reflect host of problems:
◦ Errors of experimentation
◦ Errors of observation
◦ Errors of measurements
◦ The variations in experimental resources
◦ The combined effect of all extraneous factors that can influence
the characteristic under study, but which will not be signaled out
for attention in the investigation

◦ The aim of good experimental design is to control for


as many as possible, both to minimize variability
among participants and to maximize the effects of
irrelevant variables

25
 Eliminate the effects of other variables so that only the
so that only the only the effects of independent
variables are reflected in the values of the dependent
variables.
 By doing this we are eliminating experimental errors
 Complete elimination is not possible in reality
 Experiments are designed in such a way that effects of
irrelevant variables are distributed across all the
experimental condition
 The above mentioned problem can be addressed by
using Replication, randomization , and local control

26
Randomization
 Replication makes possible a statistical test of the significance of
the result
 But it does not insure the validity of the results
 Randomization is the random assignments of the subjects to the
groups or of treatments to experimental units, to validated the
results
 Randomization does not guarantee independence, but allows us to
assume the correlation of the any comparison of treatments as
small as possible
 By randomly assigning treatments to experimental units, some
results can be kept from being biased by source of variations over
which we have no control
 Example, sometimes the results of an experimental trail can be
effected by the time, the place or the unknown characteristics of the
participants
 These uncontrollable factors can have effects that hide or skew the
results of the controllable varaibles

27
Replication
 It is repetition of the basic experiment
 Repetition is done under identical condition, rather
then repeating measurements on the same
experimental unit
 Replication provides an estimate of experimental error
that acts as a basis for assessing the importance of
observed differences in an independent variable by
using statically techniques
 Replication enables us to eliminate the mean effect of
any experimental factors
 Need to avoid confounded effect in replication
 Confounded effect is where it is impossible to separate
their effects when the subsequent analysis is performed

28
Local control
 One of the key factors that distinguishes a formal
experiment from a case study is the degree of control
 Local control is the aspect of the experimental design that
reflects how much control you have over the placements of
the subjects in experimental unit and the organization of
those units
 Local control is usually discussed in terms of two
characteristics of the design: blocking and balancing of
the units
 Blocking means allocating experimental units to blocks or
groups so the units within the block are relatively
homogeneous.
 The blocks are designed so that the predictable variations
among units has been confounded with the effects of the
blocks
 Experimental design captures the anticipated variations in
the blocks , so that variations does not contribute to the
experimental errors

29
 Balancing is blocking and assigning of treatments so
that an equal number of subjects is assigned to each
treatment whenever possible
 It is desirable because its simplifies the statistical
analysis
 Design can range from completely balance or no
balance
 Experiments investigating only one factor, blocking and
balancing plays an important role
 Experimental design should include a description of the
block defined and allocation of treatments to each

30
Types of experimental design
 There are many types of experimental design likely to
be used in software engineering research
 Based on the type of design constraints, the type of
analysis can be performed and what type of
conclusions can be drawn it is useful to understand
several types of designs
 For example s- statistics test for variance
 The choice of calculation depends upon the
experimental design, including the number of variables
and in a way the subjects are grouped and balanced
 Most design in software engineering research are based
on two simple relations between factors:
◦ Crossing
◦ nesting

31
Crossing
 Expressing design in terms of factors called the
factorial design, tells you how many different
treatments combinations are required
 Two factors A and B in the design are said to be
crossed if each level of each factor appears with each
level of other factors, denoted as A x B
Example of crossed design

32
 In the figure ai represents levels of factor A
 Bj represents levels of factor B
 In the figure first row indicates treatment of level 1 of A
is occurring with level 1 of B, of level 1 of A is occurring
with level 2 of B, of level 1 of A is occurring with level 3
of B
 The first column shows the treatment of level 1 of B
occurring with each of two levels of A
 A crossed design with m levels of first factor and n
levels of second factor will have mn cells with each cell
representing particular situation
 The above figure has 2 levels of A and 3 levels of B
results in 6 possible treatments in total

33
Nesting
 factor B is nested with factor A if each meaningful
level of B occurs with only one level of factor A
 The relationship is given as B(A), where B is nested
factor and A is nest factor
Example two factor nested

34
 In the previous figure we have two levels of factor A and
three levels of factor B
 Here B is dependent on A and each level of B occurs
with only one level of A, that is B is nested with A

Understanding nesting with example

35
36
Advantages of expressing design in terms of
factorials
 Factorials insure that resources are used most
efficiently
 Information obtained in the experiment is complete and
reflects the various possible interactions among
variables
 The factorial design involves in implicit replication,
yielding the related benefit in terms of reduced
experimental error

37
Selecting an experimental design
Choosing the number of factors
 Experiments involve one variable or factors
 The more the comparison, complex the experiments are
to conduct
 Comparison in the sense the effect of different language
used, or of several different tools
 One variable experiment are simple to analyze because
the effects of single factor are isolated from the variable
that may effect the outcome

38
Example
No interaction between factors Interaction between factors

39
Factors versus blocks
 How blocking can be used to improve experimental design
 Blocking can be used only after deciding the number of
factors appropriate to our experiment
 Do determine which approach is best blocking or factor
consider the basic hypothesis
 The hypothesis is “ if we are interested in whether design
A is better than design B, then experience should be
considered as blocking variable”.
 If we are interested in results of using design models A
and B are influenced by staff experience then experience
should be treated as factor.
 Note : if not interested in interaction blocking is better, if
interaction are important , then multiple factors are
needed.

40
Guidelines for blocking
 If you are deciding between two methods or tools, then
you should identify state variables that are likely to
effect the results and sample over those variables using
blocks to ensure an unbiased assignment of
experimental units to the alternative methods or tools
 If you are deciding between methods or tools in a
variety of circumstances, then you should identify state
variables that define the different circumstances and
treat each variable as factor
 Too keep it simple use block to eliminate bias and use
factors to distinguish case or circumstances

41
Choosing between nested and crossed
design
 After selecting appropriate number of factors for
experiment to be conducted, we need to select a
structure that supports the investigation and answer
the questions
Example of crossed design for design methods and tool
usge

42
 In the above example 12 projects are organized and
assign 3 projects randomly to each treatment in the
design
Nested design

43
 Nested design is useful for investigating one factor with
two or more condition
 Crossed design is useful for two factors looking each
with two or more factors, each within two or more
conditions
 The more the factors , more complex the analysis
 The next slide shows us how to choose a design in
software engineering

44
45
Fixed or random effects
 Some factors allow us to have complete control over the
design
 For example factor like programming language to be
used to design the system or word processor the system
is designed on
 Factors like staff experience is not easy to control
 The degree of control over factor level is an important
consideration in choosing an experimental design
 A fixed effect model has factors levels or blocks that are
controlled
 A random effect model has factor levels or blocks that
are random samples from a population of values

46
Matched or same subject design
 Economy sometimes prevents us from using same type
of subject for each type of treatments in experimental
design
 For example their might not be enough programmers to
participate in an experiment or deficiency in funds
 Then we can use same subject for different treatments
 For example we can same programmer to use tool A in
one situation and tool B in another situation
 To design an experiment , you should decide how many
and what type of subjects you want to use
 As a result one can use the same subject for one factor
but different subject for another factor to yield a mixed
between and with-in subjects design

47
Repeated measurements
 Repeating measurement can be helpful in validating, by
assessing the error associated with the measurement
process
 Below figure depicts the results of any experiment
involving one product and three developers

48
 Each developer was asked to calculate the number of
function points in the product during development that
is at the time of specification, design and code
 In the figure there are 3 points marked at each of these
3 estimation times
 The horizontal variation indicates the variation over
time, while the vertical differences at each measurnents
time indicate the variation due to difference among the
developers
 Repeated measurements add value to the experiment
but it is complex and requires good analysis.
 The horizontal variation helps us to understand the
error about the line connecting the means at each
measurement time, and the vertical line helps us to
identify observational errors

49
Planning case studies
 Every case study requires conception, design
,preparation, execution, analysis, dissemination, and
decision making
 A case study compares one situation with the other
 For example results of one method or tool with the other
 To avoid bias and to be sure the hypothesis set is tested
properly the case study is organized in anyone of the 3
ways: sister project , baseline or random selection
Sister projects
 The projects which are similar in terms of application
domain, implementation language, specification
language, specification technique and design method.

50
Baseline project
 If there are no similar projects like sister projects then
new project can be tested with baseline projects
 Here the organization collects data from various projects,
regardless of how one project is different from the other
 Baseline is like average situation of the projects in the
organization
Random selection
 Sometimes it is possible to portion a project into parts and
assign each portion to a new technique or to older one.
 This case study is similar to formal experiment , because
you are taking advantage of randomization and replication
performing analysis.
 This type of case study is useful for situation when the
method being study can take on variety of values

51
THANK YOU

QUERIES

52

You might also like