Chapter 1
Chapter 1
2
Data Structure
Data Structure is an organization and
representation of data
Representation: data can be stored
variously according to their type
signed, unsigned, etc.
example : integer representation in memory
3
Properties of Data Structure
Efficient utilization of medium
Efficient algorithms for
Creation
manipulation (insertion/deletion)
data retrieval (Find)
4
Abstract Data Types
An ADT consists of an abstract data structure and operations.
5
Abstraction
Abstraction is a process of classifying characteristics as
relevant and irrelevant for the particular purpose at hand
and ignoring the irrelevant ones.
Applying abstraction correctly is the essence of
successful programming.
How do data structures model the world or some part of
the world?
The value held by a data structure represents some specific
characteristic of the world
The characteristic being modeled restricts the possible
values held by a data structure
The characteristic being modeled restricts the possible
operations to be performed on the data structure.
Note: Notice the relation between characteristic, value,
and data structures
6
Classification of Data Structure
Data structures are broadly divided into two:
1. Primitive data structures: These are the basic
data structures and are directly operated upon by the
machine instructions, which is in a primitive level.
They are integers, floating point numbers, characters,
string constants, pointers etc.
These primitive data structures are the basis for the
discussion of more sophisticated (non-primitive) data
structures.
2. Non-primitive data structures: It is a more
sophisticated data structure emphasizing on
structuring of a group of homogeneous (same type) or
heterogeneous (different type) data items. Array, list,
files, linked list, trees and graphs fall in this category.
7
Classification of Data
Structure
8
Algorithm
A program is written in order to solve a problem.
A solution to a problem actually consists of two
things:
A way to organize the data
Sequence of steps to solve the problem
The way data are organized in a computer
memory is said to be Data Structure and the
sequence of computational steps to solve a
problem is said to be an algorithm.
Therefore, algorithm is finite, clearly specified
sequence of instructions to be followed to solve a
problem.
A program is nothing but data structures plus
9
algorithms.
Cont…
Algorithm is also a well-defined computational
procedure that takes some value or a set of values
as input and produces some value or a set of values
as output.
Data structures model the static part of the world.
They are unchanging while the world is changing.
In order to model the dynamic part of the world we
need to work with algorithms. Algorithms are the
dynamic part of a program’s world model.
An algorithm transforms data structures from one
state to another state in two ways:
An algorithm may change the value held by a data
structure
An algorithm may change the data structure itself
10
Properties of an algorithm
Finiteness: Algorithm must complete after a finite
number of steps.
Definiteness: Each step must be clearly defined, having
one and only one interpretation. At each point in
computation, one should be able to tell exactly what
happens next.
Sequence: Each step must have a unique defined
preceding and succeeding step. The first step (start step)
and last step (halt step) must be clearly noted.
Feasibility: It must be possible to perform each
instruction.
Correctness: It must compute correct answer for all
possible legal inputs.
Language Independence: It must not depend on any
one programming language.
11
Cont…
Completeness: It must solve the problem
completely.
Effectiveness: It must be possible to perform each
step exactly and in a finite amount of time.
Efficiency: It must solve with the least amount of
computational resources such as time and space.
Generality: Algorithm should be valid on all possible
inputs.
Input/Output: There must be a specified number of
input values, and one or more result values.
12
Analysis of Algorithm
Analysis investigates
What are the properties of the algorithm?
in terms of time and space
How good is the algorithm ?
according to the properties
7How it compares with others?
not always exact
13
Algorithm Analysis Concepts
Algorithm analysis refers to the process of determining the
amount of computing time and storage space required by
different algorithms.
In other words, it’s a process of predicting the resource
requirement of algorithms in a given environment.
In order to solve a problem, there are many possible algorithms.
One has to be able to choose the best algorithm for the problem
at hand using some scientific method.
To classify some data structures and algorithms as good, we
need precise ways of analysing them in terms of resource
requirement.
Running Time
Memory Usage
Communication Bandwidth
Running time is usually treated as the most important since
computational time is the most precious resource in most
problem domains.
14
Complexity Analysis
Complexity Analysis is the systematic study of
the cost of computation, measured either in time
units or in operations performed, or in the
amount of storage space required.
The goal is to have a meaningful measure that
permits comparison of algorithms independent
of operating platform.
There are two things to consider:
Time Complexity: Determine the approximate
number of operations required to solve a
problem of size n.
Space Complexity: Determine the approximate
memory required to solve a problem of size n.
15
Analysis Rules
1. We assume an arbitrary time unit.
2. Execution of one of the following operations takes time 1:
Assignment Operation
Single Input/Output Operation
Single Boolean Operations
Single Arithmetic Operations
Function Return
3. Running time of a selection statement (if, switch) is the time for the
condition evaluation + the maximum of the running times for the
individual clauses in the selection.
4. Loops: Running time for a loop is equal to the running time for the
statements inside the loop * number of iterations.
The total running time of a statement inside a group of nested loops is the
running time of the statements multiplied by the product of the sizes of all
the loops.
For nested loops, analyse inside out.
Always assume that the loop executes the maximum number of iterations
possible.
5. Running time of a function call is 1 for setup + the time for any parameter
calculations + the time required for the execution of the function body.
16
Cont…
Example
Calculate T(n) for the following
1. k=0;
Cout<<“enter an integer”;
Cin>>n;
For (i=0;i<n;i++)
K++
• T(n) = 1+1+1+ (1+n+1+ n +n)
=5+3n
17
Cont…
• 2. i=0;
While (i<n)
{
x++;
i++;
}
J=1;
While(j<=10)
{
x++;
j++
}
• T(n)=1+n+1+n+n+1+11+10+10
=3n+34
18
Cont…
3. for(i=1;i<=n;i++)
for(j=1;j<=n; j++)
k++;
T(n)=1+n+1+n+n(1+n+1+n+n)=3n2+4n+2
19
Cont…
4. Sum=0;
if(test==1)
{
for (i=1;i<=n;i++)
sum=sum+i
}
else
{
cout<<sum;
}
• T(n)=1+1+Max(1+n+1+n+n+n,1)= 4n+4
20
Cont…
Loop Incrementation Other than 1
for (int i = 0; i< n; i += c)
statement(s);
Adding to the loop counter means that the loop runtime grows
linearly when compared to its maximum value n.
Loop executes its body exactly n / c times.
for (int i = 0; i< n; i *= c)
statement(s);
Multiplying the loop counter means that the maximum value n
must grow exponentially to linearly increase the loop runtime;
therefore, it is logarithmic.
Loop executes its body exactly log c n times.
for (int i = 0; i< n * n; i += c)
statement(s);
The loop maximum is n2, so the runtime is quadratic.
Loop executes its body exactly (n2/c) times.
21
Exercise
Calculate T(n) for the following codes
A. sum=0;
for(i=1;i<=n;i++)
for(j=1;j<=m;j++)
Sum++;
B. sum=0;
for(i=1;i<=n;i++)
for(j=1;j<=i;j++)
sum++;
23
Cont…
Suppose we count the number of additions
that are done. There is 1 addition per iteration
of the loop, hence N additions in total.
Nested Loops: Formally
Nested for loops translate into multiple
summations, one for each for loop.
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) { N M N
}
sum= sum+i+j; 2 2M 2MN
i1 j1 i1
}
24
Cont…
Again, count the number of additions. The
outer summation is for the outer for loop.
Consecutive Statements: Formally
Add the running times of the separate blocks
of your
for (int i code
= 1; i < = N ; i+ + ) {
sum = sum + i;
} N N N
for (int i = 1; i < = N ; i+ + ) { 1 j 1 2 N 2 N 2
for (int j = 1; j < = N ; j+ + ) { i 1 i 1
sum = sum + i+ j;
}
}
25
Cont…
Conditionals: Formally
If (test) s1 else s2: Compute the maximum of
the running time for s1 and s2.
if (test ==1) {
for (int i =1; i <=N; i++) { N N N
sum=sum+i;
max 1, 2
}}
i1 i1 j1
elsefor (int i =1; i <=N; i++) {
for (int j =1; j <=N; j++) { N, 2N22N2
max
sum=sum+i+j;
}}
26
Measures of Times
In order to determine the running time of an
algorithm it is possible to define three functions
Tbest(n), Tavg(n) and Tworst(n) as the best, the
average and the worst case running time of the
algorithm respectively.
Average Case (Tavg): The amount of time the
algorithm takes on an "average" set of inputs.
Worst Case (Tworst): The amount of time the
algorithm takes on the worst possible set of
inputs.
Best Case (Tbest): The amount of time the
algorithm takes on the smallest possible set of
inputs.
27
We are interested in the worst-case time, since
Asymptotic Analysis
Asymptotic analysis is concerned with how the
running time of an algorithm increases with the
size of the input in the limit, as the size of the
input increases without bound.
There are five notations used to describe a
running time function. These are:
Big-Oh Notation (O)
Big-Omega Notation ()
Theta Notation ()
Little-o Notation (o)
Little-Omega Notation ()
28
The Big-Oh Notation
Big-Oh notation is a way of comparing
algorithms and is used for computing the
complexity of algorithms; i.e., the amount of
time that it takes for computer program to run.
It’s only concerned with what happens for very
a large value of n.
Therefore only the largest term in the
expression (function) is needed.
For example, if the number of operations in an
algorithm is n2 – n, n is insignificant compared
to n2 for large values of n. Hence the n term is
ignored. Of course, for small values of n, it may
be important. However, Big-Oh is mainly
29
concerned with large values of n.
Cont…
We use O-notation to give an upper bound on a function,
to within a constant factor.
Since O-notation describes an upper bound, when we use
it to bound the worst-case running time of an algorithm, by
implication we also bound the running time of the
algorithm on arbitrary inputs as well.
The following points are facts that you can use for Big-Oh
problems:
1<=n for all n>=1
n<=n2 for all n>=1
2n<=n! for all n>=4
log2n<=n for all n>=2
n<=nlog2n for all n>=2
30
Cont…
1. f(n)=10n+5 and g(n)=n. Show that f(n) is
O(g(n)).
To show that f(n) is O(g(n)) we must show that
constants c and k such that
f(n) <=c.g(n) for all n>=k
Or 10n+5<=c.n for all n>=k
Try c=15. Then we need to show that
10n+5<=15n
Solving for n we get: 5<5n or 1<=n.
So f(n) =10n+5 <=15.g(n) for all n>=1
(c=15,k=1).
31
Cont…
2. f(n) = 3n2 +4n+1. Show that f(n)=O(n2).
4n <=4n2 for all n>=1 and 1<=n2 for all n>=1
3n2 +4n+1<=3n2+4n2+n2 for all n>=1
3n2 +4n+1<=8n2 for all n>=1
So we have shown that f(n)<=8n2 for all n>=1
Therefore, f (n) is O(n2) (c=8,k=1)
32
Cont…
Diagrammatically, the function f(n) and its big
Oh g(n) can be drawn on x-y plane(y
coordinate is time, x coordinate is size of
input) as follows. Generally, f (n) =O(g(n))
means that the growth rate of f(n) is less than
or equal to g(n).
33
Cont…
Typical Orders
Here is a table of some typical cases. This uses
logarithms to base 2, but these are simply
proportional
N O(1) ton) logarithms
O(log 2 O(n) inn) other
O(n log 2 O(n base.
) 2
O(n ) 3
1 1 0 1 0 1 1
2 1 1 2 2 4 8
4 1 2 4 8 16 64
8 1 3 8 24 64 512
16 1 4 16 64 256 4,096
34
Cont…
Demonstrating that a function f(n) is big-O of a
function g(n) requires that we find specific
constants c and k for which the inequality holds
(and show that the inequality does in fact hold).
Big-O expresses an upper bound on the growth
rate of a function, for sufficiently large values of
n.
An upper bound is the best algorithmic solution
that has been found for a problem. “What is the
best that we know we can do?”
Running times of different algorithms on 1GHZ
(109 clock cycles) computer is indicated in the
following table.
35
The table assumes one operation per clock cycle
Cont…
36
Big-O Theorems
For all the following theorems, assume that f(n) is a
function of n and that k is an arbitrary constant.
Theorem 1: k is O(1)
Theorem 2: A polynomial is O(the term containing the
highest power of n).Polynomial’s growth rate is
determined by the leading term. If f(n) is a polynomial
of degree d, then f(n) is O(nd)
In general, f(n) is big-O of the dominant term of f(n).
Theorem 3: k*f(n) is O(f(n))
Constant factors may be ignored
E.g. f(n) =7n4+3n2+5n+1000 is O(n4)
Theorem 4(Transitivity): If f(n) is O(g(n))and g(n) is
O(h(n)), then f(n) is O(h(n))
37
Cont…
Theorem 5: For any base b, logb(n) is O(logn). All logarithms
grow at the same rate. logbn isO(logdn)b, d > 1
Theorem 6: Each of the following functions is big-O of its
successors:
k
logbn
n
nlogbn
n2
n to higher powers
2n
3n
38
Cont…
larger constants to the nth power
n!
nn
39
Properties of the O Notation
Higher powers grow faster
nr = O( ns) if 0 <= r <= s
Fastest growing term dominates a sum
If f(n) is O(g(n)), then f(n) + g(n) is O(g)
E.g5n4 + 6n3 = O (n4)
Exponential functions grow faster than powers, i.e. is
40
Big-Omega Notation
Just as O-notation provides an asymptotic upper
bound on a function, notation provides an
asymptotic lower bound.
Formal Definition: A function f(n)=(g(n)) if
there exist constants c, &k ∊ℛ+ such that
f(n) >= c.g(n) for all n>=k.
f(n) =( g (n)) means that f(n) is greater than or
equal to some constant multiple of g(n) for all
values of n greater than or equal to some k.
Example: If f(n) = n2,then f(n) = (n)
In simple terms, f(n) =(g(n)) means that the
growth rate of f(n) is greater than or equal to g(n).
41
Cont…
42
Theta Notation
A function f (n) belongs to the set of (g(n)) if
there exist positive constants c1 and c2 such that
it can be sandwiched between c1.g(n) and c2.g(n),
for sufficiently large values of n.
Formal Definition: A function f(n) is (g(n)) if it is
both O(g(n)) and (g(n)). In other words, there
exist constants c1, c2, and k>0 such that
c1.g (n)<=f(n)<=c2.g(n) for all n >= k
If f(n)= (g(n)), then g(n) is an asymptotically
tight bound for f(n).
In simple terms, f(n) = (g(n)) means that f(n) and g(n)
have the same rate of growth.
43
Cont…
Example:
1. If f(n) = 2n+1, then f(n) = (n)
2. f(n) = 2n2 then
f(n) = O(n4)
f(n) = O(n3)
f(n) = O(n2)
47
Relational Properties of the Asymptotic
Notations
Transitivity
if f(n) = (g(n)) and g(n) = (h(n)) then f(n) = (h(n)),
if f(n) = O(g(n)) and g(n) = O(h(n)) then f(n) = O(h(n)),
if f(n) = (g(n)) and g(n) = (h(n)) then f(n) = (h(n)),
if f(n) = o(g(n)) and g(n) = o(h(n)) then f(n) = o(h(n)),
and
if f(n) = (g(n)) and g(n) = (h(n)) then f(n) = (h(n)).
Symmetry
• f(n) = (g(n)) if and only if g(n) = (f(n)).
48
Cont…
Transpose symmetry
f(n) = O(g(n)) if and only if g(n) = (f(n)),
f(n) = o(g(n)) if and only if g(n) = (f(n)).
Reflexivity
f(n) = (f(n)),
f(n) = O(f(n)),
f(n) = (f(n)).
49
Cont…
Growth Rates
Function Name
C Constant
log n Logarithmic
log2n Log-squared
N Linear
n log n Linear-log
n2 Quadratic
n3 Cubic
2n Exponential
n! Factorial
nn Polynomial
Fig:- algorithm
growth rate comparison
50
Any Question
51