0% found this document useful (0 votes)
14 views

Chapter 1

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Chapter 1

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 51

Chapter One

Introduction to Data Structures and Algorithm Analysis

Course Title:- Data structure and


Algorithms
Instructor:- Feyisa K.
Outlines
Definition of algorithm and data
structure
Properties of Algorithm
Analysis of Algorithm
complexity of algorithm

2
Data Structure
 Data Structure is an organization and
representation of data
 Representation: data can be stored
variously according to their type
 signed, unsigned, etc.
 example : integer representation in memory

 organization: the way of storing data


changes according to the organization
 ordered,
 inordered,
 tree
 example : if you have more than one integer

3
Properties of Data Structure
 Efficient utilization of medium
 Efficient algorithms for
 Creation
 manipulation (insertion/deletion)
 data retrieval (Find)

 A well-designed data structure allows using


little
 Resources
 execution time
 memory space

4
Abstract Data Types
 An ADT consists of an abstract data structure and operations.

 The ADT specifies:

1. What can be stored in the Abstract Data Type


2. What operations can be done on/by the Abstract Data Type.
For example, if we are going to model employees of an organization:
 This ADT stores employees with their relevant attributes and
discarding irrelevant attributes (name, salary, hire date, …).
 This ADT supports hiring, firing, retiring, … operations.

5
Abstraction
 Abstraction is a process of classifying characteristics as
relevant and irrelevant for the particular purpose at hand
and ignoring the irrelevant ones.
 Applying abstraction correctly is the essence of
successful programming.
 How do data structures model the world or some part of
the world?
 The value held by a data structure represents some specific
characteristic of the world
 The characteristic being modeled restricts the possible
values held by a data structure
 The characteristic being modeled restricts the possible
operations to be performed on the data structure.
 Note: Notice the relation between characteristic, value,
and data structures
6
Classification of Data Structure
 Data structures are broadly divided into two:
1. Primitive data structures: These are the basic
data structures and are directly operated upon by the
machine instructions, which is in a primitive level.
They are integers, floating point numbers, characters,
string constants, pointers etc.
 These primitive data structures are the basis for the
discussion of more sophisticated (non-primitive) data
structures.
2. Non-primitive data structures: It is a more
sophisticated data structure emphasizing on
structuring of a group of homogeneous (same type) or
heterogeneous (different type) data items. Array, list,
files, linked list, trees and graphs fall in this category.

7
Classification of Data
Structure

8
Algorithm
 A program is written in order to solve a problem.
A solution to a problem actually consists of two
things:
 A way to organize the data
 Sequence of steps to solve the problem
 The way data are organized in a computer
memory is said to be Data Structure and the
sequence of computational steps to solve a
problem is said to be an algorithm.
 Therefore, algorithm is finite, clearly specified
sequence of instructions to be followed to solve a
problem.
 A program is nothing but data structures plus

9
algorithms.
Cont…
 Algorithm is also a well-defined computational
procedure that takes some value or a set of values
as input and produces some value or a set of values
as output.
 Data structures model the static part of the world.
They are unchanging while the world is changing.
 In order to model the dynamic part of the world we
need to work with algorithms. Algorithms are the
dynamic part of a program’s world model.
 An algorithm transforms data structures from one
state to another state in two ways:
 An algorithm may change the value held by a data
structure
 An algorithm may change the data structure itself

10
Properties of an algorithm
 Finiteness: Algorithm must complete after a finite
number of steps.
 Definiteness: Each step must be clearly defined, having
one and only one interpretation. At each point in
computation, one should be able to tell exactly what
happens next.
 Sequence: Each step must have a unique defined
preceding and succeeding step. The first step (start step)
and last step (halt step) must be clearly noted.
 Feasibility: It must be possible to perform each
instruction.
 Correctness: It must compute correct answer for all
possible legal inputs.
 Language Independence: It must not depend on any
one programming language.
11
Cont…
 Completeness: It must solve the problem
completely.
 Effectiveness: It must be possible to perform each
step exactly and in a finite amount of time.
 Efficiency: It must solve with the least amount of
computational resources such as time and space.
 Generality: Algorithm should be valid on all possible
inputs.
 Input/Output: There must be a specified number of
input values, and one or more result values.

12
Analysis of Algorithm
Analysis investigates
 What are the properties of the algorithm?
in terms of time and space
 How good is the algorithm ?
according to the properties
 7How it compares with others?
not always exact

13
Algorithm Analysis Concepts
 Algorithm analysis refers to the process of determining the
amount of computing time and storage space required by
different algorithms.
 In other words, it’s a process of predicting the resource
requirement of algorithms in a given environment.
 In order to solve a problem, there are many possible algorithms.
One has to be able to choose the best algorithm for the problem
at hand using some scientific method.
 To classify some data structures and algorithms as good, we
need precise ways of analysing them in terms of resource
requirement.
 Running Time
 Memory Usage
 Communication Bandwidth
 Running time is usually treated as the most important since
computational time is the most precious resource in most
problem domains.
14
Complexity Analysis
 Complexity Analysis is the systematic study of
the cost of computation, measured either in time
units or in operations performed, or in the
amount of storage space required.
 The goal is to have a meaningful measure that
permits comparison of algorithms independent
of operating platform.
 There are two things to consider:
 Time Complexity: Determine the approximate
number of operations required to solve a
problem of size n.
 Space Complexity: Determine the approximate
memory required to solve a problem of size n.
15
Analysis Rules
1. We assume an arbitrary time unit.
2. Execution of one of the following operations takes time 1:
 Assignment Operation
 Single Input/Output Operation
 Single Boolean Operations
 Single Arithmetic Operations
 Function Return
3. Running time of a selection statement (if, switch) is the time for the
condition evaluation + the maximum of the running times for the
individual clauses in the selection.
4. Loops: Running time for a loop is equal to the running time for the
statements inside the loop * number of iterations.
 The total running time of a statement inside a group of nested loops is the
running time of the statements multiplied by the product of the sizes of all
the loops.
 For nested loops, analyse inside out.
 Always assume that the loop executes the maximum number of iterations
possible.
5. Running time of a function call is 1 for setup + the time for any parameter
calculations + the time required for the execution of the function body.
16
Cont…
Example
Calculate T(n) for the following
1. k=0;
Cout<<“enter an integer”;
Cin>>n;
For (i=0;i<n;i++)
K++
• T(n) = 1+1+1+ (1+n+1+ n +n)
=5+3n

17
Cont…
• 2. i=0;
While (i<n)
{
x++;
i++;
}
J=1;
While(j<=10)
{
x++;
j++
}
• T(n)=1+n+1+n+n+1+11+10+10
=3n+34
18
Cont…
3. for(i=1;i<=n;i++)
for(j=1;j<=n; j++)
k++;
T(n)=1+n+1+n+n(1+n+1+n+n)=3n2+4n+2

19
Cont…
4. Sum=0;
if(test==1)
{
for (i=1;i<=n;i++)
sum=sum+i
}
else
{
cout<<sum;
}
• T(n)=1+1+Max(1+n+1+n+n+n,1)= 4n+4

20
Cont…
 Loop Incrementation Other than 1
for (int i = 0; i< n; i += c)
statement(s);
 Adding to the loop counter means that the loop runtime grows
linearly when compared to its maximum value n.
 Loop executes its body exactly n / c times.
for (int i = 0; i< n; i *= c)
statement(s);
 Multiplying the loop counter means that the maximum value n
must grow exponentially to linearly increase the loop runtime;
therefore, it is logarithmic.
 Loop executes its body exactly log c n times.
for (int i = 0; i< n * n; i += c)
statement(s);
 The loop maximum is n2, so the runtime is quadratic.
 Loop executes its body exactly (n2/c) times.
21
Exercise
Calculate T(n) for the following codes
A. sum=0;
for(i=1;i<=n;i++)
for(j=1;j<=m;j++)
Sum++;
B. sum=0;
for(i=1;i<=n;i++)
for(j=1;j<=i;j++)
sum++;

C. int double r (int n)


{
int res = 0;
for (int i = 1; i<= n; i+=2)
res = res + i;
return res;
}
22
Formal Approach to Analysis
 In the above examples we have seen that
analysis is a bit complex.
 However, it can be simplified by using
some formal approach in which case we
can ignore initializations, loop control, and
loop keeping.
 for Loops: Formally
 In general, a for loop translates to a
summation. The index and bounds of the
summation
f
or(
in
t i= 1
;are
i< =Nthe
;i
+ same
+
){ as the index and
N
bounds
}
s
um of
= sthe
um+ifor loop.
; 1 

i1
N

23
Cont…
 Suppose we count the number of additions
that are done. There is 1 addition per iteration
of the loop, hence N additions in total.
 Nested Loops: Formally
 Nested for loops translate into multiple
summations, one for each for loop.
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) { N M N

}
sum= sum+i+j; 2 2M 2MN
i1 j1 i1
}

24
Cont…
 Again, count the number of additions. The
outer summation is for the outer for loop.
 Consecutive Statements: Formally
 Add the running times of the separate blocks
of your
for (int i code
= 1; i < = N ; i+ + ) {
sum = sum + i;
}  N   N N 
for (int i = 1; i < = N ; i+ + ) {   1    j 1 2   N  2 N 2
for (int j = 1; j < = N ; j+ + ) {  i 1   i 1 
sum = sum + i+ j;
}
}

25
Cont…
 Conditionals: Formally
 If (test) s1 else s2: Compute the maximum of
the running time for s1 and s2.
if (test ==1) {
for (int i =1; i <=N; i++) { N N N 
sum=sum+i; 
max 1,   2
}}  
 i1 i1 j1 
elsefor (int i =1; i <=N; i++) {
for (int j =1; j <=N; j++) { N, 2N22N2
max
sum=sum+i+j;
}}

26
Measures of Times
 In order to determine the running time of an
algorithm it is possible to define three functions
Tbest(n), Tavg(n) and Tworst(n) as the best, the
average and the worst case running time of the
algorithm respectively.
 Average Case (Tavg): The amount of time the
algorithm takes on an "average" set of inputs.
 Worst Case (Tworst): The amount of time the
algorithm takes on the worst possible set of
inputs.
 Best Case (Tbest): The amount of time the
algorithm takes on the smallest possible set of
inputs.
27
We are interested in the worst-case time, since
Asymptotic Analysis
 Asymptotic analysis is concerned with how the
running time of an algorithm increases with the
size of the input in the limit, as the size of the
input increases without bound.
 There are five notations used to describe a
running time function. These are:
 Big-Oh Notation (O)
 Big-Omega Notation ()
 Theta Notation ()
 Little-o Notation (o)
 Little-Omega Notation ()

28
The Big-Oh Notation
 Big-Oh notation is a way of comparing
algorithms and is used for computing the
complexity of algorithms; i.e., the amount of
time that it takes for computer program to run.
 It’s only concerned with what happens for very
a large value of n.
 Therefore only the largest term in the
expression (function) is needed.
 For example, if the number of operations in an
algorithm is n2 – n, n is insignificant compared
to n2 for large values of n. Hence the n term is
ignored. Of course, for small values of n, it may
be important. However, Big-Oh is mainly
29
concerned with large values of n.
Cont…
 We use O-notation to give an upper bound on a function,
to within a constant factor.
 Since O-notation describes an upper bound, when we use
it to bound the worst-case running time of an algorithm, by
implication we also bound the running time of the
algorithm on arbitrary inputs as well.

c, k ∊ℛ+ such that for all n≥ k, f(n)≤c.g(n) Examples:


 Formal Definition: f(n)= O(g(n)) if there exist constants

 The following points are facts that you can use for Big-Oh
problems:
 1<=n for all n>=1
 n<=n2 for all n>=1
 2n<=n! for all n>=4
 log2n<=n for all n>=2
 n<=nlog2n for all n>=2
30
Cont…
1. f(n)=10n+5 and g(n)=n. Show that f(n) is
O(g(n)).
To show that f(n) is O(g(n)) we must show that
constants c and k such that
f(n) <=c.g(n) for all n>=k
Or 10n+5<=c.n for all n>=k
Try c=15. Then we need to show that
10n+5<=15n
Solving for n we get: 5<5n or 1<=n.
So f(n) =10n+5 <=15.g(n) for all n>=1
(c=15,k=1).

31
Cont…
2. f(n) = 3n2 +4n+1. Show that f(n)=O(n2).
4n <=4n2 for all n>=1 and 1<=n2 for all n>=1
3n2 +4n+1<=3n2+4n2+n2 for all n>=1
3n2 +4n+1<=8n2 for all n>=1
So we have shown that f(n)<=8n2 for all n>=1
Therefore, f (n) is O(n2) (c=8,k=1)

32
Cont…
 Diagrammatically, the function f(n) and its big
Oh g(n) can be drawn on x-y plane(y
coordinate is time, x coordinate is size of
input) as follows. Generally, f (n) =O(g(n))
means that the growth rate of f(n) is less than
or equal to g(n).

33
Cont…
Typical Orders
 Here is a table of some typical cases. This uses
logarithms to base 2, but these are simply
proportional
N O(1) ton) logarithms
O(log 2 O(n) inn) other
O(n log 2 O(n base.
) 2
O(n ) 3

1 1 0 1 0 1 1

2 1 1 2 2 4 8

4 1 2 4 8 16 64

8 1 3 8 24 64 512

16 1 4 16 64 256 4,096

1024 1 10 1,024 10,240 1,048,576 1,073,741,824

34
Cont…
 Demonstrating that a function f(n) is big-O of a
function g(n) requires that we find specific
constants c and k for which the inequality holds
(and show that the inequality does in fact hold).
 Big-O expresses an upper bound on the growth
rate of a function, for sufficiently large values of
n.
 An upper bound is the best algorithmic solution
that has been found for a problem. “What is the
best that we know we can do?”
Running times of different algorithms on 1GHZ
(109 clock cycles) computer is indicated in the
following table.
35
The table assumes one operation per clock cycle
Cont…

36
Big-O Theorems
 For all the following theorems, assume that f(n) is a
function of n and that k is an arbitrary constant.
Theorem 1: k is O(1)
Theorem 2: A polynomial is O(the term containing the
highest power of n).Polynomial’s growth rate is
determined by the leading term. If f(n) is a polynomial
of degree d, then f(n) is O(nd)
In general, f(n) is big-O of the dominant term of f(n).
Theorem 3: k*f(n) is O(f(n))
Constant factors may be ignored
E.g. f(n) =7n4+3n2+5n+1000 is O(n4)
Theorem 4(Transitivity): If f(n) is O(g(n))and g(n) is
O(h(n)), then f(n) is O(h(n))

37
Cont…
Theorem 5: For any base b, logb(n) is O(logn). All logarithms
grow at the same rate. logbn isO(logdn)b, d > 1
Theorem 6: Each of the following functions is big-O of its
successors:
k

 logbn

n

 nlogbn

 n2

 n to higher powers

 2n

 3n
38
Cont…
larger constants to the nth power

n!

nn

f(n) = 3nlogbn + 4 logbn+2 is O(nlogbn) and O(n2) and O(2n)

39
Properties of the O Notation
 Higher powers grow faster
nr = O( ns) if 0 <= r <= s
 Fastest growing term dominates a sum
If f(n) is O(g(n)), then f(n) + g(n) is O(g)
E.g5n4 + 6n3 = O (n4)
 Exponential functions grow faster than powers, i.e. is

O(bn) b > 1 and k >= 0


E.g. n20 = O(1.05n)
 Logarithms grow more slowly than powers

logbn = O(nk) b > 1 and k >= 0


E.g. log2n = O(n0.5)

40
Big-Omega Notation
 Just as O-notation provides an asymptotic upper
bound on a function,  notation provides an
asymptotic lower bound.
 Formal Definition: A function f(n)=(g(n)) if
there exist constants c, &k ∊ℛ+ such that
f(n) >= c.g(n) for all n>=k.
 f(n) =( g (n)) means that f(n) is greater than or
equal to some constant multiple of g(n) for all
values of n greater than or equal to some k.
 Example: If f(n) = n2,then f(n) = (n)
 In simple terms, f(n) =(g(n)) means that the
growth rate of f(n) is greater than or equal to g(n).
41
Cont…

Fig Big Omega growth

42
Theta Notation
 A function f (n) belongs to the set of (g(n)) if
there exist positive constants c1 and c2 such that
it can be sandwiched between c1.g(n) and c2.g(n),
for sufficiently large values of n.
 Formal Definition: A function f(n) is (g(n)) if it is
both O(g(n)) and (g(n)). In other words, there
exist constants c1, c2, and k>0 such that
 c1.g (n)<=f(n)<=c2.g(n) for all n >= k
 If f(n)= (g(n)), then g(n) is an asymptotically
tight bound for f(n).
 In simple terms, f(n) = (g(n)) means that f(n) and g(n)
have the same rate of growth.
43
Cont…
Example:
1. If f(n) = 2n+1, then f(n) = (n)
2. f(n) = 2n2 then
f(n) = O(n4)
f(n) = O(n3)
f(n) = O(n2)

Fig Theta growth


 All these are technically correct, but the last expression is
the best and tight one. Since 2n2 and n2 have the same
growth rate, it can be written as f(n) = (n2).
44
Cont…
Example:
1. Show that f(n) =10n3+5n2+17 is (n3)
10n3<f(n) < 10n3+5n3+17n3
10n3<f(n) < 32n3
f(n) = (n3) (c1=10, c2=32, n>=1)
2. Show that 5n log n +10n is ( n log n)
5n log n < f(n) <5n log n + 10n log n
5n log n < f(n) < 15n log n
f(n)=(n log n) (c1=5, c2=15, k=1)
Exercise
Show that f(n) = 10n3-5n2 is (n3)
45
Little-o Notation
 Big-Oh notation may or may not be asymptotically
tight, for example:
2n2 = O(n2)
= O(n3)
 f(n)=o(g(n)) means for all c>0 there exists some
k>0 such that f(n)<c.g(n) for all n>=k. Informally,
f(n)=o(g(n)) means f(n) becomes insignificant
relative to g(n) as n approaches infinity.
Example: f(n)=3n+4 is o(n2)
 In simple terms, f(n) has less growth rate compared
to g(n).
g(n)= 2n2, g(n) =o(n3), O(n2), g(n) is not o(n2).
46
Little-Omega ( notation)
 Little-omega () notation is to big-omega ()
notation as little-o notation is to Big-Oh notation.
 We use  notation to denote a lower bound that is not
asymptotically tight.
 Formal Definition: f(n)=(g(n)) if there exists a
constant no>0 such that 0<= c.g(n)<f(n) for all n>=k.
Example: 2n2=(n) but it’s not (n2).

47
Relational Properties of the Asymptotic
Notations
Transitivity
if f(n) = (g(n)) and g(n) = (h(n)) then f(n) = (h(n)),
if f(n) = O(g(n)) and g(n) = O(h(n)) then f(n) = O(h(n)),
if f(n) = (g(n)) and g(n) = (h(n)) then f(n) =  (h(n)),
if f(n) = o(g(n)) and g(n) = o(h(n)) then f(n) = o(h(n)),
and
if f(n) =  (g(n)) and g(n) = (h(n)) then f(n) =  (h(n)).
Symmetry
• f(n) = (g(n)) if and only if g(n) = (f(n)).

48
Cont…
Transpose symmetry
f(n) = O(g(n)) if and only if g(n) = (f(n)),
f(n) = o(g(n)) if and only if g(n) = (f(n)).
Reflexivity
f(n) = (f(n)),
f(n) = O(f(n)),
f(n) = (f(n)).

49
Cont…
Growth Rates
Function Name

C Constant

log n Logarithmic

log2n Log-squared

N Linear

n log n Linear-log

n2 Quadratic

n3 Cubic

2n Exponential

n! Factorial

nn Polynomial

Fig:- algorithm
growth rate comparison
50
Any Question

51

You might also like