0% found this document useful (0 votes)

18 views67 pages

UNIT1

This document covers the fundamentals of algorithm analysis, including time and space complexity, asymptotic notations, and various searching and sorting algorithms. It defines algorithms, their properties, and the necessity of analyzing them to select the best one for a problem. Additionally, it discusses best, worst, and average case analyses, along with examples of time complexities using Big O notation.

Uploaded by

saruhasan1103

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views67 pages

UNIT1

Uploaded by

saruhasan1103

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 67

UNIT - I

,1752'8&7,21

Algorithm analysis: Time and space complexity - Asymptotic Notations

and its properties Best case, Worst case and average case analysis –
Recurrence relation: substitution method - Lower bounds – searching:
linear search, binary search and Interpolation Search, Pattern search:
The naïve string-matching algorithm - Rabin-Karp algorithm -
Knuth-Morris-Pratt algorithm. Sorting: Insertion sort – heap sort

$/*25,7+0

1.1.1 Definition
The sequence of steps to be performed in order to solve a problem by the
computer is known as an algorithm.

Programs = Algorithms + Data

Another way to describe algorithm is the sequence of unambiguous instructions.

It starts from an initial input of instructions that describe a computation and proceeds
through a finite number of well-defined successive steps, producing an output and a
final ending state.
1.2 Algorithms

Algorithms was first developed by Persian scientist, astronomer and

mathematician Abdullah Muhammad bin Musa al-Khwarizmi in 9th century. He was
often cited as “The father of Algebra”, and was responsible for the creation of the
term “Algorithm”.

1.1.2 Examples on Algorithms

4gP\_[T
Problem statement: Calling a friend on the telephone

Input: The telephone number of your friend.

Output: Talk to your friend

Algorithm Steps
(i) Pick up the phone and listen for a dial tone

(ii) Press each digit of the phone number on the phone

(iii) If busy, hang up phone, wait 2 minutes, jump to step 2

(iv) If no one answers, leave a message then hang up

(v) If no answering machine, hang up and wait 2 hours, then jump to step 2

(vi) Talk to friend

(vii) Hang up phone

4gP\_[T !
Problem statement: Find the largest number in the given list of numbers

Input: A list of positive integer numbers.

Output: Largest number.

Algorithm Steps
(i) Define a variable ’max’ and initialize with ’0’.

(ii) Compare first number (say ’x’) in the list ’L’ with ’max’.
Introduction 1.3

(iii) If ’x’ is larger than ’max’, set ’max’ to ’x’.

(iv) Repeat step 2 and step 3 for all numbers in the list ’L’.

(v) Display the value of ’max’ as a result.

1.1.3 Properties of an Algorithm

1. Finiteness: The algorithm must always terminate after a finite number of
steps.

2. Definiteness: Each instruction must be clear, well-defined and precise. There

should not be any ambiguity.

3. Effectiveness: Each Instruction must be simple and be carried out in a

finite amount of time.

4. Input: An algorithm has zero or more inputs, taken from a specified set
of objects.

5. Output: An algorithm has one or more outputs, which have a specified

relation to the inputs.

6. Feasibility: It must be possible to perform each instruction.

7. Generality – the algorithm must be able to work for a set of inputs rather
than a single input.

8. Efficiency: The term efficiency is measured in terms of time and space

required by an algorithm to implement. Thus, an algorithm must ensure
that it takes little time and less memory space.

9. Independent: An algorithm must be language independent. It means that it

should mainly focus on the input and the procedure required to get the
output instead of depending upon the language.

1.1.4 Necessity to analyse the algorithm

If we want to go from city “A” to city “B”, there can be many ways of doing
this. We can go by flight, by bus, by train and also by bicycle. Depending on the
availability and convenience, we choose the one which suits us. Similarly, in computer
1.4 Algorithms

science, there are multiple algorithms to solve a problem. When we have more than
one algorithm to solve a problem, we need to select the best one. Performance analysis
helps us to select the best algorithm from multiple algorithms to solve a problem.

Performance of an algorithm depends on the following parameters like -

1. Whether that algorithm provides the exact solution for the problem
statement

2. Whether it is easy to understand

3. Whether it is easy to implement

4. How much space (memory) required to solve the problem

5. How much time required to solve the problem

$/*25,7+0 $1$/<6,6

Algorithm analysis is an important part of computational complexity theory,

which provides theoretical estimation for the required resources of an algorithm to
solve a specific computational problem. Most algorithms are designed to work with
inputs of arbitrary length. Algorithm analysis is the process of calculating space and
time required by that algorithm. The term “analysis of algorithms” was coined by
Donald Knuth.

Algorithm analysis is performed by using the following measures -

1. Space Complexity: Space required to complete the task. It includes program

space and data space

2. Time Complexity: Time required to complete the task.

1.2.1 Space complexity

Space complexity is an amount of memory used by the algorithm (including the
input values of the algorithm), to execute it completely and produce the result.We
know that to execute an algorithm it must be loaded in the main memory. The memory
can be used in different forms:

 Variables (This includes the constant values and temporary values)

Introduction 1.5

 Program Instruction

 Execution

Space complexity includes both Auxiliary space and space used by input.
Auxiliary Space is the extra space or temporary space used by an algorithm.

Memory Usage during program execution

 Instruction Space → used to save compiled instruction in the memory.

 Environmental Stack → used for storing the addresses while a module calls
another module or functions during execution.

 Data space → used to store data, variables, and constants which are stored
by the program and it is updated during execution.

Space complexity is a parallel concept to time complexity. If we need to create

an array of size n, this will require space. If we create a two-dimensional array of
size n * n, this will require O (n2) space.

1.2.2 Time complexity

Time complexity of an algorithm measures the amount of time taken by an
algorithmie.the time taken to execute each statement of code in an algorithm.

Example
 Time taken to execute 1 statement = x milliseconds.

 Time taken to execute nstatement = x * n milliseconds

 To execute nstatement inside a FOR loop = x * n milliseconds + y

milliseconds

wherey milliseconds is the time taken to execute FOR loop.

1.2.3 Asymptotic Notation

To perform analysis of an algorithm, it is necessary to calculate the complexity
of that algorithm. To calculate the complexity of an algorithm, exact amount of
1.6 Algorithms

resource required. But it will not be provided. So instead of taking the exact amount
of resource, we represent that complexity in a general form (Notation) for analysis
process.

In asymptotic notation, the complexities of an algorithm are represented only by

the most significant terms and ignore least significant terms (Here complexity is, Space
Complexity or Time Complexity).

Example

 Algorithm 1 : 25n3 + 2n + 1

 Algorithm 2 : 1223n2 + 8n + 3

The term ’2n + 1’ have least significance than the term ’25n3’, and the term
’8n + 3’ in algorithm has least significance than the term ’1223n2’.

Definition
Asymptotic notations are mathematical tools to represent the time and space
complexity of algorithms for asymptotic analysis.

There are mainly three asymptotic notations:

1. Big-O Notation (O-notation)

2. Omega Notation (Ω-notation)

3. Theta Notation (Θ-notation)

1XV >W =^cPcX^] >

 Big - Oh notation is used to define the upper bound of an algorithm ie. it
indicates the maximum time required by an algorithm for all input values.
Therefore, it gives the worst-case complexity of an algorithm.

 Consider function f(n) as time complexity of an algorithm and g(n) is the

most significant term. If f (n) ≤ C g(n) for all n ≥ n0, C > 0 and n0 ≥ 1.
Then we can represent f(n) as O(g(n)).
Introduction 1.7

f(n) = O(g(n))
Big - Oh Notation

Examples

 100, log (2000), 104 → have O(1)

 (n/4), (2n + 3), (n/100 + log(n)) → have O(n)

 (n2 + n), (2n2), (n2 + log (n)) → have O(n2)

 O provides upper bounds.

! >\TVP =^cPcX^] Ω=^cPcX^]

 Omega notation represents the lower bound of an algorithm ie. it indicates
the minimum time required by an algorithm for all input values.Thus, it
provides the best case complexity of an algorithm.

 Consider function f(n) as time complexity of an algorithm and g(n) is the

most significant term. If f(n) = C g(n) for all n ≥ n0, C > 0 and n0 ≥ 1. Then
we can represent f(n) as Ω (g(n)).

f(n) = Ω (g(n))
Omega Notation
1.8 Algorithms

Examples
 100, log (2000), 104 → have Ω (1)

 (n/4), (2n + 3), (n/100 + log(n)) → have Ω (n)

 (n2 + n), (2n2), (n2 + log (n)) → have Ω (n2)

Ω provides lower bounds.

" CWTcP =^cPcX^] Φ=^cPcX^]

 Theta notation always indicates the average time required by an algorithm.
Since it represents the upper and the lower bound of the running time of
an algorithm, it is used for analyzing the average-case complexity of an
algorithm.

 Consider function f(n) as time complexity of an algorithm and g(n) is the

most significant term. If C1g (n) ≤ f (n) <= C2 g (n) for all n ≥ C1 > 0,
C2 > 0 and n0 ≥ 1. Then we can represent f(n) as Φ(g(n)).

f (n) = Θ (g (n))
Theta Notation

Examples
 100, log (2000), 104 → have Φ (1)

 (n/4), (2n + 3), (n/100 + log(n)) → have Φ (n)

 (n2 + n), (2n2), (n2 + log (n)) → have Φ (n2)

Φ provides exact bounds.

Introduction 1.9

Asymptotic Notation
f(n) = O(g(n))
Big - Oh Notation ---- Upper Bound ⇒ Worst case

f(n) = Ù(g(n)) ))
Omega Notation ---- Lower Bound ⇒ Best case

f(n) = Φ (g(n))
Theta Notation ---- Upper & Lower Bound ⇒ Average case

:2567 &$6( $9(5$*( &$6( $1' %(67 &$6( ,1

$/*25,7+0 $1$/<6,6

 Best case: Function which performs the minimum number of steps on input
data of size n.
 Worst case: Function which performs the maximum number of steps on
input data of size n.
 Average case: Function which performs an average number of steps on
input data of size n.

1.3.1 Best Case Analysis (Very Rarely used)

 In the best case analysis, calculate the lower bound of the execution time
of an algorithm. It is necessary to know the case which causes the execution
of the minimum number of operations.
Example: Linear Search

In linear search, Best case occurs when x is present at the first location. The
best case time complexity would be Ω (1)

1.3.2 Worst Case Analysis (Mostly used)

 In the worst case analysis, calculate the upper limit of the execution time
of an algorithm. It is necessary to know which causes the execution of the
maximum number of operations. The worst-case time complexity of the
linear search would be O(n).
Example – Linear Search
1.10 Algorithms

In linear search, Worst case occurs when x is NOT present in the array. The
worst case time complexity of the linear search would be O(n).

1.3.3 Average Case Analysis (Rarely used)

 In average case analysis, take all possible inputs and calculate the computing
time for all of the inputs. Sum all the calculated values and divide the sum
by the total number of inputs.

Example
In linear search, assume all cases are uniformly distributed (including the case
of x not being present in the array). After summing all the cases, divide the sum by
(n + 1).

Types of time complexities

Big O Notation Name Example(s)

O(1) Constant 1. Odd or Even number checking
2. Look-up table (on average)
O(n) Linear 1. Find max element in unsorted array
2. Duplicate elements in array with Hash Map
O(n2) Quadratic 1. Duplicate elements in array
2. Bubble sort
O(log n) Logarithmic Binary Searching
O(n log n) Linearithmic Merge Sort
O(2n) Exponential 1. Travelling salesman problem using dynamic
programming
2. Fibonacci series generation

1. O(1) - Constant time

O(1) describes algorithms that take the same amount of time to compute
regardless of the input size. For example, if a function takes the same time to process
ten elements and 1 million items, then it is O(1).
Introduction 1.11

Examples
 Find if a number is even or odd.

 Check if an item on an array is null.

 Print the first element from a list.

 Find a value on a map.

2. O(n) - Linear time

Linear time complexity O(n) means that the algorithms take proportionally longer
to complete as the input grows. These algorithms imply that the program visits every
element from the input.

Examples
 Get the max/min value in an array.

 Find a given element in a collection.

 Print all the values in a list.

3. O(n2) - Quadratic time

A function with a quadratic time complexity has a growth rate of n2. If the
input is size 2, it will do four operations. If the input is size 8, it will take 64, and
so on.

Examples
 Check if a collection has duplicated values.

 Sorting using bubble sort, insertion sort, or selection sort.

 Find all possible ordered pairs in an array.

4. O(log n) - Logarithmic time

Logarithmic time complexities usually apply to algorithms that divide problems
in half every time. For example, to find a word in a book which is sorted
alphabetically, there are two ways to do it.
1.12 Algorithms

Method 1:
 Start on the first page of the book and go word by word until you find
matching word.

Method 2:
 Open the book in the middle and check the first word on it.

 If the word you are looking for is alphabetically more significant, then look
to the right. Otherwise, look in the left half.

 Divide the remainder in half again, and repeat above step until you find
matching.

Method 1 - go word by word - O(n)

Method 2 - split the problem in half for each iteration - O(log n)

Example
 Binary search.

5. O(n log n) - Linearithmic

Linearithmic time complexity it’s slightly slower than a linear algorithm.
However, it’s still much better than a quadratic algorithm.

Examples
 Sorting algorithms like merge sort, quicksort, and others.

6. O(2n) - Exponential time

Exponential (base 2) running time means the calculations performed by an
algorithm double every time as the input grows.

Examples:
 Fibonacci series generation

 Travelling salesman problem using dynamic programming

Introduction 1.13

5(&855(1&( 5(/$7,21

A recurrence relation is an equation that defines a sequence based on a rule that

gives the next term as a function of the previous term(s). It helps in finding the
subsequent term (next term) with the previous term. If we know the previous term
in a given series, then we can easily determine the next term.

Example 1
 Recursive definition for the factorial function

n!=(n-1)! * n

Example 2
 Recursive definition for Fibonacci sequence

Fib(n)=Fib(n-1)+Fib(n-2)

Recurrence relations are often used to model the cost of recursive functions. For
example, the number of multiplications required by a recursive version of the factorial
function for an input of size n will be zero when n = 0 or n = 1 (the base cases),
and it will be one plus the cost of calling fact on a value of n − 1.

1.4.1 Expansion of the Recurrence Equations

Example 1
Let us see the expansion of the following recurrence equation.

T (n) = T (n − 1) + 1 for n > 1

T (0) = T (1) = 0.

Step 1:
T (n) = 1 + T (n − 1),
Step 2:
T (n) = 1 + (1 + T (n − 2)),
1.14 Algorithms

Step 3:
T (n) = 1 + (1 + (1 + T (n − 3))),
Step 4:
T (n) = 1 + (1 + (1 + (1 + T (n − 4)))),
Step 5:
This pattern will continue till we reach a sub-problem of size 1.
T (n) = 1 + (1 + (1 + (1 + (1 + (1 + ……))))

Step 6:
n
Thus the closed form of T (n) = 1 + T (n − 1) can be modeled as ∑ 1
i=1

Example 2
Let us see the expansion of the following recurrence equation.

T (n) = T (n − 1) + n

T (1) = 1.

Step 1:
T (n) = T (n − 1)
Step 2:
T (n) = n + (n − 2))
Step 3:
T (n) = n + (n − 1 + (n − 2 + T (n − 3))
Step 4:
T (n) = n + ((n − 2 + (n − 3 + T (n − 4))))
Introduction 1.15

Step 5:
This pattern will continue till we reach a sub-problem of size 1.
T (n) = n + (n − 1 + (n − 2 + (n − 3 + (n − 4 + …… 1))))

Step 6:
n
Thus the closed form of T (n) = n + T (n − 1) can be modeled as ∑ 1
i=1

1.2.4 Methods for Solving Recurrence

 Substitution Method
 Iteration Method
 Recursion Tree Method
 Master Method
1. Substitution Method
In the substitution method, we have a known recurrence, and we use induction
to prove that our guess is a good bound for the recurrence’s solution.

Steps
 Guess a solution through your experience.
 Use induction to prove that the guess is an upper bound solution for the
given recurrence relation.
Example:
T (n) = 1 if n = 1
= 2T (n − 1) if n > 1
T (n) = 2T (n − 1)
= 2 [2T (n − 2)] = 22 T (n − 2)
= 4 [2T (n − 3)] = 23 T (n − 3)
= 8 [2T (n − 4)] = 24 T (n − 4)
1.16 Algorithms

Repeat the procedure for i times

T (n) = 2i T (n − i)

Put n − i = 1 or i = n − 1 in (Eq. 1)
T (n) = 2n − 1 T (1)

= 2n − 1 1 { T (1) = 1 …… given }

= 2n − i

2. Iteration Methods
It means to expand the recurrence and express it as a summation of terms of
n and initial condition.

EXAMPLE 1
Consider the Recurrence
T (n) = 1 if n = 1
= 2T (n − 1) if n > 1

 Solution:
T (n) = 2T (n − 1)

= 2 [2T (n − 2)] = 22 T (n − 2)

= 4 [2T (n − 3)] = 23 T (n − 3)
= 8 [2T (n − 4)] = 24 T (n − 4) (Eq. 1)
Repeat the procedure for i times

T (n) = 2i T (n − i)

Put n − i = 1 or i = n − 1 in (Eq. 1)
T (n) = 2n − 1 T (1)

= 2n − 1 1 { T (1) = 1 …… given }
= 2n − i
Introduction 1.17

EXAMPLE 2

Consider the Recurrence T (n) = T (n − 1) + 1 and T (1) = θ (1).

 Solution:
T (n) = T (n − 1) + 1

= (T (n − 2) + 1) + 1 = (T (n − 3) + 1) + 1 + 1
= T (n − 4) + 4 = T (n − 5) + 1 + 4
= T (n − 5) + 5 = T (n − k) + k
Where k=n−1
T (n − k) = T (1) = θ (1)
T (n) = θ (1) + (n − 1) = 1 + n − 1 = n −. 1 = n = θ (n)
3. Recursion Tree Method
Recursion is a fundamental concept in computer science and mathematics that
allows functions to call themselves, enabling the solution of complex problems through
iterative steps. One visual representation commonly used to understand and analyze
the execution of recursive functions is a recursion tree.

How to Use a Recursion Tree to Solve Recurrence Relations?

The cost of the sub problem in the recursion tree technique is the amount of
time needed to solve the sub problem. Therefore, if you notice the phrase "cost"
linked with the recursion tree, it simply refers to the amount of time needed to solve
a certain sub problem.

Let’s understand all of these steps with a few examples.

EXAMPLE 1
Consider the recurrence relation,

T (n) = 2T (n/2) + K

 Solution
The given recurrence relation shows the following properties,
1.18 Algorithms

A problem size n is divided into two sub-problems each of size n/2. The cost
of combining the solutions to these sub-problems is K.
Each problem size of n/2 is divided into two sub-problems each of size n/4 and
so on.
At the last level, the sub-problem size will be reduced to 1. In other words, we
finally hit the base case.
Let’s follow the steps to solve this recurrence relation,

Step 1: Draw the Recursion Tree

T (n) = 2T (n/2) + K

Step 2: Calculate the Height of the Tree

Since we know that when we continuously divide a number by 2, there comes
a time when this number is reduced to 1. Same as with the problem size N, suppose
after K divisions by 2, N becomes equal to 1, which implies, (n/2 ∧ k) = 1

Here n/2 ∧ k is the problem size at the last level and it is always equal to 1.
Now we can easily calculate the value of k from the above expression by taking
log() to both sides. Below is a more clear derivation,
n=2∧k

 log (n) = log (2 ∧ k)

Introduction 1.19

 log (n) = k∗ log (2)

 k = log (n)/log (2)

 k = log (n) base 2

So the height of the tree is log (n) base 2.

Step 3: Calculate the cost at each level

 Cost at Level-0 = K, two sub-problems are merged.

 Cost at Level-1 = K + K = 2∗ K, two sub-problems are merged two times.

 Cost at Level-2 = K + K + K + K = 4∗ K, two sub-problems are merged four

times. and so on....

Step 4: Calculate the number of nodes at each level

Let’s first determine the number of nodes in the last level. From the recursion
tree, we can deduce this

 Level-0 have 1 (2^0) node

 Level-1 have 2 (2^1) nodes

 Level-2 have 4 (2^2) nodes

 Level-3 have 8 (2^3) nodes

So the level log (n) should have 2 ∧ (log (n)) nodes i.e. n nodes.

Step 5: Sum up the cost of all the levels

 The total cost can be written as,

 Total Cost = Cost of all levels except last level + Cost of last level

 Total Cost = Cost for level-0 + Cost for level-1 + Cost for level-2
+ … + Cost for level-log (n) + Cost for last level
1.20 Algorithms

The cost of the last level is calculated separately because it is the base case
and no merging is done at the last level so, the cost to solve a single problem at
this level is some constant value. Let’s take it as O (1).

Let’s put the values into the formulae,

 T (n) = K + 2∗K + 4∗ K + … + log (n) ‘times + ‘O (1) ∗ n

 T (n) = K (1 + 2 + 4 + … + log (n) times) ‘+‘O (n)

 T (n) = K (2 ∧ 0 + 2 ∧ 1 + 2 ∧ 2 + … + log (n) times + O (n)

If you closely take a look to the above expression, it forms a Geometric

progression (a, ar, ar ∧ 2, ar ∧ 3 infinite time). The sum of GP is given by
S (N) = a/(r − 1). Here is the first term and r is the common ratio.
_
x = 1850

μ = 1800

s = 100

4. Master Method
The Master Method is used for solving the following types of recurrence

T (n) = aT (n/b + f (n) with a ≥ 1 and b ≥ 1 be constant & f (n) be a function

and (n/b) can be interpreted as

Let T (n) is defined on non-negative integers by the recurrence.

T (n) = aT (n/b) + f (n)

In the function to the analysis of a recursive algorithm, the constants and function
take on the following significance:

 n is the size of the problem.

 a is the number of subproblems in the recursion.

Introduction 1.21

 n/b is the size of each subproblem. (Here it is assumed that all subproblems
are essentially the same size.)

 f (n) is the sum of the work done outside the recursive calls, which includes
the sum of dividing the problem and the sum of combining the solutions
to the subproblems.

 It is not possible always bound the function according to the requirement,

so we make three cases which will tell us what kind of bound we can
apply on the function.

Master Theorem
It is possible to complete an asymptotic tight bound in these three cases:

⎧ Φ (nlogb a) f (n) = O (nlogb a − ε) ⎫

⎪ ⎪
⎪ Φ (nlogb a log n) f (n) = Φ (nlogb a) ⎪ ε > 0
T (n) = ⎨ ⎬
⎪ Φ (f (n)) f (n) = Ω (nlogb a + ε) AND ⎪ c < 1
⎪ ⎪
⎩ af (n/b) < cf (n) for large n ⎭

If f (n) = O (nlogb a − ε) for some constant ε > 0, then it follows that:

T (n) = (nlogb a)

EXAMPLE 1
n
T (n) = 8T ⎛⎜ ⎞⎟ + 1000n2 apply master theorem on it.
⎝2⎠

 Solution:
n
Compare T (n) = 8T ⎛⎜ ⎞⎟ + 1000n2 with
⎝2⎠
n
T (n) = aT ⎛⎜ ⎞⎟ + f (n) with a ≥ 1 and b > 1
⎝b⎠

a = 8, b = 2, f (n) = 1000 n2, logb a = log2 8 = 3

1.22 Algorithms

Put all the values in: f (n) = O (nlogb a − ε)

1000 n2 = 0 (n3 − ε)

If we choose ε = 1, we get; 1000 n2 = 0 (n3 − 1) = O (n2)

Since this equation holds, the first case of the master theorem applies to the
given recurrence relation, thus resulting in the conclusion:

T (n) = O− (nlogb a)

Therefore: T (n) = O− (n3)

6($5&+,1*

Searching is a technique that helps to find whether the given element is present
in the set of elements. Any search is said to be successful or unsuccessful depending
upon whether the element that is being searched is found or not. Some of the standards
searching techniques are:

 Linear Search or Sequential Search

 Binary Search

 Interpolation Search

1.5.1 Linear Search

It is one of the most simple and straightforward search algorithms. In this, you
need to traverse the entire list and compare the current element with the target element.
If a match is found, you can stop the search else continue.

Linear search is implemented using following steps...

Step
Step 1:
1: Read the search element from the user
Step 2: Compare, the search element with the first element in the array.
Step 3:
Step 3: If both are matched, then display “Given element found!!!” and terminate
the program
Introduction 1.23

Step 4: If both are not matched, then compare search element with the next element
Step 4:
in the array.
Step 5:
Step 5: Repeat steps 3 and 4 until the search element is compared with the last
element in the array.
Step 6: If the last element in the array is also not matched, then display “Element
Step
not found!!!” and terminate the function.

1. Python Program to search the given element in the list of items using
Linear Search

Example

Given a array, search a given element in array.

Case 1
Input: Search 20

12 5 10 15 31 20 25 2 40
0 1 2 3 4 5 6 7 8

Output: True (20 is present in array)

Case 2
Input: Search 26

12 5 10 15 31 20 25 2 40
0 1 2 3 4 5 6 7 8

Output: False (26 is not present in array)

Given the array of elements: 59, 58, 96, 78, 23 and the element to be searched
is 96, the working of linear search is as follows:
1.24 Algorithms

The Element is FOUND. Hence stop the searching process.

def LinearSearch(mylist, n, k):

for j in range(0, n):
if (mylist[j] == k):
return j
return -1

mylist = [1, 3, 5, 7, 9]
print("Given Elements : ", mylist)
k = int(input("Enter the element to be searched : "))
n = len(mylist)
result = LinearSearch(mylist, n, k)
if(result == -1):
print("Element not found")
else:
print("Element found at index: ", result)

Execution:

Input
Given Elements : [1, 3, 5, 7, 9]

Enter the element to be searched : 3

Output
Element found at index: 1
Introduction 1.25

2. Complexity Analysis of Linear Search

Time Complexity

 Best case - O(1)

The best case occurs when the target element is found at the beginning of
the list/array. Since only one comparison is made, the time complexity is
O(1).

Example:

Array A[] = {3,4,0,9,8} &Target element = 3

Here, the target is found at A[0].

 Worst-case - O(n), where n is the size of the list/array.

The worst-case occurs when the target element is found at the end of the
list or is not present in the list/array. Since you need to traverse the entire
list, the time complexity is O(n), as n comparisons are needed.

 Average case - O(n)

The average case complexity of the linear search is also O(n).

Space Complexity

 The space complexity of the linear search is O(1), as we don’t need any
auxiliary space for the algorithm.

1.5.2 Binary Search

Binary search is a searching algorithm which works efficiently on sorted elements.
It uses divide and conquers method in which we compare the target element with
the middle element of the list. If they are equal, then it implies that the target is
found at the middle position; else, we reduce the search space by half, i.e. apply
binary search on either of the left and right halves of the list depending upon whether
target element or targetmiddle element. We continue this until a match is found
or the size of the array reaches 1.
1.26 Algorithms

Binary search is implemented using following steps:

Step 1: Read the search element from the user,

Step 2: Find the middle element in the sorted array,
Step 3: Compare, the search element with the middle element in the sorted array.,
Step 4: If both are matched, then display “Given element found!!!” and terminate
the function,
Step 5: If both are not matched, then check whether the search element is smaller
or larger than middle element.,
Step 6: If the search element is smaller than middle element, then repeat steps 2,
3, 4 and 5 for the left sub array of the middle element.,
Step 7: If the search element is larger than middle element, then repeat steps 2,
3, 4 and 5 for the right sub array of the middle element.,
Step 8: Repeat the same process until we find the search element in the array or
until the sub array contains only one element.,
Step 9: If that element also doesn’t match with the search element, then display
“Element not found in the array!!!” and terminate the function.,
Introduction 1.27

/* 1. Python Program to search the given element in the list of items using
Binary Search using Iterative approach */

Method 1 – Iterative approach

Given an array of elements: 6, 12, 17, 323, 38, 45, 77, 84, 90

The element to be searched: 45

start + end
Formula for calculating middle is, Mid =
2

The Element is FOUND. Hence stop the searching process.

1.28 Algorithms

def mybinarySearch(myarray, x, low, high):

# Binary Search using Iterative approach
while low <= high:
mid = low + (high - low)//2
if myarray[mid] == x:
return mid
elif myarray[mid] < x:
low = mid + 1
else:
high = mid - 1
return -1
myarray = [3, 4, 5, 6, 7, 8, 9]
print("Elements in the array: " , myarray)
x = int(input("Enter the element to be searched : "))
result = mybinarySearch(myarray, x, 0, len(myarray)-1)
if result != -1:
print("Element is present at index :" + str(result))
else:
print("Element not found ")

Execution:

Input
Elements in the array: [33, 44, 55, 66, 77, 88, 99]

Enter the element to be searched : 66

Output
Element is present at index : 3

/* 2. Python Program to search the given element in the list of items using
Binary Search using Recursive approach */

Method 2 – Recursive approach

Method 2 is the recursive approach. In the recursive approach the function calls
itself again and again. We declared a recursive function and its base condition. The
condition is the lowest value is smaller or equal to the highest value.We calculate
the middle number as in the last program.

We have used ifstatement to proceed with the binary search.

Introduction 1.29

 If the middle value equal to the number that we are looking for, the middle
value is returned.

 If the middle value is less than the value, we are looking then our recursive
function binary_search() again and increase the mid value by one and assign
to low.

 If the middle value is greater than the value we are looking then our
recursive function binary_search() again and decrease the mid value by one
and assign it to low.

Program
def mybinary_search(myarr, low, high, x):
if high >= low:
mid = (high + low) // 2
if myarr[mid] == x:
return mid
# If element is smaller than mid, then it can only
# be present in left subarray
elif myarr[mid] > x:
return mybinary_search(myarr, low, mid - 1, x)
# Else the element can only be present in right subarray
else:
return mybinary_search(myarr, mid + 1, high, x)
else:
# Element is not present in the array
return -1
# Test data
myarr = [ 2, 3, 4, 10, 40 ]
print("Elements in the array :", myarr)
x = int(input("Enter the element to be searched : "))
# Function call
result = mybinary_search(myarr, 0, len(myarr)-1, x)
if result != -1:
print("Element is present at index : ", str(result))
else:
print("Element is not present in array")
1.30 Algorithms

Execution:

Input
Elements in the array : [2, 3, 4, 10, 40]

Enter the element to be searched : 10

Output
Element is present at index : 3

2^\_[TgXch 0]P[hbXb ^U 1X]Pah BTPaRW

Time Complexity
 Best case - O(1)
The best case occurs when the target element is found in the middle of
list/array. Since only one comparison is made, the time complexity is O(1).

 Worst-case - O(logn)
The worst occurs when the algorithm keeps on searching for the target
element until the size of the array reduces to 1. Since the number of
comparisons required is logn, the time complexity is O(logn).

 Average case - O(logn)

Binary search has an average-case complexity of O(logn).

Space Complexity
 Since no extra space is needed, the space complexity of the binary search
is O(1).

1.5.3 Interpolation Search

The interpolation search is basically an improved version of the binary search.
This searching algorithm resembles the method by which one might search a telephone
book for a name. It performs very efficiently when there are uniformly distributed
elements in the sorted list. In a binary search, we always start searching from the
middle of the list, whereas in the interpolation search we determine the starting position
depending on the item to be searched. In the interpolation search algorithm, the starting
search position is most likely to be the closest to the start or end of the list depending
Introduction 1.31

on the search item. If the search item is near to the first element in the list, then
the starting search position is likely to be near the start of the list.

Important points on Interpolation Search

 Interpolation search is an improvement over binary search.

 Binary Search always checks the value at middle index. But, interpolation
search may check at different locations based on the value of element being
searched.

 For interpolation search to work efficiently the array elements/data should

be sorted and uniformly distributed.

Interpolation search is implemented using following steps:

Step
Step 1:
1: Let A - Array of elements, e - element to be searched, pos - current position

Step
Step 2:2: Assign start = 0 & end = n-1

Step
Step 3:3: Calculate position ( pos ) to start searching by using formula:

⎡ (end − start) ⎤
pos = start + ⎢ ∗ (e − A [start]) ⎥
⎣ (A [end] − A [start]) ⎦
Step
Step 4:
4: If A[pos] == e , element found at index pos.

Step
Step 5:
5: Otherwise if e A[pos] we make start = pos + 1

Step
Step 6:
6: Else if e A[pos] we make end = pos -1

Step
Step 7:
7: Do steps 3, 4, 5, 6.

While : start <= end && e >= A[start] && e =< A[end]

 start <= end is checked until we have elements in the sub-array.

 e >= A[start] is done when the element we are looking for is greater than
or equal to the starting element of sub-array we are looking in.

 e =< A[end] is done when the element we are looking for is less than or
equal to the last element of sub-array we are looking in.
1.32 Algorithms

/* Python Program to search the given element in the list of items using
Interpolation Search */
Example: Element to be searched = 4.

start end pos

0 8 0 + (8 − 0)/(15 − 1) ∗ (4 − 1)
8/14 ∗ 3 = 0.57 ∗ 3 = 1.71 = 1
2 8 2 + (8 − 2)/(15 − 4) ∗ (4 − 4)
2 + 6/11 ∗ 0 = 2

Program
def interpolationSearch(arr, lo, hi, x):
if (lo <= hi and x >= arr[lo] and x <= arr[hi]):
pos = lo + ((hi-lo)//(arr[hi]-arr[lo])*(x - arr[lo]))
if arr[pos] == x:
return pos
if arr[pos] < x:
return interpolationSearch(arr, pos + 1, hi, x)
if arr[pos] > x:
return interpolationSearch(arr, lo, pos - 1, x)
return -1
arr = [10, 12, 13, 16, 18, 19, 20,
21, 22, 23, 24, 33, 35, 42, 47]
print("Elements in the array :", arr)
x = int(input("Enter the element to be searched : "))
n = len(arr)
index = interpolationSearch(arr, 0, n - 1, x)
Introduction 1.33

if index != -1:
print("Element found at index", index)
else:
print("Element not found")

Execution:

Input
Elements in the array : [10, 12, 13, 16, 18, 19, 20, 21, 22, 23, 24, 33, 35, 42, 47]

Enter the element to be searched : 20

Output
Element found at index 6

Complexity Analysis of Interpolation Search

Time Complexity
 Best case - O(1)
The best-case occurs when the target is found exactly as the first expected
position computed using the formula. As we only perform one comparison,
the time complexity is O(1).

 Worst-case - O(n)
The worst case occurs when the given data set is exponentially distributed.

 Average case - O(log(log(n)))

If the data set is sorted and uniformly distributed, then it takes O(log(log(n)))
time as on an average (log(log(n))) comparisons are made.

Space Complexity
 Since no extra space is needed, the space complexity of the interpolation
search is O(1).
1.34 Algorithms

1.5.4 Comparative Analysis

Time Complexity Space

Algorithm
Best case Worst-case Average case Complexity

Linear Search O(1) O(n) O(n) O(1)

Binary Search O(1) O(logn) O(logn) O(1)
Interpolation Search O(1) O(n) O(log(log(n))) O(1)

3$77(51 6($5&+

The Pattern Searching algorithms are sometimes also referred to as String

Searching Algorithms. These algorithms are useful in the case of searching a pattern
in a string.

Algorithms used for String Matching:

Various string matching algorithms are:

 The Naive String Matching Algorithm

 The Rabin-Karp-Algorithm

 Finite Automata

 The Knuth-Morris-Pratt Algorithm

 The Boyer-Moore Algorithm

Algorithms based on character comparison

Naive Match Algorithm

It slides the pattern over text one by one and checks for a match. If a match
is found, then slides by 1 again to check for subsequent matches.
Introduction 1.35

KMP (Knuth Morris Pratt) Algorithm

KMP algorithm is used to find a “Pattern” in a “Text”. This algorithm compares
character by character from left to right. But whenever a mismatch occurs, it uses a
pre-processed table called “Prefix Table” to skip characters comparison while matching.

Algorithms based on Hashing Technique

Rabin Karp Algorithm

It matches the hash value of the pattern with the hash value of current substring
of text, and if the hash values match then only it starts matching individual characters.

1.6.1 Naive Match Algorithm

This is simple and efficient brute force approach. It compares the first character
of pattern with given stringt. If a match is found, pointers in both strings are advanced.
If a match is not found, the pointer to text is incremented and pointer of the pattern
is reset. This process is repeated till the end of the text.The naïve approach does not
require any pre-processing.
Given a text array, T [1.....n], of n character and a pattern array, P [1......m],
of m characters. The algorithms are to find an integer s, called valid shift where
0 ≤ s < n-m. In other words, we need to find, if P is in T, i.e., where P is a substring
of T. The item of P and T are character drawn from some finite alphabet such as
{0, 1} or {A, B .....Z, a, b..... z}.

Steps
1. n → length [T]
2. m → length [P]
3. for s ← 0 to n -m
4. do if P [1.....m] = T [s + 1....s + m]
5. then print “Pattern occurs with shift” s

Input
string = “This is my class room”

pattern = “class”
1.36 Algorithms

Output
Pattern found at index 11

Input:
string = “AABAACAADAABAABA”

pattern = = “AABA”

Output
Pattern found at index 0

Pattern found at index 9

Pattern found at index 12

Working of Naïve Pattern matching algorithm

Introduction 1.37

1. Python Program to search the pattern in the given string using Naïve
Match algorithm
def naïve_algorithm(string, pattern):
n = len(string)
m = len(pattern)
if m > n:
print("Pattern not found")
return
for i in range(n - m + 1):
j = 0
while j < m:
if string[i + j] != pattern[j]:
break
j += 1
if j == m:
print("Pattern found at index: ", i)

string = "hellohihello"
print("Given String : ", string)
pattern = input("Enter the pattern to be searched :")
naïve_algorithm(string, pattern)

Execution:

Input
Given String : hellohihello

Enter the pattern to be searched :hi

Output
Pattern found at index: 5

2. Complexity Analysis of Naïve Match

Time Complexity
 Best Case Complexity- O(n).
Best case complexity occurs when the first character of the pattern is not
present in string.
String = “HIHELLOHIHELLO”
Pattern = “ LI”
1.38 Algorithms

The number of comparisons in best case is O(n).

 Worst Case Complexity - O(m*(n-m+1)).

Worst case complexity of Naive Pattern Searching occurs in following cases.

Case
Case 1:
1: When all the characters of the string and pattern are same.

String = “HHHHHHHHHHHH”

Pattern = “ HHH”

Case
Case 2:
2: When only the last character is different.

String = “HHHHHHHHHHHM”

Pattern = “ HHM”

The number of comparisons in the worst case is O(m*(n-m+1)).

Space Complexity
 Since no extra space is needed, the space complexity of the naïve search
is O(1).

3. Merits & Demerits

Advantages
 The comparison of the pattern with the given string can be done in any
order

 No extra space required

 Since it doesn’t require the pre-processing phase, as the running time is

equal to matching time

Disadvantage
 Naive method is inefficient because information from a shift is not used
again.
Introduction 1.39

1.6.2 Rabin Karp Algorithm

Rabin-Karp algorithm is an algorithm used for searching/matching patterns in

the text using a hash function. Unlike Naive string matching algorithm, it does not
travel through every character in the initial phase rather it filters the characters that
do not match and then performs the comparison.

 Initially calculate the hash value of the pattern.

 Start iterating from the starting of the string:

• Calculate the hash value of the current substring having length m.

• If the hash value of the current substring and the pattern are same,
check if the substring is same as the pattern.

• If they are same, store the starting index as a valid answer. Otherwise,
continue for the next substrings.

 Return the starting indices as the required answer.

Hash(acad) = 1466 Hash(acad) = 1466

Hash(abra) = 1493 Hash(brac) = 1533
Hash(acad) ≠ Hash(abra) Hash(acad) ≠ Hash(brac)
Hence, it is mismatch Hence, it is mismatch

Hash(acad) = 1466 Hash(acad) = 1466

Hash(raca) = 1595 Hash(acad) = 1466
Hash(acad) ≠ Hash(raca) Hash(acad) ≠ Hash(acad)
Hence, it is mismatch Match found at index 3
1.40 Algorithms

Steps in Rabin-Karp Algorithm

Step 1:
 Take the input string and the pattern, which we want to match.

Given string:

A B C C D D A E F G

Pattern:

C D D

Step 2:
 Here, we have taken first ten alphabets only (i.e. A to J) and given the
weights.

A B C D E F G H I J
1 2 3 4 5 6 7 8 9 10

Step 3:

n → Length of the text

m → Length of the pattern

Here, n = 10 and m = 3.

d → Number of characters in the input set.

Here, we have taken input set {A, B, C, ..., J}. So, d = 10.

Note: we can assume any suitable value for d.

Step 4:
 Calculate the hash value of the pattern (CDD)
Introduction 1.41

hash value for pattern(p) = Σ(v * dm-1) mod 13

= ((3 * 102) + (4 * 101) + (4 * 100)) mod 13

= 344 mod 13

= 6

In the calculation above, choose a prime number (here, 13) in such a way that
we can perform all the calculations with single-precision arithmetic.

 Now calculate the hash value for the first window (ABC)

hash value for text(t) = Σ(v * dm-1) mod 13

= ((1 * 102) + (2 * 101) + (3 * 100)) mod 13

= 123 mod 13

= 6

 Compare the hash value of the pattern with the hash value of the text. If
they match then, character-matching is performed. In the above examples,
the hash value of the first window (i.e. text) matches with pattern, so go
for character matching between ABC and CDD. Since they do not match
so, go for the next window.

Step 5:
 We calculate the hash value of the next window by subtracting the first
term and adding the next term as shown below.

 Simple Numerical example

• Pattern length is 3 and string is “23456”

• Let us assume that we computed the value of the first window as

234.

• How to compute the value of the next window “345”? It’s just (234
– 2*100)*10 + 5 and we get 345.
1.42 Algorithms

hash value for text(t) = ((1 * 102) + ((2 * 101) + (3 * 100) - (1 * 102))
* 10 + (3 * 100)) mod 13

= 233 mod 13

= 12

For BCC, t ≠ 12 (≠ 6). Therefore, go for the next window.

After a few searches, we will get the match for the window CDA in the text.

/* 1. Python Program to search the pattern in the given string using

Rabin-Karp algorithm */
d = 10
def search(pattern, text, q):
m = len(pattern)
n = len(text)
p = 0
t = 0
h = 1
i = 0
j = 0
for i in range(m-1):
h = (h*d) % q
# Calculate hash value for pattern and text
for i in range(m):
p = (d*p + ord(pattern[i])) % q
t = (d*t + ord(text[i])) % q
# Find the match
for i in range(n-m+1):
if p == t:
for j in range(m):
if text[i+j] != pattern[j]:
break
j += 1
if j == m:
print("Pattern is found at position: " + str(i+1))
if i < n-m:
t = (d*(t-ord(text[i])*h) + ord(text[i+m])) % q
if t < 0:
t = t+q
text = "hihellohi"
Introduction 1.43

print("Given String : ", text)

pattern = input("Enter the pattern to be searched :")
q = int(input("Enter the prime number :"))
search(pattern, text, q)

Execution:

Input
Given String : hihellohi

Enter the pattern to be searched :hello

Enter the prime number :3

Output
Pattern is found at position: 3

2. Complexity Analysis of Rabin-Karp algorithm

Time Complexity
 Best Case Complexity - O(n+m).
The average and best-case running time of the Rabin-Karp algorithm is
O(n+m), but its worst-case time is O(nm).

 Worst Case Complexity - O(nm).

The worst case of the Rabin-Karp algorithm occurs when all characters of
pattern and text are the same as the hash values of all the substrings of
text matches with the hash value of pattern.

Space Complexity
 Since no extra space is needed, the space complexity of the naïve search
is O(1).

3. Merits & Demerits

Advantages
 Extends to 2D patterns.

 Extends to finding multiple patterns.

1.44 Algorithms

Disadvantage:
 Arithmetic operations is slower than character comparisons.

1.6.3 Knuth-Morris-Pratt Algorithm

KMP Algorithm is one of the most popular patterns matching algorithms. KMP
stands for Knuth Morris Pratt algorithm. KMP algorithm was the first linear time
complexity algorithm for string matching. KMP algorithm is used to find a “Pattern”
in a “Text”. This algorithm compares character by character from left to right. But
whenever a mismatch occurs, it uses a pre-processed table called “Prefix Table” to
skip characters comparison while matching. Sometimes prefix table is also known as
LPS Table. Here LPS stands for “Longest proper Prefix which is also Suffix”.

Steps for Creating LPS Table (Prefix Table)

Step
Step 1:
1: Define a one dimensional array with the size equal to the length of the
Pattern. (LPS[size])

Step
Step 2:
2: Define variables i & j. Set i = 0, j = 1 and LPS[0] = 0.

Step
Step 3:
3: Compare the characters at Pattern[i] and Pattern[j].
Step 4:
Step 4: If both are matched then set LPS[j] = i+1 and increment both i & j values
by one. Goto Step 3.

Step 5:
Step 5: If both are not matched then check the value of variable ’i’. If it is ’0’
then set LPS[j] = 0 and increment ’j’ value by one, if it is not ’0’ then set i =
LPS[i-1]. Goto Step 3.
Step
Step 6:
6: Repeat above steps until all the values of LPS[] are filled.

Example:
Given Pattern

A B C D A B D
Initialize LPS[] table with size 7 which is equal to the length of the pattern

Step 1:
 Define variables i & j.
Introduction 1.45

 Set i = 0, j= 1 and LPS[0] = 0.

0 1 2 3 4 5 6
LPS 0

Step 2:

 Compare Pattern[i] with Pattern[j] ⇒ is compared with B. Since both were

not matching, check the value of i.

 i = 0, so set LPS[j] = 0 and increment ‘j’ value by 1.

0 1 2 3 4 5 6
LPS 0 0

 Now, i = 0 & j = 2

Step 3:

 Compare Pattern[i] with Pattern[j] ⇒ A is compared with C. Since both

were not matching, check the value of i.

 i = 0, so set LPS[j] = 0 and increment ‘j’ value by 1.

0 1 2 3 4 5 6
LPS 0 0 0
 Now, i = 0 & j = 3

Step 4:

 Compare Pattern[i] with Pattern[j] ⇒ A is compared with D. Since both

were not matching, check the value of i.

 i = 0, so set LPS[j] = 0 and increment ‘j’ value by 1.

0 1 2 3 4 5 6
LPS 0 0 0 0
1.46 Algorithms

 Now, i = 0 & j = 4

Step 5:

 Compare Pattern[i] with Pattern[j] ⇒ A is compared with A. Since both

are matching, set LPS[j] = i+1 and increment both ‘i’ & ‘j’ value by 1.

0 1 2 3 4 5 6
LPS 0 0 0 0 1

 Now, i = 1 & j = 5

Step 6:

 Compare Pattern[i] with Pattern[j] ⇒ B is compared with B. Since both

are matching, set LPS[j] = i+1 and increment both ‘i’ & ‘j’ value by 1.

0 1 2 3 4 5 6
LPS 0 0 0 0 1 2

 Now, i = 2 & j = 6

Step 7:

 Compare Pattern[i] with Pattern[j] ⇒ C is compared with D. Since both

were not matching, check the value of i.

 i =0, so set i= LPS[i-1] ⇒ LPS[2-1]

 i = 0

0 1 2 3 4 5 6
LPS 0 0 0 0 1 2

 Now, i = 0 & j = 6
Introduction 1.47

Step 8:

 Compare Pattern[i] with Pattern[j] ⇒ A is compared with D. Since both

were not matching, check the value of i.

 i = 0, so set LPS[j] = 0 and increment ‘j’ value by 1.

0 1 2 3 4 5 6
LPS 0 0 0 0 1 2 0

 Now, i = 0 & j = 7

Final LPS[] table is as follows:

0 1 2 3 4 5 6
LPS 0 0 0 0 1 2 0

1. Working mechanism of KMP

We use the LPS table to decide how many characters are to be skipped for
comparison when a mismatch has occurred.When a mismatch occurs, check the LPS
value of the previous character of the mismatched character in the pattern.

 If it is ’0’ then start comparing the first character of the pattern with the
next character to the mismatched character in the text.

 If it is not ’0’ then start comparing the character which is at an index value
equal to the LPS value of the previous character to the mismatched character
in pattern with the mismatched character in the Text.

Example
Consider the following Text and Pattern

Text: ABC ABCDAB ABCDABCDABDE

Pattern: ABCDABD
1.48 Algorithms

LPS[] table for the above pattern is as follows:

0 1 2 3 4 5 6
LPS 0 0 0 0 1 2 0

Step 1:
 Start comparing the first character of the pattern with the first character of
Text from left to right.