0% found this document useful (0 votes)
18 views80 pages

ADSA Lecture Notes

Uploaded by

eedupugantil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views80 pages

ADSA Lecture Notes

Uploaded by

eedupugantil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Academic Year: 2024 – 25

Class & Branch: II B.Tech, I Semester

Common to CSE, CSE-AI, CSE-IOT, AIML

Subject Name: ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Subject Code: 23A05302T

Regulation: R23

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 1


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

(23A05302T) ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

[Common To CSE, CSE-AI, CSE-IOT & AIML Branches]

Course Outcomes: After completion of the course, students will be able to


CO1: Illustrate the working of the advanced tree data structures and analyze the complexity
of algorithms, apply asymptotic notations. (L4)
CO2: Understand the Graph data structure, and apply divide and conquer strategy for
various applications. (L3)
CO3: Analyze the efficiency of Greedy and Dynamic programming design techniques to
solve optimization problems. (L4)
CO4: Apply Backtracking and Branch-and-Bound algorithms for various applications. (L3)
CO5: Define and Classify deterministic and non-deterministic algorithms; NP-Hard and NP-
Complete problems. (L4)

UNIT – I:
Introduction to Algorithm Analysis, Space and Time Complexity analysis, Asymptotic
Notations. AVL Trees – Creation, Insertion, Deletion operations and Applications. .B-Trees –
Creation, Insertion, Deletion operations and Applications.

UNIT – II:
Heap Trees (Priority Queues) – Min and Max Heaps, Operations and Applications.
Graphs – Terminology, Representations, Basic Search and Traversals, Connected
Components and Biconnected Components, applications.
Divide and Conquer: The General Method, Quick Sort, Merge Sort, Strassen’s matrix
multiplication, Convex Hull.

UNIT – III:
Greedy Method: General Method, Job Sequencing with deadlines, Knapsack Problem,
Minimum cost spanning trees, Single Source Shortest Paths.
Dynamic Programming: General Method, All pairs shortest paths, Single Source Shortest
Paths – General Weights (Bellman Ford Algorithm), Optimal Binary Search Trees, 0/1
Knapsack, String Editing, Travelling Salesperson problem.

UNIT – IV:
Backtracking: General Method, 8-Queens Problem, Sum of Subsets problem, Graph
Coloring, 0/1 Knapsack Problem.
Branch and Bound: The General Method, 0/1 Knapsack Problem, Travelling Salesperson
problem.

UNIT – V:
NP Hard and NP Complete Problems: Basic Concepts, Cook’s theorem.
NP Hard Graph Problems: Clique Decision Problem (CDP), Chromatic Number Decision
Problem (CNDP), Traveling Salesperson Decision Problem (TSP).
NP Hard Scheduling Problems: Scheduling Identical Processors, Job Shop Scheduling.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 2


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Text Books

1. Fundamentals of Data Structures in C++, Horowitz, Ellis; Sahni, Sartaj; Mehta,


Dinesh 2nd Edition Universities Press.
2. Computer Algorithms/C++ Ellis Horowitz, SartajSahni, SanguthevarRajasekaran2nd
Edition University Press.

Reference Books

1. Data Structures and program design in C, Robert Kruse, Pearson Education Asia.
2. An introduction to Data Structures with applications, Trembley & Sorenson,
McGrawHill.
3. The Art of Computer Programming, Vol.1: Fundamental Algorithms, Donald E
Knuth, Addison-Wesley, 1997.
4. Data Structures using C & C++: Langsam, Augenstein&Tanenbaum, Pearson, 1995.
5. Algorithms + Data Structures & Programs:, N.Wirth, PHI.
6. Fundamentals of Data Structures in C++: Horowitz Sahni& Mehta, Galgottia Pub.
7. Data structures in Java:, Thomas Standish, Pearson Education Asia.

Online Learning Resources

1. https://www.tutorialspoint.com/advanced_data_structures/index.asp
2. http://peterindia.net/Algorithms.html
3. Abdul Bari,1. Introduction to Algorithms (youtube.com)

***

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 3


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

UNIT – I

Introduction to Algorithm Analysis, Space and Time Complexity Analysis, Asymptotic


Notations. AVL Trees – Creation, Insertion, Deletion operations and Applications. B-Trees
– Creation, Insertion, Deletion operations and Applications.
***

ALGORITHM

An algorithm is a step-by-step procedure of solving the given problem statement.


Algorithms are designed by using pseudo code. Pseudo code is a language independent code.
All algorithms must satisfy the following characteristics.

 Input: Values submitted for the processing the instructions are known as input
values. Here, zero or more quantities are externally supplied.
 Output: Values generated after processing the instructions are known as output
values. Here, at least one quantity must be produced.
 Definiteness: Each instruction is in clear format.
Example: Add 10 to X [ Valid Statement ]
Add 10 or 20 X [ Invalid Statement ]
 Finiteness: If we trace out the instructions, the algorithm must terminate after a
finite sequence of steps.
 Effectiveness: Every instruction must be in basic format.

Example: 1. Addition of given two numbers

Step 1: START
Step 2: READ x, y
Step 3: sum ← x + y
Step 4: WRITE sum
Step 5: STOP

The study of algorithms includes four important active areas such as:

a) How to devise algorithms: Algorithms are designed by using design strategies like
Divide-and-Conquer strategy, Greedy method, Dynamic programming, Branch and
bound etc,

b) How to validate algorithms: Once an algorithm is designed, it is necessary to show


that it computes the correct answer for all possible legal inputs. Such a process is known
as algorithm validation.
After validation of the algorithm, it is converted into programs, which are referred to
as program proving/verification.

c) How to analyze algorithms: As an algorithm is executed, it uses the computer’s


central processing unit to perform operations and its memory to hold the program and
data.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 4


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Analysis of algorithms or performance analysis refers to the task of determining


computing time and memory space of an algorithm requires.

d) How to test a program: Testing of a program consists of two phases: debugging and
profiling.
Debugging is the process of detecting and correcting the errors while compiling the
programs for proper execution.
Profiling/Performance measurement is the process of executing a correct program
on data sets and measuring the time and space it takes to compute the results.

To focus on performance evaluation, it includes performance analysis and


performance measurement. In performance analysis, estimation of time and space are
machine independent whereas in performance measurement, those are machine dependent.

PERFORMANCE ANALYSIS / ANALYSIS OF ALGORITHMS

Any problem statement can be analyzed at two stages as:

 Apriori analysis
 Posterior analysis.

Apriori analysis means, the analysis is done for the problem before it run on the
machine. Posterior analysis is done after running the problem in a specific programming
language.

Analysis of algorithms (or) performance analysis refers to the task of determining


how much computing time (time complexity) and storage (space complexity) of an
algorithm requires.

Algorithm efficiency describes the properties of an algorithm which relates to the


amount of resources used. An algorithm must be analyzed to determine its resources usage.

The time complexity of an algorithm is the amount of computer time it needs to run
for its completion. The space complexity of an algorithm is the amount of memory it needs
to run for its completion.

These complexities are calculated based on the size of the input. With this, analysis
can be divided into three cases as: Best case analysis, Worst case analysis and Average case
analysis.

Best case analysis: In best case analysis, problem statement takes minimum
number of computations for the given input parameters.

Worst case analysis: In worst case analysis, problem statement takes


maximum number of computations for the given input parameters.

Average case analysis: In average case analysis, problem statement takes


average number of computations for the given input parameters.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 5


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

SPACE COMPLEXITY

The process of estimating the amount of memory space to run for its completion is
known as space complexity.

Space complexity S(P) of any problem P is sum of fixed space requirements and
variable space requirements as:

Space requirement S(P) = Fixed Space Requirements + Variable Space Requirements

1. Fixed space that is independent of the characteristics (Ex: number, size) of the input
and outputs. It includes the instruction space, space for simple variables and fixed-
size component variables, space for constants and so on.

2. Variable space that consists of the space needed by component variables whose size is
dependent on the particular problem instance being solved, the space needed by the
referenced variables and the recursion stack space.

When analyzing the space complexity of any problem statement, concentrate solely
on estimating the variable space requirements. First determine which instance characteristics
to use to measure the space requirements. Hence, the total space requirement S(P) of any
program can be represented as:

S (P) = C + SP (I)

Where,
C is a constant representing the fixed space requirements and I refer to
instance characteristics.

Example 1: Algorithm sum(a, b, c)


// a, b and c are ordinary variables
{
return a + b + c;
}

Here, the variables a,b and c are simple variables.


Therefore Ssum = 0.

Example 2: Algorithm sum(K, n)


// K is an array with a maximum size as n
{
s : = 0;
for i : = 1 to n do
s : = s + K[i];
return s;
}

Here, the instance characteristic is n. Variable terms are n, i and s. So that count
values are treated as n for the list array, one each for n, i, and s.
Therefore Ssum (n) = n + 3.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 6


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Example 3: Algorithm Rsum(K, n)


// K is an array with a maximum size as n
{
if (n ≤ 0)
return 0.0;
else
return Rsum(K, n-1) + K[n];
}

Here, the instance characteristic is n. Recursion includes space for formal parameters,
local variables, and the return address. So that count values are treated as one for n, one for
return address and one for K[ ] array. Function works for n+1 times. For every call, the three
word counts should be invoked.

Therefore SRsum (n) = 3(n + 1).

TIME COMPLEXITY

The process of estimating the amount of computing time to run for its completion is
known as time complexity.

The time T(P) taken by a program P is the sum of its compile time and its run time.

i.e., Time complexity T(P) = Compile time + Run time

Here,
Compile time is a fixed component and does not depends on the instance
characteristics. Hence,

T(P) = C + TP (Instance characteristics)


Where, C is a fixed constant value

T(P) ≥ TP(I)
Where, I refer instance characteristic.

Time complexity of a program is calculated by determining the number of steps that a


program/function needs to solve known as step count and then express it in terms of
asymptotic notations.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 7


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

STEP COUNT METHOD

In Step count method, determine the number of steps that a program or a function
needs to solve a particular instance by creating a global variable count, which has an initial
value of ‘0’. Now increment count by number of program steps required by each executable
statement. Finally add the total number of times that the count variable is incremented.

Example 1: Algorithm sum(a, b, c) count = 0


// a, b and c are ordinary variables
{
return a + b + c; count = count + 1
} count = count + 1

Here, count variable is incremented by twice one for addition operation and one for
return statement.

Therefore Tsum = 2.

Example 2: Algorithm sum(K, n) count = 0


// K is an array with a maximum size as n
{
s : = 0; count = count + 1 /* Assignment */
for i : = 1 to n do
{ count = count + 1 /* i Assignment */
s : = s + K[i]; count = count + 1 /* s Assignment */
} count = count + 1 /* i last assignment */
return s; count = count + 1 /* return statement */
}

Here, inside the loop count variable is incremented by two times and the loop is
executed for n times so that it becomes 2n steps. Outside the loop count is incremented by 3
steps.

Therefore Tsum (n) = 2n + 3.

Example 3: Algorithm Rsum(K, n) count = 0


// K is an array with a maximum size as n
{
if (n ≤ 0) count = count + 1
return 0.0; count = count + 1
else
return Rsum(K, n-1) + K[n]; count = count + 1
}

Here, if n = 0 then count is incremented by 2 steps. Otherwise, it also incremented by


2 steps and increase additions of Rsum ( K , n-1 ).

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 8


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Therefore TRsum (n) = 2 ; if n = 0


= 2 + TRsum (n-1) ; if n > 0

These recursive formulas are referred to as recurrence relations. Recurrence


relations are solved to make repeated substitutions for each occurrence of recursive process
until all such occurrences disappear.

TRsum (n) = 2 + TRsum (n-1)


= 2 + 2 + TRsum (n-2)
= 2 + 2 + 2 + TRsum (n-3)
= 3 (2) + TRsum (n-3)

-
-

= n (2) + TRsum (n-n)


= 2n + 2

Therefore TRsum = 2n + 2.

STEP COUNT TABLE METHOD

Another way to obtain step count is tabular method. In this method,

 First determine the step count of each statement known as steps/execution simply s/e.
 Note down the number of times that each statement is executed known as frequency.
The frequency of a non-executable statement is 0.
 Multiply s/e with frequency, gives us the total steps for each statement.
 Finally, add these total steps, gives us the step count of the entire function.

Example 1:

STATEMENT s/e Frequency Total Steps


Algorithm ADD(X, Y) 0 0 0
{ 0 0 0
Z : = X + Y; 1 1 1
return z; 1 1 1
} 0 0 0
STEP COUNT VALUE 2

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 9


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Example 2:

STATEMENT s/e Frequency Total Steps


Algorithm ADD( K , n) 0 0 0
{ 0 0 0
s : = 0; 1 1 1
for i : = 1 to n do 1 n+1 n+1
s = s + K[i]; 1 n n
return s; 1 1 1
} 0 0 0
STEP COUNT VALUE 2n+3

Example 3:

STATEMENT s/e Frequency Total Steps


n=0 n>0
Algorithm ADD( K , n) 0 0 0 0
{ 0 0 0 0
if (n = = 0) 1 1 1 1
return 0; 1 1 1 -
else 0 0 - 0
return K[n] + ADD(K , n-1); 1+x 1 - 1+x
} 0 0 0
STEP COUNT VALUE 2 2+x
Where
x= TADD (n-1)

Based on the size of input requirements, complexities can be varied. Hence, exact
representation of time and space complexities is not possible. But they can be shown in some
approximate representation using mathematical notations known as asymptotic notations.

ASYMPTOTIC NOTATIONS

Asymptotic notations are the mathematical representations of time and space


complexities in some approximate formats. The notations are:

 Big ‘Oh’ notation


 Omega notation
 Theta notation
 Little ‘Oh’ notation
 Little omega notation

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 10


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Big ‘Oh’ notation (O):

The function f(n) = O(g(n)) iff there exist two positive constants c and n0 such that
f(n) ≤ c * g(n) for all n, n ≥ n0.

The graphical representation between n values on X-axis and f(n) values on Y-axis is as
follows:
Y

c*g(n)
Here, the functional value f(n) is always below the
estimated functional value c*g(n). Thus, the function n0
g(n) acts as upper bound value for the function f(n).
Hence, Big ‘Oh’ notation is treated as “Upper bounded f(n)
Function”.
f(n)

n X

Example:

Consider f(n) = 3n+2

Assume that 3n+2 ≤ 4n

Let n=1 3(1) + 2 ≤ 4(1) → 5≤4 FALSE


n=2 3(2) + 2 ≤ 4(2) → 8≤8 TRUE
n=3 3(3) + 2 ≤ 4(3) → 11 ≤ 12 TRUE
.
.

From this,
c=4 g(n) = n and n0 = 2

Hence, the function 3n+2 = O(n) iff there exist two positive constants 4 and 2 such
that 3n+2 ≤ 4n for all n, n ≥ 2.

Example: The function n2 + n + 3 = O(n2) iff there exist two positive constants 2 and 3
such that n + n + 3 ≤ 2n2 for all n, n ≥ 3.
2

In these complexities,
O(1) means constant
O(n) means linear
O(log n) means logarithmic
O(n2) means quadratic
3
O(n ) means cubic
O(2n) means exponential.

O(1) < O(log n) < O(n) < O(n logn) < O(n2) < O(n3) < - - - - - - - < O(2n) .

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 11


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Omega notation (Ω):

The function f(n) = Ω(g(n)) iff there exist two positive constants c and n0 such that
f(n) ≥ c * g(n) for all n, n ≥ n0.

The graphical representation between n values on X-axis and f(n) values on Y-axis is as
follows:
Y f(n)

Here, the functional value f(n) is always above the


estimated functional value c*g(n). Thus, the function c*g(n)
g(n) acts as lower bound value for the function f(n).
Hence, Omega notation is treated as “Lower bounded f(n) n0
Function”.

Example: n X

Consider f(n) = 3n+2

Assume that 3n+2 ≥ 3n

Let n=1 3(1) + 2 ≥ 3(1) → 5≥3 TRUE


n=2 3(2) + 2 ≥ 3(2) → 8≥6 TRUE
.
.

From this,
c=3 g(n) = n and n0 = 1

Hence, the function 3n+2 = Ω (n) iff there exist two positive constants 3 and 1 such
that 3n+2 ≥ 3n for all n, n ≥ 1.

Theta notation (Ө):

The function f(n) = Ө(g(n)) iff there exist three positive constants c1, c2 and n0 such
that c1 * g(n) ≤ f(n) ≤ c2 * g(n) for all n, n ≥ n0.

The graphical representation between n values on X-axis and f(n) values on Y-axis is as
follows:
Y
c2 * g(n)

Here, the functional value f(n) is always lies in f(n)


between the estimated functional values c1*g(n).
and c2*g(n). Thus, the function g(n) acts as lower f(n) c1 * g(n)
bound and upper bound value for the function f(n). n0
Hence, Theta notation is treated as “Bounded
Function”. n X

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali n XPage 12


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Example:

Consider f(n) = 3n+2

Assume that 3n ≤ 3n+2 ≤ 4n

Let n=1 3(1) ≤ 3(1) + 2 ≤ 4(1) → 3≤5≤4 FALSE


n=2 3(2) ≤ 3(2) + 2 ≤ 4(2) → 6≤8≤8 TRUE
n=3 3(3) ≤ 3(3) + 2 ≤ 4(3) → 9 ≤ 11 ≤ 12 TRUE
.
.

From this,

c1 = 3 c2 = 4 g(n) = n and n0 = 2

Hence, the function 3n+2 = Ө(g(n)) iff there exist three positive constants 3,4 and 2
and n0 such that 3n ≤ 3n+2 ≤ 4n for all n, n ≥ 2.

Little ‘Oh’ notation (o):

The function f(n) = o(g(n)) iff there exist two positive constants c and n0 such that
f(n) < c * g(n) for all n, n ≥ n0.

(OR)

𝑓(𝑛)
The function f(n) = o(g(n)) iff Lim = 0
𝑔(𝑛)
n→∞

Little Omega notation (ω):

The function f(n) = ω(g(n)) iff there exist two positive constants c and n0 such that
f(n) > c * g(n) for all n, n ≥ n0.

OR

𝑔(𝑛)
The function f(n) = ω(g(n)) iff Lim = 0
𝑓(𝑛)
n→∞

***

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 13


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

HEIGHT BALANCED TREES

Trees with a worst-case height of O (log n) are called height-balanced trees.

Examples: AVL tree, Red-Black tree, B-tree etc.,

Here, AVL tree and Red-Black trees are useful for internal memory applications
whereas B-tree is useful for external memory applications.

i) AVL Tree

AVL tree is a height balanced tree introduced in 1962 by Adelson-Velskii and Landis.

“An empty binary tree T is an AVL tree. If T is non-empty binary tree with TL and TR
as its left and right subtrees, then T is an AVL tree if:

 TL and TR are also AVL trees.


 | hL – hR | ≤ 1 ; Where, hL and hR are the heights of TL and TR respectively.”

In an AVL tree every node is associated with a value called balance factor and it is
denoted for the node x as:

bf (x) = Height of left subtree – Height of right subtree

From the definition of AVL tree, the allowable balance factors are 0, 1 and -1.

Example:
0
20

0 -1
15 40

0 0 16 0
90 77

Note: If the AVL tree satisfies the properties of binary search tree, then it is referred to as an
AVL search tree.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 14


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

OPERATIONS ON AVL SEARCH TREE

Insertion Operation:

Inserting an element into an AVL search tree follows the same procedure as the
insertion of an element into a binary search tree. But, the insertion may leads to a situation
where the balance factors of any of the nodes may be other than -1, 0, 1 and the tree is
unbalanced.

If the insertion makes an AVL search tree unbalanced, the height of the subtree must
be adjusted by the operations called rotations.

For this, consider N is the newly inserted node and A is the nearest ancestor which
has balance factor as 2 or -2. Then, the imbalance rotations are classified into four types as:

 LL Rotation: New node N is left subtree of left subtree of A.


 RR Rotation: New node N is right subtree of right subtree of A.
 LR Rotation: New node N is right subtree of left subtree of A.
 RL Rotation: New node N is left subtree of right subtree of A.

Here, the transformations done to LL and RR imbalances are often called single
rotations, while those done for LR and RL imbalances are called double rotations.

LL Rotation: In LL rotation, every node moves one position from the current position in
clockwise direction.

A
B
LL Rotation
B
N A
N

RR Rotation: In RR rotation, every node moves one position from the current position in
anti clockwise direction.

A
B
RR Rotation
B
A N
N

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 15


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

LR Rotation: The LR Rotation is a sequence of single left rotation followed by a single right
rotation. In LR Rotation, at first, every node moves one position to the left and one position
to right from the current position. i.e., LR Rotation = RR Rotation + LL Rotation.

A A

RR LL
B N
N

N
B A
B

RL Rotation: The RL Rotation is sequence of single right rotation followed by single left
rotation. In RL Rotation, at first every node moves one position to right and one position to
left from the current position. i.e., RL Rotation = LL Rotation + RR Rotation.

A A
N
LL RR
B N

A
B B
N

Example: Create an AVL search tree with the keys: 70 25 47 98 101.

Deletion Operation:

Deletion of an element from an AVL search tree follows the same procedure as the
deletion of binary search tree operations. Due to deletion of the node, some or all of the
nodes of balance factors on the path might be changed and tree becomes unbalanced. To
make it balanced format, it also requires rotations.

In this case, the deleting node is not available after performing the deletion operation.
Here, based on the balanced factors siblings of the deleted nodes rotations are classified into
six types as L0 , L1 , L-1 and R0 , R1 , R-1 rotations.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 16


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

R0 Rotation: Assume a node is deleted from the right subtree of a specific position C. After
deletion operation, the sibling node B has a balance factor as 0, and then R 0 rotation is used to
rebalance the tree as:

C
B

R0 Rotation
0 B C1R BL C

BL BR BR C1R

R1 Rotation: Assume a node is deleted from the right subtree of a specific position C. After
deletion operation, the sibling node B has a balance factor as 1, and then R 1 rotation is used to
rebalance the tree as:

C
B

R1 Rotation
1 B C1R BL C

BL BR BR C1R

R-1 Rotation: Assume a node is deleted from the right subtree of a specific position C. After
deletion operation, the sibling node B has a balance factor as -1, and then R-1 rotation is used
to rebalance the tree as:

C B

R-1 Rotation
-1 A C1R C
A

AL B
AL BL BR C1R

BL BR

L0 Rotation: Assume a node is deleted from the left subtree of a specific position B. After
deletion operation, the sibling node C has a balance factor as 0, and then L0 rotation is used to
rebalance the tree as:

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 17


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

B
C

L0 Rotation
B1L 0 C CR
B

CL CR B1L CL

L1 Rotation: Assume a node is deleted from the left subtree of a specific position B. After
deletion operation, the sibling node C has a balance factor as 1, and then L 1 rotation is used to
rebalance the tree as:

B
C

L1 Rotation
B1L 1 D
D
B

C DR B1L CL CR DR

CL CR

L-1 Rotation: Assume a node is deleted from the left subtree of a specific position B. After
deletion operation, the sibling node C has a balance factor as -1, and then L-1 rotation is used
to rebalance the tree as:

B
C

L0 Rotation
B 1
-1 C CR
L
B

CL CR B1L CL

Example: Delete 40 from the tree


70

40 85

75 99

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 18


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

ii) B-TREE

m-way Search Tree:

An m-way search tree T may be an empty tree. If T is non-empty then it satisfies the
properties as:

 Every node can have a maximum of m sub trees.


 A node with k sub trees will have k-1 elements where k<m.
 All the elements of a node will be in ascending order i.e., if it consists elements as k1,
k2, - - - kn such that k1<k2<- - - <kn. Assume its children nodes as c1, c2, - - - , cn.
 The elements in c1 are less than k1 and elements in c2 are greater than k1 and so on.
 All sub trees are m-way search trees.

Example: 3-way search tree

20 40
0

10 15 25 30 45 50
0 0

28

B-TREE Definition

A B-tree of order m is an m-way search tree and it may be empty. If it is non-empty


then the following properties are satisfied in terms of extended trees.

 The root node should have a minimum of two children and a maximum of m children.
 All the internal nodes except the root node should have a minimum of ┌ m/2 ┐ non-
empty children and a maximum of m non-empty children.
 All the external nodes are at the same level.

Example: B-tree of order 3 (2-3 tree)

43 75

6 24 52 64 87

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 19


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Note:

1. B-tree of order 3 is also referred to as 2-3 trees. Since, its internal nodes can have
only two or three children.
2. B-tree of order 4 is also referred to as 2-3-4 or 2-4 trees. Since, its internal nodes can
have either two, three or four children.

OPERATIONS ON B-TREE

Insertion Operation:

Inserting a new element into the B-tree of order m is followed by the search operation
for the proper location in a node. When the search terminates at a particular node, then
insertion falls into the either of the cases as:

Case-1: In node X contains space for insertion, then inserts the element in proper
position and child pointers are adjusted accordingly.

Example: B-tree of order 5

40 82

11 25 38 58 74 86 89 93 97

Insertion (64):

40 82

11 25 38 58 64 74 86 89 93 97

Case-2: If node X contains full of elements, then first insert the element into its list of
elements. Then split the node into two sub nodes at the median value. The elements that are
less than the median becomes the left node and that are greater than the median becomes the
right node. Then the median element is shifted up into the parent node of X. Sometimes the
process may propagate up to root level also.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 20


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Example: B-tree of order 5

40 82

11 25 38 58 74 86 89 93 97

Insertion (99):
40 82 93

11 25 38 58 74 86 89 97 99

Deletion Operation:

Deleting an element from a B-tree of an order m may be performed according to one


of the four cases.

Case-1: When the key exists in the leaf node and deletion may not effect of the
properties, then simply delete the key from the node and child pointers are adjusted.

Example: B-tree of order 5

40 82

11 25 38 58 74 86 89 93 97

Deletion (89):
40 82

11 25 38 58 74 86 93 97

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 21


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Case-2: When the key exists in the non-leaf node, replace key with the largest element
from its left sub-tree or the smallest element from its right sub-tree.

Example: B-tree of order 5

40 82

11 25 38 58 74 86 89 93 97

Deletion (40):

82

11 25 38 58 74 86 89 93 97

38 82

11 25 58 74 86 89 93 97

Case-3: If deleting an element k from a node leaves it with less than its minimum
number of elements, then elements can be borrowed from either of its sibling nodes. If the
left sub tree node is capable to spare the element then its largest element is shifted into the
parent node. If the right sub tree node is capable to space the element then its smallest
element is shifted into the parent node. From the parent node the intervening element is
shifted down to fill the vacancy created by the deleted element.

Example: B-tree of order 5

40 82

11 25 38 58 74 86 89 93 97

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 22


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Deletion (58):

40 82

11 25 38 74 86 89 93 97

38 82

11 25 40 74 86 89 93 97

Case-4: If deletion of an element is making the elements of the node to be less than its
minimum number and either of the sibling nodes have no chance of sparing an element, then
this node is merged with either of the sibling nodes including the intervening element from
the parent node.

Example: B-tree of order 5

38 82

11 25 40 74 86 89 93 97

Deletion (25):

82

11 38 40 74 86 89 93 97

END

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 23


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

UNIT – II

Heap Trees (Priority Queues) – Min and Max Heaps, Operations and Applications.
Graphs – Terminology, Representations, Basic Search and Traversals, Connected
Components and Biconnected Components, Applications.
Divide and Conquer: The General Method, Quick Sort, Merge Sort, Strassen’s Matrix
Multiplication, Convex Hull.
***

HEAP TREE

Suppose H is a complete binary tree. Then it will be termed as a Heap Tree / Max
Heap if it satisfies the property as:

 For each node N in H, the value at N is greater than or equal to the value of each of
the children of N.

Example:
95

84 48
5

76 23

In addition to this a Min Heap is possible, where the value at N is less than or equal to the
value of each of the children of N.

Example:
25

45 78
5

83 71

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 24


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

REPRESENTATION OF A HEAP TREE

Heap tree is a complete binary tree, it is better to represent with a single dimensional
array. In this case, there is no wastage of memory space between two non-null entries.

Example:
95

Array Representation
84 48
1 2 3 4 5 6
5
95 84 48 76 23
76 23

OPERATIONS ON HEAP TREE

Basic operations on a heap tree are: insertion, deletion, merging etc.

Insertion into a heap tree: This operation is used to insert a new element into a
heap tree. Let K is an array that stores n elements of a heap tree. Assume an element of
information is given in the variable ‘Key’ for insertion. Then insertion procedure works as:

 First adjoin key value at the end of K so that still the tree is a complete binary tree, but
not necessarily a heap.
 Then raise the key value to its appropriate position so that finally it is a heap tree.

The basic principle is that first add the data element into the complete binary tree.
Then compare it with the data in its parent node; if the value is greater than then the parent
node then interchange these two values. This procedure will continue between every two
nodes on the path from the newly inserted node to the root node until we get a parent whose
value is greater than its child or we reached at the root node.

Example: Insert a value 124 into the existing heap tree

140

85 45

75 25 35 15

55 65

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 25


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm InHeap (Key): Let K is an array that stores a heap tree with ‘n’
elements. This procedure is used to store a new element Key into the heap tree.

Step 1: n : = n+1;
Step 2: K[n] : = Key;
Step 3: i : = n;
p : = i/2;
Step 4: Repeat WHILE p > 0 AND K[p] < K[i]
Temp : = K[i];
K[i] : = K[p];
K[p] : = Temp;
i : = p;
p : = p/2;
End Repeat
Step 5: Return

Deletion of a node from heap tree: This operation is used to delete an


element from the existing heap tree. Deletion procedure works as:

 Assign the root node into a temporary variable as key.


 Replace the root node location by the last node in the heap tree. Then re-heap the tree
as:
o Let newly modified root node be the current node. Compare its value with its
two children node values. Let X is the child whose value is the largest value.
Then interchange the value of X with the value of the current node.
o Make X as the current node.
o Continue re-heap process, if the current node is not an empty node.

Example: Delete an element from the following heap tree

140

85 45

75 25 35 15

55 65

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 26


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm Deletion ( ): This procedure removes root element of the heap tree
and rearranges the elements into heap tree format.

Step 1: Temp : = K[1];


K[1] : = K[n];
Step 2: n : = n-1;
i : = 1;
Flag : = FALSE;
Step 3: Repeat WHILE Flag = FALSE AND i < n
lchild : = 2 * i;
rchild : = 2 * i + 1;
IF lchild ≤ n THEN
X : = K[lchild];
ELSE
X : = -999;
ENDIF
IF rchild ≤ n THEN
Y: = K[rchild];
ELSE
Y : = -999;
ENDIF
IF K[i] > X AND K[i] > Y THEN
Flag : = TRUE;
ELSEIF X > Y AND K[i] < X THEN
SWAP(K[i], K[lchild])
i : = lchild;
ELSEIF Y > X AND K[i] < Y THEN
SWAP(K[i], K[rchild])
i : = rchild;
ENDIF
ENDREPEAT
Step 4: RETURN Temp

APPLICATION OF HEAP TREES

Two important applications of heap trees are: Heap sort and Priority queue
implementations.

Heap Sort: Heap sort is also known as Tree sort. The procedure of heap sort is as
follows:

Step 1: Build a heap tree with the given set of data elements.
Step 2: a) Delete the root node from the heap and place it in last location.
b) Rebuild the heap with the remaining elements.
Step 3: Continue Step-2 until the heap tree is empty.

Example: Sort the elements 33, 28, 65, 12 and 99 using heap sort.

Note: The worst case time complexity of heap sort is O(n logn) time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 27


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

GRAPHS

A graph G=(V,E) consists of a finite non-empty set of vertices V also called as nodes
and a finite set of edges E also called as arcs.

Example:

e1
a b
e2 e3

e5

c d

e4

Where, V = {a,b,c,d} and E = {e1 , e2 , e3 , e4 , e5 }


e1 = (a,b) e2 = (a,c) e3 = (b,c) e4 = (c,d) e5 = (b,d)

GRAPH TERMINOLOGY

Digraph: A graph in which every edge is directed is called a digraph. A digraph is also
known as a directed graph.

Example:

e1
a b
e2 e3

c d

e4

Where, V = {a,b,c,d} and E = {e1 , e2 , e3 , e4 }


e1 = <a,b> e2 = <a,c> e3 = <c,b> e4 = <c,d>

Undirected Graph: A graph in which every edge is undirected is called an undirected


graph. Here, an edge (Vi , Vj) is equivalent to (Vj , Vi).

Example: e1
a b
e3
e2 c

Where, V = {a,b,c} and E = {e1 , e2 , e3}


e1 = (a,b) e2 = (a,c) e3 = (b,c)

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 28


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Mixed Graph: A graph in which some edges are directed and some edges are
undirected is called a mixed graph.

Example: e1
a b

e2 e4
c d
e3

Where, V = {a,b,c,d} and E = {e1 , e2 , e3 , e4}

e1 = (a,b) e2 = <a,c> e3 = <c,d> e4 = (a,d)

Weighted Graph: A graph is termed as a weighted graph if all the edges in it are labeled
with some weight values.
Example: 5
a b
7
3
c d
9

Adjacent Vertices: A vertex Vi is adjacent / neighbor of another vertex Vj, if there is an


edge from Vi to Vj.

Self Loop: If there is an edge whose starting and ending vertices are same is called
as a loop or self loop.

Example:
a
Self loop at vertex c
e2 e4
b c e1
e3

In-degree of a Vertex: The number of edges coming into the vertex Vi is called the in-
degree of vertex Vi.

Out-degree of a Vertex: The number of edges going out from a vertex Vi is called the
out-degree of vertex Vi.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 29


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Degree of a Vertex: Sum of out-degree and in-degree of a node V is called the total
degree of the node V and is denoted by degree (V).

Example:
a In-degree (a) = 1
Out-degree (a) = 2

b d

Path: A path from vertex vi to vj is a sequence of vertices vi , vi+1 , - - - , vj such that


(vi , vi+1) , (vi+1 , vi+2) , - - - are the edges in the graph. The length of the path is the number of
edges on it.
Example:
V2

V4
V1

V3

Path from V1 to V4 is:

P1 = {V1, V2, V4} P2 = {V1, V3, V4}

Complete Graph:

A Graph G is said to be a complete graph if each vertex vi is adjacent to every other


vertex vj in G.

Example:
V2

V4
V1

V3

Note: An n-vertex, undirected graph with exactly n(n-1) / 2 edges is said to be complete
graph.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 30


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Connected Graph:

In a graph G, two vertices vi and vj are said to be connected if there is a path in G


from vi to vj.

 An undirected graph is said to be a connected graph is every pair of distinct vertices


vi and vj are connected.

Example: V2

V1 V4

V3

 A digraph is said to be strongly connected graph if for every pair of distinct


vertices vi , vj in G, there is a directed path from vi to vj and also from vj to vi.

Example: V1

V4 V2

V3

Acyclic Graph: If there is a path containing one or more edges which starts from a
vertex vi and terminates into the same vertex then the path is known as a cycle. If a graph
does not have any cycle then it is called as acyclic graph.

Example: V1

V4 V2

V3

Sub Graph: A sub graph of G is a graph G1 such that V(G1) is a subset of V(G) and
1
E(G ) is a sub set of E(G).
0

1 2

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 31


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

For this, some of the sub graphs are:

0
0 0

1 2
1 2

GRAPH REPRESENTATIONS

A graph can be represented in many ways. Some of the important representations are:
Set representation, Adjacency matrix representation, Adjacency list representations
etc.,
Set representation:

One of the straight forward methods of representing any graph is set representation.
In this method two sets are maintained: V as the set of vertices and E as the set of edges
which is a subset of V x V.
In case of weighted graph, V as the set of vertices and E as the set of edges which is a subset
of W x V x V.
V1
Example:

V(G) = {V1,V2,V3,V4,V5,V6,V7} V3
V2
E(G) = { (V1,V2), (V1,V3), (V2,V4),
(V2,V5), (V3,V4), (V3,V6), V4
(V4,V7), (V5,V7), (V6,V7) }

V6
V5

V7

Example:

5 V(G) = {A, B, C, D}
A B E(G) = { (3,A,C), (5,B,A),
(1,B,C), (7,B,D), (2,C,A),
2 3 1 6 7 (4,C,D), (6,D,B), (8,D,C)}

4
C D
8

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 32


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Linked representation:

Linked representation is a space saving way of graph representation. Linked


representation is also known as adjacency list representation. In this representation, two
types of node structures assumed as:

Node Adjacency
Node structure of Non-weighted graph:
Information List

Node structure of Weighted graph: Weight Node Adjacency


Value Information List

In linked representation, the number of lists depends on the number of vertices in the
graph.

Example:

V1 V2 V3
V1
V2 V1 V4 V5
V3
V2
V3 V1 V4 V6
V4
V4 V2 V3 V7

V6 V5 V2 V7
V5
V6 V3 V7
V7
V7 V4 V5 V6

Example:

5
A B

2 3 1 6 7

4
C D
8

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 33


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

A 3 C

B 1 C 7 D 5 A

C 2 A 4 D

D 6 B 8 C

Sequential representation:

Sequential (Matrix) representation is the most useful way for representing any graphs.
For this, different matrices are allowed as Adjacency matrix, Incidence matrix, Circuit
matrix, Cut set matrix, Path matrix etc.

Adjacency Matrix Representation

The adjacency matrix representation of a graph G with n vertices is an nXn matrix


such that each element is defined as:

Aij = 1 ; If there is an edge between vi and vj.


= 0 ; Otherwise.

Example: V1 V2 V3 V4
V3 0 1 1 0
1 0 1 0
V4
A = 1 1 0 1
V1
0 0 1 0
V2

Example:
V3 0 1 1 1
0 0 0 1
V4
A = 0 1 0 1
V1
0 0 0 0
V2

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 34


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

GRAPH TRAVERSAL TECHNIQUES

Traversing a graph means visiting all the vertices in the graph exactly once. Graph
traversal techniques are:
a) Breadth First Search (BFS) Traversal
b) Depth First Search (DFS) Traversal

a) Breadth First Search (BFS) Traversal

The traversal starts from a vertex u which is said to be visited. Now all nodes V i
adjacent to u are visited. The unvisited vertices Wij adjacent to each of Vi are visited next
and so on. This process continues till all the vertices of the graph are visited.

BFS uses a queue data structure that keep a track of order of nodes whose adjacent
nodes are not to be visited.

Algorithm BFS (u): This procedure visits all the vertices of the graph starting from the
vertex u.
Step 1: Initialize a queue as Q
Visited(µ) : = 1;
Enqueue(Q, u)
Step 2: Repeat WHILE NOT EMPTY(Q)
Dequeue(Q,u);
WRITE (u);
For all vertices V adjacent to u
IF Visited (V) = 0 THEN
Enqueue(Q,V);
Visited (V) : = 1;
ENDIF
EndFor
EndRepeat
Step 3: RETURN

Example: Apply breadth first search traversal for the graph

5 4

2
1
9
7

3
6
10

Breadth First Search Traversal Output : 1 5 6 7 4 9 10 3 2 8

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 35


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Analysis: For each vertex is placed on the queue exactly once, the while loop is iterated
at most n times. For the adjacency list representation, this loop has a total cost of O(e). For
adjacency matrix representation, the while loop takes O(n) time for each vertex visited.
Therefore, the total time is O(n2).

b) Depth First Search (DFS) Traversal

The traversal starts from a vertex u which is said to be visited. Now all nodes V i
adjacent to u are collected and the first adjacent vertex V j is visited. The nodes adjacent to Vj
namely Wk are collected and the first adjacent vertex is visited. The traversal progresses until
there are no more visits possible.

DFS uses a stack data structure that keep a track of order of nodes whose adjacent
nodes are not to be visited.

Algorithm DFS (u): This procedure visits all the vertices of the graph starting from the
vertex u.

Step 1: Visited(u) : = 1;
WRITE (u);
Step 2: For each vertex V adjacent to u
IF Visited (V) = 0 THEN
Call DFS(V);
ENDIF
EndFor
Step 3: RETURN

Example: Apply depth first search traversal for the graph

A C E

Depth First Search Traversal Output : A B E C D

Analysis: For the adjacency list representation, each node determines the links search
operation is O(e) time. For adjacency matrix representation, the determining all vertices
adjacent to the vertex requires O(n) time. Therefore, the total time is O(n2).

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 36


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

CONNECTED COMPONENTS
A connected component of an undirected graph G is a maximal connected subgraph
of G.

Example:

For this, connected component is:

1 1

2 3 2 3

4 4

If the graph is connected undirected graph, then we can visit all the vertices of the
graph by using either breadth first search or depth first search. The subgraph which has been
obtained after traversing the graph using either BFS or DFS represents the connected
component of the graph.

Example:
0

1 2

3 5 6
4

Connected component by DFS Connected component by BFS

0
0

1 2
1 2
4

3 3 5 6
5 6 4

7
7

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 37


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Functional implementation of connected components is as follows:

void components(G, n)
{
int i;
for(i =0 ; i<n; i++)
Visited[i] = 0;
for(i=0 ; i<n; i++)
{
if(Visited[i] = = 0)
dfs(i);
}
}

Analysis:

1. If the graph G is represented by its adjacency lists, then the total time needed to
generate all the connected components is O(n+e) time.
2. If the graph G is represented by its adjacency matrix, then the total time needed to
generate all the connected components is O(n2) time.

BICONNECTED COMPONENTS

Articulation Point: A vertex in an undirected connected graph is an articulation


point (or cut vertex) if removing the vertex (and edges through it) disconnects the graph into
two or more sub graphs.

Examples:

Biconnected Graph: A biconnected graph is a connected graph that has no


articulation points.

Example:
1
5

2
4
3

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 38


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Biconnected Components: A biconnected component of a connected undirected


graph is a maximal biconnected subgraph.

Key observations regarding biconnected components are:

 The biconnected component is a maximal biconnected subgraph.


 Two different biconnected components should not have any common edges.
 Two different biconnected components can have common vertex.
 The common vertex which is attaching two or more biconnected components must
have an articulation point.

Example:
1 0 3

2 4

For this, biconnected components are:

1 0 0 3 3

2 4

Note: The computing time of biconnected component is O(n+e) time.

APPLICATION OF GRAPHS

Graphs are used to represent networks, road maps, facebook etc. In addition to this,
graphs are used in different application areas which include:

 Shortest path problem


 Topological sorting
 Spanning trees
 Hamilton path etc.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 39


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

DIVIDE AND CONQUER STRATEGY

Consider a function to compute on n inputs. The Divide and Conquer strategy


suggests that to split the input into K distinct subsets, 1 ≤ K ≤ n, yielding K sub problems.
Those sub problems are individually solved and combine sub solutions to produce solution of
the given problem statement.
If the sub problem is still large, then reapply divide and conquer strategy until the last
sub problem is enough to solve without splitting.

Assume P is a problem statement to be solved. Then control abstraction for divide


and conquer can be shown as:

Algorithm DAndC ( P )
{
if Small ( P ) then
return S ( P ) ;
else
{
Divide P into small instances P1 , P2 , - - - - , PK
Apply DAndC to each these sub problems
return Combine(DAndC(P1) , DAndC(P2) , - - - , DAndC(PK))
}

Where,

 Small(P) is a Boolean-valued function that determines whether the size of the


problem is small or not.
 If the size of the problem is large, then P is divided into small problems as P1 , P2 , - -
- - , PK .
 These sub problems are solved recursive applications of DAndC.
 Combine is a function that determines the solution to P using sub solutions of K sub
problems.

If the size of P is n and the sizes of the K sub problems are n1, n2, - - - , nK respectively
then the computing time of DAndC is described by the recurrence relation

T(n) = g(n) ; n is small


T(n1) +T(n2)+ - - - + T(nK) + f(n) ; Otherwise

Where,

 T(n) is the time for DAndC on any input of size n and g(n) is the time to compute the
answer directly for small inputs.
 The function f(n) is the time for dividing P and combining the solutions to sub
problems.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 40


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Complexities of many divide and conquer algorithms is given by the recurrence relation
of the form:

T(n) = T(1) ; n=1


a T(n/b) + f(n) ; n>1

Here,
a and b are known constants, T(1) is known and n is a power of b.

In general, recurrence relations are solved by applying substitution method. This


method repeatedly makes substitution for each occurrence of the function T in the right-hand
side until all such occurrences disappear.

Example: Solve the recurrence relation when a = 2, b = 2, T(1) = 2 and f(n) = n.

T(n) = 2 T(n/2) + n
= 2 [ 2 T(n/4) + n/2 ] + n
= 4 T(n/4) + n + n
= 4 T(n/4) + 2n
= 4 [ 2 T(n/8) + n/4 ] + 2n
= 8 T(n/8) + 3n
-
-
-
= 2i T(n/2i) + in for any log 2 n ≥ i ≥ 1.

Let 2i = n

Take log2 on both sides, then

log2 2i = log2 n
i = log2 n

From this,

T(n) = n T(n/n) + (log 2 n) n


= n T(1) + n log 2 n

T(n) = 2 n + n log 2 n

Applications

Some of the important applications of Divide and Conquer strategy are


 Binary Search
 Quick Sort
 Merge Sort
 Strassen’s Matrix Multiplication etc,

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 41


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

QUICK SORT (PARTITION EXCHANGE SORT)

Quick sort is a sorting technique based on the divide-and-conquer strategy. In this


sorting technique, array elements are divided into two sub arrays depending on a specialized
element called “pivot” element.

Let K is an array that consists of ‘n’ elements from index 1 to index n. Sorting refers
to the process of rearranging the given elements of K in ascending order such that: K[1] ≤
K[2] ≤ . . . . . . . . ≤ K[n]. For this quick sort procedure works as:

Step 1: Initialize the first element as pivot element.


Step 2: Initialize a variable i at the first index and another variable j at last index+1.
Step 3: Increment i value by 1 until K[i] ≥ pivot element.
Step 4: Decrement j value by 1 until K[j] ≤ pivot element.
If i < j Then
Interchange K[i] & K[j]
EndIf
Step 5: Repeat Step 3 and 4 until i ≥ j
Step 6: Interchange the values of K[j] and pivot element.

The above process refers to one pass. At the end of the pass, the pivot element is
positioned at its sorted position. At this stage, the elements before the pivot element are less
than or equal to pivot element and after the pivot element are greater than or equal to the
pivot element.
Now, the same procedure is repeated on the elements before the pivot element as well
as on the elements after the pivot element.
When all passes are completed, then list of array elements are available in sorted
order.

Example: Sort the elements 12 9 17 16 94 using quick sort.

ALGORITHM

Algorithm QSort(K , P, Q): Let K is an array that consists of ‘n’ elements from
index 1 to index n. Assume P refers to the first index 1 and Q refers to the last index
n at the initial call.

This procedure splits the array into two sub arrays at every pass.

{
if ( P < Q ) then
{
j := Partition(K, P, Q);
QSort( K, P, j-1);
QSort( K, j+1, Q);
}
}

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 42


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm Partition(K , LB, UB)


{
Pivot := K[LB];
i := LB;
j := UB+1;
repeat
{
repeat
i := i + 1;
until (K[i] ≥ Pivot);
repeat
j := j – 1;
until (K[j] ≤ Pivot);
if i < j then
Interchange K[i] and K[j] Elements
}
until ( i ≥ j );

Interchange K[j] and Pivot elements


return j;
}

Analysis of Quick sort

1. The Worst case time complexity of quick sort is O (n2). It occurs when the list of
elements already in sorted order.

To calculate worst-case time complexity of quick sort, imagine the number of


comparisons to sort the array of n elements is f(n).

An array of zero or one element is a sorted list.

Hence, f(0) = 0
f(1) = 0.

If n ≥ 2 then the recurrence relation is given by the function f(n) = f(n-1) + n

This equation can be solved recursively as:

f(n) = f(n-1) + n
f(n-1) = f(n-1-1) + n – 1 = f(n-2) + n – 1
f(n-2) = f(n-2-1) + n – 2 = f(n-3) + n – 2
.
.
.
f(1) = f(0) + 1
f(0) = 0

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 43


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Therefore,
f(n) = n + (n-1) + (n-2) + - - - - - - - - + 1 + 0
= (n(n+1)) / 2
= (n2+n)/2
= (n2 / 2) + (n / 2)
= O(n2)
Hence,
Worst case time complexity of Quick sort is O(n2) time.

2. The Average case time complexity of quick sort is O(n logn), which is less
compare to worst case time complexity.

In this case, the pivot element is positioned at one appropriate location and splits the
array elements into two sub arrays. From this, the recurrence relation is given by the function

f(n) = (2/n) [ f(0) + f(1) + - - - - - + f(n-1) ] + n

Multiply both sides of this equation by n

n f(n) = 2 [ f(0) + f(1) + - - - - - + f(n-1) ] + n2 1


Replace n with n-1, then

(n-1) f(n-1) = 2 [ f(0) + f(1) + - - - - - + f(n-2) ] + (n-1)2 2

1 2

n f(n) – (n-1)f(n-1) = 2 f(n-1) + n2 – (n-1)2


nf(n) – nf(n-1) + f(n-1) = 2f(n-1) + n2 – ( n2 – 2n + 1)
nf(n) = 2f(n-1) – f(n-1) + nf(n-1) + n2 – n2 + 2n - 1
nf(n) = (n+1) f(n-1) + 2n – 1

Divide both sides by n(n+1), then


(1/(n+1)) f(n) = (1/n) f(n-1) + (2/(n+1)) – ((1/(n(n+1)))
Solve this equation recursively with n-1, n-2, - - - - - we get
f(n) = (n+1) log n
f(n) = n log n + log n
= O(n logn)
Therefore,
The Average case time complexity of quick sort is O(n log n) time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 44


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

MERGE SORT

Merge sort is also a sorting technique designed based on divide-and-conquer strategy.


Let K is an array that consists of ‘n’ elements from index 1 to index n. Sorting refers to the
process of rearranging the given elements of K in ascending order such that: K[1] ≤ K[2] ≤ . .
. . . . . . ≤ K[n]. For this, merge sort procedure works as:

First divide the array elements into two sub arrays based on

𝐿𝑜𝑤+𝐻𝑖𝑔ℎ
Mid =
2

Where,
Low is the first index of the array and High is the last index of the array.

Once, the sub arrays are formed, each set is individually sorted and the resulting sub
sequences are merged to produce a single sorted sequence of data elements.

Divide-and-Conquer strategy is applicable as splitting the array into sub arrays; and
combining operation is merging the sub arrays into a single sorted array.

Merging is the process of combining two sorted lists into a single sorted list. While
performing merging operation, the two sub lists must be in sorted order.

Example: Sort the elements 25 19 17 82 46 using quick sort.

ALGORITHM

// Let K is a global array that consists of ‘n’ elements from index 1 to index n.

Algorithm MSort ( Low , High ):

// Low refers to the first index 1 and High refers to the last index n at the initial call.

// This procedure sorts elements of K in ascending order.

{
if ( Low < High ) then
{
Mid : = (Low+High) / 2;
MSort(Low,Mid);
MSort(Mid+1,High);
Merge(Low,Mid,High);
}
}

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 45


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm Merge(Low, Mid, High)

// This procedure merges the two sub sorted arrays into a single sorted array.

{
h : = Low;
i : = Low;
j : = Mid+1;
while (( h ≤ Mid) AND (j ≤ High)) do
{
if ( K[h] ≤ K[j] ) then
{
S[i] : = K[h];
h : = h + 1;
}
else
{
S[i] : = K[j];
j : = j + 1;
}
i : = i + 1;
}
if ( h > Mid ) then
{
for p : = j to High do
{
S[i] : = K[p];
i : = i + 1;
}
}
else
{
for p : = h to Mid do
{
S[i] : = K[p];
i : = i + 1;
}
}
for p : = Low to High do
{
K[p] : = S[p];
}
}

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 46


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Analysis of Merge Sort

Merge sort consists of several passes over the input. The first pass merges segments
of size1, second pass merges segments of size2, and so on. The computing time for merge
sort is described by the recurrence relation:

T(n) = a ; n = 1 , a is a constant value


2 T(n/2) + c n ; n > 1 , c is a constant value

Apply substitution method, then it becomes

T(n) = 2 T(n/2) + c n
= 2 [ 2 T(n/4) + (c n) / 2 ] + c n
= 4 T(n/4) + 2 c n
= 4 [ 2 T(n/8) + (c n) / 4 ] + 2 c n
= 8 T(n/8) + 3 c n
-
-
-
= 2i T(n/2i) + i c n for any log 2 n ≥ i ≥ 1.

Let 2i = n

Take log2 on both sides, then

log2 2i = log2 n
i = log2 n

From this,

T(n) = n T(n/n) + (log 2 n) c n


= n T(1) + c n log 2 n
= n a + c n log 2 n

T(n) = O ( n log 2 n )

Therefore,
The worst case and average case time complexity of Merge sort is O ( n log 2 n )
time.

Note:

The main disadvantage of merge sort is its storage representation. In merge sort
technique, merge process required an auxiliary (temporary) array which has same size as the
original array. Hence, it requires more space compared to other sorting techniques.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 47


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

STRASSEN’S MATRIX MULTIPLICATION

Let A and B are two n X n matrices. The product matrix C = A B is also an n X n


matrix whose i, jth element is formed by taking the elements in the ith row of A and jth column
of B and multiplying them to get

C (i, j) = ∑ A (i, k) B (k, j) ; for all i and j between 1 and n.


1≤k≤n

Time complexity for the above conventional method is O(n3) time.

The divide and conquer strategy suggest another way to compute the product of two
n X n matrices. Here, assume n is a power of 2. Now, the two matrices A and B are
partitioned into 4 sub matrices, each sub matrix having dimensions n/2 X n/2. Then the
product of AB can be computed by using the formula as:

A1 1 A1 2 B1 1 B1 2 C1 1 C1 2
=
A2 1 A2 2 B2 1 B2 2 C2 1 C2 2

Here,
C11 = A11 B11 + A12 B21
C12 = A11 B12 + A12 B22
C21 = A21 B11 + A22 B21
C22 = A21 B12 + A22 B22

Here, to compute the product AB, it performs 8 multiplication operations and 4


addition operations. Then, the overall computing time T(n) is given by the recurrent relation:

T(n) = b ; n≤2
8 T(n/2) + c n2 ; n>2

T(n) = O ( n3 )

Volker Strassen has discovered a way to compute Cij `s using only 7 multiplications
and 18 additions or subtractions. His method involves first computing the seven n/2 X n/2
matrices P, Q, R, S, T, U and V as:

P = ( A11 + A22 ) ( B11 + B22 )


Q = ( A21 + A22 ) B11
R = A11 ( B12 - B22 )
S = A22 ( B21 - B11 )
T = ( A11 + A12 ) B22
U = ( A21 - A11 ) ( B11 + B12 )
V = ( A12 - A22 ) ( B21 + B22 )

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 48


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Then, Cij `s can be obtained as:

C11 = P+S–T+V
C12 = R+T
C21 = Q+S
C22 = P+R–Q+U

The resulting recurrence relation for T(n) is:

T(n) = b ; n≤2
7 T(n/2) + a n2 ; n>2

T(n) = O ( n2.81 )

Example: Calculate product of the given two matrices using Strassen’s matrix
multiplication where

A = 1 2 4 1 B = 1 2 3 4
2 3 2 4 4 3 2 1
1 5 1 2 1 3 1 2
3 1 4 2 4 1 2 3

CONVEX HULL

The convex hull of a set S of points in the plane is defined to be the smallest convex
polygon containing all points of S.

Note: A polygon is defined to be convex if for any two points p1 and p2 inside the polygon,
the directed line segment from p1 to p2 is fully contained in the polygon.

Example: Input set of points Convex Hull

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 49


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

To obtain convex hull for the given set of points apply divide-and-conquer strategy using
Quick hull algorithm as:

The Quick Hull algorithm is a Divide and Conquer algorithm similar to Quick Sort. Let
a[0…n-1] be the input array of points. Following are the steps for finding the convex hull
of these points.

1. Find the point with minimum x-coordinate as min_x and similarly the point with
maximum x-coordinate as max_x.

2. Make a line joining these two points, say L. This line will divide the whole set into two
parts. Take both the parts one by one and proceed further.

3. For a part, find the point P with maximum distance from the line L. P forms a triangle
with the points min_x, max_x. It is clear that the points residing inside this triangle can
never be the part of convex hull.

4. The above step divides the problem into two sub-problems (solved recursively). Now
the line joining the points P and min_x and the line joining the points P and max_x are
new lines and the points residing outside the triangle is the set of points. Repeat point
no. 3 till there no point left with the line. Add the end points of this point to the convex
hull.

 The average case time complexity of Quick hull algorithm is O(n log n) time.

END

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 50


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

UNIT – III

Greedy Method: General Method, Job Sequencing with deadlines, Knapsack Problem,
Minimum cost spanning trees, Single Source Shortest Paths.
Dynamic Programming: General Method, All pairs shortest paths, Single Source Shortest
Paths – General Weights (Bellman Ford Algorithm), Optimal Binary Search Trees, 0/1
Knapsack, String Editing, Travelling Salesperson problem.
***

GREEDY METHOD
Greedy method is a straightforward design technique applied on those methods in
which solution is in terms of a subset that satisfies some constraints.

 Any subset that satisfies the constraints of the problem statement is called a “feasible
solution”.
 A feasible solution that either maximizes or minimizes the given objective function is
called an “optimal solution”.

To obtain optimal solution, Greedy method suggests that to devise an algorithm that
works in stages, considering one input at a time. At each stage, a decision is made regarding
whether a particular input is I an optimal solution.
If the inclusion of next input into the partially constructed optimal solution will result
in an infeasible solution, then this input is not added to the partial solution. Otherwise, it is
added.
This procedure results the algorithm to generate suboptimal solutions. Hence, this
version of the algorithm is called the “Subset Paradigm”.

 Control abstraction algorithm of greedy strategy is as follows:

Algorithm Greedy ( a , n )
{
// a[1 : n] contains n inputs
solution : = Φ;
for i : = 1 to n do
{
x : = Select ( a );
if Feasible ( solution , x ) then
solution : = Union ( solution , x );
}
return solution;
}

Here,
 The function Select( ) selects an input from a[ ] and is assigned to x.
 Feasible( ) is a Boolean valued function that determines whether x can be included
into the solution vector.
 Union( ) function combines x with the solution and updates the objective function.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 51


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

APPLICATIONS

Greedy strategy can be applicable to solve different problems such as:

 Knapsack problem
 Job sequencing with dead lines
 Minimum cost spanning trees
 Single source shortest path etc,

KNAPSACK PROBLEM

Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If a fraction xi, 0 ≤ xi ≤ 1, of object i is placed into the
knapsack, then a profit of pixi is earned. The objective is to obtain a filling of the knapsack
that maximizes the total profit earned.

Formally, the problem statement can be stated as:

maximize ∑ pi xi → 1
1≤i≤n

subject to ∑ wi xi ≤ m → 2
1≤i≤n

and 0 ≤ xi ≤ 1 , 1≤i≤n → 3

 A feasible solution is any set ( x1, x2, - - - - -, xn ) satisfying equation 2 and 3.


 An optimal solution is a feasible solution for which equation 1 is maximized.

Example: Find an optimal solution to the knapsack instance n = 3, m = 20, (p 1,p2,p3) =


(25, 24, 15) and (w1,w2,w3) = (18, 15, 10).

Algorithm GreedyKnapsack ( m , n )
{
// p[1 : n] and w[1 : n] contain the profits and weights respectively of n objects
such that p[i] / w[i] ≥ p[i+1] / w[i+1] ≥ - - - - - - - -.

for i : = 1 to n do
x[i] : = 0.0;
U : = m;
for i : = 1 to n do
{
if ( w[i] > U ) then
break;

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 52


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

else
{
x[i] : = 1.0;
U : = U – w[i];
}
}
if ( i ≤ n ) then
x[i] : = U / w[i];
}

Here,
 If sum of all weights is ≤ m, then xi = 1 ; 1 ≤ i ≤ n is an optimal solution.
 If p1 / w1 ≥ p2 / w2 ≥ - - - - ≥ pn / wn , then GreedyKnapsack generates an optimal
solution to the given instance of the Knapsack problem
 Time complexity of Knapsack problem is O(n) time.

JOB SEQUENCING WITH DEADLINES


Consider a set of n jobs. Every job i has a deadline di ≥ 0 and a profit pi > 0. For any
job i the profit pi is earned iff the job is completed by it deadline. To process these jobs only
one machine is available. To complete the job, it has to process on machine for one unit of
time.

 A feasible solution for this problem is a subset of jobs such that each job in this sub
set can be completed by its deadline. The value of the feasible solution J is the sum of
profits of the jobs as ∑ pi.
i€J
 An optimal solution is a feasible solution with maximum profit value.

Example: Obtain optimal solution of Job sequencing with deadlines for the instance n=5,
( p1, p2, p3, p4, p5 ) = (20, 15, 10, 5, 1) and ( d1, d2, d3, d4, d5 ) = (2, 2, 1, 3, 3).

Algorithm GreedyJob ( d, J, n )
{
J : = { 1 };
for i : = 2 to n do
{
if ( all jobs in J U { i } can be completed by their deadlines ) then
J : = J U { i };
}
}

 The computing time of Job sequencing with deadlines is O(n 2) time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 53


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

MINIMUM COST SPANNING TREES

Let G = (V, E) be an undirected connected graph. A subgraph T = (V, E 1) of G is a


spanning tree iff T is a tree.

Example: Consider the graph

For this, some of the possible spanning trees are:

etc,

If the graph consists ‘n’ vertices, spanning tree contains exactly ‘n-1’ edges to connect
the ‘n’ vertices.

Consider a weighted graph in which every edge is labeled with a cost / weight value.
Cost of the spanning tree is obtained by adding the cost of the selected edges.

“A minimum cost spanning tree of G is a spanning tree T having minimum cost”.

Most important methods used to find the minimum cost spanning trees are:

 Prim’s Algorithm
 Kruskal’s Algorithm

PRIM’s ALGORITHM

In this method, minimum cost spanning tree builds the tree edge by edge. The next
edge is included based on some optimization criterion. The simplest way is to choose an
edge that results in a minimum increase in the sum of the cost of the edges so far included.

Procedure:

 Procedure starts with a tree T that contains the starting vertex.


 Add a least cost edge (u, v) to T such that T U { (u, v) } is also a tree.
 To select the edge (u, v) such that exactly one of u or v is in T.
 Selection of the edge (u, v) forms a cycle, discard it; otherwise, include in the tree
structure.
 Repeat the above procedure until T contains n-1 edges.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 54


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Example: Obtain minimum-cost spanning tree for the graph

Algorithm Prim( )
{
T:=Φ;
TV : = Φ ;
while ( T contains less than n-1 edges ) do
{
Let (u, v) be a least cost edge such that u € TV and v € TV;
if ( there is no edges ) then
break;
else
{
Add v to TV;
Add (u, v) to T;
}
}
}

 Time complexity of Prim’s algorithm is O(n2) time.

KRUSKAL’s ALGORITHM

Kruskal’s algorithm builds a minimum-cost spanning tree T by adding the edges one
at a time. It selects the edges in non-decreasing order of their costs. An edge is added to T if
it does not create a cycle.

Procedure:

 Procedure starts with all vertices in terms of a forest representation.


 Select an edge (u, v) with minimum cost and add to T if it does not form a cycle.
 Repeat the above step until T contains n-1 edges.

Example: Obtain minimum-cost spanning tree for the graph

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 55


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm Kruskal( )
{
T:=Φ;
while (( T has less than n-1 edges) and (E ≠ Φ )) do
{
Choose an edge (u,v) from E of Lowest cost;
Delete (u, v) from E;
if (u, v) does not create a cycle in T then
add (u,v) to T;
else
discard (u, v);
}
}

 The time complexity of Kruskal’s algorithm is O( |E| log n ) time.

Note: The main difference between Prim’s and Kruskal’s algorithm is:
 Prim’s algorithm maintains tree structure at every stage and Kruskal’s algorithm need
not be maintained.
 Finally, both methods produces same spanning tree with minimum-cost.

SINGLE-SOURCE SHORTEST PATHS


Graphs can be used to represent highway structure with vertices representing cities
and edges representing communication link in between the cities. These edges are labeled
with weights representing distance between the cities.
In general, everybody is interested to move from one city to another city with
minimum distance. It leads to shortest path problem.

 Consider G = (V, E) be a weighted directed graph and source vertex as v0. The
problem is to determine the shortest path from v0 to all the remaining vertices of
G.
 The greedy method generates shortest paths from vertex v0 to the remaining
vertices in non-decreasing order of path length.
 First, a shortest path to the nearest vertex is generated. Then, a shortest path to the
second nearest vertex is generated and so on.

Example: Identify shortest paths of the following graph by starting vertex as ‘5’.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 56


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm ShortestPath ( V, Cost, Dist, n )


{
// Graph G is represented by Cost Adjacency Matrix C[1:n , 1:n]

for i : = 1 to n do
{
S[i] : = False;
Dist[i] : = Cost[v, i];
}
S[v] : = True;
Dist[v] : = 0.0;
for num : = 1 to n-1 do
{
Choose u from among those vertices not in S such that Dist[u] is minimum;
S[u] : = True;
for ( each w adjacent to u with S[w] = False ) do
{
if ( Dist[w] > Dist[u] + Cost[u, w] ) then
Dist[w] : = Dist[u] + Cost[u, w];
}
}
}

 The time complexity of single-source shortest path problem is O(n2) time.

***

DYNAMIC PROGRAMMING

Dynamic programming is an algorithm design method that can be used when the
solution to a problem statement can be viewed as the result of a sequence of decisions.

To obtain optical solution, first generate all possible sequences, and then apply
principle of optimality.

“The principle of optimality states that an optimal sequence of decisions has the
property that whatever the initial state and decisions are, the remaining decisions must
constitute an optimal decision sequence with respect to the state resulting from the firs
decision”.

Note: The essential difference between the Greedy method and the Dynamic programming
is, in the greedy method only one decision sequence is ever generated. In dynamic
programming, many decision sequences may be generated.

Applications: Some of the problem statements that can be solved using dynamic
programming are:
 All pairs shortest paths
 Single-source shortest paths
 Optimal binary search trees
 0/1 Knapsack problem etc,

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 57


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

ALL-PAIRS SHORTEST PATHS

Let G = (V, E) be a directed graph with n vertices. Let cost be a cost adjacency
matrix such that cost <i, i> = 0, 1 ≤ i ≤ n. Then cost <i, j> is the length of edge <i, j> if <i, j>
€ E(G) and cost <i, j> = α if <i, j> € E(G).

All-pairs shortest path problem is to determine of matrix A such that A(i, j) is the
length of a shortest path from i to j. The matrix A can be obtained by solving n-single source
shortest path problems.

 By applying dynamic programming it can be solved as:

Step 1: To examine shortest path from i to j, if k is an intermediate vertex between i and j


then the shortest path from i to j becomes the shortest path from i to k and shortest
path from k to j.
Step 2: Ak (i, j) be length of the shortest path from node i to j such that every intermediate
vertex will be ≤ k.
Then compute Ak for k = 1, 2, 3, - - - - - - n.

When intermediate vertex k raised, then two possible cases are:


 Path going from i to j via k
 Path not going via k
Thus, principle of optimality holds.
Step 3: Shortest path can be computed using the formula

Ak (i, j) = min { Ak-1 (i, j) , Ak-1 (i, k) + Ak-1 (k, j) } ,k≥1


Here,
If k is selected, then Ak (i, j) = Ak-1 (i, k) + Ak-1 (k, j)

Otherwise, Ak (i, j) = Ak-1 (i, j).

Initially, A0 (i, j) = cost(i, j) , 1 ≤ i ≤ n.

Example: Compute All-pairs shortest paths for the graph

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 58


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm AllPaths ( cost, A, n )


{
// cost[1:n , 1:n] is the Cost Adjacency Matrix
// A[i, j] is the cost of shortest path from vertex i to j.
// cost[i, i] = 0 , V 1 ≤ i ≤ n.

for i : = 1 to n do
for j : = 1 to n do
A[i, j] : = cost[i, j];
for k : = 1 to n do
for i : = 1 to n do
for j : = 1 to n do
A[i, j] : = min [ A[i, j] , A[i, k] + A[k, j] ] ;
}

Analysis:

Here, the first nested loop takes O(n2) time and second nested loop takes O(n3) time.
Hence, overall time complexity is O(n3) time.

SINGLE-SOURCE SHORTEST PATHS [ General Weights ]


Consider a directed graph G that may have negative lengths. When negative edge
lengths are permitted, we require that the graph have no cycles of negative length. When
there no cycles of negative length, there is a shortest path between every two vertices of n-
vertex graph that has at most n-1 edges on it.

Let distl[u] be the length of a shortest path from the source vertex v to vertex u that
contains at most l edges. Then,

distl [u] = cost [v, u] , 1 ≤ u ≤ n

The main objective is to compute distn-1[u] for all u. This can be done by using
dynamic programming methodology with Bellman and Ford algorithm as:

 If the shortest path from v to u with at most k, k > 1, edges has no more than
k-1, then
distk [u] = distk-1 [u]

 If the shortest path from v to u with at most k, k > 1, edges has exactly k, then
it is made up of a shortest path from v to some vertex j followed by the edge
<j,u>, then
distk [u] = min { distk-1 [i] + cost [ i, u ] }
i
From this, dist can be shown as:

distk [u] = min { distk-1 [u] , min { distk-1 [i] + cost [ i, u ] } }


i
This recurrence can be used to compute dist k , distk-1 , - - - - - for k = 1, 2, - - -- n-1.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 59


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Example: Obtain single-source shortest paths for the graph

Algorithm BellmanFord ( v, cost, dist, n )


{
for i : = 1 to n do
dist[ i ] := cost [ v, i ];
for i : = 2 to n-1 do
for each u such that u ≠ v and u has atleast one incoming edge do
for each <i, u> in the graph do
if dist[ u ] > dist[ i ] + cost[ i, u ] then
dist[ u ] : = dist[ i ] + cost[ i, u ];
}

 The time complexity of above representation is O(n 3) time.

OPTIMAL BINARY SERCH TREES


Consider a fixed set of identifiers and need to create a binary search tree. Here, it
leads to different binary search trees with different performance characteristics.

Example: Assume the set of identifiers are: for, do, while, int, if.

For this, possible binary search trees are

for for

do while etc,
do int

int
if while
if

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 60


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

 The first tree takes 1, 2, 2, 3, 4 comparisons to find the identifiers. Thus, the average
number of comparisons is (1 + 2 + 2 + 3 + 4) / 5 = 12/5.
 The second tree takes 1, 2, 2, 3, 3 comparisons to find the identifiers. Thus, the
average number of comparisons is (1 + 2 + 2 + 3 + 3) / 5 = 11/5.
 The number of comparisons at different levels to search a particular node is
considered as a cost.
 For a given set of identifiers, design a binary search tree with minimum cost is known
as “Optimal Binary Search Tree”.

For this, Let us assume that the given set of identifiers is { a 1 , a2 , - - - - - - , an }


with a1 < a2 < - - - < an. Let p(i) be the probability to search an element a i as successful
search. Let q(i) be the probability to search an element x as a i < x < ai+1 refers to
unsuccessful search. Then,

∑ p(i) + ∑ q(i) = 1
1≤i≤n 0 ≤i≤n

To obtain cost function of binary search trees, it is useful to add an external node at
every empty subtree indicated with square nodes.

for

do while

int

if

If a binary search tree represents ‘n’ identifiers, then there will be exactly n internal
nodes and n+1 external nodes. Every internal node represents a point where a successful
search may terminate. Every external node represents a point where an unsuccessful search
may terminate.
Now, if successful search terminates at level l, then expected cost contribution for the
internal node is p(i) * level(ai). If unsuccessful search terminates at a specified level then
expected cost contribution is q(i) * ( level (E i ) – 1).
Finally, cost of the binary search tree is

∑ p(i) * level(ai) + ∑ q(i) * ( level (Ei ) – 1)


1≤i≤n 0 ≤i≤n

The aim of the problem statement is to construct the binary search tree with the above
cost is minimum.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 61


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

 To apply dynamic programming, it makes a decision as to which of the a i’s should be


assigned as root of the tree. It splits the identifiers into left and right formats. Hence,
it becomes

ak

l r
Here,

cost (l) = ∑ p(i) * level(ai) + ∑ q(i) * ( level (Ei ) – 1)


1≤i<k 0 ≤i<k

cost (r) = ∑ p(i) * level(ai) + ∑ q(i) * ( level (Ei ) – 1)


k<i≤n k < i≤n

Finally,

cost(i, j) = min { c(i, k-1) + c(k, j) } + w(i, j)


i< k ≤ j

The problem is solved for c(o, n) by computing j – i = 1, 2, - - - - - - .

Initially, c(i, i) = 0 ;
r(i, i) = 0 ;
w(i, i) = q( i ) , 0 ≤ i ≤ n and
w(i, j) = p( j ) + q( j ) + w(i, j – 1).

During this computation, record the root r(i, j) of each tree.

Example: Construct an optimal binary search tree for n = 4 and


(a1 , a2 , a3 , a4) = (do, if, int, while).
The values of p’s and q’s are p(1:4) = (3, 3, 1, 1) and q(0:4) = (2, 3, 1, 1, 1).

Algorithm OBST ( p, q, n )
{
for i : = 0 to n-1 do
{
w[i, i] : = q[i];
c[i, i] : = 0.0;
r[i, i] : = 0;
w[i, i+1] : = q[i] + q[i+1] + p[i+1] ;
c[i, i+1] : = q[i] + q[i+1] + p[i+1] ;
r[i, i+1] : = i+1;
}
w[n, n] : = q[n];
c[n, n] : = 0.0;
r[n, n] : = 0;

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 62


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

for m : = 2 to n do
{
for i : = 0 to n - m do
{
j : = i + m;
w[i, j] : = w[i, j-1] + p[ j ] + q[ j ];
k : = Find(c, r, i, j);
c[i, j] = w[i, j] + c[i, k – 1] + c[k, j];
r[i, j] : = k;
}
}
write ( c[0, n], w[0, n], r[0, n] );
}

 The time complexity of optimal binary search tree is O(n 3) time.

0 / 1 KNAPSACK PROBLEM
Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If the object i is placed into the knapsack, then a profit of
pixi is earned where xi = 0 or 1. The objective is to fill the knapsack that maximizes the total
profit earned.
Formally, the problem statement can be stated as:

maximize ∑ pi xi → 1
1≤i≤n

subject to ∑ wi xi ≤ m → 2
1≤i≤n

and xi = 0 or 1 , 1≤i≤n → 3

 Solution to the Knapsack problem can be obtained by making a sequence of decision


on the variables x1, x2, - - - - -, xn .

Let us assume that decision of xi is made in the order xn , xn-1 , - - - - , x1. For a
decision on xn possible states are:

 xn = 0 ; The knapsack capacity is m and no profit earned.


 xn = 1 ; The knapsack capacity is m – wn and a profit of pn is earned.

Now, the remaining decisions xn-1 , - - - - , x1 must be optimal with respect to the
problem statement resulting from the decision on x n.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 63


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

 To solve this problem using dynamic programming, follow the procedure as:

Step 1: Si is a pair (P, W)


Initially S0 = { (0, 0) };

Then, S1i-1 is calculated as:

S1i-1 = { Select next pair (P, W) and add with the pairs S i-1 } i = 1, 2, - - -.

Now, Si is obtained by merging Si-1 and S1i-1 and apply purging rule.

Purging Rule: If Si contains (Pj , Wj) and (Pk , Wk) with the condition that Wj > Wk
and Pj ≤ Pk , then the pair (Pj , Wj) can be discarded.

Step 2: Generate all possible Si’s.

Step 3: Select a tuple (P1 , W1) with a condition that W1 ≤ m from the last set Si then

Set xn = 0; if (P1 , W1) є Sn-1 otherwise, set xn = 1.

Now, select new pair as (P1 – pn , W1 – wn) and repeat Step 3 until all
decisions are completed.

Example: Obtain 0/1 Knapsack solution for the instance n=3, m=6, (p 1 , p2 , p3) = (1, 2, 5)
and (w1 , w2 , w3) = (2, 3, 4).

Algorithm DKP ( p, w, n, m)
{
S0 : = { (0, 0) };
for i : = 1 to n – 1 do
{
S1i-1 = { (P, W) / (P – pi , W - wi ) є Si-1 and W ≤ m } ;
Si : = MergePurge( Si-1 , S1i-1 );
}
(PX, WX) : = last pair in Sn-1;
(PY, WY) : = (P1 + pn , W1 + wn);

if (PX > PY) then


xn : = 0;
else
xn : = 1;
TraceBackFor( xn-1 , - - - - , x1 );
}

 The time complexity of 0/1 Knapsack using dynamic programming is O(2 n) time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 64


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

STRING EDITING
Consider two strings X = x1, x2, - - -, xn and Y = y1 , y2 , - - - - ym. Where xi , 1 ≤ i ≤ n
and yj, 1 ≤ j ≤ m, are members of a finite set of symbols known as the alphabet. We want to
transform X into Y using a sequence of edit operations on X. The permissible edit operations
are insert, delete, and change, each operation is associated with a cost. The cost of sequence
of operations is the sum of the costs of individual operations in the sequence.

 “ The problem of string editing is to identify a minimum cost sequence of edit


operations that will transform X into Y”.

For this,

Let D(xi) be the cost of deleting the symbol xi from X,


I(yj) be the cost of inserting the symbol yj into X,
C(xi , yj) be the cost of changing the symbol xi of X into yj.

A dynamic programming solution for this problem can be stated as:

 Define cost(i, j) be the minimum cost of any edit sequence for transforming X into Y.
cost(i, j) can arrive using the recurrence equation:

cost (i, j) = 0 ; i=j=0


cost (i-1, 0) + D(xi) ; j=0,i>0
cost (0, j-1) + I(yj) ; j>0,i=0
cost1 (i, j) ; i>0,j>0

Where,

cost1 (i, j) = min{ cost(i-1, j) + D(xi) , cost(i, j-1) + I(yj) , cost(i-1, j-1)+c(xi , yj) }

 Compute cost(i, j) for all possible values of i and j. There are (n+1)(m+1) values.
These values can be computed in the form of a table M, where each row corresponds
to the value i and column corresponds to j.

 The final cost(n, m) shows the cost of an optimal edit sequence.

 Time complexity of string editing problem is O(m n) time.

Example: Solve string edit problem for X = a, a, b, a, a, b and Y = b, a, b, b.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 65


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

TRAVELING SALESPERSON PROBLEM


Consider G be a directed graph denoted by < V, E >, where V denotes a set of
vertices and E denotes a set of Edges. The edges are assigned with weights values as c ij. The
cost cij > 0 for all i and j; if there is an edge between i and j, otherwise, c ij = α.

 A tour for the graph should be visit all vertices exactly once and the cost of the tour is
sum of the cost of the edges. Traveling salesperson problem is to find such a tour
with minimum cost.

 To solve this problem, dynamic programming strategy can be stated as:

Step 1: Let the tour starts and ends at the vertex 1. Then the function g( 1, V-{1} ) be
the total length of the tour.
 Let g(i, S) be the length of the shortest path starting at vertex i, going
through all vertices in S, and terminating at vertex 1.

Step 2: According to principle of optimality, we need to take a decision on the vertices


v1 , v2 , - - -- , vn . Hence, the functional value can be shown as:

g (i, s) = min { cij + g ( j, S – { j } ) } iєS


jєS

Initially, g ( i, 0 ) = ci 1 , 1 ≤ i ≤ n.

Obtain g (i, s) for S with |S| = 1, |S| = 2, - - - - - - - , |S| = n-1.

 The complexity of Traveling salesperson problem using dynamic programming


approach is O(n 2n) time.

END

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 66


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

UNIT – IV

Backtracking: General Method, 8-Queens Problem, Sum of Subsets problem, Graph


Coloring, 0/1 Knapsack Problem.
Branch and Bound: The General Method, 0/1 Knapsack Problem, Travelling Salesperson
problem.
***

BACKTRACKING

Backtracking is one of the general technique used to obtain solution of the problem
statement which satisfies some constraints. For this, enumerate all possible solutions and
select which produces optimum result.

 In backtracking concept, the desired solution is expressible as an n-tuple ( x1, - - , xn ),


where xi is chosen from a finite set Si.
 Backtracking strategy determines the solution by systematic searching the solution
space that consists all possible solutions.
 Most of the problem statements used to solve backtracking strategy must satisfy a
complex set of constraints known as Explicit and Implicit constraints.

 Explicit constraints are rules that restrict each xi can take values from a given set Si.

Example: Si = { 0 , 1 }

=> xi = 0 or 1.

 Implicit constraints are rules that determines which of the tuples in the solution
space satisfies the criterion function.

Basic Terminology:

Backtracking strategy determines the problem solution by systematic searching for the
solution using Tree structure.

 Each node in the tree is called as “Problem State”.


 A node which has been generated and all of its children have not yet been generated is
called a “Live Node”.
 A live node whose children are currently being expanded is called “E-Node”.
 A node which is not to be expanded further or all of its children have been generated
is called a “Dead Node”.
 “Bounding Function” is used to kill live nodes without generating its children.
 All paths from root to other nodes define “State Space” of the problem statement.
 State space defines all possible solutions called “Solution Space”.
 Finally, from the solution space, identify the “Answer State” that satisfies implicit
constraints.
 Depth first node generation with bounding function is called “Backtracking”.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 67


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

The control abstraction algorithm can be shown as:

Algorithm BackTracking ( n )
{
k : = 1;
while ( k ≠ 0 ) do
{
if (each x[k] є T ( x[1], x[2], - - -- , x[k-1]) and
Bk(x[1], - - , x[k] is True ) then
{
if ( x[1], - - - , x[k] ) is a path to answer node then
write x[1 : k];
k : = k + 1;
}
else
k : = k – 1;
}
}

APPLICATIONS

Some of the problems can be solved using backtracking are:

 8-Queens problem
 Sum of subsets problem
 Graph coloring
 0/1 Knapsack problem etc,

N-QUEENS PROBLEM

Let us consider N=4, then the problem is termed as “4 - Queens Problem”.

4 – QUEENS PROBLEM

The problem statement is to place 4 Queens on a 4 X 4 chessboard such that no two


queens can attack i.e., no two of them are on the same row, column or diagonal.

 Solution of a 4 – Queens problem can be represented as a 4-tuple ( x1 , x2 , x3 , x4 )


where xi is a column number on which queen i is placed.
 Here, solution space generates 4! tuples.
 It satisfies the specific constraints as:

 Explicit constraints: Si = { 1, 2, 3, 4 }
Xi should select from Si
 Implicit constraints: No two queens can be placed in the same row,
column or diagonal.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 68


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Suppose two queens are placed at positions ( i, j ) and ( k, l ). Then they are one same
diagonal only if

i–j=k–l or i+j=k+l

j–l=i–k or j–l=k–i

|j–l| = | i – k |;

8 – QUEENS PROBLEM

The problem statement is to place 8 Queens on a 8 X 8 chessboard such that no two


queens can attack i.e., no two of them are on the same row, column or diagonal.

 Solution of a 8 – Queens problem can be represented as a 8-tuple ( x1 , x2 , x3 , x4 , x5 ,


x6 , x7 , x8 ) where xi is a column number on which queen i is placed.
 Here, solution space generates 8! tuples.
 It satisfies the specific constraints as:

 Explicit constraints: Si = { 1, 2, 3, 4, 5, 6, 7, 8 }
Xi should select from Si
 Implicit constraints: No two queens can be placed in the same row,
column or diagonal.

N – QUEENS PROBLEM

The problem statement is to place N Queens on a N X N chessboard such that no two


queens can attack i.e., no two of them are on the same row, column or diagonal.

Solution of a N – Queens problem can be represented as a N-tuple ( x1 , x2 - - - xn )


where xi is a column number on which queen i is placed. Here, solution space generates N!
tuples.

Algorithm NQueens ( k, i )
{
for i : = 1 to n do
{
if Place ( k, i ) then
{
x[ k ] : = i;
if ( k = n ) then
write x [ 1 : n ];
else
NQueens( k+1 , n );
}
}
}

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 69


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm Place ( k , i )
{
// Returns True if a queen can be placed in kth row and ith column.

for j : = 1 to k – 1 do
{
if ( ( x[ j ] = i ) or (Abs ( x[ j ] – i ) = Abs ( j – k ) ) ) then
return False;
}
return True;
}

Analysis:

 Place ( k, i ) returns a Boolean value True if queen can be placed on ith column. For
this, it takes O(k – 1) time.
 Overall time complexity of N-Queens problem is O(n2) time.

SUM OF SUBSETS
Consider ‘n’ distinct positive numbers given by the set w = ( w1 , w2 , - - - , wn ). The
aim of the problem is to find all combinations of these numbers whose sums are m.

 For this problem statement solution is represented by an n-tuple ( x1 , x2 - - - xn ) such


that xi = 0 or 1.
 xi = 1 means wi is included
 xi = 0 means wi is not included

 In solution space tree, left subtree of root defines all subsets containing wi i.e., xi = 1.
 Right subtree of the root defines all subsets not containing w i i.e., xi = 0.

Example: Solve sum of subsets problem when n = 4, w = (7, 11, 13, 24) and m = 31.

Algorithm SumOfSub ( s, k, r )
{
x [ k ] : = 1;
if ( s + w[k] = m ) then
write ( x [ 1 : k ] );
else if ( s + w [ k ] + w [ k + 1 ] ≤ m ) then
SumOfSub ( s + w[k], k+1, r-w[k] );
if ( ( s + r – w[k] ≥ m ) and ( s + w[k+1] ≤ m ) ) then
{
x [ k ] : = 0;
SumOfSub ( s , k+1, r-w[k] );
}
}

 Time complexity of sum of subsets is O(n2 logn) time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 70


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

GRAPH COLORING
Let G be a graph and m be a given positive integer. Graph coloring is a problem of
coloring each vertex in such a way that no two adjacent nodes have the same color and only
m colors are used.

 In general, the problem is treated as m-colorability decision problem and the integer
value is treated as chromatic number of the graph.

Example: Consider the following graph and m = 3 where colors are 1, 2, 3.

1
A A

B C 2 3
B C

D E 1 D E 2

F F
3

 Suppose we represent a graph by its adjacency matrix G[1:n , 1:n] where G[i, j] = 1 if
(i, j) is an edge of G; otherwise, G[i, j] = 0;

 The colors are represented by the integers 1, 2, - - -- , m and the solutions are given
by the n-tuple ( x1 , x2 - - - xn ) where xi is the color of node i.

Algorithm mColoring ( k )
{
// k is the index of the next vertex to color.

repeat
{
NextValue ( k ); // Assign to x[k] a legal color
if ( x[ k ] = 0 ) then
return;
if ( k = n ) then
write ( x [ 1 : n ] ) ;
else
mColoring ( k+1 ) ;
}
until(false);
}

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 71


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm NextValue ( k )
{
repeat
{
x [ k ] : = ( x [k] + 1 ) mod (m + 1);
if ( x[ k ] = 0 ) then
return;
for j : =1 to n do
{
if ( ( G[k, j] ≠ 0 ) and ( x[k] = x[j] )) then
break;
}
if ( j = n + 1 ) then
return;
}
until (false);
}

Analysis:

 Computing time for NextValue to determine the legal color is O(mn) time.
 NextValue function is invoked in m-coloring function by n times.
 Hence, time complexity of Graph coloring problem is O(n mn) time.

0/1 KNAPSACK PROBLEM

Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If the object i is placed into the knapsack, then a profit of
pixi is earned where xi = 0 or 1. The objective is to fill the knapsack that maximizes the total
profit earned.
Formally, the problem statement can be stated as:

maximize ∑ pi xi
1≤i≤n

subject to ∑ wi xi ≤ m
1≤i≤n

and xi = 0 or 1 , 1≤i≤n

 In backtracking, 0/1 knapsack problem solution is represented by an n-tuple (x1, x2, - -


- - -, xn ) such that xi = 0 or 1.
 Assume the given objects are placed in decreasing order of profit / weight ration
format. Now, generate a solution space tree by specifying left child value format as
xi = 0 and right child format as xi = 1.

Example: Obtain 0/1 Knapsack solution for the instance n=3, m=6, (p 1 , p2 , p3) = (5, 3, 6)
and (w1 , w2 , w3) = (3, 2, 4).

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 72


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm Bound ( cp, cw, k )


{
// cp is the current profit total and cw is the current weight total
// k is the index of the last removed item

b : = cp;
c : = cw;
for i : = k + 1 to n do
{
c : = c + w[ i ];
if ( c < m ) then
b : = b + p[ i ];
else
return b + ( 1 – (c – m )/w[ i ] ) * p[ i ];
}
return b;
}

Algorithm BKnap( k, cp, cw )


{
if ( cw + w[k] ≤ m ) then
{
y[ k ] : = 1;
if ( k < n ) then
BKnap ( k+1, cp+p[k] , cw+w[k] );
if ( ( cp+p[w] > fp) and (k = n) ) then
{
fp : = cp + p[k];
fw : = cw + w[k];
for j : = 1 to k do
x[ j ] : = y [ j ];
}
}
if ( Bound( cp, w, k ) ≥ fp ) then
{
y[ k ] : = 0;
if ( k < n ) then
BKnap ( k+1, cp , cw );
if ( ( cp > fp) and (k = n) ) then
{
fp : = cp;
fw : = cw;
for j : = 1 to k do
x[ j ] : = y [ j ];
}
}
}

 Time complexity of above representation is O(2n) time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 73


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

BRANCH AND BOUND


Branch and bound is also used to find optimal solution of the given problem
statement. The term branch and bound refers to all state space search methods in which all
children of the E-Node are generated before any other live node can become the E-Node.

 Here, nodes explore it’s children node by using either BFS are D-Search.
 In branch and bound technology, a BFS state space search will be called FIFO (First
In First Out) search as the list of live nodes in First-In-First-Out list.
 A D-Search stat space search will be called LIFO (Last In First Out) search as the
list of live nodes in Last-In-First-Out list.
 As similar to backtracking process, Bounding functions are used to kill live nodes
those are not generating answer state.

LEAST COST (LC) SEARCH

Using FIFO and LIFO branch and bound selection of a E-Node is a time consuming
process. To improve its performance, they use another method that involves cost of the node
referred as least cost search.

In LC search cost function c( . ) is defined as:

 If x is an answer node, then c(x) is the cost of reaching x from the root of the state
space tree.
 If x is not an answer node, then c(x) = α.

Control abstraction algorithm for LC Search can be shown as:

Algorithm LCSearach ( t )
{
E : = t;
reperat
{
for each child x of E do
{
if x is an answer node then
write the path from x to t and return;
Add(x);
x → parent : = E;
}
if there ar no more live nodes then
{
write ‘ No Answer Node ‘
return;
}
E : = Least( );
}
until (False);
}

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 74


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

BOUNDING
Bounding functions are used to avoid generation of subtrees that do not contain the
answer node. In bounding function, lower and upper bound values are generated at each
node.

 Assume each node x has a cost c(x) associated with it and a minimum cost answer
node is to be found.
 A cost function Ĉ(.) such that Ĉ(x) ≤ c(x) is used to provide lower bounds on
solution obtained from any node x.
 An upper us an upper bound on the cost of minimum cost solution, then all live nodes
x with Ĉ(x) > upper can be killed.
 Initially upper is set to α. Each time a new answer node is found, the value of upper
can be updated.

Note:

1) If the list of live nodes is implemented as a Queue with Least( ) and Add( ) functions,
then LC search is treated as “FIFO Branch and Bound”.
2) If the list of live nodes is implemented as a Stack with Least( ) and Add( ) functions,
then LC search is treated as “LIFO Branch and Bound”.

0/1 KNAPSACK PROBLEM

Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If the object i is placed into the knapsack, then a profit of
pixi is earned where xi = 0 or 1. The objective is to fill the knapsack that maximizes the total
profit earned.
In branch and bound strategy, the problem statement can be changed into
minimization problem as:

minimize - ∑ pi xi
1≤i≤n

subject to ∑ wi xi ≤ m
1≤i≤n

and xi = 0 or 1 , 1≤i≤n

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 75


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

 For minimum cost answer node, calculate

u( x ) = - ∑ pi xi for every node x


1≤i≤n

Actual profit
m−CurrentTotalWeight of remaining
Ĉ(x) = u( x ) - *
Actual Weight of Remaining Object object

Example: Obtain LCBB solution for the given knapsack instance n = 4, m = 15,
(p1 , p2, p3 , p4) = (10, 10, 12, 18) and ( w1 , w2, w3 , w4) = (2, 4, 6, 9).

TRAVELING SALESPERSON
Let G = (V, E) be a directed graph defining an instance of the traveling salesperson
problem. Let cij is the cost of edge <i, j> є E and cij = α if <i, j> є E. Assume every tour
starts and ends at vertex 1.

To use LCBB to search the traveling salesperson state space tree, define a cost
function c( . ) and other two functions Ĉ(.) and u( . ) such that Ĉ(r) ≤ c(r) ≤ u(r) for all
nodes r. The cost c( . ) is such that the solution node with least c( . ) corresponds to the
shortest tour in G.

To obtain traveling salesperson solution, apply branch and bound strategy as:

Step 1: Calculate reduced cost matrix corresponding to G. A row (column) is said to be


reduced iff it contains atleast one zero and all remaining entries are non-negative.
A matrix is said to be reduced iff every row and column is reduced.

For this, choose minimum entry in Row i (column j) and substract it from all
entries in Row i (column j).

Step 2: The total amount subtracted from the columns and rows is a lower bound on the
length of a minimum-cost tour and can be used as Ĉ value for the root of the state
space tree.

Step 3: Obtain a reduced cost matrix for every node in the traveling salesperson state
space tree.

Let A be the reduced cost matrix for node R. Let S be a child of R such that the tree
edge (R, S) corresponds to including edge <i, j> in the tour. If S is not a leaf node, then
reduced cost matrix for S may be obtained as follows:

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 76


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

 Change all entries in row i and column j of A to α.


 Set A(j, 1) to α.
 Reduce all rows and columns in the resulting matrix except for rows and
containing only α.
 Let the resulting matrix be B. If r is the total amount subtracted in sub-step (3)
then,
Ĉ(S) = Ĉ(R) + A[i, j] + r
 Initially assume upper = α. After reaching final leaf node, kill the live nodes
which have Ĉ(x) > upper.
 When the process reached to leaf node, it shows optimal tour of the traveling
salesperson problem.

Example: Obtain optimal tour for the cost matrix

α 20 30 10 11
15 α 16 4 2
3 5 α 2 4
19 6 18 α 3
16 4 7 16 α

END

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 77


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

UNIT – V

NP Hard and NP Complete Problems: Basic Concepts, Cook’s theorem.


NP Hard Graph Problems: Clique Decision Problem (CDP), Chromatic Number Decision
Problem (CNDP), Traveling Salesperson Decision Problem (TSP).
NP Hard Scheduling Problems: Scheduling Identical Processors, Job Shop Scheduling.
***

P & NP- Class Problem: Depending on computational complexity of various problem, it


can be classified into two classes as: P-Class and NP-Class problems.

A set of decision problems that can be solved in deterministic polynomial time are
called P-Class problems.
Example: Searching : O(log n)
Sorting techniques : O(n2), O(n log n)
Matrix multiplication : O(n3) etc,

A problem that can’t be solved in polynomial time, but is verified in non-deterministic


polynomial time is known as NP-Class problem.
Example: Knapsack problem
Traveling sales person problem etc,

NP-Class problems are further classified into two types as: NP-Hard and NP-
Complete problems.

NP-Hard Problem

A problem L is NP-Hard, iff SAT problem reduces to L.

 If there is a solution to one NP-Hard problem in polynomial time, there is solution to


all NP-Hard problems.
 If an NP-Hard problem can be solved in polynomial time, then all NP-Complete
problem can be solved in polynomial time.

NP-Complete Problem

A problem L is NP-Complete if
 It belongs to NP and NP-Hard
 It has property that it can be solved in polynomial time, iff all other NP-Complete
problems can also be solved in polynomial time.

 All NP-Complete problems are NP-Hard, but some NP-Hard problems can’t be NP-
Complete.

Properties:

P and NP: P is the set of all decision problems solvable by deterministic algorithms in
polynomial time. NP is the set of all decision problems solvable by nondeterministic
algorithms in polynomial time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 78


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Since deterministic algorithms are just a special case of nondeterministic ones,


we conclude that P⊆NP.

 A problem L is NP-Hard, iff satifiability problem reduces to L. A problem L is NP-


Complete iff L is NP-Hard and LєNP.
 Let L1 and L2 be two problems. Problem L1 reduces to L2 iff there is a way to solve
L1 by any polynomial time.

Cook’s Theorem
Cook’s theorem states that the satisfiability is in P, iff P = NP. The scientist Stephen
Cook in 1972 state that Boolean satisfiability problem is NP-Complete problem.

SAT is in NP, because a non-deterministic algorithm can guess an assignment of truth


values for variables. An expression is satisfiable if it value results to be true on some
assignments on Boolean values.

Example: Consider the Boolean formula:


__ __ __ __ __
(x1 V x2 V x3) ^ (x1 V x2 V x3) ^ (x1 V x2)

Let x1 = 1 , x2 = 0, x3 = 0

Then, result of formula = ( 1 V 0 V 1 ) ^ ( 0 V 1 V 1 ) ^ (1 V 1)


= 1^1^1
= 1 (TRUE)

The non-deterministic algorithm can also be determine the value of expression for
corresponding assignment and accept if entire expression is true.
The proof shows that Boolean satisfiable problem can be solved in polynomial time.
Hence, all the problems in NP could be solved in polynomial time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 79


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

NP Hard Graph Problems


Clique Decision Problem: A clique is a subgraph of a graph such that all the vertices in
this subgraph are connected with each other that is the subgraph is a complete graph. The
Maximal Clique Problem is to find the maximum sized clique of a given graph G, that is a
complete graph which is a subgraph of G and contains the maximum number of vertices.
This is an optimization problem.

The Clique Decision Problem belongs to NP – If a problem belongs to the NP class, then
it should have polynomial-time verifiability, that is given a certificate, we should be able to
verify in polynomial time if it is a solution to the problem.

Proof:
1. Certificate – Let the certificate be a set S consisting of nodes in the clique and S is a
subgraph of G.

2. Verification – We have to check if there exists a clique of size k in the graph. Hence,
verifying if number of nodes in S equals k, takes O(1) time. Verifying whether each
vertex has an out-degree of (k-1) takes O(k2) time. (Since in a complete graph, each
vertex is connected to every other vertex through an edge. Hence the total number of
edges in a complete graph = kC2 = k*(k-1)/2 ). Therefore, to check if the graph formed
by the k nodes in S is complete or not, it takes O(k 2) = O(n2 ) time (since k<=n, where n
is number of vertices in G).

Therefore, the Clique Decision Problem has polynomial time verifiability and hence
belongs to the NP Class.

The Clique Decision Problem belongs to NP-Hard – A problem L belongs to NP-Hard if


every NP problem is reducible to L in polynomial time. Now, let the Clique Decision
Problem by C. To prove that C is NP-Hard, we take an already known NP-Hard problem,
say S, and reduce it to C for a particular instance. If this reduction can be done in
polynomial time, then C is also an NP-Hard problem. The Boolean Satisfiability Problem
(S) is an NP-Complete problem as proved by the Cook’s theorem. Therefore, every
problem in NP can be reduced to S in polynomial time. Thus, if S is reducible to C in
polynomial time, every NP problem can be reduced to C in polynomial time, thereby
proving C to be NP-Hard.

Conclusion
The Clique Decision Problem is NP and NP-Hard. Therefore, the Clique decision problem
is NP-Complete.

Traveling Salesperson problem

The traveling salesman problem consists of a salesman and a set of cities. The
salesman has to visit each one of the cities starting from a certain one and returning to the
same city. The challenge of the problem is that the traveling salesman wants to minimize the
total length of the trip.

THE END
II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 80

You might also like