0% found this document useful (0 votes)
10 views

Data Structures- Unit-I (1)

This document covers the fundamentals of data structures, including definitions, classifications, and the importance of organizing data for efficient processing. It discusses primitive and non-primitive data structures, abstract data types (ADTs), and the analysis of algorithms, including time and space complexities. Additionally, it explains various sorting algorithms and asymptotic notations used to evaluate algorithm performance.

Uploaded by

sssaij20
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Data Structures- Unit-I (1)

This document covers the fundamentals of data structures, including definitions, classifications, and the importance of organizing data for efficient processing. It discusses primitive and non-primitive data structures, abstract data types (ADTs), and the analysis of algorithms, including time and space complexities. Additionally, it explains various sorting algorithms and asymptotic notations used to evaluate algorithm performance.

Uploaded by

sssaij20
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Sub: Data Structures Unit-I

Topics:
1. Definition of Data Structures
2. Classification of Data Structures
3. Abstract Data Type(ADT)
4. Analysis of Algorithms
5. Recursion - Examples
6. Analysis of Recursive Algorithms
7. Sorting:
- Quick Sort
- Merge Sort
- Selection Sort
- Radix Sort
8. Comparison of Sorting Algorithms

Data structure:
In the modern world, data and its information is an essential part, where data is just
collection of facts or set of values that are in particular format and the information is the
processed data.

If the data is not organized effectively, it is very difficult to perform any task on large amount
of data. If it is organized effectively then any operation can be performed easily on that data.

If the data is stored in a well-organized way on storage media and in computer's memory
then it can be accessed quickly for processing that further reduces the latency and the user
is provided a fast response.

A data structure is a particular way of organizing a large amount of data more efficiently in a
computer so that any operation on that data becomes easy.

In other words, Data structure is a way of collecting and organizing data in such a way that
we can perform operations on these data in an effective way.

Data structures is about rendering data elements in terms of some relationship, for better
performance, organization and storage.

The logical or mathematical model for a particular organization of data is termed as a data
structure.

Prepared by Dr. Srilatha P


In simple words, Data structures are structures programmed to store ordered data, so that
various operations can be performed on it easily.

It represents the knowledge of data to be organized in memory. It should be designed and


implemented in such a way that it reduces the complexity and increases the efficiency.

Data Structure is useful in representing the real world data and its operations in the
computer program.

Based on the organizing method of a data structure, data structures are divided into two
types:

1. Primitive Data Structures (Built-In Data Structures)


2. Non-primitive Data Structures (User-defined Data Structures)

Primitive data structures are those which are the predefined way of storing data by the
system. And the set of operations that can be performed on these data are also predefined.

Primitive data structures are char, int, double and float.

Most of the programming languages have built-in support for the primitive data structures.

Non-primitive data structures are more complicated data structures and they are derived
from primitive data structures.

Non-primitive data structures are used to store large and connected data. Some example of
non-primitive data structures are: Linked List, Tree, Graph, Stack and Queue.

The non-primitive data structures are subcategorized into two ways: Linear data structures
and Non-linear data structures.

If a data structure is organizing the data in sequential order then that data structure is called
a linear data structure.

Some of the examples are Arrays, Linked Lists, Stacks and Queues.

If a data structure is organizing the data in random order or hierarchical order, not in
sequential order, then that data structure is called as non-linear data structure.

Some of the examples are Trees, Graphs, Dictionaries and Heaps.

Prepared by Dr. Srilatha P


Abstract Data Types (ADTs):
To simplify the process of solving problems, we combine the data structures with their
operations and we call this Abstract Data Types (ADTs). An ADT consists of two parts:
1. Declaration of data
2. 2. Declaration of operations.
Commonly used ADTs include: Linked Lists, Stacks, Queues, Priority Queues, Binary Trees,
Dictionaries, Disjoint Sets (Union and Find), Hash Tables, Graphs, and many others.

Introduction to Algorithms
● An Algorithm is a finite set of instructions or logic, written in order, to accomplish a
certain predefined task.
● An Algorithm is independent of the programming language. An Algorithm is the core
logic to solve a given problem.
● An Algorithm is expressed generally as flow chart or as an informal high level
description called as pseudocode
● Algorithm can be defined as “a sequence of steps to be performed for getting the
desired output for a given input.”
● Before attempting to write an algorithm, one should find out what the expected
inputs and outputs are for the given problem.

The properties of an algorithm are:


1. Input: Algorithm should be accepting 0 or more inputs supplied externally.
2. Output: Algorithm should be generating at least one output.
3. Definiteness: Each step of an algorithm must be precisely defined. Meaning the step

Prepared by Dr. Srilatha P


should perform a clearly defined task without much complication.
4. Finiteness: An algorithm must always terminate after a finite number of steps.
5. Effectiveness: The efficiency of the steps and the accuracy of the output determine the
effectiveness of the algorithm.
6. Correctness: Each step of the algorithm must generate a correct output.

An algorithm is said to be efficient and fast, if it takes less time to execute and consumes
less memory space.

The performance of an algorithm is measured on the basis of following properties:


● Time complexity
● Space complexity

Space and Time Complexities


Space complexity is the amount of memory used by the algorithm (including the input
values to the algorithm) to execute and produce the result.

When we design an algorithm to solve a problem, it needs some computer memory to


execute.

For any algorithm to execute it requires memory for the following purposes:
1. Memory required to store the program instructions. (Also called as Instruction space)
2. Memory required to store constants and variables. (Also called as Data space )
3. Memory that is to be dynamically allocated. Memory that is required for storing data
between functions. (Also called as Environment space)

An algorithm may require inputs, variables, and constants of different data types.

For calculating the space complexity, we need to know the amount of memory used by
variables of different data types, which generally varies for different operating systems.

Prepared by Dr. Srilatha P


Let us see the procedure for calculating the space complexity of a code with an example:

The above code taken two inputs x and y of type int as formal parameters.

In the code, another local variable z of type int is used for storing the sum of x and y.

The int data type takes 2 bytes of memory, so the total space complexity is 3 (number of
variables) * 2 (Size of each variable) = 6 bytes.

The space requirement of this algorithm is fixed for any input given to the algorithm, hence
it is called as constant space complexity

Let us understand the calculation of space complexity with one more example:

In the above code, 2 * n bytes of memory (size of int data type is 2) is required by the array
a[ ] and 2 bytes of memory for each variable of x, n and i.

Hence the total space requirement for the above code would be (2 * n + 6).

The space complexity of the program is increasing linearly with the size of the array (input)
n then it is called as Linear Space Complexity.

Similarly, when the memory requirement of the algorithm increases quadratic to the given
input then it is called as a "Quadratic Space Complexity".

Similarly, when the memory requirement of the algorithm increases cubic to the given input
then it is called as a "Cubic Space Complexity" and so on.

Prepared by Dr. Srilatha P


Time complexity is the computational complexity that measures or estimates the time
taken to run an algorithm.

The total time required by the algorithm to run till its completion depends on the number of
(machine) instructions the algorithm executes during its running time.

The actual time taken differs from machine to machine as it depends on several factors like
CPU speed, memory access speed, OS, processor etc.

So, we will take the number of operations executed in the algorithm as the time complexity.

Time complexity is commonly estimated by counting the number of elementary operations


performed by the algorithm, supposing that each elementary operation takes a fixed amount
of time to perform.

Thus, the amount of time taken and the number of elementary operations performed by the
algorithm differ by at most a constant factor.

Examples:
1.
Algorithm sum(A,n): Time Taken by each operation
s=0 - 1 unit
i=0 - 1 unit
while(i<n): - n+1 units
i++ - n units
s=s+A[i] -n units
return S - 1 unit

Total: 3n+4 i.e O(n)

2. Algorithm add(A,B,n):
For i in range(0,n): - n units
For j in range(0,n): -n*n units
c[i,j]=A[i,j]+B[i,j] - n*n units
Total: n+2n2 i.e O(n2 )

3. Algorithm multiply(A,B,n):
For i in range(0,n): -n
For j in range(0,n): -n*n
C[i,j]=0 -n*n
For k in range(0,n): -n*n*n
C[i,j] = C[i,j] + A[i,k] + B[k,j] -n*n*n
Total : n+2n2+2n3 ~O(n3 )

Prepared by Dr. Srilatha P


4. For i in range(0,n,2):
Stmt;
Stmt executes for n/2 times. So the time complexity is O(n)

5. i=n
while(i>=1):
i=i/2

1st iteration i=n


2nd iteration i=n/2
.
.
.
Kth iteration i=n/2k
Assume i<1 loop terminate i.e n/2k < 1
approx n/2k = 1
=> n = 2k
=> log2(n) = log2(2k) [ taking log on both sides]
=> log2(n) = k
So, the time complexity of the above loop is O( log2(n))

Constant Time — O(1)


An algorithm is said to have a constant time when it is not dependent on the input data (n).
No matter the size of the input data, the running time will always be the same.

Ex1:
if a > b:
return True
else:
return False
Prepared by Dr. Srilatha P
Ex2:
def get_first(data):
return data[0]

if __name__ == '__main__':
data = [1, 2, 9, 8, 3, 4, 7, 6, 5]
print(get_first(data))

Logarithmic Time — O(log n)


An algorithm is said to have a logarithmic time complexity when it reduces the size of the
input data in each step (it don’t need to look at all values of the input data)

Ex1:
for index in range(0, len(data), 3):
print(data[index])

Ex2:
def binary_search(data, value):
n = len(data)
left = 0
right = n - 1
while left <= right:
middle = (left + right) // 2
if value < data[middle]:
right = middle - 1
elif value > data[middle]:
left = middle + 1
else:
return middle
raise ValueError('Value is not in the list')

if __name__ == '__main__':
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
print(binary_search(data, 8))

Linear Time — O(n)


An algorithm is said to have a linear time complexity when the running time increases at
most linearly with the size of the input data

Ex1:
for value in data:
print(value)

Prepared by Dr. Srilatha P


Ex2:
def linear_search(data, value):
for index in range(len(data)):
if value == data[index]:
return index
raise ValueError('Value not found in the list')

if __name__ == '__main__':
data = [1, 2, 9, 8, 3, 4, 7, 6, 5]
print(linear_search(data, 7))

Quasilinear Time — O(n log n)


An algorithm is said to have a quasilinear time complexity when each operation in the input
data have a logarithm time complexity. It is commonly seen in sorting algorithms (e.g.
mergesort, timsort, heapsort).

For example: for each value in the data1 (O(n)) use the binary search (O(log n)) to search the
same value in data2.

for value in data1:


result.append(binary_search(data2, value))

Quadratic Time — O(n²)


An algorithm is said to have a quadratic time complexity when it needs to perform a linear
time operation for each value in the input data,

for example:
for x in data:
for y in data:
print(x, y)

Exponential Time — O(2^n)


An algorithm is said to have an exponential time complexity when the growth doubles with
each addition to the input data set. This kind of time complexity is usually seen in
brute-force algorithms.

Asymptotic Notations
Asymptotic notations are a set of mathematical tools used in computer science to describe
and analyze the growth rate of functions. These notations allow us to simplify the
mathematical representation of the functions, making it easier to compare the relative
growth rates of different functions. Some of the most commonly used asymptotic notations
are big O, little o, big Theta, and little omega.

Prepared by Dr. Srilatha P


Here are a few examples of each of these notations:

1. Big O notation (O(n)): This notation represents an upper bound on the growth rate of
a function. If f(n) = O(g(n)), it means that there exists a constant c > 0 and a constant
n0 such that for all n >= n0, f(n) <= c * g(n). This means that the function f grows at
most as fast as g, as n approaches infinity. For example, the function f(n) = n^2 is
O(n^2).

2. Little o notation (o(n)): This notation represents a strict upper bound on the growth
rate of a function. If f(n) = o(g(n)), it means that there exists a constant c > 0 and a
constant n0 such that for all n >= n0, f(n) < c * g(n). This means that the function f
grows strictly slower than g, as n approaches infinity. For example, the function f(n) =
n^2 is o(n^3).

3. Big Theta notation (Θ(n)): This notation represents an average case time complexity
of a function. If f(n) = Θ(g(n)), it means that there exist constants c1, c2 > 0 and a
constant n0 such that for all n >= n0, c1 * g(n) <= f(n) <= c2 * g(n). This means that
the function f grows at the same rate as g, as n approaches infinity. For example, the
function f(n) = n^2 is Θ(n^2).

4. Big Omega notation (Ω) : Big-Omega (Ω) notation gives a lower bound for a function
f(n) to within a constant factor. We write f(n) = Ω(g(n)), If there are positive constants
n0 and c such that, to the right of n0 the f(n) always lies on or above c*g(n). Ω(g(n)) =
{ f(n) : There exist positive constant c and n0 such that 0 ≤ c g(n) ≤ f(n), for all n ≤ n0}

5. Little omega notation (Ω(n)): This notation represents a lower bound on the growth
rate of a function. If f(n) = Ω(g(n)), it means that there exists a constant c > 0 and a
constant n0 such that for all n >= n0, f(n) >= c * g(n). This means that the function f
grows at least as fast as g, as n approaches infinity. For example, the function f(n) =
n^2 is Ω(n).

General Properties:
1. If f(n) is O(g(n)) then a*f(n) is also O(g(n)) ; where a is a constant.

Example:
f(n) = 2n²+5 is O(n²)
then 7*f(n) = 7(2n²+5)
= 14n²+35 is also O(n²)

Similarly, this property satisfies both Θ and Ω notation. We can say


If f(n) is Θ(g(n)) then a*f(n) is also Θ(g(n)); where a is a constant.
If f(n) is Ω (g(n)) then a*f(n) is also Ω (g(n)); where a is a constant.
Prepared by Dr. Srilatha P
2. Reflexive Properties:
If f(n) is given then f(n) is O(f(n)).

Example: f(n) = n² ; O(n²) i.e O(f(n))

Similarly, this property satisfies both Θ and Ω notation. We can say


If f(n) is given then f(n) is Θ(f(n)).
If f(n) is given then f(n) is Ω (f(n)).

3. Transitive Properties :
If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) = O(h(n)) .
Example: if f(n) = n , g(n) = n² and h(n)=n³
n is O(n²) and n² is O(n³) then n is O(n³)

Similarly this property satisfies for both Θ and Ω notation. We can say
If f(n) is Θ(g(n)) and g(n) is Θ(h(n)) then f(n) = Θ(h(n)) .
If f(n) is Ω (g(n)) and g(n) is Ω (h(n)) then f(n) = Ω (h(n))

4. Symmetric Properties :
If f(n) is Θ(g(n)) then g(n) is Θ(f(n)) .

Example: f(n) = n² and g(n) = n² then f(n) = Θ(n²) and g(n) = Θ(n²)
This property only satisfies for Θ notation.

5. Transpose Symmetric Properties :


If f(n) is O(g(n)) then g(n) is Ω (f(n)).

Example: f(n) = n , g(n) = n² then n is O(n²) and n² is Ω (n)


This property only satisfies for O and Ω notations.

Some More Properties :


1. If f(n) = O(g(n)) and f(n) = Ω(g(n)) then f(n) = Θ(g(n))
2. If f(n) = O(g(n)) and d(n)=O(e(n))
then f(n) + d(n) = O( max( g(n), e(n) ))

Recursion:
In programming, recursion is a technique where a function calls itself with different input
parameters until a specific condition is met.
Here are some key concepts related to recursion :
Base case: The base case is the simplest version of the problem that can be solved without
further recursion. It serves as the stopping condition for the recursive algorithm. Without a
base case, the recursion will continue indefinitely and lead to a stack overflow error.

Prepared by Dr. Srilatha P


Recursive case: The recursive case is part of the algorithm that breaks down the problem
into smaller subproblems and calls itself with these subproblems as inputs. The recursive
case typically involves a function calling itself with a smaller input and then combining the
results of these smaller subproblems to solve the original problem.
Call stack: The call stack is a data structure that keeps track of the function calls made
during the execution of a program. Each time a function is called, a new entry is added to the
call stack. When a function completes, its entry is removed from the call stack. Recursion
can cause the call stack to grow very large, which can lead to a stack overflow error.
Tail recursion: Tail recursion is a special case of recursion where the recursive call is the last
operation performed by the function. In this case, the compiler or interpreter can optimize
the recursion to use a constant amount of memory, which can reduce the risk of a stack
overflow error.

Tail recursion:
● Tail recursion is a technique used in recursive functions where the last operation
performed in the recursive function is the recursive call itself.
● In other words, the function call is made at the end of the function, and there is no
pending operation after the call.
● In a traditional recursive function, each recursive call adds a new stack frame to the
call stack, which can lead to stack overflow errors if the recursion depth is too deep.
● However, in a tail-recursive function, the compiler can optimize the code so that the
function does not create a new stack frame for each recursive call, but instead reuses
the existing stack frame.
● This optimization is possible because the function's final operation is the recursive
call, so there is no need to keep the previous stack frames in memory.
● By reusing the same stack frame, the function can avoid stack overflow errors and
improve its performance.

Here is the tail-recursive version of the function that computes the factorial of a number

Def factorial_tail(int n, int acc) :


if (n == 0) :
return acc;
else:
return factorial_tail(n - 1, acc * n);

Def factorial(int n) :
return factorial_tail(n, 1)

In the tail-recursive version, the function factorial_tail takes two arguments: n, which
represents the number to compute the factorial of, and acc, which represents the
accumulator that holds the intermediate result of the computation. The function starts with
an initial value of acc=1 and multiplies it by n in each recursive call. When n reaches 0, the

Prepared by Dr. Srilatha P


function returns the final result in the acc variable. The factorial function simply calls
factorial_tail with the initial value of acc=1.This tail-recursive version of the factorial
function can be optimized by the compiler to avoid stack overflow errors and improve its
performance.

The flow of recursive programs:


Recursive programs call themselves to solve a problem by breaking it down into smaller
subproblems. The flow of execution in recursive programs can be a bit tricky to follow, but
the general pattern is as follows:

The initial call: The program starts by calling the recursive function with some initial input.
Base case: The recursive function checks if the input meets some base case condition, which
is a simple case that can be solved directly without recursion. The function returns a result if
the input meets the base case condition.
Recursive case: If the input does not meet the base case condition, the function breaks down
the problem into smaller subproblems, which are solved by calling the function recursively
with the subproblems as input.
Backtracking: Once the function has solved the subproblems, it combines their results to
solve the original problem and returns the result to the caller.

The Call Stack:The call stack is a data structure used by the computer to keep track of the
order in which function calls are made and their corresponding variables and parameters.
In recursive programs, the call stack plays an important role in keeping track of the state of
the program as it executes the recursive function calls.

When a recursive function is called, a new frame is added to the call stack to keep track of
the function's local variables and parameters.

The function then proceeds to call itself recursively with a smaller subproblem.

Each recursive call adds a new frame to the call stack, pushing the previous frames down.
As the function calls continue, the call stack grows taller and taller until a base case is
reached.

At this point, the recursive function returns a value, and the computer begins to unwind the
call stack, popping the frames off the stack one by one.

As each frame is popped off the stack, the function returns to the state it was in before the
recursive call, with its original variables and parameters.

Here's an example to illustrate the call stack in a recursive program in python:

def factorial(n):

Prepared by Dr. Srilatha P


if n == 0:
return 1
else:
return n * factorial(n-1)

result = factorial(5)
print(result)

The initial call: factorial(5) is called, and a new frame is added to the call stack.
Recursive call 1: factorial(4) is called, and a new frame is added to the call stack.
Recursive call 2: factorial(3) is called, and a new frame is added to the call stack.
Recursive call 3: factorial(2) is called, and a new frame is added to the call stack.
Recursive call 4: factorial(1) is called, and a new frame is added to the call stack.
Base case: factorial(1) returns 1, and its frame is popped off the call stack.
Backtracking 1: factorial(2) returns 2*1=2, and its frame is popped off the call stack.
Backtracking 2: factorial(3) returns 3*2=6, and its frame is popped off the call stack.
Backtracking 3: factorial(4) returns 4*6=24, and its frame is popped off the call stack.
Backtracking 4: factorial(5) returns 5*24=120, and its frame is popped off the call stack.
So the final result is 120, and the call stack is empty.

Analysis of Recursive Algorithms:


Analysis of Recursive algorithms can be known by using the recurrence relations
A recursive relation is a mathematical model that captures the underlying time complexity of
the algorithm

T(n): Denotes the no. of comparison incurred by an algorithm on an input size ‘n’.

Recurrence Relation for Linear Search:


A recurrence relation for linear search can be expressed mathematically as follows:

Let T(n) represent the time it takes to perform a linear search on an array of size n.

If the element is found at the first position (i.e., the best-case scenario), it would take 1
comparison to find the element. Therefore, T(1) = 1.( Best Case)

In the worst-case scenario, you may have to search through the entire array. If the element is
not present, you will perform n comparisons. Therefore, T(n) = n.

In the average case, you can assume that the element you are searching for is equally likely
to be at any position in the array. So, on average, you will search half of the array, which
would be (n/2) comparisons.

Prepared by Dr. Srilatha P


The recurrence relation for the average case of linear search can be expressed as:

T(n) = T(n-1) + 1

This recurrence relation states that the time it takes to perform a linear search on an array of
size n is equal to the time it takes to perform a linear search on an array of size (n-1) plus
one comparison.

The base cases for this recurrence relation are T(1) = 1 (best-case) and T(n) = n
(worst-case).

Solving this recurrence relation will give you the average-case time complexity of linear
search.

Recurrence Relation for Binary Search:


A recurrence relation for binary search can be described as follows:

Let T(n) represent the number of comparisons or steps required to perform a binary search
on a sorted array of size n.

The binary search algorithm divides the array into two halves and checks the middle
element to determine whether the target value is in the left or right half of the array. This
leads to a recurrence relation:

T(n) = T(n/2) + 1

Here's what each part of the relation means:

T(n) represents the total number of comparisons or steps required to perform a binary
search on an array of size n.
T(n/2) represents the number of comparisons required to perform a binary search on one
half of the array (either the left or right half).
1 represents the initial comparison to check the middle element.

The base case for this recurrence relation is T(1) = 1 because when you have only one
element, you've already found the target element (or determined that it's not present) with
a single comparison.

Recurrence Relation for Ternary Search:


T(n) = T(n/3) + 2

Recurrence Relation for Factorial Search:


The recurrence relation for factorial search can be defined as follows:

Prepared by Dr. Srilatha P


T(n) = T(n-1) + 1

Here's what each part of the relation means:

T(n) represents the total number of comparisons or steps required to perform a factorial
search on a list of size n.
T(n-1) represents the number of comparisons required to perform a factorial search on the
list of size n-1, as you eliminate some permutations.
1 represents the comparison needed to check the current permutation with the target
permutation.
The base case for this recurrence relation is T(1) = 1 because, when you have only one
permutation left in the list, you've already found the target permutation with a single
comparison.

Solving Recursive relations:


Using Substitution Method:

1.T(n) = T(n-1) + 1 if n>0


T(n) = 1 if n=0

sol:
Find T(n-1) and substitute in the equation. T(n-1) = T(n-2)+1 if substitute in the T(n) which
gives
T(n) = T(n-2) + 2
.
.
Repeat for k times which gives
T(n) = T(n-k) + k

Assume n-k =0 => n=k. Then T(n) becomes


T(n) = T(0) + n
= 1+n
~O(n)

2. T(n) = n + T(n-1) if n>0


= 1 if n=0

Sol:
T(n) = n + T(n-1)
=n+(n-1)+T(n-2)[ substitute T(n-1)= n-1 + T(n-2)]
=n+(n-1)+(n-3)+T(n-3)[ substitute T(n-2)= n-2 + T(n-3)]
. . . . k times

Prepared by Dr. Srilatha P


=n+(n-1)+(n-2)+..........+(n-(k-1))+T(n-k)
= T(0)+1+2+3+..........+n {Assume n-k=0 => n=k}
=T(0)+ n(n+1) / 2 ~ O(n2)

3. T(n) = T(n-1) + log(n) if n>0


=1 if n=0

Sol:
T(n) = T(n-1) + log(n)
=T(n-2) + log(n-1)+ log(n)[substitute T(n-1) = T(n-2) + log(n-1)]
=T(n-3) +log(n-2)+ log(n-1)+ log(n)[substitute T(n-2) = T(n-3) + log(n-2)]
…… K times
= T(n-k) + log(n-(k-1))+ log(n-(k-2))....................+ log(n)
= T(0)+log(1)+log(2)+..........+log(n){Assume n-k=0 => n=k}
= 1+ log(1*2*3…….(n-1)*n)
=1+ log(n!)
~O(nlog(n))

4. T(n) = 2*T(n-1) + 1 if n>0


=1 if n=0

Sol:
T(n) = 2*T(n-1) + 1
=2*( 2*T(n-2) + 1)+1[ substitute T(n-1) = 2 * T(n-2) + 1]
= 22*T(n-2) + 2 + 1
……. K times
= 2k * T(n-k) + 2k-1 + 2k-2 + ……..+22 +2+1
= 2n * T(0)+1+2+ ……..+ 2k-1{Assume n-k=0 => n=k}
= 2n * 1 + 2k - 1 [ since 1+2+3……….2k = 2k+1 -1]
= 2n + 2n -1
= 2n+1 - 1
~O(2n)

5. T(n) =1 if n=1
= n * T(n-1) if n>1

Sol:
T(n) = n * T(n-1)
= (n-1) * n * T(n-2)
……. N-1 steps
= n * (n-1) * (n-2) * ……………*(n- (n-2)) * T(n- (n-1))
= n * (n-1) * (n-2) * ……………*2 * T(1)
=n * (n-1) * (n-2) * ……………*2 * 1

Prepared by Dr. Srilatha P


= n * n(1-1/n) * n(1-2/n) * n(1-3/n) ……..(2/n)*n * (1/n)*n
~ O(nn)

Using Master Method:

Let's see how the master theorem can also be applied to dividing functions.

Master theorem for Dividing functions:

Dividing functions having form like:


T(n) = T(n / 2) + c (constant), T(n) = 3T(n / 2) + log n etc.. .

The generalized form is written as:


T(n) = aT(n / b) + f(n).
Here a and b are constants such that a >= 1, b >1

Sol: T(n) = nlogb(a) [u(n)]


u(n) depends on h(n)
h(n) = f(n) / nlogb(a)

Relation between h(n) and u(n)


r
1. If h(n) = N1 r > 0 then O(nr)
2. If h(n) = N1r r < 0 then O(1)
3. If h(n) = (logn)i i >= 0 then (log2n)i+1 / i+1

1. T(n) = 8 * T(n / 2) + n2

Sol:
a=8, b= 2, f(n) = n2
T(n) = nlog28 u(n)
T(n) = n3 u(n)

h(n) = n2 / n3 = 1/n = n-1 then u(n) = O(1) [ case : 2]


Then T(n)= n3 O(1)
~ O(n3)

2. T(n) = T(n/2) + c
Sol:
a=1, b=2, f(n) =c
T(n) = nlog21 u(n) = n0 u(n)

h(n) = c / nlog21 = c / n0 = c / 1 = c

Prepared by Dr. Srilatha P


Consider case 3:

If h(n) = (log2n) 0 * c [ multiply with term (log2n) 0 ] then u(n) = (log2n) 0+1 / 0+1 ~ O(log2n)

Sorting:
- Merge Sort
- Quick Sort
- Selection Sort
- Radix Sort

Another way of classifying sorting algorithms is:


• Internal Sort
• External Sort

Internal Sort
Sort algorithms that use main memory exclusively during the sort are called internal sorting
algorithms. This kind of algorithm assumes high-speed random access to all memory.

External Sort
Sorting algorithms that use external memory, such as tape or disk, during the sort come
under this category.

The phrase "external sorting" refers to a group of sorting algorithms that can process
enormous volumes of data. When the data being sorted must live in the slower external
memory because it will not fit in the computer's primary memory (often RAM), external
sorting is necessary(drive).

A hybrid sort-merge approach is frequently used for external sorting. Data chunks small
enough to fit in the main memory are read, sorted, and written off to a temporary file during
the sorting process. The sorted sub-files are concatenated into a single, bigger file during
the merging step.

Example:
The external merge sort method separates the sorted chunks into smaller ones that can fit
in RAM before merging them. In order to accommodate each run into the main memory, the
file is first divided into runs of a manageable size. Then use the merge sort sorting method
to order each run in the main memory. Once the file is sorted, combine the resulting runs
into gradually larger runs.

When to use External Sorting?


1. When the unsorted data is too large to perform sorting in computer internal memory
then we use external sorting.
2. In external sorting we use the secondary device. in a secondary storage device, we use

Prepared by Dr. Srilatha P


the tape disk array.

Examples of external sort:


1. Merge sort
2. Tape sort
3. Polyphase sort
4. External radix
5. External merge
Merge sort

The process of combining two or more sorted lists or files into a third sorted list or file is
called merging.

Merge sort is a divide and conquer algorithm, it divides input array in two halves, calls itself
for the two halves and then merges the two sorted halves.

The concept of divide and conquer involves three steps:


1. Divide the problem into multiple small problems.
2. Conquer the subproblems by solving them. The idea is to break down the problem
into atomic subproblems, where they are actually solved.
3. Combine the solutions of the subproblems to find the solution of the actual problem.

The merge() function is used for merging two halves.

The merge(arr, low, mid, high) is key process that assumes that arr[low..mid] and
arr[mid+1..high] are sorted and merges the two sorted sub-arrays into one.

The following diagram shows the complete merge sort process for an example array {38, 27,
43, 3, 9, 82, 10}.

We can see that the array is recursively divided in two halves till the size becomes 1. Once
the size becomes 1, the merge processes comes into action and starts merging arrays back
till the complete array is merged.

Prepared by Dr. Srilatha P


The working procedure for merge sort is as follows:

1. Let us consider an array of n elements (i.e., a[n]) to be sorted.


2. Split the elements into two halves as List-1: a[0], a[2] ... a[n/2] and List-2: a[n/2 + 1],
a[n/2 + 2] ... a[n - 1]
3. The List-1 is recursively divided into two halves and finally each sublist is individually
sorted and stored data into the List-1.
4. Similarly, the second half of the list List-2 is recursively divided into two halves and
finally each sublist is individually sorted and stored data into the List-2.
5. Finally, the two lists are individually sorted and the resulting sorted sequence is
stored in the output of the list.

Prepared by Dr. Srilatha P


# Python program for implementation of MergeSort

# Merges two subarrays of arr[].


# First subarray is arr[l..m]
# Second subarray is arr[m+1..r]

def merge(arr, l, m, r):


n1 = m - l + 1
n2 = r - m

# create temp arrays


L = [0] * (n1)
R = [0] * (n2)

# Copy data to temp arrays L[] and R[]


for i in range(0, n1):
L[i] = arr[l + i]

for j in range(0, n2):


R[j] = arr[m + 1 + j]

# Merge the temp arrays back into arr[l..r]


i = 0 # Initial index of first subarray
j = 0 # Initial index of second subarray
k = l # Initial index of merged subarray

while i < n1 and j < n2:


if L[i] <= R[j]:

Prepared by Dr. Srilatha P


arr[k] = L[i]
i += 1
else:
arr[k] = R[j]
j += 1
k += 1

# Copy the remaining elements of L[], if there


# are any
while i < n1:
arr[k] = L[i]
i += 1
k += 1

# Copy the remaining elements of R[], if there


# are any
while j < n2:
arr[k] = R[j]
j += 1
k += 1

# l is for left index and r is right index of the


# sub-array of arr to be sorted

def mergeSort(arr, l, r):


if l < r:

# Same as (l+r)//2, but avoids overflow for


# large l and h
m = l+(r-l)//2

# Sort first and second halves


mergeSort(arr, l, m)
mergeSort(arr, m+1, r)
merge(arr, l, m, r)

# Driver code to test above


arr = [12, 11, 13, 5, 6, 7]
n = len(arr)
print("Given array is")
for i in range(n):
print("%d" % arr[i],end=" ")

mergeSort(arr, 0, n-1)
print("\n\nSorted array is")
for i in range(n):
print("%d" % arr[i],end=" ")

Prepared by Dr. Srilatha P


Merge Sort advantages and disadvantages:

Advantages:
1. Suitable for Large List
2. Merge Operation can be improved with linked list
3. Uses External Sorting
4. Stable: Keeps relative ordering of the elements

Disadvantages:

1. Takes extra place( not in place sorting)


2. Not suitable for small problems
3. Recursive

Recursive Relation for Merge Sort:

The recursive relation for mergesort( in all the cases) can be expressed as follows:

T(n) = 2 * T(n/2) + O(n)

In this recursive relation:

T(n) represents the time it takes to sort an array of size 'n' using mergesort.

The term "2 * T(n/2)" represents the time required to recursively sort the two halves of the
array (each of size n/2) using mergesort. Since mergesort operates on two halves
independently, you can consider the time taken to sort each half separately.

The term "O(n)" represents the time required to merge the two sorted halves of the array
back together. The merging step takes linear time because it involves comparing and
combining elements from the two halves while preserving their order.

QuickSort

Like merge sort, quick sort is a divide and conquer algorithm, where it picks an element as
pivot and partitions the given array around the picked pivot element.

There are many different versions of quick sort that pick pivot in different ways:
1. Always pick first element as pivot
2. Always pick last element as pivot
3. Pick a random element as pivot.
4. Pick median as pivot

Prepared by Dr. Srilatha P


Quick sort first divides a large array into two smaller sub-arrays as low elements and the
high elements. Quick sort can then recursively sort the sub-arrays.

The steps are:


● Pick the first element, called a pivot, from the array.
● Partitioning: reorder the array so that all elements with values less than the pivot come
before the pivot, while all elements with values greater than the pivot come after it
(equal values can go either way). After this partitioning, the pivot is in its final position.
This is called the partition operation.
● Recursively apply the above steps to the sub-array of elements with smaller values and
separately to the sub-array of elements with greater values.

The base case of the recursion is arrays of size zero or one, which are in order by definition,
so they never need to be sorted.

The quickSort() function is used to partition the list into two halves and recursively calls
the two halves.

The partition() is a key process, where given an array and an element x of array as pivot,
put x at its correct position in sorted array and put all smaller elements (smaller than x)
before x, and put all greater elements (greater than x) after x.

Prepared by Dr. Srilatha P


# Python program for implementation of Quicksort Sort

# Function to find the partition position


def partition(array, low, high):

# choose the rightmost element as pivot


pivot = array[high]

# pointer for greater element


i = low - 1

# traverse through all elements


# compare each element with pivot
for j in range(low, high):
if array[j] <= pivot:

# If element smaller than pivot is found


# swap it with the greater element pointed by i
i=i+1

# Swapping element at i with element at j


(array[i], array[j]) = (array[j], array[i])

# Swap the pivot element with the greater element specified by i


(array[i + 1], array[high]) = (array[high], array[i + 1])

# Return the position from where partition is done


return i + 1

# function to perform quicksort

def quickSort(array, low, high):


if low < high:

# Find pivot element such that


# element smaller than pivot are on the left
# element greater than pivot are on the right
pi = partition(array, low, high)

# Recursive call on the left of pivot


quickSort(array, low, pi - 1)

# Recursive call on the right of pivot

Prepared by Dr. Srilatha P


quickSort(array, pi + 1, high)

data = [1, 7, 4, 1, 10, 9, -2]


print("Unsorted Array")
print(data)

size = len(data)

quickSort(data, 0, size - 1)

print('Sorted Array in Ascending Order:')


print(data)

Example:

Input:
0 1 2 3 4
5 4 2 1 3

P L H

Quicksort(A,0,4) Method Calls partition(A, 0,4):

0 1 2 3 4
5 4 2 1 3

4<5 so increment L
P L H

0 1 2 3 4
5 4 2 1 3

2<5 so increment L
P L H

Prepared by Dr. Srilatha P


0 1 2 3 4
5 4 2 1 3

1<5 so increment L
P L H

0 1 2 3 4
5 4 2 1 3

3<5 and end of the list stop traversing in


P L H forward and start traversing in backwards

0 1 2 3 4
5 4 2 1 3

since 3 is not greater than 5 stop traversing


P L H backwards and swap pivot with element point
By H

0 1 2 3 4
3 4 2 1 5

After the partition the 2 sublists are [3,4,2,1] and []

Again call quicksort(A, 0, 3) which calls the partition(A,0,3)

0 1 2 3 4
3 4 2 1 5

P L H

Prepared by Dr. Srilatha P


0 1 2 3 4
3 4 2 1 5

4 is not less than 3 so stop traversing in


P L H forward and start traverse in backwards

0 1 2 3 4
3 4 2 1 5

1 is not greater than 3, so stop traversing in


P L H backwards and swap 1 & 4

0 1 2 3 4
3 1 2 4 5

Again start traversing in forward direction.


P L H since 1 < 3 increment L

0 1 2 3 4
3 1 2 4 5

since 2< 3 increment L


P L H

0 1 2 3 4
3 1 2 4 5

since 4 ⊀ 3 stop traversing in forward and


P L H traverse in backwards

Prepared by Dr. Srilatha P


0 1 2 3 4
3 1 2 4 5

4 > 3 so decrement H. since all the elements are


P H L scanned this completes the partition process. replace
Pivot element with element point by H

0 1 2 3 4
2 1 3 4 5

The partition(A,0,3) creates the sublist [2,1] and [4]


Again Quicksort on [2,1] sublist is called which calls partition(A,0,1)

0 1 2 3 4
2 1 3 4 5

P L H

0 1 2 3 4
2 1 3 4 5

1 < 2 and end of list, so stop traversing in forward and


P L H traverse in back ward

0 1 2 3 4
2 1 3 4 5

1 ⊁ 2 so stop traversing in backward and


P L H replace pivot with element point by H

0 1 2 3 4
1 2 3 4 5

partition(A,0,1) divides the list into 2 sublists [1] and []


quicksort(A,0,0) is called since it has only one element get terminated.

Prepared by Dr. Srilatha P


Then quicksort(A,2,1) is called. Since it is empty gets terminated
Then quicksort(A,3,3) is called. Since it is empty gets terminated
Then quicksort(A,5,4) is called. Since it is empty gets terminated and finally gets
terminated

Recursive Tree:

Quicksort Advantages:
1. It is an in-place algorithm since it just requires a modest auxiliary stack.
2. Sorting n objects takes only n (log n) time.

Quicksort Disadvantages:
1. It is a recursive process
2. In the worst-case scenario, it takes quadratic (i.e., n2) time.

Recursive Relation of Quicksort:


Best Case:
In the best-case scenario, the recursive relation for quicksort depends on how evenly the
input array is divided at each step. Ideally, in the best case, the pivot chosen in each step
should divide the array into two nearly equal subarrays. Therefore, in the best-case
scenario, the recursive relation for quicksort can be expressed as follows:

T(n) = 2 * T(n/2) + O(n)

In this recursive relation:

1. T(n) represents the time it takes to sort an array of size 'n' using quicksort in the best
case.
2. The term "2 * T(n/2)" represents the time required to recursively sort the two nearly
equal subarrays, each of size 'n/2'.
3. The term "O(n)" represents the time required for the partitioning step, where the pivot
ideally divides the array into two nearly equal subarrays.

Prepared by Dr. Srilatha P


In the worst-case scenario, the recursive relation for quicksort can be expressed as follows:

T(n) = T(n-1) + O(n)

In this recursive relation:

1. T(n) represents the time it takes to sort an array of size 'n' using quicksort in the worst
case.
2. The term "T(n-1)" represents the time required to sort the subarray of size 'n-1' that is not
the pivot element itself.
3. The term "O(n)" represents the time required for the partitioning step.

The recursive relation for quicksort in the average case is typically more complex because it
depends on the probability distribution of pivot choices and how well-balanced the
partitions are during the sorting process. However, on average, quicksort can be analyzed
using the following probabilistic recursive relation:

T(n) = n + (1/n) * Σ(T(i) + T(n-i-1)), for i = 0 to n-1

In this recursive relation:

T(n) represents the average time it takes to sort an array of size 'n' using quicksort.
The term "n" represents the time required for the partitioning step, where the pivot divides
the array into two subarrays.
Σ(T(i) + T(n-i-1)) is the sum of average times for sorting the left and right subarrays after
partitioning, considering all possible choices of the pivot.

Randomized Algorithm:

Randomized Quicksort is a variant of the Quicksort algorithm that uses randomization to


select the pivot element. The random selection of the pivot helps mitigate the risk of
worst-case behavior and ensures a more balanced partitioning of the array, leading to good
average-case performance. Here's the algorithm:

Prepared by Dr. Srilatha P


Radix Sort

In radix sort the given list of numbers are sorted based on the digits of individual numbers.

Sorting is performed from least significant digit of the numbers to the most significant digit
of the numbers.

Radix sort algorithm requires number of passes which are equal to the number of digits
present in the largest number among the list of numbers.

Consider a list of numbers: 99,125,186, 34,67,12,43,78. Here, the biggest number is 186
and the number of digits in 186 is 3, so the number of passes required to sort all the
numbers is 3.

# Python program for implementation of Radix Sort


# A function to do counting sort of arr[] according to the digit represented by exp.

def countingSort(arr, exp1):

n = len(arr)

# The output array elements that will have sorted arr


output = [0] * (n)

# initialize count array as 0


count = [0] * (10)

# Store count of occurrences in count[]


for i in range(0, n):
index = arr[i] // exp1
count[index % 10] += 1

# Change count[i] so that count[i] now contains actual position of this digit in
output array
for i in range(1, 10):
count[i] += count[i - 1]

# Build the output array


i=n-1
while i >= 0:
index = arr[i] // exp1
output[count[index % 10] - 1] = arr[i]
count[index % 10] -= 1

Prepared by Dr. Srilatha P


i -= 1

# Copying the output array to arr[], so that arr now contains sorted numbers
i=0
for i in range(0, len(arr)):
arr[i] = output[i]

# Method to do Radix Sort


def radixSort(arr):

# Find the maximum number to know number of digits


max1 = max(arr)

# Do counting sort for every digit. Note that instead of passing digit number, exp is
passed. exp is 10^i where i is current digit number

exp = 1
while max1 / exp >= 1:
countingSort(arr, exp)
exp *= 10
# Driver code
arr = [170, 45, 75, 90, 802, 24, 2, 66]

# Function Call
radixSort(arr)

Complexity Analysis of Radix Sort:


Time Complexity:
Radix sort is a non-comparative integer sorting algorithm that sorts data with integer
keys by grouping the keys by the individual digits which share the same significant
position and value. It has a time complexity of O(d * (n + b)), where d is the number of
digits, n is the number of elements, and b is the base of the number system being used.

In practical implementations, radix sort is often faster than other comparison-based


sorting algorithms, such as quicksort or merge sort, for large datasets, especially when
the keys have many digits. However, its time complexity grows linearly with the
number of digits, and so it is not as efficient for small datasets.

Auxiliary Space:
Radix sort also has a space complexity of O(n + b), where n is the number of elements
and b is the base of the number system. This space complexity comes from the need to
create buckets for each digit value and to copy the elements back to the original array
after each digit has been sorted.

Prepared by Dr. Srilatha P


Example:

Input:

001 453 246 123 089

The maximum number is 453 and the number of digits in the max number is 3. So we
need 3 passes to sort the elements.

Pass 1:

First create a count array which contains the frequency of unit place value
Scan the input from left to right
The first element is 001 and the unit place value is 1. Store 1 in the index ‘1’ in the
count array
Count Array:
0 1 2 3 4 5 6 7 8 9
1

The second element is 453 and the unit palace value is 3. Store 1 at the index ‘3’ in
the count array

0 1 2 3 4 5 6 7 8 9
1 1

The third element is 246 and the unit palace value is 6. Store 1 at the index ‘6’ in the
count array

0 1 2 3 4 5 6 7 8 9
1 1 1

The fourth element is 123 and the unit palace value is 3. Add 1 to the value at the
index ‘3’ in the count array

0 1 2 3 4 5 6 7 8 9
1 2 1

The fifth element is 089 and the unit palace value is 9. Store 1 in the index ‘9’ in the

Prepared by Dr. Srilatha P


count array

0 1 2 3 4 5 6 7 8 9
1 2 1 1

Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
0 1 1 3 3 3 4 4 4 5

Now scan the input from right to left to sort the elements on the unit place values
The first element is 089 and the unit place value is 9. Go to index ‘9’ in the cumulative
count array and decrement by 1 and consider the decrement value as an index in to
the input array and store 089.

Output Array:

0 1 2 3 4
089

Updated Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
0 1 1 3 3 3 4 4 4 4

The next element is 123 and the unit place value is 3. Go to index ‘3’ in the
cumulative count array and decrement by 1 and consider the decrement value as an
index in to the input array and store 123.

output Array:

0 1 2 3 4
123 089

Updated Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
0 1 1 2 3 3 4 4 4 4

Prepared by Dr. Srilatha P


The next element is 246 and the unit place value is 6. Go to index ‘6’ in the
cumulative count array and decrement by 1 and consider the decrement value as an
index in to the input array and store 246.

Output Array:

0 1 2 3 4
123 246 089

Updated Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
0 1 1 2 3 3 3 4 4 4

The next element is 453 and the unit place value is 3. Go to index ‘3’ in the
cumulative count array and decrement by 1 and consider the decrement value as an
index in to the input array and store 453.

Output Array:

0 1 2 3 4
453 123 246 089

Updated Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
0 1 1 1 3 3 3 4 4 4

The next element is 001 and the unit place value is 1. Go to index ‘1’ in the
cumulative count array and decrement by 1 and consider the decrement value as an
index in to the input array and store 001.

Output Array:

0 1 2 3 4
001 453 123 246 089

Prepared by Dr. Srilatha P


Updated Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
0 0 1 1 3 3 3 4 4 4

After the first pass elements on the unit place value get sorted

Pass II:
Input Array:

0 1 2 3 4
001 453 123 246 089

Scan the input from left to right


The first element is 001 and the tens place value is 0. Store 1 in the index ‘0’ in the
count array
Count Array:
0 1 2 3 4 5 6 7 8 9
1

The second element is 453 and the tens palace value is 5. Store 1 at the index ‘5’ in
the count array

0 1 2 3 4 5 6 7 8 9
1 1

The third element is 123 and the tens palace value is 2. Store 1 at the index ‘2’ in the
count array

0 1 2 3 4 5 6 7 8 9
1 1 1

The fourth element is 246 and the tens palace value is 4. store 1 at the index ‘1’ in
the count array

0 1 2 3 4 5 6 7 8 9
1 1 1 1

The fifth element is 089 and the tens palace value is 8. Store 1 in the index ‘8’ in the

Prepared by Dr. Srilatha P


count array

0 1 2 3 4 5 6 7 8 9
1 1 1 1 1

Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
1 1 2 2 3 4 4 5 5 5

Now scan the input from right to left to sort the elements on the tens place values
The first element is 089 and the tens place value is 8. Go to index ‘8’ in the
cumulative count array and decrement by 1 and consider the decrement value as an
index in to the input array and store 089.

Output Array:

0 1 2 3 4
089

Updated Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
1 1 2 2 3 4 4 4 5 5

The next element is 246 and the tens place value is 4. Go to index ‘4’ in the
cumulative count array and decrement by 1 and consider the decrement value as an
index in to the input array and store 246.

Output Array:

0 1 2 3 4
246 089

Updated Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
1 1 2 2 2 4 4 4 4 5

Prepared by Dr. Srilatha P


The next element is 123 and the tens place value is 2. Go to index ‘2’ in the
cumulative count array and decrement by 1 and consider the decrement value as an
index in to the input array and store 123.
Output Array:

0 1 2 3 4
123 246 089

Updated Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
1 1 1 2 2 4 4 4 4 5

The next element is 453 and the tens place value is 5. Go to index ‘5’ in the
cumulative count array and decrement by 1 and consider the decrement value as an
index in to the input array and store 453.

Output Array:
0 1 2 3 4
123 246 453 089

Updated Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
1 1 1 2 2 3 4 4 4 5

The next element is 001 and the tens place value is 0. Go to index ‘0’ in the
cumulative count array and decrement by 1 and consider the decrement value as an
index in to the input array and store 001.

Output Array:

0 1 2 3 4
001 123 246 453 089

Updated Compute Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
0 1 1 2 2 4 4 4 4 5

Prepared by Dr. Srilatha P


After the second pass elements on the tens place value get sorted

Pass III:
Input Array:
0 1 2 3 4
001 123 246 453 089
count array

0 1 2 3 4 5 6 7 8 9
2 1 1 1

Cumulative Count Array:

0 1 2 3 4 5 6 7 8 9
2 3 4 4 5 5 5 5 5 5

Sorted Array:
0 1 2 3 4
001 089 123 246 453

Selection Sort

Selection sort process can be done in two ways, one is the largest element method and the
other is smallest element method.

The working procedure for selection sort largest element method is as follows:
1. Let us consider an array of n elements (i.e., a[n]) to be sorted.
2. In the first step, the largest element in the list is searched. Once the largest element is
found, it is exchanged with the element which is placed at the last position. This
completes the first pass.
3. In the next step, it searches for the second largest element in the list and it is
interchanged with the element placed at second largest position. This is done in
second pass.
4. This process is repeated for n - 1 passes to sort all the elements.

Let us consider an example of array numbers "80 10 50 20 40", and sort the array from
lowest number to greatest number using selection sort by the largest element

Pass - 1 :

Prepared by Dr. Srilatha P


( 80 10 50 20 40 ) -> ( 40 10 50 20 80 ) // First finds the largest element and it is exchanged
with the last position element.

After completion of Pass - 1, the largest element is moved to the end of the array.

Now, Pass - 2 can find the next largest element with out considering the last position
element.

Pass - 2 :
( 40 10 50 20 80 ) -> ( 40 10 20 50 80 ) // Largest in 40 10 50 20 is 50 and it is replaced with
next last position of the array.

After completion of Pass - 2 the second largest element is moved to the second last position
of the array.

Now, Pass - 3 can find the next largest element with out considering the last two position
elements because they are already sorted.

Pass - 3 :
( 40 10 20 50 80 ) -> ( 20 10 40 50 80 ) // Largest in 40 10 20 is 40 and it is replaced with
next last position of the array.

After completion of Pass - 3 the third largest element is moved to the third last position of
the array.

Now, Pass - 4 can find the next largest element with out considering the last three position
elements because they are already sorted.

Pass - 4 :
( 20 10 40 50 80 ) -> ( 10 20 40 50 80 ) // Largest in 20 10 is 20 and it is replaced with next
last position of the array.

After completion of Pass - 4 all the elements of the array are sorted. So, the result is 10 20
40 50 80.

Selection sort is a simple comparison-based sorting algorithm that repeatedly selects the
minimum (or maximum) element from an unsorted portion of the array and moves it to its
correct position in the sorted portion. The recursive relation for selection sort is not
commonly used because selection sort is typically implemented as an iterative algorithm.
However, you can express its behavior in terms of a recursive relation:

T(n) = T(n-1) + O(n)

Prepared by Dr. Srilatha P


In this recursive relation:

T(n) represents the time it takes to sort an array of size 'n' using selection sort in the worst
case.
The term "T(n-1)" represents the time required to sort an array of size 'n-1', as you first find
the minimum element and place it in the correct position.
The term "O(n)" represents the time required to find the minimum element among the
remaining unsorted elements.

This recursive relation indicates that selection sort has a worst-case time complexity of
O(n^2) because, in each iteration, it needs to compare and potentially swap elements,
resulting in a nested loop structure.

However, it's important to note that selection sort is not typically implemented using
recursion due to its inefficiency compared to more efficient sorting algorithms like quicksort,
mergesort, or even insertion sort. In practice, selection sort is often implemented using
iterative loops for sorting small datasets where simplicity may be more important than
efficiency.

Comparative Analysis of algorithms


In-Place sorting algorithms: A sorting algorithm is In-place if it does not use extra space to
manipulate the input but may require a small though extra space for its operation
In-Place sorting algorithms: Bubble , Insertion , quick , Selection, Heap, Radix
Sorting algorithms with extra space: Merge

Stable sorting algorithms: A sorting algorithm is stable if it does not change the order of
elements with same values
Stable sorting algorithms: Bubble, Insertion , Merge
Non-stable sorting algorithms: Selection, Quick , Heap, Radix

Prepared by Dr. Srilatha P


Time Complexity:

Quicksort: Quicksort has an average-case time complexity of O(n log n), where 'n' is
the number of elements to be sorted. However, in the worst case (when the pivot
choice is poor), it can degrade to O(n^2). The worst-case behavior can be mitigated
with good pivot selection strategies.

Mergesort: Mergesort always has a time complexity of O(n log n), regardless of the
input data. This makes it a reliable choice when a guaranteed worst-case performance
is needed.

Radix Sort: Radix sort has a time complexity of O(k * n), where 'n' is the number of
elements and 'k' is the number of digits or maximum value of the elements. It is
efficient when 'k' is relatively small compared to 'n'. However, it may not be suitable
for sorting data with a wide range of values or floating-point numbers.

Selection Sort: Selection sort has a time complexity of O(n^2) in all cases. It is not
efficient for large datasets

Space Complexity:

Quicksort: Quicksort typically has a space complexity of O(log n)[best Case] and
O(n)[Worst Case] due to the recursion stack.

Mergesort: Mergesort has a space complexity of O(n) because it requires additional


memory for creating temporary arrays during the merging phase.
Prepared by Dr. Srilatha P
Radix Sort: Radix sort has a space complexity of O(n + k) due to the need for
additional data structures, where 'k' is the range of values.

Selection Sort: Selection sort has a space complexity of O(1) because it sorts the array
in-place without requiring additional memory.

Suitability for Different Data:

Quicksort: Quicksort is a general-purpose sorting algorithm that works well on most


types of data, especially when average-case performance is acceptable. It may not be
the best choice for nearly sorted or already sorted data.

Mergesort: Mergesort is suitable for all types of data and is especially useful when
stability and guaranteed worst-case performance are important. It does not have the
best constant factors, so it may be less efficient for small datasets.

Radix Sort: Radix sort is ideal for sorting integers or fixed-length strings where 'k' is
small compared to 'n'. It's not suitable for sorting data with a wide range of values or
floating-point numbers.

Selection Sort: Selection sort is not recommended for sorting large datasets, and it is
not particularly suited for any specific type of data. It is mainly used for small datasets
where simplicity is preferred.

*********************** ************ The End **************************************

Prepared by Dr. Srilatha P

You might also like