0% found this document useful (0 votes)
116 views

M Tech ADS Question Paper With Answers

Uploaded by

sirishaksnlp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
116 views

M Tech ADS Question Paper With Answers

Uploaded by

sirishaksnlp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

ADS MTech-1 SEM 2024 QUESTIONS:

UNIT-I
1a) Summarize on different types of data structures 7M
1b) What is the importance of Double Hashing Technique? 8M
OR
2a) Explain the different operations of Stack and Queue 7M
2b) Following elements are inserted into an empty hash table with hash function f(x)= x% 17
and quadratic probing. Explain. 58, 48, 79, 46, 54, 32,24,19,18 8M
UNIT-II
3a) Define Binary Tree. Explain the infix, Prefix, and Post fix functions of a
binary tree. 7M
3b)Distingush between Quick Sort and Merge Sort with example 8M
OR
4a) Explain an Algorithm of Depth First Traversal with Example. 7M
4b) What is the need of Randomizing data structure?Give an Example 8M
UNIT-III
5a) Explain Double Hashing with example. 8M
5b) Define Queue ADT. Construct different function of Queue ADT with example 7M
OR
6a) Explain Hash Functions with example 7M
6b) Explain Collision Resolution of Hashing with example 8M
UNIT-IV
7a) Analyse Priority Queue using Heaps with example 7M
7b) Explain about Brute-Force Pattern Matching 8M
OR
8a) Write about ABT boyer-Moore Algorithm and explain in detail. 8M
8b) Write an algorithm for inorder,preorder and post order travels of binary search
tree 7M
UNIT-V
9a) Define the Red Black tree algorithm with example 8M
9b) Describe Splay Trees algorithm with example 7M
OR
10a) Define B-tree and Height of B-Tree. Explain insertion algorithm with
example 7M
10b) Explain K-D trees in detail with example 8M
I M.TECH. I Semester Regular Examinations (MR23), March 2024
ADVANCED DATA STRUCTURES AND ALGORITHMS
(COMPUTER SCIENCE AND ENGINEERING)

UNIT-I
1a) Summarize on different types of data structures 7M

Answer:
Data structure
A data structure is a specialized format for organizing and storing data. General data structure
types include the array, the file, the record, the table, the tree, and so on. Any data structure is
designed to organize data to suit a specific purpose so that it can be accessed and worked with in
appropriate ways
Abstract Data Type
In computer science, an abstract data type (ADT) is a mathematical model for data types where
a data type is defined by its behavior (semantics) from the point of view of a user of the data,
specifically in terms of possible values, possible operations on data of this type, and the behavior of
these operations. When a class is used as a type, it is an abstract type that refers to a hidden
representation. In this model an ADT is typically implemented as a class, and each instance of the
ADT is usually an object of that class. In ADT all the implementation details are hidden

2 types of Data Structures


1. Linear data structures
the data structures in which data is arranged in a list or in a sequence.
2. Non linear data structures
the data structures in which data may be arranged in a hierarchical manner
Characteristics of Data Structures
Data Structure is the systematic way used to organise the data. The characteristics of Data
Structures are:
Linear or Non-Linear
This characteristic arranges the data in sequential order, such as arrays, graphs etc.
Static and Dynamic
Static data structures have fixed formats and sizes along with memory locations. The static
characteristic shows the compilation of the data.
Linear Data Structures
Data elements in a linear data structure are linked to one another in a sequential arrangement,
with each element linked to the elements in front of and behind it. In this manner, a single run
can traverse the structure. Linear data structures consist of four types. They are:
 Stack
 Array
 Queue
 Linked list
Stack
The linear data structure stores the data elements in the ‘first-in/ last-out’ or the ‘last-in/ first
out’ order. These orders are known as FILO and LIFO orders, respectively. By using Stack,
the element can be added and removed simultaneously from the same end. In Python, Stack
can be developed in the following ways.
1. Queue.LifoQueue
2. List
3. Collections.deque

A Stack is a data type that only allows users to access the newest member of the list. It is analogous
to a stack of paper, where you add and remove paper from the top, but never look at the papers below
it.
A typical Stack implementation supports 3 operations: Push(), Pop(), and Top().
1. Push() will add an item to the end of the list. This takes constant time.
2. Pop() will remove the item at the end of the list. This takes constant time.
3. Top() will return the value of the item at the top.
All operations on a stack happen in constant time, because no matter what, the stack is always
working with the top- most value, and the stack always knows exactly where that is. This is the
main reason why Stacks are so amazingly fast.

In Stack, the terms ‘Push’ and ‘Pop’ are used instead of ‘insert’ and ‘delete’.
Array
It is the collection of similar data types that are stored in the Contiguous Memory
Locations. Arrays are used in Python as well. Arrays work on the scale of 0 to (n-1),
where ‘n’ denotes the size of the array. Arrays are of two types. They are:

1. One-dimensional Array
2. Multi-dimensional Array

Queue
The queue is a linear data structure that follows the FIFO order. FIFO stands for First In
and First Out. The order is that the elements which are inserted first are to be removed
first. The properties of Queue data structure are:

1. Inserting an element
2. Deleting the element
3. Time of access.
A Queue is a data structure where you can only access the oldest item in the list. It is analogous to a line
in the grocery store, where many people may be in the line, but the person in the front gets serviced first.
A typical Queue implementation has 3 operations, which are similar to the functions in Stacks. They
are: enqueue(), dequeue(), and Front().

1. Enqueue() will add an item to the end of the list. This takes constant time.
2. Dequeue() will remove an item from the beginning of the list. This takes constant time.
3. Front() will return the value of front-most item.
Queues, like Stacks, are very fast because all of the operations are simple, and constant-time.
I will provide a sample implementation in C. However, this code will produce a Queue that cannot
resize when it runs out of room.
Linked List
Linked Lists separate the data structures that are stored consecutively. The last node of a data
structure will be linked to the first node of the next data structure. The first element of any data
structure is known as the Head of the List. The linked list helps in memory allocation, stores data
in internal structure etc. There are three types of Linked Lists. They are:
1. Single Linked List
2. Double Linked List
3. Circular Linked List

Non-Linear Data Structures


The data structure in which the data elements are randomly arranged. The elements are non-arranged
sequentially. The data elements are present at different levels. In Non-linear data structures, there are
different paths for an element to reach the other element. The data elements in the non-linear data
structures are connected to one or more elements. There are two types of non-linear data structures.
They are:
 Tree Data Structure

 Graph Data Structure


Tree Data Structure

Tree data structures are completely different from the arrays, stacks, queues and linked lists. Tree
data structures are hierarchic. The tree data structure collects the nodes together to depict and
stimulate the sequence. Tree data structure does not store the data sequentially. It stores the data on
multiple levels. The top node of the Tree Data Structure is known as the Root Node. Any type of data
can be stored in the root node. Each node shall definitely contain the data. The branches in the Tree
Data Structure are known as the children.

In computer science, a tree is a widely used abstract data type (ADT)—or data structure
implementing this ADT—that simulates a hierarchical tree structure, with a root value and subtrees of
children with a parent node, represented as a set of linked nodes.A tree data structure can be defined
recursively (locally) as a collection of nodes (starting at a root node), where each node is a data
structure consisting of a value, together with a list of references to nodes (the "children"), with the
constraints that no reference is duplicated, and none points to the root.

DEFNITION: A tree is a data structure made up of nodes or vertices and edges without having any
cycle. The tree with no nodes is called the null or empty tree. A tree that is not empty consists of a
root node and potentially many levels of additional nodes that form a hierarchy.
The different parts of the Tree Data Structure are:
1. Root Node
2. Child Node
3. Edge
4. Siblings
5. Leaf Node
6. Internal Nodes
7. Height of the tree
8. Degree of the Node
Graph Data Structure

In Graph Data Structure, one node is simply connected to the other node through the edge of the
graph. The Graph Data Structure obviously uses Non-linear data structures which are not sequentially
arranged. The graph data structures consist of edges and nodes represented by E and V, respectively.
Graph Data Structures do not have root nodes. It does not have a standard order of arranging the data.
Every tree is also known as the graph with n-1 edges where ‘n’ represents the total number of vertices
in the graph. There are various categories in the graphs such as undirected, unweighted, directed and
weighted.

The different parts of the graph are as follows.


1. Vertex
2. Edges
3. Directed Edge
4. Undirected Edge
5. Weighted Edge
6. Degree
7. Indegree
8. Outdegree

Advantages of data structure:


1. Improved data organization and storage efficiency.
2. Faster data retrieval and manipulation.
3. Facilitates the design of algorithms for solving complex problems.
4. Eases the task of updating and maintaining the data.
5. Provides a better understanding of the relationships between data elements.

Disadvantage of Data Structure:


1. Increased computational and memory overhead.
2. Difficulty in designing and implementing complex data structures.
3. Limited scalability and flexibility.
4. Complexity in debugging and testing.
5. Difficulty in modifying existing data structures.

1b) What is the importance of Double Hashing Technique?


44 8M

Double Hashing in Java


In programming, while we deal with data structure sometimes, we required to store two objects having
the same hash value. Storing two objects having the same hash value is not possible. To overcome this
problem, data structure provides collision resolution technique. In this section, we will focus only
on double hashing, its advantages, example, and formula.

Data structure provides the following collision resolution techniques:

1. Separate Chaining (also known as Open Hashing)


2. Open Addressing (also known as Closed Hashing)

o Linear Probing
o Quadratic Probing
o Double Hashing

What is double hashing?


It is a collision resolution technique in open addressing hash table that is used to avoid collisions. A
collision occurs when two keys are hashed to the same index in a hash table. The reason for occurring
collision is that every slot in hash table is supposed to store a single element.

Generally, hashing technique consists a hash function that takes a key and produces hash table index for
that key. The double hashing technique uses two hash functions so it is called double hashing. The
second hash function provides an offset value if the first hash function produces a collision. In other
words, we can say that when two different objects have the same hash, is called collision.

Hash Function
A hash function is a function that accepts a group of characters (key) and maps that key to a value of a
certain length (called a hash value or hash). The process is called hashing. It is done for indexing and
locating items in databases. It provides an easy way to find longer value associated with the shorter hash
value. It is widely used in encryption. It is also known as a hashing algorithm or message digest
function.

Double Hash Function


The first hash function determines the initial location to located the key and the second hash
function is to determine the size of the jumps in the probe sequence. The following function is
an example of double hashing:

1. h(key, i) = (firstHashfunction(key) + i * secondHashFunction(key)) % tableSize

Or

1. h(k, i) = (h1 (k) + i · h2 (k)) mod tableSize

In the above function, the value of i will keep incrementing until an empty slot is found.

1. firstHashFunction(key) = key % tableSize

If the table size is prime, the double hashing works well.

1. secondHashFunction(key) = PRIME - (key % PRIME)

Where PRIME is a prime smaller than tableSize.

If the above functions compute an offset value that is already occupied by other object, it means
there is a collision. A good hash function must have the following properties:

o Quick to evaluate.
o Differ from the original hash function.
o Never evaluates to 0.

Advantages of Double Hashing


o The technique does not yield any clusters.
o It is the best form of probing because it can find next free slot in hash table more quickly than linear
probing.
o It produces a uniform distribution of records throughout a hash table.
Double Hashing Example
Suppose, we have a hash table of size 11. We want to insert keys 20, 34, 45, 70, 56 in the hash table.
Let's insert the keys into hash table using the following double hash functions:

h1(k) = k mod 11 (first hash function)

h2(k) = 8 - (k mod 8) (second hash function)

first, we will create a hash table of size 11.

Let's insert key one by one.

Steps Key Hash Index Description


Function

1 20 h1(20) = 20 9 No collision occurs.


mod 11

2 34 h1(34) = 34 1 No collision occurs.


mod 11

3 45 h1(45) = 45 1 Collision occur because index 1 is already occupied by


mod 11 34. Now we will use the second hash function to
calculate the index for the key 45.

h2(45) = 8 - (45 4 Here, we have taken the value if i 1 because first collision
mod 8) = 3 occurs. Note that here h2 (k) and 11 are relatively prime.
h(45, 1) = (1 + The value of h2 (k) must be less than the table size.
1 * 3) mod 11

4 70 h1(70) = 70 4 Collision occur because index 4 is already occupied by


mod 11 45. Now we will use the second hash function to
calculate the index for the key 70.
h2(70) = 8 - (70 6 Here, we have taken the value if i 1 because first collision
mod 8) = 2 occurs. Note that here h2 (k) and 11 are relatively prime.
h(70, 1) = (4 + The value of h2 (k) must be less than the table size.
1 * 2) mod 11

5 56 h1(56) = 56 1 Collision occur because index 1 is already occupied by


mod 11 34. Now we will use the second hash function to
calculate the index for the key 56.

h2(56) = 8 - (56 9 Again, collision occur. The index 9 is already occupied by


mod 8) = 8 20.
h(56, 1) = (1 +
1 * 8) mod 11

h(56, 2) = (1 + 6 Here, the value of i is incremented by 1 because collision


2 * 8) mod 11 occurs second times. We note that again collision occurs
because the index 6 is already occupied by 70.

h(56, 3) = (1 + 3 Here, the value of i is incremented by 1. The third index


3 * 8) mod 11 is empty. So, we will store 56 at index 3.

*indexes that are bold denotes collision.

After inserting all the keys, the hash table looks like the following.

Now we have clearly understood the double hashing. So, we can easily differentiate linear and quadratic
probing.
In the linear probing, if collision occurs at any index, we look for the immediate next index. If the next
index is also occupied, we look for the immediate next index. The process repeats until we get an empty
index.

In quadratic probing, if collision occurs at any index, we look for the immediate next index. If the next
index is also occupied, then we jump to the i2 index.

Suppose, index 2 is already occupied, then we look for 22 i.e. 4. It means look 4 indexes ahead from the
current index (i.e. 2).

Similarly, if index 3 is also occupied, then we look for 32 i.e. 9. It means look 9 indexes ahead from the
current index (i.e. 3).

Double Hashing Algorithm for Inserting an Element


1. Set index = H(K); offset = H2(K)
2. If table location index already contains the key, no need to insert it. Done!
3. Else if table location index is empty, insert key there. Done!
4. Else collision. Set index = (index + offset) mod M.
5. If index == H(K), table is full! (Throw an exception, or enlarge table.) Else go to step 2.

OR

2a) Explain the different operations of Stack and Queue 7M

Stack: A stack is a linear data structure in which elements can be inserted and deleted
only from one side of the list, called the top. A stack follows the LIFO (Last In First Out)
principle, i.e., the element inserted at the last is the first element to come out. The insertion
of an element into the stack is called push operation, and the deletion of an element from
the stack is called pop operation. In stack, we always keep track of the last element
present in the list with a pointer called top.
The diagrammatic representation of the stack is given below:
STACK: Stack is a linear data structure which works under the principle of last in first out. Basic
operations: push, pop, display.

1. PUSH: if (top==MAX), display Stack overflow else reading the data and making stack [top]
=data and incrementing the top value by doingtop++.
2. Pop: if (top==0), display Stack underflow else printing the element at the top of the stack and
decrementing the top value by doing the top.

DISPLAY: IF (TOP==0), display Stack is empty else printing the elements in the stack from
stack [0] to stack [top].

Queue is a linear data structure in which elements can be inserted only from one side
of the list called rear, and the elements can be deleted only from the other side called
the front. The queue data structure follows the FIFO (First In First Out) principle, i.e.
the element inserted at first in the list, is the first element to be removed from the list.
The insertion of an element in a queue is called an enqueue operation and the deletion
of an element is called a dequeue operation. In queue, we always maintain two
pointers, one pointing to the element which was inserted at the first and still present in
the list with the front pointer and the second pointer pointing to the element inserted at
the last with the rear pointer.
The diagrammatic representation of the queue is given below:
Procedure for Queue operations using array:

In order to create a queue we require a one dimensional array Q(1:n) and two variables front and rear.
The conventions we shall adopt for these two variables are that front is always 1 less than the actual front
of the queue and rear always points to the last element in the queue. Thus, front = rear if and only if there
are no elements in the queue. The initial condition then is front = rear = 0.

The various queue operations to perform creation, deletion and display the elements in a queue are as
follows:

1. insertQ(): inserts an element at the end of queue Q.

2. deleteQ(): deletes the first element of Q.

3. displayQ(): displays the elements in the queue.

Difference between Stack and Queue Data Structures are as follows:


Stacks Queues

A stack is a data structure that stores a A queue is a data structure that stores a collection of
collection of elements, with operations to elements, with operations to enqueue (add) elements at the
push (add) and pop (remove) elements from back of the queue, and dequeue (remove) elements from
the top of the stack. the front of the queue.

Stacks are based on the LIFO principle, i.e., Queues are based on the FIFO principle, i.e., the element
the element inserted at the last, is the first inserted at the first, is the first element to come out of the
element to come out of the list. list.
Stacks Queues

Stacks are often used for tasks that require Queues are often used for tasks that involve processing
backtracking, such as parsing expressions or elements in a specific order, such as handling requests or
implementing undo functionality. scheduling tasks.

Insertion and deletion in queues takes place from the


Insertion and deletion in stacks takes place opposite ends of the list. The insertion takes place at the
only from one end of the list called the top. rear of the list and the deletion takes place from the front
of the list.

Insert operation is called push operation. Insert operation is called enqueue operation.

Stacks are implemented using an array or Queues are implemented using an array or linked list data
linked list data structure. structure.

Delete operation is called pop operation. Delete operation is called dequeue operation.

In queues we maintain two pointers to access the list. The


In stacks we maintain only one pointer to
front pointer always points to the first element inserted in
access the list, called the top, which always
the list and is still present, and the rear pointer always
points to the last element present in the list.
points to the last inserted element.

Stack is used in solving problems works Queue is used in solving problems having sequential
on recursion. processing.

Stacks are often used for recursive Queues are often used in multithreaded applications,
algorithms or for maintaining a history of where tasks are added to a queue and executed by a pool
function calls. of worker threads.

Queue is of three types – 1. Circular Queue 2. Priority


Stack does not have any types.
queue 3. double-ended queue.

Can be considered as a vertical collection


Can be considered as a horizontal collection visual.
visual.

Examples of queue-based algorithms include Breadth-


Examples of stack-based languages include
First Search (BFS) and printing a binary tree level-by-
PostScript and Forth.
level.
2b) Following elements are inserted into an empty hash table with hash function
f(x)= x% 17 and quadratic probing. Explain. 58, 48, 79, 46, 54, 32,24,19,18 8M

Quadratic probing is a technique used to handle collision in hash tables. When a collision
occurs (i.e.., when two elements hash to the same index), quadratic probing searches for the
next available slot by using a quadratic function to calculate the offset from the original
hash index. The formula for quadratic probing is:
Probe= 2 probe (i) = i2
Here, i is the number of times we’ve probed for an empty slot, and the probe sequence is
0,1,4,9,16,….0,1,4,9,16,….
Let’s go through the insertion process for the given elements into an empty hash table with
the hash function f(x) = x % 17 and quadratic probing:
Insert 58:
 f(58) = 58 % 17 =7(no collision)
 Place 58 in slot 7.
Insert 48:
 F(48)= 48 % 17=14(no collision)
 Place 48 in slot 14.

Insert 79:
 F(79)=79%17=11(no collision)
 Place 79 in slot 11.

Insert 46:
 f (46)= 46%17=12(no collision)
 Place 46 in slot 12.

Insert 54:
 f (54) =54%17=3 (collision with 46, probe sequence is 0,1,4,9,…0,1,4,9,…)
 Probe sequence:02= 0, 12=1,22=4
 Place 54 in slot 4.
Insert 32:
 f (32)=32%17=15(no collision)
 Place 32 in slot 15

Insert 24:
 (24)=24%17=7f(24)=24%17=7 ((collision with 58, probe sequence is 0,1,4,9,…0,1,4,9,…)
 Probe sequence: 02=0,12=1,22=4,32=9
 Place 24 in slot 9.

Insert 19:
 f(19)=19%17=2 (no collision)
 Place 19 in slot 2.

Insert 18:
 f(18)=18%17=1 (no collision)
 Place 18 in slot 1.

The final state of the hash table after these insertions will be:
 [18,19,48,54,58,46,empty,empty,24,empty,79,empty,32,empty,empty,15][18,19,48,54,5
8,46,empty,empty,24,empty,79,empty,32,empty,empty,15]
 Note that some slots are marked as "empty" where the probing sequence did not find a
suitable slot.
Unit II

3a) Define Binary Tree. Explain the infix, Prefix, and Post fix functions of a binary
tree. 7M

A Binary Tree Data Structure is a hierarchical data structure in which each node has at most two
children, referred to as the left child and the right child. It is commonly used in computer science for
efficient storage and retrieval of data, with various operations such as insertion, deletion, and
traversal.

There are three ways to read a binary tree:

 Prefix: Root node, then left child, then right child


 Infix: Left child, then root node, then right child
 Postfix: Left child, then right child, then root node
Take, for example, this really simple binary tree:

The ways to read this are:


 Prefix: + 2 3
 Infix: 2 + 3
 Postfix: 2 3 +
The infix reading of this tree resembles (and, in fact, is) the standard way we write and interpret
simple mathematical equations. "Two plus three equals..." (As an aside, all simple
mathematical equations can be expressed as a binary tree. I'm not happy with the tools I have
available to render trees right now so I will leave this as an exercise for you, the reader.)

The postfix reading should be familiar to anyone who owns a Hewlett-Packard graphing
calculator. This form of representing mathematical equations is most commonly referred to
as Reverse Polish notation. Postfix ordering of mathematical expressions is commonly used
for rendering stack-based calculators, usually in assignments for a programming class.

The prefix reading resembles the standard way we use constructs in programming languages.
If we had to represent "2 + 3" using a function, we would write something like plus( 2, 3 ).
This is most clearly shown with LISP's construct ( + 2 3 ). Haskell's backtick operators around
infix operators, e.g. `div`, have a side effect of reminding programmers that most functions are
prefix-oriented.

3b) Distingush between Quick Sort and Merge Sort with example 8M

A sorting is the arrangement of collectively data in particular format like ascending or descending
order. Generally, it is used to arrange the homogeneous data in sorted manner. Using
the sorting algorithms, we can arrange the data in a sequence order and search an element easily and
faster. Sorting techniques depends on two situation such as total time and total space required to
execute a program. In this section, we will discuss quick sort and merge sort and also compare them
each other.
Quick Sort
Quick sort is a comparison based sorting algorithm that follows the divide and conquer technique
to sort the arrays. In quick sort, we usually use a pivot (key) element to compare and interchange
the position of the element based on some condition. When a pivot element gets its fixed position
in the array that shows the termination of comparison & interchange procedure. After that, the
array divides into the two sub arrays. Where the first partition contains all those elements that are
less than pivot (key) element and the other parts contains all those elements that are greater than
pivot element. After that, it again selects a pivot element on each of the sub arrays and repeats the
same process until all the elements in the arrays are sorted into an array.

Algorithm of Quick Sort


Partition (A, p, r)

1. X <- A[r]
2. I <- p-1
3. For j <- p to r -1
4. Do if A[j] <= x
5. Then I <- I + 1
6. Exchange A[i] <-> A[j]
7. Exchange A[I + 1] <--> A[r]
8. Return I + 1

Quicksort (A, p, r)

1. While (p < r)
2. Do q <- Partition (A, p, r)
3. R <- q-1
4. While (p < r)
5. Do q <- Partition (A, p, r)
6. P <- q + 1

Steps to sort an array using the quick sort algorithm

Suppose, we have an array X having the elements X[1], X[2], X[3],…., X[n] that are to be sort. Let's
follow the below steps to sort an array using the quick sort.
Step 1: Set the first element of the array as the pivot or key element. Here, we assume pivot as
X[0], left pointer is placed at the first element and the last index of the array element as right.

Step 2: Now we starts the scanning of the array elements from right side index, then

If X[key] is less than X[right] or if X[key] < X[Right],

1. Continuously decreases the right end pointer variable until it becomes equal to the key.
2. If X[key] > X[right], interchange the position of the key element to the X[right] element.
3. Set, key = right and increment the left index by 1.

Step 3: Now we again start the scanning of the element from left side and compare each element
with the key element. X[key] > X[left] or X[key] is greater than X[left], then it performs the following
actions:

1. Continuously compare the left element with the X[key] and increment the left index by 1 until key becomes
equal to the left.
2. If X[key] < X[left], interchange the position of the X[key] with X[left] and go to step 2.

Step 4: Repeat Step 2 and 3 until the X[left] becomes equal to X[key]. So, we can say that if X[left]
= X[key], it shows the termination of the procedures.

Step 5: After that, all the elements at the left side will be smaller than the key element and the rest
element of the right side will be larger than the key element. Thus indicating the array needs to
partitioned into two sub arrays.

Step 6: Similarly, we need to repeatedly follow the above procedure to the sub arrays until the
entire array becomes sorted.

Let's see an example of quick sort.

Example: Consider an array of 6 elements. Sort the array using the quick sort.

arr[] = {50, 20, 60, 30, 40, 56}

In the above array, 50 is in its right place. So, we divided the elements that are less than pivot in
one sub array and the elements that are larger than the pivot element in another sub array.
Hence, we get the sorted array.

Merge sort
Merge sort is a most important sorting techniques that work on the divide and conquer strategies. It
is the most popular sorting techniques used to sort data that is externally available in a file. The merge
sort algorithm divides the given array into two halves (N/2). And then, it recursively divides the set of
two halves array elements into the single or individual elements or we can say that until no more
division can take place. After that, it compares the corresponding element to sort the element and
finally, all sub elements are combined to form the final sorted elements.
Steps to sort an array using the Merge sort algorithm
1. Suppose we have a given array, then first we need to divide the array into sub array. Each sub array can store
5 elements.
2. Here we gave the first sub array name as A1 and divide into next two subarray as B1 and B2.
3. Similarly, the right sub array name as A2 and divide it into next two sub array as B3 and B4.
4. This process is repeated continuously until the sub array is divided into a single element and no more partitions
may be possible.
5. After that, compare each element with the corresponding one and then start the process of merging to arrange
each element in such a way that they are placed in ascending order.
6. The merging process continues until all the elements are merged in ascending order.

Let's see an example of merge sort.

Example: Consider an array of 9 elements. Sort the array using the merge sort.

arr[] = {70, 80, 40, 50, 60, 11, 35, 85, 2}


Hence, we get the sorted array using the merge sort.
Quick Sort vs. Merge Sort
S.N. Parameter Quick Sort Merge Sort

1. Definition It is a quick sort algorithm that It is a merge sort algorithm that arranges
arranges the given elements into the given sets of elements in ascending
ascending order by comparing and order using the divide and conquer
interchanging the position of the technique, and then compare with
elements. corresponding elements to sort the array.

2. Principle It works on divide and conquer It works on divide and conquer techniques.
techniques.

3. Partition of In quick sort, the array can be Merge sort partition an array into two sub
elements divide into any ratio. array (N/2).

4. Efficiency It is more efficient and work faster It is more efficient and work faster in larger
in smaller size array, as compared data sets or array, as compare to the quick
to the merge sort. sort.

5 Sorting It is an internal sorting method that It is an external sorting method that sort the
method sort the array or data available on array or data sets available on external file.
main memory.

6 Time Its worst time complexity is O Whereas, it's worst time complexity is O (n
complexity (n2). log n).

7 Preferred It is a sorting algorithm that is Whereas, the merge sort algorithm that is
applicable for large unsorted preferred to sort the linked lists.
arrays.

8 Stability Quick sort is an unstable sort Merge sort is a stable sort algorithm that
algorithm. But we can made it contains two equal elements with same
stable by using some changes in values in sorted output.
programming code.

9 Requires It does not require any additional It requires the additional space as
Space space to perform the quick sort. temporary array to merge two sub arrays.
10. Functionality Compare each element with the Whereas, the merge sort splits the array
pivot until all elements are into two parts (N/2) and it continuously
arranged in ascending order. divides the array until an element is left.

OR
4a) Explain an Algorithm of Depth First Traversal with Example. 7M

DFS (Depth First Search) algorithm


In this article, we will discuss the DFS algorithm in the data structure. It is a recursive algorithm to
search all the vertices of a tree data structure or a graph. The depth-first search (DFS) algorithm
starts with the initial node of graph G and goes deeper until we find the goal node or the node
with no children.

Because of the recursive nature, stack data structure can be used to implement the DFS algorithm.
The process of implementing the DFS is similar to the BFS algorithm.

The step by step process to implement the DFS traversal is given as follows -

1. First, create a stack with the total number of vertices in the graph.
2. Now, choose any vertex as the starting point of traversal, and push that vertex into the stack.
3. After that, push a non-visited vertex (adjacent to the vertex on the top of the stack) to the top of the stack.
4. Now, repeat steps 3 and 4 until no vertices are left to visit from the vertex on the stack's top.
5. If no vertex is left, go back and pop a vertex from the stack.
6. Repeat steps 2, 3, and 4 until the stack is empty.

Applications of DFS algorithm


The applications of using the DFS algorithm are given as follows -

o DFS algorithm can be used to implement the topological sorting.


o It can be used to find the paths between two vertices.
o It can also be used to detect cycles in the graph.
o DFS algorithm is also used for one solution puzzles.
o DFS is used to determine if a graph is bipartite or not.

Algorithm
Step 1: SET STATUS = 1 (ready state) for each node in G

Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting state)

Step 3: Repeat Steps 4 and 5 until STACK is empty

Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed state)

Step 5: Push on the stack all the neighbors of N that are in the ready state (whose STATUS = 1)
and set their STATUS = 2 (waiting state)

[END OF LOOP]

Step 6: EXIT

Pseudocode
1. DFS(G,v) ( v is the vertex where the search starts )
2. Stack S := {}; ( start with an empty stack )
3. for each vertex u, set visited[u] := false;
4. push S, v;
5. while (S is not empty) do
6. u := pop S;
7. if (not visited[u]) then
8. visited[u] := true;
9. for each unvisited neighbour w of uu
10. push S, w;
11. end if
12. end while
13. END DFS()
Example of DFS algorithm
Now, let's understand the working of the DFS algorithm by using an example. In the example
given below, there is a directed graph having 7 vertices.
Now, let's start examining the graph starting from Node H.

Step 1 - First, push H onto the stack.

1. STACK: H

Step 2 - POP the top element from the stack, i.e., H, and print it. Now, PUSH all the neighbors of
H onto the stack that are in ready state.

1. Print: H]STACK: A

Step 3 - POP the top element from the stack, i.e., A, and print it. Now, PUSH all the neighbors of
A onto the stack that are in ready state.

1. Print: A
2. STACK: B, D

Step 4 - POP the top element from the stack, i.e., D, and print it. Now, PUSH all the neighbors of
D onto the stack that are in ready state.

1. Print: D
2. STACK: B, F

Step 5 - POP the top element from the stack, i.e., F, and print it. Now, PUSH all the neighbors of
F onto the stack that are in ready state.

1. Print: F
2. STACK: B
Step 6 - POP the top element from the stack, i.e., B, and print it. Now, PUSH all the neighbors of
B onto the stack that are in ready state.

1. Print: B
2. STACK: C

Step 7 - POP the top element from the stack, i.e., C, and print it. Now, PUSH all the neighbors of
C onto the stack that are in ready state.

ADVERTISEMENT

1. Print: C
2. STACK: E, G

Step 8 - POP the top element from the stack, i.e., G and PUSH all the neighbors of G onto the
stack that are in ready state.

1. Print: G
2. STACK: E

Step 9 - POP the top element from the stack, i.e., E and PUSH all the neighbors of E onto the stack
that are in ready state.

1. Print: E
2. STACK:

Now, all the graph nodes have been traversed, and the stack is empty.

Complexity of Depth-first search algorithm


The time complexity of the DFS algorithm is O(V+E), where V is the number of vertices and E is the
number of edges in the graph.

The space complexity of the DFS algorithm is O(V).

Implementation of DFS algorithm


Now, let's see the implementation of DFS algorithm in Java.

In this example, the graph that we are using to demonstrate the code is given as follows -
1. /*A sample java program to implement the DFS algorithm*/
2.
3. import java.util.*;
4.
5. class DFSTraversal {
6. private LinkedList<Integer> adj[]; /*adjacency list representation*/
7. private boolean visited[];
8.
9. /* Creation of the graph */
10. DFSTraversal(int V) /*'V' is the number of vertices in the graph*/
11. {
12. adj = new LinkedList[V];
13. visited = new boolean[V];
14.
15. for (int i = 0; i < V; i++)
16. adj[i] = new LinkedList<Integer>();
17. }
18.
19. /* Adding an edge to the graph */
20. void insertEdge(int src, int dest) {
21. adj[src].add(dest);
22. }
23.
24. void DFS(int vertex) {
25. visited[vertex] = true; /*Mark the current node as visited*/
26. System.out.print(vertex + " ");
27.
28. Iterator<Integer> it = adj[vertex].listIterator();
29. while (it.hasNext()) {
30. int n = it.next();
31. if (!visited[n])
32. DFS(n);
33. }
34. }
35.
36. public static void main(String args[]) {
37. DFSTraversal graph = new DFSTraversal(8);
38.
39. graph.insertEdge(0, 1);
40. graph.insertEdge(0, 2);
41. graph.insertEdge(0, 3);
42. graph.insertEdge(1, 3);
43. graph.insertEdge(2, 4);
44. graph.insertEdge(3, 5);
45. graph.insertEdge(3, 6);
46. graph.insertEdge(4, 7);
47. graph.insertEdge(4, 5);
48. graph.insertEdge(5, 2);
49. System.out.println("Depth First Traversal for the graph is:");
50. graph.DFS(0);
51. }
52. }

Output:

Depth First Traversal For the graph is:

0 1 3 5 2 4 7 6
4b) What is the need of Randomizing data structure?Give an Example 8M
Randomizing data structures can enhance their efficiency and security by reducing the predictability
of operations. One example is in hash functions used for hash tables. Without randomization, an
adversary could intentionally craft inputs leading to worst-case performance. Randomization, like
using random hash functions, helps mitigate such attacks, ensuring a more balanced and
unpredictable distribution of data.

Randomizing data structures can enhance security, prevent performance vulnerabilities, and improve
overall robustness in certain applications. For example, consider hash tables. Randomizing the hash
function or using techniques like "chaining" with randomized linked lists can mitigate the risk of
hash collisions. This makes it more challenging for adversaries to manipulate data in ways that could
lead to performance degradation or security vulnerabilities, such as denial-of-service attacks.
Randomization adds an element of unpredictability, making it harder for malicious actors to exploit
weaknesses in the system.

Randomizing data structures can be useful in various applications to achieve specific goals such as
improving performance, security, or fairness. Here are a few examples of why randomizing data
structures might be necessary:

Security and Cryptography:

Example: In cryptographic applications, using randomized data structures can enhance security by
introducing unpredictability. For instance, using randomized hash functions or salt values in
password hashing can make it more difficult for attackers to exploit patterns in the data.

Load Balancing:

Example: In distributed systems, randomizing data structures like load balancing algorithms can
ensure that requests are evenly distributed among servers. This prevents hotspots and improves
overall system performance.

Preventing Denial-of-Service Attacks:


Example: Randomized data structures can be employed to defend against certain types of denial-of-
service attacks. For instance, randomizing the order in which requests are processed can make it
harder for an attacker to predict and exploit system weaknesses.

Algorithmic Fairness:

Example: In scenarios where fairness is a concern, randomizing the order of processing data or
making decisions can help reduce bias. For instance, in a job interview process, randomly ordering
the evaluation of candidates can minimize unintentional bias in the assessment.

Network Routing:

Example: Randomized algorithms in network routing can be used to avoid congestion and find more
efficient paths for data transmission. This helps prevent predictable patterns that could be exploited
by malicious entities.

Cache Management:

Example: Randomizing cache eviction policies can be beneficial in preventing cache-based side-
channel attacks. By introducing randomness in cache replacement decisions, it becomes more
challenging for attackers to predict and exploit patterns.

Privacy Preservation:

Example: Randomizing data structures in privacy-preserving algorithms, such as differential


privacy, can help protect individuals' sensitive information by adding noise to the data. This makes
it more difficult for adversaries to infer specific details about individuals in the dataset.
In summary, randomizing data structures can be a valuable tool in various domains to enhance
security, improve performance, promote fairness, and address specific challenges in algorithmic
design.
UNIT –III

5a) Explain Double Hashing with example. 8M

Double hashing is a technique used in hash tables to resolve collisions that occur when
two different keys hash to the same index. In this method, a secondary hash function is
applied to the original key, and the result is used to determine the next possible index for
the key. If the calculated index is already occupied, the process is repeated with a
different offset until an empty slot is found.
Here's a step-by-step explanation of double hashing with an example:
Hash Functions:
Assume you have a hash table with 'm' slots.
Define two hash functions:
The primary hash function, denoted as
ℎ1 (key)=keymodm, maps the key to the initial index.
The secondary hash function, denoted as
h 2(key)=1+(keymod(m−1)), determines the step size for probing.
Insertion:
When inserting a key into the hash table, calculate the initial index using the primary
hash function:
index=h1(key).
If the slot at the calculated index is empty, insert the key there.
If the slot is occupied, use the secondary hash function to calculate the next index:
index=(index+h2(key))modm.
Repeat this process until an empty slot is found.
Example:
Suppose you have a hash table with 10 slots (m = 10).
Primary hash function:
h1(key)=key mod 10
Secondary hash function:
h2(key)=1+(key mod 9)
Inserting keys 12, 22, and 32:
h1 (12)=12mod10=2: Slot 2 is empty, so insert 12 at index 2.
h 1 (22)=22mod10=2: Collision! Use double hashing to find the next index.
h2(22)=1+(22mod9)=1+4=5
index=(2+5)mod10=7: Slot 7 is empty, so insert 22 at index 7.
h1 (32)=32mod10=2: Collision again! Use double hashing.
h2 (32)=1+(32mod9)=1+5=6

index=(2+6)mod10=8: Slot 8 is empty, so insert 32 at index 8.


This process ensures that even if keys collide at the same initial index, they will be placed
at different positions in the hash table, reducing the likelihood of further collisions.
There are two important rules to be followed for the second function:
1. It must never evaluate to zero.
2. Must make sure that all cells can be probed. The formula to be used for double hashing is
The double hashing requires another hash function whose probing efficiency is same as
some another hash function required when handling random collision.

The double hashing is more complex to implement than quadratic probing. The quadratic
probing is fast technique than double hashing.

5b) Define Queue ADT. Construct different function of Queue ADT with example
7M

Answer:
The queue abstract data type is defined by the following structure and operations. A queue is
structured, as described above, as an ordered collection of items which are added at one end,
called the “rear,” and removed from the other end, called the “front.” Queues maintain a FIFO
ordering property. The queue operations are given below.
 Queue() creates a new queue that is empty. It needs no parameters and returns an empty queue.
 enqueue(item) adds a new item to the rear of the queue. It needs the item and returns nothing.
 dequeue() removes the front item from the queue. It needs no parameters and returns the item. The
queue is modified.
 isEmpty() tests to see whether the queue is empty. It needs no parameters and returns a boolean
value.
 size() returns the number of items in the queue. It needs no parameters and returns an integer.
As an example, if we assume that q is a queue that has been created and is currently empty,
then Table 1 shows the results of a sequence of queue operations. The queue contents are shown
such that the front is on the right. 4 was the first item enqueued so it is the first item returned by
dequeue.
Table 1: Example Queue Operations

Queue Operation Queue Contents Return Value

q.isEmpty() [] True

q.enqueue(4) [4]

q.enqueue('dog') ['dog',4]

q.enqueue(True) [True,'dog',4]
Table 1: Example Queue Operations

Queue Operation Queue Contents Return Value

q.size() [True,'dog',4] 3

q.isEmpty() [True,'dog',4] False

q.enqueue(8.4) [8.4,True,'dog',4]

q.dequeue() [8.4,True,'dog'] 4

q.dequeue() [8.4,True] 'dog'

q.size() [8.4,True] 2

Here's a basic definition of the Queue ADT and some common operations/functions
associated with it:
Queue ADT:
A queue is an ordered collection of elements where an element is inserted at one end (rear)
and deleted from the other end (front).
The order is First In, First Out (FIFO).
Queue Functions:
Enqueue (Insert):
 Adds an element to the rear of the queue.
 If the queue is full, it may result in an overflow.
6a) Explain Hash Functions with example 7M
A hash function is a mathematical function that takes an input (or 'key') and produces a
fixed-size string or number, which is typically a hash code. The primary purpose of a hash
function is to map data of arbitrary size to a fixed size, making it suitable for various
applications, such as indexing data structures like hash tables.
Here are the key characteristics of a good hash function:
Deterministic: Given the same input, a hash function must always produce the same output.
Efficient: The hash function should be computationally efficient and quick to compute.
Uniformity: Hash values should be distributed as evenly as possible across the output space
to minimize collisions.
Avalanche Effect: A small change in the input should result in a significantly different hash
code.
HASHING AND HASH FUNCTIONS:

Hash table is a data structure used for storing and retrieving data very quickly. Insertion of
data in the hash table is based on the key value. Hence every entry in the hash table is
associated with some key.

Using the hash key the required piece of data can be searched in the hash table by few or
more key comparisons. The searching time is then dependent upon the size of the hash table

The effective representation of dictionary can be done using hash table. We can place
the dictionary entries in the hash table using hash function.

Hash function is a function which is used to put the data in the hash table. Hence one can use
the same hash function to retrieve the data from the hash table. Thus hash function is used to
implement the hash table.
The integer returned by the hash function is called hash key.

For example: Consider that we want place some employee records in the hash table The
record of employee is placed with the help of key: employee ID. The employee ID is a 7
digit number for placing the record in the hash table. To place the record 7 digit number is
converted into 3 digits by taking only last three digits of the key.
If the key is 496700 it can be stored at 0th position. The second key 8421002, the record of
those key is placed at 2 nd position in the array. Hence the hash function will be- H(key) =
key%1000 Where key%1000 is a hash function and key obtained by hash function is called
hash key.

Bucket and Home bucket: The hash function H(key) is used to map several dictionary
entries in the hash table. Each position of the hash table is called bucket.
The function H(key) is home bucket for the dictionary with pair whose value is key.

TYPES OF HASH FUNCTION


There are various types of hash functions that are used to place the record in the hash table-
1. Division Method: The hash function depends upon the remainder of division. Typically the
divisor is table length.
For eg; If the record 54, 72, 89, 37 is placed in the hash table and if the table size is 10 then

h(key) = record % table size 0

54%10=4 2 72
72%10=2 3

89%10=9 4 54
37%10=7 5

7 37
8

9 89

2. Mid Square
In the mid square method, the key is squared and the middle or mid part of the result is used as the index.
If the key is a string, it has to be preprocessed to produce a number. Consider that if we want to place a
record 3111 then

31112 = 9678321
for the hash table of size 1000H(3111) = 783 (the middle 3 digits)

3. Multiplicative hash function:


The given record is multiplied by some constant value. The formula for computing the hash keyis- H(key) = floor(p

*(fractional part of key*A)) where p is integer constant and A is constant real number. Donald Knuth

suggested to use constant A = 0.61803398987

If key 107 and p=50 then

H(key) = floor(50*(107*0.61803398987))

= floor(3306.4818458045)

= 3306

At 3306 location in the hash table the record 107 will be placed.

4. Digit Folding:

The key is divided into separate parts and using some simple operation these parts are combined to
produce the hash key.
For eg; consider a record 12365412 then it is divided into separate parts as 123 654 12 andthese are
added together

H(key) = 123+654+12

= 789
The record will be placed at location 789

5. Digit Analysis:

The digit analysis is used in a situation when all the identifiers are known in advance. We first
transform the identifiers into numbers using some radix, r. Then examine the digits of each identifier.
Some digits having most skewed distributions are deleted. This deleting of digits is continued until
the number of remaining digits is small enough to give an address in the range of the hash table. Then
these digits are used to calculate the hash address.
6b) Explain Collision Resolution of Hashing with example 8M

COLLISION

he hash function is a function that returns the key value using which the record can be placed
in the hash table. Thus this function helps us in placing the record in the hash table at
appropriate position and due to this we can retrieve the record directly from that location.
This function need to be designed very carefully and it should not return the same hash key
address for two different records. This is an undesirable situation in hashing.

Definition: The situation in which the hash function returns the same hash key (home
bucket) for more than one record is called collision and two same hash keys returned for
different records is called synonym.
Similarly when there is no room for a new pair in the hash table then such a situation is
called overflow. Sometimes when we handle collision it may lead to overflow conditions.
Collision and overflow show the poor hash functions.
For example,
Consider a hash function.
H(key) = recordkey%10 having the hash table size of 10
The record keys to be placed are
131, 44, 43, 78, 19, 36, 57 and 77
131%10=1
1 131
44%10=4 2
43%10=3 3 43
78%10=8 4 44
19%10=9 5
36%10=6 6 36
7 57
57%10=7 8 78
77%10=7 9 19

Now if we try to place 77 in the hash table then we get the hash key to be 7 and at index 7
already the record key 57 is placed. This situation is called collision. From the index 7 if we look for
next vacant position at subsequent indices 8.9 then we find that there is no room to place 77 in the
hash table. This situation is called overflow.

COLLISION RESOLUTION TECHNIQUES

If collision occurs then it should be handled by applying some techniques. Such a technique is
called collision handling technique.
1. Chaining
2. Open addressing (linear probing)
3. Quadratic probing
4. Double hashing
5. Double hashing
6. Rehashing

CHAINING
In collision handling method chaining is a concept which introduces an additional field with data i.e.
chain. A separate chain table is maintained for colliding data. When collision occurs then a linked list(chain)
is maintained at the home bucket.

For eg;
Consider the keys to be placed in their home
buckets are 131, 3, 4, 21, 61, 7, 97, 8, 9

then we will apply a hash function as H(key) =

key % D Where D is the size of table. The hash

table will be- Here D = 10

A chain is maintained for colliding elements. for instance 131 has a home bucket (key)
1. similarly key 21 and 61 demand for home bucket 1. Hence a chain is maintained at index 1.

OPEN ADDRESSING – LINEAR PROBING

This is the easiest method of handling collision. When collision occurs i.e. when two records
demand for the same home bucket in the hash table then collision can be solved by placing the second
record linearly down whenever the empty bucket is found. When use linear probing (open addressing), the
hash table is represented as a one-dimensional array with indices that range from 0 to the desired table size-
1. Before inserting any elements into this table, we must initialize the table to represent the situation where
all slots are empty. This allows us to detect overflows and collisions when we inset elements into the table.
Then using some suitable hash function the element can be inserted into the hash table.

For example:

Consider that following keys are to be inserted in the hash table 131, 4, 8, 7, 21, 5, 31, 61, 9, 29
Initially, we will put the following keys in the hash table.

We will use Division hash function. That means the keys are placed using the formula H(key) = key %

tablesize

H(key) = key % 10

For instance the element 131 can beplaced at H(key) = 131 % 10

=1

Index 1 will be the home bucket for 131. Continuing in this fashion we will place 4,8, 7. Now the next key to

be inserted is 21. According to the hash function H(key)=21%10

H(key) = 1

But the index 1 location is already occupied by 131 i.e. collision occurs. To resolve this collision we will linearly
move down and at the next empty location we will prob the element. Therefore 21 will be placed at the index
2. If the next element is 5 then we get the home bucket for 5 as index 5 and this bucket is empty so we will
put the element 5 at index 5.
Index
Key Key Key

0 NULL NULL NULL

1 131 131 131

21 21
2
NULL NULL 31
3
4 4
4
NULL 5 5
5
4 NULL 61
6
NULL 7 7
7
8 NULL
8 8
9 7
NULL NULL
8

The next record NULL


key is 9. According to decision hash function it demands for the
home bucket 9. Hence we will place 9 at index 9. Now the next final record key 29
and it hashes a key 9. But home bucket 9 is already occupied. And there is no next
empty bucket as the table size is limited to index9
The overflow occurs. To handle it we move back to bucket 0 and is the location over
there is empty 29 will be placed at 0th index.
Problem with linear probing:
One major problem with linear probing is primary clustering. Primary clustering is a process in

which a block of data is formed in the hash table when collision is resolved

19%10 = 9
18%10 = 8
39%10 = 9
29%10 = 9
8%10 = 8
QUADRATIC PROBING:
Quadratic probing operates by taking the original hash value and adding successive
values ofan arbitrary quadratic polynomial to the starting value. This method uses
following formula.
DOUBLE HASHING
Double hashing is technique in which a second hash function is applied to the
key when a collision occurs. By applying the second hash function we will get
the number of positions from the point of collision to insert.
There are two important rules to be followed for the second function:
it must never evaluate to zero.
must make sure that all cells can be probed. The formula to be used for double
hashing is
UNIT –IV

7a) Analyse Priority Queue using Heaps with example 7M


A Priority Queue is an abstract data type that stores elements with associated
priorities and allows efficient access to the element with the highest (or lowest)
priority. One common implementation of a priority queue is using a heap data
structure.
A heap is a specialized binary tree-based data structure that satisfies the heap
property. In a Max Heap, each parent node has a value greater than or equal to its
children, making the maximum element easily accessible. In a Min Heap, each
parent node has a value less than or equal to its children, allowing for easy
retrieval of the minimum element.

Here's an analysis of a Priority Queue implemented using a Max Heap:

Insert (Enqueue):

New elements are inserted at the next available position in the heap (usually at
the bottom, rightmost position).
The heap property is restored by repeatedly swapping the new element with its
parent until the heap property is satisfied.
A Binary Heap is a binary tree data structure that satisfies the heap property. In a
Max Heap, for any given node 'i', the value of 'i' is greater than or equal to the
values of its children. In a Min Heap, the value of 'i' is less than or equal to the
values of its children.
Let's analyze the Priority Queue using a Max Heap as an example:

Operations in Priority Queue using Max Heap:

Insert (Enqueue):

Add a new element to the Priority Queue.


Place it in the next available position (usually at the bottom, rightmost spot).
Reheapify the heap to maintain the Max Heap property.
Extract-Max (Dequeue):
Remove and return the element with the highest priority (root of the Max Heap).
Replace the root with the last element in the heap.
Reheapify the heap to maintain the Max Heap property.
Peek (Get Max):
Return the element with the highest priority without removing it (root of the Max
Heap).

Example: In Python language:

class MaxHeapPriorityQueue:
def __init__(self):
self.heap = []

def insert(self, item):


self.heap.append(item)
self._heapify_up()

def extract_max(self):
if not self.heap:
raise IndexError("Priority Queue is empty")
if len(self.heap) == 1:
return self.heap.pop()

max_item = self.heap[0]
self.heap[0] = self.heap.pop()
self._heapify_down()
return max_item

def peek(self):
if not self.heap:
raise IndexError("Priority Queue is empty")
return self.heap[0]

def _heapify_up(self):
index = len(self.heap) - 1
while index > 0:
parent_index = (index - 1) // 2
if self.heap[index] > self.heap[parent_index]:
self.heap[index], self.heap[parent_index] = self.heap[parent_index],
self.heap[index]
index = parent_index
else:
break

def _heapify_down(self):
index = 0
while True:
left_child_index = 2 * index + 1
right_child_index = 2 * index + 2
largest = index

if left_child_index < len(self.heap) and self.heap[left_child_index] >


self.heap[largest]:
largest = left_child_index

if right_child_index < len(self.heap) and self.heap[right_child_index] >


self.heap[largest]:
largest = right_child_index

if largest != index:
self.heap[index], self.heap[largest] = self.heap[largest],
self.heap[index]
index = largest
else:
break

# Example usage:
priority_queue = MaxHeapPriorityQueue()
priority_queue.insert(10)
priority_queue.insert(30)
priority_queue.insert(20)

print("Peek:", priority_queue.peek()) #
max_item = priority_queue.extract_max()
print("Extracted Max:", max_item) # print("Peek after extraction:",
priority_queue.peek()) #

Output: Peek: 30
Output: Extracted Max: 30
Output: Peek after extraction: 20

In this example, we've implemented a Max Heap-based Priority Queue with basic
operations like insert, extract_max, and peek. The underlying heap is adjusted to
maintain the Max Heap property after each operation. Keep in mind that a Min Heap
would be used for a Priority Queue where the element with the smallest priority should
be accessed first.

7b) Explain about Brute-Force Pattern Matching 8M

Answer:

Brute-force pattern matching is a simple and straightforward algorithm used to find


occurrences of a pattern within a text. The basic idea behind this approach is to
systematically check all possible positions in the text for the occurrence of the pattern.
The algorithm slides the pattern over the text one position at a time, checking for a
match at each position. If a match is found, the algorithm reports the position of the
match in the text.

Here's a step-by-step explanation of the brute-force pattern matching algorithm:

Initialization:
Begin at the leftmost position of the text.
Comparison:
Compare each character of the pattern to the corresponding characters in the text,
starting from the current position.
Matching:
If all characters in the pattern match the corresponding characters in the text, a match is
found.
Shift:
Move the pattern one position to the right and repeat steps 2-3.
Repeat:
Continue shifting the pattern until the entire text is covered.
Reporting:
Report the positions where matches are found.
public class BruteForcePatternMatching {
public static void main(String[] args) {
String text = "ABABDABACDABABCABAB";
String pattern = "ABABC";
int[] matches = bruteForcePatternMatching(text, pattern);
System.out.print("Pattern found at positions: ");
for (int match : matches) {
System.out.print(match + " ");
}
}

public static int[] bruteForcePatternMatching(String text, String pattern) {


int n = text.length();
int m = pattern.length();
int[] positions = new int[n]; // Array to store positions of matches
int count = 0; // Count of matches
for (int i = 0; i <= n - m; i++) {
int j;
for (j = 0; j < m; j++) {
if (text.charAt(i + j) != pattern.charAt(j)) {
break;
}
}

if (j == m) {
positions[count++] = i; // Match found, store position
}
}

int[] result = new int[count];


System.arraycopy(positions, 0, result, 0, count);

return result;
}
}
OUT PUT:
Pattern found at positions: 10
In this example, the bruteForcePatternMatching function is called with a sample text
and pattern. The pattern "ABABC" is found at position 10 in the text
"ABABDABACDABABCABAB". The output displays the position where the pattern
is found.

Keep in mind that the output may vary depending on the specific text and pattern
provided. If the pattern is not found, the program will output "Pattern not found in the
text."
In this Java example:

The bruteForcePatternMatching function takes a text and a pattern as input and returns
an array of positions where the pattern is found in the text.
The algorithm iterates through each position in the text, comparing characters with the
pattern.
If a match is found, the position is stored in an array.
The positions array is then copied into a new array of the exact size of the matches
found, and this array is returned.
This simple example demonstrates how the brute-force pattern matching algorithm
works in Java. Keep in mind that for larger texts and patterns, more efficient algorithms
like the Knuth-Morris-Pratt (KMP) or Boyer-Moore might be preferred due to their
better time complexity.

8a) Write about ABT boyer-Moore Algorithm and explain in detail.


8M

The Boyer-Moore algorithm is a well-known string-searching algorithm, but "ABT"


doesn't correspond to a standard term in this context. Assuming you are asking about
the Boyer-Moore algorithm, I'll provide an explanation of it.

Boyer-Moore Algorithm:
The Boyer-Moore algorithm is a highly efficient algorithm for string searching and
pattern matching. It was developed by Robert S. Boyer and J Strother Moore in 1977.
Unlike some other algorithms, Boyer-Moore takes advantage of information gained
from previous comparisons to skip unnecessary comparisons.

Key Components:
Bad Character Rule:

The algorithm preprocesses the pattern and creates a table of the rightmost occurrence
of each character in the pattern. This information is used to skip comparisons when a
mismatch occurs.
Good Suffix Rule:

Boyer-Moore also employs a good suffix rule to skip portions of the pattern when a
mismatch occurs. It involves finding the longest suffix of the pattern that matches a
substring of the pattern before the mismatch.
Steps of the Algorithm:

Preprocessing:
 Create a Bad Character Shift table, which stores the rightmost occurrence of
each character in the pattern.
 Create a Good Suffix Shift table, which stores information about how much to
shift when a mismatch occurs.
Search:

 Start comparing the pattern to the text from right to left.


 When a mismatch occurs at position i, calculate two shifts:
• Bad Character Shift: Use the Bad Character Rule to shift the pattern
so that the mismatched character in the text aligns with its rightmost
occurrence in the pattern.
• Good Suffix Shift: Use the Good Suffix Rule to shift the pattern based
on the occurrence of the mismatched suffix in the pattern.
Continue Searching:
Repeat the search until the end of the text or a match is found.
Example:
Let's take an example:

Text: "ABABCABABCD"
Pattern: "ABAB"

1. Preprocessing:

Bad Character Table: {'A': 3, 'B': 2, 'C': 7, 'D': 8}


Good Suffix Table: [2, 2, 0, 0]
2. Search:

• Start comparing the pattern to the text from right to left.

• First mismatch at position 3 ('C' in text and 'A' in pattern).

• Bad Character Shift: 3 positions.

• Good Suffix Shift: 2 positions.

• Shift the pattern by max(3, 2) = 3 positions to the right.

• Continue searching.

• Second mismatch at position 5 ('C' in text and 'B' in pattern).

• Bad Character Shift: 2 positions.


• Good Suffix Shift: 2 positions.

• Shift the pattern by max(2, 2) = 2 positions to the right.

• Continue searching.

• Match found at position 6.

Advantages:
• Boyer-Moore is particularly effective for searching in long texts with shorter
patterns.
• It has a linear average and best-case time complexity.
Conclusion:
The Boyer-Moore algorithm is widely used in practice due to its efficiency in handling
real-world scenarios. Its ability to skip comparisons based on preprocessed information
makes it a powerful tool for pattern matching

8b) Write an algorithm for inorder,preorder and post order travels of


binary search tree 7M

Binary Tree Traversal Algorithms:


1. In-order Traversal:
In in-order traversal, we visit the left subtree, then the root, and finally the right
subtree.

Algorithm for In-order Traversal:

1. Traverse the left subtree using in-order traversal.


2. Visit the root node.
3. Traverse the right subtree using in-order traversal..
2. Pre-order Traversal:
In pre-order traversal, we visit the root node first, then the left subtree, and finally the
right subtree.

Algorithm for Pre-order Traversal:

1. Visit the root node.


2. Traverse the left subtree using pre-order traversal.
3. Traverse the right subtree using pre-order traversal.
3. Post-order Traversal:
In post-order traversal, we visit the left subtree first, then the right subtree, and
finally the root.
Algorithm for Post-order Traversal:

1. Traverse the left subtree using post-order traversal.


2. Traverse the right subtree using post-order traversal.
3. Visit the root node.

A binary search tree is a binary tree made up of nodes. Each node has a key signifying
its value.
The value of the nodes on the left subtree are smaller than the value of the root node.
And the value of the nodes on the right subtree are larger than the value of the root
node.
The root node is the parent node of both subtrees.
The diagram below shows the main parts of a binary tree:

binary-tree
Diagram of a binary search tree
Let's us look at the relationship between the nodes.

• A is the root node.


• The left subtree begins at B while the right subtree begins at C.
• Node A has two child nodes – B and C.
• Node C is the parent node to F and G. F and G are siblings.
• Node F and G are know as leaf nodes because they do not have children.
• Node B is the parent node to D and E.
• Node D is the parent node to H and I.
• D and E are siblings as well as H and I.
• Node E is a leaf node.
So here are some important terms that we just used to describe the tree above:

Root: The topmost node in the tree.

Parent: A node with a child or children.

Child: A node extended from another node (parent node).

Leaf: A node without a child.


Binary search trees help us speed up our binary search as we are able to find items
faster.
We can use the binary search tree for the addition and deletion of items in a tree.
We can also represent data in a ranked order using a binary tree. And in some cases, it
can be used as a chart to represent a collection of information.
Next, we'll look at some techniques used in traversing a binary tree.
Traversing a tree means visiting and outputting the value of each node in a particular
order. In this tutorial, we will use the Inorder, Preorder, and Post order tree traversal
methods.

The major importance of tree traversal is that there are multiple ways of carrying out
traversal operations unlike linear data structures like arrays, bitmaps, matrices where
traversal is done in a linear order.
Each of these methods of traversing a tree have a particular order they follow:
• For Inorder, you traverse from the left subtree to the root then to the right
subtree.
• For Preorder, you traverse from the root to the left subtree then to the right
subtree.
• For Post order, you traverse from the left subtree to the right subtree then to the
root.
Here is another way of representing the information above:

Inorder => Left, Root, Right.

Preorder => Root, Left, Right.

Post order => Left, Right, Root.

How to Traverse a Tree Using Inorder Traversal


We are going to create a tree similar to the one in the last section, but this time the node
keys will be numbers instead of letters.
Remember that the values of the nodes on the left subtree are always smaller than the
value of the root node. Also, the values of the nodes on the right subtree are larger than
the value of the root node.
Here is the diagram we will be working with:

binary-search-tree
Recall that the order for inorder traversal is Left, Root, Right.
This is the result we get after using inorder traversal:
D, B, E, A, F, C, G
If that seems a bit complex for you to understand, then follow the order of the colorsthe
picture below

Preorder Traversal
The order here is Root, Left, Right.
Using the same diagram above, we have:
A, B, D, E, C, F, G

Here is the same diagram with different colors used as a guide:

Preorder traversal

Postorder Traversal
The order for post order traversal is Left, Right, Root.
Here is the output:
D, E, B, F, G, C, A
If you can't figure out how we arrived at that result, then use the colors in the picture
below as a guide:
UNIT-V

9a) Define the Red Black tree algorithm with example 8M


Introduction:
When it comes to searching and sorting data, one of the most fundamental data
structures is the binary search tree. However, the performance of a binary
search tree is highly dependent on its shape, and in the worst case, it can
degenerate into a linear structure with a time complexity of O(n). This is
where Red Black Trees come in, they are a type of balanced binary search tree
that use a specific set of rules to ensure that the tree is always balanced. This
balance guarantees that the time complexity for operations such as insertion,
deletion, and searching is always O(log n), regardless of the initial shape of the
tree.
Red Black Trees are self-balancing, meaning that the tree adjusts itself
automatically after each insertion or deletion operation. It uses a simple but
powerful mechanism to maintain balance, by coloring each node in the tree
either red or black.
Red Black Tree-
Red-Black tree is a binary search tree in which every node is colored with
either red or black. It is a type of self balancing binary search tree. It has a
good efficient worst case running time complexity.

Properties of Red Black Tree:

The Red-Black tree satisfies all the properties of binary search tree in
addition to that it satisfies following additional properties –
1. Root property: The root is black.
2. External property: Every leaf (Leaf is a NULL child of a node) is black in
Red-Black tree.
3. Internal property: The children of a red node are black. Hence possible
parent of red node is a black node.
4. Depth property: All the leaves have the same black depth.
5. Path property: Every simple path from root to descendant leaf node
contains same number of black nodes.
The result of all these above-mentioned properties is that the Red-Black tree
is roughly balanced.
Rules That Every Red-Black Tree Follows:
1. Every node has a color either red or black.
2. The root of the tree is always black.
3. There are no two adjacent red nodes (A red node cannot have a red
parent or red child).
4. Every path from a node (including root) to any of its descendants NULL
nodes has the same number of black nodes.
5. Every leaf (e.i. NULL node) must be colored BLACK.
Why Red-Black Trees?
Most of the BST operations (e.g., search, max, min, insert, delete.. etc) take
O(h) time where h is the height of the BST. The cost of these operations may
become O(n) for a skewed Binary tree. If we make sure that the height of the
tree remains O(log n) after every insertion and deletion, then we can
guarantee an upper bound of O(log n) for all these operations. The height of
a Red-Black tree is always O(log n) where n is the number of nodes in the
tree.

Sr. No. Algorithm Time Complexity

1. Search O(log n)

2. Insert O(log n)

3. Delete O(log n)

“n” is the total number of elements in the red-black tree.


Comparison with AVL Tree:
The AVL trees are more balanced compared to Red-Black Trees, but they
may cause more rotations during insertion and deletion. So if your
application involves frequent insertions and deletions, then Red-Black trees
should be preferred. And if the insertions and deletions are less frequent and
search is a more frequent operation, then AVL tree should be preferred over
the Red-Black Tree.
How does a Red-Black Tree ensure balance?
A simple example to understand balancing is, that a chain of 3 nodes is not
possible in the Red-Black tree. We can try any combination of colors and see
if all of them violate the Red-Black tree property.
Interesting points about Red-Black Tree:
1. The black height of the red-black tree is the number of black nodes on a
path from the root node to a leaf node. Leaf nodes are also counted as
black nodes. So, a red-black tree of height h has black height >= h/2.
2. Height of a red-black tree with n nodes is h<= 2 log2(n + 1).
3. All leaves (NIL) are black.
4. The black depth of a node is defined as the number of black nodes from
the root to that node i.e the number of black ancestors.
5. Every red-black tree is a special case of a binary tree.
Black Height of a Red-Black Tree :
Black height is the number of black nodes on a path from the root to a leaf.
Leaf nodes are also counted black nodes. From the above properties 3 and
4, we can derive, a Red-Black Tree of height h has black-height >= h/2.
Number of nodes from a node to its farthest descendant leaf is no more than
twice as the number of nodes to the nearest descendant leaf.
Every Red Black Tree with n nodes has height <= 2Log2(n+1)
This can be proved using the following facts:
1. For a general Binary Tree, let k be the minimum number of nodes on all
root to NULL paths, then n >= 2k – 1 (Ex. If k is 3, then n is at least 7).
This expression can also be written as k <= Log 2(n+1).
2. From property 4 of Red-Black trees and above claim, we can say in a
Red-Black Tree with n nodes, there is a root to leaf path with at-most
Log2(n+1) black nodes.
3. From properties 3 and 5 of Red-Black trees, we can claim that the
number of black nodes in a Red-Black tree is at least ⌊ n/2 ⌋ where n is
the total number of nodes.
From the above points, we can conclude the fact that Red Black Tree
with n nodes has a height <= 2Log2(n+1)
Search Operation in Red-black Tree:
As every red-black tree is a special case of a binary tree so the searching
algorithm of a red-black tree is similar to that of a binary tree.
Algorithm:
searchElement (tree, val)
Step 1:
If tree -> data = val OR tree = NULL
Return tree
Else
If val < data
Return searchElement (tree -> left, val)
Else
Return searchElement (tree -> right, val)
[ End of if ]
[ End of if ]

Step 2: END
Example: Searching 11 in the following red-black tree.

Solution:
1. Start from the root.
2. Compare the inserting element with root, if less than root, then recurse for
left, else recurse for right.
3. If the element to search is found anywhere, return true, else return false.

Just follow the blue bubble.


In this post, we introduced Red-Black trees and discussed how balance is
ensured. The hard part is to maintain balance when keys are added and
removed. We have also seen how to search an element from the red-black
tree. We will soon be discussing insertion and deletion operations in coming
posts on the Red-Black tree.
Exercise:
1) Is it possible to have all black nodes in a Red-Black tree?
2) Draw a Red-Black Tree that is not an AVL tree structure-wise?
Insertion and Deletion
Red-Black Tree Insertion
Red-Black Tree Deletion
Applications:
1. Most of the self-balancing BST library functions like map, multiset, and
multimap in C++ ( or java packages like java.util.TreeMap and
java.util.TreeSet ) use Red-Black Trees.
2. It is used to implement CPU Scheduling Linux. Completely Fair
Scheduler uses it.
3. It is also used in the K-mean clustering algorithm in machine learning for
reducing time complexity.
4. Moreover, MySQL also uses the Red-Black tree for indexes on tables in
order to reduce the searching and insertion time.
5. Red Black Trees are used in the implementation of the virtual memory
manager in some operating systems, to keep track of memory pages and
their usage.
6. Many programming languages such as Java, C++, and Python have
implemented Red Black Trees as a built-in data structure for efficient
searching and sorting of data.
7. Red Black Trees are used in the implementation of graph algorithms such
as Dijkstra’s shortest path algorithm and Prim’s minimum spanning tree
algorithm.
8. Red Black Trees are used in the implementation of game engines.
Advantages:
1. Red Black Trees have a guaranteed time complexity of O(log n) for basic
operations like insertion, deletion, and searching.
2. Red Black Trees are self-balancing.
3. Red Black Trees can be used in a wide range of applications due to their
efficient performance and versatility.
4. The mechanism used to maintain balance in Red Black Trees is relatively
simple and easy to understand.
Disadvantages:
1. Red Black Trees require one extra bit of storage for each node to store
the color of the node (red or black).
2. Complexity of Implementation.
3. Although Red Black Trees provide efficient performance for basic
operations, they may not be the best choice for certain types of data or
specific use cases.
9b) Describe Splay Trees algorithm with example 7M
Splay tree is a self-adjusting binary search tree data structure, which means
that the tree structure is adjusted dynamically based on the accessed or
inserted elements. In other words, the tree automatically reorganizes itself so
that frequently accessed or inserted elements become closer to the root
node.
1. The splay tree was first introduced by Daniel Dominic Sleator and Robert
Endre Tarjan in 1985. It has a simple and efficient implementation that
allows it to perform search, insertion, and deletion operations in O(log n)
amortized time complexity, where n is the number of elements in the tree.
2. The basic idea behind splay trees is to bring the most recently accessed
or inserted element to the root of the tree by performing a sequence of
tree rotations, called splaying. Splaying is a process of restructuring the
tree by making the most recently accessed or inserted element the new
root and gradually moving the remaining nodes closer to the root.
3. Splay trees are highly efficient in practice due to their self-adjusting
nature, which reduces the overall access time for frequently accessed
elements. This makes them a good choice for applications that require
fast and dynamic data structures, such as caching systems, data
compression, and network routing algorithms.
4. However, the main disadvantage of splay trees is that they do not
guarantee a balanced tree structure, which may lead to performance
degradation in worst-case scenarios. Also, splay trees are not suitable for
applications that require guaranteed worst-case performance, such as
real-time systems or safety-critical systems.
Overall, splay trees are a powerful and versatile data structure that offers
fast and efficient access to frequently accessed or inserted elements. They
are widely used in various applications and provide an excellent tradeoff
between performance and simplicity.
A splay tree is a self-balancing binary search tree, designed for efficient
access to data elements based on their key values.
 The key feature of a splay tree is that each time an element is accessed,
it is moved to the root of the tree, creating a more balanced structure for
subsequent accesses.
 Splay trees are characterized by their use of rotations, which are local
transformations of the tree that change its shape but preserve the order of
the elements.
 Rotations are used to bring the accessed element to the root of the tree,
and also to rebalance the tree if it becomes unbalanced after multiple
accesses.
Operations in a splay tree:
 Insertion: To insert a new element into the tree, start by performing a
regular binary search tree insertion. Then, apply rotations to bring the
newly inserted element to the root of the tree.
 Deletion: To delete an element from the tree, first locate it using a binary
search tree search. Then, if the element has no children, simply remove
it. If it has one child, promote that child to its position in the tree. If it has
two children, find the successor of the element (the smallest element in its
right subtree), swap its key with the element to be deleted, and delete the
successor instead.
 Search: To search for an element in the tree, start by performing a binary
search tree search. If the element is found, apply rotations to bring it to
the root of the tree. If it is not found, apply rotations to the last node
visited in the search, which becomes the new root.
 Rotation: The rotations used in a splay tree are either a Zig or a Zig-Zig
rotation. A Zig rotation is used to bring a node to the root, while a Zig-Zig
rotation is used to balance the tree after multiple accesses to elements in
the same subtree.
Here’s a step-by-step explanation of the rotation operations:
 Zig Rotation: If a node has a right child, perform a right rotation to bring it
to the root. If it has a left child, perform a left rotation.
 Zig-Zig Rotation: If a node has a grandchild that is also its child’s right or
left child, perform a double rotation to balance the tree. For example, if
the node has a right child and the right child has a left child, perform a
right-left rotation. If the node has a left child and the left child has a right
child, perform a left-right rotation.
 Note: The specific implementation details, including the exact rotations
used, may vary depending on the exact form of the splay tree.
Rotations in Splay Tree
 Zig Rotation
 Zag Rotation
 Zig – Zig Rotation
 Zag – Zag Rotation
 Zig – Zag Rotation
 Zag – Zig Rotation
1) Zig Rotation:
The Zig Rotation in splay trees operates in a manner similar to the single
right rotation in AVL Tree rotations. This rotation results in nodes moving one
position to the right from their current location. For example, consider the
following scenario:

Zig Rotation (Single Rotation)

2) Zag Rotation:

The Zag Rotation in splay trees operates in a similar fashion to the single left
rotation in AVL Tree rotations. During this rotation, nodes shift one position
to the left from their current location. For instance, consider the following
illustration:

Zag Rotation (Single left Rotation)

3) Zig-Zig Rotation:

The Zig-Zig Rotation in splay trees is a double zig rotation. This rotation
results in nodes shifting two positions to the right from their current location.
Take a look at the following example for a better understanding:

Zig-Zig Rotation (Double Right Rotation)

4) Zag-Zag Rotation:
In splay trees, the Zag-Zag Rotation is a double zag rotation. This rotation
causes nodes to move two positions to the left from their present position.
For example:

Zag-Zag Rotation (Double left rotation)


5) Zig-Zag Rotation:

The Zig-Zag Rotation in splay trees is a combination of a zig rotation


followed by a zag rotation. As a result of this rotation, nodes shift one
position to the right and then one position to the left from their current
location. The following illustration provides a visual representation of this
concept:

Zig- Zag rotation

6) Zag-Zig Rotation:

The Zag-Zig Rotation in splay trees is a series of zag rotations followed by a


zig rotation. This results in nodes moving one position to the left, followed by
a shift one position to the right from their current location. The following
illustration offers a visual representation of this concept:

Zag-Zig Rotation
10a) Define B-tree and Height of B-Tree. Explain insertion
algorithm with example 7M

B Tree is a specialized m-way tree that can be widely used for disk access. A B-Tree of
order m can have at most m-1 keys and m children. One of the main reason of using
B tree is its capability to store large number of keys in a single node and large key
values by keeping the height of the tree relatively small.

A B tree of order m contains all the properties of an M way tree. In addition, it contains
the following properties.

1. Every node in a B-Tree contains at most m children.


2. Every node in a B-Tree except the root node and the leaf node contain at least m/2
children.
3. The root nodes must have at least 2 nodes.
4. All leaf nodes must be at the same level.

It is not necessary that, all the nodes contain the same number of children but, each
node must have m/2 number of nodes.

A B tree of order 4 is shown in the following image.


While performing some operations on B Tree, any property of B Tree may violate such
as number of minimum children a node can have. To maintain the properties of B Tree,
the tree may split or join.

Operations
Searching :
Searching in B Trees is similar to that in Binary search tree. For example, if we search
for an item 49 in the following B Tree. The process will something like following :

1. Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.
2. Since, 40<49<56, traverse right sub-tree of 40.
3. 49>45, move to right. Compare 49.
4. match found, return.

Searching in a B tree depends upon the height of the tree. The search algorithm takes
O(log n) time to search any element in a B tree.

Inserting
Insertions are done at the leaf node level. The following algorithm needs to be followed
in order to insert an item into B Tree.

1. Traverse the B Tree in order to find the appropriate leaf node at which the node can
be inserted.
2. If the leaf node contain less than m-1 keys then insert the element in the increasing
order.
3. Else, if the leaf node contains m-1 keys, then follow the following steps.
o Insert the new element in the increasing order of elements.
o Split the node into the two nodes at the median.
o Push the median element upto its parent node.
o If the parent node also contain m-1 number of keys, then split it too by
following the same steps.

Example:

Insert the node 8 into the B Tree of order 5 shown in the following image.

8 will be inserted to the right of 5, therefore insert 8.

The node, now contain 5 keys which is greater than (5 -1 = 4 ) keys. Therefore split the
node from the median i.e. 8 and push it up to its parent node shown as follows.
10b) Explain K-D trees in detail with example 8M

A K-D tree, short for k-dimensional tree, is a data structure used for organizing
points in a k-dimensional space. It is particularly useful for efficient
multidimensional search operations, such as nearest neighbor search. The K-D
tree is a binary tree where each node represents an axis-aligned hyperrectangle (a
region in the k-dimensional space), and the points are associated with the leaves
of the tree.

Here's a basic explanation of how a K-D tree works:

Building the Tree:

The construction of a K-D tree starts with selecting a dimension along which to
split the data. The dimension is chosen based on various criteria, often
alternating between dimensions at each level of the tree.
Once a dimension is chosen, the data is sorted along that dimension, and the
median point is selected as the splitting point.
The selected dimension and splitting point define a hyperplane that divides the
data into two subsets. The points on one side of the hyperplane go to the left
subtree, and the points on the other side go to the right subtree.
This process is recursively applied to each subset until each leaf node of the tree
corresponds to a single point.
Searching in the Tree:

To perform a search in a K-D tree, start at the root and move down the tree,
making decisions at each level based on the comparison between the search point
and the splitting value in the current dimension.
At each level, you traverse either the left or right subtree based on the
comparison. This process continues until you reach a leaf node.
Once at a leaf node, you consider the point associated with that leaf as a
candidate and backtrack to check other branches if there might be closer points.
Nearest Neighbor Search:

The efficiency of K-D trees becomes evident in nearest neighbor searches.


During a nearest neighbor search, you traverse the tree to find the leaf node that
would contain the target point.
After reaching the leaf, you backtrack, checking other nodes if there might be
points closer to the target point, updating the closest point found so far.
This process continues until you reach the root of the tree, and you have
determined the nearest neighbor.
K-D trees are widely used in various applications, including computer graphics,
pattern recognition, and database systems, where efficient multidimensional
search operations are crucial. They provide a balance between storage efficiency
and search speed, making them a valuable data structure in many fields.

You might also like