0% found this document useful (0 votes)
14 views276 pages

Unit 4 PPT Merged

The document provides an overview of tree data structures, including their definitions, properties, and terminologies such as root, edge, parent, child, and traversal methods. It discusses various types of trees, including binary trees and their specific characteristics, as well as tree traversal techniques like pre-order, in-order, and post-order. Additionally, it covers representations of trees, including linked list representation and left child-right sibling representation.

Uploaded by

ar2330
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views276 pages

Unit 4 PPT Merged

The document provides an overview of tree data structures, including their definitions, properties, and terminologies such as root, edge, parent, child, and traversal methods. It discusses various types of trees, including binary trees and their specific characteristics, as well as tree traversal techniques like pre-order, in-order, and post-order. Additionally, it covers representations of trees, including linked list representation and left child-right sibling representation.

Uploaded by

ar2330
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 276

21CSC201J

DATA STRUCTURES AND


ALGORITHMS

UNIT-4
Topic : Trees
Classification of Data Structures
Introduction to Trees
• The study of trees in mathematics can be traced to Gustav Kirchhoff in the
middle nineteenth century and several years later to Arthur Cayley, who
used trees to study the structure of algebraic formulas.
• Cayley’s work undoubtedly laid the framework for Grace Hopper’s use of
trees in 1951 to represent arithmetic expressions.
• Hopper’s work bears a strong resemblance to today’s binary tree formats.
• Trees are used extensively in computer science to represent algebraic
formulas;
• as an efficient method for searching large, dynamic lists; and for such
diverse applications as artificial intelligence systems and encoding
algorithms.
General Trees
• Linear access time for linked lists too high.
• Solution – group data into trees.
• Trees – non linear data structure
• Used to represent data contains a hierarchical relationship among
elements example: family, record
• Worst Case time – O(log n)
• Tree can be defined in many ways, eg: recursively
General Trees
• A tree is a non-linear data structure.
• Tree is a non-linear data structure which organizes data in
hierarchical structure and this is a recursive definition.
• A tree is a collection of nodes connected by directed (or
undirected) edges.
• A tree can be empty with no nodes or a tree is a structure
consisting of one node called the root and zero or one or more
subtrees.
What is tree data structure ?
• The tree is a non linear data structure that consist of nodes and is connected by
edges.
• A tree data structure is a hierarchical structure that is used to represent and
organize data in a way that is easy to navigate and search.
• It is a collection of nodes that are connected by edges and has a hierarchical
relationship between the nodes.

• The topmost node of the tree is called the root,


and the nodes below it are called the child nodes.
Each node can have multiple child nodes, and these
child nodes can also have their own child nodes,
forming a recursive structure.
Basic Terminologies In Tree Data Structure:
Tree Terminology
Root Height
Edge Depth
Parent Subtree
Child Forest
Sibling Path
Degree Ancestor
Internal node Descendant
Leaf node
Level
Tree Terminology - Root
• First node is called as Root Node.
• Every tree must have a root node.
• The root node is the origin of the tree data structure.
• In any tree, there must be only one root node. We never have
multiple root nodes in a tree.
Tree Terminology - Root
• The stating node of a tree is root node
• Only one root node
Tree Terminology - Edge
• the connecting link between any two nodes is called as EDGE.
• In a tree with 'N' number of nodes there will be a maximum of 'N-1'
number of edges.
Tree Terminology - Edge
• Nodes are connected using the link called edges
• Tree with n nodes exactly have (n-1) edges
Tree Terminology - Parent
• A predecessor of any node is called as PARENT NODE.
• The node which has a branch from it to any other node is called a parent
node.
• Parent node can also be defined as "The node which has child / children".
Tree Terminology - Parent
• Node that have children nodes or have branches connecting to other
nodes
• Parental node have one or more children nodes
Tree Terminology - Child
• Node that is descendant of any node is child
• All nodes except root node are child node
• Any parent node can have any number of child nodes.
Tree Terminology - Child
Tree Terminology - Siblings
• Nodes with same parents are siblings
Tree Terminology - Siblings
Tree Terminology - Degree
• Degree of node – number of children per node
• Degree of tree – highest degree of a node among all nodes in tree

Degree A – 2
Degree B – 3
Degree C – 0
Degree D – 0
Degree E – 0
Degree F – 0
Degree of entire tree - 3
Tree Terminology - Degree
• Total number of children of a node is called as DEGREE of that Node
Tree Terminology – Internal Node
• Node with minimum one child – internal node
• Also known as non – terminal nodes
Tree Terminology – Internal Node
Tree Terminology – Leaf Node
• Node with no child – Leaf node
• Also known as External nodes or Terminal nodes
Tree Terminology – Leaf Node
• the node which does not have a child is called as LEAF Node
Tree Terminology – Level
• Each step of tree is level number
• Staring with root as 0
• the root node is said to be at Level 0 and the children of root node
are at Level 1
Tree Terminology – Height
• Number of edges in longest path from the node to any leaf
• Height of any leaf node is 0
• Height of Tree = Height of Root node

Height A – 3
Height B – 2
Height D – 1
Height C,G,E,F – 0
Height of tree - 3
Tree Terminology – Height
• total number of edges from leaf node to a particular node in the
longest path
Tree Terminology – Depth
• Number of edges from root node to particular node is Depth
• Depth of root node is 0
• Depth of Tree = Depth of longest path from root to leaf

Depth A – 0
Depth B ,C – 1
Depth D, E, F – 2
Depth G – 3
Depth of tree - 3
Tree Terminology – Depth
• total number of edges from root node to a particular node
Tree Terminology – Subtree
• Each child of a tree forms a subtree recursively
Tree Terminology – Forest
• Set of disjoint trees
Tree Terminology - Path
• The sequence of Nodes and Edges from one node to another
node is called as PATH between that two Nodes.
• Length of a Path is total number of nodes in that path.
Tree Terminology - Path
• A path from node a1 to ak is defined as a sequence of nodes a1, a2, . . .
, ak such that ai is the parent of ai+1 for 1 ≤ i < k

Path A-G: A,B,D,G


Tree Terminology – length of Path
• Number of edges in the path

Path A-G: A – B – D – G

Length of path A-G : 3


Tree Terminology – Ancestor & Descendant
• If there is a path from A to B, then A is an ancestor of B and B is a
descendant of A

Path A-G: A – B – D – G

A – Ancestor for G, B, D…

G – Descendant of D,B,A
Tree Representations
A tree data structure can be represented in two methods.
• List Representation
• Left Child - Right Sibling Representation
Tree Representations
List Representation
1. Two types of nodes one for representing the node with data
called 'data node' and another for representing only references
called 'reference node'.
2. start with a 'data node' from the root node in the tree. Then it
is linked to an internal node through a 'reference node' which is
further linked to any other node directly
Tree Representations
• List Representation
Tree Representations
Left Child - Right Sibling Representation
• A list with one type of node which consists of three fields namely
Data field, Left child reference field and Right sibling reference field.
• Data field stores the actual value of a node
• left reference field -> address of the left child
• right reference field -> address of the right sibling node.
Tree Representations
Left Child - Right Sibling Representation
• In this representation, every node's data field stores the actual value
of that node.
• If that node has left a child, then left reference field stores the
address of that left child node otherwise stores NULL.
• If that node has the right sibling, then right reference field stores the
address of right sibling node otherwise stores NULL.
Tree Representations
Left Child - Right Sibling Representation
Tree Representations
Left Child - Right Sibling Representation
Binary Tree
• A tree in which every node can have a maximum of two children is
called Binary Tree.
• In a binary tree, every node can have either 0 children or 1 child or 2
children but not more than 2 children.
Binary Tree
• A binary tree is a data structure specified as a set of node elements.
• The topmost element in a binary tree is called the root node, and each
node has 0, 1, or at most 2 kids.
Binary Tree
Properties of Binary Trees
• At each level of n, the maximum number of nodes is 2n
• Maximum number of nodes possible at height h is 2h+1 -1
• The minimum number of nodes possible at height h is h+1
Strict Binary Tree
• Each node must contain either 0 or 2 children.
• Also a tree in which each node must contain 2 children except the leaf
nodes. All nodes filled from left to right
Strict Binary Tree / full Binary tree
• A full Binary tree is a special type of binary tree in which every parent
node/internal node has either two or no children.
Complete Binary Tree
• Except last level, all the nodes need to completely filled
• All nodes filled from left to right
Complete Binary Tree
• The complete binary tree is a tree in which all the nodes are completely
filled except the last level.
• In the last level, all the nodes must be as left as possible. The nodes
should be added from the left.
• All nodes filled from left to right
Representation of Binary Tree
• Each node – three portion – Data portion, left child pointer, Right
child pointer
Linked List Representation of Tree
struct node
{
struct node *left;
int value;
struct node *right;
};
Linked List Representation of Tree
struct node printf("\nPress 1 for new node");
{ printf("Enter your choice : ");
int data; scanf("%d", &choice);
struct node *left, *right; if(choice==0)
} { return 0; }
void main() else
{ {
struct node *root; printf("Enter the data:");
root = create(); scanf("%d", &data);
} temp->data = data;
struct node *create() printf("Enter the left child of %d", data);
{ temp->left = create();
struct node *temp; printf("Enter the right child of %d", data);
int data; temp->right = create();
temp = (struct node *)malloc(sizeof(struct node)); return temp;
printf("Press 0 to exit"); } }
Linked List Representation of Tree
21CSC201J
DATA STRUCTURES AND
ALGORITHMS

UNIT-4
Topic : Tree Traversal
Tree Traversal
• Traversal – visiting all node only once
• Based on the order of visiting :
• In – order traversal
• Pre – order traversal
• Post – order traversal
Tree traversal
• The term 'tree traversal' means traversing or visiting each node of a
tree. Traversing can be performed in three ways
Pre-order traversal
In-order traversal
Post-order traversal
Pre-order traversal
Algorithm
Step 1 - Visit the root node
Step 2 - Traverse the left subtree recursively.
Step 3 - Traverse the right subtree recursively.
The output of the preorder traversal of the above tree is -
A→B→D→E→C→F→G
Tree TRAVERSAL

In-order traversal
• The first left subtree is visited after that root node is traversed, and
finally, the right subtree is traversed.
Algorithm
Step 1 - Traverse the left subtree recursively.
Step 2 - Visit the root node.
Step 3 - Traverse the right subtree recursively.

The output of the inorder traversal of the above tree is -


D → B → E →A→ F→ C →G
Tree TRAVERSAL
Post-order traversal
the first left subtree of the root node is traversed, after that recursively
traverses the right subtree, and finally, the root node is traversed.
Algorithm
Step 1 - Traverse the left subtree recursively.
Step 2 - Traverse the right subtree recursively.
Step 3 - Visit the root node.

The output of the postorder traversal of the above tree is -


D→E→B→F→G→C→A
Tree Traversal In detail
In-order Traversal
• Traverse left node , visit root node, Traverse right node

Traverse Left node– B


No left child for B subtree
Visit root node B
No right child for B subtree
Visit root node A
B–A–C Traverse Right node– C
No left child for C subtree
Visit root node C
No right child for C subtree
In-order Traversal
Algorithm:
Inorder(Tree)
1. while Tree != Null then Repeat step 2 – 4
2. Inorder(Tree->left)
3. Write(Tree->Data) // root
4. Inorder(Tree->right)
5. End
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B, E
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B, E, A,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B, E, A,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B, E, A, C
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B, E, A, C, I,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B, E, A, C, I,
F,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B, E, A, C, I,
F,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B, E, A, C, I,
F, K,
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B, E, A, C, I,
F, K, J
In-order Traversal
Algorithm: Inorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Inorder(Tree->left)
3. Write(Tree->Data)
4. Inorder(Tree->right)
5. End
Result : G, D, H, L, B, E, A, C, I,
F, K, J
struct node
In-order Traversal root->right = insertNode(root->right,val);
{
int key; if(root->key > val)
struct node *left; root->left = insertNode(root->left,val);
struct node *right; return root;
}; }
struct node *insertNode(struct node *root, int val) void inorder(struct node *root)
{ {
if(root == NULL) if(root == NULL)
{ return;
struct node *newNode; //traverse the left subtree
inorder(root->left);
newNode = malloc(sizeof(struct node));
newNode->key = val; //visit the root
newNode->left = NULL; printf("%d ",root->key);
newNode->right = NULL;
return newNode; //traverse the right subtree
} inorder(root->right);
if(root->key < val) }
Pre – order Traversal
• Visit root node, Traverse left node , Traverse right node

Visit root node A


Traverse Left node– B
Visit root node B
No left child for B subtree
No right child for B subtree
A–B–C Traverse Right node– C
Visit root node C
No left child for C subtree
No right child for C subtree
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D,G,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H, L,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H, L, E,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H, L, E,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H, L, E, C,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H, L, E, C,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H, L, E, C, F,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H, L, E, C, F,
I,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H, L, E, C, F,
I, J,
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H, L, E, C, F,
I, J, K
Pre-order Traversal
Algorithm: Preorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Write(Tree->Data)
3. Preorder(Tree->left)
4. Preorder(Tree->right)
5. End

Result: A, B, D, G, H, L, E, C, F,
I, J, K
Post– order Traversal
• Traverse left node, Traverse right node , visit root node

Traverse Left node – B


No left child for B subtree
No right child for B subtree
Visit root node B
Traverse Right node– C
B–C–A No left child for C subtree
No right child for C subtree
Visit root node C
Visit root node A
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Post-order Traversal
Algorithm: Postorder(Tree)
• Step 1 - Traverse the left subtree recursively.
• Step 2 - Traverse the right subtree recursively.
• Step 3 - Visit the root node.
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B, I,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B, I,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B, I, K,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B, I, K, J,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B, I, K, J,
F,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B, I, K, J,
F, C,
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B, I, K, J,
F, C, A
Post-order Traversal
Algorithm: Postorder(Tree)
1. Repeat step 2 – 4 while Tree != Null
2. Postorder(Tree->left)
3. Postorder(Tree->right)
4. Write(Tree->Data)
5. End
Result : G, L, H, D, E, B, I, K, J,
F, C, A
Tree
struct Node{
Traversal printf("1. Insert\n2. Display\n3. Exit");
int data; printf("\nEnter your choice: ");
struct Node *left; scanf("%d",&choice);
struct Node *right; switch(choice)
}; {
struct Node *root = NULL; case 1: printf("\nEnter the value to be insert: ");
int count = 0; scanf("%d", &value);
struct Node* insert(struct Node*, int); root = insert(root,value); break;
void display(struct Node*); case 2: display(root); break;
case 3: exit(0);
void main(){ default: printf("\nPlease select correct operations!!!\n");
int choice, value; }
clrscr(); }
printf("\n----- Binary Tree -----\n"); }
while(1){
Tree Traversal
struct Node* insert(struct Node *root,int value)
{ else
struct Node *newNode; root->right = insert(root->right,value);
newNode = (struct Node*)malloc(sizeof(struct Node)); }
newNode->data = value; return root;
if(root == NULL) }
{ void display(struct Node *root)
newNode->left = newNode->right = NULL; {
root = newNode; if(root != NULL)
count++; {
} display(root->left);
else printf("%d\t",root->data);
{ display(root->right);
if(count%2 != 0) }
root->left = insert(root->left,value); }
21CSC201J
DATA STRUCTURES AND
ALGORITHMS

UNIT-4
Topic : BINARY SEARCH TREE
BINARY SEARCH TREE
• Binary Search Tree is a binary tree in which every node contains only smaller
values in its left sub tree and only larger values in its right sub tree.
• Also called as ORDERED binary tree
BST– properties:
• It should be Binary tree.
• Left subtree < Root Node < = Right subtree
(or)
Left subtree < =Root Node < Right subtree
BINARY SEARCH TREE
Binary search trees or not ?
• Operations: Searching, Insertion, Deletion of a Node
• TimeComplexity :
BEST CASE WORST CASE

Search Operation - O(log n) Search Operation - O(n)


Insertion Operation- O(log n) Insertion Operation- O(1)
Deletion Operation - O(log n) Deletion Operation - O(n)
(note:Height of the binary search tree becomes n)
Example: Example:

(a) Left skewed, and (b) right skewed binary search


trees
Binary search tree Construction
• Create a binary search tree using the following data elements: 45, 39,
56, 12, 34, 78, 32, 10, 89, 54, 67, 81
Remaining: 89, 54, 67, 81
SEARCHING A NODE IN BST
SearchElement (TREE, VAL)

Step 1: IF TREE-> DATA = VAL OR TREE = NULL


Return TREE
ELSE
IF VAL < TREE-> DATA
Return searchElement(TREE-> LEFT, VAL)
ELSE
Return searchElement(TREE-> RIGHT, VAL)
[END OF IF]
[END OF IF]
Step 2: END
EXAMPLE 1:
Searching a node with value 12 in the given binary search tree

We start our search from the root node 45.


As 12 < 45, so we search in 45’s LEFT subtree.
As 12 < 39, so we search in 39’s LEFT subtree.
So, we conclude that 12 is present in the above BST.
EXAMPLE 2:
Searching a node with value 52 in the given binary search tree

We start our search from the root node 45.


As 52 > 45, so we search in 45’s RIGHT subtree.
As 52 < 56 so we search in 56’s LEFT subtree.
As 52 < 54 so we search in 54’s LEFT subtree.
But 54 is leaf node
So, we conclude that 52 is not present in the above BST.
INSERTING A NODE IN BST
Insert (TREE, VAL)
Step 1: IF TREE = NULL
Allocate memory for TREE
SET TREE-> DATA = VAL
SET TREE-> LEFT = TREE-> RIGHT = NULL
ELSE
IF VAL < TREE-> DATA
Insert(TREE-> LEFT, VAL)
ELSE
Insert(TREE-> RIGHT, VAL)
[END OF IF]
[END OF IF]

Step 2: END
EXAMPLE : Inserting nodes with values 55 in the given binary search tree

We start searching for value 55 from the root node 45.


As 55 > 45, so we search in 45’s RIGHT subtree.
As 55 < 56 so we search in 56’s LEFT subtree.
As 55 > 54 so so we add 55 to 54’s right subtree.
Deletion Operation in BST
• Case 1: Deleting a Leaf node (A node with no children)
• Case 2: Deleting a node with one child
• Case 3: Deleting a node with two children
Case 1: Deleting a Leaf node (A node with no children)
EXAMPLE : Deleting node 78 from the given binary search tree
Case 2: Deleting a node with one child
EXAMPLE : Deleting node 54 from the given binary search tree
Case 3: Deleting a node with two children
EXAMPLE 1 : Deleting node 56 from the given binary search tree

 Visit to the left sub tree of the deleting node.


 Grab the greatest value element called as in-order predecessor.
 Replace the deleting element with its in-order predecessor.
Case 3: Deleting a node with two children
EXAMPLE 2 : Deleting node 15 from the given binary search tree

 Visit to the right sub tree of the deleting node.


 Pluck the least value element called as inorder successor.
 Replace the deleting element with its inorder successor.
Deletion Operation in BST
Delete (TREE, VAL)
Step 1: IF TREE = NULL SET TEMP = TREE
Write "VAL not found in the tree" IF TREE-> LEFT = NULL AND TREE-> RIGHT = NULL
ELSE IF VAL < TREE-> DATA SET TREE = NULL
Delete(TREE->LEFT, VAL) ELSE IF TREE-> LEFT != NULL
ELSE IF VAL > TREE-> DATA SET TREE = TREE-> LEFT
Delete(TREE-> RIGHT, VAL) ELSE
ELSE IF TREE-> LEFT AND TREE-> RIGHT SET TREE = TREE-> RIGHT
SET TEMP = findLargestNode(TREE-> LEFT) ( INORDER PREDECESSOR) [END OF IF]
SET TREE-> DATA = TEMP DATA FREE TEMP
Delete(TREE-> LEFT, TEMP DATA) [END OF IF] Step 2: END
(OR)
SET TEMP = findSmallestNode(TREE-> RIGHT) ( INORDER SUCESSOR)
SET TREE-> DATA = TEMP DATA
Delete(TREE-> RIGHT, TEMP DATA)
ELSE
21CSC201J
DATA STRUCTURES AND
ALGORITHMS

UNIT-4
Topic : AVL TREE
AVL TREE
• An AVL tree is a balanced binary search tree.
• In AVL Tree balance factor of every node is either -1, 0 or +1.

• Balance factor = height Of Left Subtree – height Of Right Subtree


AVL TREE
• Named after Adelson-Velskii and Landis as AVL tree
• Also called as self-balancing binary search tree
AVL tree – properties:
• It should be Binary search tree
• Balancing factor: balance of every node is either -1 or 0 or 1
where balance(node) = height(node.left subtree) – height(node.right subtree)
• Maximum possible number of nodes in AVL tree of height H
= 2H+1 – 1
• Operations: Searching, Insertion, Deletion of a Node
• TimeComplexity : O(log n)
Balancing Factor
Balance factor = heightOfLeftSubtree – heightOfRightSubtree
Example 1 : Check - AVL Tree?
Example 2: Check - AVL Tree?
6

4 8

1 5 7 11

2
operations on AVL tree

1.SEARCHING
2.INSERTION
3.DELETION
Search Operation in AVL Tree
ALGORITHM:
STEPS:
1 : Get the search element
2 : check search element == root node in the tree.
3 : If both are exact match, then display “element found" and end.
4 : If both are not matched, then check whether search element is < or > than that node
value.
5: If search element is < then continue search in left sub tree.
6: If search element is > then continue search in right sub tree.
7: Repeat the step from 1 to 6 until we find the exact element
8: Still the search element is not found after reaching the leaf node display “element not
found”.
INSERTION or DELETION
• After performing any operation on AVL tree - the balance factor of each node is to be checked.
• After insertion or deletion there exists either any one of the following:

Scenario 1:
• After insertion or deletion , the balance factor of each node is either 0 or 1 or -1.
• If so AVL tree is considered to be balanced.
• The operation ends.
Scenario 1:
• After insertion or deletion, the balance factor is not 0 or 1 or -1 for at least one node then
• The AVL tree is considered to be imbalanced.
• If so, Rotations are need to be performed to balance the tree in order to make it as AVL TREE.
Rotation
• After insertion / deletion balance factor of every node in the tree 0,1-1.
• Otherwise we must make it balanced.
• Whenever the tree becomes imbalanced due to any operation we
do rotation operations to make the tree balanced.

• Rotation is the process of moving nodes either to left or to right to make the tree
balanced.
AVL TREE ROTATION
LL ROTATION
new node is inserted in the left sub-tree of the left sub-tree of the node (
LL imbalanced)
LL ROTATION
When new node is inserted in the left sub-tree of the left sub-tree of the
critical
LL ROTATION- Example
RR ROTATION
new node is inserted in the right sub-tree of the right sub-tree of the node
(RR Imbalanced)
RR ROTATION
When new node is inserted in the right sub-tree of the right sub-tree of
the critical
RR Rotation - Example
LR ROTATION
When new node is inserted in the left sub-tree of the right sub-
tree of the node LR Imbalance
LR ROTATION
When new node is inserted in the left sub-tree of the right sub-
tree of the node
LR ROTATION
When new node is inserted in the left sub-tree of the right sub-
tree of the critical
LR Rotation - Example
RL ROTATION
When new node is inserted in the right sub-tree of the left sub-
tree of the node ( RL imbalance)
RL ROTATION
When new node is inserted in the right sub-tree of the left sub-
tree of the node ( RL imbalance)
RL ROTATION
When new node is inserted in the right sub-tree of the left sub-
tree of the node ( RL imbalance)
RL ROTATION
When new node is inserted in the right sub-tree of the left sub-
tree of the critical
RL Rotation - Example
AVL TREE CONSTRUCTION / NODE INSERTION - Example
Construct an AVL tree by inserting the following elements in the given order 63, 9, 19, 27, 18, 108,
99, 81.
AVL TREE CONSTRUCTION / NODE INSERTION - Example
In an organization, 10 employees joined with their ID 50,20, 60, 10, 8, 15, 32, 46, 11, 48. Kindly insert
them one by one but follow the Height balance while constructing the tree. Write the Steps to follow the
insertion and solve it stepwise
AVL TREE – NODE DELETION - Example
• Delete nodes 52, 36, and 61 from the AVL tree given
Construct AVL Tree
• 35, 15,5,20,25,17,45
21CSC201J
DATA STRUCTURES AND
ALGORITHMS

UNIT-4
Topic : B TREE
B-Trees
• A B-tree is called as an m-way tree. Its order is m.

• Each node can have a maximum of m children.


• Non-leaf nodes (internal node) minimum m / 2 children.
• Root can have node minimum 2 children & leaf – min – 0 child.

• Each non-leaf node can have maximum m-1 keys


• Minimum no of key at root is 1 and other node is m / 2 -1

• All the leaf nodes should be in the same level


Structure of an m-way
search tree node
• The structure of an m-way search tree node is shown in figure

• Where P0 , P1 , P2 , ..., Pn are pointers to the node’s sub-trees and K0


, K1 , K2 , ..., Kn–1 are the key values of the node.
• All the key values are stored in ascending order.
• A B tree is a specialized m-way tree developed by Rudolf Bayer and Ed
McCreight in 1970 that is widely used for disk access.
Source :
http://masterraghu.com/subjects/Datastructures/ebooks/rema
%20thareja.pdf
B-Trees - Examples

B-Tree of order 3
B-Trees - Examples

B-Tree of order 4
Searching in a B-Tree
• Similar to searching in a binary search tree.
• Consider the B-Tree shown here.
• If we wish to search for 72 in this tree, first consider
the root, the search key is greater than the values in
the root node. So go to the right sub-tree.
• The right sub tree consists of two values and again 72 is greater than 63 so traverse to right sub tree of
63.
• The right sub tree consists of two values 72 and 81. So we found our value 72.
Insertion in a B-Tree
• Insertions are performed at the leaf level.
• Search the B-Tree to find suitable place to insert the new element.
• If the leaf node is not full, the new element can be inserted in the leaf
level.
• If the leaf is full,
• insert the new element in order into the existing set of keys.
• split the node at its median into two nodes.
• push the median element up to its parent’s node. If the parent’s node is
already full, then split the parent node by following the same steps.
Insertion - Example
• Consider the B-Tree of order 5

• Insert 8, 9, 39 and 4 into it.


Insertion - Example
• Insert 8, 9, 39 and 4 into it.
Insertion - Example
Exercise
• Consider the B-Tree of order 3, try to insert 121 and 87.

• Create a B-Tree of order 5, by inserting the following elements


• 3, 14, 7, 1, 8, 5, 11, 17, 13, 6, 23, 12, 20, 26, 4, 16, 18, 24, 25, and 19.
Deletion in a B-Tree
• Like insertion, deletion also should be performed from leaf nodes.
• There are two cases in deletion.
• To delete a leaf node.
• To delete an internal node.
• Deleting leaf node
• Search for the element to be deleted, if it is in the leaf node and the leaf node
has more than m/2 elements then delete the element.
• If the leaf node does not contain m/2 elements then take an element from
either left or right sub tree.
• If both the left and right sub tree contain only minimum number of elements
then create a new leaf node.
Deletion in a B-Tree
• Deleting an internal node
• If the element to be deleted is in an internal node then find the predecessor
or successor of the element to be deleted and place it in the deleted element
position.
• The predecessor or successor of the element to be deleted will always be in
the leaf node.
• So the procedure will be similar to deleting an element from the leaf node.
Deletion - Example
• Consider the B-Tree of order 5

Try to delete values 93, 201, 180 and 72 from it.


Deletion - Example
• Try to delete values 93, 201, 180 and 72 from it.
Deletion - Example
Deletion - Example
Deletion - Example
Exercise
• Consider the B-Tree of order 3, try to delete 36 and 109
21CSC201J
DATA STRUCTURES AND
ALGORITHMS

UNIT-4
Topic : Heap
Heaps
• A heap is a complete binary tree in which each node can have utmost
two children.
• Heap is a balanced binary tree where the root-node key is compared
with its children and arranged accordingly.

• Min-Heap − the value of the root node is less than or equal to either
of its children
• Max-Heap − the value of the root node is greater than or equal to
either of its children.
Heaps
• A heap is a binary tree whose left and right subtrees have values less
than their parents.
• The root of a maxheap is guaranteed to hold the largest node in the
tree; its subtrees contain data that have lesser values.
• Unlike the binary search tree, however, the lesser-valued nodes of a
heap can be placed on either the right or the left subtree.
• Heaps have another interesting facet: they are often implemented in
an array rather than a linked list.
Heaps
• Max-Heap − the value of the root node is greater than or equal to
either of its children.
Heaps : Max-Heap Construction/Insertion
• Step 1 − Create a new node at the end of heap.
• Step 2 − Assign new value to the node.
• Step 3 − Compare the value of this child node with its parent.
• Step 4 − If value of parent is less than child, then swap them.
• Step 5 − Repeat step 3 & 4 until Heap property holds.
Heaps : Max-Heap Construction/Insertion
• Construct heap data structure: 35, 33, 42, 10, 14, 19, 27, 44, 26, 31
Heaps : Max-Heap Deletion
• Step 1 − Remove root node.
• Step 2 − Move the last element of last level to root.
• Step 3 − Compare the value of this child node with its parent.
• Step 4 − If value of parent is less than child, then swap them.
• Step 5 − Repeat step 3 & 4 until Heap property holds.
Heaps : Max-Heap Deletion
• In Deletion in the heap tree, the root node is always deleted and it is
replaced with the last element.
Definition

• A heap, as shown in Figure


9-1, is a binary tree
structure with the
following
• properties:
• 1. The tree is complete or
nearly complete.
• 2. The key value of each
node is greater than or A heap is a complete or nearly complete binary
tree in which the key value in a node is greater
equal to the key value in than or
each of its descendants. equal to the key values in all of its subtrees, and
the subtrees are in turn heaps.
Heap Operations
• Two basic maintenance operations are performed on a heap:
• insert a node
• and delete a node.

• To implement the insert and delete operations, we need two basic algorithms:
reheap up and reheap down.
Reheap Up
• The reheap up operation reorders a “broken” heap by floating the
last element up the tree until it is in its correct location in the heap.
Heap Implementation
• Although a heap can be built in a dynamic tree structure, it is most often
implemented in an array. This implementation is possible because the heap is, by
definition, complete or nearly complete. Therefore, the relationship
• between a node and its children is fixed and can be calculated as shown below.
• 1. For a node located at index i, its children are found at:
• a. Left child: 2i + 1
• b. Right child: 2i + 2
• 2. The parent of a node located at index i is located at [(i – 1) / 2]
• 3. Given the index for a left child, j, its right sibling, if any, is found at j + 1.
• Conversely, given the index for a right child, k, its left sibling, which must exist, is
found at k – 1
• 4. Given the size, n, of a complete heap, the location of the first
leaf is [(n / 2)]
• Given the location of the first leaf element, the location of the
last non leaf
• element is one less.

parent= (i-1)/2
parent of 40, i=4
parent=3/2=1= 75
A heap can be implemented in an
array because it must be a complete
or nearly complete binary tree, which
allows a fixed relationship between
each node and its children.
Max Heap Construction Algorithm
Example :max heap. Insert a new node with
value 85.
• Step 1 - Insert the newNode with value 85
as last leaf from left to right. That means • Step 2 - Compare newNode value (85) with
newNode is added as a right child of node with
value 75. After adding max heap is as follows... its Parent node value (75). That means 85 > 75
• Step3 - Here newNode value (85) is • Step 4 - Now, again compare new Node
greater than its parent value (75), value (85) with its parent node value (89).
then swap both of them. After
swapping, max heap is as follows...
• Here, newNode value (85) is smaller than its parent node value
(89). So, we stop insertion process. Finally, max heap after
insertion of a new node with value 85 is as follows...
Max Heap Deletion Algorithm
Problem Example:

Step 0: Change the Array Data Structure into a Tree


In order to change an array structure into the tree version of the binary heap, we start from the
left to the right of the array, and then insert values into the binary heap from top to bottom
and left to right.
• Step 1: Delete the node that contains the
value you want deleted in the heap. Step 2: Replace the deleted node with the
farthest right node.
The value that we want to delete is the
maximum value or element in the array
which is at the root of the tree. This is
the node that contains the value “10”.
Step 3: Heapify (Fix the heap):

• The value “7” at the root of the tree is less than both of its children,
the nodes containing the value “8” and “9”. We need to swap the “7”
with the largest child, the node containing the value “9”.
Heap sort
• Heaps can be used in sorting an array.
• In max-heaps, maximum element will always be at the root. Heap Sort
uses this property of heap to sort the array.
• Consider an array Arr which is to be sorted using Heap Sort.
• 1. Initially build a max heap of elements in Arr.
• 2. The root element, that is Arr[1], will contain maximum element of Arr.
After that, swap this element with the last element of Arr and heapify
the max heap excluding the last element which is already in its correct
position and then decrease the length of heap by one.
• 3. Repeat the step 2, until all the elements are in their correct position.
Heap sort-complexity

• max_heapify has complexity O(logN),


• build_maxheap has complexity O(N) and we run
max_heapify N−1 times in heap_sort function, therefore complexity
of heap_sort function is O(NlogN).
Heap Applications
Three common applications of heaps
• Three common applications of heaps are
• selection algorithms,
• Priority queues, and
• sorting.
Selection Algorithms
• There are two solutions to the problem of determining the kth element in
an
• unsorted list.
• We could first sort the list and select the element at location k, or we could
create a heap and delete k – 1 elements from it, leaving the desired
element at the root.
• Rather than simply discard the elements at the top of the heap, a better
solution is to place the deleted element at the end of the heap and reduce
the heap size by 1.
• After the kth element has been processed, the temporarily removed
elements can then be reinserted into the heap.
For example, if we want to know the fourth-largest
element in a list, we can create the heap shown in Figure
9-14. After deleting three times, we have the fourth-largest
element, 21, at the top of the heap.
After selecting 21 we re-heap to restore the heap so that it
is complete and we are ready for another selection.
Priority Queues
• The heap is an excellent structure to use for a priority queue.
• As an event enters the queue, it is assigned a priority number that
determines its position relative to the other events already in the
queue.
• It is assigned a priority number even though the new event can enter
the heap in only one place at any given time, the first empty leaf.
Hashing and Collision Resolution

Hashing
Hashing is the process of mapping large amount of data item to smaller table with
the help of hashing function. Hashing is also known as Hashing Algorithm or Message Digest
Function.

It is a technique to convert a range of key values into a range of indexes of an array.


Hashing is a well-known technique to search any particular element among several elements. It
minimizes the number of comparisons while performing the search.

Hashing Mechanism
In hashing, an array data structure called as Hash table is used to store the data
items.
Based on the hash key value, data items are inserted into the hash table.
Hash Key Value (index)
Hash key value is a special value that serves as an index for a data item. It
indicates where the data item should be stored in the hash table. Hash key value is generated
using a hash function.
Hash Table
Hash table or hash map is a data structure used to store key-value pairs. It is a collection
of items stored to make it easy to find them later. It uses a hash function to compute an index
into an array of buckets or slots from which the desired value can be found.

Figure 1: Hash Table


The above figure shows the hash table with the size of n = 10. Each position of the hash table is
called as Slot. In the above hash table, there are n slots in the table, names = {0, 1, 2, 3, 4, 5, 6, 7,
8, 9}. Slot 0, slot 1, slot 2 and so on. Hash table contains no items, so every slot is empty.
As we know the mapping between an item and the slot where item belongs in the hash table is
called the hash function. The hash function takes any item in the collection and returns an integer
in the range of slot names between 0 to n-1.
1
Figure 2: Hashing Mechanism
Hash Function
Hash function is a function that maps any big number or string to a small integer value.
Hash function takes the data item as an input and returns a small integer value as an output. The
small integer value is called as a hash value. Hash value of the data item is then used as an index
for storing it into the hash table.

Types of Hash Functions


There are various types of hash functions available such as-
 Mid Square Hash Function
The key K is multiplied by itself and the address is obtained by selecting an appropriate number
of digits from the middle of the square.
The number of digits selected depends on the size of the table.

-digit address is required, positions 5 to 7 could be chosen giving address 138.


 Division Hash Function
Suppose we have integer items {26, 70, 18, 31, 54, 93}. One common method of determining a
hash key is the division method of hashing and the formula is :
Hash Key = Key Value % Number of Slots in the Table
Division method or reminder method takes an item and divides it by the table size and returns the
remainder as its hash value.

2
Data Item Value % No. of Slots Hash Value

26 26 % 10 = 6 6

70 70 % 10 = 0 0

18 18 % 10 = 8 8

31 31 % 10 = 1 1

54 54 % 10 = 4 4

93 93 % 10 = 3 3

Figure 3: Hash Table


After computing the hash values, we can insert each item into the hash table at the designated
position as shown in the above figure. In the hash table, 6 of the 10 slots are occupied, it is
referred to as the load factor
 Folding Hash Function
The key K is partitioned into a number of parts, each of which has the same length as the
required address with the possible exception of the last part .
final carry, to form an address.
a three digit address.
P1=356, P2=942, P3=781 are added to yield 079.
Collision Handling
When two or more keys are given the same hash value, it is called a collision. To handle this
collision, we use collision resolution techniques.

3
Figure 4: Collision Handling
Keys: 5, 28, 19, 15, 20, 33, 12, 17, 10
HT slots: 9
hash function = h(k) = k % 9
h(5) = 5 % 9 = 5
h(28) = 28 % 9 = 1
h(19) = 19 % 9 = 1
h(15) = 15 % 9 = 6
h(20) = 20 % 9 = 2
h(33) = 33 % 9 = 6
h(12) = 12 % 9 = 3
h(17) = 17 % 9 = 8
h(10) = 10 % 9 = 1
Collision resolution techniques
There are two types of collision resolution techniques.
1. Separate chaining (open hashing)
2. Open addressing (closed hashing)

4
Figure 5: Techniques in Collision Resolution
Separate
chaining
In this technique, a linked list is created from the slot in which collision has occurred, after
which the new key is inserted into the linked list. This linked list of slots looks like a chain, so it
is called separate chaining. It is used more when we do not know how many keys to insert or
delete.
Problem-
Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table-
50, 700, 76, 85, 92, 73 and 101
Use separate chaining technique for collision resolution.
Solution-
The given sequence of keys will be inserted in the hash table as-
Step-01:
Draw an empty hash table.
For the given hash function, the possible range of hash values is [0, 6].
So, draw an empty hash table consisting of 7 buckets as-

5
Figure 6: Empty Hash Table
Step-02:
Insert the given keys in the hash table one by one.
The first key to be inserted in the hash table = 50.
Bucket of the hash table to which key 50 maps = 50 mod 7 = 1.
So, key 50 will be inserted in bucket-1 of the hash table as-

Figure 7: Insert 50
Step-03:
The next key to be inserted in the hash table = 700.
Bucket of the hash table to which key 700 maps = 700 mod 7 = 0.
So, key 700 will be inserted in bucket-0 of the hash table as-

6
Figure 8: Insert 700
Step-04:
 The next key to be inserted in the hash table = 76.
 Bucket of the hash table to which key 76 maps = 76 mod 7 = 6.
 So, key 76 will be inserted in bucket-6 of the hash table as-

Figure 9: Insert 76

Step-05:
 The next key to be inserted in the hash table = 85.
 Bucket of the hash table to which key 85 maps = 85 mod 7 = 1.
 Since bucket-1 is already occupied, so collision occurs.
 Separate chaining handles the collision by creating a linked list to bucket-1.
 So, key 85 will be inserted in bucket-1 of the hash table as-

7
Figure 10: Insert 85

Step-06:
 The next key to be inserted in the hash table = 92.
 Bucket of the hash table to which key 92 maps = 92 mod 7 = 1.
 Since bucket-1 is already occupied, so collision occurs.
 Separate chaining handles the collision by creating a linked list to bucket-1.
 So, key 92 will be inserted in bucket-1 of the hash table as-

Figure 11: Insert 92

Step-07:
 The next key to be inserted in the hash table = 73.
 Bucket of the hash table to which key 73 maps = 73 mod 7 = 3.
 So, key 73 will be inserted in bucket-3 of the hash table as-

Figure 12: Insert 73

8
Step-08:
 The next key to be inserted in the hash table = 101.
 Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
 Since bucket-3 is already occupied, so collision occurs.
 Separate chaining handles the collision by creating a linked list to bucket-3.
 So, key 101 will be inserted in bucket-3 of the hash table as-

Figure 13: Insert 101


Open addressing
Open addressing is collision-resolution method that is used to control the collision in the hashing
table. There is no key stored outside of the hash table. Therefore, the size of the hash table is
always greater than or equal to the number of keys. It is also called closed hashing.
The following techniques are used in open addressing:
 Linear probing
 Quadratic probing
 Double hashing
1. Linear Probing-
In linear probing,
When collision occurs, we linearly probe for the next bucket.
We keep probing until an empty bucket is found.
Problem-
Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table-
50, 700, 76, 85, 92, 73 and 101
Use linear probing technique for collision resolution.
Solution-
The given sequence of keys will be inserted in the hash table as-

9
Step-01:
 Draw an empty hash table.
 For the given hash function, the possible range of hash values is [0, 6].
 So, draw an empty hash table consisting of 7 buckets as-

Figure 14: Empty Hash Table

Step-02:
 Insert the given keys in the hash table one by one.
 The first key to be inserted in the hash table = 50.
 Bucket of the hash table to which key 50 maps = 50 mod 7 = 1.
 So, key 50 will be inserted in bucket-1 of the hash table as-

Figure 15: Insert 50


Step-03:
 The next key to be inserted in the hash table = 700.
 Bucket of the hash table to which key 700 maps = 700 mod 7 = 0.
 So, key 700 will be inserted in bucket-0 of the hash table as-

10
Figure 16: Insert 700
Step-04:
 The next key to be inserted in the hash table = 76.
 Bucket of the hash table to which key 76 maps = 76 mod 7 = 6.
 So, key 76 will be inserted in bucket-6 of the hash table as-

Figure 17: Insert 76


Step-05:
 The next key to be inserted in the hash table = 85.
 Bucket of the hash table to which key 85 maps = 85 mod 7 = 1.
 Since bucket-1 is already occupied, so collision occurs.
 To handle the collision, linear probing technique keeps probing linearly until an empty
bucket is found.
 The first empty bucket is bucket-2.
 So, key 85 will be inserted in bucket-2 of the hash table as-

11
Figure 18: Insert 85
Step-06:
 The next key to be inserted in the hash table = 92.
 Bucket of the hash table to which key 92 maps = 92 mod 7 = 1.
 Since bucket-1 is already occupied, so collision occurs.
 To handle the collision, linear probing technique keeps probing linearly until an empty
bucket is found.
 The first empty bucket is bucket-3.
 So, key 92 will be inserted in bucket-3 of the hash table as-

Figure 19: Insert 92


Step-07:
 The next key to be inserted in the hash table = 73.
 Bucket of the hash table to which key 73 maps = 73 mod 7 = 3.
 Since bucket-3 is already occupied, so collision occurs.
 To handle the collision, linear probing technique keeps probing linearly until an empty
bucket is found.
 The first empty bucket is bucket-4.
 So, key 73 will be inserted in bucket-4 of the hash table as-

12
Figure 20: Insert 73
Step-08:
 The next key to be inserted in the hash table = 101.
 Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
 Since bucket-3 is already occupied, so collision occurs.
 To handle the collision, linear probing technique keeps probing linearly until an empty
bucket is found.
 The first empty bucket is bucket-5.
 So, key 101 will be inserted in bucket-5 of the hash table as-

Figure 21: Insert 101


PRIORITY QUEUE
A priority queue is a special type of queue in which each element is associated with a
priority value. And, elements are served on the basis of their priority. That is, higher priority
elements are served first. However, if elements with the same priority occur, they are served
according to their order in the queue.

13
UNIT -4
COLLISION
the hash function is a function that returns the key value using which the record can be placed in
the hash table. Thus this function helps us in placing the record in the hash table at appropriate
position and due to this we can retrieve the record directly from that location. This function need
to be designed very carefully and it should not return the same hash key address for two different
records. This is an undesirable situation in hashing.

Definition: The situation in which the hash function returns the same hash key (home bucket) for
more than one record is called collision and two same hash keys returned for different records is
called synonym.

Similarly when there is no room for a new pair in the hash table then such a situation is
called overflow. Sometimes when we handle collision it may lead to overflow conditions.
Collision and overflow show the poor hash functions.

For example, 0
1 131
Consider a hash function. 2
3 43
H(key) = recordkey%10 having the hash table size of 10 4 44
5
The record keys to be placed are 6 36
7 57
131, 44, 43, 78, 19, 36, 57 and 77 8 78
131%10=1 9 19
44%10=4
43%10=3
78%10=8
19%10=9
36%10=6
57%10=7
77%10=7

Now if we try to place 77 in the hash table then we get the hash key to be 7 and at index 7 already
the record key 57 is placed. This situation is called collision. From the index 7 if we look for next
vacant position at subsequent indices 8.9 then we find that there is no room to place 77 in the hash
table. This situation is called overflow.

COLLISION RESOLUTION TECHNIQUES


If collision occurs then it should be handled by applying some techniques. Such a
technique is called collision handling technique.
1. Chaining
2. Open addressing (linear probing)
3.Quadratic probing
4. Double hashing
5. Double hashing
6.Rehashing

104
UNIT -4
CHAINING
In collision handling method chaining is a concept which introduces an additional field with data
i.e. chain. A separate chain table is maintained for colliding data. When collision occurs then a
linked list(chain) is maintained at the home bucket.

For eg;

Consider the keys to be placed in their home buckets are


131, 3, 4, 21, 61, 7, 97, 8, 9

then we will apply a hash function as H(key) = key % D

Where D is the size of table. The hash table will be-

Here D = 10

0
1 131 21 61 NULL

3 NULL

131 61 NULL

7 97 NULL

A chain is maintained for colliding elements. for instance 131 has a home bucket (key) 1.
similarly key 21 and 61 demand for home bucket 1. Hence a chain is maintained at index 1.

OPEN ADDRESSING – LINEAR PROBING


This is the easiest method of handling collision. When collision occurs i.e. when two records
demand for the same home bucket in the hash table then collision can be solved by placing the
second record linearly down whenever the empty bucket is found. When use linear probing (open
addressing), the hash table is represented as a one-dimensional array with indices that range from
0 to the desired table size-1. Before inserting any elements into this table, we must initialize the
table to represent the situation where all slots are empty. This allows us to detect overflows and
collisions when we inset elements into the table. Then using some suitable hash function the
element can be inserted into the hash table.

For example:

Consider that following keys are to be inserted in the hash table

131, 4, 8, 7, 21, 5, 31, 61, 9, 29

105
UNIT -4

Initially, we will put the following keys in the hash table.


We will use Division hash function. That means the keys are placed using the formula

H(key) = key % tablesize


H(key) = key % 10

For instance the element 131 can be placed at

H(key) = 131 % 10
=1

Index 1 will be the home bucket for 131. Continuing in this fashion we will place 4, 8, 7.

Now the next key to be inserted is 21. According to the hash function

H(key)=21%10
H(key) = 1

But the index 1 location is already occupied by 131 i.e. collision occurs. To resolve this collision
we will linearly move down and at the next empty location we will prob the element. Therefore
21 will be placed at the index 2. If the next element is 5 then we get the home bucket for 5 as
index 5 and this bucket is empty so we will put the element 5 at index 5.

Index Key Key Key

NULL NULL NULL


0
131 131 131
1
NULL 21 21
2
NULL NULL 31
3
4 4 4
4
NULL 5 5
5
NULL NULL 61
6
7 7 7
7
8 8 8
8
NULL NULL NULL
9

after placing keys 31, 61

106
UNIT -4
The next record key is 9. According to decision hash function it demands for the home bucket 9.
Hence we will place 9 at index 9. Now the next final record key 29 and it hashes a key 9. But
home bucket 9 is already occupied. And there is no next empty bucket as the table size is limited
to index 9. The overflow occurs. To handle it we move back to bucket 0 and is the location over
there is empty 29 will be placed at 0th index.
Problem with linear probing:
One major problem with linear probing is primary clustering. Primary clustering is a process in
which a block of data is formed in the hash table when collision is resolved.
Key

39
19%10 = 9 cluster is formed
18%10 = 8 29
39%10 = 9 8
29%10 = 9
8%10 = 8

rest of the table is empty

this cluster problem can be solved by quadratic probing.

18

QUADRATIC PROBING: 19

Quadratic probing operates by taking the original hash value and adding successive values of an
arbitrary quadratic polynomial to the starting value. This method uses following formula.

H(key) = (Hash(key) + i2) % m)

where m can be table size or any prime number.

for eg; If we have to insert following elements in the hash table with table size 10:

37, 90, 55, 22, 17, 49, 87 0 90


1 11
37 % 10 = 7 2 22
90 % 10 = 0 3
55 % 10 = 5 4
22 % 10 = 2 5 55
11 % 10 = 1 6
7 37
Now if we want to place 17 a collision will occur as 17%10 = 7 and 8
bucket 7 has already an element 37. Hence we will apply 9
quadratic probing to insert this record in the hash table.

Hi (key) = (Hash(key) + i2) % m

Consider i = 0 then
(17 + 02) % 10 = 7

107
UNIT -4
(17 + 12) % 10 = 8, when i =1

The bucket 8 is empty hence we will place the element at index 8. 0 90


Then comes 49 which will be placed at index 9. 1 11
2 22
49 % 10 = 9 3
4
5 55
6
7 37
8 49
9
Now to place 87 we will use quadratic probing.
0 90
(87 + 0) % 10 = 7 1 11
(87 + 1) % 10 = 8… but already occupied 2 22
(87 + 22) % 10 = 1.. already occupied 3
(87 + 32) % 10 = 6 4
5
It is observed that if we want place all the necessary elements in 55
6
the hash table the size of divisor (m) should be twice as large as 87
7
total number of elements. 37
8 49
9
DOUBLE HASHING
Double hashing is technique in which a second hash function is applied to the key when a
collision occurs. By applying the second hash function we will get the number of positions from
the point of collision to insert.
There are two important rules to be followed for the second function:
 it must never evaluate to zero.
 must make sure that all cells can be probed.
The formula to be used for double hashing is

H1(key) = key mod tablesize


Key
H2(key) = M – (key mod M)
90

where M is a prime number smaller than the size of the table.


22
Consider the following elements to be placed in the hash table of size 10
37, 90, 45, 22, 17, 49, 55
Initially insert the elements using the formula for H1(key).
Insert 37, 90, 45, 22 45

H1(37) = 37 % 10 = 7
H1(90) = 90 % 10 = 0 37
H1(45) = 45 % 10 = 5
H1(22) = 22 % 10 = 2
H1(49) = 49 % 10 = 9 49

108
UNIT -4
Now if 17 to be inserted then
Key
H1(17) = 17 % 10 = 7 90
H2(key) = M – (key % M)
17

Here M is prime number smaller than the size of the table. Prime number 22
smaller than table size 10 is 7

Hence M = 7
45
H2(17) = 7-(17 % 7)
=7–3=4
37
That means we have to insert the element 17 at 4 places from 37. In short we ha v e to take 4
jumps. Therefore the 17 will be placed at index 1.
49
Now to insert number 55
Key
H1(55) = 55 % 10 =5 Collision
90

H2(55) = 7-(55 % 7) 17
=7–6=1 22

That means we have to take one jump from index 5 to place 55.
Finally the hash table will be -
45

55

37

49
Comparison of Quadratic Probing & Double Hashing

The double hashing requires another hash function whose probing efficiency is same as
some another hash function required when handling random collision.
The double hashing is more complex to implement than quadratic probing. The quadratic
probing is fast technique than double hashing.

REHASHING

Rehashing is a technique in which the table is resized, i.e., the size of table is doubled by creating
a new table. It is preferable is the total size of table is a prime number. There are situations in
which the rehashing is required.

 When table is completely full


 With quadratic probing when the table is filled half.
 When insertions fail due to overflow.

109
UNIT -4
In such situations, we have to transfer entries from old table to the new table by re computing
their positions using hash functions.

Consider we have to insert the elements 37, 90, 55, 22, 17, 49, and 87. the table size is 10 and will
use hash function.,

H(key) = key mod tablesize

37 % 10 = 7
90 % 10= 0
55 % 10 = 5
22 % 10 = 2
17 % 10 = 7 Collision solved by linear probing
49 % 10 = 9

Now this table is almost full and if we try to insert more elements collisions will occur and
eventually further insertions will fail. Hence we will rehash by doubling the table size. The old
table size is 10 then we should double this size for new table, that becomes 20. But 20 is not a
prime number, we will prefer to make the table size as 23. And new hash function will be

H(key) key mod 23 0 90


1 11
37 % 23 = 14 2 22
90 % 23 = 21 3
55 % 23 = 9 4
22 % 23 = 22 5 55
17 % 23 = 17 6 87
49 % 23 = 3 7 37
87 % 23 = 18 8 49
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

Now the hash table is sufficiently large to accommodate new insertions.

Advantages:

110
UNIT -4
1. This technique provides the programmer a flexibility to enlarge the table size if required.
2. Only the space gets doubled with simple hash function which avoids occurrence of
collisions.

EXTENSIBLE HASHING

 Extensible hashing is a technique which handles a large amount of data. The data to be
placed in the hash table is by extracting certain number of bits.
 Extensible hashing grow and shrink similar to B-trees.
 In extensible hashing referring the size of directory the elements are to be placed in
buckets. The levels are indicated in parenthesis.

For eg: Directory

0 1
Levels
(0) (1)
001 111
data to be
010
placed in bucket

 The bucket can hold the data of its global depth. If data in bucket is more than global
depth then, split the bucket and double the directory.

Consider we have to insert 1, 4, 5, 7, 8, 10. Assume each page can hold 2 data entries (2 is the
depth).

Step 1: Insert 1, 4
1 = 001
0
4 = 100
(0)
We will examine last bit
001
of data and insert the data
010
in bucket.

Insert 5. The bucket is full. Hence double the directory.

111

You might also like