Unit-III DS Search Trees
Unit-III DS Search Trees
Search Trees: Binary Search Trees, Definition, Implementation, Operations- Searching, Insertion and
Deletion, B- Trees, B+ Trees, AVL Trees, Definition, Height of an AVL Tree, Operations – Insertion,
Deletion and Searching, Red –Black, Splay Trees.
Introduction to Tree
o In linear data structures like an array, linked list, stack and queue in which all the elements are arranged in a
sequential manner.
o A tree data structure is a nonlinear hierarchical data structure that consists of nodes connected by edges.
o In linear data structures like an array, linked list, stack and queue in which all the elements are arranged in a
sequential manner. A tree data structure is a nonlinear hierarchical data structure that consists of nodes
connected by edges.
o A tree is a connected graph without any circuits.
o In tree data structure, every individual element is called as Node. Node in a tree data structure store the
o actual data of that particular element and link to next element in hierarchical structure.
o The topmost node of the tree is called the root, and the nodes below it are called the child nodes.
o Each node can have multiple child nodes, and these child nodes can also have their own child nodes,
forming a recursive structure.
o In a tree data structure, if we have N number of nodes then we can have a maximum of N-1 number of
links.
Definition
A tree is a finite set of one or more nodes such that, there is a specially designated node called root. The
remaining nodes are partitioned into n>=0 disjoint sets T1, T2,..Tn, where each of these set is a tree
T1,T2, ...Tn are called the subtrees of the root.
2. Edge
o In a tree data structure, the connecting link between any two nodes is called as EDGE.
o In a tree with 'N' number of nodes there will be a maximum of 'N-1' number of edges.
3. Parent
o In a tree data structure, the node which is a predecessor of any node is called as PARENT NODE.
o In simple words, the node which has a branch from it to any other node is called a parent node.
o Parent node can also be defined as "The node which has child / children".
5. Siblings
o In a tree data structure, nodes which belong to same Parent are called as SIBLINGS.
o In simple words, the nodes with the same parent are called Sibling nodes.
6. Leaf
o In a tree data structure, the node which does not have a child is called as LEAF Node.
o In simple words, a leaf is a node with no child.
o In a tree data structure, the leaf nodes are also called as External Nodes.
o External node is also a node with no child.
o In a tree, leaf node is also called as 'Terminal' node.
8. Degree
o In a tree data structure, the total number of children of a node is called as DEGREE of that Node.
o In simple words, the Degree of a node is total number of children it has.
o The highest degree of a node among all the nodes in a tree is called as 'Degree of Tree'.
9. Level
o In a tree data structure, the root node is said to be at Level 0 and the children of root node are at
Level 1 and the children of the nodes which are at Level 1 will be at Level 2 and so on...
o In simple words, in a tree each step from top to bottom is called as a Level and the Level count starts
with '0' and incremented by one at each level (Step).
11. Depth
o In a tree data structure, the total number of edges from root node to a particular node is called as
DEPTH of that Node.
o In a tree, the total number of edges from root node to a leaf node in the longest path is said to be
Depth of the tree.
o In simple words, the highest depth of any leaf node in a tree is said to be depth of that tree.
o In a tree, depth of the root node is '0'.
12. Path
o In a tree data structure, the sequence of Nodes and Edges from one node to another node is called
as PATH between that two Nodes.
o Length of a Path is total number of nodes in that path.
o In below example the path A - B - E - J has length 4.
TREE REPRESENTATIONS
o A tree data structure can be represented in two methods.
o Those methods are as follows:
a) List Representation
b) Left Child - Right Sibling Representation
Consider the following tree:
a) List Representation
o In this representation, two types of nodes are used, one for representing the node with data called
'data node' and another for representing only references called 'reference node'.
o We start with a 'data node' from the root node in the tree. Then it is linked to an internal node through
a 'reference node' which is further linked to any other node directly. This process repeats for all the
nodes in the tree.
o In this representation, every node's data field stores the actual value of that node. If that node has left
a child, then left reference field stores the address of that left child node otherwise stores NULL.
o If that node has the right sibling, then right reference field stores the address of right sibling node
otherwise stores NULL.
The above example tree can be represented using Left Child - Right Sibling representation as follows:
There can be n number of subtrees in a general tree. In the general tree, the subtrees are unordered as the
nodes in the subtree cannot be ordered.
Every non-empty tree has a downward edge, and these edges are connected to the nodes known as child
nodes. The root node is labeled with level 0. The nodes that have the same parent are known as siblings.
2. Binary tree:
Here, binary name itself suggests two numbers, i.e., 0 and 1. In a binary tree, each node in a tree can have
utmost two child nodes. Here, utmost means whether the node has 0 nodes, 1 node or 2 nodes.
Definition
A binary tree is a finite set of nodes that either is empty or consists of a root and two disjoint binary trees
called the left subtree and right subtree.
Strictly binary Tree Left Skewed Binary Tree Right Skewed Binary Tree
Steps:
Visit the current node (e.g., print the node's data).
Traverse the left subtree in a pre-order manner.
Traverse the right subtree in a pre-order manner.
Use Case: Pre-order traversal is useful for creating a copy of the tree or for prefix notation expressions.
Recursion Algorithm
preorder(root):
if root ≠ NULL, then
Print root → info
preorder(root->left)
preorder(root->right)
Non-Recursion Algorithm
preorder(root):
1. If root = NULL, then print "Empty tree" and exit.
2. top = -1 //[Initialize stack top]
3. Push(root) //[Push root address onto stack]
4. ptr = root. //[Initialize ptr with root of the binary tree]
5. Repeat steps 6 and 7, while stack is not empty
6. ptr = pop) //[Get stored address and assign to ptr]
7. If ptr ≠ NULL, then
(a) Print ptr→info //[Print node info]
(b) Push(ptr>right) //[Store address of right subtree]
(c) Push(ptr>left) //[ Store address of left subtree]
8. Exit
3. Post-Order Traversal
In post-order traversal, the nodes are traversed in a left-right-root order. Here, you first visit the left and
right subtrees and then the node itself.
In-Order: D, B, E, A, C, F
Pre-Order: A, B, D, E, C, F
Post-Order: D, E, B, F, C, A
Each traversal serves different purposes and offers a unique view of the elements in the binary tree.
Now, the creation of binary search tree is completed. After that, let's move towards the operations that can be
performed on Binary search tree.
We can perform insert, delete and search operations on the binary search tree.
Let's understand how a search is performed on a binary search tree.
Searching in Binary search tree
Searching means to find or locate a specific element or node in a data structure. In Binary search tree, searching
a node is easy because elements in BST are stored in a specific order.
The steps of searching a node in Binary Search tree are listed as follows -
1. First, compare the element to be searched with the root element of the tree.
2. If root is matched with the target element, then return the node's location.
3. If it is not matched, then check whether the item is less than the root element, if it is smaller than the
root element, then move to the left subtree.
Mr. Mohammed Afzal, Asst. Professor in CSE (AI&ML) dept.
Mob: +91-8179700193, Email: [email protected]
4. If it is larger than the root element, then move to the right subtree.
5. Repeat the above procedure recursively until the match is found.
6. If the element is not found or not present in the tree, then return NULL.
Algorithm to search an element in Binary search tree
Search(root, item):
Step 1 - if (item = root → data) or (root = NULL)
return root
else if (item < root → data)
return Search(root → left, item)
else
return Search(root → right, item)
END if
Step 2 - END
Now, let's understand the searching in binary tree using an example. We are taking the binary search tree
formed above. Suppose we have to find node 20 from the below tree.
Step – 1:
Step – 2:
There comes the role of B-tree, we can say that B-trees are m-way search trees with guidelines.
B-Trees:
B-Trees were introduced by Rudolf Bayer and Edward M. McCreight at Boeing Research Labs in
1970 with the name Height Balanced M-way Search Tree. Later it was named as B-Tree.
A B-tree is a self-balancing tree where all the leaf nodes are at the same level which allows for efficient
searching, insertion and deletion of records. Because of all the leaf nodes being on the same level, the
access time of data is fixed regardless of the size of the data set.
b. The leaf does not have a sibling that has more than the required number of keys.
o We need to merge the leaf with either its left or right sibling.
o Because two leaves are being merged, we also need to take the key from the parent whose left
and right pointers used to point to these two leaves and add this key to the merged node.
B+ Tree
A B+ tree is a self-balancing tree data structure that keeps data sorted and allows for efficient search,
insertion, and deletion operations. It is often used in databases and file systems to organize and access
large amounts of data.
B+ Trees retain the basic principles of B-Trees but introduce significant modifications that enhance
performance in disk-based storage systems. These modifications include storing all data in leaf nodes
and linking these leaf nodes to facilitate efficient sequential access, which is particularly valuable for
range queries.
Properties or Characteristics of B+ Trees
In a B+ Tree, all data records are stored at the leaf level. This contrasts with a B-tree, where data can be at
any level.
Leaf nodes in a B+ Tree are linked together in a linked list fashion, facilitating efficient range queries
and sequential access.
The internal nodes (non-leaf nodes) of a B+ Tree do not store data. Instead, they act as indexes
containing only keys that guide the search towards the leaf nodes.
Due to the high degree (number of child pointers per node), B+ Trees are typically shallower than B-trees
with the same number of elements. This results in fewer disk reads, which is crucial for performance in
disk-based storage.
All leaf nodes are at the same depth, which ensures that every access path from the root to a leaf node is
of the same length, providing consistent query performance.
By storing values only in leaves and maintaining a high fan-out, B+ Trees use disk space more efficiently
and reduce the number of disk I/O operations needed.
Like B-trees, B+ Trees are self-balancing. Insertions and deletions may cause redistribution of keys
among nodes or splitting/merging of nodes but maintain the tree's balance.
B+ Tree Insertion in data structure
Case 1: Inserting into a Non-Full Leaf Node
o If the leaf node has less than m-1 keys, Insert the key into the node in its correct sorted position.
Deletion in B+ Tree
Step-1: Delete the key and data from the leaves.
200 is present in the right sub-tree of 190, after 195. delete it.
Merge the two nodes by using 195, 190, 154 and 129.
Now, element 120 is the single element present in the node which is violating the B+ Tree properties.
Therefore, we need to merge it by using 60, 78, 108 and 120.
Now, the height of B+ tree will be decreased by 1.
For an AVL tree to be balanced, the balance factor of every node must be -1, 0, or 1.
Here is what these values signify:
0: The heights of the left and right subtrees are equal. The subtree is perfectly balanced.
1: The left subtree is taller than the right subtree by one level.
-1: The right subtree is taller than the left subtree by one level.
If the balance factor of any node falls outside this range (-1, 0, 1), it indicates that the tree is unbalanced at
that node, and a rotation (or series of rotations) is required to bring it back into balance.
An AVL tree is given in the following figure. We can see that, balance factor associated with each node is in
between -1 and +1. therefore, it is an example of AVL tree.
In above example, node A has balance factor -2 because a node C is inserted in the right subtree of A
right subtree. We perform the RR rotation on the edge below A.
2. LL Rotation
When BST becomes unbalanced, due to a node is inserted into the left subtree of the left subtree of C,
then we perform LL rotation, LL rotation is clockwise rotation, which is applied on the edge below a
node having balance factor 2.
In above example, node C has balance factor 2 because a node A is inserted in the left subtree of C left
subtree. We perform the LL rotation on the edge below A.
3. LR Rotation
LR rotation = RR rotation + LL rotation, i.e., first RR rotation is performed on subtree and then LL
rotation is performed on full tree, by full tree we mean the first node from the path of inserted node
whose balance factor is other than -1, 0, or 1.
Let us understand each and every step very clearly:
On inserting the above elements, especially in the case of H, the BST becomes unbalanced as the
Balance Factor of H is -2. Since the BST is right-skewed, we will perform RR Rotation on node H.
2. Insert B, A The resultant balance tree is:
On inserting the above elements, especially in case of A, the BST becomes unbalanced as the Balance
Factor of H and I is 2, we consider the first node from the last inserted node i.e. H. Since the BST from
H is left-skewed, we will perform LL Rotation on node H.
3. Insert E
4. Insert C, F, D
On inserting C, F, D, BST becomes unbalanced as the Balance Factor of B and H is -2, since if we travel
from D to B we find that it is inserted in the right subtree of left subtree of B, we will perform RL
Rotation on node I. RL = LL + RR rotation.
5. Insert G
On inserting G, BST become unbalanced as the Balance Factor of H is 2, since if we travel from G to H,
we find that it is inserted in the left subtree of right subtree of H, we will perform LR Rotation on node I.
LR = RR + LL rotation.
a) We first perform RR rotation on node C b) We then perform LL rotation on node H
The resultant tree after RR rotation is: The resultant balanced tree after LL rotation is:
On inserting K, BST becomes unbalanced as the Balance Factor of I is -2. Since the BST is right-skewed
from I to K, hence we will perform RR Rotation on the node I.
7. Insert L
On inserting the L tree is still balanced as the Balance Factor of each node is now either, -1, 0, +1.
Hence the tree is a Balanced AVL tree.
In the above tree, if we want to search the 80. We will first compare 80 with the root node. 80 is greater
than the root node, i.e., 10, so searching will be performed on the right subtree. Again, 80 is compared
with 15; 80 is greater than 15, so we move to the right of the 15, i.e., 20.
Now, we reach the leaf node 20, and 20 is not equal to 80. Therefore, it will show that the element is not
found in the tree. After each operation, the search is divided into half. The above BST will take O(logn)
time to search the element.
The above tree shows the right-skewed BST. If we want to search the 80 in the tree, we will compare 80
with all the nodes until we find the element or reach the leaf node. So, the above right-skewed BST will
take O(n) time to search the element.
Step 2: The next node is 18. As 18 is greater than 10 so it will come at the right of 10 as shown below.
We know the second rule of the Red Black tree that if the tree is not empty then the newly created node will
have the Red color. Therefore, node 18 has a Red color, as shown in the figure:
Now we verify the third rule of the Red-Black tree, i.e., the parent of the new node is black or not. In the above
figure, the parent of the node is black in color; therefore, it is a Red-Black tree.
Step 3: Now, we create the new node having value 7 with Red color. As 7 is less than 10, so it will come at the
left of 10 as shown below.
Now we verify the third rule of the Red-Black tree, i.e., the parent of the new node is black or not. As we can
observe, the parent of the node 7 is black in color, and it obeys the Red-Black tree's properties.
Step 4: The next element is 15, and 15 is greater than 10, but less than 18, so the new node will be created at the
left of node 18. The node 15 would be Red in color as the tree is not empty.
We also have to check whether the parent's parent of the new node is the root node or not. As we can observe in
the above figure, the parent's parent of a new node is the root node, so we do not need to recolor it.
Step 5: The next element is 16. As 16 is greater than 10 but less than 18 and greater than 15, so node 16 will
come at the right of node 15. The tree is not empty; node 16 would be Red in color, as shown in the below
figure:
In the above figure, we can observe that it violates the property of the parent-child relationship as it has a red-
red parent-child relationship. We have to apply some rules to make a Red-Black tree. Since the new node's
parent is Red color, and the parent of the new node has no sibling, so rule 4a will be applied. The rule 4a says
that some rotations and recoloring would be performed on the tree.
Since node 16 is right of node 15 and the parent of node 15 is node 18. Node 15 is the left of node 18. Here we
have an LR relationship, so we require to perform two rotations. First, we will perform left, and then we will
perform the right rotation. The left rotation would be performed on nodes 15 and 16, where node 16 will move
upward, and node 15 will move downward. Once the left rotation is performed, the tree looks like as shown in
the below figure:
After rotation, node 16 and node 18 would be recolored; the color of node 16 is red, so it will change to black,
and the color of node 18 is black, so it will change to a red color as shown in the below figure:
Step 6: The next element is 30. Node 30 is inserted at the right of node 18. As the tree is not empty, so the color
of node 30 would be red.
The color of the parent and parent's sibling of a new node is Red, so rule 4b is applied. In rule 4b, we have to do
only recoloring, i.e., no rotations are required. The color of both the parent (node 18) and parent's sibling (node
15) would become black, as shown in the below image.
Step 7: The next element is 25, which we have to insert in the tree. Since 25 is greater than 10, 16, 18 but less
than 30; so, it will come at the left of node 30. As the tree is not empty, node 25 would be in Red color. Here
Red-red conflict occurs as the parent of the newly created is Red color.
Since there is no parent's sibling, so rule 4a is applied in which rotation, as well as recoloring, are performed.
First, we will perform rotations. As the newly created node is at the left of its parent and the parent node is at
the right of its parent, so the RL relationship is formed. Firstly, the right rotation is performed in which node 25
goes upwards, whereas node 30 goes downwards, as shown in the below figure.
After the first rotation, there is an RR relationship, so left rotation is performed. After right rotation, the median
element, i.e., 25 would be the root node; node 30 would be at the right of 25 and node 18 would be at the left of
node 25.
Step 8: The next element is 40. Since 40 is greater than 10, 16, 18, 25, and 30, so node 40 will come at the right
of node 30. As the tree is not empty, node 40 would be Red in color. There is a Red-red conflict between nodes
40 and 30, so rule 4b will be applied.
As the color of parent and parent's sibling node of a new node is Red so recoloring would be performed. The
color of both the nodes would become black, as shown in the below image.
After recoloring, we also have to check the parent's parent of a new node, i.e., 25, which is not a root node, so
recoloring would be performed, and the color of node 25 changes to Red.
After recoloring, red-red conflict occurs between nodes 25 and 16. Now node 25 would be considered as the
new node. Since the parent of node 25 is red in color, and the parent's sibling is black in color, rule 4a would be
applied. Since 25 is at the right of the node 16 and 16 is at the right of its parent, so there is an RR relationship.
In the RR relationship, left rotation is performed. After left rotation, the median element 16 would be the root
node, as shown in the below figure.
Step 9: The next element is 60. Since 60 is greater than 16, 25, 30, and 40, so node 60 will come at the right of
node 40. As the tree is not empty, the color of node 60 would be Red.
As we can observe in the above tree that there is a Red-red conflict occurs. The parent node is Red in color, and
there is no parent's sibling exists in the tree, so rule 4a would be applied. The first rotation would be performed.
The RR relationship exists between the nodes, so left rotation would be performed.
After rotation, the recoloring is performed on nodes 30 and 40. The color of node 30 would become Red, while
the color of node 40 would become black.
The above tree is a Red-Black tree as it follows all the Red-Black tree properties.
Deletion in Red Back tree
Let's understand how we can delete the particular node from the Red-Black tree. The following are the rules
used to delete the particular node from the tree:
Step 1: First, we perform BST rules for the deletion.
Step 2:
Case 1: if the node is Red, which is to be deleted, we simply delete it.
Mr. Mohammed Afzal, Asst. Professor in CSE (AI&ML) dept.
Mob: +91-8179700193, Email: [email protected]
Let's understand case 1 through an example.
Suppose we want to delete node 30 from the tree, which is given below.
Initially, we are having the address of the root node. First, we will apply BST to search the node. Since 30 is
greater than 10 and 20, which means that 30 is the right child of node 20. Node 30 is a leaf node and Red in
color, so it is simply deleted from the tree.
If we want to delete the internal node that has one child. First, replace the value of the internal node with the
value of the child node and then simply delete the child node.
Let's take another example in which we want to delete the internal node, i.e., node 20.
We cannot delete the internal node; we can only replace the value of that node with another value. Node 20 is at
the right of the root node, and it is having only one child, node 30. So, node 20 is replaced with a value 30, but
the color of the node would remain the same, i.e., Black. In the end, node 20 (leaf node) is deleted from the tree.
If we want to delete the internal node that has two child nodes. In this case, we have to decide from which we
have to replace the value of the internal node (either left subtree or right subtree). We have two ways:
o In-order predecessor: We will replace with the largest value that exists in the left subtree.
o In-order successor: We will replace with the smallest value that exists in the right subtree.
Suppose we want to delete node 30 from the tree, which is shown below:
Case 2: If the root node is also double black, then simply remove the double black and make it a single black.
Case 3: If the double black's sibling is black and both its children are black.
o Remove the double black node.
o Add the color of the node to the parent (P) node.
1. If the color of P is red then it becomes black.
2. If the color of P is black, then it becomes double black.
o The color of double black's sibling changes to red.
o If still double black situation arises, then we will apply other cases.
Let's understand this case through an example.
Suppose we want to delete node 15 in the below tree.
We cannot simply delete node 15 from the tree as node 15 is Black in color. Node 15 has two children, which
are nil. So, we replace the 15 value with a nil value. As node 15 and nil node are black in color, the node
becomes double black after replacement, as shown in the below figure.
Mr. Mohammed Afzal, Asst. Professor in CSE (AI&ML) dept.
Mob: +91-8179700193, Email: [email protected]
In the above tree, we can observe that the double black's sibling is black in color and its children are nil,
which are also black. As the double black's sibling and its children have black so it cannot give its black color
to neither of these. Now, the double black's parent node is Red so double black's node add its black color to its
parent node. The color of the node 20 changes to black while the color of the nil node changes to a single black
as shown in the below figure.
After adding the color to its parent node, the color of the double black's sibling, i.e., node 30 changes to red as
shown in the below figure.
In the above tree, we can observe that there is no longer double black's problem exists, and it is also a Red-
Black tree.
Case 4: If double black's sibling is Red.
o Swap the color of its parent and its sibling.
o Rotate the parent node in the double black's direction.
o Reapply cases.
Let's understand this case through an example.
Suppose we want to delete node 15.
In the above tree, we can observe that double black situation still exists in the tree. It satisfies the case 3 in
which double black's sibling is black as well as both its children are black. First, we remove the double black
from the node and add the black color to its parent node. At the end, the color of the double black's sibling, i.e.,
node 25 changes to Red as shown in the below figure.
In the above tree, we can observe that the double black situation has been resolved. It also satisfies the
properties of the Red Black tree.
Case 5: If double black's sibling is black, sibling's child who is far from the double black is black, but near child
to double black is red.
o Swap the color of double black's sibling and the sibling child which is nearer to the double black node.
o Rotate the sibling in the opposite direction of the double black.
o Apply case 6
Suppose we want to delete the node 1 in the below tree.
We can observe in the above screenshot that the double black problem still exists in the tree. So, we will reapply
the cases. We will apply case 5 because the sibling of node 5 is node 30, which is black in color, the child of
node 30, which is far from node 5 is black, and the child of the node 30 which is near to node 5 is Red. In this
case, first we will swap the color of node 30 and node 25 so the color of node 30 changes to Red and the color
of node 25 changes to Black as shown below.
As we can observe in the above tree that double black situation still exists. So, we need to case 6. Let's first see
what is case 6.
Case 6: If double black's sibling is black, far child is Red
o Swap the color of Parent and its sibling node.
o Rotate the parent towards the Double black's direction
o Remove Double black
o Change the Red color to black.
Now we will apply case 6 in the above example to solve the double black's situation.
In the above example, the double black is node 5, and the sibling of node 5 is node 25, which is black in color.
The far child of the double black node is node 30, which is Red in color as shown in the below figure:
First, we will swap the colors of Parent and its sibling. The parent of node 5 is node 10, and the sibling node is
node 25. The colors of both the nodes are black, so there is no swapping would occur.
In the second step, we need to rotate the parent in the double black's direction. After rotation, node 25 will move
upwards, whereas node 10 will move downwards. Once the rotation is performed, the tree would like, as shown
in the below figure:
Splay Tree
The splay tree was developed by Daniel Dominic Sleator and Robert Endre Tarjan in 1985. A Splay Tree is
a self-adjusting binary search tree data structure that automatically reorganizes itself to optimize access times
for frequently accessed elements by moving them closer to the root. It accomplishes this by performing a splay
operation after every search, insert, or delete operation, bringing the accessed node to the root of the tree.
No Height Balance Guarantee: Unlike AVL trees and Red-Black trees, Splay Trees do not guarantee a
balanced height. However, they exhibit a form of local balance, which means nodes that are frequently
accessed tend to move closer to the root.
function rightRotate(node):
// Right rotation operation
newRoot = node.left
node.left = newRoot.right
newRoot.right = node
return newRoot
function leftRotate(node):
// Left rotation operation
newRoot = node.right
node.right = newRoot.left
newRoot.left = node
return newRoot
// After splaying, if the key is not at the root, it's not in the tree
if key != root.key:
return root
function findMax(node):
// Find the maximum node in a subtree
while node.right is not null:
node = node.right
return node
Step 3: The next element is 17. As 17 is greater than 10 and 15 so it will become the right child of node 15.
Now, we will perform splaying. As 17 is having a parent as well as a grandparent so we will perform zig zig
rotations.
Still the node 7 is not a root node, it is a left child of the root node, i.e., 17. So, we need to perform one more
right rotation to make node 7 as a root node as shown below: