DSA-Module 1_ Notes on Search Trees and Their Operations
DSA-Module 1_ Notes on Search Trees and Their Operations
Module – 1
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
3. Characteristics:
• Each node can have one object (the key) and two children (left and
right), leading to a ternary-like structure.
• Traversing an interior node requires two comparisons (one for
the left/right decision and one for equality).
• The maximum number of objects in a tree of height ( h ) is
( 2^{h+1} - 1 ).
4. Example Structure:
• The structure is similar to Model 1, but with the understanding
that each node can also contain an object:
typedef struct tr_n_t {
key_tkey; // The key for comparison
object_t *object; // The object associated with the
key struct tr_n_t *left; // Points to the left
subtree
struct tr_n_t *right; // Points to the right subtree
} tree_node_t;
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
tmp_node = n->right;
tmp_key = n->key;
n->right = n->left;
n->key = n->left->key;
n->left = n->right-
>left;
n->right->left = n->right-
>right; n->right->right =
tmp_node;
n->right->key = tmp_key;
}
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Theorem:
A search tree for n objects has height at least [log n] and at most n −
1. It is easy to see that both bounds can be reached.
The height is the worst-case distance we have to traverse to reach
a specific object in the search tree. Another related measure of quality of
a search tree is the average depth of the leaves, that is, the average over
all objects of the distance we have to go to reach that object. Here the
bounds are:
Theorem:
A search tree for n objects has average depth at least log n and at most
(n−1)(n+2) / 2n ≈ 1/2n.
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
In the first case, one uses that the function x log x is convex, so
a log a + b log b ≥ (a + b) log (a + b)/2.
{ find( tree, query key): Returns the object associated with query key, if there is
one;
{ insert( tree, key, object ): Inserts the (key, object) pair in the tree; and
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
{ delete( tree, key): Deletes the object associated with key from the tree
1. Find Operation
Definition: The find operation searches for a specific key in the search tree and
returns the associated object if the key exists.
Algorithm:
• Start at the root of the tree.
• Traverse the tree based on comparisons:
• If the query key is less than the current node's key, move to the
left child.
• If the query key is greater than the current node's key, move to
the right child.
• If a leaf node is reached and the key does not match, return NULL.
• If the key matches the current node's key, return the associated object.
Code Implementation: Here are two versions of the find function: an iterative
version and a recursive version.
Recursive Version:
object_t *find(tree_node_t *tree, key_tquery_key) {
if (tree->left == NULL || (tree->right == NULL && tree->key != query_key))
{
return NULL; // Tree is empty or key not found
} else if (tree->right == NULL && tree->key == query_key)
{ return (object_t *) tree->left; // Return the associated object
} else {
if (query_key< tree->key) {
return find(tree->left, query_key); // Search left subtree
} else {
return find(tree->right, query_key); // Search right subtree
}
}
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
}
2. Insert Operation
Definition: The insert operation adds a new (key, object) pair to the search tree.
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Algorithm:
• Start at the root of the tree.
• Traverse the tree to find the correct position for the new key:
• If the new key is less than the current node's key, move to the
left child.
• If the new key is greater than the current node's key, move to
the right child.
• If a leaf node is reached and the key already exists, return an error.
• Create a new interior node and a new leaf node for the new key and
object.
• Adjust the pointers to maintain the binary search tree properties.
Code Implementation:
Delete Operation:
Definition: The delete operation removes a (key, object) pair from the search
tree based on the specified key. If the key exists, the associated object is
returned, and the tree is restructured to maintain its properties.
Algorithm
1. Check for Empty Tree: If the tree is empty (i.e., the left child of the
root is NULL), return NULL since there is nothing to delete.
2. Leaf Node Deletion: If the current node is a leaf (i.e., it has no right
child):
• If the key of the current node matches the delete_key, store the
object, set the left child to NULL, and return the object.
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
• Implementation:
• Traverse the tree to find the appropriate leaf node for the
key.
• Insert the new object at the beginning of the linked list
associated with that key.
3. Delete Operation:
• Objective: Delete all objects associated with a given key.
• Time Complexity: ( O(h) ), where ( h ) is the height of the tree.
• Implementation:
• Traverse the tree to find the leaf node for the key.
• Remove the linked list of objects associated with that key.
• To efficiently manage memory, maintain an additional node
between the leaf and the linked list that contains pointers to
the beginning and end of the list. This allows for quick
deletion of the entire list in ( O(1) ) time.
This operation involves finding all the keys within a specified range [a,b][a,
b] in a search tree, such as a Binary Search Tree (BST). The goal is to
efficiently retrieve all the keys that fall within the given interval.
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
An optimal search tree, also known as an optimal binary search tree (OBST), is a binary
search tree that minimizes the expected search cost, given the probabilities of searching for
each key. The process of building an OBST involves dynamic programming.
Here's a simplified explanation:
1. Identify Frequencies: Determine the probability of searching for each
key, as well as the probability of searches that result in no match (i.e.,
between keys).
2. Create Cost and Root Matrices: Use dynamic programming to
create two matrices: one for the cost and one for the root of the
subtree.
3. Fill Matrices: Calculate the optimal cost for subtrees of increasing
sizes, using the probabilities and previously computed values.
4. Construct Tree: Use the root matrix to construct the tree.
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Diagram:
Keys: 10 20 30
Probabilities: p1 p2 p3
[10]
/ \
NULL [20]
\ [30]
The diagram shows an example where 10, 20, and 30 are the keys, and the tree
is constructed to minimize the expected search cost.
This essentially performs rotations in the root till the left-lower neighbour is a leaf; then it
returns that leaf, moves the root down to the right, and returns the previous root.
Height-Balanced Trees:
A tree is height-balanced if, in each interior node, the height of the right
subtree and the height of the left subtree differ by at most 1. This is the
oldest balance criterion for trees, introduced and analyzed by G.M.
Adel’son-Vel’ski˘ı and E.M. Landis (1962), and still the most popular
variant of balanced search trees (AVL trees). A height-balanced tree has
necessarily small height.
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Weight-Balanced Trees:
When Adel’son-Vel’ski˘ı and Landis invented the height-balanced search trees in 1962,
computers were extremely memory limited, so the applicability of the structure at that time
was small and only very few other papers on balanced search trees1 appeared in the 1960s.
But by 1970, technological development made it a feasible and useful structure, generating
much interest in the topic, and several alternative ways to maintain search trees at O(log n)
height were proposed. One natural alternative balance criterion is to balance the weight, that
is, the number of leaves, instead of the height of the subtrees.
The weight of a tree is the number of its leaves, so in a weight-balanced tree, the weight of
the left and right subtrees in each node should be “balanced”.An α-weight-balanced tree has
necessarily small height.
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
A different method to keep the height of the trees small is to allow tree nodes of higher
degree. This idea was introduced as B-trees by Bayer and McCreight (1972) and turned out
to be very fruitful. It was originally intended as external memory data structure. The
characteristic of external memory is that access to it is very slow, compared to main memory,
and is done in blocks, units much larger than single main memory locations, which are
simultaneously transferred into main memory. In the 1970s, computers were still very
memory limited but usually already had a large external memory, so that it was a necessary
consideration how a structure operates when a large part of it is not in main memory, but on
external memory. This situation is now less important, but it is still relevant for database
applications, where B-tree variants are still much used as index structures.
The problem with normal binary search trees as external memory structure is that each tree
node could be in a different external memory block, which becomes known only when the
previous block has been retrieved from the external memory. So we might need as many
external memory block accesses as the height of the tree, which is more than log2(n), and
would be interested in each of these blocks, which are large enough to hold many nodes, in
just a single node. The idea of B-trees is to take each block as a single node of high degree.
In the original version, each node has degree between a and 2a − 1, where a is chosen as
large as possible under the condition that a block must have room for 2a − 1 pointers and
keys. Then balance was maintained by the criterion that all leaves should be at the same
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
depth.
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
1. Red-Black Trees
Definition: A Red-Black Tree is a binary search tree with an additional color property for
each node, which can be either red or black. This coloring helps maintain balance during
insertions and deletions.
Properties: A Red-Black Tree must satisfy the following properties:
1. Node Color: Each node is either red or black.
2. Root Property: The root node is always black.
3. Red Property: Red nodes cannot have red children (no two reds in a row).
4. Black Property: Every path from a node to its descendant leaves must have the
same number of black nodes (black height).
5. Leaf Property: All leaves (NIL nodes) are considered black.
These properties ensure that the tree remains approximately balanced, leading to a height of
(O(\log n)), where (n) is the number of nodes.
Operations:
1. Insertion:
• Insert the new node as in a regular binary search tree.
• Color the new node red.
• If the parent node is red, perform rotations and recoloring to restore the
Red- Black properties.
2. Deletion:
• Remove the node as in a regular binary search tree.
• If the node is black, additional adjustments are needed to maintain
the properties, which may involve rotations and recoloring.
Complexity:
• Time Complexity: (O(\log n)) for search, insertion, and deletion.
• Space Complexity: (O(n)) for storing the nodes.
2. Trees of Almost Optimal Height
Definition: Trees of almost optimal height refer to trees that maintain a height close to the
theoretical minimum height for a given number of nodes. For binary search trees, the optimal
height is (O(\log n)). Red-Black Trees achieve this by ensuring that the longest path from the
root to a leaf is no more than twice the length of the shortest path.
Height Characteristics:
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Conclusion
Red-Black Trees are a powerful data structure that maintains balance through color
properties, ensuring efficient operations for dynamic datasets. Their height characteristics
guarantee that they remain close to optimal height, providing logarithmic time complexity
for
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Search Trees: Two Models of Search Trees, General Properties, and Transformations
Introduction
Search trees are a fundamental data structure used to maintain a dynamic set of items,
allowing for efficient search, insertion, and deletion operations. They are particularly useful
for applications that require frequent access to sorted data.
Two Models of Search Trees
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS
lOMoARcPSD|45529801
Downloaded
Downloaded by armar
Kinjal abdul
Algud([email protected])
NIS