13
13
4
Index techniques
Two different techniques:
B trees
B+ trees
Why not adopt a binary search tree (BST) for index ?
A BST may not be balanced
• E.g., One subtree has many nodes, while the
other has a few nodes, poor performance
The depth of a balanced BST is still large
• Need about log2 n searches, possible log2 n
times of disk accesses, while a disk access is
very time-consuming
B tree
B tree is a Balanced tree, has following
properties:
The root is either a leaf or has at least two
children.
Each node, except for the root and the leaves,
has between m/2 and m children.
• m usually is very large, e.g., m=100
All leaves are at the same level in the tree, so
the tree is always height balanced
B tree example: 2-3 tree, i.e., m = 3
Each internal nodes in a 2-3 tree has 2 or 3 children
A node contains one or two keys
All leaves are at the same level
The 2-3 Tree has a property analogous to the BST:
left subtree < 1st key;
1st key ≤ mid subtree < 2nd key;
right subtree ≥ 2nd key
2-3 Tree
The advantage of the 2-3 Tree over the BST is
that it can be updated at low cost, e.g., insert 14
2-3 Tree Insertion, insert 55
Split the node has keys 50 and 52, and
Promote the median of 50, 52, 55, i.e., 52, to its parent
2-3 Tree Insertion, insert 19
Split the node has 20,21, promote 20, to node has 23, 30
Then, split node has 23,30, promote 23 to root has 18, 33
2-3 Tree Insertion, insert 19
Split the root has 18, 33 due to the insertion of 23,
and
promote 23, by creating a new root
The tree height increase by 1
But all leaves at the same level
Node deletion in B tree
13
B+-Tree Example with order m=4
Each internal node should have from
m/2 =2 to m=4 children
A leaf has no more than m+1= 5 records,
but at least (m+1)/2=3 records
Nodes in the same level are linked in
order
14
B+-Tree Insertion
Insert 55
Similar the insertion in B tree
B+-Tree Deletion (1)-delete 18
Just remove key 18 from its leaf node
B+-Tree Deletion (2)-delete 12
Borrow one node 18 from its sibling to
make it at least 3 nodes
B+-Tree Deletion-delete 33
Node having 33,45,47 cannot borrow from its
siblings, merge with its one sibling node 48,50,52
Node 48 has one less child, borrow one child from
its sibling node having 18,23, modify guide keys
B-Tree Space Analysis (1)
Asymptotic cost of search, insertion, and deletion
of nodes from B-Trees is (log n).
Base of the log is the (average) branching
factor of the tree.
Example: Consider a B+-Tree of order 100 with leaf nodes
containing m=100 records.
1 level B+-tree: Min 0, Max 100
2 level: Min: 2 leaves of 50 (100 records). Max: 100
leaves with 100 (10,000 records).
3 level: Min 2 x 50 nodes of leaves, for 5000 records.
Max: 1003 = 1,000,000 records.
4 level: Min: 250,000 records (2 * 50 * 50 * 50). Max:
1004 = 100 million records.
19
Advanced Tree Structures
13.1 Tries
object space decomposition in regular BST
key space decmoposition in Tries
Balanced Trees
AVL Tree: based on BST, rotations during
insertion and deletion (2 kinds of rotations)
Splay Tree: rotation opertations during insertion,
deletion, and searching. (3 kinds of roations)
Advanced Tree Structures II
Deadline: Dec. 24