Advanced Set Representation Methods: AVL Trees. 2-3 (-4) Trees. Union-Find Set ADT
Advanced Set Representation Methods: AVL Trees. 2-3 (-4) Trees. Union-Find Set ADT
AVL Trees
Problem with BSTs: worst case operation may take O(n) time. One solution: AVL tree: binary search tree with a balance condition: For every node in an AVL tree T, the height of the left (TL) and right (TR) subtrees can differ by at most 1: |hL - hR| 1
Two BSTs
Let n(h) denote the number of nodes of an AVL tree of height h. It is easy to see that n(1) = 1 and n(2) = 2 For n > 2, an AVL tree of height h contains:
the root node, one AVL subtree of height h-1 and another of height h-2.
For every node in an AVL tree T, the height of the left (TL) and right (TR) subtrees can differ by at most 1: |hL - hR| 1
DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 3
That is, n(h) = 1 + n(h-1) + n(h-2) Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2). By induction n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), , n(h) > 2in(h-2i) Solving the base case we get: n(h) > 2 h/2-1 Taking logarithms: h < 2log n(h) +2
Rotation features:
Nodes not in the subtree of the node rotated are unaffected A rotation takes constant time Before and after the rotation tree is still BST Code for left rotation is symmetric to code for a right rotation
Single Rotation
k1 < k2 all elements in subtree A are smaller than k1 all elements in subtree C are larger than k2 all elements in subtree B are in between k1 and k2
Note that subtrees B and C are empty
typedef struct { ElementT element; AVLPtr left; AVLPtr right; int height; } AVLNode; typedef AVLNode *AVLPtr;
10
12
14
Subtree X is empty
15
16
k2
k3
17
Single rotation left: when a node is inserted in the right subtree of the right child (B) of the nearest ancestor (A) with balance factor -2 Single rotation right: when a node is inserted in the left subtree of the left child (B) of the nearest ancestor (A) with balance factor +2. Right-left double rotation: when a node is inserted in the left subtree of the right child (B) of the nearest ancestor (A) with balance factor -2. Left-right double rotation: when a node is inserted in the right subtree of the left child (B) of the nearest ancestor (A) with balance factor +2.
DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 18
A node is deleted using the standard inorder successor (predecessor) logic for binary search trees Imbalance is fixed using rotations Identify the parent of the actual node that was deleted, then:
22
Demos from:
http://webpages.ull.es/users/jriera/Docencia/AVL/A VL%20tree%20applet.htm http://www.site.uottawa.ca/~stan/csi2514/applets/a vl/BT.html
23
using a linked-structure binary tree height of tree is O(log n), no restructures needed initial find is O(log n) restructuring up the tree, maintaining heights is O(log n) initial find is O(log n) restructuring up the tree, maintaining heights is O(log n)
remove is O(log n)
24
2-3 Trees
2-3 tree properties:
2-3 trees
Each interior node has two or three children. Each path from the root to a leaf has the same length. A tree with zero or one node(s) is a special case of a 2-3 tree. Elements are placed at the leaves If element a is to the left of element b, then a < b must hold. Ordering of elements based on one field of a record: a key. At each interior node: key of the smallest descendant of the second child and, if there is a third child, key of the smallest descendant of third child.
DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 25
26
split parent in two nodes node and node. The two smallest elements among the four children of node stay with node, The two larger will become the children of node Continue process up the tree Special case when splitting the root
28
29
30
2-3-4 trees.
2-3-4 tree refer to how many links to child nodes can potentially be contained in a given node. For non-leaf nodes, three arrangements :
A node with one data item always has two children A node with two data items always has three children A node with three data items always has four children In short, a non-leaf node must always have one more child than it has data items.
Symbolically, if the number of child links is L and the number of data items is D, then L = D + 1 Empty nodes are not allowed.
DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 31 DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 32
33
34
similar to insertion in 2-3 trees items are inserted at the leafs since a 4-node cannot take another item, 4-nodes are split up during insertion process
Strategy
on the way from the root down to the leaf: split up all 4-nodes "on the way" insertion can be done in one pass (remember: in 2-3 trees, a reverse pass might be necessary)
DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 36
35
similar to deletion in 2-3 trees items are deleted at the leafs swap item of internal node with inorder successor note: a 2-node leaf creates a problem
38
start with a collection of objects, each in a set by itself; combine sets in some order, and from time to time ask which set a particular object is in If set S has an equivalence relation (reflexive, symmetrical, transitive) defined on it, then the set S can be partitioned into disjoint subsets S1, S2,, ... S with S =S
Equivalence classes:
Equivalence problem:
U
k
given a set S and a sequence of statements of the form a b process the statements in order in such a way that at any time we are able to determine in which equivalence class a given element belongs
union(A, B) takes the union of the components A and B and calls the result either A or B, arbitrarily. find(x), a function that returns the name for the component of which x is a member. initial(A, x) creates a component named A that contains only the element x.
40
39
Examples
List based
List-based
Tree based
Tree based
42
43
44
When performing a union, make the root of smaller tree point to the root of the larger
Each time we follow a pointer, we are going to a subtree of size at least double the size of the previous subtree Thus, we will follow at most O(log n) pointers for any find.
DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 46
45
After performing a find, compress all the pointers on the path just traversed so that they all point to the root
Implies O(n log* n) time for performing n union- find Time bound of O(m(n)) where (n) is a very slowly growing function
DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 47 DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 48
j + 1, if k = 0 Ak ( j ) = ( j +1) Ak 1 ( j ), if k 1
where Ak-10(j)=j, Ak-1(i)(j)= Ak-1(Ak-1(i-1)(j)) for i 1. k is called the level of the function and i in the above is called iterations.
Ak(j) strictly increase with both j and k. Let us see how quick the increase is
49
50
Inverse of Ak(n):(n)
(n)=min{k: Ak(1) n} (n)= 0 for 0 n 2 1 n =3 2 for 4 n 7 3 for 8 n 2047 4 for 2048 n A4(1). Extremely slow increasing function. (n) 4 for all practical purpose.
51 DSA - lecture 4 - T.U.Cluj-Napoca - M. Joldos 52
Reading
AHU, chapter 5, sections 5.4, 5.5 Preiss, chapter: Search Trees section AVL Search Trees CLR, chapter 22, sections 1-4 Notes
53