0% found this document useful (0 votes)
54 views

Mod 2 (Aad)

The document discusses advanced data structures including AVL trees, disjoint sets, and graphs. It provides details on AVL tree operations like insertion and deletion as well as graph algorithms like breadth-first search.

Uploaded by

Ponnu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Mod 2 (Aad)

The document discusses advanced data structures including AVL trees, disjoint sets, and graphs. It provides details on AVL tree operations like insertion and deletion as well as graph algorithms like breadth-first search.

Uploaded by

Ponnu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

MODULE II

ADVANCED DATA STRUCTURES AND GRAPH


ALGORITHMS
Self Balancing Tree - AVL Trees (Insertion and deletion operations with all
rotations in detail, algorithms not expected); Disjoint Sets- Disjoint set
operations, Union and find algorithms.
Graphs: DFS and BFS traversals - Analysis, Strongly Connected Components of
a Directed graph, Topological Sorting.

AVL TREES
Height balanced Trees
• Height balance tree (self balancing tree) is a binary tree which
automatically maintain height of tree, and its sub tree on each insertion
and deletion of node.
• AVL tree, red-black tree are example of height balanced tree.
• AVL tree uses balance factor to balance the height of tree.
• Balance factor is nothing but difference between height of left subtree and
right subtree.
AVL TREES: Adelson, Velski & Landis
Definition:
• An empty binary tree B is an AVL tree
• If B is non empty with BL and BR its right and left sub trees then B is
AVL tree if
i. BL and BR are AVL trees
ii. |hL-hR|<= 1 where hL and hR are heights of Left and Right
sub-tree respectively
 To implement AVL tree, each node must contain a balance factor.
 Balance factor bf = hL – hR
 Bf of AVL trees can only have values: -1,0,1
Rotations in AVL tress
When we insert a node into AVL tree, it might make the tree unbalanced. To
balance we have to first find the pivot node after calculating balance factor(bf).
The node whose absolute value of bf switches from 1 to 2 is marked as pivot. If
there are more than one such node, choose the node nearest to the newely
inserted node as pivot.
There are basically four types of rotations which are as follows:
 L L rotation: Inserted node is in the left subtree of left child of pivot
 R R rotation : Inserted node is in the right subtree of right child of pivot
 L R rotation : Inserted node is in the right subtree of left child of pivot
 R L rotation : Inserted node is in the left subtree of right child of pivot

L L rotation
AVL tree may become unbalanced, if a node is inserted in the left subtree of the
left child of pivot. The tree then needs a right rotation.
RR rotation
If a tree becomes unbalanced, when a node is inserted into the right subtree of
the right subtree, then we perform a single left rotation .
LR Rotation

If the tree become unbalanced by an inserted node in the right subtree of left
child of pivot A left-right rotation is a combination of left rotation followed by
right rotation.
RL Rotation

If the tree become unbalanced by an inserted node in the left subtree of right
child of pivot A right-left rotation is a combination of right rotation followed
by left rotation.
AVL TREE INSERTION
 Step 1: First, insert a new element into the tree using BST's (Binary
Search Tree) insertion logic.
 Step 2: After inserting the elements you have to check the Balance Factor
of each node.
 Step 3: When the Balance Factor of every node will be found like 0 or 1 or -
1 then the algorithm will proceed for the next operation.
 Step 4: When the balance factor of any node comes other than the above
three values then the tree is said to be imbalanced. Then perform the
suitable Rotation to make it balanced and then the algorithm will proceed
for the next operation.
AVL TREE DELETION
 Step 1: Firstly, find that node where k is stored
 Step 2: Secondly delete those contents of the node (Suppose the node is x)
 Step 3: Claim: Deleting a node in an AVL tree can be reduced by deleting a
leaf. There are three possible cases:

 When x has no children then, delete x


 When x has one child, let x' becomes the child of x.
 then replace the contents of x with the contents of x'
 then delete x' (a leaf)
 Step 4: When x has two children,
 then find x's successor z (which has no left child)
 then replace x's contents with z's contents, and
 delete z
In all of the three cases, you will end up removing a leaf.
DISJOINT-SET
 Represents a mathematics concept Set.
 A disjoint-set data structure, also called a union–find data
structure or merge–find set.
 A disjoint-set data structure that keeps track of a set of elements
partitioned into a number of disjoint or non-overlapping subsets.
 It provides near-constant-time operations to add new sets, to
merge existing sets, and to determine whether elements are in the
same set.
 Plays a key role in Kruskal’s algorithm for finding the minimum
spanning tree of a graph.
 This can also be used to detect cycle in the graph.
How Disjoint Set is constructed:

 A disjoint-set forest consists of a number of elements each of


which stores an id, a parent pointer
 The parent pointers of elements are arranged to form one or more
trees, each representing a set.
 If an element’s parent pointer points to no other element, then the
element is the root of a tree and is the representative member of
its set.
 A set may consist of only a single element. However, if the
element has a parent, the element is part of whatever set is
identified by following the chain of parents upwards until a
representative element (one without a parent) is reached at the
root of the tree.

Disjoint Set Operations:


MakeSet(X): This operation makes a new set by creating a new element
with a parent pointer to itself. The parent pointer to itself indicates that
the element is the representative member of its own set.
The MakeSet operation has O(1) time complexity.
Find(X): follows the chain of parent pointers from x upwards through
the tree until an element is reached whose parent is itself. This element
is the root of the tree and is the representative member of the set to
which x belongs, and may be x itself.
Union(x,y): Uses Find to determine the roots of the trees x and y belong
to. If the roots are distinct, the trees are combined by attaching the root
of one to the root of the other. If this is done naively, such as by always
making x a child of y, the height of the trees can grow as O(n).

EXAMPLE:
Applications using Disjoint sets:
1. Represents network connectivity.
2. Image Processing.
3. In game algorithms.
4. Kruskal’s minimum spanning tree.
5. Find cycle in undirected graphs.

Union-Find Algorithm – Union by rank and path compression


How Disjoint Set is optimized:
 Union by rank.
 Path by compression.
Union by rank:
Uses Find to determine the roots of the trees x and y belong to. If the
roots are distinct, the trees are combined by attaching the root of one to
the root of the other. If this is done naively, such as by always
making x a child of y, the height of the trees can grow as O(n). We can
optimize it by using union by rank.
Union by rank always attaches the shorter tree to the root of the taller
tree.
To implement union by rank, each element is associated with a rank.
Initially a set has one element and a rank of zero.
If we union two sets and

 Both trees have the same rank – the resulting set’s rank is one larger
 Both trees have the different ranks – the resulting set’s rank is the larger of
the two. Ranks are used instead of height or depth because path compression
will change the trees’ heights over time.
Worst case complexity: O(LogN)

Path compression:
Path compression` is a way of flattening the structure of the tree
whenever Find is used on it. Since each element visited on the way to a
root is part of the same set, all of these visited elements can be
reattached directly to the root. The resulting tree is much flatter,
speeding up future operations not only on these elements, but also on
those referencing them.
GRAPHS

A graph is a pictorial representation of a set of objects where some pairs of


objects are connected by links. The interconnected objects are represented
by points termed as vertices, and the links that connect the vertices are
called edges. A graph G = (V;E) consists of a set of objects V = v1; v2; v3….
is the set of vertices and E = e1; e2; e3; …. is the set of edges, such that each
edge ek is identified with an unordered pair (vi; vj) of vertices. The vertices
(vi; vj) associated with edges ek are called the end vertices of ek. The most
common representation of a graph is by means of a diagram.

Representations of Graphs
We can choose between two standard ways to represent a graph G = (V;E)
as a collection of adjacency lists or as an adjacency matrix.
The adjacency-list representation of a graph G= (V, E) consists of an array
Adj of |V| lists, one for each vertex in V . For each u ε V , the adjacency
list
Adj[u] contains all the vertices v such that there is an edge .(u,v) ε E. That
is,
Adj[u] consists of all the vertices adjacent to u in G.

For the adjacency-matrix representation of a graph G = (V,E) we assume


that the vertices are numbered 1;2…….. |V | in some arbitrary manner.
Then the adjacency-matrix representation of a graph G consists of a |V| x
|V| matrix A =aij such that

BREADTH FIRST SEARCH


Breadth-first search is one of the simplest algorithms for searching a
graph. Given a graph =(V,E) and a distinguished source vertex s, breadth-
first search systematically explores the edges of G to “discover” every
vertex that is reachable from s. It computes the distance (smallest number
of edges) from s to each reachable vertex. It also produces a “breadth-first
tree” with root s that contains all reachable vertices. For any vertex v
reachable from s, the simple path in the breadth-first tree from s to v
corresponds to a “shortest path” from s to v in G, that is, a path containing
the smallest number of edges. The algorithm works on both directed and
undirected graphs.

To keep track of progress, breadth-first search colors each vertex white,


grey, or black. All vertices start out white and may later become grey and
then black. A vertex is discovered the first time it is encountered during
the search, at which time it becomes non-white. Breadth-first search
constructs a breadth-first tree, initially containing only its root, which is
the source vertex s. Whenever the search discovers a white vertex in the
course of scanning the adjacency list of an already discovered vertex u, the
vertex v and the edge (u,v) are added to the tree. We say that u is the
predecessor or parent of v in the breadth-first tree.

The breadth-first-search procedure BFS below assumes that the input


graph G =(V,E) is represented using adjacency lists.
Analysis
Before proving the various properties of breadth-first search, we take on
the somewhat
easier job of analyzing its running time on an input graph G =(V,E). After
initialization, breadth-first search never whitens a vertex, and thus the
test in line 13 ensures that each vertex is enqueued at most once, and
hence dequeued at most once. The operations of enqueuing and dequeuing
take O (1) time, and so the total time devoted to queue operations is O (V).
Because the procedure scans the adjacency list of each vertex only when
the vertex is dequeued, it scans each adjacency list at most once. Since the
sum of the lengths of all the adjacency lists is θ (E), the total time spent in
scanning adjacency lists is O ( E ). The overhead for initialization is O (V),
and thus the total running time of the BFS procedure is O (V + E). Thus,
breadth-first search runs in time linear in the size of the adjacency-list
representation of G.
DEPTH FIRST SEARCH
Depth-first search explores edges out of the most recently discovered vertex v
that still has unexplored edges leaving it. Once all of v's edges have been
explored, the search \backtracks" to explore edges leaving the vertex from which
v was discovered. This process continues until we have discovered all the vertices
that are reachable from the original source vertex. If any undiscovered vertices
remain, then depth- first search selects one of them as a new source, and it
repeats the search from that source. The algorithm repeats this entire process
until it has discovered every vertex. As in breadth first search, depth first search
colors vertices during the search to indicate their state. Each vertex is initially
white, is grayed when it is discovered in the search, and is blackened when it is
finished, that is, when its adjacency list has been examined completely. Besides
creating a depth-first forest, depth- first search also timestamps each ver tex.
Each vertex v has two timestamps: the first timestamp v.d records when v is first
discovered (and grayed), and the second timestamp v.f records when the search
finishes examining v's adjacency list (and blackens v). The procedure DFS below
records when it discovers vertex u in the attribute u.d and when it finishes
vertex u in the attribute u.f. These timestamps are integers between 1 and
2|V|j, since there is one discovery event and one finishing event for each of the
|V| vertices.
Depth-first search is another strategy for exploring a graph
■ Explore “deeper” in the graph whenever possible
■ Edges are explored out of the most recently discovered vertex v that
still has unexplored edges
■ When all of v’s edges have been explored, backtrack to the vertex
from which v was discovered
● Vertices initially colored white
● Then colored gray when discovered
● Then black when finished
ALGORITHM:
ANALYSIS:
● This running time argument is an informal example of amortized analysis
■ “Charge” the exploration of edge to the edge:
○ Each loop in DFS_Visit can be attributed to an edge in the
graph
○ Runs once/edge if directed graph, twice if undirected
○ Thus loop will run in O(E) time, algorithm O(V+E)
 Considered linear for graph, b/c adj list requires
O(V+E) storage
■ Important to be comfortable with this kind of reasoning and
analysis
EXAMPLE:
Classification of edges
Another interesting property of depth-first search is that the search can be used
to classify the edges of the input graph G(V,E).We can define four edge types in
terms of the depth-first forest produced by a depth-first search on G:
1. Tree edge: encounter new (white) vertex. The tree edges form a spanning
forest. Edge (u,v)is a tree edge if v was first discovered by exploring edge
(u,v). All edges that is included in the depth first search are called tree edges.
2. Back edge: from descendent to ancestor. Encounter a grey vertex (grey to
grey). Back edges are those edges (u,v) connecting a vertex u to an ancestor v
in a depth-first tree. We consider self-loops, which may occur in directed
graphs, to be back edges. Edges that moves to a vertex that is greyed are
back edges.

3. Forward edge: from ancestor to descendent. Not a tree edge, though. From
grey node to black node. Forward edges are those nontree edges (u,v)
connecting a vertex u to a descendant v in a depth-first tree. Alternative path
edges included in forward edges.
4. Cross edge: between a tree or subtrees. From a grey node to a black node.
Cross edges are all other edges. They can go between vertices in the same
depth-first tree, as long as one vertex is not an ancestor of the other, or they
can go between vertices in different depth-first trees. Edges that move to a
blackened vertex are cross edges.

STRONGLY CONNECTED COMPONENTS


A strongly connected component of a directed graph G = (V, E) is a maximal set of
vertices C subset of V such that every pair of vertices u and v in C are reachable
from each other i.e. there is a path between u and v.
ALGORITHM:
1. Call DFS(G) to compute finishing times u.f for each vertex u
2. Compute GT
3. Call DFS(GT), but in main loop of DFS, consider the vertices in order of
decreasing u.f
4. Output the vertices of each tree in depth first forest formed in line 3 as
separate strongly connected component.

EXAMPLE:
TOPOLOGICAL SORTING
• Given a digraph G = (V, E), find a linear ordering of its vertices such that: for
any edge (v, w) in E, v precedes w in the ordering
• A directed graph with a cycle cannot be topologically sorted.

ALGORITHM -1

1. Identify vertices that have no incoming edges. Select one such vertex
2. Delete this vertex of in-degree 0 and all its outgoing edges from the graph. Place it
in the output.

3. Repeat Step 1 and Step 2 until graph is empty


ALGORITHM -2

TOPOLOGICAL SORT (G)


1. Call DFS(G) to compute finishing times v.f for each vertex v
2. As each vertex is finished, insert it into the front of linked list
3. Return the linked list of vertices

You might also like