0% found this document useful (0 votes)
71 views339 pages

Dsa Merged

This document discusses hash tables and different approaches to handling collisions in hash tables, including separate chaining and open addressing. Separate chaining stores colliding items in linked lists attached to each table entry. Open addressing resolves collisions by probing for the next open slot in the table using a modified hash function. Searching and insertion in separate chaining takes average time O(1+α) where α is the load factor, while open addressing probes table slots in a permutation until an open slot is found.

Uploaded by

Abhishek Mishra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views339 pages

Dsa Merged

This document discusses hash tables and different approaches to handling collisions in hash tables, including separate chaining and open addressing. Separate chaining stores colliding items in linked lists attached to each table entry. Open addressing resolves collisions by probing for the next open slot in the table using a modified hash function. Searching and insertion in separate chaining takes average time O(1+α) where α is the load factor, while open addressing probes table slots in a permutation until an open slot is found.

Uploaded by

Abhishek Mishra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 339

Data Structures and Algorithms

Dr. L. Rajya Lakshmi


Hash table: Hash functions
• Hash codes:
• Summation hash code: not a good choice, if keys are either strings or other
multiple-length objects viewed as k-tuple (x0, x1, . . ., xk-1), where order of xi’s
is significant
• “temp01”, “temp10” collide
• temp01 (116, 101, 109, 112, 48, 49)
• “spot”, “pots”, “stop”, “tops” collide
• spot (115, 112, 111, 116)
Hash table: Hash functions
• Hash codes:
• Integer representation of key x is (x0, x1, . . ., xk-1)
• Polynomial hash code: Choose a nonzero constant “a” and use as a hash code
the value
x0ak-1 + x1 ak-2 + . . . + xk-2a + xk-1,
which can be written as:
xk-1 + a(xk-2 + a(xk-3 + . . . + a(x2 + a(x1 + ax0)). . .))
• This hash code uses the components of key “x” as coefficients of the
polynomial in “a” (polynomial hash code)
• Taking “a” to be 33, 37, 39, and 41 produced less than 7 collisions on a list of
50, 000 English words
Hash table: Hash functions
• Second action (compression map):
• Once key object is converted into hash code, it has to be converted
into an integer in the range [0, N-1]
• A simple compression map method (division method) is:
h(k) = |k|mod N
• {20, 25, 30, 35, 40, 45, 50}, assume that N =10
• Consider N =11
• If we choose N as a prime, then this method spreads out the
distribution of hashed values
• If N is 2p and then h(k) is p lower-order bits of k
Hash table: Hash functions
• If key values are of the form iN + j for several different i’s, then there
will be collisions though N is a prime
• The MAD method:
• Multiply, add, and divide
• Hash function is defined as: h(k) = {ak + b} mod N; N is prime, a and b are
nonnegative integers randomly chosen, a mod N ≠ 0
• To get close to a good hash function such that the probability of two keys
getting hashed to the same bucket is at most 1/N
Hash table: Hash functions
• The multiplication method
• Multiply key k by a constant A in the range 0 < A < 1
• Extract the fractional portion, f, of the result
• Multiply f by m and take the floor of the result (m is hash table capacity)
• h(k) = 𝑚(𝑘𝐴 𝑚𝑜𝑑 1)
• kA mod 1 = kA - 𝑘𝐴
• Value of m is not critical here
• Can choose m as a power of 2 (say 2p) for easy implementation of hash
function
• Assume that the word size of machine is “w” bits and k fits in a word
• Restrict A to be of form s/2w, where 0 < s < 2w
• Multiply k by w-bit integer s = A. 2w
• The result is a 2w-bit value r1 2w + r0
• The most significant p bits of r0 is the hash value of k
Hash table: Hash functions
• Assume that the word size of the machine 8-bits
• Select m = 23 and A as 0.25
• Consider key k = 51
• h(k) = 𝑚(𝑘𝐴 𝑚𝑜𝑑 1) = 8(0.75) = 6
• s = A. 2w = 64
• ks = 51*64 = 3264
• 12*28 + 192
• 192: 1100 0000
Collision handling schemes
• Consider two items (k1, e1) and (k2, e2)
• If h(k1) = h(k2), then we have a collision
• Which operations get affected?
• insertItem() and findItem()
• A simple and efficient way is to have each bucket A[i] to store a
reference to an unordered sequence (list), Si, that stores all items that
are mapped to bucket A[i]
• Each bucket is a miniature dictionary
• This way of collision resolution is called separate chaining
• Assume that each nonempty bucket is implemented as a list
Separate chaining
Algorithm findElement(k)
B ← A[h(k)]
if B is empty then
return NO_SUCH_KEY
else
{search for key in the list for this bucket}
return B.findElement(k)
Separate chaining
Algorithm insertItem(k, e)
if A[h(k)] is empty then
Create a new list B, which is initially empty
A[h(k)] ← B
else
B ← A[h(k)]
B.insertItem(k,e)
Separate chaining
Algorithm removeElement(k)
B ← A[h(k)]
If B is empty then
return NO_SUCH_Key
else
return B.removeElement(k)
Separate chaining
• Insertion runs in constant time (item is not present in the table)
• In the worst-case the time to search an item is θ(n)
• Consider a hash table T of capacity m that stores n items
• Average-case performance depends on how well the hash function
distributes keys among m buckets on average
• Assume that any given item is equally likely to hash to any of the m
buckets independent of where any other item has hashed to (simple
uniform hashing)
• The load factor of T, α, is defined as n/m, that is, average number of
elements stored in each list/chain
• α can be less than, equal to, or greater then 1
Separate chaining
• nj is the length of the list pointed by T[j], where j = 0, 1, . . ., m-1
• n = n0 + n1 + . . .+ nm-1
• E[nj] = α = n/m
• Assume that the time to compute hash function is O(1)
• Time required to search for an item with key k is linearly dependent
on the length nh(k) of the list referred by T[h(k)]
• Analyse the expected number of items examined by the search
algorithm, that is, the number of items in the list referred by T[h(k)]
• Consider two cases: unsuccessful search and successful search
Separate chaining
Theorem: In a hash table, if collisions are resolved by chaining, an
unsuccessful search takes average-case time θ(1 + α), under the
assumption of simple uniform hashing.
Proof:
Any key k which is not present in table T is equally likely to hash
to any of the m buckets
The expected time to perform an unsuccessful search is the
expected time to search to the end of the list of T[h(k)]
The expected length of the list of T[h(k)] is E[nh(k)] is α
The expected number of items examined in an unsuccessful
search is α
Total required for an unsuccessful search is θ(1 + α)
Separate chaining
Theorem: In a hash table if collisions are resolved by chaining, a
successful search takes average-case time θ(1 + α), under the
assumption of simple uniform hashing.
Proof:
Item to be searched is equally likely to be any of n items stored in
the table
The number of items searched in a successful search for an item
x is one more than the number of items that precede x in x’s list (why?)
Find the number of items that were inserted after x was inserted
in x’s list
Separate chaining
Proof (contd):
Let xi be the ith item inserted into the table, for i =1, 2, . . ., n and
let ki = xi.key
For keys ki and kj define a random variable Xij = I{h(ki) = h(kj)}
Under the assumption of simple uniform hashing, Pr{h(ki) = h(kj)}
= 1/m, so E[Xij] = 1/m
The number of items that were searched in a successful search
for xi is: 1 + 𝑛𝑗=𝑖+1 𝑋i𝑗
The expected number of items searched in a successful search is:
1 𝑛 𝑛
E 𝑖=1 1+ 𝑗=𝑖+1 𝑋i𝑗
𝑛
Separate chaining
Proof (Contd):
1 𝑛 𝑛
E 𝑖=1 1+ 𝑗=𝑖+1 𝑋i𝑗
𝑛
1 𝑛 𝑛
= 𝑖=1 1 + 𝑗=𝑖+1 𝐸[𝑋i𝑗]
𝑛
1 𝑛 𝑛
= 𝑖=1 1 + 𝑗=𝑖+1 1/𝑚
𝑛
1 𝑛
=1 + (𝑛 − 𝑖)
𝑛𝑚 𝑖=1
1 2 𝑛(𝑛+1)
=1 + 𝑛 −
𝑛𝑚 2
(𝑛−1)
=1 +
2𝑚
α α
=1 + −
2 2𝑛
Separate chaining
Proof (Contd):
Thus the total time required for a successful search is: θ(2 + α/2
– α/2n), which is θ(1 + α)
• If hash table capacity is at least proportional to the number of items
in the table, then n is O(m)
• α = n/m is O(m)/m, which is O(1)
• Thus searching takes constant time
• If the lists are maintained using doubly linked lists, then removal also
takes constant time
Data Structures and Algorithms
Dr. L. Rajya Lakshmi
Open Addressing
• We have to maintain a large number of pointers
• Also space is wasted
• There is another collision resolution technique that uses the space in
the hash table only
Open addressing
• All items occupy the hash table itself (no separate lists maintained)
• Saves memory space; can use that space for defining a large hash
table; reduces collisions
• For each item a sequence of buckets are searched
• Do not probe buckets in sequence 0, 1, . . ., N-1
• The buckets probed depend upon the key of the item
• Modified hash function h, which takes the probe number as an
argument in addition to hash key
• The probe sequence for key k (assumption: k is an integer) is: h(k, 0),
h(k, 1), . . ., h(k, N-1)
• It is a permutation of 0, 1, . . ., N-1
Open addressing
• Assume that items are keys themselves and key are integers
• Each bucket contains either an item/a key or NIL
Algorithm Hash_Insert(T, k)
i←0
while i < N do
j ← h(k, i)
if T[j] is NIL
T[j] ← k
return j
else i ← i+1
raise an error “hash table overflow”
Open addressing
• The search algorithm probes the same sequence
• It encounters an empty bucket (unsuccessful search) or bucket having the
search key
Algoritm Hash_Search(T, k)
i←0
j ← h(k, i)
while i < N and T[j] is not NIL
if T[j] == k
return j
i ← i+1
j ← h(k, i)
return NIL
Open addressing
• Cannot simply delete the keys?
• If the buckets i1, i2, i3, . . . were probed while inserting key k into the
hash table and finally inserted k into the bucket j
• If the key stored in bucket i3 is deleted, the search algorithm would
return a wrong output
• Can be handled by marking these buckets differently, say storing
“DELETED” instead of NIL
Open addressing
Algoritm Hash_Delete(T, k)
i←0
j ← h(k, i)
while i < N and T[j] is not NIL
if T[j] == k
T[j] ← DELETED
return j
i ← i+1
j ← h(k, i)
return NIL
Open addressing
• Three techniques are commonly used to compute probe sequences
• Linear probing
• Quadratic probing
• Double hashing
• All three guarantee that the probe sequence is a permutation of 0, 1,
. . ., N-1
Linear probing
• Assume that h’ is the hash function used for compression mapping
(auxiliary hash function)
• Hash function used by linear probing is:
h(k, i) ← (h’(k) + i) mod N, for i=0, 1, . . ., N-1
• We first probe the bucket given by the auxiliary hash function,
followed by other buckets
Linear probing
• h’(k) ← k mod N; N = 10; h(k) = (h’(k) + i) mod N
• {89, 18, 49, 58, 69}
Linear probing
• Suffers from a problem called primary clustering
• Blocks of occupied buckets start forming (primary clustering)
• Issue?
• Any key that hashes into the cluster, will join that cluster after several
attempts to resolve the collision
Quadratic probing
• Assume that h’ is the hash function used for compression mapping
• Hash function used by quadratic probing is:
h(k, i) ← (h’(k) + c1i + c2 i2) mod N, where c1 and c2 are auxiliary constants, i=0, 1,
. . ., N-1
• We first probe the bucket given by the auxiliary hash function,
followed by other buckets provided by the hash function
Quadratic probing
• h’(k) ← k mod N; N = 10; h(k, i) = (h’(k) + i2) mod N
• {89, 18, 49, 58, 69}
Quadratic probing
• If two keys hash to the same initial bucket, then their probe
sequences are the same; h(k1, 0) = h(k2, 0) implies h(k1, i) = h(k2,i)
• Same alternative buckets are probed
• Suffers from a problem called secondary clustering
Double hashing
• It uses hash function of the form
h(k, i) = (h’(k) + i h’’(k)) mod N,
h’ and h’’ are auxiliary hash functions and i = 0, 1, . . ., N-1
• The initial probe goes to position T[h’(k)]
• The successive attempts to resolve the collision will probe the
positions that are offset from the previous positions by the amount
h’’(k) mod N
• The probe sequence depends upon the key in two ways
Double hashing
• The value of h’’(k) must be relatively prime to the table size in order
to probe the entire table
• Can select N as a power of 2 and select h’’ so that it always produces
an odd number
• Can select N as a prime, and design h’’ so that it always returns a
positive integer less than N
• Another common choice is: N is prime, N’ is a prime smaller than N,
and h’’(k) = N’ – (k mod N’)
• Ex: can select N as 13, then
h’(k) = k mod 13
h‘’(k) = 1 + k mod N’,
where N’ is chosen slightly lower than N, say 11
Double hashing
• N = 13, N’ = 11, h’(k) ← k mod N; h’’(k) = 1 + k mod N’
• h(k) = (h’(k) + i h’’(k)) mod N
• Insert 14
79

69
98

72

14

50
Data Structures and Algorithms
Dr. L. Rajya Lakshmi
Analysis: open addressing
• How many distinct probe sequences are possible with
• Linear probing?
• The initial probe decides the subsequent probes
• Quadratic probing?
• Here also, the subsequent probes are dependent on the initial probe
• Double hashing?
• When N is a prime or a power of 2, double hashing uses θ(N2) probe sequences, since
each possible (h’(k), h’’(k)) pair yields a distinct probe sequence
• For the analysis, we assume uniform hashing
• Uniform hashing: the probe sequence for each key is equally likely to
be any one of N! permutations of (0, 1, . . ., N-1)
• When the values of parameters are selected appropriately, double
hashing performs close to ideal scheme of uniform hashing
Analysis: open addressing
• Analysis is in terms of load factor α = n/N (n is the number of items in
the hash table and N is the capacity of the hash table)
• With open addressing, n ≤ N, so α ≤ 1
• Assume that we are using uniform hashing
• The probe sequence (h(k,0), h(k,1), . . ., h(k, N-1)) used for key k is
equally likely to be any of the permutation of (0, 1, . . ., N-1)
Analysis: open addressing
Theorem: Given an open addressing hash table with load factor α =
n/N < 1, the expected number of probes in an unsuccessful search is at
most 1/(1- α), assuming uniform hashing.
Proof:
Every probe except for the last one accesses an occupied bucket
that does not contain the required item
The last bucket accessed is an empty one
X denotes the number of probes made in an unsuccessful search
Ai: the event that an ith probe occurs and it is to a occupied
bucket
Analysis: open addressing
• Now consider the event {X ≥ i}
• 1st probe occurs and it is to a occupied bucket, 2nd probe occurs and
it is to a occupied bucket, . . ., (i-1)th probe occurs and it is to a
occupied bucket
• The event {X ≥ i} is the intersection of the events A1, A2, . . .Ai-1
• {X ≥ i} is A1ꓵ A2 ꓵ . . . ꓵ Ai-1
Analysis: open addressing
For a collection of events A1, A2, . . .Ai-1 the following relation
holds:
Pr{A1ꓵ A2 ꓵ . . . ꓵ Ai-1} = Pr{A1}. Pr{A2 | A1}. Pr{A3 | (A1 ꓵ A2)} . . .
Pr{Ai-1 | A1 ꓵ A2 ꓵ . . . ꓵ Ai-2}
Pr(A1) = n/N
Consider the event that jth probe occurs and it is to an occupied
bucket, given that the first j-1 probes were to occupied buckets, j > 1
In the jth probe we would find one of the remaining (n-(j-1))
items in one of the remaining (N-(j-1)) buckets (uniform hashing)
Probability of the event is (n-(j-1))/(N-(j-1))
Observing that n < N, (n-j)/(N-j) ≤ n/N, for all j s.t. 0 ≤ j < N
Pr{A2|A1} = n-1/(N-1); Pr{A3 | (A1 ꓵ A2)} = n-2/(N-2), . . ., Pr{Ai-1 |
A1 ꓵ A2 ꓵ . . . ꓵ Ai-2} = n-i+2/(N-i+2)
Analysis: open addressing
Since the event {X ≥ i} is A1ꓵ A2 ꓵ . . . ꓵ Ai-1,
𝑛 𝑛−1 𝑛−2 𝑛−𝑖+2
Pr{X ≥ i} = . . ... (for all i s.t. 0 ≤ i ≤ N)
𝑁 𝑁−1 𝑁−2 𝑁−𝑖+2
𝑛 𝑖−1

𝑁
= αi-1
When a random variable takes values from the set of natural
numbers N = {0, 1, . . .}, we have a formula for its expectation:
E[X] = ∞ 𝑖=0 𝑖. Pr{𝑋 = 𝑖}
= ∞ 𝑖=0 𝑖. (Pr{𝑋 ≥ 𝑖} − Pr{𝑋 ≥ 𝑖 + 1}
= ∞ 𝑖=1 Pr{𝑋 ≥ 𝑖}
Analysis: open addressing
Using the above mentioned relation
E[X] = ∞𝑖=1 Pr{𝑋 ≥ 𝑖}
≤ ∞𝑖=1 α i−1

= ∞𝑖=0 α
i
1
=
1−α
• Unsuccessful search runs in O(1) if α is constant.
• If the hash table is half full, then the average number of probes in an
unsuccessful search is 2
• If hash table if 90% full, then the average number of probes in an
unsuccessful search is 10
Analysis: open addressing
Corollary: Inserting an item into an open addressing hash table with
load factor α requires at most 1/(1- α) probes on average assuming
uniform hashing.
Proof:
An item is inserted into a hash table if and only if there is room,
that is α < 1
Inserting an item: an unsuccessful search followed by placing the
item into the empty bucket found
The expected number of probes is at most 1/(1- α)
Analysis: open addressing
Theorem: Given an open addressing hash table with load factor α < 1,
the expected number of probes in a successful search is at most
1 1
ln
α 1− α
assuming uniform hashing and assuming that each key in the table is
equally likely to be searched for.
Proof:
A search for key k reproduces the probe sequence that was used
while inserting that key into hash table
By the corollary, if the key k was the (i+1)st item inserted into
hash table, then the expected number of probes in search for key is:
1/(1-i/N), that is, N/(N-i)
Analysis: open addressing
Averaging over all n items in the table,
1 𝑛−1 𝑁 𝑁 𝑛−1 1
𝑖=0 𝑁−𝑖 = 𝑖=0 𝑁−𝑖
𝑛 𝑛
1 𝑁 1
A= 𝑘=𝑁−𝑛+1 𝑘
α
𝑞+1 𝑞 𝑞
𝑝
𝑓 𝑥 𝑑𝑥 ≤ 𝑘=𝑝 𝑓(𝑘)≤ 𝑝−1 𝑓 𝑥 𝑑𝑥
1 𝑁 1
A ≤ 𝑁−𝑛( )𝑑𝑥
α 𝑥
1 𝑁
= ln
α N− n
1 1
= ln
α 1− α
Analysis: open addressing
• If the table is half full, then the expected number of probes in a
successful search is 1.387
• If the table is 90% full, then the expected number of probes in a
successful search is: 2.559
Selection sort
• Similar to the insertion, the given input array can be divided into two
parts: sorted and unsorted part
• Basic Principle: Take the smallest elements from the unsorted part
and move it to the end of the sorted part
Selection sort
Algorithm Selection_Sort(A[0. .n-1], n)
for i ← 0 to n-2 do
m←i
for j ← i+1 to n-1 do
if A[j] < A[m]
m←j
swap A[i] and A[m]
Selection sort
4 4

i m
i m

4 2

i m i m

2 2

i m i m

2 2

i m i m
Selection sort

2 2

i m i m

2 2

i m i m

2 2

i m i m

2 2

i m i m
Selection sort
Algorithm Selection_Sort(A[0. .n-1], n)
for i ← 0 to n-2 do
m←i
for j ← i+1 to n-1 do
if A[j] < A[m]
m←j
swap A[i] and A[m]
𝑛−2 𝑛−1
• T(n) = 𝑖=0 𝑗=𝑖+1 𝑐
𝑛−2
= 𝑖=0 𝑐(𝑛 − 𝑖 − 1) = c 𝑛−2
𝑖=0 (𝑛 − 1) - c
𝑛−2
𝑖=0 𝑖
= c(n-1)(n-1) – c(n-2)(n-1)/2 = cn(n-1)/2
Data Structures and Algorithms
Dr. L. Rajya Lakshmi
Heapsort
• Runs in O(n log n) time
• An example of in-place sorting algorithm
• A new data structure “heap” is used (different from the heap that we
use for garbage collection)
Heapsort
• A binary tree: An ordered rooted tree where each internal node can have
at most two children (left child and right child)
• Leaf nodes, level of a node
• A nearly complete binary tree: A binary tree where except for the last level,
the other levels are completely filled
16

16

14 10

14 10

8 7 9 3

8 7 9 3

2 4 1

2 4 1
Heapsort
• The binary heap data structure is an array object which can be viewed
as a nearly complete binary tree

16

14 10

8 7 9 3

2 4 1
Heapsort
• An array A that represents a heap is an object with two attributes:
• A.length (represents the size of A)
• A.heap_size (represents the number of elements in the heap that are stored
within A)
• Though A[1 . . A.length] may contain numbers, only the elements in
A[1 . . A.heap_size], where 0 ≤ A.heap_size ≤ A.length are valid
elements of the heap
• A[1] is the root
• Given an index “i“, can easily locate its parent, left and right child
Heapsort
Algorithm Parent(i)
return 𝑖/2
Algorithm Left(i)
return 2i
Algorithm Right(i)
return (2i+1)
Heapsort
• There are two kinds of binary heaps: max-heaps and min-heaps
• The elements stored in nodes satisfy a heap order property
• These two types of heaps differ in terms of heap order property
Heapsort
• Max-heap-property:
• For every node “i“ other than the root, A[Parernt(i)] ≥ A[i]
• The maximum elements is stored at the root
• The subtree rooted at a node stores elements no larger than that
stored at that node
Heapsort

16

14 10

8 7 9 3

2 4 1
Heapsort
• Min-heap-property:
• For every node “i“ other than the root, A[Parernt(i)] ≤ A[i]
• The minimum elements is stored at the root
• The subtree rooted at a node stores elements no smaller than that
stored in that node
Heapsort

2 4

3 5 6 8

7 9 10
Heapsort
• Height of a node: the length (number of edges) of the longest
downward path from that node to a leaf
• Height of the tree: height of the root
• A heap of “n” elements is a nearly complete binary tree, its height is
θ(log n)
• Basic operations on heap run in time proportional to the height of the
tree, thus the time complexity of these operations is O(log n)
Heapsort

2 16

16 10 2 10

14 7 9 3 14 7 9 3

8 4 1 8 4 1
Heapsort

16
16

14 10
14 10

8 7 9 3
2 7 9 3

2 4 1
8 4 1
Heapsort
Algorithm Max_Heapify(A, i)
l ← Left(i)
r ← Right(i)
if l ≤ A.heap_size and A[l] > A[i]
largest ← l
else
largest ← i
if r ≤ A.heap_size and A[r] > A[largest]
largest ← r
if largest ≠ i
swap A[i], A[largest]
Max_Heapify(A, largest)
Heapsort
• Max_Heapify assumes that the binary trees rooted at Left(i) and
Right(i) are max-heaps
• The running time on a subtree of size n rooted at node i is:
• Time required to fix up the relationships among the elements A[i], A[Left(i)],
A[Right(i)]
• Time to rum Max_Heapify on a subtree rooted at one of children of node i
• The children’s subtree size is at most 2n/3 (the bottom level of the
tree is exactly half full)
• T(n) <= T(2n/3) + θ(1)
• The solution to this recurrence is O(log n)
Data Structures and Algorithms
Dr. L. Rajya Lakshmi
Analysis of insertion sort
for j ← 1 to n-1 do n
key ← A[j] n-1
{insert A[j] into the sorted
sequence A[0. .j-1]}
i ← j-1 n-1
𝑛−1
𝑡𝑗
while i ≥ 0 and A[i] > key do 𝑗=1
𝑛−1
A[i+1] ← A[i] 𝑗=1
(𝑡𝑗 − 1)
𝑛−1
i-- 𝑗=1
(𝑡𝑗 − 1)

A[i+1] ← key n-1


Analysis of insertion sort
𝑛−1 𝑛−1 𝑛−1
T(n) = n + (n-1) + (n-1) + 𝑗=1 𝑡𝑗+ 𝑗=1 (𝑡𝑗 − 1)+ 𝑗=1 (𝑡𝑗 − 1)+ (n-1)
= n +3(n-1) + (n-1)n/2 + (n-1)(n-2)/2
= an2 + bn + c
• O-notation describes an upper bound; when we use it to bound the
worst-case running time of an algorithm, we have a bound on the
running time of that algorithm on every input
• Does Θ(n2) bound on the worst-case running time of insertion sort
imply Θ(n2) bound on every input?
Selection sort
Algorithm Selection_Sort(A[0. .n-1], n)
for i ← 0 to n-2 do
m←i
for j ← i+1 to n-1 do
if A[j] < A[m]
m←j
swap A[i] and A[m]
𝑛−2 𝑛−1
• T(n) = 𝑖=0 𝑗=𝑖+1 𝑐
𝑛−2
= 𝑖=0 𝑐(𝑛 − 𝑖 − 1) = c 𝑛−2
𝑖=0 (𝑛 − 1) - c
𝑛−2
𝑖=0 𝑖
= c(n-1)(n-1) – c(n-2)(n-1)/2 = cn(n-1)/2
Heapsort 2

Algorithm Max_Heapify(A, i)
l ← Left(i) 16 10

r ← Right(i)
if l ≤ A.heap_size and A[l] > A[i] 14 7 9 3

largest ← l
else
8 4 1
largest ← i
if r ≤ A.heap_size and A[r] > A[largest]
largest ← r
if largest ≠ i
swap A[i], A[largest]
Max_Heapify(A, largest)
Heapsort

3 4

2 10 9 7

8 6 5
Heapsort
1 1

3 4 3 4

2 10 9 7 8 10 9 7

8 6 5 2 6 5
Heapsort
1 1

3 9 10 9

8 10 4 7 8 5 4 7

2 6 5 2 6 3
Heapsort
1 10

10 9 8 9

8 5 4 7 6 5 4 7

2 6 3 2 1 3
Heapsort
Algorithm Build_Max_Heap(A)
A.heap_size ←A.length
for i ← 𝐴. 𝑙𝑒𝑛𝑔ℎ𝑡/2 downto 1
Max_Heapify(A, i)
Heapsort

b c

d e f g

h i j k
Heapsort
• Time to run Max_Heapify at a node varies with the height of the node
• Time required to run Max_Heapify at a node of height h is O(h)
• An n element heap has height log(𝑛)
• At height “h”, there would be at most n/2h+1 nodes
• The total cost of Build_Max_Heap:
log(𝑛) log(𝑛)
ℎ=1 n/2h+1 O(h) which is O(n ℎ=1 h/2h
Heapsort
∞ 𝑖 1
𝑖=0 𝑥 =
1−𝑥
if |x| < 1 {differentiate both sides}
∞ i−1 1
𝑖=1 𝑖x =
1−𝑥 2 {multiply both sides by x}
∞ i 𝑥
𝑖=1 𝑖x = 1−𝑥 2
Taking x = ½ yields,
∞ i 1/2
𝑖=1 𝑖x = 1−1/2 2 =2
log(𝑛) ∞
O(n ℎ=1 h/2h is O(n ℎ=1 h/2h which can be written
1/2
as O(n 1 ) which is O(n)
1− 2
2
Heapsort
• Consider an array A[1 . . n], n is A.length
• Run Build_Max_Heap on A
• The largest element is sitting in A[1]
• Swap A[1] and A[n], decrement A.heap_size by 1, and run
Max_Heapify on the new root
• Repeat the above step until A.heap_size becomes 1
Heapsort
Algorithm Heapsort(A)
Build_Max_Heap(A)
for i ← A.length downto 2
swap A[1] and A[i]
Max_Hepify(A, 1)
• Build_Max_Heap() takes O(n) time
• Each of n-1 calls of Max_Heapify() takes O(log n) time
• Running time is O(n log n)
Heapsort
5 5

13 2 13 2

25 7 17 20 25 51 17 20

8 4 51 8 4 7

5 13 2 25 7 17 20 8 4 51 5 13 2 25 51 17 20 8 4 7
Heapsort

5 5

13 2 13 20

25 51 17 20 25 51 17 2

8 4 7 8 4 7

5 13 2 25 51 17 20 8 4 7 5 13 20 25 51 17 2 8 4 7
Heapsort
5 51

51 20 25 20

25 13 17 2 8 13 17 2

8 4 7 5 4 7

5 51 20 25 13 17 2 8 4 7 51 25 20 8 13 17 2 5 4 7
Heapsort

7
51 25

25 20 13 20

8 13 17 2 8 7 17 2

5 4 7 5 4

51 25 20 8 13 17 2 5 4 7 25 13 20 8 7 17 2 5 4 51

7 25 20 8 13 17 2 5 4 51
Heapsort
20
4
25

13 17
13 20

8 7 4 2
8 7 17 2

5
5 4

25 13 20 8 7 17 2 5 4 51 20 13 17 8 7 4 2 5 25 51

4 13 20 8 7 17 2 5 25 51
Heapsort
5
20 17

13 17 13 5

8 7 4 2 8 7 4 2

17 13 5 8 7 4 2 20 25 51
20 13 17 8 7 4 2 5 25 51

5 13 17 8 7 4 2 20 25 51
Heapsort
17
2 13

13 5 8 5

8 7 4 2 2 7 4

17 13 5 8 7 4 2 20 25 51
13 8 5 2 7 4 17 20 25 51

2 13 5 8 7 4 17 20 25 51
Heapsort

13
4 8

8 5 7 5

2 7 4 2 4

13 8 5 2 7 4 17 20 25 51 8 7 5 2 4 13 17 20 25 51

4 8 5 2 7 13 17 20 25 51
Heapsort

48 7

77 55 4 5

22 4 2

8 7 5 2 4 13 17 20 25 51 7 4 5 2 8 13 17 20 25 51

4 7 5 2 8 13 17 20 25 51
Heapsort
27
5

44 55
4 2

7 4 5 2 8 13 17 20 25 51 5 4 2 7 8 13 17 20 25 51

2 4 5 7 8 13 17 20 25 51
Heapsort

4
52

2
4 2

5 4 2 7 8 13 17 20 25 51 4 2 5 7 8 13 17 20 25 51

2 4 5 7 8 13 17 20 25 51
Heapsort

24

2 4 5 7 8 13 17 20 25 51

2 4 5 7 8 13 17 20 25 51
Data Structures and Algorithms
CS F211

Vishal Gupta
Department of Computer Science and Information Systems
BITS Pilani Birla Institute of Technology and Science
Pilani Campus Pilani Campus, Pilani
BITS Pilani
Pilani Campus

Non-linear Data Structures: Trees


Tree

Tree represents hierarchy.

Examples of trees:
- Directory tree
- Family tree
- Company organization chart
-Table of contents

-structure resembles branches of a “tree”, hence the name.

BITS Pilani, Pilani Campus


Trees
Trees have nodes. They also have edges that
connect the nodes.
• Between two nodes there is always only
one path.
Tree nodes

Tree edges

BITS Pilani, Pilani Campus


Trees: More Definitions
• Trees that we consider are rooted. Once the root is defined
(by the user) all nodes have a specific level.
• Trees have internal nodes and leaves. Every node (except
the root) has a parent and it also has zero or more children.

root

level 0
level 1 internal nodes

level 2
parent
and
level 3
child
leaves
BITS Pilani, Pilani Campus
Tree Terminology (1)

• A vertex (or node) is an object that can have a name and can
carry other associated information.
• The first or top node in a tree is called the root node.
• An edge is a connection between two vertices.
• A path in a tree is a list of distinct vertices in which successive
vertices are connected by edges in the tree.
• The defining property of a tree is that there is precisely one
path connecting any two nodes.
• A disjoint set of trees is called a forest.
• Nodes with no children are leaves, terminal or external nodes.
BITS Pilani, Pilani Campus
Tree Terminology (2)

Child of a node u :- Any node reachable from u by 1 edge.


Parent node :- If b is a child of a, then a is the parent of b.
- All nodes except root have exactly one parent.
Subtree:-any node of a tree, with all of its descendants.
Depth of a node :
- Depth of root node is 0.
-Depth of any other node is 1 greater than depth of its
parent.

BITS Pilani, Pilani Campus


Tree Terminology (3)

The size of a tree is the number of nodes in it

Height : Maximum of all depths.

Each node except the root has exactly one node above it in
the tree, (i.e. it’s parent), and we extend the family
analogy talking of children, siblings, or grandparents.
Nodes that share parents are called siblings.

BITS Pilani, Pilani Campus


Binary Trees
Definition: A binary tree is either empty or it consists
of a root together with two binary trees called the
left subtree and the right subtree.
A binary tree is a tree in which each node has atmost
2 children.

BITS Pilani, Pilani Campus


Complete Binary Trees
• Nodes in trees can contain keys (letters, numbers, etc)
• Complete binary tree: A binary tree in which every
level, except possibly the deepest, is completely filled.
At depth n, the height of the tree, all nodes must be as
far left as possible.
14

10 16

8 12 15 18

7 9 11 13
BITS Pilani, Pilani Campus
Complete Binary Trees: Array Representation

Complete Binary Trees can be represented in


memory with the use of an array A so that all
nodes can be accessed in O(1) time:
– Label nodes sequentially top-to-bottom and
left-to-right
– Left child of A[i] is at position A[2i]
– Right child of A[i] is at position A[2i + 1]
– Parent of A[i] is at A[i/2]

BITS Pilani, Pilani Campus


Complete Binary Trees: Array Representation

1
14

2 3
10 16
4 5 6 7
8 12 15 18
8 9 10 11
7 9 11 13

BITS Pilani, Pilani Campus


Binary Trees: Linked List Representation

A Binary tree is a linked data structure. Each node


contains data (including a key and satellite data), and
pointers left, right and p.
Left points to the left child of the node.
Right points to the right child of the node.
p points to the parent of the node.
If a child is missing, the pointer is NIL.
If a parent is missing, p is NIL.
The root of the tree is the only node for which p is NIL.
Nodes for which both left and right are NIL are leaves.
BITS Pilani, Pilani Campus
Binary Trees: Linked List Representation

BITS Pilani, Pilani Campus


Height, Depth, and Level

• The height of a node is the number of edges on the


longest downward path between that node and a leaf.

• The depth of a node is the number of edges from the


node to the tree's root node.

• Level and depth are same. Although, some textbooks


say that level = depth + 1

BITS Pilani, Pilani Campus


Problem - 1

1. Draw a Binary tree with height 7 and maximum number


of nodes.

BITS Pilani, Pilani Campus


Problem - 1

2. Is a complete binary tree balanced?


Is vice versa also true?

BITS Pilani, Pilani Campus


Problem - 1

1. What is the minimum and maximum number of external


nodes for a binary tree with height h? Justify your
answer.

BITS Pilani, Pilani Campus


Problem

BITS Pilani, Pilani Campus


Problem 2

What are the internal nodes?


What are the leaf nodes?
How many descendants does node cs016/ have?
How many ancestors does node cs016/ have?
What are the siblings of node homeworks/?
Which nodes are in the subtree rooted at node projects/?
What is the depth of node papers/?
What is the height of the tree?

BITS Pilani, Pilani Campus


Problem

What does this algorithm


computes ?

BITS Pilani, Pilani Campus


Binary Tree Traversals

– A binary tree is defined recursively: it consists of a


root, a left subtree and a right subtree
– To traverse (or walk) the binary tree is to visit each
node in the binary tree exactly once.
– Tree traversals are naturally recursive.
– Since a binary tree has three parts, there are six
possible ways to traverse the binary tree:
• root, left, right : preorder (root, right, left)
• left, root, right: inorder (right, root, left)
• left, right, root: postorder (right, left, root)
Binary Tree Traversals

1 preorder: 1 2 3
2 3
inorder: 2 1 3
postorder: 2 3 1
Tree Traversal: InOrder

In-order traversal
• Visit left subtree (if there is one) In Order
• print the key of the current node
• Visit right subtree (if there is one) In Order

I
Tree Traversal: PreOrder
Another common traversal is PreOrder.
It goes as deep as possible (visiting as it goes) then left to
right
• print root
• Visit left subtree in PreOrder
• Visit right subtree in PreOrder
Tree Traversal: PostOrder
PostOrder traversal also goes as deep as possible, but
only visits internal nodes during backtracking.
recursive:
•Visit left subtree in PostOrder
•Visit right subtree in PostOrder
•print root
Data Structures and Algorithms
CS F211

Vishal Gupta
Department of Computer Science and Information Systems
BITS Pilani Birla Institute of Technology and Science
Pilani Campus Pilani Campus, Pilani
BITS Pilani
Pilani Campus

Binary Search trees


Pseudo code for INORDER

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Binary Search Trees

A Binary Search Tree (BST) is a binary tree


with the following properties:
– The key of a node is always greater than the
keys of the nodes in its left subtree

– The key of a node is always smaller than the


keys of the nodes in its right subtree

BITS Pilani, Pilani Campus


Binary Search Trees: Examples

root
root
14
C

root 10 16
A D

8 11 15 18
14

10 16

8 11 15

BITS Pilani, Pilani Campus


Building a BST

Build a BST from a sequence of nodes read one a time


Example: Inserting C A B L M (in this order!)

1) Insert C 2) Insert A

C
C
A

BITS Pilani, Pilani Campus


Building a BST

3) Insert B C

5) Insert M
A

B C

A L

4) Insert L C B M

A L

BITS Pilani, Pilani Campus


Building a BST

Is there a unique BST for letters A B C L M ?


NO! Different input sequences result in
different trees
Inserting: A B C L M
Inserting: C A B L M
A

B C

C
A L
L
M B M

BITS Pilani, Pilani Campus


Example Binary Searches

• Find ( 2 )
10 5
10 > 2, left 5 > 2, left
5 > 2, left 2 45 2 = 2, found
5 30
2 = 2, found

2 25 45 30

10

25

BITS Pilani, Pilani Campus


Recursive Search of Binary Tree

Node Find( Node n, Value key) {


if (n == null) // Not found
return( n );
else if (n.data == key) // Found it
return( n );
else if (n.data > key) // In left subtree
return Find( n.left );
else // In right subtree
return Find( n.right );
}

BITS Pilani, Pilani Campus


Complexity of Search

• Running time of searching in a BST is


proportional to the height of the tree.
If n is the number of nodes in a BST, then
• Best Case – O(logn)
• Worst Case – O(n)

11
BITS Pilani, Pilani Campus
Binary Search Tree - Insertion

Insert Algorithm
• If value we want to insert < key of current
node, we have to go to the left subtree
• Otherwise we have to go to the right subtree
• If the current node is empty (not existing)
create a node with the value we are
inserting and place it here.

BITS Pilani, Pilani Campus


Insertion - Example

For example, inserting ’15’ into the BST?

BITS Pilani, Pilani Campus


Binary Search Tree - Deletion

There are 3 possible cases


Case1 : Node to be deleted has no children
 We just delete the node.
Case2 : Node to be deleted has only one child
 Replace the node with its child
and make the parent of the
deleted node to be a parent of the
child of the deleted node
Case3 : Node to be deleted has two children

BITS Pilani, Pilani Campus


Binary Search Tree - Deletion

Node to be deleted has two children

BITS Pilani, Pilani Campus


Binary Search Tree - Deletion

Node to be deleted has two children


Steps:
• Find minimum value of right subtree
• Delete minimum node of right subtree but
keep its value
• Replace the value of the node to be deleted
by the minimum value whose node was
deleted earlier.

BITS Pilani, Pilani Campus


Binary Search Tree - Deletion

BITS Pilani, Pilani Campus


Convert the following into a
pseudo code.

BITS Pilani, Pilani Campus


Data Structures and Algorithms
CS F211
Vishal Gupta
Department of Computer Science and Information Systems
BITS Pilani Birla Institute of Technology and Science
Pilani|Dubai|Goa|Hyderabad
Pilani Campus, Pilani
BITS Pilani
Pilani|Dubai|Goa|Hyderabad

Agenda: Red-Black Trees


Red-Black Trees
• Red-black trees:
– Binary search trees augmented with node color
– Operations designed to guarantee that the height
h = O(lg n)

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Red-Black Properties
• The red-black properties:
1. Every node is either red or black
2. Every leaf (NULL pointer) is black
• Note: this means every “real” node has 2 children
3. If a node is red, both children are black
• Note: can’t have 2 consecutive reds on a path
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Red-Black Tree
• Each node has the following attributes:
– Color
– Key
– Left child
– Right child
– Parent
• If a child of the parent of a node does not exist, the
corresponding pointer attribute of the node contains the value
NIL.
• We shall regard these NIL’s as being pointers to leaves (external
nodes) of the BST and the normal, key bearing nodes as being
internal nodes of the tree.
• By constraining the node colors on any simple path from the
root to a leaf, red black trees ensure that no such path is more
than twice as long as any other path.
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Black-Height
• Black-height: The black-height of a node, X, in a
red-black tree is the number of Black nodes on
any path to a NULL (or leaf), not counting X.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


X

A Red-Black Tree with NULLs shown


Black-Height of the tree (the root) = 3
Black-Height of node “X” = 2
A Red-Black Tree with
Black-Height = _____
Red-Black Trees: An Example
• Color this tree: 7

5 9

12
12

Red-black properties:
1. Every node is either red or black
2. Every leaf (NULL pointer) is black
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 8 7
– Where does it go? 5 9

12

1. Every node is either red or black


2. Every leaf (NULL pointer) is black
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 8 7
– Where does it go? 5 9
– What color
should it be? 8 12

1. Every node is either red or black


2. Every leaf (NULL pointer) is black
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 8 7
– Where does it go? 5 9
– What color
should it be? 8 12

1. Every node is either red or black


2. Every leaf (NULL pointer) is black
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 11 7
– Where does it go? 5 9

8 12

1. Every node is either red or black


2. Every leaf (NULL pointer) is black
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 11 7
– Where does it go? 5 9
– What color?
8 12

11
1. Every node is either red or black
2. Every leaf (NULL pointer) is black
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 11 7
– Where does it go? 5 9
– What color?
• Can’t be red! (#3) 8 12

11
1. Every node is either red or black
2. Every leaf (NULL pointer) is black
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 11 7
– Where does it go? 5 9
– What color?
• Can’t be red! (#3) 8 12
• Can’t be black! (#4) 11
1. Every node is either red or black
2. Every leaf (NULL pointer) is black
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 11 7
– Where does it go? 5 9
– What color?
• Solution: 8 12
recolor the tree
11
1. Every node is either red or black
2. Every leaf (NULL pointer) is black
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 10 7
– Where does it go? 5 9

8 12

11
1. Every node is either red or black
2. Every leaf (NULL pointer) is black
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 10 7
– Where does it go? 5 9
– What color?
8 12

11
1. Every node is either red or black
2. Every leaf (NULL pointer) is black 10
3. If a node is red, both children are black
4. Every path from node to descendent leaf
contains the same number of black nodes
5. The root is always black
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees:
The Problem With Insertion
• Insert 10 7
– Where does it go? 5 9
– What color?
• A: no color! Tree 8 12
is too imbalanced
11
• Must change tree structure
to allow recoloring
10
– Goal: restructure tree in
O(lg n) time

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


RB Trees: Rotation
• Our basic operation for changing tree
structure is called rotation:

y rightRotate(y) x
x C A y
leftRotate(x)
A B B C

• Does rotation preserve inorder key ordering?


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
y rightRotate(y) x
x C A y
leftRotate(x)
A B B C
Left-Rotate(T,x)
y = right(x) ; assume right(x) <> NIL
right(x) = left(y) ; move y's child over
if left(y) <> NIL
then parent(left(y)) = x
parent(y) = parent(x) ; move y up to x's position
if parent(x) = NIL
then root(T) = y
else if x = left(parent(x))
then left(parent(x)) = y
else right(parent(x)) = y
left(y) = x ; move x down
parent(x) = y
y rightRotate(y) x
x C A y
leftRotate(x)
A B B C
Right-Rotate(T,y)
x = left(y) ; assume left(y) <> NIL
left(y) = right(x)
if right(x) <> NIL
then parent(right(x)) = y
parent(x) = parent(y)
if parent(y) = NIL
then root(T) = x
else if y = left(parent(y))
then left(parent(y)) = x
else right(parent(y)) = x
right(x) = y
parent(y) = x
Rotation Example
• Rotate left about 9:

5 9

8 12

11

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Rotation Example
• Rotate left about 9:

5 12

8 11

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Red-Black Trees: Insertion
• Insertion: the basic idea
– Insert x into tree, color x red
– Which of the red-black properties might be violated?
• Root is always black
• Red node cannot have a red child

– Fix the violated properties.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Red-Black Trees: Insertion
Insertion
1. Insert node into tree using BST Insert(T,x) and
color node Red
2. Fix violated RBT properties
1. Root is always black
2. Red node cannot have a red child
3. Color root Black

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Red-Black Trees: Insertion
• If parent node `a' was Black, then no changes are necessary.
• If not, then there are following cases to consider for each of the
orientations below.

• Move up the tree until there are no violations or we are at the root.
• In the following discussion we will assume the parent is a left child
(if the parent is a right child perform the same steps swapping
``right'' and ``left'')
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Red-Black Trees: Insertion
RB-Insert(T,x)
Case I: x's uncle is Red
• Change x's grandparent to Red
• Change x's uncle and parent to Black
• Change x to x's grandparent

How to get uncle (x)


if parent(x) = left(parent(parent(x)))
then uncle(x) = right(parent(parent(x)))
else uncle(x) = left(parent(parent(x)))

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Red-Black Trees: Insertion
RB-Insert(T,x)
Case II: x's uncle is Black, x is the right child of its parent
• Change x to x's parent
• Rotate x's parent (now x) left to make Case III
• Case II is now Case III
Case III: x's uncle is Black, x is the left child of its parent
• Set x's parent to Black
• Set x's grandparent to Red
• Rotate x's grandparent right

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Red-Black Trees: Insertion
Example:
Insert the following keys in a Red Black Tree in
order:
3 2 5 6 9 4 7 8

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Theorem 1 – In a red-black tree, any subtree rooted
at x contains atleast (2bh(x) – 1) internal nodes,
where bh(x) is the black height of node x.
Proof: by induction on height of x.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


In a red-black tree, at least half the nodes on any
path from the root to a NULL (i.e. leaf) must be Black.

Proof – If there is a Red node on the path, there must


be a corresponding Black node.

Algebraically this theorem means


bh( x ) ≥ h/2
where x is the root node.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


In a red-black tree, no path from any node, X, to a NULL (i.e.
leaf) is more than twice as long as any other path from X to
any other NULL (i.e. leaf).

Proof: By definition, every path from a node to any NULL


contains the same number of Black nodes. By Theorem 2,
atleast ½ the nodes on any such path are Black. Therefore,
there can be no more than twice as many nodes on any path
from X to a NULL as on any other path. Therefore the length
of every path is no more than twice as long as any other
path.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Theorem 4 – A red-black tree with n internal nodes has
height h ≤ 2 lg(n + 1).

Proof: Let h be the height of the red-black tree with root x.


By Theorem 2,
bh(x) ≥ h/2
From Theorem 1, n ≥ 2bh(x) - 1
Therefore n ≥ 2 h/2 – 1
n + 1 ≥ 2h/2
lg(n + 1) ≥ h/2
2lg(n + 1) ≥ h

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


RB Trees: Worst-Case Time
• So we’ve proved that a red-black tree has
O(lg n) height
• Corollary: These operations take O(lg n) time:
– Minimum(), Maximum()
– Successor(), Predecessor()
– Search()
• Insert() and Delete():
– Will also take O(lg n) time
– But will need special care since they modify tree

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Red-Black Trees: Deletion
• As insert had three cases, delete has four
different cases.
• Do it yourself.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


BITS Pilani
Pilani|Dubai|Goa|Hyderabad
Data Structures and Algorithms
CS F211
Vishal Gupta
Department of Computer Science and Information Systems
BITS Pilani Birla Institute of Technology and Science
Pilani|Dubai|Goa|Hyderabad
Pilani Campus, Pilani
BITS Pilani
Pilani|Dubai|Goa|Hyderabad

Agenda: B Trees
B Tree: Motivation

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Motivation for B-Trees
• Index structures for large datasets cannot be
stored in main memory
• Storing it on disk requires different approach to
efficiency

• Assuming that a disk spins at 3600 RPM, one


revolution occurs in 1/60 of a second, or 16.7ms
• Crudely speaking, one disk access takes about
the same time as 200,000 instructions

B-Trees 4
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Motivation (cont.)
• Assume that we use an AVL tree to store about
20 million records
• We end up with a very deep binary tree with lots
of different disk accesses; log2 20,000,000 is
about 24, so this takes about 0.2 seconds
• We know we can’t improve on the log n lower
bound on search for a binary tree
• But, the solution is to use more branches and
thus reduce the height of the tree!
– As branching increases, depth decreases
B-Trees 5
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
B-Trees
B-Trees are useful in the following cases:

• The number of objects is too large to fit in memory.


• Need external storage.
• Disk accesses are slow, thus need to minimize the number of
disk accesses

Is RB Tree good in these situations?

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


B Tree Example

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


A B-tree of height 2 containing
over one billion keys.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


B-Trees
• B-Trees are balanced, like RB trees.
• They have a large number of children (large branching factor), unlike RB trees.
• The branching factor is determined by the size of disk transfers (page size).
• Each object (node) referenced requires a DiskRead.
• Each object modified requires a DiskWrite.
• The root of the tree is kept in memory at all times.
• Insert, Delete, Search = O(h), where h is the height of the tree.
O(lgn), though much less in reality ( logBF n).

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Properties of B-Trees
• Every leaf has the same depth equal to the height of the tree.
• The number of keys is bounded in terms of the minimum degree t
≥ 2. (WHY?)
• n(x) ≥ t-1 (except root ≥ 1)
• #children(x) ≥ t (except root ≥ 0), leaves = 0
• n(x) ≤ 2t - 1
• #children ≤ 2t (except leaves which = 0)
• If n(x) = 2t - 1 then n is a full node.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


What is the height of B-Tree
• Given a B-Tree of height h, minimum degree t, and number of keys n,
prove that the height of the B-tree is:

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


What is the height of B-Tree
• Given a B-Tree of height h, minimum degree t, and number of keys n,
prove that the height of the B-tree is:

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Searching a key

What is the Time-Complexity ??

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Inserting a Node
Overview
• If node x is a non-full (< 2t-1 keys) leaf, then insert new key k in node x
• If node x is non-full but not a leaf, then recurse to appropriate child of x
• If node x is full (2t-1 keys), then ``split'' the node into x1 and x2, and
recurse to appropriate node x1 or x2.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Splitting a Node

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Inserting a Node

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Inserting a Node

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Insertion Example

Insert B:

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Insertion Example

Insert Q:

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Insertion Example

Insert L:

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Insertion Example

Insert F:

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Insertion Example

Insert F:

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deleting a Node
Overview

Deletion: B-Tree-Delete(x, k)
1) Search down the tree for node containing k
2) When B-Tree-Delete is called recursively, the number of keys in x must
be at least the minimum degree t (the root can have < t keys)
1) If x is a leaf, just remove key k and still have at least t-1 keys in x
2) If there are not ≥ t keys in x, then borrow keys from other nodes.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deleting a Node

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deleting a Node

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deleting a Node

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deleting a Node

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deleting a Node

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deleting a Node

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deleting a Node

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deleting a Node

Deletion Example

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


1. If the key k is in node x and x is a leaf, delete the key k from x.

2. If the key k is in node x and x is an internal node, do the following.

a)If the child y that precedes k in node x has at least t keys, then find
the predecessor k' of k in the subtree rooted at y. Recursively
delete k', and replace k by k' in x. (Finding k' and deleting it can be
performed in a single downward pass.)

b)Symmetrically, if the child z that follows k in node x has at


least t keys, then find the successor k' of k in the subtree rooted at z.
Recursively delete k', and replace k by k' in x. (Finding k' and
deleting it can be performed in a single downward pass.)

a)Otherwise, if both y and z have only t- 1 keys, merge k and all


of z into y, so that x loses both k and the pointer to z, and y now
contains 2t - 1 keys. Then, free z and recursively delete k from y.
3. If the key k is not present in internal node x, determine the root ci[x]
of the appropriate subtree that must contain k, if k is in the tree at all.
If ci[x] has only t - 1 keys, execute step 3a or 3b as necessary to
guarantee that we descend to a node containing at least t keys. Then,
finish by recursing on the appropriate child of x.

a) If ci[x] has only t - 1 keys but has a sibling with t keys, give ci[x]
an extra key by moving a key from x down into ci[x], moving a
key from ci[x]'s immediate left or right sibling up into x, and
moving the appropriate child from the sibling into ci[x].

a) If ci[x] and all of ci[x]'s siblings have t - 1 keys, merge ci with one
sibling, which involves moving a key from x down into the new
merged node to become the median key for that node.
Deletion Example

Delete F

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deletion Example

Delete M

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deletion Example

Delete G

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deletion Example

Delete D

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deletion Example

Delete B

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deletion Example

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Reasons for using B-Trees
• When searching tables held on disc, the cost of each
disc transfer is high but doesn't depend much on the
amount of data transferred, especially if consecutive
items are transferred
– If we use a B-tree of order 101, say, we can transfer each
node in one disc read operation
– A B-tree of order 101 and height 3 can hold 1014 – 1 items
(approximately 100 million) and any item can be accessed
with 3 disc reads (assuming we hold the root in memory)
• If we take m = 3, we get a 2-3 tree, in which non-leaf
nodes have two or three children (i.e., one or two keys)
– B-Trees are always balanced (since the leaves are all at the
same level), so 2-3 trees make a good type of balanced
tree

B-Trees 40
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Comparing Trees
• Binary trees
– Can become unbalanced and lose their good time complexity
(big O)
– AVL trees are strict binary trees that overcome the balance
problem
– Heaps remain balanced but only prioritise (not order) the keys

• Multi-way trees
– B-Trees can be m-way, they can have any (odd) number of
children
– One B-Tree, the 2-3 (or 3-way) B-Tree, approximates a
permanently balanced binary tree, exchanging the AVL tree’s
balancing operations for insertion and (more complex) deletion
operations

B-Trees 41
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
BITS Pilani
Pilani|Dubai|Goa|Hyderabad
Data Structures and Algorithms
CS F211
Vishal Gupta
Department of Computer Science and Information Systems
BITS Pilani Birla Institute of Technology and Science
Pilani|Dubai|Goa|Hyderabad
Pilani Campus, Pilani
BITS Pilani
Pilani|Dubai|Goa|Hyderabad

Agenda: Binomial Heaps


Binomial Heaps
DATA STRUCTURES: MERGEABLE HEAPS
• MAKE-HEAP ( )
– Creates & returns a new heap with no elements.
• INSERT (H,x)
– Inserts a node x into heap H. key field of the node
has already been filled.
• MINIMUM (H)
– Returns a pointer to the node in heap H whose key
is minimum.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Mergeable Heaps

• EXTRACT-MIN (H)
– Deletes the node from heap H whose key is
minimum. Returns a pointer to the node.

• DECREASE-KEY (H, x, k)
– Assigns to node x within heap H the new value k
where k is smaller than its current key value.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Mergeable Heaps
• DELETE (H, x)
– Deletes node x from heap H.

• UNION (H1, H2)


– Creates and returns a new heap that contains all
nodes of heaps H1 & H2.
– Heaps H1 & H2 are destroyed by this operation

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Binomial Trees
• A binomial heap is a collection of binomial
trees.
• The binomial tree Bk is an ordered tree defined
recursively
Bo Consists of a single node
.
.
.
Bk Consists of two binominal trees Bk-1
linked together. Root of one is the
leftmost child of the root of the other.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Binomial Trees
B1 B2
B1
B0
B1

B0 B1 B2
B3

B3
B2 B0

B1

B4
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Binomial Trees

B1 Bo
B2

Bk-2
Bk-1

Bk
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Properties of Binomial Trees
LEMMA: For the binomial tree Bk ;
1. There are ____ nodes,
2. The height of tree is ___,
3. There are exactly ____ nodes at depth i
for i = 0,1,..,k and
4. The root has degree ___ > degree of any other
node if the children of the root are numbered
from left to right as k-1, k-2,...,0; child i is the
root of a subtree Bi.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Properties of Binomial Trees
PROOF: By induction on k
Each property holds for the basis B0
INDUCTIVE STEP: assume that Lemma
holds for Bk-1
1. Bk consists of two copies of Bk-1
| Bk | = | Bk-1 | + | Bk-1| = 2k-1 +2k-1 = 2k
2. hk-1 = Height (Bk-1) = k-1 by induction
hk=hk-1+1 = k-1 +1 = k

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Properties of Binomial Trees
3. Let D(k,i) denote the number of nodes at depth i of a Bk ;
d=1
d=i-1 d=i d=i

D(k-1,i -1) D(k-1, i)

Bk-1
Bk-1 true by induction

k-1 k
D(k,i)=D(k-1,i -1) + D(k-1,i) = k-1
+ =
i -1 i i

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Properties of Binomial Trees(Cont.)

4.Only node with greater degree in Bk than those


in Bk-1 is the root,

• The root of Bk has one more child than the root


of Bk-1,

Degree of root Bk=Degree of Bk-1+1=(k-1)+1=k

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Properties of Binomial Trees (Cont.)
• COROLLARY: The maximum degree of any
node in an n-node binomial tree is lg(n)

The term BINOMIAL TREE comes from the


3rd property.
k
i.e. There are i nodes at depth i of a Bk
terms k are the binomial coefficients.
i

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Binomial Heaps
A BINOMIAL HEAP H is a set of BINOMIAL
TREES that satisfies the following “Binomial
Heap Properties”
1. Each binomial tree in H is HEAP-ORDERED
• the key of a node is ≥ the key of the parent
• Root of each binomial tree in H contains the
smallest key in that tree.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Binomial Heaps
2. There is at most one binomial tree in H whose
root has a given degree,
– n-node binomial heap H consists of at most
[lgn] + 1 binomial trees.
– Binary represantation of n has lg(n) + 1 bits,

n ≤ b -1, ....b1, b0> = Σ


,b lgn b 2 i
lgn lgn i=0 i

By property 1 of the lemma (Bi contains 2i nodes) Bi


appears in H iff bit bi=1

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Binomial Heaps
Example: A binomial heap with n = 13 nodes
3 2 1 0

13 =< 1, 1, 0, 1>2
Consists of B0, B2, B3
head[H]
10 1 6

B0 12 25 8 14 29

18
B2 11 17 38

B3
27

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Representation of Binomial Heaps
• Each binomial tree within a binomial heap is stored in
the left-child, right-sibling representation
• Each node X contains POINTERS
– p[x] to its parent
– child[x] to its leftmost child
– sibling[x] to its immediately right sibling
• Each node X also contains the field degree[x] which
denotes the number of children of X.

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Representation of Binomial Heaps
HEAD [H]

10 1 ROOT LIST (LINKED LIST)


0 2

parent
key 12 25
degree 1 0

child 18
0 sibling

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Representation of Binomial Heaps

• Let x be a node with sibling[x] ≠ NIL

– Degree [sibling [x]]=degree[x]-1


if x is NOT A ROOT

– Degree [sibling [x]] > degree[x]


if x is a root

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Operations on Binomial Heaps
CREATING A NEW BINOMIAL HEAP

MAKE-BINOMIAL-HEAP ( )
allocate H RUNNING-TIME= Θ(1)
head [ H ]  NIL
return H
end

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Operations on Binomial Heaps
BINOMIAL-HEAP-MINIMUM (H)
x  Head [H]
min  key [x]
x  sibling [x]
while x ≠ NIL do
if key [x] < min then
min  key [x]
yx
endif
x  sibling [x]
endwhile
return y
end

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Operations on Binomial Heaps

Since binomial heap is HEAP-ORDERED

The minimum key must reside in a ROOT NODE

Above procedure checks all roots

NUMBER OF ROOTS ≤ lgn + 1

RUNNING–TIME = O (lgn)

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps
BINOMIAL-HEAP-UNION
Procedure repeatedly link binomial trees whose roots
have the same degree
BINOMIAL-LINK
Procedure links the Bk-1 tree rooted at node y to
the Bk-1 tree rooted at node z to make z the parent of y
i.e. Node z becomes the root of a Bk tree

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps

BINOMIAL-LINK (y,z)
p [y]  z
sibling [y]  child [z]
child [z]  y
degree [z] degree [z] + 1
end

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps
NIL

z
+1

NIL

sibling [y]
Uniting Two Binomial Heaps: Cases
We maintain 3 pointers into the root list

x = points to the root currently being


examined

prev-x = points to the root PRECEDING x on


the root list sibling [prev-x] = x

next-x = points to the root FOLLOWING x on


the root list sibling [x] = next-x

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps
• Initially, there are at most two roots of the same
degree
• Binomial-heap-merge guarantees that if two roots in h
have the same degree they are adjacent in the root list
• During the execution of union, there may be three
roots of the same degree appearing on the root list at
some time

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps
CASE 1: Occurs when degree [x] ≠ degree [next-x]
prev-x x next-x sibling { next-x}
a b c d

Bk Bl
l >k
prev-x x next-x

a b c d

Bk Bl

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps: Cases
CASE 2: Occurs when x is the first of 3 roots of equal degree
degree [x] = degree [next-x] = degree [sibling[next-x]]
prev-x x next-x sibling [next-x]

a b c d

BK BK BK
prev-x x next-x
a b c d

BK BK BK

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps: Cases
CASE 3 & 4: Occur when x is the first of 2 roots of equal degree
degree [x] = degree [next-x] ≠ degree [sibling [next-x]]

• Occur on the next iteration after any case


• Always occur immediately following CASE 2
• Two cases are distinguished by whether x or next-x has the
smaller key
• The root with the smaller key becomes the root of the linked tree

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps: Cases
CASE 3 & 4 CONTINUED
prev-x x next-x sibling [next-x]

a b c d

Bk Bk Bl l>k
prev-x x next-x
CASE 3
a b d
key [b] ≤ key [c]
c

prev-x x next-x
CASE 4
a c d
b

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps: Cases
The running time of binomial-heap-union operation is
O (lgn)

• Let H1 & H2 contain n1 & n2 nodes respectively


where n= n1+n2

• Then, H1 contains at most lgn1 +1 roots


H2 contains at most lgn2 +1 roots

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps: Cases
• So H contains at most
lgn1 + lgn2 +2 ≤ 2 lgn +2= O (lgn) roots
immediately after BINOMIAL-HEAP-MERGE

• Therefore, BINOMIAL-HEAP-MERGE runs in O(lgn)


time and

• BINOMIAL-HEAP-UNION runs in O (lgn) time

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Binomial-Heap-Union Procedure

BINOMIAL-HEAP-MERGE PROCEDURE

- Merges the root lists of H1 & H2 into a single linked-


list

- Sorted by degree into monotonically increasing order

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Binomial-Heap-Union Procedure
BINOMIAL-HEAP-UNION (H1,H2)
H  MAKE-BINOMIAL-HEAP ( )
head [ H ]  BINOMIAL-HEAP-MERGE (H1,H2)
free the objects H1 & H2 but not the lists they point to
prev-x  NIL
x  HEAD [H]
next-x  sibling [x]
while next-x ≠ NIL do
if ( degree [x] ≠ degree [next-x] OR
(sibling [next-x] ≠ NIL and degree[sibling [next-x]] = degree [x]) then
prev-x  x CASE 1 and 2
x  next-x CASE 1 and 2
elseif key [x] ≤ key [next-x] then
sibling [x]  sibling [next -x] CASE 3

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Binomial-Heap-Union Procedure (Cont.)
BINOMIAL- LINK (next-x, x) CASE 3
else
if prev-x = NIL then
head [H]  next-x CASE 4
else CASE 4
sibling [prev-x]  next-x CASE 4
endif
BINOMIAL-LINK(x, next-x) CASE 4
x  next-x CASE 4
endif
next-x  sibling [x]
endwhile
return H
end

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Uniting Two Binomial Heaps vs
Adding Two Binary Numbers

H1 with n1 NODES : H1 =
H2 with n2 NODES : H2 =
5 4 3 2 1 0

ex: n1= 39 : H1 = < 1 0 0 1 1 1 > = { B0, B1, B2, B5 }


n2 = 54 : H2= < 1 1 0 1 1 0 > = { B1, B2, B4, B5 }

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


x next-x
MERGE H
B0 B1 B1 B2 B2 B4 B5 B5
CASE1 MARCH
Cin=0
1+0=1 x next-x
B0 B1 B1 B2 B2 B4 B5 B5
CASE3 or 4
LINK Cin=0
1+1=10
x next-x
B0 B1 B2 B2 B4 B5 B5

CASE2 MARCH B1
then CASE3 and B2 x next-x
CASE4 LINK
Cin=1 B0 B2 B2 B2 B4 B5 B5
1+1=11

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


x next-x
CASE1 B0 B2 B2 B4 B5 B5
MARCH
Cin=1 B2 B3
0+0=1
x next-x
B0 B2 B3 B4 B5 B5

CASE1
MARCH
Cin=0 x next-x
0+1=1
B0 B2 B3 B4 B5 B5

CASE3 OR 4 x
LINK Cin=0 B0 B2 B3 B4 B5
1+0=10
B5 B6

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Inserting a Node
BINOMIAL-HEAP-INSERT (H,x)

H'  MAKE-BINOMIAL-HEAP (H, x)


P [x]  NIL
child [x]  NIL
sibling [x]  NIL RUNNING-TIME= O(lg n)
degree [x]  O
head [H’]  x
H  BINOMIAL-HEAP-UNION (H, H’)
end

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Relationship Between Insertion &
Incrementing a Binary Number
H : n1=51 H = < 110011> = { B0, B1 ,B4, B5 }
H
B0 B1 B4 B5

MERGE x next-x
( H,H’) B0 B0 B1 B4 B5
5 4 3 2 1 0
x next-x 1
LINK B5 1 1 0 0 1 1
B0 B1 B4
1
+
B0 B2 B4 B5
1 1 0 1 0 0
B1 B4 B5
LINK
B1
Extracting the Node with the Minimum Key

BINOMIAL-HEAP-EXTRACT-MIN (H)
(1) find the root x with the minimum key in the
root list of H and remove x from the root list of H
(2) H’  MAKE-BINOMIAL-HEAP ( )
(3) reverse the order of the linked list of x’ children
and set head [H’]  head of the resulting list
(4) H  BINOMIAL-HEAP-UNION (H, H’)
return x
end

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Extracting the Node with the Minimum Key

Consider H with n = 27, H = <1 1 0 1 1> = {B0, B1, B3, B4 }


assume that x = root of B3 is the root with minimum key

x
head [H]

B0 B1
B4

B1 B0
B2

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Extracting the Node with the Minimum Key

x
head [H]

B0 B1
B4

B2 B1 B0
head [H’]

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Extracting the Node with the Minimum Key

• Unite binomial heaps H= {B0 ,B1,B4} and


H’ = {B0 ,B1,B2}
• Running time if H has n nodes
• Each of lines 1-4 takes O(lgn) time
it is O(lgn).

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Decreasing a Key
BINOMIAL-HEAP-DECREASE-KEY (H, x, k)
key [x]  k
y  x
z  p[y]
while z ≠ NIL and key [y] < key [z] do
exchange key [y]  key [z]
exchange satellite fields of y and z
yz
z  p [y]
endwhile
end

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Decreasing a Key

• Similar to DECREASE-KEY in BINARY HEAP

• BUBBLE-UP the key in the binomial tree it


resides in

• RUNNING TIME: O(lgn)

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Deleting a Key
BINOMIAL- HEAP- DELETE (H,x)
y←x
z ← p [y]
while z ≠ NIL do RUNNING-TIME= O(lg n)
key [y] ← key [z]
satellite field of y ← satellite field of z
y ← z ; z ← p [y]
endwhile
H’← MAKE-BINOMIAL-HEAP
remove root z from the root list of H
reverse the order of the linked list of z’s children
and set head [H’] ← head of the resulting list
H ← BINOMIAL-HEAP-UNION (H, H’)

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


BITS Pilani
Pilani|Dubai|Goa|Hyderabad
Data Structures and Algorithms
CS F211
Vishal Gupta
Department of Computer Science and Information Systems
BITS Pilani Birla Institute of Technology and Science
Pilani|Dubai|Goa|Hyderabad
Pilani Campus, Pilani
BITS Pilani
Pilani|Dubai|Goa|Hyderabad

Agenda: Elementary Graph


Algorithms
Definitions
• A graph G=(V,E), V and E are two sets
– V: finite non-empty set of vertices
– E: set of pairs of vertices, edges
• Undirected graph
– The pair of vertices representing any edge is unordered.
Thus, the pairs (u,v) and (v,u) represent the same edge
• Directed graph
– each edge is represented by a ordered pairs <u,v>

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Examples of Graph G1

• G1
– V(G1)={0,1,2,3} 0
– E(G1)={(0,1),(0,2),
(0,3),(1,2),(1,3), (2,3)}
1 2

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Examples of Graph G2

• G2
– V(G2)={0,1,2,3,4,5,6} 0

– E(G2)={(0,1),(0,2),
(1,3),(1,4),(2,5),(2,6)} 1 2

• G2 is also a tree
– Tree is a special case of 3 4 5 6
graph

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Examples of Graph G3

• G3 0
– V(G3)={0,1,2}
– E(G3)={<0,1>,<1,0
1
>,<1,2>}
• Directed graph
2
(digraph)

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Adjacent and Incident

• If (u,v) is an edge in an undirected graph,


– Adjacent: u and v are adjacent
– Incident: The edge (u,v) is incident on vertices u
and v
• If <u,v> is an edge in a directed graph
– Adjacent: u is adjacent to v, and vu is adjacent
from v
– Incident: The edge <u,v> is incident on u and v

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Cycle

• Cycle 0
– a simple path, first and last
vertices are same. 1 2

• 0,1,2,0 is a cycle 3
• Acyclic graph
– No cycle is in graph

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Degree
• Degree of Vertex
– is the number of edges incident to that vertex
• Degree in directed graph
– Indegree
– Outdegree

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Graph Representations
• Adjacency Matrix
• Adjacency Lists

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Adjacency Matrix
• Adjacency Matrix : let G = (V, E) with n vertices, n  1.
The adjacency matrix of G is a 2-dimensional n  n
matrix, A
– A(i, j) = 1 iff (vi, vj) E(G)
(vi, vj for a diagraph)
– A(i, j) = 0 otherwise.
• The adjacency matrix for an undirected graph is
symmetric; the adjacency matrix for a digraph need
not be symmetric

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


0 4
Example
2 1 5

3 6
0
0

1 2 7
0 1 1 0 0 0 0 0
3 1
1 0 0 1 0 0 0 0
0 1 1 1 
0 1 0 1 0 0 1 0 0 0 0
1 0 1 1     
  1 0 1  0 1 1 0 0 0 0 0
1 1 2 0 0 0 0

1 0
 0 0 0 0 1 0 0
 
1 1 1 0
G2 0 0 0 0 1 0 1 0
G1 0 0 0 0 0 1 0 1
symmetric  
undirected: n2/2 0 0 0 0 0 0 1 0
directed: n2
G 4 3 of UGC Act, 1956
BITS Pilani, Deemed to be University under Section
Merits of Adjacency Matrix

• From the adjacency matrix, to determine the


connection of vertices is easy
n 1

• The degree of a vertex is  adj _ mat[i][ j ]


j 0
• For a digraph, the row sum is the out_degree,
while the column sum is the in_degree

n 1 n 1
ind (vi )   A[ j , i ] outd (vi )   A[i , j ]
j 0 j 0
Adjacency Lists
• Replace n rows of the adjacency matrix with n
linked list

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Data Structures for Adjacency Lists
Each row in adjacency matrix is represented as an adjacency list.

#define MAX_VERTICES 50
typedef struct node *node_pointer;
typedef struct node {
int vertex;
struct node *link;
};
node_pointer graph[MAX_VERTICES];
int n=0; /* vertices currently in use */
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example(1)
0
0

1 2 1
3
0 2
1 2 3
1 0 2 3 0 1
2 0 1 3 1 0 2
3 0 1 2 2
G1 G3

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Example(2)

0 1 2
0 4
1 0 3
2 1 5 2 0 3
3 6 3 1 2
4 5
7 5 4 6
6 5 7
7 6
G4
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Interesting Operations

degree of a vertex in an undirected graph


–# of nodes in adjacency list
# of edges in a graph
–determined in O(n+e)
out-degree of a vertex in a directed graph
–# of nodes in its adjacency list
in-degree of a vertex in a directed graph
–traverse the whole data structure

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Breadth-first search

VISHAL GUPTA, PhD 20 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Breadth First Search

• Breadth-first search is one of the simplest


algorithms for searching a graph.
• Dijkstra's single-source shortest-paths
algorithm Prim's minimum-spanning-tree
algorithm use ideas similar to those in BFS.

VISHAL GUPTA, PhD 21 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Breadth First Search

• Given a graph G = (V, E) and a


distinguished source vertex s, BFS systematically
explores the edges of G to "discover" every vertex
that is reachable from s.
• It computes the distance (fewest number of edges)
from s to all such reachable vertices.
• It also produces a "breadth-first tree" with
root s that contains all such reachable vertices.
• BFS discovers all vertices at distance k from s before
discovering any vertices at distance k + 1.

VISHAL GUPTA, PhD 22 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Breadth First Search
Breadth_First_Search(G, s) While (Q  )
for each vertex u  G.V – {s} {
u.color = WHITE u = DEQUEUE (Q)
u.d =  for each v  G.Adj[u]
u. = NIL {
s.color = GRAY if v.color == WHITE
s.d = 0 v.color = GRAY
s. = NIL v.d = u.d + 1
Q= v. = u
ENQUEUE (Q, s) ENQUEUE (Q, v)
}
u.color = BLACK
}

VISHAL GUPTA, PhD 23 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Depth first search
Global Variable: time DFS-VISIT (G, u)
DEPTH_FIRST_SEARCH time = time+1
for each vertex u  G.V u.d = time
{ u.color = GRAY
u.color = WHITE for each v  G.Adj[u]
u. =NIL {
} if v.color == WHITE
time = 0 v. = u
for each vertex u  G.V DFS-VISIT (G, v)
{ }
if u.color == WHITE u.color = BLACK
DFS-VISIT (G, u) time = time+1
} u.f = time

VISHAL GUPTA, PhD 24 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
DFS: Properties
• discovery and finishing times have parenthesis structure.

VISHAL GUPTA, PhD 25 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Parenthesis Theorem
In any depth first search of G = (V, E), for any two vertices u and v, exactly
one of the following three conditions holds:

a) The intervals [u.d, u.f] and [v.d, v.f] are entirely disjoint, and neither u nor
v is a descendant of the other in the depth first forest.
b) The interval [u.d, u.f] is contained entirely within the interval [v.d, v.f],
and u is a descendent of v in a depth first tree, or
c) The interval [v.d, v.f] is contained entirely within the interval [u.d, u.f],
and v is a descendent of u in a depth first tree

Corollary: Vertex v is a proper descendant of vertex u in the depth first forest


for a (directed or undirected) graph G if and only if
u.d < v.d < v.f < u.f

VISHAL GUPTA, PhD 26 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Classification of Edges
• DFS can be used to classify the edges of G into:
a) Tree Edge: Edge (u, v) is a tree edge if v was first discovered by exploring edge
(u, v).
b) Back edges: Edges (u, v) connecting a vertex u to an ancestor v in a depth-first
tree. Self-loops, which may occur in directed graphs, are considered to be back
edges.
c) Forward edges: Those non-tree edges (u, v) connecting a vertex u to a
descendant v in a depth-first tree.
d) Cross edges are all other edges. They can go between vertices in the same
depth-first tree, as long as one vertex is not an ancestor of the other, or they can
go between vertices in different depth-first trees.
• Edge (u, v) can be classified by the color of the vertex v that is reached when
the edge is first explored
1. WHITE indicates a tree edge,
2. GRAY indicates a back edge, and
3. BLACK indicates a forward or cross edge.
VISHAL GUPTA, PhD 27 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Properties
• In a depth-first search of an undirected graph G, every edge of
G is either a tree edge or a back edge.
• A directed graph is acyclic if and only if a depth-first search
yields no “back” edges.
• For a weighted graph, DFS traversal of the graph produces the
minimum spanning tree and all pair shortest path tree.
• We can specialize the DFS algorithm to find a path between
two given vertices u and z.
i. Call DFS(G, u) with u as the start vertex.
ii. Use a stack S to keep track of the path between the start
vertex and the current vertex.
iii. As soon as destination vertex z is encountered, return the
path as the contents of the stack

VISHAL GUPTA, PhD 28 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Properties
• In a depth-first search of an undirected graph G, every edge of
G is either a tree edge or a back edge.
• A directed graph is acyclic if and only if a depth-first search
yields no “back” edges.
• For a weighted graph, DFS traversal of the graph produces the
minimum spanning tree and all pair shortest path tree.
• We can specialize the DFS algorithm to find a path between
two given vertices u and z.
i. Call DFS(G, u) with u as the start vertex.
ii. Use a stack S to keep track of the path between the start
vertex and the current vertex.
iii. As soon as destination vertex z is encountered, return the
path as the contents of the stack

VISHAL GUPTA, PhD 29 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Topological sort
• A topological sort of a DAG G = (V, E) is a linear ordering of all its vertices
such that if G contains an edge (u, v), then u appears before v in the
ordering.
• A topological sort of a graph can be viewed as an ordering of its vertices
along a horizontal line so that all directed edges go from left to right.
• If the graph is not acyclic, then no linear ordering is possible.

TOPOLOGICAL-SORT (G)
1. call DFS(G) to compute finishing times u.f for each vertex u
2. As each vertex is finished, insert it onto the front of a linked list
3. Return the linked list of vertices

VISHAL GUPTA, PhD 30 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Topological sort

VISHAL GUPTA, PhD 31 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Data Structures and Algorithms
CS F211
Vishal Gupta
Department of Computer Science and Information Systems
BITS Pilani Birla Institute of Technology and Science
Pilani|Dubai|Goa|Hyderabad
Pilani Campus, Pilani
BITS Pilani
Pilani|Dubai|Goa|Hyderabad

Agenda: Minimum Spanning Trees


Four classes of graph problem
.. that can be solved efficiently (in polynomial time)

1. Shortest path – find a shortest path between two vertices in a graph


2. Minimum spanning tree – find subset of edges with minimum total
weights
3. Matching – find set of edges without common vertices
4. Maximum flow – find the maximum flow from a source vertex to a sink
vertex

A wide array of graph problems that can be solved in polynomial time are
variants of these above problems.

In this class, we’ll cover the first two problems – shortest path and minimum
spanning tree

3
Minimum Spanning Trees

• It’s the 1920’s. Your friend at the electric company needs to choose where to build
wires to connect all these cities to the plant.
B
6
3
E
2
1 C
A 1
9
5 0
7
4
D
8
She knows how much it would cost to lay electric wires between any pair of locations, and wants the cheapest way to
make sure electricity from the plant to every city.

4 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Minimum Spanning Trees

• It’s 1950’s
the 19Your friend
boss at the electricompany
phone needs to choose where to build wires to
connect all these cities each
to the pl
other.
B
6
3
E
2
1 C
A 1
9
5 0
7
4
D F
8
phone wires between any pair of locations, and wants the cheapest way
She knows how much it would cost to lay electric
to make sure electricity from the
Everyone can plant to every else.
call everyone city.

5 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Minimum Spanning Trees

It’s the 1
today Your friend at the electric
ISP company needs to choose where to build wires
to connect all these cities to the plant.
Internet with fiber optic cable
B
6
3
E
2
1 C
A 1
9
5 0
7
4
D
8
Cable
She knows how much it would cost to lay electric wires between any pair of locations, and wants the cheapest way to
make sure electricityEveryone
from thecan
plant to every
reach city.
the server

6 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Minimum Spanning Trees
• What do we need? A set of edges such that:
– Every vertex touches at least one of the edges.
(the edges span the graph)
– The graph on just those edges is connected.
– The minimum weight set of edges that meet those
conditions.
• Assume all edge weights are positive.
• Claim: The set of edges we pick never has a
cycle. Why?

7 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Aside: Trees

• Our BSTs had:


– A root
– Left and/or right children
– Connected and no cycles
• Our heaps had:
– A root
– Varying numbers of children (but same at each level of the tree)
– Connected and no cycles
• On graphs our tees:
– Don’t need a root (the vertices aren’t ordered, and we can start BFS from
anywhere) Tree (when talking about graphs)
– Varying numbers of children (can also vary
An undirected, at eachacyclic
connected level)graph.
– Connected and no cycles
8

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


MST Problem
Given an undirected graph G = (V, E), and for each edge (u, v)  E, we have a
weight w(u, u) specifying the cost between u and v. We then wish to find an
acyclic subset T  E that connects all of the vertices and whose total weight

is minimized.

VISHAL GUPTA, PhD 9 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Generic MST Algorithm
GENERIC_MST (G, w)
A=
while A does not form a spanning tree
find an edge (u, v) that is safe for A
A = A U {(u, v)}
return A

VISHAL GUPTA, PhD 10 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Definitions
• A cut (S, V - S) of an undirected graph G = (V, E) is a partition of V.
• We say that an edge (u, v)  E crosses the cut (S, V - S) if one of its
endpoints is in S and the other is in V - S.
• We say that a cut respects a set A of edges if no edge in A crosses the cut
• An edge is a light edge crossing a cut if its weight is the minimum of any
edge crossing the cut. Note that there can be more than one light edge
crossing a cut in the case of ties.

VISHAL GUPTA, PhD 11 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Theorem
Let G = (V, E) be a connected, undirected graph with a real-valued weight
function w defined on E. Let A be a subset of E that is included in some
minimum spanning tree for G, let (S, V - S) be any cut of G that respects A,
and let (u, v) be a light edge crossing (S, V - S). Then, edge (u, u) is safe for A.

VISHAL GUPTA, PhD 12 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Kruskal’s Algorithm
MST_Kruskal (G, w)
A=
for each vertex v  G.V
MAKE-SET (v)
Sort the edges of G.E into non-decreasing order by weight w
for each edge (u, v)  G.E taken in non-decreasing order by weight
if FIND-SET(u)  FIND-SET (v)
A = A U {(u, v)}
return A

VISHAL GUPTA, PhD 13 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Prim’s Algorithm
MST_PRIM (G, w, r)
for each u  G.V
u.key = 
u. = NIL
r.key = 0
Q = G.V
while Q  
u = EXTRACT_MIN (Q)
for each v  G.adj[u]
if v  Q and w(u, v) < v.key
v. = u
v.key = w(u, v)

VISHAL GUPTA, PhD 14 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
CS F211
(Data Structures and Algorithms)
Vishal Gupta
Department of Computer Science and Information Systems
BITS Pilani Birla Institute of Technology and Science
Pilani|Dubai|Goa|Hyderabad Pilani Campus, Pilani
BITS Pilani
Pilani|Dubai|Goa|Hyderabad

Agenda: Single Source Shortest Paths


Slides Source: http://jupiter.math.nctu.edu.tw/~huilan/2011Spring/10Single-
Source%20Shortest%20Paths.pdf
Overview
► Let G = (V, E) be a weighted directed graph with weight function w.
The weight of a path (directed) p  v0 , v1,..,vk  is  w(vi1 , vi ) .
k

► Define i1

min w( p)| p is a path from u to v if there is a path from u to v .


 (u, v) 
 otherwise

► A shortest path from u to v is any path p with weight w( p)   (u, v) .

► Example.

3
► The breadth-first-search algorithm is a shortest-paths algorithm that
works on unweighted graphs (or each edge have unit weight).

Variants

 Single-source shortest-paths problem: find a shortest path from a given


source vertex s to each vertex v ∈ V.
Many other problems can be solved by the algorithm for the
single-source problem, including the following variants.

4
 Single-destination shortest-paths problem: Find a shortest path to a
given destination vertex t from each vertex v.

 Single-pair shortest-path problem: Find a shortest path from u to v for


given vertices u and v.
No algorithms for this problem are known that run asymptotically
faster than the best single-source algorithms in the worst case.

 All-pairs shortest-paths problem: Find a shortest path from u to v for


every pair of vertices u and v.
It can usually be solved faster (Chapter 25).

Optimal substructure of a shortest path

► Optimal-substructure property: A shortest path between two vertices


contains other shortest paths within it.

5
Lemma 24.1: (Subpaths of shortest paths are shortest paths)
Given a weighted, directed graph G = (V, E) with weight function w, let
p = 〈 v1, v2,..., vk〉 be a shortest path from v1 to vk, then the
subpath pij = 〈 vi, vi+1,..., vj〉 is a shortest path from vi to vj.

• Dijkstra's algorithm is a greedy algorithm.


• Floyd-Warshall algorithm for all-pairs shortest-paths problem is a
dynamic-programming algorithm.

Negative-weight edges

► If G = (V, E) contains no negative-weight cycles reachable from the source


s, then for all v ∈ V , the shortest-path weight δ(s, v) remains well defined,
even if it has a negative value.
► If there is a negative-weight cycle on some path from s to v, δ(s, v) = -∞.
6
Not reachable from s
A negative-weight cycle reachable from s

► Dijkstra's algorithm assumes that all edge weights are nonnegative.


Bellman-Ford algorithm allows negative-weight edges and produces a
correct answer as long as no negative-weight cycles are reachable from the
source.

7
Cycles

► Can a shortest path contain a cycle

• It cannot contain a negative-weight cycle.


• Nor can it contain a positive-weight cycle (it can be removed!).
• We can remove a 0-weight cycle from any path to produce another
path whose weight is the same.

Therefore, without loss of generality we can assume that when we are


finding shortest paths, they have no cycles.

► Since any acyclic path in a graph G = (V, E) contains at most |V| distinct
vertices, it also contains at most |V| - 1 edges. Thus, we can restrict our
attention to shortest paths of at most |V| - 1 edges.

8
Representing shortest paths

► Besides finding the shortest path weight, we wish to find the vertices on
shortest paths.

► Given a graph G = (V, E), π[v] denotes the predecessor of vV (as BFS
does!). PRINT-PATH(G, s, v) can print a shortest path from s to v.
Suppose π[v] for every v.

9
PRINT-PATH(G, s, v)
1 if v = s
2 then print s
3 else if π[v] = NIL
4 then print "no path from" s "to" v "exists"
5 else PRINT-PATH(G, s, π[v])
6 print v

Predecessor subgraph Gπ = (Vπ, Eπ)


induced by the π values,
where Vπ = {v  V :π[v] ≠ NIL}∪{s}
and Eπ ={(π[v], v): vVπ - {s}}.

10
► A shortest-paths tree rooted at s is a directed subgraph G' = (V', E'), where
V' ⊆ V and E' ⊆ E, such that
1. V' is the set of vertices reachable from s in G,
2. G' forms a rooted tree with root s, and
3. for all v∈V', the unique simple path from s to v in G' is a shortest
path from s to v in G.

Relaxation

► For each vertex v, d[v] is an upper bound on the weight of a shortest path
from s to v and is called a shortest-path estimate.

11
INITIALIZE-SINGLE-SOURCE(G, s) RELAX(u, v, w)
1 for each vertex v ∈ V[G] 1 if d[v] > d[u] + w(u, v)
2 do d[v] ← ∞ 2 then d[v] ← d[u] + w(u, v)
3 π[v] ← NIL 3 π[v] ← u
4 d[s] ← 0

12
► The process of relaxing an edge (u, v) consists of testing whether we can
improve the shortest path to v found so far by going through u and, if so,
updating d[v] and π[v].

Dijkstra's algorithm and the shortest-paths algorithm for directed acyclic


graphs: each edge is relaxed exactly once.
Bellman-Ford algorithm: each edge is relaxed many times.

Properties of shortest paths and relaxation

 Triangle inequality (Lemma 24.10)


For any edge (u, v)∈E, we have δ(s, v) ≤ δ(s, u) + w(u, v)

 Upper-bound property (Lemma 24.11)


d[v] ≥ δ(s, v) for all vertices v∈V, and once d[v] achieves the value δ(s,
v), it never changes.

13
 No-path property (Corollary 24.12)
If there is no path from s to v, then we always have d[v] = δ(s, v) = ∞.

 Convergence property (Lemma 24.14)


If is a shortest path and d[u] = δ(s, u) prior to relaxing
edge (u, v), then d[v] = δ(s, v) at all times afterward.

 Path-relaxation property (Lemma 24.15)


If p =〈 s, v1,..., vk〉 is a shortest path from s to vk, and the edges of p
are
relaxed in the order (s, v1), (v1, v2),..., (vk-1, vk), then d[vk] = δ(s, vk).

 Predecessor-subgraph property (Lemma 24.17)


Once d[v] = δ(s, v) for all v∈V, the predecessor subgraph is a
shortest-paths tree rooted at s.

14
1. The Bellman-Ford algorithm

► The Bellman-Ford algorithm solves the single-source shortest-paths


problem (edge weights may be negative)

► If there is a negative-weight cycle that is reachable from the source, the


algorithm indicates that no solution exists.

► Bellman-Ford algorithm:

INITIALIZE-SINGLE-SOURCE(G, s) for each edge (u, v)∈E[G]


do RELAX(u, v, w) 13
BELLMAN-FORD(G, w, s)

1 INITIALIZE-SINGLE-SOURCE(G, s)

2 for i ← 1 to |G.V| - 1
3 do for each edge (u, v) ∈ G.E
4 do RELAX(u, v, w)

5 for each edge (u, v) ∈ G.E


6 do if v.d > u.d + w(u, v)
7 then return FALSE
8 return TRUE

 After initializing the d and π values of all vertices in line 1, the


algorithm makes |V| - 1 passes over the edges of the graph.

16
► Each pass relaxes the edges in the order (t, x), (t, y), (t, z), (x, t), (y, x), (y, z),
(z, x), (z, s), (s, t), (s, y). There are four passes!!

17
► Analysis of running time:
The Bellman-Ford algorithm runs in time O(V E).
The initialization: Θ(V) time
Each passe: Θ(E) time
The for loop of lines 5-7 takes O(E) time.

18
2. Single-source shortest paths in directed acyclic graphs

DAG-SHORTEST-PATHS(G, w, s)

topologically sort the vertices of G


INITIALIZE-SINGLE-SOURCE(G, s)
for each vertex u, taken in topologically sorted order
do for each vertex v∈Adj[u]
do RELAX(u, v, w)

19
► Running time: Θ(V + E) time.

20
Theorem 24.5
If a weighted, directed graph G = (V, E) has source vertex s and
no cycles, DAG-SHORTEST-PATHS returns d[v] = δ(s, v)
for all vertices v ∈ V, and Gπ is a shortest-paths tree.

Proof For a shortest path p =〈 v0, v1,..., vk


〉 edges
the , on p are relaxed in the order (v0, v1), (v1, v2),..., (vk-1, vk).
The path-relaxation property implies that d[vi] = δ(s, vi) at termination for i
= 0, 1,..., k. ▓

21
DIJKSTRA’s ALGORITHM

DIJKSTRA (G, w, s)
INITIALIZE-SINGLE-SOURCE (G,s)
S=
Q = G.V
while Q  
u = EXTRACT-MIN (Q)
S = S U {u}
for each vertex v  G.Adj[u]
RELAX (u, v, w)
3. Dijkstra's algorithm

► Dijkstra's algorithm solves the single-source shortest-paths problem on a


weighted, directed graph in which all edge weights are nonnegative.

relaxed

Q=V Q=V-{0} Q=V-{0,5}

23
► Dijkstra's algorithm uses a greedy strategy (always chooses the "lightest"
or "closest" vertex in V - S to add to set S).

► The key of the correctness of Dijkstra's algorithm is each time a vertex u


is added to set S, we have d[u] = δ(s, u).

Theorem 24.6: (Correctness of Dijkstra's algorithm)


Dijkstra's algorithm, run on a weighted, directed graph G = (V, E) with
non-negative weight function w and source s, terminates with d[u] = δ(s, u)
for all vertices u ∈ V.

Proof Claim: For each vertex u ∈ V, we have d[u] = δ(s, u) at the time when
u is added to set S.
Suppose this is not true. Let u be the first vertex for which d[u]   (s,u)
when it is added to S. Let P be a shortest path from s to u.
Let y be the first vertex along P such that y  S , and let x  S be y’s
predecessor.

24

You might also like