Data structures and Algorithms
Lecture 12: Randomized Algorithms
[GT 19.1 and 19.6]
Lecturer: Dr. Karlos Ishac
School of Computer Science
Some content is taken from the textbook
publisher Wiley and previous
Co-ordinator Dr. Andre van Renssen.
The University of Sydney Page 1
Randomized algorithms
Randomized algorithms are algorithms where the behaviour
doesn’t depend solely on the input. It also depends (in part) on
random choices or the values of a number of random bits.
Reasons for using randomization:
- Sampling data from a large population or dataset
- Avoid pathological worst-case examples
- Avoid predictable outcomes
- Allow for simpler algorithms
The University of Sydney Page 3
Real World Applications
The University of Sydney Page 4
Generating random permutations
Input: An integer n.
Output: A permutation of {1, …, n} chosen uniformly at random, i.e.,
every permutation has the same probability of being generated.
Number Position
Example:
1 1
n=6
<1,4,6,3,2,5> 2 2
3 3
4 4
5 5
The University of Sydney
6 6 Page 5
Generating random permutations
What are random permutations used for?
– Many algorithms whose input is an array perform better in practice
after randomly permuting the input (for example, QuickSort).
– Can be used to sample k elements without knowing k in advance by
picking the next element in the permuted order when needed.
– Can be used to assign scarce resources.
– Can be a building block for more complex randomized algorithms.
The University of Sydney Page 6
First (incorrect) attempt
def permute(A):
# permute A in place
n ← length of array A
for i in {0, ..., n-1} do
# swap A[i] with random position
j ← random number in {0, ..., n-1}
A[i], A[j] ← A[j], A[i]
return A
Note that since j is picked at random, different executions lead to different outcomes
The University of Sydney Page 7
So, why is this incorrect?
For all permutations to be equally likely, we want that every
permutation is generated by the same number of possible executions.
Executions Permutations
e1 p1
e2
e3 p2
p3
p4
e4
e5
p5
The University of Sydney Page 8
First (incorrect) attempt: Analysis
def permute(A):
Number of executions:
𝑛𝑛 ∗ 𝑛𝑛 ∗ 𝑛𝑛 ∗ … ∗ 𝑛𝑛 = 𝑛𝑛𝑛𝑛 # permute A in place
n ← length of array A
n times
for i in {0, ..., n-1} do
# swap A[i] with random position
j ← random number in {0, ..., n-1}
Number of permutations:
A[i], A[j] ← A[j], A[i]
1 ∗ 2 ∗ 3 ∗ … ∗ 𝑛𝑛 = 𝑛𝑛!
return A
𝑛𝑛𝑛𝑛 isn’t divisible by 𝑛𝑛!
Example:
n=3
𝑛𝑛𝑛𝑛 = 27
𝑛𝑛! = 6
27 isn’t a multiple of 6, so some permutations are more likely than others.
The University of Sydney Page 9
Second attempt
def FisherYates(A):
# permute A in place
n ← length of array A
for i in {0, ..., n-1} do
# swap A[i] with random position
j ← random number in {i, ..., n-1}
A[i], A[j] ← A[j], A[i]
return A
Note that since j is picked at random, different executions lead to different outcomes
The University of Sydney Page 10
Second attempt
The University of Sydney Credit: Shusen Wang YT Channel Page 11
Second attempt: Analysis
def FisherYates(A):
Number of executions:
1 ∗ 2 ∗ 3 ∗ … ∗ 𝑛𝑛 = 𝑛𝑛! # permute A in place
n ← length of array A
Number of permutations:
1 ∗ 2 ∗ 3 ∗ … ∗ 𝑛𝑛 = 𝑛𝑛! for i in {0, ..., n-1} do
# swap A[i] with random position
j ← random number in {i, ..., n-1}
A[i], A[j] ← A[j], A[i]
Observation: Every execution leads
to a different permutation.
return A
Example: To generate <3,2,4,1> starting from <1,2,3,4>
<1,2,3,4> → <3,2,1,4>, i=0 and j=2
<3,2,1,4> → <3,2,1,4>, i=1 and j=1
<3,2,1,4> → <3,2,4,1>, i=2 and j=3
<3,2,4,1> → <3,2,4,1>, i=3 and j=3
The University of Sydney Page 12
Skip lists
Another way to implement a Map.
So, why are we looking at another different way of doing this?
– Relatively simple data structure that’s built in a randomized way
– No need for rebalancing like in AVL trees
– Still has O(log n) expected worst-case time (this is NOT average case)
Applications:
– Indexing in File Systems
– Range Queries
– Various database systems use it
– Concurrent/parallel computing environments
The University of Sydney Page 15
Remember the Sorted Linked List
1 2 3 4 5
The University of Sydney Page 16
The Map ADT (recap)
– get(k): if the map M has an entry with key k, return its
associated value
– put(k, v): if key k is not in M, then insert (k, v) into the map M;
else, replace the existing value associated to k with v
– remove(k): if the map M has an entry with key k, remove it
– size(), isEmpty()
– entrySet(): return an iterable collection of the entries in M
– keySet(): return an iterable collection of the keys in M
– values(): return an iterable collection of the values in M
The University of Sydney © Goodrich, Tamassia, Goldwasser Page 17
Skip lists
Leveled structure, where every level is a subset of the one below it.
Next level’s elements determined by coin flips.
𝑆𝑆3 −∞ +∞
𝑆𝑆2 −∞ 14 +∞
𝑆𝑆1 −∞ −23 14 +∞
𝑆𝑆0 −∞ −23 6 14 30 +∞
The University of Sydney Page 18
Skip lists
A node p has pointer to:
• after(p): Node following p on same level.
• before(p): Node preceding p in the same level.
• above(p): Node above p in the same tower.
• below(p): Node below p in the same tower.
above(p)
before(p) 14 after(p)
below(p)
The University of Sydney Page 19
Search
def search(p,k):
while below(p) ≠ null do
p ← below(p)
while key(after(p)) ≤ k do
p ← after(p)
return p
𝑆𝑆3 −∞ +∞
𝑆𝑆2 −∞ 14 +∞
𝑆𝑆1 −∞ −23 14 +∞
𝑆𝑆0 −∞ −23 6 14 30 +∞
Example: search(topleft node, 30)
The University of Sydney Page 20
Insertion
def insert(p,k):
p ← search(p,k)
q ← insertAfterAbove(p,null,k)
while coin flip is heads do
while above(p) = null do
p ← before(p)
p ← above(p)
q ← insertAfterAbove(p,q,k)
𝑆𝑆3 −∞ +∞
𝑆𝑆2 −∞ 14 +∞
𝑆𝑆1 −∞ −23 14 +∞
𝑆𝑆0 −∞ −23 6 14 30 +∞
The University of Sydney Example: insert(topleft node, 25) Page 21
Insertion
def insert(p,k):
p ← search(p,k)
q ← insertAfterAbove(p,null,k)
while coin flip is heads do
while above(p) = null do
p ← before(p)
p ← above(p)
q ← insertAfterAbove(p,q,k)
𝑆𝑆3 −∞ +∞
𝑆𝑆2 −∞ 14 +∞
𝑆𝑆1 −∞ −23 14 +∞
𝑆𝑆0 −∞ −23 6 14 25 30 +∞
The University of Sydney Example: insert(topleft node, 25) Page 22
Insertion
def insert(p,k):
p ← search(p,k)
q ← insertAfterAbove(p,null,k)
while coin flip is heads do
while above(p) = null do
p ← before(p)
p ← above(p)
q ← insertAfterAbove(p,q,k)
𝑆𝑆3 −∞ +∞
𝑆𝑆2 −∞ 14 +∞
𝑆𝑆1 −∞ −23 14 25 +∞
𝑆𝑆0 −∞ −23 6 14 25 30 +∞
The University of Sydney Example: insert(topleft node, 25) Page 23
Removal
def remove(p,k):
p ← search(p,k)
if key(p) ≠ k then return null
repeat
remove p
p ← above(p)
until above(p) = null
𝑆𝑆3 −∞ +∞
𝑆𝑆2 −∞ 14 +∞
𝑆𝑆1 −∞ −23 14 25 +∞
𝑆𝑆0 −∞ −23 6 14 25 30 +∞
The University of Sydney Example: remove(topleft node, 14) Page 24
Removal
def remove(p,k):
p ← search(p,k)
if key(p) ≠ k then return null
repeat
remove p
p ← above(p)
until above(p) = null
𝑆𝑆3 −∞ +∞
𝑆𝑆2 −∞ 14 +∞
𝑆𝑆1 −∞ −23 14 25 +∞
𝑆𝑆0 −∞ −23 6 25 30 +∞
The University of Sydney Example: remove(topleft node, 14) Page 25
Removal
def remove(p,k):
p ← search(p,k)
if key(p) ≠ k then return null
repeat
remove p
p ← above(p)
until above(p) = null
𝑆𝑆3 −∞ +∞
𝑆𝑆2 −∞ 14 +∞
𝑆𝑆1 −∞ −23 25 +∞
𝑆𝑆0 −∞ −23 6 25 30 +∞
The University of Sydney Example: remove(topleft node, 14) Page 26
Removal
def remove(p,k):
p ← search(p,k)
if key(p) ≠ k then return null
repeat
remove p
p ← above(p)
until above(p) = null
𝑆𝑆3 −∞ +∞
𝑆𝑆2 −∞ +∞
𝑆𝑆1 −∞ −23 25 +∞
𝑆𝑆0 −∞ −23 6 25 30 +∞
The University of Sydney Example: remove(topleft node, 14) Page 27
Skip lists: Top layer
Keep a pointer to the top left node.
Choices for the top layer:
• Keep at a fixed level, say max{10, 3 log(𝑛𝑛) }
• Insertion needs to take this into account
• Variable level
• Continue insertion until coin comes up tails
• No modification required
• Probability that this gives more than O(log n) levels is very low
The University of Sydney Page 28
Skip lists: Analysis
Theorem:
The expected height of a skip list is O(log n).
Proof:
• The probability that an element is present at height i is 1/2𝑖𝑖 .
• I.e., the probability that the coin comes up heads i times.
• The probability that level i has at least one item is at most n/2𝑖𝑖 .
• The probability that skip list has height h is probability that level h
has at least one element.
• So, probability that skip list has height larger than c log n is at most
𝑛𝑛 𝑛𝑛 1
𝑐𝑐 log 𝑛𝑛
= 𝑐𝑐 = 𝑐𝑐−1
2 𝑛𝑛 𝑛𝑛
• So, probability that skip list has height O(log n) is at least
1
1 − 𝑐𝑐−1
The University of Sydney 𝑛𝑛 Page 29
Skip lists: Search Analysis
Theorem:
The expected search time of a skip list is O(log n).
Proof:
• Searching consists of horizontal and vertical steps.
• There are h vertical steps, so O(log n) with high probability.
• To have a horizontal step on level i, the next node can’t be on level i+1.
• The probability of this is 1/2.
• This means that the expected number of horizontal steps per level is 2.
• So we expect to spend O(1) time per level.
• Expected search time: O(log n) time with high probability.
Insertion and deletion take expected O(log n) time using similar analysis.
The University of Sydney Page 30
Skip lists: Space Analysis
Theorem:
The expected space used by a skip list is O(n).
Proof:
• Space per node: O(1)
• Expected number of nodes at level i is n/2𝑖𝑖 .
• Thus expected number of nodes is
ℎ ℎ
𝑛𝑛 1
� 𝑖𝑖 = 𝑛𝑛 � 𝑖𝑖 < 2𝑛𝑛
2 2
𝑖𝑖=0 𝑖𝑖=0
The University of Sydney Page 31
Skip lists: Summary
Expected space: O(n)
Expected search/insert/delete time: O(log n)
Works very well in practice and doesn’t require any complicated
rebalancing operations.
The University of Sydney Page 32