0% found this document useful (0 votes)
74 views

End Sem

The document discusses greedy algorithms and minimum spanning trees. It provides an example of a scheduling problem that can be solved with a greedy algorithm by always picking the interval with the earliest end time. It also discusses the set cover problem and proves that the greedy strategy of always picking the largest uncovered set provides an optimal solution within a logarithmic factor. Finally, it defines minimum spanning trees and some of their key properties.

Uploaded by

Jee Aspirant
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views

End Sem

The document discusses greedy algorithms and minimum spanning trees. It provides an example of a scheduling problem that can be solved with a greedy algorithm by always picking the interval with the earliest end time. It also discusses the set cover problem and proves that the greedy strategy of always picking the largest uncovered set provides an optimal solution within a logarithmic factor. Finally, it defines minimum spanning trees and some of their key properties.

Uploaded by

Jee Aspirant
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 104

Greedy Algorithms

1
Example: A Scheduling Problem

Input: A set of intervals 𝐴𝐴 = { 𝑥𝑥1 , 𝑦𝑦1 , 𝑥𝑥2 , 𝑦𝑦2 , … , … }


𝑥𝑥𝑖𝑖 ≤ 𝑦𝑦𝑖𝑖 , 𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 ∈ ℝ
Output: Schedule 𝑆𝑆 ⊆ 𝐴𝐴 Non-overlapping
(null intersection)
𝐼𝐼3
Goal: Maximize |𝑆𝑆| 𝐼𝐼1
Both optimal
𝐼𝐼2
Algo Skeleton:
1. Set 𝑆𝑆 = 𝜙𝜙
2. While ∃ 𝑥𝑥, 𝑦𝑦 ∈ 𝐴𝐴\S1 that does not
overlap with any 𝑥𝑥, 𝑦𝑦 ∈ S
• Choose an optimal 𝑥𝑥 ∗ , 𝑦𝑦 ∗ ∈ 𝐴𝐴\S1
and add to S
3. Return 𝑆𝑆 Strategy?!

Greedy because it does not revisit its


decisions [here, choosing optimal interval]
2
Strategy:
Pick the non-intersecting (with S) interval with the earliest end time!

How to prove the optimality?

- Let G = (g1 , g 2 , … ) be the greedy solution.


- Let S = (𝑠𝑠1 , 𝑠𝑠2 , … ) be any optimal solution.

Sorted by end time Complexity? 𝑛𝑛 = |𝐴𝐴|


Lets compare G and S: 𝑂𝑂 𝑛𝑛 log 𝑛𝑛 + 𝑂𝑂(𝑛𝑛)
Say, g1 = 𝑠𝑠1 but g 2 ≠ 𝑠𝑠2
Consider: S ′ = (𝑔𝑔1 , 𝑔𝑔2 , s3 , s4 , … )

Is this a valid solution?


𝑔𝑔2 does not intersect with the intervals before it
Does 𝑔𝑔2 go with intervals after it?
3
𝑔𝑔2 was chosen when 𝑠𝑠2 , 𝑠𝑠3 and all other options were available.

Now if 𝑠𝑠2 co-exist with all others that follow it, 𝑔𝑔2 with
earlier or same end time can also!
So S ′ = (𝑔𝑔1 , 𝑔𝑔2 , s3 , s4 , … ) is valid (optimal).

Like this we can obtain more optimal solutions and end with
the greedy solution without ever loosing optimality.
Formal proof by contradiction:
Suppose 𝐺𝐺 is NOT optimal.
Pick an optimal 𝑆𝑆 which is closest (max |𝑆𝑆 ∩ 𝐺𝐺|) to 𝐺𝐺.
𝑆𝑆 ≠ 𝐺𝐺
We can modify 𝑆𝑆 to 𝑆𝑆𝑆 so that:
𝑆𝑆𝑆 is also optimal & 𝑆𝑆 ′ ∩ 𝐺𝐺 ≥ 𝑆𝑆 ∩ 𝐺𝐺
4 Contradiction!
Set Cover
𝐼𝐼 = 𝑆𝑆𝑖𝑖 , 𝑖𝑖 = 1: 𝑚𝑚 & ∪𝑘𝑘=1:𝑚𝑚 𝑆𝑆𝑘𝑘 = 𝐵𝐵, 𝑆𝑆𝑘𝑘 ≠ {𝜙𝜙}

Minimize 𝐼𝐼𝑠𝑠 𝑜𝑜𝑜𝑜 |𝐽𝐽| ≤ 𝑚𝑚 , 𝐼𝐼𝑠𝑠 = 𝑆𝑆𝑖𝑖 ,∪𝑖𝑖 𝑆𝑆𝑖𝑖 = 𝐵𝐵 , 𝐽𝐽 = {𝑖𝑖,∪𝑖𝑖 𝑆𝑆𝑖𝑖 = 𝐵𝐵}
Example: 𝐵𝐵 is a universal set of topics covered by 𝑚𝑚 available ALGO
courses. 𝑆𝑆𝑗𝑗 is the set of topics covered by the 𝑗𝑗𝑡𝑡𝑡 ALGO course. Taking
minimum number of which ALGO courses will allow us to learn all
the topics being covered by all the 𝒎𝒎 of them?
Courses
More
than 1
optimal! Topics
5
A graph? (number of points (n) = number of sets (m))

Example: 𝐵𝐵 is a universal set of 𝑛𝑛 towns. 𝑆𝑆𝑗𝑗 is the set of towns (that will be)
covered by the school of 𝑗𝑗𝑡𝑡𝑡 town (𝑚𝑚 = 𝑛𝑛, symmetric, not transitive). Which
minimum number of schools can cover all the 𝑛𝑛 towns. [Vertex Cover]
A school in a town will cover another adjacent (edge) town

6
Algo Skeleton:
1. Set 𝐽𝐽 = 𝜙𝜙
2. While ∪𝑖𝑖∈𝐽𝐽 𝑆𝑆𝑖𝑖 ≠ 𝐵𝐵: Pick 𝑖𝑖 ∉ 𝐽𝐽 (𝑆𝑆𝑖𝑖 ∈ 𝐼𝐼\J) and add to 𝐽𝐽
3. Return 𝐽𝐽
Strategy?
Chose the largest uncovered set? max |𝑆𝑆𝑖𝑖 \∪𝑘𝑘∈𝐽𝐽 𝑆𝑆𝑘𝑘 |
𝑖𝑖∉𝐽𝐽

Valid as worst case will be all 𝑆𝑆𝑖𝑖 chosen.


Optimal? NO. Cubic time
First complexity!
𝑂𝑂(𝑚𝑚2 𝑛𝑛)
Second
Third

Greedy algos may not be optimal!

7 Notion of approximation algos.


Greedy Algorithms

1
How close to optimal?
|𝐽𝐽∗ | |𝐽𝐽𝐺𝐺 | ? 𝑚𝑚
Small Large
Upper
bound
𝑛𝑛𝑡𝑡 → No. of uncovered points after 𝑡𝑡 iterations
So 𝑛𝑛0 = 𝐵𝐵 (𝑛𝑛) 𝑛𝑛0 > 𝑛𝑛1 > 𝑛𝑛2 > ⋯ > 𝑛𝑛 𝑇𝑇

After some finite (T) iterations, we hope to get this small enough!
𝑛𝑛𝑡𝑡 � <1
𝑡𝑡=𝑇𝑇
𝑛𝑛𝑡𝑡
Property: 𝑛𝑛𝑡𝑡+1 ≤ 𝑛𝑛𝑡𝑡 −
|𝐽𝐽∗ |
𝑛𝑛𝑡𝑡
In the (𝑡𝑡 + 1)𝑡𝑡𝑡 iteration, at least points are covered by
|𝐽𝐽∗ |
the strategy of choosing the largest uncovered set
2
Proof: 𝐵𝐵

Covered 𝑛𝑛𝑡𝑡

|𝐽𝐽∗ | many sets cover the entire 𝐵𝐵, that is, the 𝑛𝑛 elements
Therefore, one set among the |𝐽𝐽∗ | must have at least
𝑛𝑛/|𝐽𝐽∗ | elements!

As at most all 𝐽𝐽∗ sets cover the 𝑛𝑛𝑡𝑡 elements, one set
must have at least 𝑛𝑛𝑡𝑡 /|𝐽𝐽∗ | of the 𝑛𝑛𝑡𝑡 elements!

The Strategy: Choose the largest uncovered set!


So at least 𝑛𝑛𝑡𝑡 /|𝐽𝐽∗ | of the 𝑛𝑛𝑡𝑡 elements will be present
in the chosen set!
3
Now: 𝑛𝑛𝑡𝑡−1 1
𝑛𝑛𝑡𝑡 ≤ 𝑛𝑛𝑡𝑡−1 − ∗ = 𝑛𝑛𝑡𝑡−1 1 − ∗
𝐽𝐽 𝐽𝐽
2 𝑡𝑡
1 1
≤ 𝑛𝑛𝑡𝑡−2 1− ∗ ≤ 𝑛𝑛0 1− ∗
𝐽𝐽 𝐽𝐽

Need to iterate till 𝑛𝑛𝑡𝑡 < 1!


𝑡𝑡
1
𝑛𝑛𝑡𝑡 ≤ 𝑛𝑛0 1− ∗
𝐽𝐽
𝑡𝑡
1
≤ 𝑛𝑛0 exp − ∗
𝐽𝐽 At most number of
∗ ∗ sets to be chosen for
𝑛𝑛 = 𝑛𝑛0 . As for 𝐽𝐽 < ∞, putting 𝑡𝑡 = 𝐽𝐽 log(𝑛𝑛) :
complete coverage:
𝑡𝑡 The upper bound!
𝑛𝑛𝑡𝑡 < 𝑛𝑛 exp − ∗ = 𝑛𝑛 exp − log 𝑛𝑛 = 1
𝐽𝐽
Best possible by any
Approximation factor: Upper bound / Optimal ≤ log 𝑛𝑛 polynomial time algo!
4
Greedy Algorithms

1
Minimum Spanning Trees (MST)

Input: Undirected graph 𝐺𝐺 = (𝑉𝑉, 𝐸𝐸), weights 𝜔𝜔: 𝐸𝐸 → ℤ

Output: an MST, that is, a “spanning” tree 𝑇𝑇 in 𝐺𝐺 that


minimizes 𝜔𝜔 𝑇𝑇 = ∑𝑒𝑒∈𝐸𝐸 𝑇𝑇 𝜔𝜔 𝑒𝑒 . 𝐸𝐸 𝑇𝑇 ⊆ 𝐸𝐸 𝐺𝐺 .
Spanning – touches all vertices in 𝐺𝐺
Weights (positive) are cost

Without
compromising
connectivity!

Observations:

2
No 2 different
paths between
same nodes.

Hence, Minimum Spanning Trees!

One optimal solution:


Cost (weight sum) = 16.

All possible MSTs will have the same number of edges, 𝑉𝑉 − 1


Without loss of generality:
• All edge weights are considered non-negative (for ease of analysis)
• Negative edges can be made positive by adding a suitable large
positive number to the all edges without effecting the MST outcome.
We are not searching for shortest path!
3
Trees: No roots needed!
Def: A tree is a connected acyclic graph.
Property: A tree (𝐺𝐺) on 𝑛𝑛 nodes has 𝑛𝑛 − 1 edges (E(𝐺𝐺)).
A connected graph with 𝑉𝑉 vertices: if acyclic then has 𝑉𝑉 − 1 edges.
𝐸𝐸 𝐺𝐺 < 𝑉𝑉 − 1 Not possible!
Lets start from an empty graph with 𝑉𝑉 vertices and add edges
one by one.
- Beginning: 𝑉𝑉 connected components & ‘0’ edges
- At the end: 1 connected component
- At each iteration:
Number of connected components 𝐶𝐶 → 𝐶𝐶 − 1
… as we should add the edge such that it does not
create a cycle!
So we will conclude exactly after 𝑉𝑉 − 1 iterations
adding 𝑉𝑉 − 1 edges.
4
Property: Any connected undirected graph 𝐺𝐺 = (𝑉𝑉, 𝐸𝐸)
with E = 𝑉𝑉 − 1 is a tree.

A connected graph (𝐺𝐺) with 𝑉𝑉 vertices: if it has 𝑉𝑉 − 1


edges, it is acyclic.
The converse to the previous property.
Lets start from the connected graph 𝐺𝐺.
- While 𝐺𝐺 has a cycle, remove an edge from that cycle.
It will not affect connectivity.
- After previous step 𝐺𝐺 → 𝐺𝐺 ′ . 𝐺𝐺 ′ is obviously acyclic.
- 𝐺𝐺 ′ has 𝑉𝑉 − 1 edges [from previous property].
- So if 𝐺𝐺 has 𝑉𝑉 − 1 we cannot remove any edge, as
essentially 𝐺𝐺 ≡ 𝐺𝐺 ′ . So such a 𝐺𝐺 is acyclic.
Another…
Property: An undirected graph is a tree if and only if there is
a unique path between any pair of nodes.
Tree (𝐺𝐺 = (𝑉𝑉, 𝐸𝐸)) ↔ A unique path exists between 𝑢𝑢 & 𝑣𝑣, ∀𝑢𝑢, 𝑣𝑣 ∈ 𝑉𝑉
5
A Greedy Approach

Lets consider the following greedy


approach:

• Start with the empty graph.


• Repeatedly add the next lightest edge that doesn't
produce a cycle.

Ordering the edges!


6
Correctness? Optimality?
The cut property

Cut def: A cut in 𝐺𝐺 is any pair of the form (𝑆𝑆, 𝑉𝑉 − 𝑆𝑆).

Property: The cheapest edge 𝑋𝑋


in any cut is in some MST.

7
Greedy Algorithms

1
Proof:
Pick any MST: 𝑇𝑇
Assume that 𝑒𝑒 ∉ 𝑇𝑇
𝑋𝑋
But the MST must cross the cut! 𝑋𝑋

So let 𝑒𝑒′ be the edge in MST that


does the crossing.

Now, get a 𝑇𝑇𝑇 by deleting 𝑒𝑒𝑒 from 𝑇𝑇 and adding 𝑒𝑒.


𝑇𝑇 contained by 𝑋𝑋 but not 𝑒𝑒, and 𝑇𝑇 ′ contains 𝑋𝑋 ∪ {𝑒𝑒}.
- Is 𝑇𝑇𝑇 connected? Yes
Broken parts of 𝑇𝑇 (after 𝑒𝑒𝑒 deletion) are individually
connected & 𝑒𝑒 connects them!
If 𝑒𝑒′ was not deleted, it would have formed a cycle. Deleting
an edge causing cycle does not affect connectivity!
2
- 𝑇𝑇𝑇 has 𝑉𝑉 − 1 edges? Yes
We removed 1 edge from an MST and added 1!

So 𝑇𝑇𝑇 is a tree!

Now, total weight of 𝑇𝑇𝑇 :


𝜔𝜔 𝑇𝑇 ′ = 𝜔𝜔 𝑇𝑇 + 𝜔𝜔 𝑒𝑒 − 𝜔𝜔(𝑒𝑒 ′ )

But 𝑒𝑒 is the cheapest / lightest (or one of the lightest)


weight across the partition! So:
𝜔𝜔 𝑇𝑇 ′ ≤ 𝜔𝜔 𝑇𝑇

But as 𝑇𝑇 is an MST we can only have: 𝜔𝜔 𝑇𝑇 ′ = 𝜔𝜔 𝑇𝑇

And 𝑇𝑇 ′ is an MST!

Hence, 𝑒𝑒 and 𝑋𝑋 ∪ {𝑒𝑒}, both are parts of the 𝑇𝑇 ′ MST.


3
4
Kruskal’s Algorithm Adjacency list of pairs vertex: (vertex, weight)

The approach:
1. Sort the edges by weight (in increasing order)
2. ∀𝑒𝑒 in this order, if 𝑒𝑒 does not create a cycle add it to 𝑋𝑋
(start with 0 edges)
3. Output 𝑇𝑇 = 𝑋𝑋
Correctness:
First step:
∅ ∪ 𝑒𝑒1 →The cheapest edge is part of some MST!
𝑉𝑉 − 1 Can connect two already
Inductive step (𝑖𝑖 𝑡𝑡𝑡 step): formed CCs (or) one CC and
one singleton (or) two
𝑋𝑋 ∪ 𝑒𝑒𝑖𝑖 = 𝑋𝑋 ′ → 𝑋𝑋 singletons. Will never connect
Can represent more than one 2 nodes in 1 CC (cycle!).
connected components (CC) on Therefore, 𝑒𝑒𝑖𝑖 goes across cut
either sides of the cut & 𝑒𝑒𝑖𝑖 is the cheapest outside 𝑋𝑋
5 that does not create a cycle.
Time complexity: Step 1: 𝑂𝑂( 𝐸𝐸 log 𝐸𝐸 )
Step 2: 𝑂𝑂( 𝐸𝐸 ( 𝑉𝑉 + 𝐸𝐸 ))

Quadratic in # of edges because we are running DFS.

OK, but not great!

DFS revealing back-edges in undirected graph also means there


is a cycle (2 different paths between same nodes).

But the DFS is possibly doing a lot of redundant things every time it
is run, as the graph only changes by an edge every iteration!
A way out to use a special data structure.

Union Find

6
Faster Kruskal: # of 𝐶𝐶𝐶𝐶𝐶𝐶 at the start is 𝑉𝑉

- Maintain the connected components 𝐶𝐶𝐶𝐶(𝑣𝑣) along the way.


- For the next cheapest edge 𝑒𝑒 = (𝑢𝑢, 𝑣𝑣), check whether
𝐶𝐶𝐶𝐶 𝑢𝑢 = 𝐶𝐶𝐶𝐶(𝑣𝑣) and then add only when not equal.
- After adding 𝑒𝑒 update the 2 connected components (label)
by merging them into 1.
𝑂𝑂( 𝐸𝐸 log 𝑉𝑉 )

𝑂𝑂(|𝑉𝑉|)

𝑂𝑂 𝐸𝐸 log 𝐸𝐸 ≤ 𝑂𝑂( 𝐸𝐸 log 𝑉𝑉 2 ) = 𝑂𝑂( 𝐸𝐸 log 𝑉𝑉 )

𝑂𝑂(log |𝑉𝑉|)

Return 𝑋𝑋 𝑂𝑂(log |𝑉𝑉|)


7
The functions
Maintain a collection 𝐶𝐶 of disjoint subsets (𝑆𝑆? ) of {1,2, … , 𝑛𝑛}
1. makeset(𝑥𝑥): 𝐶𝐶 ← 𝐶𝐶 ∪ {𝑥𝑥}
2. find(𝑥𝑥): return “name” of 𝑆𝑆𝑥𝑥 (representative element / label)
3. union(𝑥𝑥, 𝑦𝑦): remove 𝑆𝑆𝑥𝑥 , 𝑆𝑆𝑦𝑦 from 𝐶𝐶 and 𝐶𝐶 ← 𝐶𝐶 ∪ {𝑆𝑆𝑥𝑥 ∪ 𝑆𝑆𝑦𝑦 }

Naïve: Array- 𝐴𝐴[𝑖𝑖] stores the name of vertex 𝑖𝑖 (𝑆𝑆𝑖𝑖 )

Find: 𝑂𝑂(1), Union: 𝑂𝑂(𝑛𝑛), Makeset: 𝑂𝑂(𝑛𝑛)

Need to scan the array and replace one


particular name everywhere!

Kruksal’s: 𝑂𝑂 𝑚𝑚 log 𝑚𝑚 + 𝑂𝑂 𝑛𝑛 + 𝑂𝑂 𝑚𝑚 + 𝑂𝑂(𝑛𝑛2 )

Dense: 𝑂𝑂 𝑛𝑛2 log 𝑛𝑛 > 𝑂𝑂 𝑛𝑛2


Sparse: 𝑂𝑂 𝑛𝑛 log 𝑛𝑛 < 𝑂𝑂 𝑛𝑛2
8
Prim’s Algorithm
From the cut property:
Any algorithm conforming to the following greedy
schema is guaranteed to work

In Prim’s
- 𝑋𝑋 always forms a subtree of an MST.
- 𝑆𝑆 is chosen as the set of 𝑋𝑋’s vertices.
- 𝑋𝑋 grows into 𝑋𝑋 ← 𝑋𝑋 ∪ {𝑒𝑒}, where 𝑒𝑒 is the lightest
edge between a node in 𝑆𝑆 and a node outside 𝑆𝑆.
Equivalently: 𝑆𝑆 grows by including the vertex 𝑣𝑣 ∉ 𝑆𝑆:

9
Greedy Algorithms

1
Very similar in procedure
(not purpose) to Dijkstra’s!

Same run time based on


priority queue!

2
𝑂𝑂(𝑛𝑛)

𝑂𝑂(𝑛𝑛)
Edge weights!

Do you need any


other checks
𝑂𝑂(1) here?? Yes on z.

An example: 𝑶𝑶(𝒏𝒏𝟐𝟐 )

3
MST growth:

Prim with Fibonacci heap: 𝑂𝑂(𝑚𝑚 + 𝑛𝑛 log 𝑛𝑛)

1995 (Karger’s algorithm): 𝑂𝑂(𝑚𝑚) [Randomized


(expected runtime)]

http://cs.brown.edu/research/pubs/pdfs/1995/Karger-1995-RLT.pdf

1994 (Fredman & Willard): 𝑂𝑂(𝑚𝑚) [Beyond use


of comparisons (Word RAM model) ]

http://csclub.uwaterloo.ca/~gzsong/papers/Trans-
dichotomous%20Algorithms%20for%20Minimum%20Spanning%20Tr
ees%20and%20Shortest%20Paths.pdf

---
4
Huffman Encoding
Sampling and Quantization leads to:
String of length 𝑇𝑇 of alphabet ∈ Γ

50 min audio! Binary encoding

260 Megabits!
00, 01, 10, 11
In the sequence at hand: All same length

5
But decoding?

Prefix-free property: no codeword can be a prefix of another codeword

Full Binary Tree with Symbols at the Leaves!


Left – 0, Right – 1

(for every bit, one


Decoding Algo check-one traversal)
6
But how to form the tree?

Number of bits required!

The two symbols with the smallest frequencies must be at the


bottom of the optimal tree.

A Greedy Approach to Construct the Tree!


Correctness /Optimality: No intermediate node formed otherwise
7 can have a lower frequency associated! Sum of leaves remain same!
−1 size every iteration
𝑂𝑂(𝑛𝑛2 ) with Array
𝑂𝑂(𝑛𝑛 𝑙𝑙𝑙𝑙𝑙𝑙(𝑛𝑛)) with Binary heap
Form adjacency list (tree)
# of nodes: 2𝑛𝑛 − 1
8
Greedy Algorithms

1
Union Find

A set as a directed tree (poly tree):

Nodes – Set elements (any order)


Every node has a pointer (𝜋𝜋) to its parents that
eventually leads to the root.
Root element (parent pointer is self loop) is the name.

2
Every node also has a rank value.
Let us consider that as the height of the subtree hanging from
that node.

𝑂𝑂(1)
Every node is a singleton!

Arrays

Finding the root of tree


which contains 𝑥𝑥, traversing
through parents
Time proportional to the height of the tree.
Therefore, while merging 2 sets (trees), we will merge the
smaller height tree with the larger, so that the new height is
3 at most 1 more than the original larger height.
Union: Make the root of one point to the root of the other.
Make the root of the shorter tree
point to the root of the taller tree
union by rank (we don’t compute heights explicitly)

Cost similar to find!

Therefore, the union function forms the trees!


4
An example of union operations, with the order of edges from
the underlying graph is given

5
Property: For any 𝑥𝑥, 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑥𝑥 ≤ 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟(𝜋𝜋 𝑥𝑥 )
By definition, when we move up a path in a tree toward a root
node, the rank values along the way are strictly increasing.

Property: Any root node of rank 𝑘𝑘 has at least 2𝑘𝑘 nodes in its tree
- As we start from singletons, union creates the further trees.
- In the union function, root rank increases only when rank of the
merging trees are same.
- A root node with rank 𝑘𝑘 is created by merging two trees with roots of
rank 𝑘𝑘 − 1.
- By induction:
Two 0 rank (singletons) produce one 1 rank (# nodes = 2)
Two 1 ranks produce one 2 rank (min # nodes = 4)
Two 2 ranks produce one 3 rank (min # nodes = 8)
Two 𝑘𝑘 − 1 ranks produce one 𝑘𝑘 rank (min # nodes = 2𝑘𝑘 (2𝑘𝑘−1 +2𝑘𝑘−1 ))
* one 𝑘𝑘 rank and one < 𝑘𝑘 rank can produce one 𝑘𝑘 rank
So # nodes > 2𝑘𝑘 (2𝑘𝑘 +…)
6
- Different rank 𝑘𝑘 nodes do not have common descendants! (by
Property 1 as well)
Therefore:
Property: If there are 𝑛𝑛 elements overall, there can be at most
𝑛𝑛/2𝑘𝑘 nodes of rank 𝑘𝑘.
Therefore, maximum value of 𝑘𝑘 (rank) = log 2 𝑛𝑛 , “height” ≤ log 𝑛𝑛.

Upper bound on theoretical time complexity of


the find and union functions!

Edge sort makeset find union


𝑂𝑂 𝑚𝑚 log 𝑚𝑚 + 𝑂𝑂 𝑛𝑛 + 𝑂𝑂 𝑚𝑚 log 𝑛𝑛 + 𝑂𝑂(𝑛𝑛 log 𝑛𝑛)

7
Now, if we assume that the edges available are already sorted, then
can we make find and union even faster?

How about in ‘find’, when a series of parent


pointers is followed up to the root of a tree,
we will change all these pointers so that
they point directly to the root!

Path compression Slight increase in the base


cost due to the recursion

Runs very less


Statement: Rank remains same but its no longer the height!
Looking at sequences of find and union operations,
starting from an empty data structure and determining
the average time per operation revealed cost to be little
8 more than 𝑂𝑂(1)!
Dynamic Programming

1
Consider Fibonnaci That’s Dynamic Programming!
Iterative

Bottom-up

Recursive I can just store


function fib1fast(𝑛𝑛) last 2 values and
create an array 𝑓𝑓[0 … … 𝑛𝑛] n
use a dummy to
return fibmemo(𝑛𝑛, 𝑓𝑓) swap! Better at
Memoization n-1 storage!

function fibmemo(𝑛𝑛, 𝑓𝑓) n-2


Always being
if 𝑛𝑛 ≤ 1: return 𝑛𝑛 picked from
DAG
elif 𝑓𝑓 𝑛𝑛 ≠ : return 𝑓𝑓[𝑛𝑛] memory! n-3 structure
else:
𝑓𝑓 𝑛𝑛 = fibmemo(𝑛𝑛 − 1, 𝑓𝑓) + fibmemo(𝑛𝑛 − 2, 𝑓𝑓)
return 𝑓𝑓[𝑛𝑛]
Top-down
2
Dynamic Programming

1
Shortest Path in DAG (from starting node)

DAG structure, can be solved by recursion!

We were considering vertices (𝑢𝑢) in linearized order


then applying the update

Example:

Till the
node
Bookkeeping
needed!
to get path

2 Iteration?!
S, C, A, B, D, E – 0, 1, 2, 3, 4, 5 Need implicit DAG to
avoid infinite recursion
function Sspdag(𝑛𝑛)
S / 0 – starting node create an array dist 0 … … 𝑉𝑉 − 1 with all values ∞
return Sspdagmemo(𝑛𝑛, 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑)
Careful: m & n do not denote
# of edges & nodes here
function Sspdagmemo(𝑛𝑛, 𝑑𝑑)
if 𝑛𝑛 = 0: return 0
Can use the elif 𝑑𝑑 𝑛𝑛 ≠ ∞: return 𝑑𝑑[𝑛𝑛] #comment & see
adjacency list of elif in-degree(𝑛𝑛) = 0: return ∞
the reverse else:
graph
∀ 𝑚𝑚, 𝑛𝑛 :
𝑑𝑑 𝑛𝑛 = min(𝑑𝑑 𝑛𝑛 , 𝑙𝑙(𝑚𝑚, 𝑛𝑛)+ Sspdagmemo(𝑚𝑚, 𝑑𝑑))
return 𝑑𝑑 𝑛𝑛
Complexity? Topological sort – linear, Reverse graph – linear so In-degree –
linear, edge presence – linear PLUS loop over all edges – linear
(lookup, add, min)
Time - 𝑉𝑉 + |𝐸𝐸|, memory for memoization: 𝑉𝑉
3
Dynamic programming is a very powerful algorithmic paradigm in
which a problem is solved by identifying a collection of subproblems
and tackling them one by one, smallest first, using the answers to
small problems to help figure out larger ones, until the whole lot of
them is solved.
Conceptually DAG is implicit
The path…
function Sspdagmemo(𝑛𝑛, 𝑑𝑑, 𝑝𝑝)
if 𝑛𝑛 = 0: return 0, [𝑛𝑛]
function Sspdag(𝑛𝑛) elif 𝑑𝑑 𝑛𝑛 ≠ ∞: return 𝑑𝑑 𝑛𝑛 , 𝑝𝑝[𝑛𝑛]
create an array dist 0 … 𝑉𝑉 − 1 with all values ∞ elif in-degree(𝑛𝑛) = 0: return ∞, [ ]
create a list prev 0 … 𝑉𝑉 − 1 with all empty lists [ ] else:
𝐷𝐷, 𝑃𝑃 = Sspdagmemo(𝑛𝑛, 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑, 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝) ∀ 𝑚𝑚, 𝑛𝑛 :
return 𝐷𝐷, 𝑃𝑃 δ, ρ = Sspdagmemo(𝑚𝑚, 𝑑𝑑, 𝑝𝑝)
𝑣𝑣 = 𝑙𝑙 𝑚𝑚, 𝑛𝑛 + 𝛿𝛿
𝑑𝑑 𝑛𝑛 = min(𝑑𝑑 𝑛𝑛 , 𝑣𝑣)
We can also find: if 𝑑𝑑 𝑛𝑛 == 𝑣𝑣:
Longest path, 𝑝𝑝 𝑛𝑛 = 𝑝𝑝 𝑚𝑚 . 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎(𝑚𝑚)
smallest product return 𝑑𝑑 𝑛𝑛 , 𝑝𝑝[𝑛𝑛]
(instead of sum)
4
Longest Increasing Subsequence
Sequence

Subsequence

Increasing subsequence:
𝑎𝑎𝑖𝑖1 < 𝑎𝑎𝑖𝑖2 < ⋯ < 𝑎𝑎𝑖𝑖𝑘𝑘

Find:

Example..

5
Lets build a graph:

- Consider each number 𝑎𝑎𝑖𝑖 in the sequence as a node


- Starting from the first (leftmost) node:
Add a directed edge (𝑖𝑖, 𝑗𝑗) whenever 𝑖𝑖 < 𝑗𝑗 & 𝑎𝑎𝑖𝑖 < 𝑎𝑎𝑗𝑗

Time complexity: Space: 𝑜𝑜 𝑚𝑚


𝑛𝑛 𝑛𝑛 − 1 will be a DAG as for every edge
2 (𝑖𝑖, 𝑗𝑗) we have 𝑖𝑖 < 𝑗𝑗.
6 𝑛𝑛 → 𝑉𝑉
All paths in the graph 𝐺𝐺 will represent increasing subsequences!
All edges have the same weight, so find the longest path in 𝐺𝐺!
–ive of edge weights
Top-down /Bottom-up using DAG ( 𝑉𝑉 2 ):
Shortest path in DAG  Longest path in DAG
But we need to compute longest path from every node to every node!
Memory for memoization: 𝑉𝑉 × 𝑉𝑉 It actually can be
To get this cheaper than this!
TD Time complexity: 𝑉𝑉 2 𝑉𝑉 + 𝐸𝐸
BU Time complexity: 𝑉𝑉 𝐸𝐸 Max from this!
Another DAG 0 initialization The order in sequence
Bottom-up: (already linearized!)
Length of
the
Bookkeeping
longest
path to 𝑗𝑗 Maximum of the longest paths!
7
End (Longest increasing subsequence)
Memory: Linear in 𝑉𝑉 Time Complexity: E + 𝑉𝑉
Inverse graph is linear, max L is linear, Within loop 𝑛𝑛 ↔ 𝑛𝑛2
# of elements in
constant cost is applied times linear in edges.
the sequence
In the top-down approach, initialize with ‘0’ and use max, just like
bottom-up! The complexity will be linear, space is linear!
*There is a 𝑂𝑂(𝑛𝑛 log 𝑛𝑛) solution available
Dynamic?
Problem:

Same complexity O(𝑛𝑛2 ), NO DAG!


or loop: ∀𝑖𝑖 < 𝑗𝑗, 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑎𝑎𝑖𝑖 < 𝑎𝑎𝑗𝑗
8
Dynamic Programming

1
Worst implementation:

No memory used for storing values, but exponential complexity

Redundant computations! You need to compute again & again the same stuff!

Preferred implementation: 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 − 𝑂𝑂(𝑛𝑛 log 𝑛𝑛), Space − 𝑂𝑂(𝑛𝑛)


Involves binary search
2
Edit Distance
Input: Two strings 𝑥𝑥[1 … … 𝑚𝑚], 𝑦𝑦 1 … … 𝑛𝑛
Task: Compute fewest number of edits to convert 𝑥𝑥 into 𝑦𝑦

Operations allowed:
Insert char into 𝑥𝑥, Delete char from 𝑥𝑥, substitute a char for another in 𝑥𝑥

A DP solution

3
The following edits are possible among the right most entities to match
the strings: Del Ins Sub

Cost - 1 Cost – diff(𝑖𝑖, 𝑗𝑗)


Cost - 1
Prev (next):

Can record
4 the operation c>1
0,0

Base cases:

Computation can be done moving left to right row-wise or


top to bottom column-wise. Bottom up

Top down
Start from 𝐸𝐸(𝑚𝑚, 𝑛𝑛) and recursively call functions
giving ‘top’, ‘left’ and ‘top-left’ values. [very similar]

5
Run-time complexity:

Memory: Storing the DP table!


Top down
Bottom up:
Inside the loop we need to store only the 3 values from
current & previous rows or current & previous columns!
𝑂𝑂(min 𝑚𝑚, 𝑛𝑛 )
6
The underlying DAG:

Shortest path in DAG with


edge weights 0 or 1 ( or
different for different operations)
7
Dynamic Programming

1
Knapsack
Input: List of 𝑛𝑛 items with weights 𝑤𝑤𝑖𝑖 & value 𝑣𝑣𝑖𝑖 and a
knapsack with capacity 𝑊𝑊
Task: Pack items in the knapsack maximizing the total value

Only 1 quantity of each item available:


One item 1 and One item 3 – 46 Dollars
Unlimited quantity of each item available:
One item 1 and Two item 4 – 48 Dollars
Note: Greedy (value) won’t work – Example:
W=20, Item 1 (w=11, v=15), Item 2 (w=10, v=10), Item 3 (w=10, v=10)
2
Which 𝑖𝑖?

3
Algorithm:
Bookkeeping

Time Complexity: 𝑂𝑂(𝑛𝑛𝑛𝑛)

Memory: 𝑂𝑂(𝑊𝑊)

Note: The input is usually considered to be 𝑛𝑛 items and log2(𝑊𝑊) bits


representing the weight. Therefore, the time complexity is actually
exponential with respect to the input size (Psuedolinear!).

Knapsack in this form is just a variant of finding the longest path in DAG!

4
Solution:

𝑗𝑗 is taken 𝑗𝑗 is not taken

2D table: 𝑛𝑛 + 1 × 𝑊𝑊 + 1

5 Time: 𝑂𝑂 𝑛𝑛𝑛𝑛 Memory: 𝑂𝑂 𝑊𝑊 𝑏𝑏𝑏𝑏 , 𝑂𝑂 𝑛𝑛𝑛𝑛 𝑡𝑡𝑡𝑡


Dynamic Programming

1
Chain Matrix Multiplication

2
The Greedy approach of smallest cost first does not work!

Consider:

Number of multiplications involved:

Sol:
3
𝑂𝑂 𝑛𝑛2 storing 𝑖𝑖 ≤ 𝑘𝑘 ≤ 𝑗𝑗

Optimal:

Running complexity:

Bookkeeping: store k (multiplication #)


4
Travelling Salesman Problem
There are 𝑛𝑛 places {1, 2, … , 𝑛𝑛} and the distance between
each pair of places is given by 𝑑𝑑𝑖𝑖𝑖𝑖 (complete graph).

Goal: Starting from place 1, visit all the places exactly once by
travelling the minimum distance in total.

So an exhaustive algo will take:

Lets see what dynamic programming can do… subproblems?


𝑆𝑆 ⊆ {1,2, … , 𝑛𝑛} with 1, 𝑗𝑗 ∈ 𝑆𝑆

The minimum distance travelled to visit each place in 𝑆𝑆 exactly


once starting from 1 and ending at 𝑗𝑗.
5
If 𝑆𝑆 = 1, 𝐶𝐶 {1}, 1 = 0

Cannot end at the starting place 1

Travelling from 𝑖𝑖 to 𝑗𝑗.

Bookkeeping

Each Linear
Memory: 𝑛𝑛2𝑛𝑛
6
Dynamic Programming

1
Floyd-Warshall Algorithm

Applying Bellman-Ford for all vertices as the starting


node will cost 𝑂𝑂( 𝑉𝑉 2 |𝐸𝐸|)
A DP solution in 𝑂𝑂( 𝑉𝑉 3 )!

Subproblem:
Lets start with shortest paths between all pairs not involving
any intermediate nodes, and grow from there considering 1
node in the graph at a time till we consider all.
Vertices:

The shortest path from 𝑖𝑖 to 𝑗𝑗 considering only


the nodes {1,2, … , 𝑘𝑘} as the intermediates
2
Base case:
If an edge exist between 𝑖𝑖 and 𝑗𝑗, then it is
it is length 𝑙𝑙 𝑖𝑖, 𝑗𝑗 [0 for 𝑖𝑖 = 𝑗𝑗] or it is ∞.

Assuming no negative cycles, with the introduction of the node 𝑘𝑘


into the set of intermediate nodes being considered {1, … , 𝑘𝑘 − 1},
we need to recalculate all distance values!

Bookkeeping (store min path)


𝑂𝑂(𝑛𝑛3 ) time complexity

Memory: TD - 𝑂𝑂(𝑛𝑛3 ), BU - 𝑂𝑂(𝑛𝑛2 )

3
Reliable Shortest Path
Shortest path algos do not consider the number of hops
(edges / nodes) in them. In some application more hops
may represent a higher chance of connection drop!

Given a starting node 𝑠𝑠, get the shortest path to any node 𝑣𝑣 that
uses only 𝑘𝑘 (or less) edges.

∀𝑣𝑣
Optimal:

𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑(𝑣𝑣, 𝑘𝑘) Cost:𝑚𝑚𝑛𝑛𝑛𝑛, storage: 𝑛𝑛 or 𝑛𝑛𝑛𝑛


4
Flow Network

1
Flow in Networks
Goal: A directed graph. Send as much quantity possible from
source s to sink t. There is a maximum capacity associated
with every edge in the graph.

Such a scheme is called a Flow (Network Flow)

* Feasible flow 2
Task: Find
Given a set of linear constraints and a linear
objective function to maximize!

Linear Program! Max-flow problem is a linear programming issue.


3
11 - # edges, 5 - # of intermediate nodes.

So the problem is a generic LP one! But, max-flow algorithm exists


independently.

4
Basically, apply a path finding algorithm (like DFS). Choose the
path and send quantity through the path as much allowed by its
bottleneck/s. Reduce the edge capacities by the corresponding
current flows (remove the bottleneck edges), and then again
repeat the above. Greedy, guaranteed to stop!
O(m(m+n)), But optimality is not guaranteed!
Start

End

What if starting & ending at:


5
We can get around this issue! Ford-Fulkerson Algorithm
– Cancel existing Flow

For every edge (u,v) in a chosen s-t path of the flow at an iteration an
edge (v,u) is introduced with zero capacity & negative flow. Then, all the
positive and negative flows are subtracted from their corresponding
edge capacities to get the available capacities for the next iteration.
The network thus formed in every iteration is called the
residual network.

0 - (-fuv) 6
- If an edge already existed in the opposite direction, that can
easily be incorporated in the framework.
- The s-t path finding, say using linear-time DFS, happens on
the residual network.

Will yield the same


result as before!

The flow is updated by adding the corresponding edge weights in


the selected s-t path to the current flow.

An example: Current Flow on the left, Residual Graph on the right.


7
8
Flow Network

1
Cut:
(s, t)-cut
An (s, t)-cut partitions the vertices into two disjoint
groups L and R such that s is in L and t is in R. Its
capacity is the total capacity of the edges from L to
R, and is an upper bound on any flow.
Tight upper bound: (7)
Loose upper bound: (19)

This leads to the proof of correctness as well! 2


- In the original graph, we have proved that minimum cut capacity
gives the maximum flow size.
- [loose*] The Ford-Fulkerson algorithm converges as flows into the
sink increases with iteration (cancel flows useless here), leaving no
incoming capacity at some point. (not for irrational capacities)
When the Ford-Fulkerson algorithm is applied:
- If the algorithm has achieved maximum flow size, no ‘augmenting’
path will be there in the residual graph to increase flow size.
- Question: If the algorithm stops due to ‘augmenting’ path being
unavailable in the residual graph, does that mean the algorithm has
achieved maximum flow size?
* It is enough to prove it results in the cut with minimum capacity!
3
f → Final Flow (t is no longer reachable from s in the residual network)
R=V \ L L → set of all nodes reachable from s
Any e, e’ in original graph
So the (L, R) is an s-t cut.
To prove: [tight*]
is the min

In the residual graph:


- By definition, as one cannot reach from L to R, any edge from L
to R must be full in capacity.
- Secondly, if there was any edge from R to L that contributed to
the flow, there should be a corresponding ‘reverse’ edge from L
to R! This is not possible as one cannot reach from L to R. 4
Dinic’s Implementation: O(mn2)
- DFS (or BFS) linear time (m + n)
Edmond-Karp Algorithm: O(m2n)
- Iterations?
- If edge weights are integer (fixed-point real values) with
maximum value as U (otherwise may not terminate):
Each s-t path’s bottleneck edge will have at least a weight
value 1. s node can have at most n-1 neighbors and the
edges to the neighbors can have the maximum value of U. So
maximum possible iteration: (n-1)U ≥ val (f*). You won’t be
iterating anyway beyond val (f*).
U can be very high
So time complexity: O((m+n) val (f*)) compared to m & n
- Number of iterations can be restricted to O(mn), if BFS is
used to find paths with fewest edges. (Edmond-Karp) 5
Flow Network

1
Edmond-Karp Approach Complexity:
𝑓𝑓 − A flow: 𝑢𝑢, 𝑣𝑣 critical
𝑓𝑓 ′ − A later flow: 𝑣𝑣, 𝑢𝑢 selected (first following 𝑓𝑓)
𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑣𝑣 = 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑢𝑢 + 1; 𝑢𝑢, 𝑣𝑣 ∈ 𝐸𝐸, 𝑢𝑢, 𝑣𝑣 ∈ 𝑉𝑉
Shortest hop distances in the Bottleneck @ 𝑓𝑓
res. graph Any path: 2 x 1010 iterations
Introduces (if not already) 𝑣𝑣, 𝑢𝑢 ∈ 𝐸𝐸 Shortest path: 2 iterations
𝑑𝑑𝑓𝑓′ 𝑠𝑠, 𝑢𝑢 = 𝑑𝑑𝑓𝑓′ 𝑠𝑠, 𝑣𝑣 + 1; 𝑣𝑣, 𝑢𝑢 Chosen @ 𝑓𝑓′ Notes:
• Edge weights taken 1 in BFS
≥ 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑣𝑣 + 1 = 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑢𝑢 + 2 • Shortest path from s to t will
Maximum shortest path (no looping!) distance contain shortest path from s to
𝑛𝑛−2 all the intermediate nodes!
from s to any u≠t: 𝑛𝑛 − 2 → max times an • A shortest path distance for 𝒇𝒇 ≤
2
edge 𝑢𝑢, 𝑣𝑣 will be a bottleneck as 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑢𝑢 ≥ the same for 𝒇𝒇’
0. So, # of Iterations: O(mn) (someone becomes critical in each iteration) 2
Annexure: Proof of “ A shortest path distance for 𝒇𝒇 ≤ the same for 𝒇𝒇’ ”
𝑓𝑓 is obtained using the augmented path chosen that changes the 𝐺𝐺𝑓𝑓 to 𝐺𝐺𝑓𝑓1
Take the base case: 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑠𝑠 = 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑠𝑠 = 0 satisfying

By induction… 0 → 𝑘𝑘 − 1, 𝑠𝑠 → 𝑥𝑥, = → ≤: 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑥𝑥 ≤ 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑥𝑥 < 𝑘𝑘, ∀𝑥𝑥

Let us assume there is a vertex 𝑣𝑣 for which 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑣𝑣 > 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑣𝑣 [contradiction]
… with 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑣𝑣 = 𝑘𝑘
In residual graph 𝐺𝐺𝑓𝑓1 take the 𝑢𝑢 so that: 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑣𝑣 = 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑢𝑢 + 1
[𝑢𝑢 is the node just before 𝑣𝑣 in the shorted path from 𝑠𝑠 to 𝑣𝑣]

Exists with 𝑑𝑑𝑓𝑓1 (𝑠𝑠, 𝑢𝑢)


As 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑢𝑢 = 𝑘𝑘 − 1 < 𝑘𝑘
we have, 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑢𝑢 ≤ 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑢𝑢 In 𝐺𝐺𝑓𝑓1
The contradicting assumption: 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑣𝑣 > 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑣𝑣
Given the above assumption 𝐺𝐺𝑓𝑓 can not contain the edge (𝑢𝑢, 𝑣𝑣), as if it does then:
By triangular inequality of
𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑣𝑣 ≤ 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑢𝑢 + 1 shortest distances!
𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑣𝑣 ≤ 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑢𝑢 + 1 = 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑣𝑣 Does not agree!

So (𝑢𝑢, 𝑣𝑣) does not exist in 𝐺𝐺𝑓𝑓 but it does in 𝐺𝐺𝑓𝑓1 ! So (𝑣𝑣, 𝑢𝑢) existed in 𝐺𝐺𝑓𝑓 that became
critical while augmenting 𝑓𝑓 that yielded (𝑢𝑢, 𝑣𝑣) in 𝐺𝐺𝑓𝑓1 ((𝑣𝑣, 𝑢𝑢) was removed).
So 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑢𝑢 = 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑣𝑣 + 1
𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑣𝑣 = 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑢𝑢 − 1 ≤ 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑢𝑢 − 1 = 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑣𝑣 − 2
Indicating 𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑣𝑣 ≤ 𝑑𝑑𝑓𝑓1 𝑠𝑠, 𝑣𝑣 that REJECTS our assumption!
As 𝑓𝑓 1 is a flow just after augmenting the path to flow 𝑓𝑓, we have:
𝑑𝑑𝑓𝑓 𝑠𝑠, 𝑣𝑣 ≤ 𝑑𝑑𝑓𝑓′ 𝑠𝑠, 𝑣𝑣 for any 𝑓𝑓′ that is obtained after 𝑓𝑓.
NP-Complete & NP-Hard

1
Introduction
Saw efficient POLYNOMIAL TIME algorithms for finding a
shortest path in a graph, a graph’s minimum spanning tree, etc.

They are intelligent ways of reducing the search space of solutions!

A graph with n vertices can have upto nn-2 spanning trees (complete
graph) [Kirchhoff’s Theorem], and a typical graph has an exponential
number of paths from s to t (complete: (n-2)!).

So exhaustive search would be of exponential (or worse) complexity


2
But for certain problems, the most efficient solution invented till
now is still of exponential complexity, may not much simpler
than exhaustive search.

Satisfiability (SAT) AND OR

Either find an assignment such that the result is TRUE or


report that it does not exist.
3-SAT (problem with max 3 literals in a clause, 1&2-SATs are Polynomial) 3
n variables in the expression means 2n possibilities, and an
exhaustive search through them to perform the task on all clauses.

A search problem must have the property that any proposed


solution S to an instance I can be quickly checked for correctness

Input size 4
Travelling Salesman Problem (TSP)
Seeing TSP as a search problem. (slightly different!)

Check whether:
A
Exhaustive search would be worse than exponential! Our DP TSP
algorithm was also exponential!

5
Non-deterministic* polynomial time
We denote the class of all search problems by NP

Most experts think so, but not proven.


* Turing machine 6
Reduction: Polynomial time Polynomial time

The converse may not be true!

If A is hard then B is also hard, if B is efficient so is A.


A problem is NP-hard if all search problems in NP reduce to it. If a NP-
hard problem is in NP (or is a search problem), then it is NP-complete.
If even one NP-complete problem is in P, then P = NP. Note: P ⊆ NP 7
Largest subset of graph vertices
with no edge between them

(Set Cover)

(Knapsack)
Largest vertex subset that forms a
complete graph (all edges present)
8
Why only P, NP?
Subset to Superset? P then NP?

EXPTIME - The set of all decision problems that are solvable by a


deterministic Turing machine in exponential time.
NEXPTIME - The set of decision problems that can be solved by a non-
deterministic Turing machine in exponential time.
PSPACE - The set of all decision problems that can be solved by a
Turing machine using a polynomial amount of space.
EXPSPACE - The set of all decision problems solvable by a Turing
machine in an exponential amount of space.

We also have L, NL, NL-hard, NL-complete… NL ⊆ P


9
Branch & Bound for Travelling Salesman
Problem:

Starting point

Cost (lower bound): Cost of completed partial tour + cost of completing the
partial tour (at least) 10
Lower bound on the cost of Sum
completing the partial tour:

Example:

TSP sol.

11
Approximation
for TSP Assume complete graph and that the distances (edge length)
satisfy triangle inequality.
If we remove an edge from the TSP solution, it will
be a spanning tree!

We cannot go through the same node (city) twice


Lets consider MST and drop the requirement that we cannot visit the same city twice.
So we can traverse through the MST edges at most twice to visit all places and return.
So cost is at most 2 times that of MST or at most 2 times the optimal TSP cost.
When 𝑂𝑂𝑂𝑂𝑂𝑂. 𝑇𝑇𝑇𝑇𝑇𝑇 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 = 𝑀𝑀𝑀𝑀𝑀𝑀 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
Now invoking the ‘no 2 visit’ clause, traversing directly from one place to
another without going through the MST edges twice can only reduce cost by
triangular inequality! Remove duplicate visits by direct path! Polynomial time! 12

You might also like