0% found this document useful (0 votes)
100 views

Bellman-Ford, Floyd-Warshall, and Dynamic Programming!

The document summarizes an upcoming lecture on shortest path algorithms. It will cover the Bellman-Ford algorithm, which can handle graphs with negative edge weights unlike Dijkstra's algorithm. Bellman-Ford works by maintaining arrays d(k) where d(k)[b] is the shortest path cost from the source s to node b using at most k edges. It updates d(k) from d(k-1) to gradually find the overall shortest paths.

Uploaded by

Hydro
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views

Bellman-Ford, Floyd-Warshall, and Dynamic Programming!

The document summarizes an upcoming lecture on shortest path algorithms. It will cover the Bellman-Ford algorithm, which can handle graphs with negative edge weights unlike Dijkstra's algorithm. Bellman-Ford works by maintaining arrays d(k) where d(k)[b] is the shortest path cost from the source s to node b using at most k edges. It updates d(k) from d(k-1) to gradually find the overall shortest paths.

Uploaded by

Hydro
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Lecture

12
Bellman-Ford, Floyd-Warshall, and Dynamic
Programming!
Announcements
• HW5 due Friday
• Midterms have been graded!
• Pick up your exam after class.
• Average: 84, Median: 87
• Max: 100 (x4)
• I am very happy with how well y’all did!
• Regrade policy:
• Write out a regrade request as you would on
Gradescope.
• Hand your exam and your request to me after class on
Wednesday or in my office hours Tuesday (or by
appointment).
Last time
• Dijkstra’s algorithm!
• Solves single-source shortest path in weighted graphs.

u
s 3 32
1 5
1
b v
13
a 2 2
1
16 t
Today
• Bellman-Ford algorithm
• Another single-source shortest path algorithm
• This is an example of dynamic programming
• We’ll see what that means
• Floyd-Warshall algorithm
• An “all-pairs” shortest path algorithm
• Another example of dynamic programming
• Weights on edges
Recall represent costs.

• The cost of a path is the


• A weighted directed graph: sum of the weights
along that path.

• A shortest path from s


u to t is a directed path
s 3 32 from s to t with the
1 5 smallest cost.
1
This is a b v • The single-source
path from 13 shortest path problem is
a s to t of to find the shortest path
cost 22. 21 2
from s to v for all v in
the graph.
16
t This is a path from s to t of
cost 10. It is the shortest
path from s to t.
One drawback to Dijkstra
• Might not work with negative edge weights
• On your homework!

u
s 3 32
-1 5
-1
b v
13
Why would we ever
a 21 -2 have negative weights?
• Negative costs might
mean benefits.
16 • eg, it costs me -$2
t when I get $2.
Bellman-Ford Algorithm
• Slower (but arguably simpler) than Dijkstra’s algorithm.
• Works with negative edge weights.
*We won’t actually store all these, but let’s pretend we do for now.

Bellman-Ford Algorithm
• We keep* an array d(k) of length n for each k = 0, 1, …, n-1.
Formally, we will maintain
the loop invariant: d(k)[b] is the cost of the shortest path
s u v t from s to b with at most k edges in it,
• For example, this is the shortest for all b in V.
d(0) path from s to t with at most
two edges in it.
s 2
s u v t
d(1) • But it’s not the shortest path
from s to t (with any number of u
1
edges).
s u v t
• That’s this one. 5
d(2) 2

s u v t t
-2 v
d(3)
*We won’t actually store all these, but let’s pretend we do for now.

Bellman-Ford Algorithm
• We keep* an array d(k) of length n for each k = 0, 1, …, n-1.
Formally, we will maintain
the loop invariant: d(k)[b] is the cost of the shortest path
s u v t from s to b with at most k edges in it,
for all b in V.
d(0) 0 ∞ ∞ ∞
0
s u v t s 2

d(1) u
1
s u v t
5
d(2) 2

s u v t t
-2 v
d(3) ∞ ∞
While maintaining:
d(k)[b] is the cost of the shortest path
Now update! from s to b with at most k edges in it.

• We will use the table d(0) to fill in d(1)


• Then use d(1) to fill in d(2)
•…
• Then use d(k-1) to fill in d(k)
• ...
• Then use d(n-2) to fill in d(n-1)

This eventually gives us what we want:


• d(k)[a] is the shortest path from s to a with at most k edges.
• Eventually we’ll get all the shortest paths…
How do we get d(k)[b] from d(k-1)?
Want to maintain: d(k)[b] is the cost of the shortest path
• Two cases: from s to b with at most k edges in it.
Case 1: the shortest path from s to b with at
most k edges actually has at most k-1 edges.
d(k)[b] = d(k-1)[b]
s 2 b
2 u
say k=3
Case 2: the shortest path from s to b with at
most k edges really has k edges.
d(k)[b] = d(k-1)[a] + w(a,b) for some a...
2
x a 2 d(k)[b] = mina {d(k-1)[a] + w(a,b)}
2

s 10 b
2 u
say k=3
Bellman-Ford Algorithm*

• Bellman-Ford*(G,s):
• Initialize d(k) for k = 0, …, n-1
• d(0)[v] = ∞ for all v other than s This minimum is over
• d(0)[s] = 0. all a so that (a,b) is in E

• For k = 1, …, n-1:
• For b in V:
• d(k)[b] ← min{ d(k-1)[b], mina {d(k-1)[a] + weight(a,b)} }
• Return d(n-1)

If we set d(k)[b] to be the minimum of the previous two


cases, then we maintain the loop invariant that:
d(k)[b] is the cost of the shortest path
from s to b with at most k edges in it.
Bellman-Ford Algorithm* Example
d(k)[b] is the cost of the shortest path
from s to b with at most k edges in it.
• For k = 1,…,n-1:
• For b in V:
• d(k)[b] ← min{ d(k-1)[b], mina {d(k-1)[a] + weight(a,b)} }

s u v t
d(0) 0 ∞ ∞ ∞
0
s u v t s 2

d(1) u
1
s u v t
5
d(2) 2

s u v t t
-2 v
d(3) ∞ ∞
Bellman-Ford Algorithm* Example
d(k)[b] is the cost of the shortest path
from s to b with at most k edges in it.
• For k = 1,…,n-1:
• For b in V:
• d(k)[b] ← min{ d(k-1)[b], mina {d(k-1)[a] + weight(a,b)} }

s u v t
d(0) 0 ∞ ∞ ∞
0
s u v t s 2
2
d(1) 0 2 5 ∞
u
1
s u v t
5
d(2) 2

s u v t t
-2 v
d(3) ∞ 5
Bellman-Ford Algorithm* Example
d(k)[b] is the cost of the shortest path
from s to b with at most k edges in it.
• For k = 1,…,n-1:
• For b in V:
• d(k)[b] ← min{ d(k-1)[b], mina {d(k-1)[a] + weight(a,b)} }

s u v t
d(0) 0 ∞ ∞ ∞
0
s u v t s 2
2
d(1) 0 2 5 ∞
u
1
s u v t
5
d(2) 0 2 4 3 2

s u v t t
-2 v
d(3) 3 4
Bellman-Ford Algorithm* Example
d(k)[b] is the cost of the shortest path
from s to b with at most k edges in it.
• For k = 1,…,n-1:
• For b in V:
• d(k)[b] ← min{ d(k-1)[b], mina {d(k-1)[a] + weight(a,b)} }

s u v t
d(0) 0 ∞ ∞ ∞
0
s u v t s 2
2
d(1) 0 2 5 ∞
u
1
s u v t
5
d(2) 0 2 4 3 2

s u v t t
-2 v
d(3) 0 2 4 2 2 4
Bellman-Ford Algorithm* Example
d(k)[b] is the cost of the shortest path
from s to b with at most k edges in it.
SANITY CHECK:
• The shortest path with 1 edge from s to t has cost ∞. (there is no such path).
• The shortest path with 2 edges from s to t has cost 3. (s-v-t)
• The shortest path with 3 edges from s to t has cost 2. (s-u-v-t)
And this one is the shortest path!!!
s u v t
d(0) 0 ∞ ∞ ∞
0
s u v t s 2
2
d(1) 0 2 5 ∞
u
1
s u v t
5
d(2) 0 2 4 3 2

s u v t t
-2 v
d(3) 0 2 4 2 2 4
How do we actually implement this?
(This is what the * on all the previous slides was for).

• Don’t actually keep all the arrays d(k) around.


• Just keep two of them at a time, that’s all we need.
• Running time: O(mn)
• That’s worse than Dijkstra, but BF can handle negative edge
weights.
• Space complexity:
• We need space to store the graph and two arrays of size n.

*WARNING: This is slightly different from the version of Bellman-Ford in CLRS.


But we will stick with what we just saw for pedagogical reasons.
See Lecture Notes 11.5 (listed on the webpage in the Lecture 12 box) for notes
on the analysis of the slightly different CLRS version.
Bellman-Ford Algorithm*

• Bellman-Ford*(G,s):
• Initialize d(k) for k = 0, …, n-1
• d(0)[v] = ∞ for all v other than s
• d(0)[s] = 0.
• For k = 1, …, n-1:
• For b in V:
• d(k)[b] ← min{ d(k-1)[b], mina {d(k-1)[a] + weight(a,b)} }
• Return d(n-1)
Why does it work?
• First, we’ve been asserting that:

d(n-1)[b] is the cost of the shortest path


from s to b with at most n-1 edges in it.

• Technically, this requires proof!


• We’ve basically already seen the proof!
• It follows from induction with the inductive hypothesis
d(k)[b] is the cost of the shortest path
Work out the details of this proof from s to b with at most k edges in it.
on your own! To help you, there’s
an outline on the next slide.
(Which we’ll skip now).
Sketch of proof [skip this in lecture]
that this thing we’ve been asserting is really true

• Inductive hypothesis: d(k)[b] is the cost of the shortest path


from s to b with at most k edges in it.

• Base case: For k = 0: 0 ∞ ∞ ∞

Case 2: the shortest path from s to


b of length at most k edges has
• Inductive step: exactly k edges
• d(k)[b] ← min{ d(k-1)[b], mina {d(k-1)[a] + weight(a,b)} }
Case 1: the shortest path from
s to be has <k edges
• In either case, we make the correct update.
When k = n-1, the inductive
hypothesis reads: (n-1)
• Conclusion: d [b] is the cost of the shortest path
from s to b with at most n-1 edges in it.
Is this the conclusion we want?
d(n-1)[b] is the cost of the shortest path
from s to b with at most n-1 edges in it.

• We still need to prove that this implies BF* is correct.


• We return d(n-1)
• Need to show d(n-1)[a] = distance(s,a).

• Enough to show:

Shortest path with at Shortest path with any


most n-1 edges number of edges
Shortest path
Shortest path
with any
DANGER! with at most n-1
edges
number of
edges

A negative cycle is a
• If the graph has a negative directed cycle with
negative total cost
cycle, this might not be true.
• If there is a negative cycle,
there may not be a shortest
path between two vertices!
s 2

1 u
-5
2
t
-2 v
But if there is no negative cycle
• Then not only are there shortest paths, but actually
there’s always a simple shortest path.
-2 10
s v
2 t
-5
This cycle isn’t helping.
x y Just get rid of it.
3

• A simple path in a graph with n vertices has at most


n-1 edges in it.
s “Simple” means
u that the path has
t no cycles in it.

Can’t add another edge v


without making a cycle!
Let’s go after a new conclusion.
• Theorem:
• The Bellman-Ford Algorithm* is correct as long as G has no
negative cycles.

*We will prove this for our version of Bellman-Ford.


See Notes 11.5 or CLRS for CLRS version.
Proof
• By induction, d(n-1)[b] is the cost of the shortest path
from s to b with at most n-1 edges in it.

• If there are no negative cycles,


Shortest path
Shortest path
with any
with at most n-1
number of
edges
edges

• This is because the shortest path is WLOG simple, and all


simple paths have at most n-1 edges.

• So the thing we return is equal to the thing we want to


return.
So that proves:
• Theorem:
• The Bellman-Ford Algorithm* is correct as long as G has no
negative cycles.

• Further, if G has a negative cycle, Bellman-Ford can detect


that.
• (See Notes 11.5)
What have we learned?
• The Bellman-Ford algorithm is slower than Dijkstra:
• O(mn) time

• But it works with negative edges weights.


• You’ll see how Dijkstra does with negative edge weights
in HW5.

• It doesn’t work with negative cycles, but in that


case shortest paths don’t even make sense.
Bellman-Ford is also used in practice.
• eg, Routing Information Protocol (RIP) uses something
like Bellman-Ford.
• Older protocol, not used as much anymore.
Destination Cost to get Send to
• Each router keeps a there whom?
table of distances to 172.16.1.0 34 172.16.1.1
every other router. 10.20.40.1 10 192.168.1.2
• Periodically we do a 10.155.120.1 9 10.13.50.0
Bellman-Ford update.
• This also means that if
there are changes in the
network, this will
propagate. (maybe slowly…)
This was an example of…
What is dynamic programming?
• It is an algorithm design paradigm
• like divide-and-conquer is an algorithm design paradigm.
• Usually it is for solving optimization problems
• eg, shortest path
Elements of dynamic programming
• Big problems break up into little problems.
• eg, Shortest path with at most k edges.
• The optimal solution of a problem can be expressed in
terms of optimal solutions of smaller sub-problems.
• eg, d(k)[b] ← min{ d(k-1)[b], mina {d(k-1)[a] + weight(a,b)} }

We call this “optimal sub-structure”


Elements of dynamic programming II

• The sub-problems overlap a lot.


• eg, Lots of different entries of d(k) ask for d(k-1)[a].
• This means that we can save time by solving a sub-problem
just once and storing the answer.

We call this “overlapping sub-problems”


Elements of dynamic programming III
• Optimal substructure.
• Optimal solutions to sub-problems are sub-solutions to the
optimal solution of the original problem.
• Overlapping subproblems.
• The subproblems show up again and again

• Using these properties, we can design a dynamic


programming algorithm:
• Keep a table of solutions to the smaller problems.
• Use the solutions in the table to solve bigger problems.
• At the end we can use information we collected along the
way to find the optimal solution.
• eg, recover the shortest path (not just its cost).
Two ways to think about and/or
implement DP algorithms

• Top down

• Bottom up

This picture isn’t hugely relevant but I like it.


Bottom up approach
• What we just saw.
• Solve the small problems first
• fill in d(0)
• Then bigger problems
• fill in d(1)
•…
• Then bigger problems
• fill in d(n-2)
• Then finally solve the real problem.
• fill in d(n-1)
Top down approach
• Think of it like a recursive algorithm.
• To solve the big problem:
• Recurse to solve smaller problems
• Those recurse to solve smaller problems
• etc..

• The difference from divide and


conquer:
• Memo-ization
• Keep track of what small problems you’ve
already solved to prevent re-solving the
same problem twice.
Example: top-down** version of BF*
The actual pseudocode here
isn’t important, I just want to
talk about the structure of it.
• Bellman-Ford*(G,s):
• Initialize a bunch of empty tables d(k) for k=0,…,n-1,
• Fill in d(0)
• for b in V:
• BF*_helper(G, s, b, n-1)

• BF*_helper(G, s, b, k):
• For each a so that (a,b) in E, and also for a=b:
• If d(k-1)[a] is not already in the table:
• d(k-1)[a] = BF*_helper( G, s, a, k-1 )
• return min{ d(k-1)[b], mina {d(k-1)[a] + weight(a,b)} }

*Not the actual Bellman-Ford algorithm; we don’t want to keep all these tables around
**Probably not the best way to think about Bellman-Ford: this is for DP pedagogy only!
Visualization
top-down approach

d(n-1)[u] d(n-1)[v]

d(n-2)[x] d(n-2)[y] d(n-2)[a] d(n-2)[a] d(n-2)[x]

This is a really big


recursion tree! Naively,
n layers, so at least 2n
time!
d(n-3)[z] d(n-3)[t] d(n-3)[v]
d(n-3)[x] d(n-3)[v]
Visualization
top-down approach

d(n-1)[u] d(n-1)[v]

d(n-2)[x] d(n-2)[a] d(n-2)[z]


d(n-2)[y]

Now it’s a much


smaller
“recursion DAG!”
d(n-3)[z] d(n-3)[t] d(n-3)[v]
d(n-3)[x]
What have we learned?

• Paradigm in algorithm design.


• Useful when there’s optimal substructure:
• optimal solutions to a big problem break up in to optimal sub-
solutions of subproblems.
• Useful when there are overlapping subproblems:
• Use memo-ization (aka, put it in a table) to prevent repeated
work.
• Can be implemented bottom-up or top-down.
• It’s a fancy name for a pretty common-sense idea:
• Don’t duplicate work if you don’t have to!
Why “dynamic programming” ?
• Programming refers to finding the optimal “program.”
• as in, a shortest route is a plan aka a program.
• Dynamic refers to the fact that it’s multi-stage.
• But also it’s just a fancy-sounding name.

Manipulating computer code in an action movie?


Why “dynamic programming” ?
• Richard Bellman invented the name in the 1950’s.
• At the time, he was working for the RAND
Corporation, which was basically working for the
Air Force, and government projects needed flashy
names to get funded.
• From Bellman’s autobiography:
• “It’s impossible to use the word, dynamic, in the
pejorative sense…I thought dynamic programming was
a good name. It was something not even a
Congressman could object to.”
Another example
• Floyd-Warshall Algorithm
• This is an algorithm for All-Pairs Shortest Paths (APSP)
• That is, I want to know the shortest path from u to v for ALL
pairs u,v of vertices in the graph.
• Not just from a special single source s.
Destination
s u v t s 2
Source

s 0 2 4 2
1 u
u 1 0 2 0
5
v ∞ ∞ 0 -2 2

t ∞ ∞ ∞ 0 t
-2 v
Another example
• Floyd-Warshall Algorithm
• This is an algorithm for All-Pairs Shortest Paths (APSP)
• That is, I want to know the shortest path from u to v for ALL
pairs u,v of vertices in the graph.
• Not just from a special single source s.
• Naïve solution (if we want to handle negative edge weights):
• For all s in G:
• Run Bellman-Ford on G starting at s.
• Time O(n⋅nm) = O(n2m),
• may be as bad as n4 if m=n2
Label the vertices 1,2,…,n
Optimal substructure (We omit edges in the
picture below).
Sub-problem: For all pairs, u,v, find the cost of
the shortest path from u to v, so that all the
internal vertices on that path are in {1,…,k-1}.

Let D(k-1)[u,v] be the solution k


to this sub-problem. k+1

u 1 2
v
3
This is the shortest
… k-1 path from u to v
n through the blue
set. It has length
D(k-1)[u,v]
Label the vertices 1,2,…,n
Optimal substructure (We omit edges in the
picture below).
Sub-problem: For all pairs, u,v, find the cost of
the shortest path from u to v, so that all the
internal vertices on that path are in {1,…,k-1}.

Let D(k-1)[u,v] be the solution k


to this sub-problem. k+1

Question: How can we find D(k)[u,v] using D(k-1)?


u 1 2
v
3
This is the shortest
… k-1 path from u to v
n through the blue
set. It has length
D(k-1)[u,v]
How can we find D(k)[u,v] using D(k-1)?
D(k)[u,v] is the cost of the shortest path from u to v so
that all internal vertices on that path are in {1, …, k}.

k
k+1

u 1 2
v
3

… k-1
n
How can we find D(k)[u,v] using D(k-1)?
D(k)[u,v] is the cost of the shortest path from u to v so
that all internal vertices on that path are in {1, …, k}.

Case 1: we don’t
k
need vertex k. k+1

u 1 2
v
3

… k-1
n

D(k)[u,v] = D(k-1)[u,v]
How can we find D(k)[u,v] using D(k-1)?
D(k)[u,v] is the cost of the shortest path from u to v so
that all internal vertices on that path are in {1, …, k}.

Case 2: we need
k
vertex k. k+1

u 1 2
v
3

… k-1
n
Case 2 continued
Case 2: we need
• Suppose there are no negative
vertex k.
cycles.
• Then WLOG the shortest path from
u to v through {1,…,k} is simple.
k

• If that path passes through k, it


must look like this: u 1 2
v
3
• This path is the shortest path …
k-1
from u to k through {1,…,k-1}. n
• sub-paths of shortest paths are
shortest paths
• Same for this path.

D(k)[u,v] = D(k-1)[u,k] + D(k-1)[k,v]


How can we find D(k)[u,v] using D(k-1)?

• D(k)[u,v] = min{ D(k-1)[u,v], D(k-1)[u,k] + D(k-1)[k,v] }


Case 1: Cost of Case 2: Cost of shortest path
shortest path from u to k and then from k to v
through {1,…,k-1} through {1,…,k-1}

• Optimal substructure:
• We can solve the big problem using smaller problems.
• Overlapping sub-problems:
• D(k-1)[k,v] can be used to help compute D(k)[u,v] for lots
of different u’s.
How can we find D(k)[u,v] using D(k-1)?

• D(k)[u,v] = min{ D(k-1)[u,v], D(k-1)[u,k] + D(k-1)[k,v] }


Case 1: Cost of Case 2: Cost of shortest path
shortest path from u to k and then from k to v
through {1,…,k-1} through {1,…,k-1}

• Using our paradigm, this


immediately gives us an algorithm!
Floyd-Warshall algorithm
• Initialize n-by-n arrays D(k) for k = 0,…,n
The base case
• D(k)[u,u] = 0 for all u, for all k checks out: the
• D(k)[u,v] = ∞ for all u ≠ v, for all k only path through
• D(0)[u,v] = weight(u,v) for all (u,v) in E. zero other vertices
are edges directly
• For k = 1, …, n: from u to v.
• For pairs u,v in V2:
• D(k)[u,v] = min{ D(k-1)[u,v], D(k-1)[u,k] + D(k-1)[k,v] }
• Return D(n)

This is a bottom-up algorithm.


We’ve basically just shown
• Theorem:
If there are no negative cycles in a weighted directed graph G,
then the Floyd-Warshall algorithm, running on G, returns a
matrix D(n) so that:
D(n)[u,v] = distance between u and v in G.
Work out the details of the
proof! (Or see Lecture
• Running time: O(n3) Notes 12 for a few more
details).
• Better than running BF n times!
• Not really better than running Dijkstra n times.
• But it’s simpler to implement and handles negative weights.
• Storage:
• Enough to hold two n-by-n arrays, and the original graph.
As with Bellman-Ford, we don’t really need to store all n of the D(k).
What if there are negative cycles?
• Just like Bellman-Ford, Floyd-Warshall can detect
negative cycles.
• If there is a negative cycle, then there is a path
from v to v that goes through all n vertices that has
cost < 0.
• That’s just the definition of a negative cycle.
• So D(n)[v,v] < 0.
• So check for that at the end.
• if there is such a v, return negative cycle.
What have we learned?
• The Floyd-Warshall algorithm is another example of
dynamic programming.

• It computes All Pairs Shortest Paths in a directed


weighted graph in time O(n3).
Another Example?
• Longest simple path (say all edge weights are 1):

s a

t b

What is the longest simple path from s to t?


This is an optimization problem…
• Can we use Dynamic Programming?
• Optimal Substructure?
• Longest path from s to t = longest path from s to a
+ longest path from a to t?

s a

NOPE!
t b
This doesn’t work
What went wrong?
• The subproblems we came up with aren’t independent:
• Once we’ve chosen the longest path from a to t
• which uses b,
• our longest path from s to a shouldn’t be allowed to use b
• since b was already used.

• Actually, the longest simple path s a


problem is NP-complete.
• We don’t know of any polynomial-
time algorithms for it, DP or
otherwise! t b
Recap
• Two more shortest-path algorithms:
• Bellman-Ford for single-source shortest path
• Floyd-Warshall for all-pairs shortest path
• Dynamic programming!
• This is a fancy name for:
• Break up an optimization problem into smaller problems
• The optimal solutions to the sub-problems should be sub-
solutions to the original problem.
• Build the optimal solution iteratively by filling in a table of sub-
solutions.
• Take advantage of overlapping sub-problems!
Next time
• More examples of dynamic programming!

We will stop bullets with our


action-packed coding skills,
and also maybe find longest
common subsequences.

You might also like