Coursera Advanced Algorithms and Complexity
Coursera Advanced Algorithms and Complexity
Network flows show up in many real world situations in which a good needs to be transported across
a network with limited capacity. You can see it when shipping goods across highways and routing
packets across the internet. In this unit, we will discuss the mathematical underpinnings of network
flows and some important flow algorithms. We will also give some surprising examples on seemingly
unrelated problems that can be solved with our knowledge of network flows.
Linear programming is a very powerful algorithmic tool. Essentially, a linear programming problem
asks you to optimize a linear function of real variables constrained by some system of linear
inequalities. This is an extremely versatile framework that immediately generalizes flow problems,
but can also be used to discuss a wide variety of other problems from optimizing production
procedures to finding the cheapest way to attain a healthy diet. Surprisingly, this very general
framework admits efficient algorithms. In this unit, we will discuss some of the importance of linear
programming problems along with some of the tools used to solve them.
Although many of the algorithms you've learned so far are applied in practice a lot, it turns out that
the world is dominated by real-world problems without a known provably efficient algorithm. Many of
these problems can be reduced to one of the classical problems called NP-complete problems which
either cannot be solved by a polynomial algorithm or solving any one of them would win you a million
dollars (see Millenium Prize Problems) and eternal worldwide fame for solving the main problem of
computer science called P vs NP. It's good to know this before trying to solve a problem before the
tomorrow's deadline :) Although these problems are very unlikely to be solvable efficiently in the
nearest future, people always come up with various workarounds. In this module you will study the
classical NP-complete problems and the reductions between them. You will also practice solving
large instances of some of these problems despite their hardness using very efficient specialized
software based on tons of research in the area of NP-complete problems.
After the previous module you might be sad: you've just went through 5 courses in Algorithms only to
learn that they are not suitable for most real-world problems. However, don't give up yet! People are
creative, and they need to solve these problems anyway, so in practice there are often ways to cope
with an NP-complete problem at hand. We first show that some special cases on NP-complete
problems can, in fact, be solved in polynomial time. We then consider exact algorithms that find a
solution much faster than the brute force algorithm. We conclude with approximation algorithms that
work in polynomial time and find a solution that is close to being optimal.
WEEK
5 hours to complete
In most previous lectures we were interested in designing algorithms with fast (e.g. small polynomial)
runtime, and assumed that the algorithm has random access to its input, which is loaded into
memory. In many modern applications in big data analysis, however, the input is so large that it
cannot be stored in memory. Instead, the input is presented as a stream of updates, which the
algorithm scans while maintaining a small summary of the stream seen so far. This is precisely the
setting of the streaming model of computation, which we study in this lecture. The streaming model
is well-suited for designing and reasoning about small space algorithms. It has received a lot of
attention in the literature, and several powerful algorithmic primitives for computing basic stream
statistics in this model have been designed, several of them impacting the practice of big data
analysis. In this lecture we will see one such algorithm (CountSketch), a small space algorithm for
finding the top k most frequent items in a data stream.
10 videos (Total 72 min)
Flows in Networks
Introduction
Ppt slides
Hello everybody, welcome to our course on Advanced Algorithms and Complexity. As the first unit in
this course, we're going to talk about network flow algorithms and, in particular, in this very first
lecture, we're just going to give an example of a problem to give you a feel for what types of things
we're going to be talking about in this unit.
Play video starting at 20 seconds and follow transcript0:20
So to set things up, suppose that you're a disaster relief manager and you're trying to, among other
things, you have this city and you want to know how quickly could it be evacuated in the case of an
emergency. Well, to do this you have to look at the roads leading out of the city, and so you see that
there's the main highway out that will handle 5000 cars an hour. Of course, this isn't the only road
leading out of the city. There are some secondary highways that can each handle 2000 cars an hour.
Play video starting at 50 seconds and follow transcript0:50
Of course, things are a little bit more complicated than that. These other roads, they each bifurcate
into two halves. Each of these halves can handle 1000 cars an hour. So you're maybe okay so far. But
it turns out that two of those halves merged together a little wise down into just a single road that
can only handle 1000 cars now. And so you can imagine that in real life, there are many, many, many
more roads than this in their full road network, but this is a toy example. And we'd like to know, given
that this is the network, how quickly can we evacuate? Well, it's not hard to start playing around with
this. We can take 5000 cars an hour and send them out along the main road. We can send another
thousand cars an hour along this northern path here. Another thousand cars an hour can go along up
on the northern
Play video starting at 1 minute 44 seconds and follow transcript1:44
road and then split off and join in on the merged road, and finally, another thousand cars and hour
can go off on the third highway. Now, putting this all together, we have a total of 8000 cars an hour
that we can evacuate, but we'd like to know, is this the best that we can do or can you do better?
Play video starting at 2 minutes 2 seconds and follow transcript2:02
Well, if you play around with this a little bit, there's no obvious way to make an improvement, and
you might suspect that this is the best you can do, and in fact, you'd be correct. One way to show this
is that if you draw a river, suppose that there was a river where this blue line is on the diagram, you'll
note that there are only four bridges that cross that river. And the total capacity of all the bridges is
only 8000 cars an hour.
Play video starting at 2 minutes 30 seconds and follow transcript2:30
So if only 8000 cars an hour can cross this river and you need to cross the river to get out of the city,
only 8000 cars an hour can evacuate the city. And that proves that this plan that we have for
evacuation is really the best you can do. It's bottlenecked at the river, you can't do any faster. So
network flow problems are problems that will allow us to study things like this problem. And this is
what we're going to be talking about in this unit. And next lecture, what we're going to do is we're
going to take a little bit of a more careful look at this problem. We're going to come up with a formal
framework to discuss this kind of issue, and then we're going to discuss some examples of where
these sorts of problems might show up in real life. So that's what you have to look forward to in the
next lecture. I hope to see you then.
Network Flows
Hello everybody.
Welcome back to our network flows unit.
Today we're going to be talking a little bit about sort of formal definitions and
sort of getting a concrete definition of our problem and
then some examples of sort of what sorts of problems fall into this category.
So, remember last time we discussed the disaster management problem.
Today what we're going to do is we're going to have a sort of formal framework
for talking about this problem and some similar problems.
Basic Tools
Residual Networks
Hello everybody, welcome back to our network flows unit. Today we're going to be talking about
some tools that are actually very useful. A tool called the Residual Network for coming up with new
flows, or adding a little bit of flow to an existing flow.
Play video starting at 16 seconds and follow transcript0:16
So remember last time we formally defined these things. We defined what a network was, and we
defined the flow on this network was, and then defined what the maxflow problem, which is the one
we're working towards solving. There's a very basic technique to solving maxflow, that is basically
what we are going to be working towards for the next bunch of lectures. And the idea is to build up
your flow a little bit at a time, and this is really what we did in the original example in the first lecture,
where we routed a bunch of cars along one road, and then routed some more cars along another
road and so on and so forth, and built up the final flow as the sum of a bunch of little flows. So how do
we this in practice? Well suppose that we have the following network. Here, all the edges have
capacities 1, for simplicity, and what we can do is, we can just add flows of those together a little bit
at a time. We can know, hey, we can send the unit of flow along this top path, and if we just have a
unit of flow on each of these edges, everything balances. But after we do that, we can send another
unit of flow along the bottom path, and then another unit of flow along the middle. And once we've
done this, we now have a maximum flow, but we built it up in nice convenient little pieces.
Ppt slides
Play video starting at 1 minute 30 seconds and follow transcript1:30
Okay, so let's consider another example, this one's actually a little bit simpler. We have our network
here, the maximum flow is 2, as we've shown here, but we're going to try and add flow increment. So
let's start by adding flow along this path, it's a perfectly valid path, we can route a unit of flow
through it. And now we want to try to add our second unit of flow and there's a bit of a problem. We
can't readily add a second unit if we've already used up these piece edges, the remaining edges just
don't connect to each other we can't actually get the flow to work. Now it turns out this away around
this, which of course there is since the maximum flow is 2, and it involves with the rounding flow
along with this blue path,
Play video starting at 2 minutes 16 seconds and follow transcript2:16
which is a little bit weird since we can not actually do that. We can't actually send flow down along
the middle edge since there was not an edge there, but if you think about it in the right way, you can
think of sending flow down this middle edge as cancelling out the flow that we currently send in the
up direction. If the flow going up and the flow going down are thought to cancel each other, then
once we add these two flows together, we just get this flow, which is perfectly valid, because there's
no flow running along the middle edge.
Play video starting at 2 minutes 48 seconds and follow transcript2:48
And so,the moral of the story is that if you want to be able to appropriately add your little bit of flow,
sometimes it's not enough to just add flow along new edges but sometimes you also have to let your
flow cancel flow along existing edges.
Play video starting at 3 minutes 5 seconds and follow transcript3:05
So given a network G and a flow f what we're going to do is construct what's called the residual
network, g sub f, and this is a new network that represents the places where flow can be added to f.
Play video starting at 3 minutes 19 seconds and follow transcript3:19
But this includes not just edges where there's more room for extra flow to go along that edge, but
also places where we could cancel out existing flows.
Play video starting at 3 minutes 31 seconds and follow transcript3:31
So to define this formally for each edge e of our graph our residual graph, our residual network is
going to have edges, well it's going to have an edge along e. And the capacity is going to be the
capacity of the edge, the original capacity of the edge, minus the flow along that edge. And the point
is this is the amount of remaining capacity that we have of course, if the flow is equal to the capacity
we can ignore this edge because it would have no capacity.
Play video starting at 4 minutes 0 seconds and follow transcript4:00
We also need to have an edge opposite e with capacity equal to the flow along e, because this will
represent the amount of flow that we can cancel going in the other direction.
Ppt slides
And so, for example, in this network that we had corresponding to our city evacuation problem, we
can define a cut that contains these four vertices. And size of the cut, well, is the sum of the capacities
of these four roads, which ends up being 8000.
Play video starting at 2 minutes 9 seconds and follow transcript2:09
Okay, so to make sure we're all on the same page, here is a pretty simple network. There is a cut,
which is this blue square that contains four vertices on the inside. What's the size of this cut?
Play video starting at 2 minutes 23 seconds and follow transcript2:23
Well, you just have to look at which edges cross from inside the cut to outside the cut. These have
capacities one and two and three. And so you add those up, and you get six as the answer.
Okay, so the important thing though is that your cuts provide upper bounds on the size of the flow in
and out. In particular, for any flow f and any cut C, the size of f is at most the size of C.
Play video starting at 2 minutes 51 seconds and follow transcript2:51
And this was sort of exactly the argument that we had, any piece of flow needs to cross the cut.
There's only so much capacity that lets you cross the cut, and so that's an upper bound on the flow.
Now, to make this rigorous, let's give a proof, the flow is the sum of our sources of the total flow out
of that vertex minus the total flow into that vertex.
Play video starting at 3 minutes 14 seconds and follow transcript3:14
Now for vertices that aren't a source or sink, this term is zero. So we can extend this to a sum over
vertices inside our cut of the flow out of that vertex minus the flow into that vertex. On the other
hand, you'll note that, I mean, if you have an edge that stays within the cut, it comes out of one
vertex and into another and cancels out of the sum. So this is the same as the sum over edges that
leave the cut of the flow through that edge, minus the sum over edges that go into the cut of the flow
through that edge. Now of course, the flow of edges leaving the cut, that's at most the capacity of the
edge, the flow of edges into the cut is at least zero. And so this things is at most the sum of the edges
that leave the cut of the capacity of that edge, which is exactly the size of the cut.
Play video starting at 4 minutes 5 seconds and follow transcript4:05
So this proves the theorem.
Play video starting at 4 minutes 7 seconds and follow transcript4:07
And what this says is that if you have any cut C, that gives you an upper bound on the maximum flow.
The size of the maximum flow is at most the size of the cut.
Play video starting at 4 minutes 18 seconds and follow transcript4:18
Now it's good we've got some upper bounds, but the question is, is this good enough? I mean, there
are lots of ways to prove upper bounds. But what we really want is a sharp upper bound, one that
good enough that once we found a maximum flow, we'll have a matching upper bound that will tell us
you actually can't do any better than this. And, somewhat surprisingly, bounds of this form are
actually good enough.
Play video starting at 4 minutes 42 seconds and follow transcript4:42
So the big theorem here is known as the maxflow-mincut theorem. For any network G, the maximum
over flows of the size of the flow is equal to the minimum over cuts of the size of the cut.
Play video starting at 4 minutes 57 seconds and follow transcript4:57
In other words, there's always going to be a cut that's small enough to give the correct upper bound
on maximum flows.
Play video starting at 5 minutes 6 seconds and follow transcript5:06
So to prove this theorem, let's start with a very special case. What happens when the maximum flow
is equal to zero?
Play video starting at 5 minutes 13 seconds and follow transcript5:13
If this is the case, it has to be the case that there's no path from source to sink. If there is any path
from a source to a sink, then you could ride a little bit of flow along that path, and your maxflow
would be positive.
Play video starting at 5 minutes 26 seconds and follow transcript5:26
So what we're going to do is we're going to let C be the set of vertices that are reachable from
sources. And it turns out there can't be any edges out of C at all because if there were, if there was an
edge that left C, then wherever you ended up, that would also be reachable from the source. And it
should be in C as well.
Play video starting at 5 minutes 47 seconds and follow transcript5:47
Now, since there are no edges leaving C, the size of the cut has to be 0.
Now, in the general case, we can do something similar. We're going to let f now be a maximum flow
for G. And then, we're going to look at the residual graph.
Play video starting at 6 minutes 2 seconds and follow transcript6:02
Now, if the residual graph, which is a way to talk about ways of adding flow to f, if that had any flow
that you could put in it, f couldn't be a maxflow.
Play video starting at 6 minutes 13 seconds and follow transcript6:13
So the residual graph has maxflow zero.
Play video starting at 6 minutes 17 seconds and follow transcript6:17
And what that means is there's a cut C with size zero in this residual graph. And I claim this cut C has
size exactly equal to the size of our flow f.
Play video starting at 6 minutes 29 seconds and follow transcript6:29
And the proof isn't hard. The size of f for any cut is actually the total flow out of that cut minus the
total flow into that cut.
Play video starting at 6 minutes 37 seconds and follow transcript6:37
But if C has size 0 in the residual graph, that means that all the edges leaving the cut need to have
been completely saturated, they need to have used the full capacity. And the edges coming in to C
had to have no flow, because otherwise the residual graph would have an edge pointing on the
opposite direction.
Play video starting at 6 minutes 57 seconds and follow transcript6:57
And so the size is just total sum over edges leaving C of their capacity minus the sum over edges in the
C of zero, which is just the size of the cut. And so we what found is we found a flow f and a cut c
where the size of the flow is equal to the size of the cut.
Play video starting at 7 minutes 15 seconds and follow transcript7:15
Now, by the previous limit, you can't have any flows bigger than that cut, or any cuts smaller than
that flow.
Play video starting at 7 minutes 22 seconds and follow transcript7:22
And so this is the maximum flow, and it's equal to the minimum cut size.
Play video starting at 7 minutes 27 seconds and follow transcript7:27
So in summary, you can always check whether or not a flow is maximal by seeing if there's a matching
cut.
Play video starting at 7 minutes 34 seconds and follow transcript7:34
In particular, f is going to be a maxflow if and only if there's no source to sink path in the residual
graph, and this is a key criteria that we'll be using in our algorithm that we'll be discussing next time.
So, I hope to see you for the next lecture.
Maxflow algorithms
The Ford–Fulkerson Algorithm
Hello everybody, welcome back to our Flows in Networks unit. Today we're actually going to, finally,
give an algorithm to compute maximum flows. So the idea of this algorithm is very much along the
lines that we've been sort of hinting at the entire time. We're going to start with zero flow, in our
network, so the trivial flow, no flow along any edge. And we're going to repeatedly add a tiny bit of
flow, sort of building up the flow a little bit at a time, until we reach a state where it's impossible to
add anymore flow, and then we'll be done. So how do we add flow?
You have some flow f. We then compute the residual network, Gf. And this really does represent the
ways in which flow can be added. So any new flow that we would have would be of the form f + g,
where g is a flow in our residual network. So if we want to replace f by a slightly larger flow, all we
need is a slightly positive flow in the residual network.
Play video starting at 59 seconds and follow transcript0:59
And to do that, all we want to do is see if there's a source to sink path in this network.
Play video starting at 1 minute 5 seconds and follow transcript1:05
So, what happens if there's no path? If there's no source to sink path in our residual network, then
the set of vertices that we can reach from the source defines a cut of size 0.
Play video starting at 1 minute 16 seconds and follow transcript1:16
That says there's no flow in the residual of positive size. And so any flow f + g has size at most the size
of f and f is a maximum flow. And so if that's the case, we're done. We already have a maximum flow
and we can just stop.
Play video starting at 1 minute 32 seconds and follow transcript1:32
Now if there is a path, it turns out we can always add flow along that path.
Play video starting at 1 minute 37 seconds and follow transcript1:37
What you do is if you add x units of flow to each edge along that path, well, you have conservation of
flow, there's x units in an x units out of each vertex on that path. And as long as x is at most the
minimum capacity of any of these edges in the residual graph, this is actually a flow in the residual
network.
Play video starting at 2 minutes 0 seconds and follow transcript2:00
So if we do this, we find some flow g for our residual network with the size of g is bigger than 0. Then
we'll replace f by f + g, we found a new flow where the size of f + g is strictly bigger than the size of f.
We found flow that's slightly bigger than the one we had before.
Play video starting at 2 minutes 19 seconds and follow transcript2:19
So to make this formal, we produced what's known as the Ford-Fulkerson algorithm for max flow. You
start by letting f be the trivial flow. And then you repeat the following. You compute the residual
graph for f.
Play video starting at 2 minutes 33 seconds and follow transcript2:33
You then try and find an s to t path, P, in this residual graph.
Play video starting at 2 minutes 38 seconds and follow transcript2:38
If there is no such path, we know that we already have a max flow so we can just return f.
Play video starting at 2 minutes 45 seconds and follow transcript2:45
Otherwise, what we're going to do is we're going to let X be the minimum capacity of any edge along
this path in the residual network.
Play video starting at 2 minutes 52 seconds and follow transcript2:52
We're going to let g be a flow, where g assigns X units of flow to each edge along this path.
Play video starting at 3 minutes 0 seconds and follow transcript3:00
And then we're going to let f be f + g. And when we do this we increased out flow by a little bit and we
just keep repeating until we can't increase our flow anymore. So, for example, we've got the network
here. Here's our residual network. How much flow do we end up adding in one step?
Play video starting at 3 minutes 22 seconds and follow transcript3:22
Well to figure this out you have to do two things. You first have to find your S to T path, which is this
one. And then you say, well how much capacity are there on the edges? Which edge has minimum
capacity? And that's this edge of capacity 4. And so in this case you'd route four units of flow on your
first step.
Play video starting at 3 minutes 43 seconds and follow transcript3:43
But, to really see how this algorithm works let's take the following example. So, we have a graph up
top. Initially we have no flow running through that graph so the graph below is the residual, is the
same network that we started with. And now what we want to do is we want to find paths in the
residual network. So here's an S to T path. The minimum capacity along this path is 5, so we route 5
units of flow along each of these edges. Now this updates the residual, we have a couple, we've got a
new edge, we got an edge that wasn't there before, whatever.
Play video starting at 4 minutes 19 seconds and follow transcript4:19
We now want to again find an S to T path in the residual graph. This one works out pretty well. Again,
the minimum capacity of these edges is 5, so we route 5 more units of flow along each of those edges
and we update the residual graph again. Once again, we find an S to T path in the residual graph. This
one works pretty well. The minimum capacity here on these edges is 2. So we route 2 more units of
flow along each of those edges.
Play video starting at 4 minutes 48 seconds and follow transcript4:48
And, at this point, once we've updated the residual we will note there is no S to T path. In fact, there's
a cut right here that prevents us from routing any more flow. And so given that cut you can actually
see that this flow which routes 12 total units of flow is actually a maximum flow and so we're done.
Play video starting at 5 minutes 12 seconds and follow transcript5:12
So before we get into analyzing the run time of this algorithm, there's an important point to make.
We should note that if all the capacities that we have are integers in our original network, then all the
flows that we produce are also integer. Because every time we try and augment our flow along some
path, we look at the smallest capacity, which is always an integer. And so we put an integer amount of
flow everywhere and everything remains integer if we started with integers.
Play video starting at 5 minutes 42 seconds and follow transcript5:42
And there's an interesting lemma that we get out of this, which actually will prove useful to us later,
that says if you have a network G with integer capacities, there's always a maximum flow with integer
flow rates. And you can get it just by using the Ford-Fulkerson algorithm. Okay but now let's look at
the analysis.
And for this analysis to work I'm going to have to assume that all capacities are integers.
Play video starting at 6 minutes 8 seconds and follow transcript6:08
Now what does this algorithm do? Every time through this loop, we compute the residual graph and
then we try to find a path P in it. And each of these run in O of number of edges time.
Play video starting at 6 minutes 20 seconds and follow transcript6:20
Now, every time we do that, we increase the total flow by a little bit, in fact by at least 1. So the
number of times we do it is most the total flow on our graph. So our total runtime is bounded by the
number of edges in our graph times the size of the maximum flow.
Play video starting at 6 minutes 40 seconds and follow transcript6:40
Now this is a little bit weird as a runtime, because it depends not just on sort of the structure of the
graph that we're working on, but also the capacities of the edges and the size of the maximum flow.
This leads us to a problem, where, potentially at least, if we have numerically very, very large
capacities in our graph, it could actually take us a very, very long time to compute the flow.
One other thing I should note about this algorithm is that it's not quite a full algorithm. What it says is
at every step I need to find some source to sink path in our residual.
Play video starting at 7 minutes 18 seconds and follow transcript7:18
Now, there might be many valid paths to choose from, and the Ford-Fulkerson algorithm, as I've
stated, doesn't really tell you which one to use.
Play video starting at 7 minutes 27 seconds and follow transcript7:27
Now you might just want to run depth-first search because it's very fast, but maybe that's not the
best way to do it. And as we'll see a little bit later in fact, finding the right way to pick these
augmenting paths can actually have a substantial impact on the runtime of the algorithm. But that's
for a little bit later. That's all for our lecture today. Next time, we'll talk a little bit more about the
runtime of this particular algorithm. So I hope to see you then.
Slow Example
Hello everybody, welcome back to our network flows unit. Today we're going to be talking about sort
of an example of an algorithm.
Play video starting at 10 seconds and follow transcript0:10
Network on which the Ford-Fulkerson algorithm might not be very efficient.
Play video starting at 15 seconds and follow transcript0:15
So last time we had this great algorithm for Maxflow called the Ford-Fulkerson Algorithm. The
runtime was all of the number of edges of the graph times the size of the maximum flow. Now, this is
potentially very bad if the size of the flow is large. On the other hand, this is sort of a theoretical
problem at this point. We don't know for sure whether or not this is ever actually a problem.
Ppt slides
Hello everybody and welcome back to our Network Flows Unit. Today we're going to be talking a new
algorithm for network flows, or maybe just a version of the old algorithm, that will do a little bit better
than what we had previously. So last time, we were still talking with the Ford-Fulkerson algorithm for
Maxflow. The runtime, in general, is O of the number of edges times the size of the flow. And last
time we showed that this can actually be very very slow on graphs with large capacities.
Play video starting at 30 seconds and follow transcript0:30
And in particular, we had this example, where sort of every time, if you routed flow, at least if you
were picking the wrong paths, then you just got one unit of flow every iteration and it took millions of
iterations to actually finish.
Play video starting at 45 seconds and follow transcript0:45
Fortunately though, we know that the Ford-Fulkerson algorithm gives us a choice as to which
augmenting path to use. And the hope is that maybe by picking the right path we can guarantee that
our algorithms won't take that long. And so in particular what we want to do is we want to find sort of
a principled way of picking these augmenting paths in order to ensure that our algorithm doesn't run
through too many iterations. And one way to do this is via what's known as the Edmonds-Karp
algorithm.
Play video starting at 1 minute 18 seconds and follow transcript1:18
The idea of the Edmonds-Karp algorithm is as follows. We'd like to use the Ford-Fulkerson algorithm
but we're always going to be using the shortest possible augmented path. That is, shortest in the
number of edges that are being used.
Play video starting at 1 minute 33 seconds and follow transcript1:33
And, basically all that this means is that if we want to find our augmenting paths, we want to use a
breadth-first search, rather than a depth-first search.
Play video starting at 1 minute 43 seconds and follow transcript1:43
So, for example, if we're trying to run Edmonds-Karp on this example, then we can't use the zig-zag
path with three edges. We're required to pick this augmenting path with only two edges instead.
Play video starting at 1 minute 56 seconds and follow transcript1:56
After we've done that there's another path with only two edges, and after we've done that there's
nothing left to be done. So at least on this example the Edmonds-Karp algorithm gives us the good
execution rather than the bad one.
Play video starting at 2 minutes 10 seconds and follow transcript2:10
Now to really look into how well this works, we need to analyze these augmenting paths. So if you
have an S to T path. You'll note that when you add your augmenting flow it always saturates some
edge. That is, uses up all the available flow from that edge.
Play video starting at 2 minutes 26 seconds and follow transcript2:26
And this is because the way we decided the amount of flow to run along this path was we took the
minimum capacity of any of these edges in the residual graph. And so which ever edge had only that
much capacity left got saturated.
Play video starting at 2 minutes 43 seconds and follow transcript2:43
Now, once we add this augmenting flow, we have to modify the residual network. We end up with
edges pointing backwards along each of these places because we can now cancel out that flow we
just added. And, at least the one edge that we ended up saturating, we destroyed that edge, we used
up all of the remaining flow.
So that's the idea of our analysis and the way we're going to show that this works is we're going to
start with a critical lemma.
Play video starting at 3 minutes 55 seconds and follow transcript3:55
The Edmonds-Karp algorithm is very concerned about distances in the residual graph because it looks
for short paths there.
Play video starting at 4 minutes 2 seconds and follow transcript4:02
And so we'd like to know how these distances change as the algorithm executes. Because as you run
your algorithm your residual graph keeps changing, and so the distances inside the residual graph
change.
Play video starting at 4 minutes 16 seconds and follow transcript4:16
Now the Lemma that we want is the following. As the Edmonds-Karp algorithm executes, if you take
any vertex v and look at the distances from the source to v, those distances only get bigger.
Play video starting at 4 minutes 29 seconds and follow transcript4:29
Similarly look at the distances from from v to t or the distance from s to t, again those can only
increase, never decrease.
Ppt slides
Once we have this lemma the rest of this analysis is actually pretty easy. The distance between s and t
in the residual graph can only increase and it's never more than the number of vertices. So it can only
increase the number of vertices times.
Play video starting at 8 minutes 17 seconds and follow transcript8:17
Now between times that it increases, no edge can be saturated more than once because once it's
saturated you can never use it again. And so between times you can only have O of E many saturated
edges.
Play video starting at 8 minutes 31 seconds and follow transcript8:31
But each augmenting path has to saturate an edge. You can only have O of E many such paths
between increases in this distance between s and t. And that can happen only O of E many times.
Play video starting at 8 minutes 45 seconds and follow transcript8:45
So there are only O of size of V times size of E many augmenting paths used by this algorithm.
Play video starting at 8 minutes 52 seconds and follow transcript8:52
Each path here takes only O of E much time. And so the total run time, is at most, O of V times E
squared.
Play video starting at 9 minutes 0 seconds and follow transcript9:00
Now, this is maybe not so great, because, O of E times E squared, this might be number of vertices to
the fifth, or number edges cubed. But it is polynomial and it has no dependence on, or it sort of
doesn't become very, very, very big when our actual size of our flow becomes very, very large.
Play video starting at 9 minutes 25 seconds and follow transcript9:25
Okay. So one problem, sort of a quick review properties of this Edmonds-Karp algorithm. Which of the
following are true about the Edmonds-Karp algorithm?
Play video starting at 9 minutes 34 seconds and follow transcript9:34
One, that no edge is saturated more than size of V many times. Two, the lengths of the augmenting
paths decrease as the algorithm progresses.
Play video starting at 9 minutes 47 seconds and follow transcript9:47
Or three, that changing the capacities of edges will not affect the final runtime.
Play video starting at 9 minutes 53 seconds and follow transcript9:53
Well, it turns out that only one of these is true.
Play video starting at 9 minutes 56 seconds and follow transcript9:56
Yes, edges only become resaturated after the distance between S and T increases, which only
happens V many times. However, the lengths of the augmenting paths increase as the algorithm
progresses, not decrease. And finally, although the runtime does not have an explicit dependence on
the edge capacities, like it did in the Ford-Fulkerson algorithm, they can still affect the runtime. If all
the capacities are zero, you don't need to do any augmenting paths. If the capacities are weird, they
might make you do a little bit more work than you'd have to do otherwise.
Play video starting at 10 minutes 33 seconds and follow transcript10:33
But the nice thing about Edmonds-Karp is that there's a bound to how bad it can be.
Play video starting at 10 minutes 38 seconds and follow transcript10:38
So in summary if we choose augmenting paths based on length it removes this sort of, at least bad
dependence that we had on the numerical sizes of the capacities. We have a runtime we can write
down that we can run independently of our total flow. And now max flow is an incredibly well studied
algorithmic problem. There are actually better more complicated algorithms that we're just not going
to get into in this course. The state of the art is a little better than what we had, it's O of number of
vertices times number of edges.
Play video starting at 11 minutes 12 seconds and follow transcript11:12
And if you want to look it up, I mean, feel free to look up these more complicated algorithms. But this
is all that we're going to do in this course. The next two lectures, we're going to sort of talk about
some applications of these maxflow algorithms to a couple other problems where it's not quite
obvious that this is the right thing to do. So I'll see you next time.
Application
Bipartite Matching
Hello, everybody.
Welcome back to our network flows unit.
Today we're going to talk about an application of some of these network flow
algorithms we've been discussing, to a problem called bipartite matching.
So to get started on this,
suppose you're trying to coordinate housing in a college dormitory.
So what you have is, you've got n students and m rooms.
Each student is giving you a list of rooms that they consider to be acceptable, and
what you'd like to do is place as many students as possible
in an acceptable room.
Now, of course, there's a limitation here that you can't place more than one student
in the same room.
Play video starting at 38 seconds and follow transcript0:38
Okay, so this is the problem.
. How do we organize this data? Well, I mean, you got a bunch of students. You got a bunch of rooms.
And there's some pairs of students in rooms, without students willing to be in that room.
Play video starting at 52 seconds and follow transcript0:52
And so a great way to organize this data pictorially is by with this bipartite graph.
Play video starting at 58 seconds and follow transcript0:58
A bipartite graph is a graph G whose vertex set is partitioned into two subsets, U and V, students and
rooms. They're sort of two types of vertices, so that all edges in the graph are between a vertex of U
and a vertex of V, so all the edges that connect the student to a room now connect the student to a
room to a room. And so if we just redraw that graph and call two sides U and V instead of Students
and Rooms, it's exactly a bipartite graph.
Play video starting at 1 minute 28 seconds and follow transcript1:28
So what we'd like to do on this graph is find what is called a matching. We want to find a bunch of
pairs of different rooms, that's a bunch of edges in a graph, but it needs to be the case that each
student gets assigned only one room, and each room is assigned to only one student, and that says
that no two of these edges that we pick can share an end point.
Play video starting at 1 minute 48 seconds and follow transcript1:48
So in our example if you look at the blue edges here, that will give us a matching. We've got a bunch
of pairings of students get paired to rooms that they were acceptable to be paired with. And each
student assigned only one room, and each room is assigned to at most one student.
Play video starting at 2 minutes 6 seconds and follow transcript2:06
So the big problem we're going to try and solve is known as bipartite matching. Given bipartite graph
G, we try to find a matching of G that consists of as many edges as possible and ideally one that pairs
up all of the vertices with each other. So, just to be sure that we're on the same page, if I give you the
bipartite graph, what's the size of the number of edges in the largest possible match?
Play video starting at 2 minutes 33 seconds and follow transcript2:33
Well, you have to play around with it for a bit. You can find that you can actually get matchings of size
three here and it takes a while, but you should be able to convince yourself that it's not actually
possible to get the matching here of size four, five.
Play video starting at 2 minutes 49 seconds and follow transcript2:49
So, let's talk about applications. Bipartate matching actually has a bunch of applications. One thing,
need be is matchmaking. Suppose you have a bunch of men, women, some pairs of them are
attracted to each other and you would like to sort of pair them off into as many possible couples as
possible, such that nobody is dating more than one person at the same time.
Play video starting at 3 minutes 10 seconds and follow transcript3:10
Now, we have to be a little bit careful here. If there are gay people, then this doesn't quite fit into the
context of bipartite matching, because there are men attracted to men or women attracted to
women. The graph is no longer bipartite. And there's nothing wrong with this necessarily, but it will
make the problem computationally more complicated.
Play video starting at 3 minutes 31 seconds and follow transcript3:31
Another example that you might want to consider is maybe a scheduling problem. You have sort of a
bunch of events that need to be scheduled at different times. Each event has some blocks of time that
would work for it and you need to make sure that no two events get the same time block. Once again,
sort of a bipartite matching problem.
Image Segmentation
Hello, everybody, welcome back to our Flows in Networks unit.
Today we're going to be talking an interesting problem on image segmentation.
This is a problem in image processing,
and we'll actually show that there's some surprising connections
to this max-flow min-cut type of things that we've been talking about.
Play video starting at 17 seconds and follow transcript0:17
So the problem we're trying to solve is image segmentation.
Given an image, separate the foreground of the image from the background.
And we don't want to get too much into image processing, so
here's the basic setup.
The image is a grid of pixels.
We need to decide which pixels are in the foreground and
which are in the background.
And I don't know much about how you actually process images, but we're going
to assume that there's some other program that gives you some sort of idea about
which pixels are in the foreground and which are in the background.
Play video starting at 47 seconds and follow transcript0:47
So, in particular, there's some other algorithm which looks at each pixel and
makes a guess as to whether it's foreground or the background.
Play video starting at 55 seconds and follow transcript0:55
It assigns this pixel two numbers, av, which is sort of a likelihood that it's in
the foreground, and the bv, which is the likelihood that it's in the background.
Play video starting at 1 minute 4 seconds and follow transcript1:04
So the simple version of this algorithm, the input are these values a and
b, and the output should be a partition of the pixels into foreground and background.
So just the sum over v in the foreground of a sub v
plus the sum over v in the background of b sub v is as large as possible.
So to be sure that we're on the same page, here's a really simple version. We've got three pixels and
we've got some a and b values. What's the best possible value that we can get out of this problem?
Play video starting at 1 minute 36 seconds and follow transcript1:36
Well, it turns out that this problem is actually not that hard to solve in general. Basically, for any pixel,
if you put it in the foreground, you get a points, and if you put it in the background, you get b points.
So if a is bigger than b, it goes in the foreground, and if b is bigger than a, it goes in the background.
So what you do is, well, 1 should go in the background and gives us 4. 2 goes in the foreground and
gives us 5, 3 goes in the foreground and gives us 6. And so the answer is 4 and 5 and 6 is 15. Very well.
Now, this problem is maybe a little bit too easy. But let's take a little bit more information into
account. We sort of expect that nearby pixels should be on the same side of the foreground-
background divide. They're not going to be sort of randomly spattered throughout the picture, they
tend to be more or less connected regions.
Play video starting at 2 minutes 24 seconds and follow transcript2:24
So for each pair of pixels v and w, we're going to introduce a penalty pvw for putting v in the
foreground and putting w in the background.
Play video starting at 2 minutes 35 seconds and follow transcript2:35
So the full problem is the following. As input we take a, b, and p.
Play video starting at 2 minutes 40 seconds and follow transcript2:40
Again, we want a partition of our pixels into foreground and background. And now we want to
maximize the following. The sum of v in the foreground of av and the sum of v in the background of
bv, as before.
Play video starting at 2 minutes 52 seconds and follow transcript2:52
But now we subtract the sum over all pairs, where v is in the foreground and w is in the background,
of pvw.
Play video starting at 3 minutes 0 seconds and follow transcript3:00
And now we want this thing to be as large as possible.
QUIZ • 10 MIN
Flow Algorithms
Flow Algorithms
TOTAL POINTS 5
1.Question 1
Which vertices are in the minimum S-T cut in the network below?
1 point
S
T
2.Question 2
What is the augmenting path that will be used by the Edmonds-Karp algorithm to increase the flow
given below?
1 point
S-B-A-C-D-T
S-B-T
S-B-D-C-T
S-A-C-T
S-B-A-C-T
3.Question 3
Which of the statements below is true?
1 point
The Ford-Fulkerson algorithms runs in polynomial time on graphs with unit edge capacities.
The sum of the capacities of the edges of a network equals the sum of the capacities of the edges
of any residual network.
4.Question 4
What is the size of the maximum matching of the following graph?
1 point
5.Question 5
Consider the image segmentation problem on a picture that is given by an n by n grid of pixels.
Suppose that separation penalties are imposed only for adjacent pairs of pixels. If we use the
Edmonds-Karp algorithm to solve this problem as described in class, the final runtime is O(n^a)
for some a. What is the best such a?
Programming assignment:1
Week 2
Advanced Algorithms and Complexity
Week 2
Discuss this week's modules here.
Linear Programming
Linear programming is a very powerful algorithmic tool. Essentially, a linear programming problem asks
you to optimize a linear function of real variables constrained by some system of linear inequalities. This is
an extremely versatile framework that immediately generalizes flow problems, but can also be used to
discuss a wide variety of other problems from optimizing production procedures to finding the cheapest
way to attain a healthy diet. Surprisingly, this very general framework admits efficient algorithms. In this
unit, we will discuss some of the importance of linear programming problems along with some of the tools
used to solve them.
Less
Key Concepts
Generate examples of problems that can be formulated as linear programs.
Interpret linear programming duality in the context of various linear programs.
Solve systems of linear equations.
Compute optimal solutions to linear programs.
Illustrate convex polytopes.
Less
Reading: Slides and Resources on Linear Programming
. Duration:10 min
Introduction
Video: LectureIntroduction
. Duration:5 min
Resume
. Click to resume
Video: LectureLinear Programming
. Duration:8 min
. Duration:5 min
. Duration:10 min
Basic Tools
Video: LectureConvexity
. Duration:9 min
Video: LectureDuality
. Duration:12 min
. Duration:7 min
Algorithms
. Duration:8 min
. Duration:10 min
. Duration:6 min
5 questions
Programming Assignment
. Duration:3h
Reading
Chapter 7 in [DPV], Chapter 29 in [CLRS].
[DPV] Sanjoy Dasgupta, Christos H. Papadimitriou, and Umesh V. Vazirani. Algorithms. McGraw-
Hill, 2008.
[CLRS] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.
Introduction to Algorithms (3. ed.). MIT Press, 2009.
Introduction
Hello, everybody, welcome back to our course on advanced algorithms and complexity. Today we're
starting a new unit, we're starting to talk about linear programming problems. And in particular today
we're going to give just a simple example of the sort of problem that we will be trying to solve during
this unit.
Play video starting at 18 seconds and follow transcript0:18
So imagine that you're running a widget factory and you'd like to optimize your production
procedures in order to save money.
Play video starting at 24 seconds and follow transcript0:24
Now these widgets, you can make them using some combination of machines and workers. Now you
have only 100 machines in stock, so you can't use more than that. But you can hire an unlimited
number of workers. However, each machine that you're trying to use requires two workers in order to
operate it. Additional workers can be building things on their own, but machines they are using
require two workers on them.
Play video starting at 50 seconds and follow transcript0:50
Now in addition to this, each machine that you use makes a total of 600 widgets a day. And each
worker that's not currently involved in operating a machine makes 200 widgets a day.
Play video starting at 1 minute 4 seconds and follow transcript1:04
Finally, the total demand for widgets is only 100,000 widgets a day. So if you make any more than
this, they just won't sell, and that's no good for anybody.
So writing these constraints down in a reasonable way, if we let W be the number of workers that we
have. And M the number of machines, we have a bunch of constraints. The number of workers should
be non-negative, the number of machines should be between 0 and 100. The number of workers
needs to be at least twice the number of machines. And then finally, 100,000 is at least 200 times the
number of unoccupied workers. That's W minus 2M, plus 600 times the number of the machines. And
so these constraints sort of constrain which allowable combinations we can have. Now we can try and
graph these constraints. So here we've got a plane of possible values of M and W that satisfy these
constraints. Now if we're just starting where M and W both need to be non negative, we have this
quadrant as the allowable values. But when we require that M needs to be at least 100, we're
reduced to being in this strip here. When we look at our constraint based on the total demand, we
find that M + W is at most 500. And so we're now constrained to this region. And we add the final
constraint that the workers need to be at least twice the number of machines. We finally come to this
diagram of possible configurations of machines and workers that we can use.
What's next? Profit, well suppose that profits are determined as follows. Each widget that you make
earns you a $1 but each worker that you're hiring costs you $100 a day. So the total profit that you
get, then, in terms of dollars per day. Well it's number of widgets, 200 workers minus twice machines,
plus 600 times number of machines, minus the total salaries you paid to workers, 100 times the
number of workers. So that's 100 times the number of workers plus 200 times the number of
machines. And if we want to plot that on our graph we can do it as follows. So these lines that I've
drawn are lines of equal profit. There's a line with $30,000 a day, and then $40,000 a day, and then
$50,000 a day. And sort of as you go from left to right, or from bottom to top, you make more profit.
Play video starting at 3 minutes 29 seconds and follow transcript3:29
So what we're trying to do now is we're trying to say well, what can we do to get the most profit? And
it turns out, the best you can do is at this point here. Note that it's a corner of the allowable region,
it's where we have 100 machines and 400 workers. And the total profit is $60,000 a day.
Play video starting at 3 minutes 49 seconds and follow transcript3:49
Now it's clear from this diagram that this is the best that you can do. But if you actually want to prove
it, there's a clever way you can do that.
So two of the constraints that we have, one of them is that the number of machines is at most 100.
And another is that 200 times the number of machines plus 200 times the number of workers is at
most 100,000. Now if we take 100 times the first constraint and add it to a half times the second
constraint, what you find is that 200 times the number of machines plus 100 times the number of
workers has to be at most 60,000. And that says the profit that we make has to be at most 60,000.
And so this is a very convenient way to prove that the 60,000 that we could attain is actually the best
we can do. So in summary, what we did is we solved this problem where we maximized this function,
200M + 100W. Subject to the following list of five constraints.
Play video starting at 4 minutes 49 seconds and follow transcript4:49
And because the thing we're trying to maximize is a linear function, and the constraints we have are
linear inequalities, this makes this an example of the type of problem we're going to be looking at.
That is, a linear program. So come back next lecture and we'll sort of formally define this problem and
get us started on our investigation.
Linear Programming
Hello everybody, welcome back to our unit on linear programming. Today, what we're going to do is
we're sort of going to put everything on sort of a more solid, rigorous basis. So remember last time
what we did is we had this factory problem, where what we wanted to is we wanted to maximize. In
terms of M and W, this 200M + 100W, this is linear expression. Subject to the following list of linear
inequality that they had dissatisfied. And so in general, basically, just this is where the linear
programming is. It says we want to find real numbers, x1 through xn that satisfy a bunch of linear
inequalities, so a11x1 + a12x2 +..., is at least to b1, and then a bunch more of those. And subject to
these constraints, we would like a linear objective function, v1x1 + v2x2 + etc., to be as large or
possibly as small as possible.
Play video starting at 59 seconds and follow transcript0:59
To clean up the notation a bit, we're really going to store this by having a matrix A that encodes the
coefficients of all these inequalities along with vectors b and v.
Basic Tools
Convexity
Hello everybody. Welcome back to our Linear Programming unit. Today, we're going to talk about
convex polytope. In particular, we're going to try to understand what the solution set to the system of
linear inequalities that we need to deal with actually looks like. So remember, in a Linear Program
we're trying to Optimize a linear function subject to a bunch of linear inequality constraints. Today,
we're going to ask the question, what does the region of points defined by these inequalities actually
look like? For example, this factory example that we looked at way back at the beginning. If you look
at the set of solutions to these five inequalities, you got this nice trapezoid here. So the question is,
what did things look like in general?
Play video starting at 45 seconds and follow transcript0:45
Well, another example, if you look at the system were x, y and z and three dimensions were all
between zero and one. You've got the unit cube. And in general, you get much more complicated
looking regions. But you'll always get what's called convex polytope. And don’t worry will unrule these
meaning as we go. So the first thing to know is what is a single linear equation? Well if you look at the
linear equality, it defines a Hyperplane, infinite flat surface.
Play video starting at 1 minute 16 seconds and follow transcript1:16
Now, if instead you want an inequality it gives you what you call a halfspace. It gives you a Hyperplane
and everything on one side of that Hyperplane.
Play video starting at 1 minute 25 seconds and follow transcript1:25
So if we want the solutions to a system of linear inequalities, we have a thing defined by a bunch of
halfspaces, we want the intersection of all these of halfspaces. We want everything that's inside of all
them. We want to solve all of the equations.
Okay, so the other Lemma is about polytopes. Suppose, that you have a polytope and there's a linear
function on this that you're trying to minimize or maximize.
Play video starting at 6 minutes 48 seconds and follow transcript6:48
The claim is that it takes its minimum or maximum values on vertices.
Play video starting at 6 minutes 52 seconds and follow transcript6:52
This is clearly relevant to our linear program because we're exactly trying to minimize and maximize
linear functions on this convex polytopes. So, we saw this in it's original factor example with the
maximum was at this vertex and turns out that happens in general.
Play video starting at 7 minutes 9 seconds and follow transcript7:09
Now, to maybe get some intuition for why this is true. We've got our polytope and it's, this polytope
is sort of spanned by the corners, it's got corners and like things in between these corners. But
because we have linear functions like the things in between the corners are never as good as the
extreme points and so the optimum must be at the corners.
Play video starting at 7 minutes 31 seconds and follow transcript7:31
Now, to actually prove this the thing to note is that you have a linear function to find on a line
segment. It always takes extreme values at the two end points.
Play video starting at 7 minutes 41 seconds and follow transcript7:41
And we're going to use to sort of push our points toward the corners and let the values get bigger and
bigger.
Ppt slides
Play video starting at 7 minutes 48 seconds and follow transcript7:48
So we start at any point in our polytope and what you do is you draw a line through it. And you'll
note, that the biggest value that our linear function takes on this line comes all the way over to that
line hits an end of the polytope. So it takes an extreme point at the endpoint of that line which is on
the face of your polytope.
Play video starting at 8 minutes 8 seconds and follow transcript8:08
Now, once you're on the face or some facet, you can repeat this. You draw a line through that point
and what you know is that the extreme values will be at the end points of this line and that lets you
push it to a lower dimensional facet. And you keep doing this until you end up at a vertex.
Play video starting at 8 minutes 26 seconds and follow transcript8:26
And so, we start at any point and we kept going until we hit a vertex. And that vertex has at least as
large a value as the point we started.
Play video starting at 8 minutes 35 seconds and follow transcript8:35
And so, the maximum values must be attained at some vertex.
Play video starting at 8 minutes 41 seconds and follow transcript8:41
So in summary, the Region defined by a linear program is always convex. The Optimum of this linear
program is always attained at a vertex. And finally, if you have a point that's not in the region, you Can
always separate it from points on the inside by an appropriate hyperplane.
Play video starting at 8 minutes 58 seconds and follow transcript8:58
So these are some basic facts about linear programs and their solution sets. Come back next time,
we'll talk about another interesting property of linear programs called duality.
Duality
Hello everybody. Welcome back to our Linear Programming Unit. Today we're going to talk about an
interesting phenomenon in linear programs called duality.
Play video starting at 9 seconds and follow transcript0:09
So let's recall the first example that we looked at here. We wanted to maximize 200M + 100W subject
to this bunch of equations. Now it turns out we had a very clever way of proving that optimum was
correct once we found it. So the best you could do is 60000, and there was a great way of proving it.
We took one constraint, we multiplied by a hundred, we took another constraint and we multiplied it
by half. We added those together and we got a new constraint that, if we satisfied our original
constraints, it had to be the case. That 200M plus 100W, the thing we were trying to optimize, was at
most 60,000. And this is a very interesting and general technique that if you want to bound your
objective. You can try and do this by combining the constraints that you have together to prove a
bound.
Play video starting at 1 minute 0 seconds and follow transcript1:00
So let's see what happens in general. You have a linear program, you say you want to minimize v1x1
plus v2x2 plus. all the way up to vnxn.
Play video starting at 1 minute 10 seconds and follow transcript1:10
Subject to a bunch of these linear inequality constraints A11x1 plus a12x2 plus dot dot dot is at least
b1 and thenetc etc. So how can we try and do this?
Well if you give me any constant ci bigger than 0, you can take the first constraint and multiply it by c1
and the second constraint multiplied by c2 and so on and so forth, and add those all up. And what
you'll get is a new linear inequality, w1x1 plus w2x2 plus dot dot dot is at least T. Here the w i are
some combination of the Cs, w i is the sum of c j a j i and t is the sum of c j b j and this is a new
inequality. Now, if it's the case that w i is equal to Vi for all i, we have is that V1X1 plus V2X2 plus dot,
dot, dot that thing we were trying to minimize is at least t. And so, if we can arrange for Wi to the Vi
for all i, we've proven a lower bound on the thing that we're trying to minimize. So, we'd like to find
there's a bunch of Ci's that are all non-negative such that vi is the sum of j = 1 to m of cj aji for all i.
And so that subject to this constraints t the sum of cjbj is as large as possible. So we like the biggest
lower bound we can. Now the very interesting thing about this, is that this system we just wrote
down is actually just another linear program. We want to find the c in Rm such that cjbj the sum of
that is as large as possible subject to a bunch of linear inequalities. CI bigger than or equal to 0 and a
few linear equalities. vi is the sum of cj times aji.
Play video starting at 3 minutes 3 seconds and follow transcript3:03
And so, to put this formally, given any linear program we can call the primal very often.. Say minimize
v.x subject to Ax at least b. There's a dual linear problem, which is the linear problem that we want to
maximize. Y.b subject to y transpose A equals v and that's just another way of rewriting our inequality
constraints. And y at least 0. And it should be noted that even if your linear program wasn't exactly in
this form, you can still write a dual program, it's with a linear program of trying to find a combination
of the constraints. To bound the thing that you're trying to optimize. And so it's not hard to show that
a solution to the dual program bounds the optimum for the primal.
algorithms
Linear Programming Formulations
Hello everybody, welcome back to our unit on Linear Program. Today, we're going to talk about some
sort of different types of linear programming problems. It should have all related but not quite the
same.
Play video starting at 12 seconds and follow transcript0:12
So the point is there's actually several different types of problems that go into the heading of linear
programming. Now, the one that we've been talking about so far, we might call the full optimization
version. Minimize or maximize a linear function subject to a system of linear inequality constraints. Or
maybe say the constraints have no solution if they don't.
Play video starting at 32 seconds and follow transcript0:32
Now, it turns out there are a number sort of related problems dealing with sort of systems of linear
inequalities that you might want to solve, that are maybe a little bit easier. And will actually be
important, we start coming up with algorithms in the next couple of lectures that will actually be
solving algorithm's other formulations.
Play video starting at 50 seconds and follow transcript0:50
So the first one is optimization from a starting point. Given the system of linear inequalities, and the
vertex of the polytope they define, optimize a linear function with respect to these constraints, so
you're given a place to start.
Play video starting at 1 minute 5 seconds and follow transcript1:05
Another version is one they call Solution Finding. Given the system of linear inequalities, no objective
whatsoever, find some solution to these systems, assuming it exists, this is also somewhat easier. And
finally, we can satisfiability. Given the system of linear inequalities, determine whether or not there is
a solution.
Play video starting at 1 minute 27 seconds and follow transcript1:27
So it turns out that these problems are actually equivalent. If you can solve any one of these, you can
solve the others. Which is very convenient, because the algorithms we'll be looking at will each only
solve one of these versions.
Ppt slides
First off, it's clear that the full optimization of the problem is the strongest. Using that you can solve
anything else you want. If you have optimization from a starting point, you can just ignore the starting
point and solve the problem from scratch. If you're trying to find a solution while the optimal solution
is a solution and if you merely want to know if there is a solution. Well you try and run a full
optimization and it outputs a solution, you have a solution, great.
Play video starting at 2 minutes 7 seconds and follow transcript2:07
But the nice thing is that you can go the other direction. If you can only solve optimization from a
starting point, you can actually do the full optimization.
Play video starting at 2 minutes 16 seconds and follow transcript2:16
And the problem here is, how do you find the starting point? If you had the starting point you could
run the algorithm and you'd be done. But somehow you need a solution to this system and there's
actually a clever way to do this, you add equations one at a time. So now we have a solution to the
first seven of your equations. We now need to add to make it a solution to the first eight. For that, we
need to say, well, maybe your solution, it doesn't satisfy this eight inequality. Well what you can do is
you can optimize. So not only do we want to satisfy these seven, we want to make this eighth
inequality, maybe, as true as possible. And that one will give you a solution set that satisfies all of
them.
Play video starting at 2 minutes 56 seconds and follow transcript2:56
So to see how this works let's look at an example. We start with this rectangle in a nice corner of it.
We know where to add an inequality that chops our rectangle at this line. So what we're going to do
is we're going to say, well, we want our sort of point to be as much below this line as possible, that's
just a linear optimization question. So we can solve that using our optimization from starting point
algorithm. We get that vertex, and what do you know? It's a solution to this bigger system.
Play video starting at 3 minutes 27 seconds and follow transcript3:27
Next, we want to add this thing as an inequality. So again, we solve our optimization from starting
point, we find a solution, we can now add the additional inequality. We add another one, find a
solution, and then finally we've got all of our equations in the mix. We now need to do for our
optimization and we can do that.
Play video starting at 3 minutes 47 seconds and follow transcript3:47
So this is basically how you do it, you act one equation at a time. There is a technical point to keep
into account. Things are a bit messier if some of these intermediate systems that you're trying to
solve don't have optima. They might need one of these unbounded systems where things can get as
large as you like.
Play video starting at 4 minutes 5 seconds and follow transcript4:05
Now, to fix this, I mean it's not hard, you just need to play around a bit. First you want to start with n
constraints. That means you actually have a single vertex to start out your system and then we you
are trying to add the constraint v.x at least t, you don't just maximize v.x. That would be sort of a
problem because that might be unbound. So what you'll do is, you'll add the additional constraint that
v.x is at most t. And this guarantees that v.x will actually have a maximum at t and once you find it,
that'll be good.
Play video starting at 4 minutes 43 seconds and follow transcript4:43
Okay, so that was that, let's talk about solution finding.
Play video starting at 4 minutes 46 seconds and follow transcript4:46
How do we go from being able to find a solution to find the best one?
Play video starting at 4 minutes 51 seconds and follow transcript4:51
We somehow need to guarantee that the solution we found was the best solution.
Play video starting at 4 minutes 57 seconds and follow transcript4:57
But there's actually a good way to do that, we can use duality.
Play video starting at 5 minutes 0 seconds and follow transcript5:00
The point is that duality gives you a good way to verify that your solution's optimal by solving the dual
program and providing matching upper bound.
Play video starting at 5 minutes 10 seconds and follow transcript5:10
So what you want to do is you want to find both a solution to the original program and a matching
dual solution.
Although many of the algorithms you've learned so far are applied in practice a lot, it turns out that the
world is dominated by real-world problems without a known provably efficient algorithm. Many of these
problems can be reduced to one of the classical problems called NP-complete problems which either
cannot be solved by a polynomial algorithm or solving any one of them would win you a million dollars
(see Millenium Prize Problems) and eternal worldwide fame for solving the main problem of computer
science called P vs NP. It's good to know this before trying to solve a problem before the tomorrow's
deadline :) Although these problems are very unlikely to be solvable efficiently in the nearest future,
people always come up with various workarounds. In this module you will study the classical NP-complete
problems and the reductions between them. You will also practice solving large instances of some of these
problems despite their hardness using very efficient specialized software based on tons of research in the
area of NP-complete problems.
Less
Key Concepts
Give_examples of NP-complete problems
Interpret the famous P versus NP open problem
Develop a program for assigning frequencies to the cells of a GSM network
Develop a program for determining whether there is a way to allocate advertising budget given a
set of constraints
Less
Slides and Resources on NP-complete Problems
Reading: Slides and Resources on NP-complete Problems
. Duration:10 min
Search Problems
. Duration:5 min
Resume
. Click to resume
Video: LectureSearch Problems
. Duration:9 min
. Duration:7 min
. Duration:8 min
. Duration:1 min
. Duration:3 min
. Duration:3 min
Video: LectureP and NP
. Duration:4 min
Reductions
Video: LectureReductions
. Duration:5 min
Video: LectureShowing NP-completeness
. Duration:6 min
. Duration:5 min
. Duration:14 min
Video: LectureSAT to 3-SAT
. Duration:7 min
. Duration:12 min
. Duration:5 min
Video: LectureUsing SAT-solvers
. Duration:14 min
. Duration:10 min
Quiz: NP-complete Problems
6 questions
Programming Assignment
. Duration:3h
Slides and Resources on NP-complete Problems
Slides
17_np_complete_problems_1_search_problems.pdf PDF File
17_np_complete_problems_2_reductions.pdfPDF File
Reading
Chapter 8 in [DPV], Chapter 8 in [KT], Chapter 34 in [CLRS].
[DPV] Sanjoy Dasgupta, Christos H. Papadimitriou, and Umesh V. Vazirani. Algorithms. McGraw-
Hill, 2008.
[KT] Jon M. Kleinberg and Eva Tardos. Algorithm design. Addison-Wesley, 2006.
[CLRS] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.
Introduction to Algorithms (3. ed.). MIT Press, 2009.
Sudoku Solver
sudokusolver.py
Brute Force Search
Hello and welcome to the next module of the Advanced Algorithms and Complexity class. In this
module, we are going to meet problems that combinations very hard. In all previous modules, we
considered many efficient algorithms for various combinational problems. By saying efficient, we
usually mean polynomial algorithm and this is why. Consider for algorithms shown here on the slide,
because a running time of the felt algorithm is just n is usual by n with the other size of the input. The
running time of the second algorithm is n squared, so it is for the right timing algorithm and the third
one is a cubic time algorithm and the last one is running time 2 to the n. So here, we have polynomial
time algorithms, three polynomial time algorithms and the last one is exponential time. The second
draw here in the table shows the maximum value of n for which the total number of steps performed
by the corresponding algorithm stays below than 10 to the 9. Why 10 to the 9? Well, just because this
is roughly the estimate for the number of permutations performed by modern computers in one
second. So, we're interested in the maximum value of n for which the running time of the
corresponding algorithm stays below the one second. It is not difficult to compute these values. For
the first algorithm, this is 10 to 9, of course. For the second algorithm, this is 10 to 4.5 and for the
third one, it is 10 to the 3. So polynomial time algorithms are able to handle instances of size roughly
thousand, so even millions. While for exponential time algorithms, the maximum value for which it
performs less than 10 to the operations is roughly 30. So, it allows us to process only very small
instances. Recall that any exponential time function grows faster than any polynomial time function.
So for this reason, exponential time algorithms are usually considered as impractical.
Note, however, the following theme. Usually, for many computational problems, the corresponding
set of all candidate solutions is exponential. Let me illustrate this with a few examples. I assume that
way given n objects since our goal is to find an optimal permutation. An optimal in some sense
permutation of this object. A nice way to do this would be to go through all possible, such
permutations and to select an optimal one. The running time the corresponding algorithm, however is
going to be at least ten factorial, because there are n factorial different permutations of n given
objects. And n factorial grows even faster than any exponential function, 2 to the n, for example,
which means that the corresponding algorithm is going to be extremeless low. Another example is the
following. Assume that we're given objects and we need to split them into two sets. For example, we
need to partition a set of vertices of a graph in the two sets to find a cut. Then again, a nice way to do
this would be to go through all possible partitions into two sets and to select an optimal one.
However, there are 2 to the n ways to split the n given objects into two sets. And if we do this the
running time of the algorithm is going to be at least 2 to the n and we know that this is very slow. This
only allows us to handle instances of size roughly 30 in less than 1 second. Another example is assume
we need to find a minimum spanning tree in a complete graph, that is in a graph where we have an
edge between every pair of vertices. A naive way to do this would be to go through all possible
minimum spanning trees and to select one with minimum weight. However, the total number of
spanning trees in a graph on n vertices is n to the n-2. Again, this grows even faster than 2 the n and
this makes the corresponding algorithm completely impractical. So once again, in many cases, an
efficient polynomial algorithm is called efficient in particular, because it avoids going through all
possible candidate solutions, which usually has exponential size.
In the rest of this module, we will learn that there are many computational problems that arise
frequently in practice actually for which we don't know an efficient that is polynomial time algorithm.
For such problem, roughly the best we can do is to go naively through all possible candidate solutions
and to select the best one. It will turn out also surprisingly that all these seemingly different problems,
well, millions of problems they are related to each other. Namely if you construct an efficient, if you
design an efficient algorithm, a polynomial time algorithm for at least one of them, this will
automatically give you a polynomial time algorithm just for all these problems.
Play video starting at 5 minutes 26 seconds and follow transcript5:26
At the same time, constructing such an algorithm turns out to be an extremely difficult task. In
particular, there is a one million dollar prize for constructing such an algorithm or proving that there is
no such algorithm.
Search Problems
We will now give a formal definition of a Search Problem. And we will do this by considering the
famous boolean satisfiability problem.
Play video starting at 10 seconds and follow transcript0:10
The input for this problem is a formula in conjunctive normal form, which is usually abbreviated just
as CNF.
Play video starting at 22 seconds and follow transcript0:22
So a formal and conjunctive number form is just a set of clauses. In this case, in this example, we have
five clauses. This is the first one. This is the second one. The third one, the fourth one, and the last
one.
Play video starting at 36 seconds and follow transcript0:36
Each clause is a logical, is a logical or, or a disjunction of a few literals. For example, the first one is a
disjunction of literals x, y, and z. The second one is the disjunction of x and the negation of y. The third
one is a disjunction of y and not z, or a negation of z, and so on. So x, y and z are Boolean variables.
These are variables that take Boolean values. The Boolean values are true and false and we will
usually use 1 instead of true and 0 instead of false. So what this formula tells us is the first clause
actually constrains the values of x, y, and z to be so that either x = 1, or y = 1, or z = 1, right? So this is
just x, or y, or z. The second clause tells use that either x must be true, or the negation of y must be
true. That is, either x = 1, or y = 0 and so on. For example, the last clause tells us that either x = 0 or y
= 0, or z = 0.
Play video starting at 2 minutes 5 seconds and follow transcript2:05
Then, the Boolean Satisfiability problem, or just Satisfiability problem, which is also abbreviated as
SAT usually, is stated as follows. Given a formula in conjunctive normal form, we would like to check
whether it is satisfiable or not. That is, whether it is possible to assign Boolean values to all variables
so that all clauses are satisfied. If it is possible, we need to output a satisfying assignment. If it is not
possible, we need to report that no such assignment exists.
Play video starting at 2 minutes 38 seconds and follow transcript2:38
Now we give a few examples.
Play video starting at 2 minutes 41 seconds and follow transcript2:41
In the first example, we're given a formula over two variables, x and y. It contains three clauses, and it
is satisfiable. To satisfy it, we can assign the value 1 to x and the value 0 to y.
Play video starting at 2 minutes 55 seconds and follow transcript2:55
Let's check that it indeed satisfies all three clauses. Well, in the first clause, x is satisfied. In the second
clause, not y is satisfied. And, in the last clause X is satisfied. In the second example, we illustrate that
a formula may have more than just one satisfying assignment. For example, for this formula there is a
satisfying assignment which assigns the value 1 to x, y, and z and there is another satisfying
assignment which is shown here.
Play video starting at 3 minutes 23 seconds and follow transcript3:23
Okay, for the last formula, the last formula is unsatisfiable. And probably the most straightforward
way to check this is just to list all possible truth assignments to x, y and z.
Play video starting at 3 minutes 38 seconds and follow transcript3:38
So there are eight such assignments. Let me list them all.
Play video starting at 3 minutes 48 seconds and follow transcript3:48
Then for each of these assignments, we need to check that each of them falsifies at least one clause.
For example, the first one falsifies the first clause. When x, and y, and z = 0, the first clause is falsified,
right? For the second one, it falsifies the clause y or not z, right? The third one falsifies the clause x or
naught y and so on. So it can be checked that each of these eight assignments falsifies at least one
clause.
So another way of showing that this formula is satisfiable is the following. Let's first try to assign the
value zero to x. Then let's take a look at the following clause. So it is x or not y. x is already assigned
zero. So the only way to satisfy this clause is to assign the value 0 to y. So setting x to 0 forces us to set
y to 0 also. Now let's take a look at this clause. It forces us to set the values 0 to z, also. But then we
see that this clause is already falsified, right, which tells us that our initial move, I mean to assign 0 to
x was a wrong move. That we need to assign the value 1 to x.
Let's try to do this. If x = 1, let's take a look at the following clause. Not x is already falsified in this
clause, so we need to assign the value 1 to z. Now let's take at this clause. Not z is already falsified
here so we need to assign the value 1 to y. But then this clause is falsified. So no matter how we
assign x, it forces us to some other assignments and in the end, we falsify some clause which justifies
that this formula is unsatisfiable. That is a canonical hard problem.
It has applications in various branches of computer science. In particular because many hard
combinatorial problems are reduced very easily to the satisfiability problems. I mean, many hard
combinatorial problems can be stated very naturally in terms of SAT and then when a problem is
stated in terms of SAT, we can use a so-called SAT solver, which is a program that solves the
satisfiability problem. There are many such programs and there is even a [INAUDIBLE] competition of
SAT solvers.
Correct
That's right! \{B,D,F\}{B,D,F} is an independent set.
Yes, it does.
is selected.This is correct.
That's right! \{B,D,F\}{B,D,F}is an independent set.
Our last hard set problem that we mentioned here, deals with graphs again. It is called an
independent set problem. Here, we're given a graph and the budget b. And our goal is to select at
least b vertices such that there is no edge between any pair of selected vertices.
Play video starting at 20 seconds and follow transcript0:20
For example, in this graph that we've seen before in this lecture, is there is an independent set of size
seven? So the selected vertices are shown here in red. And it is not difficult to check that there is no
edge between a pair of red vertices. And the particularization implies that an independent set is
indeed a search problem. It is easy to check whether a given set of vertices is an independent set, and
that it has size at least b.
. It is interesting to note, that the problem can be easily solved if the given graph is a three. Namely, it
can be solved by the following simple greedy strategy, given a tree if you want to find even just the
independence side of maximum size we can do the following. The first thing to do is let's just take all
the leaves into a solution. Then, lets remove all the leaves from the three together with all its parents.
And then, lets just continue this process. To prove that this algorithm produces and optimal solution,
we need to show the take in all the leaves and our solution is a safe move, that it is consistent with an
optimal solution.
Play video starting at 1 minute 31 seconds and follow transcript1:31
This is usually done as follows. Assume that there is some optimal solution in which not all the leafs
are taken.
Play video starting at 1 minute 39 seconds and follow transcript1:39
Assume that, just for concreteness, assume that this is suboptimal solution. Not all the leaves are
taken here because, well, we have this leaf, this leaf, and this leaf, which is not in the solution. Then
we show that it can be transformed into another solution which, without decreasing its size, such that
it contains all the leaves. Indeed, let's just take all these market leaves into a solution. This will
probably require us to discard from a solution all it's parents, but it will not decrease the size of the
solution. So what we get is another solution whose size is, in this case, actually the same. But it
contains all the leaves. Which proves that there always exists an optimal solution which contains all
the leaves. And this in turn means that it is safe to take all the leaves.
We will see the details of this algorithm later in this class. But in general, once again, if we are given a
tree then we can find
Play video starting at 2 minutes 45 seconds and follow transcript2:45
an independent set of maximum size in this tree. Very efficiently in linear time. But at the moment,
we have no polynomial algorithm that finds, that even checks where there's a reason and
independent set of size b in the given graph in polynomial time.
P and NP
Now when have a formal definition of the search problem and when we've seen a few example of
search problems, we are ready to state the most important open problem in computer science. The
problem about classes P and NP. So recall once again that the search problem is defined by an
algorithm C that takes an instance I and a candidate solution S, and checks in time polynomial in I
where the S is indeed a solution for I.
Play video starting at 32 seconds and follow transcript0:32
In other words, we say that S is a solution for I if and only if the corresponding algorithm C of I and S
returns to,
Play video starting at 43 seconds and follow transcript0:43
then the class NP is defined just as the class of all search problems.
The name of this class stands for non-deterministic polynomial time. This essentially means that we
can guess a solution and then check its correctness in polynomial time. That is a solution for a search
problem can be verified in polynomial time.
Play video starting at 1 minute 11 seconds and follow transcript1:11
The Class P on the other hand, contains all search problems that can be solved in polynomial time.
That is all such problems for which we can find a solution in polynomial time. So to summarize, once
again the class P contains all search problems whose solution can be found efficiently. This class
contains in particulars, as minimum spending 3 problems. The shortest path problem, the linear
programming problem, the independent set on trees problem.
Play video starting at 1 minute 46 seconds and follow transcript1:46
The class NP contains all the problems whose solution can be verified efficiently that is given an
instantiation solution for this instance solution, we can check in polynomial time. In the size of this
instance, whether this is indeed a solution for. This class contains such problems as a problem, the
longest path problem, problem and independent set on general graphs.
The main open problem in computer science asks whether these two clauses are equal, namely
whether the clause P is equal to the clause NP. This is also known as the P versus NP question. The
problem is open, namely we do not know whether these two clauses are equal and this problem turns
out to be very difficult. This is a so-called Millenium Prize Problem. There is a $1 million prize from
Clay Mathematics Institute for resolving this problem. Note that if P is equal to NP, then all search
problems, I mean all the problems for which we can efficiently verify a solution, they can be solved in
polynomial time.
Play video starting at 3 minutes 2 seconds and follow transcript3:02
In other words, for all the problems for which we can efficiently verify a solution, we can also
efficiently find a solution.
Play video starting at 3 minutes 12 seconds and follow transcript3:12
On that hand, if P is not equal to NP, then there are search problems for which there are no efficient
algorithms. So there are problems like, for example, problem for which we can quickly check whether
a given candidate solution is indeed a solution, but there is no polynomial time algorithm for finding
such a solution efficiently. At this point, we do not know whether P is equals to NP or not, I mean,
where they are such problems for which there are no polynomial time algorithms.
In the next part, at the same time we will show that all the problems that we mentioned in this
lecture namely the problem, the longest path problem, the traveling salesman problem, the inter
linear programming problem are in some sense, the most difficult search problems in the class NP.
Reductions
Reductions
Hello and welcome to the next module of the Advanced Algorithms and Complexity class. This module
is devoted to reductions. Reductions allow us to say that one search problem is at least as hard as
another search problem. Intuitively, the fact that a search problem A reduces to a search problem B
just means that we can use an efficient named the polynomial time algorithm for the problem B to
solve the problem A, also in polynomial time. And we can use it just as a black box.
Pictorially, the fact that the search problem A reduces to search problem B means that we have the
following pipeline. Assume that we have an instance, I, of a problem, A. Now we are going to design
an algorithm that solves the instance I using an algorithm, a polynomial-time algorithm for B as a
black box. For this reason, it is shown here in a black box. Okay, the first thing to do is that we need to
transform an instance I into an instance I of the problem A, we must enter an instance of the problem
B. We do this by calling an algorithm f. So we plug, we feed the instance I of the problem A into the
algorithm f, and it gives us the instance f(I) of the problem B. We then use the algorithm for B as a
black box to solve it efficiently, and it gives us one of two outputs. Either there is no solution for this
instance f(I). In this case, we report that there is no solution also for i, for the instance i of the
problem A.
Play video starting at 1 minute 56 seconds and follow transcript1:56
Otherwise, it gives us a solution S for instance f(I). In this case, we need to transform it back to a
solution of I. We do this by using the second algorithm h. And it transforms in solution f(I) into
solution h(S) of the initial instance I.
We can now state it formally. Given two search problems, A and B, we say that A reduces to B and
write
Play video starting at 2 minutes 30 seconds and follow transcript2:30
A to B, if there is a pair of two polynomial time algorithms f and g. The algorithm f transforms any
instance of A into any instance f(A) of the problem B, such that the following holds. If there is no
solution for the instance f(I) of the problem B, then there is no solution for the instance I of the
problem A.
Play video starting at 2 minutes 58 seconds and follow transcript2:58
Otherwise, if there is a solution S for the instance f(I), then by applying the algorithm h to this solution
S, we get a solution h(S) of the initial instance I.
Play video starting at 3 minutes 14 seconds and follow transcript3:14
Now when we have an option of reduction, we can imagine a huge, huge graph containing all search
problems. So this graph can respond to the class NP of all search problems. In this graph, there is a
vertex for each search problem and to put an edge between the search problem A and the search
problem B a direct approach, if a search problem A reduces to search problem B, okay?
Then by definition, we say that this search problem is NP-complete if all other search problems
reduce to it.
Play video starting at 3 minutes 50 seconds and follow transcript3:50
Pictorially, it looks as follows. So the red vertex here corresponds to an NP-complete search problem.
So in some sense, this problem attracts all other search problems, all other search problems reduce to
it.
Play video starting at 4 minutes 7 seconds and follow transcript4:07
But it otherwise an algorithm for an NP-complete problem can be used as a polynomial time
algorithm for an NP-complete can be used as a black box to solve just all other search problems also
in polynomial time.