0% found this document useful (0 votes)
877 views

Coursera Advanced Algorithms and Complexity

The document describes a course on advanced algorithms and complexity. The first unit discusses network flow algorithms. The first lecture provides an example of an evacuation problem to illustrate the types of problems addressed. It presents a toy road network with different road capacities and asks how many cars can be evacuated per hour. The optimal solution is 8000 cars per hour by sending flows along different roads respecting the capacity limits. The next lecture will provide a formal framework to define network flow problems and discuss real-world examples.

Uploaded by

yousef shaban
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
877 views

Coursera Advanced Algorithms and Complexity

The document describes a course on advanced algorithms and complexity. The first unit discusses network flow algorithms. The first lecture provides an example of an evacuation problem to illustrate the types of problems addressed. It presents a toy road network with different road capacities and asks how many cars can be evacuated per hour. The optimal solution is 8000 cars per hour by sending flows along different roads respecting the capacity limits. The next lecture will provide a formal framework to define network flow problems and discuss real-world examples.

Uploaded by

yousef shaban
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 329

Shokoufeh Mirzaei Youtube Channel Linear programming

Ryan O'Donnell Youtube Channel >> ??


Arabic Competitive Programming
Syllabus - What you will learn from this course
Content Rating87%(2,586 ratings)
WEEK
1
5 hours to complete
Flows in Networks

Network flows show up in many real world situations in which a good needs to be transported across
a network with limited capacity. You can see it when shipping goods across highways and routing
packets across the internet. In this unit, we will discuss the mathematical underpinnings of network
flows and some important flow algorithms. We will also give some surprising examples on seemingly
unrelated problems that can be solved with our knowledge of network flows.

9 videos (Total 72 min), 3 readings, 2 quizzes


SEE ALL
WEEK
2
5 hours to complete
Linear Programming

Linear programming is a very powerful algorithmic tool. Essentially, a linear programming problem
asks you to optimize a linear function of real variables constrained by some system of linear
inequalities. This is an extremely versatile framework that immediately generalizes flow problems,
but can also be used to discuss a wide variety of other problems from optimizing production
procedures to finding the cheapest way to attain a healthy diet. Surprisingly, this very general
framework admits efficient algorithms. In this unit, we will discuss some of the importance of linear
programming problems along with some of the tools used to solve them.

10 videos (Total 84 min), 1 reading, 2 quizzes


SEE ALL
WEEK
3
5 hours to complete
NP-complete Problems

Although many of the algorithms you've learned so far are applied in practice a lot, it turns out that
the world is dominated by real-world problems without a known provably efficient algorithm. Many of
these problems can be reduced to one of the classical problems called NP-complete problems which
either cannot be solved by a polynomial algorithm or solving any one of them would win you a million
dollars (see Millenium Prize Problems) and eternal worldwide fame for solving the main problem of
computer science called P vs NP. It's good to know this before trying to solve a problem before the
tomorrow's deadline :) Although these problems are very unlikely to be solvable efficiently in the
nearest future, people always come up with various workarounds. In this module you will study the
classical NP-complete problems and the reductions between them. You will also practice solving
large instances of some of these problems despite their hardness using very efficient specialized
software based on tons of research in the area of NP-complete problems.

16 videos (Total 115 min), 2 readings, 2 quizzes


SEE ALL
WEEK
4
5 hours to complete
Coping with NP-completeness

After the previous module you might be sad: you've just went through 5 courses in Algorithms only to
learn that they are not suitable for most real-world problems. However, don't give up yet! People are
creative, and they need to solve these problems anyway, so in practice there are often ways to cope
with an NP-complete problem at hand. We first show that some special cases on NP-complete
problems can, in fact, be solved in polynomial time. We then consider exact algorithms that find a
solution much faster than the brute force algorithm. We conclude with approximation algorithms that
work in polynomial time and find a solution that is close to being optimal.

11 videos (Total 119 min), 1 reading, 2 quizzes


SEE ALL

 WEEK

5 hours to complete

Streaming Algorithms (Optional)

In most previous lectures we were interested in designing algorithms with fast (e.g. small polynomial)
runtime, and assumed that the algorithm has random access to its input, which is loaded into
memory. In many modern applications in big data analysis, however, the input is so large that it
cannot be stored in memory. Instead, the input is presented as a stream of updates, which the
algorithm scans while maintaining a small summary of the stream seen so far. This is precisely the
setting of the streaming model of computation, which we study in this lecture. The streaming model
is well-suited for designing and reasoning about small space algorithms. It has received a lot of
attention in the literature, and several powerful algorithmic primitives for computing basic stream
statistics in this model have been designed, several of them impacting the practice of big data
analysis. In this lecture we will see one such algorithm (CountSketch), a small space algorithm for
finding the top k most frequent items in a data stream.

10 videos (Total 72 min)

Flows in Networks
Introduction
Ppt slides
Hello everybody, welcome to our course on Advanced Algorithms and Complexity. As the first unit in
this course, we're going to talk about network flow algorithms and, in particular, in this very first
lecture, we're just going to give an example of a problem to give you a feel for what types of things
we're going to be talking about in this unit.
Play video starting at 20 seconds and follow transcript0:20
So to set things up, suppose that you're a disaster relief manager and you're trying to, among other
things, you have this city and you want to know how quickly could it be evacuated in the case of an
emergency. Well, to do this you have to look at the roads leading out of the city, and so you see that
there's the main highway out that will handle 5000 cars an hour. Of course, this isn't the only road
leading out of the city. There are some secondary highways that can each handle 2000 cars an hour.
Play video starting at 50 seconds and follow transcript0:50
Of course, things are a little bit more complicated than that. These other roads, they each bifurcate
into two halves. Each of these halves can handle 1000 cars an hour. So you're maybe okay so far. But
it turns out that two of those halves merged together a little wise down into just a single road that
can only handle 1000 cars now. And so you can imagine that in real life, there are many, many, many
more roads than this in their full road network, but this is a toy example. And we'd like to know, given
that this is the network, how quickly can we evacuate? Well, it's not hard to start playing around with
this. We can take 5000 cars an hour and send them out along the main road. We can send another
thousand cars an hour along this northern path here. Another thousand cars an hour can go along up
on the northern
Play video starting at 1 minute 44 seconds and follow transcript1:44
road and then split off and join in on the merged road, and finally, another thousand cars and hour
can go off on the third highway. Now, putting this all together, we have a total of 8000 cars an hour
that we can evacuate, but we'd like to know, is this the best that we can do or can you do better?
Play video starting at 2 minutes 2 seconds and follow transcript2:02
Well, if you play around with this a little bit, there's no obvious way to make an improvement, and
you might suspect that this is the best you can do, and in fact, you'd be correct. One way to show this
is that if you draw a river, suppose that there was a river where this blue line is on the diagram, you'll
note that there are only four bridges that cross that river. And the total capacity of all the bridges is
only 8000 cars an hour.
Play video starting at 2 minutes 30 seconds and follow transcript2:30
So if only 8000 cars an hour can cross this river and you need to cross the river to get out of the city,
only 8000 cars an hour can evacuate the city. And that proves that this plan that we have for
evacuation is really the best you can do. It's bottlenecked at the river, you can't do any faster. So
network flow problems are problems that will allow us to study things like this problem. And this is
what we're going to be talking about in this unit. And next lecture, what we're going to do is we're
going to take a little bit of a more careful look at this problem. We're going to come up with a formal
framework to discuss this kind of issue, and then we're going to discuss some examples of where
these sorts of problems might show up in real life. So that's what you have to look forward to in the
next lecture. I hope to see you then.
Network Flows

Hello everybody. 
Welcome back to our network flows unit. 
Today we're going to be talking a little bit about sort of formal definitions and 
sort of getting a concrete definition of our problem and 
then some examples of sort of what sorts of problems fall into this category. 
So, remember last time we discussed the disaster management problem. 
Today what we're going to do is we're going to have a sort of formal framework 
for talking about this problem and some similar problems.

So to begin with, we're going to want to define what a network is. 


A network should be thought of like this network of roads that we saw in this 
previous example.
Play video starting at 36 seconds and follow transcript0:36
So a network is a directed graph G representing all the roads, but 
with some additional information.
Play video starting at 43 seconds and follow transcript0:43
Each edge E is assigned a positive real number called its capacity. 
This is how much traffic can be handled by that road.
Play video starting at 51 seconds and follow transcript0:51
Additionally one or more vertexes labeled as a source, this is the city, 
this is the place where the traffic is coming from.
Play video starting at 58 seconds and follow transcript0:58
And then one or more vertex is labelled a sink, 
which is sort of the edges of the graph where everything is going to. 
So flow goes from sources to sinks along these edges that each have a capacity.
Play video starting at 1 minute 10 seconds and follow transcript1:10
So with the example from last time, 
we have this network of roads, we can turn it into a graph where the city is a node, 
each of the intersection gives us another vertex, and 
then we have some extra vertices sort of at the ends where the cars escape to. 
The city itself is the source of our flow. 
The four exits give us sinks, labeled T here.
Play video starting at 1 minute 33 seconds and follow transcript1:33
And each of the edges has a capacity which says how many cars 
an hour can drive along that.
Fair enough. The next thing we want to be able to discuss are flows. We want to be able to talk about
flows of traffic through this network, and talk about what's a valid flow and what isn't. And before we
get into anymore detail on that, in the last example we actually talked about sort of exactly which
routes different cars take. You know a thousand cars traveled this route and another thousand
traveled this route and so on and so forth. But this is a little bit more complicated than we want to do.
Rather than talking about where each individual car goes, we're instead just going to concern
ourselves with the total number of cars, the total amount of flow along each edge.
Play video starting at 2 minutes 21 seconds and follow transcript2:21
So in particular, we're just going to figure out how much flow goes along each edge. But this, of
course, needs to satisfy a couple of conditions. It can't just be any number we like.
Play video starting at 2 minutes 33 seconds and follow transcript2:33
The first of these is rate limitation. For any edge E, for one thing the flow along that edge needs to be
non-negative. You can't send a negative number of cars along this road.
Play video starting at 2 minutes 44 seconds and follow transcript2:44
And secondly the flow needs to be at most the capacity of the edge. You can't run more cars along the
road than the total capacity of the road fits. The second thing is a little more subtle, it's conservation
of flow. The idea here is that if a car drives into an intersection, then eventually it needs to drive out
of the intersection.
Play video starting at 3 minutes 4 seconds and follow transcript3:04
And what this says is that at any vertex except for sources where flow gets created and sinks where
flow gets destroyed it needs to be the case that the total flow of cars into that vertex is the same as
the total flow coming out of that vertex. So for every vertex, the sum over all edges pointing into the
vertex of the flow is the same as the sum over all edges going out of that vertex of the flow along that
edge.
And so putting this together formally, we define a flow on a network is just an assignment of a real
number, f sub e, to each edge e. Such that these two conditions hold for each edge E the flow on that
edge is between 0 and the capacity of the edge and for all vertices V except for sources and sinks the
total flow into that vertex is the same as the total flow out of that vertex.
Play video starting at 3 minutes 58 seconds and follow transcript3:58
So that's what a flow is. So for example, in our example from last time, we have all of these cars
traveling in various directions, and on each road, we can compute the total amount of flow, the total
number of cars flowing along that road. So there are 5,000 along the main highway, the northern
highway going up has 2,000 cars an hour, half of them going one way, half of them spooling off the
other way. And for each road, you just label how many cars an hour are traveling along that road. And
if you look at this for a while, you can actually determine that yes, these satisfy the properties that we
want. No road has more flow than its capacity and flow is conserved at each of these three vertices
that aren't sources or sinks.
Play video starting at 4 minutes 47 seconds and follow transcript4:47
So to make sure that we're all on the same page here, we have a network listed up above, and then
three possible assignments of flow. Which of these three are valid flows for the given network?
Play video starting at 5 minutes 2 seconds and follow transcript5:02
Well, you look at these for a while, you compare to the definitions, and you'll find out that C is the
only valid flow. A has the problem that it doesn't conserve flow at the denoted vertex. It has six units
of flow going into that vertex but seven units of flow coming out. B conserves flow everywhere, but
the edge that's highlighted has six units of flow whereas the capacity is five.
Play video starting at 5 minutes 27 seconds and follow transcript5:27
On the other hand if you look at diagram C, everything works out. Flow is conserved, nothing goes
above capacity, it's all great.
Play video starting at 5 minutes 36 seconds and follow transcript5:36
Okay, so this is what a network flow is, but network flows are actually very useful to study because
they actually model a large number of real life phenomena. We've already talked about flows of
goods or people along a transportation network which fit very cleanly into this model, but you can
also look at flows of electricity along a power grid or flows of water through pipes or even flows of
information through a communications network. All of these are going to be examples of network
flows in which this sort of formalism that we've developed will be useful for analyzing problems.
Play video starting at 6 minutes 11 seconds and follow transcript6:11
Now, what types of problems are we going to be studying? Well, the big one that you want to know is
the size of a flow. You want to really know how much stuff is actually flowing. How many cars are
actually evacuating the city? How many can we get to evacuate the city? And for this, we need to
define the size of a flow.
Play video starting at 6 minutes 32 seconds and follow transcript6:32
And it turns out this can be computed by looking only at the sources. And the idea is that any flow, it
gets created at the source, they sort of drive until they hit a sink and then they go away. But if we
could just measure how much flow is coming out of the sources, that will tell us how much there is in
total.
Play video starting at 6 minutes 49 seconds and follow transcript6:49
So given the flow after we defined its size to be the sum of all edges that leave a source of the flow
coming out of that source minus the sum over all edges going into a source of the total flow through
those edges. And so it's the total flow going out of source minus the total flow going into sources,
that's the size of the flow.
Play video starting at 7 minutes 10 seconds and follow transcript7:10
Now, it turns out you can equally well compute this by looking only at sinks. The Lemma says the size
of the flow is equal to the sum of flow going into a sink minus the sum of flow going out of a sink. And
the argument here is pretty nice, we'll be seeing similar things a lot.
Play video starting at 7 minutes 28 seconds and follow transcript7:28
So the thing to note is that if you take the sum of all vertices of the total flow going into that vertex
minus the total flow going out of that vertex, that's actually zero because each edge, some flow
leaves the vertex but then goes into another vertex, so the two terms cancel out.
Play video starting at 7 minutes 49 seconds and follow transcript7:49
On the other hand, if we take vertices that aren't sources or sinks, but conservation of flow, that inner
term is zero. So this is the same as the sum only over sources and sinks of the total flow into that
vertex, minus the total flow out of that vertex.
Play video starting at 8 minutes 5 seconds and follow transcript8:05
Now if we look at the sum only over sources of the flow into the vortex minus the flow out of the
vortex, that's minus the size of the flow.
Play video starting at 8 minutes 14 seconds and follow transcript8:14
And the other term is just the flow into sinks minus the flow out of sinks. And since the sum is zero,
the total flow into a sink minus total flow out of a sink Is the same as the size of the flow which what
we were trying to prove.

Play video starting at 8 minutes 29 seconds and follow transcript8:29


Okay so that's what the size of the flow is. The big problem that we're going to be trying to solve, and
we'll really discussing how do you solve this problem for the next several lectures, is how much flow
can you fit through a network?
Play video starting at 8 minutes 43 seconds and follow transcript8:43
Formally this is called the maxflow problem. The input should be a network G, so a graph with these
capacities and some designated sources and sinks, and the output should be a flow f for the graph G
such that the size of the flow f is as large as possible.
Play video starting at 9 minutes 1 second and follow transcript9:01
And this is the problem that we're going to be spending the next several lectures on. So, come back,
next lecture we'll start talking about some of the tools that will be useful in designing these
algorithms. So I'll see you then.

Basic Tools
Residual Networks
Hello everybody, welcome back to our network flows unit. Today we're going to be talking about
some tools that are actually very useful. A tool called the Residual Network for coming up with new
flows, or adding a little bit of flow to an existing flow.
Play video starting at 16 seconds and follow transcript0:16
So remember last time we formally defined these things. We defined what a network was, and we
defined the flow on this network was, and then defined what the maxflow problem, which is the one
we're working towards solving. There's a very basic technique to solving maxflow, that is basically
what we are going to be working towards for the next bunch of lectures. And the idea is to build up
your flow a little bit at a time, and this is really what we did in the original example in the first lecture,
where we routed a bunch of cars along one road, and then routed some more cars along another
road and so on and so forth, and built up the final flow as the sum of a bunch of little flows. So how do
we this in practice? Well suppose that we have the following network. Here, all the edges have
capacities 1, for simplicity, and what we can do is, we can just add flows of those together a little bit
at a time. We can know, hey, we can send the unit of flow along this top path, and if we just have a
unit of flow on each of these edges, everything balances. But after we do that, we can send another
unit of flow along the bottom path, and then another unit of flow along the middle. And once we've
done this, we now have a maximum flow, but we built it up in nice convenient little pieces.
Ppt slides
Play video starting at 1 minute 30 seconds and follow transcript1:30
Okay, so let's consider another example, this one's actually a little bit simpler. We have our network
here, the maximum flow is 2, as we've shown here, but we're going to try and add flow increment. So
let's start by adding flow along this path, it's a perfectly valid path, we can route a unit of flow
through it. And now we want to try to add our second unit of flow and there's a bit of a problem. We
can't readily add a second unit if we've already used up these piece edges, the remaining edges just
don't connect to each other we can't actually get the flow to work. Now it turns out this away around
this, which of course there is since the maximum flow is 2, and it involves with the rounding flow
along with this blue path,
Play video starting at 2 minutes 16 seconds and follow transcript2:16
which is a little bit weird since we can not actually do that. We can't actually send flow down along
the middle edge since there was not an edge there, but if you think about it in the right way, you can
think of sending flow down this middle edge as cancelling out the flow that we currently send in the
up direction. If the flow going up and the flow going down are thought to cancel each other, then
once we add these two flows together, we just get this flow, which is perfectly valid, because there's
no flow running along the middle edge.
Play video starting at 2 minutes 48 seconds and follow transcript2:48
And so,the moral of the story is that if you want to be able to appropriately add your little bit of flow,
sometimes it's not enough to just add flow along new edges but sometimes you also have to let your
flow cancel flow along existing edges.
Play video starting at 3 minutes 5 seconds and follow transcript3:05
So given a network G and a flow f what we're going to do is construct what's called the residual
network, g sub f, and this is a new network that represents the places where flow can be added to f.
Play video starting at 3 minutes 19 seconds and follow transcript3:19
But this includes not just edges where there's more room for extra flow to go along that edge, but
also places where we could cancel out existing flows.
Play video starting at 3 minutes 31 seconds and follow transcript3:31
So to define this formally for each edge e of our graph our residual graph, our residual network is
going to have edges, well it's going to have an edge along e. And the capacity is going to be the
capacity of the edge, the original capacity of the edge, minus the flow along that edge. And the point
is this is the amount of remaining capacity that we have of course, if the flow is equal to the capacity
we can ignore this edge because it would have no capacity.
Play video starting at 4 minutes 0 seconds and follow transcript4:00
We also need to have an edge opposite e with capacity equal to the flow along e, because this will
represent the amount of flow that we can cancel going in the other direction.

Play video starting at 4 minutes 13 seconds and follow transcript4:13


So for example, up top we have the following net, we have a network, and it has a flow assigned to it,
there are various units of flow assigned to various edges. Down below will give us the residual
network, so if you look at the edge on the left, for example, well we used up all five units of its flow.
So what does this mean? Well, we've got no edge left pointing down, because there's no extra flow
that we can push in that direction.
Play video starting at 4 minutes 41 seconds and follow transcript4:41
However, we do have a new edge pointing up with five units of flow, saying there are five units of
flow going the other way that we might cancel out later.
Play video starting at 4 minutes 52 seconds and follow transcript4:52
If you look at the top edge we use five out of seven total units of flow, so there's two units of flow
left. So there's this one edge up top with two units of flow, and then there's this additional edge going
the opposite direction representing the five units of flow that can be still be cancelled. And so we do
that also for all the other edges of the graph, and this gives us the residual network.
Play video starting at 5 minutes 16 seconds and follow transcript5:16
So if we look at what this does to our previous example, we have this graph, we route flow like this.
Now we can't add to it directly, but if you look at the residual network, we're actually going to have
an edge going back in the opposite direction from each of these.
Play video starting at 5 minutes 33 seconds and follow transcript5:33
And in this residual graph there is actually a path that supports user flow, it involves this middle edge
that says that we're cancelling out flow along the middle.
Play video starting at 5 minutes 44 seconds and follow transcript5:44
Okay, so given a network g and a flow f, any flow g on the residual graph it turns out can be added to f
to get a new flow on the original graph. So, the point is that if you have flow along this edge in the
same direction that you had in the original graph, that's saying, you should add that much flow along
that edge.
Play video starting at 6 minutes 9 seconds and follow transcript6:09
However, if you got flow sort of in one of these opposite direction pointing edges, that's saying that
that much stuff should be cancelled From the flow that you had before along that edge. So just to
make it clear, let's look at this problem.
So we have a network with a flow on it, f this upper left corner. Down below it we show what the
residual network is corresponding to that flow. Now in the upper right we have a flow, little g, for the
residual network.
Play video starting at 6 minutes 41 seconds and follow transcript6:41
And the question is if we want to compute the sum of the flows, f plus g, what is the flow of f plus g
along this highlighted edge from source to. Well, what do we get? The original flow along that edge
was two, we need to add the flow of g along that same edge. That's four extra units of flow and we
need to subtract off the flow in the canceling direction, so that's plus four minus two, That's a total of
four units of flow from S to T in the residual. And you can compute the other edges and yes, f + g does
give you a valid flow for the original map work.
Play video starting at 7 minutes 24 seconds and follow transcript7:24
In fact, the theorem is as follows, if you have a graph G and a flow f and then have any flow you like g
on the residual map work, a few things happen. Firstly, f + g is always a flow on the original network
which is nice.
Play video starting at 7 minutes 40 seconds and follow transcript7:40
If you want to look at the size of the flow, the size of f + g is just the size of f plus the size of g.
Play video starting at 7 minutes 47 seconds and follow transcript7:47
And finally and importantly, any flow on the original network you can always get by finding some
appropriate residual flow like adding it to f.
Play video starting at 7 minutes 57 seconds and follow transcript7:57
Now the proof of this is actually not that hard, if you want to look at conservation of flow,
conservation of flow of f, and conservation of flow of g, if you combine them,imply that you have
conservation of flow on f + g.
Play video starting at 8 minutes 10 seconds and follow transcript8:10
Next if you want to look at the total flow f + g sends through an edge, well the flow it sends through
edge E is equal to at most the flow of f along that edge plus the flow of g along that edge, which is at
most the flow of f plus the capacity of that edge in the residual. But that capacity is just the original
capacity minus the flow that you sent from f and so that's just the capacity of our original network.
Play video starting at 8 minutes 39 seconds and follow transcript8:39
On the other hand, you can't end up with negative flow along an edge because g isn't allowed to
cancel more flow along that edge than you had originally. And so, putting this together, f + g has to be
a flow.
Play video starting at 8 minutes 53 seconds and follow transcript8:53
Next, if you look at the flow of f plus g out of a source, this can be shown to be the flow of f out of
that source plus the flow of g out of that source. So combining this, the sum of the size, the size of the
sum of the flows is the sum of the sizes. And finally if you're given any flow h for our original network,
it's not hard to construct a g that's somehow h- f, that's a flow on the residual graph.
Play video starting at 9 minutes 22 seconds and follow transcript9:22
And so you can then write as h as f + g for some appropriate flow on the residual graph.
Play video starting at 9 minutes 28 seconds and follow transcript9:28
So, in summary, flows on the residual network, so it exactly correspond to ways to add flow to our
original f. And this is very useful because our big picture idea for our algorithm is going to be start
with some flow, and then add little bits of flow, and the residual graph will tell us exactly how we can
add little bits of flow. So that is all we have for this lecture, come back next time and we will talk a
little bit about how to show that we actually have the best flow when we do.
Maxflow-Mincut
Hello everybody. 
Welcome back to our unit on flows and networks. 
Today we're going to be talking a little bit about sort 
of how to bound the size of our flows. 
And in particular, I mean, we've got this problem. 
In order to find maxflows, 
we're going to need a way to verify the flows that we have are actually optimal.
Play video starting at 21 seconds and follow transcript0:21
So, in ,particular what we're going to do is we're going to need techniques for 
bounding the size of a maximum flow.
Play video starting at 28 seconds and follow transcript0:28
And it turns out we actually had a way to do this. 
So, in our original example, we said we have the city we're trying to evacuate. 
And if we have a river at a particular location, if you just 
look at the total amount of capacity of all the roads that cross the river, 
this gave us an upper bound on the rate at which we could evacuate the city. 
Because everyone evacuating the city needs to cross the river at some point.
Play video starting at 53 seconds and follow transcript0:53
And this is going to be our basic idea for bounding maxflows. 
The idea is, we want to find a bottleneck in the flow. 
We want to sort of find some region where in order to 
cross from one side of this bottleneck to the other, there's not a lot of capacity. 
And the total capacity across this bottleneck 
will give us a bound on the flow.
Play video starting at 1 minute 14 seconds and follow transcript1:14
So to make this a little bit more rigorous, we're going to define a cut. 
So given the network G, a cut, this is going to be a set of vertices of G. 
And you should think of these as sort of the set of vertices sort of on 
the source side of river, on the same side of the river as the C. 
So this is a set of vertices such that C contains all the sources of our graph and 
none of the sinks.
Play video starting at 1 minute 38 seconds and follow transcript1:38
Now the size of the cut is given by the total capacity 
of all edges that leave the cut, that go from inside the cut to outside the cut, 
which is the sum of all that capacity.

Ppt slides
And so, for example, in this network that we had corresponding to our city evacuation problem, we
can define a cut that contains these four vertices. And size of the cut, well, is the sum of the capacities
of these four roads, which ends up being 8000.
Play video starting at 2 minutes 9 seconds and follow transcript2:09
Okay, so to make sure we're all on the same page, here is a pretty simple network. There is a cut,
which is this blue square that contains four vertices on the inside. What's the size of this cut?
Play video starting at 2 minutes 23 seconds and follow transcript2:23
Well, you just have to look at which edges cross from inside the cut to outside the cut. These have
capacities one and two and three. And so you add those up, and you get six as the answer.
Okay, so the important thing though is that your cuts provide upper bounds on the size of the flow in
and out. In particular, for any flow f and any cut C, the size of f is at most the size of C.
Play video starting at 2 minutes 51 seconds and follow transcript2:51
And this was sort of exactly the argument that we had, any piece of flow needs to cross the cut.
There's only so much capacity that lets you cross the cut, and so that's an upper bound on the flow.
Now, to make this rigorous, let's give a proof, the flow is the sum of our sources of the total flow out
of that vertex minus the total flow into that vertex.
Play video starting at 3 minutes 14 seconds and follow transcript3:14
Now for vertices that aren't a source or sink, this term is zero. So we can extend this to a sum over
vertices inside our cut of the flow out of that vertex minus the flow into that vertex. On the other
hand, you'll note that, I mean, if you have an edge that stays within the cut, it comes out of one
vertex and into another and cancels out of the sum. So this is the same as the sum over edges that
leave the cut of the flow through that edge, minus the sum over edges that go into the cut of the flow
through that edge. Now of course, the flow of edges leaving the cut, that's at most the capacity of the
edge, the flow of edges into the cut is at least zero. And so this things is at most the sum of the edges
that leave the cut of the capacity of that edge, which is exactly the size of the cut.
Play video starting at 4 minutes 5 seconds and follow transcript4:05
So this proves the theorem.
Play video starting at 4 minutes 7 seconds and follow transcript4:07
And what this says is that if you have any cut C, that gives you an upper bound on the maximum flow.
The size of the maximum flow is at most the size of the cut.
Play video starting at 4 minutes 18 seconds and follow transcript4:18
Now it's good we've got some upper bounds, but the question is, is this good enough? I mean, there
are lots of ways to prove upper bounds. But what we really want is a sharp upper bound, one that
good enough that once we found a maximum flow, we'll have a matching upper bound that will tell us
you actually can't do any better than this. And, somewhat surprisingly, bounds of this form are
actually good enough.
Play video starting at 4 minutes 42 seconds and follow transcript4:42
So the big theorem here is known as the maxflow-mincut theorem. For any network G, the maximum
over flows of the size of the flow is equal to the minimum over cuts of the size of the cut.
Play video starting at 4 minutes 57 seconds and follow transcript4:57
In other words, there's always going to be a cut that's small enough to give the correct upper bound
on maximum flows.
Play video starting at 5 minutes 6 seconds and follow transcript5:06
So to prove this theorem, let's start with a very special case. What happens when the maximum flow
is equal to zero?
Play video starting at 5 minutes 13 seconds and follow transcript5:13
If this is the case, it has to be the case that there's no path from source to sink. If there is any path
from a source to a sink, then you could ride a little bit of flow along that path, and your maxflow
would be positive.
Play video starting at 5 minutes 26 seconds and follow transcript5:26
So what we're going to do is we're going to let C be the set of vertices that are reachable from
sources. And it turns out there can't be any edges out of C at all because if there were, if there was an
edge that left C, then wherever you ended up, that would also be reachable from the source. And it
should be in C as well.
Play video starting at 5 minutes 47 seconds and follow transcript5:47
Now, since there are no edges leaving C, the size of the cut has to be 0.
Now, in the general case, we can do something similar. We're going to let f now be a maximum flow
for G. And then, we're going to look at the residual graph.
Play video starting at 6 minutes 2 seconds and follow transcript6:02
Now, if the residual graph, which is a way to talk about ways of adding flow to f, if that had any flow
that you could put in it, f couldn't be a maxflow.
Play video starting at 6 minutes 13 seconds and follow transcript6:13
So the residual graph has maxflow zero.
Play video starting at 6 minutes 17 seconds and follow transcript6:17
And what that means is there's a cut C with size zero in this residual graph. And I claim this cut C has
size exactly equal to the size of our flow f.
Play video starting at 6 minutes 29 seconds and follow transcript6:29
And the proof isn't hard. The size of f for any cut is actually the total flow out of that cut minus the
total flow into that cut.
Play video starting at 6 minutes 37 seconds and follow transcript6:37
But if C has size 0 in the residual graph, that means that all the edges leaving the cut need to have
been completely saturated, they need to have used the full capacity. And the edges coming in to C
had to have no flow, because otherwise the residual graph would have an edge pointing on the
opposite direction.
Play video starting at 6 minutes 57 seconds and follow transcript6:57
And so the size is just total sum over edges leaving C of their capacity minus the sum over edges in the
C of zero, which is just the size of the cut. And so we what found is we found a flow f and a cut c
where the size of the flow is equal to the size of the cut.
Play video starting at 7 minutes 15 seconds and follow transcript7:15
Now, by the previous limit, you can't have any flows bigger than that cut, or any cuts smaller than
that flow.
Play video starting at 7 minutes 22 seconds and follow transcript7:22
And so this is the maximum flow, and it's equal to the minimum cut size.
Play video starting at 7 minutes 27 seconds and follow transcript7:27
So in summary, you can always check whether or not a flow is maximal by seeing if there's a matching
cut.
Play video starting at 7 minutes 34 seconds and follow transcript7:34
In particular, f is going to be a maxflow if and only if there's no source to sink path in the residual
graph, and this is a key criteria that we'll be using in our algorithm that we'll be discussing next time.
So, I hope to see you for the next lecture.

Maxflow algorithms
The Ford–Fulkerson Algorithm

Hello everybody, welcome back to our Flows in Networks unit. Today we're actually going to, finally,
give an algorithm to compute maximum flows. So the idea of this algorithm is very much along the
lines that we've been sort of hinting at the entire time. We're going to start with zero flow, in our
network, so the trivial flow, no flow along any edge. And we're going to repeatedly add a tiny bit of
flow, sort of building up the flow a little bit at a time, until we reach a state where it's impossible to
add anymore flow, and then we'll be done. So how do we add flow?

You have some flow f. We then compute the residual network, Gf. And this really does represent the
ways in which flow can be added. So any new flow that we would have would be of the form f + g,
where g is a flow in our residual network. So if we want to replace f by a slightly larger flow, all we
need is a slightly positive flow in the residual network.
Play video starting at 59 seconds and follow transcript0:59
And to do that, all we want to do is see if there's a source to sink path in this network.
Play video starting at 1 minute 5 seconds and follow transcript1:05
So, what happens if there's no path? If there's no source to sink path in our residual network, then
the set of vertices that we can reach from the source defines a cut of size 0.
Play video starting at 1 minute 16 seconds and follow transcript1:16
That says there's no flow in the residual of positive size. And so any flow f + g has size at most the size
of f and f is a maximum flow. And so if that's the case, we're done. We already have a maximum flow
and we can just stop.
Play video starting at 1 minute 32 seconds and follow transcript1:32
Now if there is a path, it turns out we can always add flow along that path.
Play video starting at 1 minute 37 seconds and follow transcript1:37
What you do is if you add x units of flow to each edge along that path, well, you have conservation of
flow, there's x units in an x units out of each vertex on that path. And as long as x is at most the
minimum capacity of any of these edges in the residual graph, this is actually a flow in the residual
network.
Play video starting at 2 minutes 0 seconds and follow transcript2:00
So if we do this, we find some flow g for our residual network with the size of g is bigger than 0. Then
we'll replace f by f + g, we found a new flow where the size of f + g is strictly bigger than the size of f.
We found flow that's slightly bigger than the one we had before.
Play video starting at 2 minutes 19 seconds and follow transcript2:19
So to make this formal, we produced what's known as the Ford-Fulkerson algorithm for max flow. You
start by letting f be the trivial flow. And then you repeat the following. You compute the residual
graph for f.
Play video starting at 2 minutes 33 seconds and follow transcript2:33
You then try and find an s to t path, P, in this residual graph.
Play video starting at 2 minutes 38 seconds and follow transcript2:38
If there is no such path, we know that we already have a max flow so we can just return f.
Play video starting at 2 minutes 45 seconds and follow transcript2:45
Otherwise, what we're going to do is we're going to let X be the minimum capacity of any edge along
this path in the residual network.
Play video starting at 2 minutes 52 seconds and follow transcript2:52
We're going to let g be a flow, where g assigns X units of flow to each edge along this path.
Play video starting at 3 minutes 0 seconds and follow transcript3:00
And then we're going to let f be f + g. And when we do this we increased out flow by a little bit and we
just keep repeating until we can't increase our flow anymore. So, for example, we've got the network
here. Here's our residual network. How much flow do we end up adding in one step?
Play video starting at 3 minutes 22 seconds and follow transcript3:22
Well to figure this out you have to do two things. You first have to find your S to T path, which is this
one. And then you say, well how much capacity are there on the edges? Which edge has minimum
capacity? And that's this edge of capacity 4. And so in this case you'd route four units of flow on your
first step.
Play video starting at 3 minutes 43 seconds and follow transcript3:43
But, to really see how this algorithm works let's take the following example. So, we have a graph up
top. Initially we have no flow running through that graph so the graph below is the residual, is the
same network that we started with. And now what we want to do is we want to find paths in the
residual network. So here's an S to T path. The minimum capacity along this path is 5, so we route 5
units of flow along each of these edges. Now this updates the residual, we have a couple, we've got a
new edge, we got an edge that wasn't there before, whatever.
Play video starting at 4 minutes 19 seconds and follow transcript4:19
We now want to again find an S to T path in the residual graph. This one works out pretty well. Again,
the minimum capacity of these edges is 5, so we route 5 more units of flow along each of those edges
and we update the residual graph again. Once again, we find an S to T path in the residual graph. This
one works pretty well. The minimum capacity here on these edges is 2. So we route 2 more units of
flow along each of those edges.
Play video starting at 4 minutes 48 seconds and follow transcript4:48
And, at this point, once we've updated the residual we will note there is no S to T path. In fact, there's
a cut right here that prevents us from routing any more flow. And so given that cut you can actually
see that this flow which routes 12 total units of flow is actually a maximum flow and so we're done.
Play video starting at 5 minutes 12 seconds and follow transcript5:12
So before we get into analyzing the run time of this algorithm, there's an important point to make.
We should note that if all the capacities that we have are integers in our original network, then all the
flows that we produce are also integer. Because every time we try and augment our flow along some
path, we look at the smallest capacity, which is always an integer. And so we put an integer amount of
flow everywhere and everything remains integer if we started with integers.
Play video starting at 5 minutes 42 seconds and follow transcript5:42
And there's an interesting lemma that we get out of this, which actually will prove useful to us later,
that says if you have a network G with integer capacities, there's always a maximum flow with integer
flow rates. And you can get it just by using the Ford-Fulkerson algorithm. Okay but now let's look at
the analysis.
And for this analysis to work I'm going to have to assume that all capacities are integers.
Play video starting at 6 minutes 8 seconds and follow transcript6:08
Now what does this algorithm do? Every time through this loop, we compute the residual graph and
then we try to find a path P in it. And each of these run in O of number of edges time.
Play video starting at 6 minutes 20 seconds and follow transcript6:20
Now, every time we do that, we increase the total flow by a little bit, in fact by at least 1. So the
number of times we do it is most the total flow on our graph. So our total runtime is bounded by the
number of edges in our graph times the size of the maximum flow.
Play video starting at 6 minutes 40 seconds and follow transcript6:40
Now this is a little bit weird as a runtime, because it depends not just on sort of the structure of the
graph that we're working on, but also the capacities of the edges and the size of the maximum flow.
This leads us to a problem, where, potentially at least, if we have numerically very, very large
capacities in our graph, it could actually take us a very, very long time to compute the flow.
One other thing I should note about this algorithm is that it's not quite a full algorithm. What it says is
at every step I need to find some source to sink path in our residual.
Play video starting at 7 minutes 18 seconds and follow transcript7:18
Now, there might be many valid paths to choose from, and the Ford-Fulkerson algorithm, as I've
stated, doesn't really tell you which one to use.
Play video starting at 7 minutes 27 seconds and follow transcript7:27
Now you might just want to run depth-first search because it's very fast, but maybe that's not the
best way to do it. And as we'll see a little bit later in fact, finding the right way to pick these
augmenting paths can actually have a substantial impact on the runtime of the algorithm. But that's
for a little bit later. That's all for our lecture today. Next time, we'll talk a little bit more about the
runtime of this particular algorithm. So I hope to see you then.
Slow Example

Hello everybody, welcome back to our network flows unit. Today we're going to be talking about sort
of an example of an algorithm.
Play video starting at 10 seconds and follow transcript0:10
Network on which the Ford-Fulkerson algorithm might not be very efficient.
Play video starting at 15 seconds and follow transcript0:15
So last time we had this great algorithm for Maxflow called the Ford-Fulkerson Algorithm. The
runtime was all of the number of edges of the graph times the size of the maximum flow. Now, this is
potentially very bad if the size of the flow is large. On the other hand, this is sort of a theoretical
problem at this point. We don't know for sure whether or not this is ever actually a problem.

Play video starting at 41 seconds and follow transcript0:41


So today we're going to consider the following example.
Play video starting at 45 seconds and follow transcript0:45
Here is a graph, some of the capacities are pretty large a bunch of them have a million capacity and
then there's one edge with only capacity one.
Play video starting at 54 seconds and follow transcript0:54
So the max flow here is big, we can route a million units of flow over the top and another million over
the bottom, so the max flow for this graph is two million, fine.
Ppt slides
Play video starting at 1 minute 5 seconds and follow transcript1:05
Let's look at possible executions of the forward focus and algorithm on this graph. In particular, one in
particular. So we start with no flow, we have a residual graph, let's look for a source to sync path.
Here's one.
Play video starting at 1 minute 20 seconds and follow transcript1:20
What's the minimum capacity on this path? Well it's one coming from that middle edge. So we're
going to route one unit of flow along that path.
Play video starting at 1 minute 29 seconds and follow transcript1:29
Update the residual, find the source to seek a path. Here's one. One unit of capacity along the middle
edge. So we route one more unit of flow along this path.
Play video starting at 1 minute 40 seconds and follow transcript1:40
Update the residual, find the path, one more unit of flow, residual, one more unit of flow, and we can
keep going like this for a while.
Play video starting at 1 minute 50 seconds and follow transcript1:50
So the question here is, if we keep iterating the Ford-Fulkerson algorithm like this. How many
iterations will actually take to compute the maximum flow?. If assuming that it keeps augmenting
paths is according to this pattern. Well quite a lot actually. Each step here adds only one unit of flow
because we're keeping limited by this middle edge.
Play video starting at 2 minutes 15 seconds and follow transcript2:15
In order to find a max flow, we need a total of two million total units. So that Ford-Fulkerson
algorithm requires something like two million iterations before it converges on this graph. And that's
a really big number for a graph with only four vertices.
Play video starting at 2 minutes 32 seconds and follow transcript2:32
On the other hand, if you think about it, it doesn't need to be this bad. I mean here's another
perfectly valid execution of the Ford-Fulkerson algorithm on this graph.
Play video starting at 2 minutes 42 seconds and follow transcript2:42
We've got no flow. Let's find a path in the residual. There's this one. We can write a million units of
flow along that path, update the residual. Here's another path. Put a million units of flow along that
path and suddenly we've got a cut. We're done.

Play video starting at 2 minutes 58 seconds and follow transcript2:58


And so there's a big difference between these two different executions of more or less the same
algorithm. And what would be really nice is if we had a way to ensure that we always had something
that looked more like the ladder execution than like the former execution. And next time we're going
to be talking about sort of a way to go about this. A sort of principled way of choosing our paths to
guarantee that we don't have the type of problem presented by the first of these examples. So that's
what we will be discussing next time. I hope to see you then.
The Edmonds–Karp Algorithm

Ppt slides
Hello everybody and welcome back to our Network Flows Unit. Today we're going to be talking a new
algorithm for network flows, or maybe just a version of the old algorithm, that will do a little bit better
than what we had previously. So last time, we were still talking with the Ford-Fulkerson algorithm for
Maxflow. The runtime, in general, is O of the number of edges times the size of the flow. And last
time we showed that this can actually be very very slow on graphs with large capacities.
Play video starting at 30 seconds and follow transcript0:30
And in particular, we had this example, where sort of every time, if you routed flow, at least if you
were picking the wrong paths, then you just got one unit of flow every iteration and it took millions of
iterations to actually finish.
Play video starting at 45 seconds and follow transcript0:45
Fortunately though, we know that the Ford-Fulkerson algorithm gives us a choice as to which
augmenting path to use. And the hope is that maybe by picking the right path we can guarantee that
our algorithms won't take that long. And so in particular what we want to do is we want to find sort of
a principled way of picking these augmenting paths in order to ensure that our algorithm doesn't run
through too many iterations. And one way to do this is via what's known as the Edmonds-Karp
algorithm.
Play video starting at 1 minute 18 seconds and follow transcript1:18
The idea of the Edmonds-Karp algorithm is as follows. We'd like to use the Ford-Fulkerson algorithm
but we're always going to be using the shortest possible augmented path. That is, shortest in the
number of edges that are being used.
Play video starting at 1 minute 33 seconds and follow transcript1:33
And, basically all that this means is that if we want to find our augmenting paths, we want to use a
breadth-first search, rather than a depth-first search.
Play video starting at 1 minute 43 seconds and follow transcript1:43
So, for example, if we're trying to run Edmonds-Karp on this example, then we can't use the zig-zag
path with three edges. We're required to pick this augmenting path with only two edges instead.
Play video starting at 1 minute 56 seconds and follow transcript1:56
After we've done that there's another path with only two edges, and after we've done that there's
nothing left to be done. So at least on this example the Edmonds-Karp algorithm gives us the good
execution rather than the bad one.
Play video starting at 2 minutes 10 seconds and follow transcript2:10
Now to really look into how well this works, we need to analyze these augmenting paths. So if you
have an S to T path. You'll note that when you add your augmenting flow it always saturates some
edge. That is, uses up all the available flow from that edge.
Play video starting at 2 minutes 26 seconds and follow transcript2:26
And this is because the way we decided the amount of flow to run along this path was we took the
minimum capacity of any of these edges in the residual graph. And so which ever edge had only that
much capacity left got saturated.
Play video starting at 2 minutes 43 seconds and follow transcript2:43
Now, once we add this augmenting flow, we have to modify the residual network. We end up with
edges pointing backwards along each of these places because we can now cancel out that flow we
just added. And, at least the one edge that we ended up saturating, we destroyed that edge, we used
up all of the remaining flow.

Play video starting at 3 minutes 5 seconds and follow transcript3:05


Okay, so we'd like to now analyze the Edmonds-Karp algorithm. And the basic idea is that whenever
we have an augmenting path, we always saturate some edge. And we're going to show that we don't
have too many different augmenting paths by showing that no edge is saturated too many times. Now
we'll note this really fails to hold in the bad case that we looked at because the middle edge kept on
being saturated over and over again, it just flipped from going pointing up to pointing down in the
residual graph over and over again. And this was the real thing that was limiting to us to adding one
unit of flow per iteration. Okay.

So that's the idea of our analysis and the way we're going to show that this works is we're going to
start with a critical lemma.
Play video starting at 3 minutes 55 seconds and follow transcript3:55
The Edmonds-Karp algorithm is very concerned about distances in the residual graph because it looks
for short paths there.
Play video starting at 4 minutes 2 seconds and follow transcript4:02
And so we'd like to know how these distances change as the algorithm executes. Because as you run
your algorithm your residual graph keeps changing, and so the distances inside the residual graph
change.
Play video starting at 4 minutes 16 seconds and follow transcript4:16
Now the Lemma that we want is the following. As the Edmonds-Karp algorithm executes, if you take
any vertex v and look at the distances from the source to v, those distances only get bigger.
Play video starting at 4 minutes 29 seconds and follow transcript4:29
Similarly look at the distances from from v to t or the distance from s to t, again those can only
increase, never decrease.
Ppt slides

Play video starting at 4 minutes 38 seconds and follow transcript4:38


And the proof is not so bad but it's a little subtle. So, whenever we have an augmenting path, we
introduce a bunch of new edges that point backwards along this augmenting path.
Play video starting at 4 minutes 49 seconds and follow transcript4:49
Now the augmenting path sort of by assumption was always the shortest path from source to sink.
Play video starting at 4 minutes 55 seconds and follow transcript4:55
And what that means is that the new edges point from vertices that were further away from s to
vertices that are closer to s.
Play video starting at 5 minutes 5 seconds and follow transcript5:05
And the key observation is that new vertices of that form never give you any faster paths from source
to v.
Play video starting at 5 minutes 13 seconds and follow transcript5:13
And this is because, well if I told you someone introduced a great, fast, one-way highway that went
from a city 1,000 miles away from your house to a city 10 miles away from your house, it would not
actually be useful for you to get anywhere from home. Now it would be incredibly useful getting back
from this other place, but if you wanted to get to this place 10 miles away, you could just drive 10
miles instead of driving 1,000 miles and taking the new highway.
Play video starting at 5 minutes 40 seconds and follow transcript5:40
Similarly these edges that only point from distances farther from S to vertices closer to S, they never
help you get to places from S any faster than you were before. Now the saturated edges that got
removed might make things become slightly further away than they were before, but the new edges
never make anything closer. And that basically completes our proof. The fact that distances at vertices
to T increase is completely analogous, as is the proof that vertices from S to T increase.
So, with that under our belts, the critical lemma now is the following. We want to show that there's a
limit on how often edges can be resaturated.
Play video starting at 6 minutes 23 seconds and follow transcript6:23
And so we have the following lemma. When running the Edmonds-Karp algorithm, if an edge e is
saturated, that edge cannot be used again in any augmenting path at least until the distance between
s and t and the residual graph has increased.
Play video starting at 6 minutes 40 seconds and follow transcript6:40
Now the proof this is a little bit subtle, so we're going to first consider this path that caused us to
saturate the edge.
Play video starting at 6 minutes 48 seconds and follow transcript6:48
So the path went from s to u, this had length x, then from u to v which was our edge. And then from v
to t which had length y.
Play video starting at 6 minutes 57 seconds and follow transcript6:57
Now, this had to be a shortest path. And so the path from s to t had to be x+y+1. Now, when we use
that edge again, we use this edge from v back to u. Well, we need to have some path from s going to v
then to u and then from u to t.
Play video starting at 7 minutes 17 seconds and follow transcript7:17
Now this has to be the shortest path.
Play video starting at 7 minutes 20 seconds and follow transcript7:20
Now what's the distance from s to v? The distance from s to v is at least what the distance from s to v
was before, which was x + 1.
Play video starting at 7 minutes 29 seconds and follow transcript7:29
Then the distance from v to u is one and the distance from u to t is at least what it was before, which
is at least y + 1. So that means when this edge gets used again, the distance from s to t had be at least
(x + 1) + (y +1) + 1, which is at least x + y + 3. Which means that when this edge gets used again, it has
to be the case that the distance between s and t was bigger than it was before. And that completes
our proof.

Once we have this lemma the rest of this analysis is actually pretty easy. The distance between s and t
in the residual graph can only increase and it's never more than the number of vertices. So it can only
increase the number of vertices times.
Play video starting at 8 minutes 17 seconds and follow transcript8:17
Now between times that it increases, no edge can be saturated more than once because once it's
saturated you can never use it again. And so between times you can only have O of E many saturated
edges.
Play video starting at 8 minutes 31 seconds and follow transcript8:31
But each augmenting path has to saturate an edge. You can only have O of E many such paths
between increases in this distance between s and t. And that can happen only O of E many times.
Play video starting at 8 minutes 45 seconds and follow transcript8:45
So there are only O of size of V times size of E many augmenting paths used by this algorithm.
Play video starting at 8 minutes 52 seconds and follow transcript8:52
Each path here takes only O of E much time. And so the total run time, is at most, O of V times E
squared.
Play video starting at 9 minutes 0 seconds and follow transcript9:00
Now, this is maybe not so great, because, O of E times E squared, this might be number of vertices to
the fifth, or number edges cubed. But it is polynomial and it has no dependence on, or it sort of
doesn't become very, very, very big when our actual size of our flow becomes very, very large.
Play video starting at 9 minutes 25 seconds and follow transcript9:25
Okay. So one problem, sort of a quick review properties of this Edmonds-Karp algorithm. Which of the
following are true about the Edmonds-Karp algorithm?
Play video starting at 9 minutes 34 seconds and follow transcript9:34
One, that no edge is saturated more than size of V many times. Two, the lengths of the augmenting
paths decrease as the algorithm progresses.
Play video starting at 9 minutes 47 seconds and follow transcript9:47
Or three, that changing the capacities of edges will not affect the final runtime.
Play video starting at 9 minutes 53 seconds and follow transcript9:53
Well, it turns out that only one of these is true.
Play video starting at 9 minutes 56 seconds and follow transcript9:56
Yes, edges only become resaturated after the distance between S and T increases, which only
happens V many times. However, the lengths of the augmenting paths increase as the algorithm
progresses, not decrease. And finally, although the runtime does not have an explicit dependence on
the edge capacities, like it did in the Ford-Fulkerson algorithm, they can still affect the runtime. If all
the capacities are zero, you don't need to do any augmenting paths. If the capacities are weird, they
might make you do a little bit more work than you'd have to do otherwise.
Play video starting at 10 minutes 33 seconds and follow transcript10:33
But the nice thing about Edmonds-Karp is that there's a bound to how bad it can be.
Play video starting at 10 minutes 38 seconds and follow transcript10:38
So in summary if we choose augmenting paths based on length it removes this sort of, at least bad
dependence that we had on the numerical sizes of the capacities. We have a runtime we can write
down that we can run independently of our total flow. And now max flow is an incredibly well studied
algorithmic problem. There are actually better more complicated algorithms that we're just not going
to get into in this course. The state of the art is a little better than what we had, it's O of number of
vertices times number of edges.
Play video starting at 11 minutes 12 seconds and follow transcript11:12
And if you want to look it up, I mean, feel free to look up these more complicated algorithms. But this
is all that we're going to do in this course. The next two lectures, we're going to sort of talk about
some applications of these maxflow algorithms to a couple other problems where it's not quite
obvious that this is the right thing to do. So I'll see you next time.

Application
Bipartite Matching
Hello, everybody. 
Welcome back to our network flows unit. 
Today we're going to talk about an application of some of these network flow 
algorithms we've been discussing, to a problem called bipartite matching. 
So to get started on this, 
suppose you're trying to coordinate housing in a college dormitory. 
So what you have is, you've got n students and m rooms. 
Each student is giving you a list of rooms that they consider to be acceptable, and 
what you'd like to do is place as many students as possible 
in an acceptable room. 
Now, of course, there's a limitation here that you can't place more than one student 
in the same room.
Play video starting at 38 seconds and follow transcript0:38
Okay, so this is the problem.
. How do we organize this data? Well, I mean, you got a bunch of students. You got a bunch of rooms.
And there's some pairs of students in rooms, without students willing to be in that room.
Play video starting at 52 seconds and follow transcript0:52
And so a great way to organize this data pictorially is by with this bipartite graph.
Play video starting at 58 seconds and follow transcript0:58
A bipartite graph is a graph G whose vertex set is partitioned into two subsets, U and V, students and
rooms. They're sort of two types of vertices, so that all edges in the graph are between a vertex of U
and a vertex of V, so all the edges that connect the student to a room now connect the student to a
room to a room. And so if we just redraw that graph and call two sides U and V instead of Students
and Rooms, it's exactly a bipartite graph.
Play video starting at 1 minute 28 seconds and follow transcript1:28
So what we'd like to do on this graph is find what is called a matching. We want to find a bunch of
pairs of different rooms, that's a bunch of edges in a graph, but it needs to be the case that each
student gets assigned only one room, and each room is assigned to only one student, and that says
that no two of these edges that we pick can share an end point.
Play video starting at 1 minute 48 seconds and follow transcript1:48
So in our example if you look at the blue edges here, that will give us a matching. We've got a bunch
of pairings of students get paired to rooms that they were acceptable to be paired with. And each
student assigned only one room, and each room is assigned to at most one student.
Play video starting at 2 minutes 6 seconds and follow transcript2:06
So the big problem we're going to try and solve is known as bipartite matching. Given bipartite graph
G, we try to find a matching of G that consists of as many edges as possible and ideally one that pairs
up all of the vertices with each other. So, just to be sure that we're on the same page, if I give you the
bipartite graph, what's the size of the number of edges in the largest possible match?
Play video starting at 2 minutes 33 seconds and follow transcript2:33
Well, you have to play around with it for a bit. You can find that you can actually get matchings of size
three here and it takes a while, but you should be able to convince yourself that it's not actually
possible to get the matching here of size four, five.
Play video starting at 2 minutes 49 seconds and follow transcript2:49
So, let's talk about applications. Bipartate matching actually has a bunch of applications. One thing,
need be is matchmaking. Suppose you have a bunch of men, women, some pairs of them are
attracted to each other and you would like to sort of pair them off into as many possible couples as
possible, such that nobody is dating more than one person at the same time.
Play video starting at 3 minutes 10 seconds and follow transcript3:10
Now, we have to be a little bit careful here. If there are gay people, then this doesn't quite fit into the
context of bipartite matching, because there are men attracted to men or women attracted to
women. The graph is no longer bipartite. And there's nothing wrong with this necessarily, but it will
make the problem computationally more complicated.
Play video starting at 3 minutes 31 seconds and follow transcript3:31
Another example that you might want to consider is maybe a scheduling problem. You have sort of a
bunch of events that need to be scheduled at different times. Each event has some blocks of time that
would work for it and you need to make sure that no two events get the same time block. Once again,
sort of a bipartite matching problem.

Play video starting at 3 minutes 51 seconds and follow transcript3:51


So how are we going to solve this problem? They key idea is there's a connection between this
problem and network flows, sort of what you want to solve in bipartite matching is you want to
connect nodes on the left to nodes on the right without putting too many connections though a given
node.
Play video starting at 4 minutes 8 seconds and follow transcript4:08
This sounds sort of like a flow problem. You want to have flows running from left to right without too
much flow running through any given node.
Play video starting at 4 minutes 17 seconds and follow transcript4:17
So to make this work, you add source nodes and connect them to the left and have the right node to
connect to a sink node and build up a network.
Ppt slides
Play video starting at 4 minutes 25 seconds and follow transcript4:25
So in particular, we start with our bipartite graph. You direct all of the edges left to right.
Play video starting at 4 minutes 32 seconds and follow transcript4:32
We're going to add a source and sink node. We're going to hand the source node to the vertices on
the left and connect the vertices on the right to the sink and we're going to define all the edges of this
graph to have capacity one.
Play video starting at 4 minutes 45 seconds and follow transcript4:45
This gives us a network associated to our bipartite graph, and it turns out that for every matching in
our bipartite graph there's a corresponding flow on the network.
Play video starting at 4 minutes 58 seconds and follow transcript4:58
And so to be formal about this, if G is the bipartite graph and G prime the corresponding network,
there's actually a one to one correspondence between bipartite matchings on G and integer value
flows on G prime.
Play video starting at 5 minutes 15 seconds and follow transcript5:15
And just to prove this, well, if you have a matching, we can produce a flow by running flow through
each edge of the matching. Then to make everything balance out, each vertex of U, which we had
flow running through it, we need to have flow coming to that vertex from S, and then that edge goes
to some vertex and V and that needs to extend through the edge to t, and that will give us a flow.
Play video starting at 5 minutes 42 seconds and follow transcript5:42
Now if we have a flow and wants to go back to a matching, you just look at these middle edges
between U and V and say which ones of them have flow? And those edges, we use in the matching.
Now, we can't have two edges coming out of same vertex of view because there wont be enough flow
going into that vertex. There is only one unit of flow going in at most and so there can't be two units
coming out. And we also can't have the edges sharing the same vertex on V for basically the same
reason.
Play video starting at 6 minutes 11 seconds and follow transcript6:11
And so there's a relationship between bipartite matching and integer valued flows. However, you'll
note that it was a Lemma that we proved that you can always find an integer valued maximum flow.
And so our max flow algorithm sort of already worked for solving this problem.
Play video starting at 6 minutes 29 seconds and follow transcript6:29
So this gives a very simple a algorithm for solving bipartitematching.
Play video starting at 6 minutes 33 seconds and follow transcript6:33
You construct the corresponding network G'. You compute a maxflow for G' in such a way that gives
you an integer maxflow. You then find the corresponding matching and return it. That solves the
problem.
Play video starting at 6 minutes 48 seconds and follow transcript6:48
Now, we could just say that we're done here, but there's something very interesting going on. So
Maxflow-Mincut, relating to maximum flow to the minimum cut, which is sort of nice as a theoretical
tool. But here these bipartite graphs, the maximum matching relates to a maxflow and lets see what
these cuts relate to.
Play video starting at 7 minutes 9 seconds and follow transcript7:09
So if we have the network corresponding to a matching and look at a cut in this network, well, this cut
contains the source and it contains some set x of vertices on the left and some set y of vertices on the
right.
Play video starting at 7 minutes 22 seconds and follow transcript7:22
And we'd like to make this cut as small as possible.
Play video starting at 7 minutes 26 seconds and follow transcript7:26
Now if we fix x, the vertices on the right will sort of, when do they contribute to the cut?
Play video starting at 7 minutes 33 seconds and follow transcript7:33
Well, vertices in Y, they have edges to T which produces sort of one edge to the cut. But if you had an
edge from X to a vertex not in Y,
Play video starting at 7 minutes 47 seconds and follow transcript7:47
then that would also give you one edge that breaks the cut. And because of this it actually can be
shown that you can basically afford to just let your elements in Y be exactly the elements in the right
hand side that are connected to by some element of X.
Play video starting at 8 minutes 9 seconds and follow transcript8:09
Now if we do that, what edges break the cut? Well, you've got edges from S to elements of U that
aren't in X.
Play video starting at 8 minutes 18 seconds and follow transcript8:18
Now, by the way we constructed these, vertices in X can only connect to vertices in Y that are also in
the cut. But vertices in Y they connect to T and those also give you edges out of the cut.
Play video starting at 8 minutes 30 seconds and follow transcript8:30
So the total size of the cut is the size of U minus X plus the size of Y. However, you'll note that all
edges in G connect to either a vertex in Y or a vertex in U minus X. So one way to find a bound on your
matching is by finding a set of vertices such that every edge in your graph connects to one of those
vertices.

Play video starting at 8 minutes 56 seconds and follow transcript8:56


And working this out gives us what's called Konig's theorem, which says if G is a bipartite graph and k
is the size of the maximal matching, then there has to be a set S of only k vertices of the graph, such
that each edge in G connects to one of these vertices of S.
Play video starting at 9 minutes 15 seconds and follow transcript9:15
And you'll note that if you have such an S, that gives you a bound in the maximal matching, because
each edge needs to use up one of those vertices and no two edges can share a vertex. And so Konig's
Theorem says that sort of the maximum matching on the graph is the same as the minimum when it's
called vertex cover set s of vertices that connect to all edges.
Play video starting at 9 minutes 38 seconds and follow transcript9:38
So, for example, if we have the following graph, you'll note that these four vertices connect to every
single edge in the graph. So that says immediately the maximum match can size at most four and it
turns out that in this case its tight.
Play video starting at 9 minutes 54 seconds and follow transcript9:54
Now theres one more special case of Konig's theorem that's worth mentioning. That the case where G
is a bipartite graph with n vertices one each side.
Play video starting at 10 minutes 3 seconds and follow transcript10:03
One thing that you might want to do is produce what's called a perfect pairing on G. That is a match
that uses every single vertex on both sides.
Play video starting at 10 minutes 12 seconds and follow transcript10:12
Now, it's a theorem that you should have specialized Konig's Theorem to this case, you can show that
there's always a perfect pairing, unless there's some set of only M vertices on the left hand side, such
that the total number of vertices that they connect to is strictly less than that.
Play video starting at 10 minutes 30 seconds and follow transcript10:30
So you can always pair up your n men with y our n women, unless there's some collection of M men
that pair with a total of fewer than M possible women, and if that's the case, these M men can not all
simultaneously have distinct dates, and so it's clearly not possible to produce your perfect pair. So in
summary, we've got this interesting problem of maximum matching, and we can solve it by resulting
it to a problem of finding maximum flows.
Play video starting at 11 minutes 2 seconds and follow transcript11:02
Furthermore, maxflow-mincut gives us some interesting characterizations of the sizes of this
maximum action. So, that's all I have to say about bipartite matching. Come back next session, we'll
talk about one more problem that you can solve using this maxflow technology that we've developed.

Image Segmentation
Hello, everybody, welcome back to our Flows in Networks unit. 
Today we're going to be talking an interesting problem on image segmentation. 
This is a problem in image processing, 
and we'll actually show that there's some surprising connections 
to this max-flow min-cut type of things that we've been talking about.
Play video starting at 17 seconds and follow transcript0:17
So the problem we're trying to solve is image segmentation. 
Given an image, separate the foreground of the image from the background. 
And we don't want to get too much into image processing, so 
here's the basic setup. 
The image is a grid of pixels. 
We need to decide which pixels are in the foreground and 
which are in the background. 
And I don't know much about how you actually process images, but we're going 
to assume that there's some other program that gives you some sort of idea about 
which pixels are in the foreground and which are in the background.
Play video starting at 47 seconds and follow transcript0:47
So, in particular, there's some other algorithm which looks at each pixel and 
makes a guess as to whether it's foreground or the background.
Play video starting at 55 seconds and follow transcript0:55
It assigns this pixel two numbers, av, which is sort of a likelihood that it's in 
the foreground, and the bv, which is the likelihood that it's in the background.
Play video starting at 1 minute 4 seconds and follow transcript1:04
So the simple version of this algorithm, the input are these values a and 
b, and the output should be a partition of the pixels into foreground and background. 
So just the sum over v in the foreground of a sub v 
plus the sum over v in the background of b sub v is as large as possible.
So to be sure that we're on the same page, here's a really simple version. We've got three pixels and
we've got some a and b values. What's the best possible value that we can get out of this problem?
Play video starting at 1 minute 36 seconds and follow transcript1:36
Well, it turns out that this problem is actually not that hard to solve in general. Basically, for any pixel,
if you put it in the foreground, you get a points, and if you put it in the background, you get b points.
So if a is bigger than b, it goes in the foreground, and if b is bigger than a, it goes in the background.
So what you do is, well, 1 should go in the background and gives us 4. 2 goes in the foreground and
gives us 5, 3 goes in the foreground and gives us 6. And so the answer is 4 and 5 and 6 is 15. Very well.

Now, this problem is maybe a little bit too easy. But let's take a little bit more information into
account. We sort of expect that nearby pixels should be on the same side of the foreground-
background divide. They're not going to be sort of randomly spattered throughout the picture, they
tend to be more or less connected regions.
Play video starting at 2 minutes 24 seconds and follow transcript2:24
So for each pair of pixels v and w, we're going to introduce a penalty pvw for putting v in the
foreground and putting w in the background.
Play video starting at 2 minutes 35 seconds and follow transcript2:35
So the full problem is the following. As input we take a, b, and p.
Play video starting at 2 minutes 40 seconds and follow transcript2:40
Again, we want a partition of our pixels into foreground and background. And now we want to
maximize the following. The sum of v in the foreground of av and the sum of v in the background of
bv, as before.
Play video starting at 2 minutes 52 seconds and follow transcript2:52
But now we subtract the sum over all pairs, where v is in the foreground and w is in the background,
of pvw.
Play video starting at 3 minutes 0 seconds and follow transcript3:00
And now we want this thing to be as large as possible.

Play video starting at 3 minutes 5 seconds and follow transcript3:05


Now, before we get into too much depth on this, I'm going to do a tiny bit of algebra. I'm going to
subtract the sum over all vertices v, all pixels v, of av plus bv. And the point is that this is a constant
that doesn't depend on our foreground-background split, so this doesn't really affect our
maximization problem. It just changes the numbers around a bit. We now want to maximize negative
the sum over v in the foreground of bv and then v in the background of av, and then pairs v in the
foreground w in the background of pvw. Now, instead of maximizing a negative quantity, of course,
we can try to minimize this positive quantity.
Play video starting at 3 minutes 46 seconds and follow transcript3:46
Okay. That changed things around a bit. What do we do now?
Well, the thing to note is that we want to split the vertices into two sets.
Play video starting at 3 minutes 57 seconds and follow transcript3:57
And we pay a cost. And the cost is mostly based on the boundary between these two sets, sort of
pairs where we break across the boundary, that's where we pay this big penalty.
Play video starting at 4 minutes 8 seconds and follow transcript4:08
And this looks like kind of a familiar problem. This looks a lot like a minimum cut problem.
Play video starting at 4 minutes 14 seconds and follow transcript4:14
So to make this all formal, let's try and build a network so that this is a minimum cut problem.
Play video starting at 4 minutes 20 seconds and follow transcript4:20
The first thing we have to do is add two new vertices, a source and a sink.
Play video starting at 4 minutes 25 seconds and follow transcript4:25
Now, we add edges from a source to vertex v with capacity av and an edge from v to t with capacity
bv. We also add an edge from v to w with capacity pvw. And this gives us a network.
Play video starting at 4 minutes 40 seconds and follow transcript4:40
Now, if we have a cut in this network, the cut contains S and not t, and then some of the other pixels
and not some of the others.
Play video starting at 4 minutes 48 seconds and follow transcript4:48
Now, what's the size of this cut?
Play video starting at 4 minutes 51 seconds and follow transcript4:51
Well, if v's inside our cut, there's an edge from v to T with capacity bv.
Play video starting at 4 minutes 57 seconds and follow transcript4:57
If v is not in our cut, there's an edge from S to v with capacity av. And then if v is in our cut but w isn't,
there's an edge from v to w with capacity pvw.
Play video starting at 5 minutes 11 seconds and follow transcript5:11
But if you stare at this for a bit, you'll note that if we just let the foreground be this thing is in the cut
and the background be the thing is not in the cut, this is exactly the thing that we're trying to
minimize.
Play video starting at 5 minutes 24 seconds and follow transcript5:24
So the original problem of this image segmentation boils down exactly to solving this minimum cut
problem. And now, maybe we don't know directly how to solve mincut, but we know that mincut is
equal to maxflow.
Play video starting at 5 minutes 39 seconds and follow transcript5:39
And so we're just going to use our maxflow algorithms. We're going to construct this network,
compute the maximum flow, and then find the corresponding minimum cut.
Play video starting at 5 minutes 49 seconds and follow transcript5:49
So the algorithm for image segmentation is really not that hard. You construct the corresponding
network G.
Play video starting at 5 minutes 55 seconds and follow transcript5:55
You then compute a maxflow f for G using Edmonds-Karp or whatever other algorithm you want.
Then we need to find the corresponding flow, so we compute the residual network. And you let C be
the collection of vertices reachable from the source in this residual network.
Play video starting at 6 minutes 11 seconds and follow transcript6:11
Then the foreground should just be the same. C, the background, should be everything else. That is
the optimal solution to our image segmentation.
Play video starting at 6 minutes 21 seconds and follow transcript6:21
And so in summary, we started with this basic problem in image processing, we found a nice
mathematical formulation, and then we noted that it looked a lot like minimum cut. And we're able to
construct a network, use the relationship between maxflow and mincut, and then use our existing
maximum flow algorithm to just solve this problem. And so this is one final application of these flow
algorithms that we've been discussing. That's really all that we have to say for the moment about
these flows and network algorithms. Come back next time, we'll start another unit on linear
programming problems, where we'll discuss some problems that are actually somewhat more general
than the ones we've been discussing here that turn out to be very useful in practice. So I hope to see
you then.

QUIZ • 10 MIN

Flow Algorithms

Flow Algorithms
TOTAL POINTS 5
1.Question 1
Which vertices are in the minimum S-T cut in the network below?
1 point

S
T

2.Question 2
What is the augmenting path that will be used by the Edmonds-Karp algorithm to increase the flow
given below?

1 point

S-B-A-C-D-T

S-B-T
S-B-D-C-T

S-A-C-T

S-B-A-C-T

3.Question 3
Which of the statements below is true?

1 point

The Ford-Fulkerson algorithms runs in polynomial time on graphs with unit edge capacities.

The Edmonds-Karp algorithm is always faster than the Ford-Fulkerson algorithm.

The sum of the capacities of the edges of a network equals the sum of the capacities of the edges
of any residual network.

4.Question 4
What is the size of the maximum matching of the following graph?
1 point

5.Question 5
Consider the image segmentation problem on a picture that is given by an n by n grid of pixels.
Suppose that separation penalties are imposed only for adjacent pairs of pixels. If we use the
Edmonds-Karp algorithm to solve this problem as described in class, the final runtime is O(n^a)
for some a. What is the best such a?

Programming assignment:1
Week 2
Advanced Algorithms and Complexity
Week 2
Discuss this week's modules here.

43 threads · Last post 16 days ago


Go to forum

Linear Programming

Linear programming is a very powerful algorithmic tool. Essentially, a linear programming problem asks
you to optimize a linear function of real variables constrained by some system of linear inequalities. This is
an extremely versatile framework that immediately generalizes flow problems, but can also be used to
discuss a wide variety of other problems from optimizing production procedures to finding the cheapest
way to attain a healthy diet. Surprisingly, this very general framework admits efficient algorithms. In this
unit, we will discuss some of the importance of linear programming problems along with some of the tools
used to solve them.
Less
Key Concepts
 Generate examples of problems that can be formulated as linear programs.
 Interpret linear programming duality in the context of various linear programs.
 Solve systems of linear equations.
 Compute optimal solutions to linear programs.
 Illustrate convex polytopes.

Less

Slides and Resources on Linear Programming


Reading: Slides and Resources on Linear Programming

. Duration:10 min

Introduction

Video: LectureIntroduction

. Duration:5 min

Resume

. Click to resume

Video: LectureLinear Programming

. Duration:8 min

Video: LectureLinear Algebra: Method of Substitution

. Duration:5 min

Video: LectureLinear Algebra: Gaussian Elimination

. Duration:10 min
Basic Tools

Video: LectureConvexity

. Duration:9 min

Video: LectureDuality

. Duration:12 min

Video: Lecture(Optional) Duality Proofs

. Duration:7 min

Algorithms

Video: LectureLinear Programming Formulations

. Duration:8 min

Video: LectureThe Simplex Algorithm

. Duration:10 min

Video: Lecture(Optional) The Ellipsoid Algorithm

. Duration:6 min

End of Module Quiz

Purchase a subscription to unlock this item.

Quiz: Linear Programming Quiz

5 questions

Due Aug 2, 11:59 PM PDT

Programming Assignment

Purchase a subscription to unlock this item.

Programming Assignment: Programming Assignment 2

. Duration:3h

Slides and Resources on Linear Programming

Slides and Resources on Linear Programming


Slides
16_LP_1_1_introduction.pdfPDF File
16_LP_1_2_LP.pdfPDF File
16_LP_2_GaussianElimination.pdfPDF File
16_LP_2_1_Substitution.pdfPDF File
16_LP_2_2_GaussianElimination.pdfPDF File
16_LP_3_Convexity.pdfPDF File
16_LP_4_1_Duality.pdfPDF File
16_LP_4_2_DualityProofs.pdfPDF File
16_LP_5_Formulations.pdfPDF File
16_LP_6_simplex.pdfPDF File
16_LP_7_ellipsoid.pdfPDF File

Reading
Chapter 7 in [DPV], Chapter 29 in [CLRS].

[DPV] Sanjoy Dasgupta, Christos H. Papadimitriou, and Umesh V. Vazirani. Algorithms. McGraw-
Hill, 2008.

[CLRS] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.
Introduction to Algorithms (3. ed.). MIT Press, 2009.
Introduction
Hello, everybody, welcome back to our course on advanced algorithms and complexity. Today we're
starting a new unit, we're starting to talk about linear programming problems. And in particular today
we're going to give just a simple example of the sort of problem that we will be trying to solve during
this unit.
Play video starting at 18 seconds and follow transcript0:18
So imagine that you're running a widget factory and you'd like to optimize your production
procedures in order to save money.
Play video starting at 24 seconds and follow transcript0:24
Now these widgets, you can make them using some combination of machines and workers. Now you
have only 100 machines in stock, so you can't use more than that. But you can hire an unlimited
number of workers. However, each machine that you're trying to use requires two workers in order to
operate it. Additional workers can be building things on their own, but machines they are using
require two workers on them.
Play video starting at 50 seconds and follow transcript0:50
Now in addition to this, each machine that you use makes a total of 600 widgets a day. And each
worker that's not currently involved in operating a machine makes 200 widgets a day.
Play video starting at 1 minute 4 seconds and follow transcript1:04
Finally, the total demand for widgets is only 100,000 widgets a day. So if you make any more than
this, they just won't sell, and that's no good for anybody.
So writing these constraints down in a reasonable way, if we let W be the number of workers that we
have. And M the number of machines, we have a bunch of constraints. The number of workers should
be non-negative, the number of machines should be between 0 and 100. The number of workers
needs to be at least twice the number of machines. And then finally, 100,000 is at least 200 times the
number of unoccupied workers. That's W minus 2M, plus 600 times the number of the machines. And
so these constraints sort of constrain which allowable combinations we can have. Now we can try and
graph these constraints. So here we've got a plane of possible values of M and W that satisfy these
constraints. Now if we're just starting where M and W both need to be non negative, we have this
quadrant as the allowable values. But when we require that M needs to be at least 100, we're
reduced to being in this strip here. When we look at our constraint based on the total demand, we
find that M + W is at most 500. And so we're now constrained to this region. And we add the final
constraint that the workers need to be at least twice the number of machines. We finally come to this
diagram of possible configurations of machines and workers that we can use.
What's next? Profit, well suppose that profits are determined as follows. Each widget that you make
earns you a $1 but each worker that you're hiring costs you $100 a day. So the total profit that you
get, then, in terms of dollars per day. Well it's number of widgets, 200 workers minus twice machines,
plus 600 times number of machines, minus the total salaries you paid to workers, 100 times the
number of workers. So that's 100 times the number of workers plus 200 times the number of
machines. And if we want to plot that on our graph we can do it as follows. So these lines that I've
drawn are lines of equal profit. There's a line with $30,000 a day, and then $40,000 a day, and then
$50,000 a day. And sort of as you go from left to right, or from bottom to top, you make more profit.
Play video starting at 3 minutes 29 seconds and follow transcript3:29
So what we're trying to do now is we're trying to say well, what can we do to get the most profit? And
it turns out, the best you can do is at this point here. Note that it's a corner of the allowable region,
it's where we have 100 machines and 400 workers. And the total profit is $60,000 a day.
Play video starting at 3 minutes 49 seconds and follow transcript3:49
Now it's clear from this diagram that this is the best that you can do. But if you actually want to prove
it, there's a clever way you can do that.
So two of the constraints that we have, one of them is that the number of machines is at most 100.
And another is that 200 times the number of machines plus 200 times the number of workers is at
most 100,000. Now if we take 100 times the first constraint and add it to a half times the second
constraint, what you find is that 200 times the number of machines plus 100 times the number of
workers has to be at most 60,000. And that says the profit that we make has to be at most 60,000.
And so this is a very convenient way to prove that the 60,000 that we could attain is actually the best
we can do. So in summary, what we did is we solved this problem where we maximized this function,
200M + 100W. Subject to the following list of five constraints.
Play video starting at 4 minutes 49 seconds and follow transcript4:49
And because the thing we're trying to maximize is a linear function, and the constraints we have are
linear inequalities, this makes this an example of the type of problem we're going to be looking at.
That is, a linear program. So come back next lecture and we'll sort of formally define this problem and
get us started on our investigation.

Linear Programming
Hello everybody, welcome back to our unit on linear programming. Today, what we're going to do is
we're sort of going to put everything on sort of a more solid, rigorous basis. So remember last time
what we did is we had this factory problem, where what we wanted to is we wanted to maximize. In
terms of M and W, this 200M + 100W, this is linear expression. Subject to the following list of linear
inequality that they had dissatisfied. And so in general, basically, just this is where the linear
programming is. It says we want to find real numbers, x1 through xn that satisfy a bunch of linear
inequalities, so a11x1 + a12x2 +..., is at least to b1, and then a bunch more of those. And subject to
these constraints, we would like a linear objective function, v1x1 + v2x2 + etc., to be as large or
possibly as small as possible.
Play video starting at 59 seconds and follow transcript0:59
To clean up the notation a bit, we're really going to store this by having a matrix A that encodes the
coefficients of all these inequalities along with vectors b and v.

Play video starting at 1 minute 9 seconds and follow transcript1:09


And our output should be a vector x and Rn. Such that A times x is at least b. And what I mean by this
is that if you multiply the matrix A by the vector x, then you get a new vector where the first
component of that is bigger than the first component of b. The second component is at least the
second component and so on and so forth. And note that if you just unroll what that means it's
exactly the system of linear inequalities that we had on the previous slide.
Play video starting at 1 minute 37 seconds and follow transcript1:37
Now subject to this constrain, we would like v * x to be as large or as small as possible.
Play video starting at 1 minute 44 seconds and follow transcript1:44
So linear programming turns out to be incredibly useful because them are an extraordinary number of
problems that can be put into this framework.
Play video starting at 1 minute 52 seconds and follow transcript1:52
To begin with the factory example that we solved in the last lecture was exactly of this form. Optimize
a linear function with respect to some linear inequality constraints. But there's a ton more problems
that fit into this. One of them is the diet problem, which was studied by George Stigler in the 1930s
and 40s. And intuitively, it's a very simple problem. How cheaply can you purchase food for a healthy
diet? This is really important if you say, need to feed and army full of soldiers for example and you
want to be cheap.
Play video starting at 2 minutes 22 seconds and follow transcript2:22
So how do you do this? Well, you got a whole bunch of variables for every type of food that you could
possibly eat. You need to know how many servings per day of that food you're going to have. So
you've got a variable x of bread, x of milk, x of apples, so on and so forth.

Play video starting at 2 minutes 39 seconds and follow transcript2:39


And then you've got a bunch of constraints. Firstly, for each type of food, you need to have a non-
negative number of servings of that food. It wouldn't do to have minus three servings of bread a day.
Play video starting at 2 minutes 51 seconds and follow transcript2:51
And additionally, you need to have enough nutritional content, you need to have sufficiently many
calories per day. So many calories do you have? That's just the (Cal / serving bread) x bread + (Cal /
serving milk) x milk + so on and so forth over every type of food. And this should be at least 2,000 or
whatever your minimum calories per day for your diet is. And in addition to this constraint, we have
another similar looking constraint for each other nutritional need. Vitamin C, protein, What have you.
And so we have a bunch of linear inequalities as our constraints on these variables.
And subject to these constraints, we want to minimize our total cost. And the cost of our diet is the
(cost of serving a bread) x bread + (cost of serving milk) x milk+ so on and so forth.
Play video starting at 3 minutes 45 seconds and follow transcript3:45
So we want to minimize a linear function of these variables, subject to a bunch of linear inequality
constraints. This is a linear problem. You can solve it and it would tell you in some sense the cheapest
diet that you could live on. Unfortunately, I should worn you that actually doing this is maybe not the
best idea. When you solve this, the solution tends to optimized for a few very efficient foods for
getting calories and protein. And then maybe a few random things to like fill in your other dietary
needs very cheaply. But I mean It might say that you should eat mostly potatoes and peanut butter
and then a bunch of vitamin pills or something.
Play video starting at 4 minutes 23 seconds and follow transcript4:23
And so, these tend not to produce diets you'd actually want to consist entirely on. But maybe if you
want to think about what can I do to eat more cheaply, it's something to look at.
Play video starting at 4 minutes 35 seconds and follow transcript4:35
So another problem that fits very nicely in this linear programming formulation is network flow. It
turns out that the network flow problems that we discussed in the last lecture are actually just a
special case of linear programing problems. So if you want to solve max flow, then you've got a bunch
of variables, f sub e, the flow along each edge e.
Play video starting at 4 minutes 56 seconds and follow transcript4:56
And they satisfy some constraints, for each edge e, f sub e if between 0 and the capacity of the edge.
And then for every vertex that's not a source or a sink, you have conservation of flow. So the total
flow into that vertex is the same as the total flow out of that vertex.
Play video starting at 5 minutes 13 seconds and follow transcript5:13
Now when you first look at this, this might not seem to be an inequality, it's an equality, not an
inequality. But you could actually just write it by writing down two linear inequalities. You could say
the flow into the vertex is at least the flow out of the vertex. And on the other hand, the flow into the
vertex is at most the flow out of the vertex. And so we put these two inequalities together, it's
equivalent to this one equality.
Play video starting at 5 minutes 42 seconds and follow transcript5:42
So once we put these constraints on, we now have an objective function we'd like to subject to these
constraints, maximize the flow. That's the total flow going out of sources minus the total flow going in
to sources, which is a nice linear function. And so this maximal problem is just a special case. When
you phrase it this way, it's exactly a linear problem.
Play video starting at 6 minutes 5 seconds and follow transcript6:05
Now a lot of the time, when you look at a linear program, it's exactly what I said. You're subject to
these constraints, there's a unique maximum value which attains the best possible value of the
objective function. However, there are a couple of edge cases that you need to keep in mind where
things don't quite work out this way.
Play video starting at 6 minutes 24 seconds and follow transcript6:24
The first is you could actually have a system where there is just no solution. You could have constrains
say x is at least 1, y is at least 1, and x + y is at most 1. If you have the system of constrains which is
graphed here, there's actually no solution. Because if x and y are each at least 1, x + y needs to be at
least 2. So there's no solution to the system, so you can't even start with trying to find a maximum.
Play video starting at 6 minutes 52 seconds and follow transcript6:52
It could also be the case that even though your system has solutions there's no actual optimum.
Play video starting at 6 minutes 57 seconds and follow transcript6:57
And a way that this could happen is as follows, if you have the system where x is at 0, y is at least 0, x-
y is at least 1, there's actually no maximum value for x here. Basically, the region is graphed here, but
it says you could higher and higher and higher up. x is actually unbounded in this system. And so in
some sense, your solution should say that there's no maximum.
Now just to review this I've got three pretty simple systems here, each with two equations and two
unknowns. Now one of these systems has no solution. One of them has solutions but no solution with
a maximum x value, it's unbounded. And the third one actually does have a unique maximum x value.
Play video starting at 7 minutes 43 seconds and follow transcript7:43
And so I'd like you to take a little while to think about which one of these is which.
Play video starting at 7 minutes 49 seconds and follow transcript7:49
Okay, so if we actually graph these three systems, A it turns out has no solution. One equation says
we're supposed to be bigger than one, the other one says we have to be less than zero, you can't do
both of those. B if you write it down does have a unique maximum at I think x equals one and a half,
as plotted there, it's the red point. And C, although it does have plenty of solutions, if you graph this
region. You'll note that you sort of slide up along this line, x equals y, you can make the x value as
large as you want.
Play video starting at 8 minutes 21 seconds and follow transcript8:21
And so there is our R solutions, but there's no max. In any case, that's all that I had to say about this
basic introduction to linear programs. Come back next time and we'll start by looking at a special case
of dealing with linear equalities rather than inequalities.

Linear Algebra: Method of Substitution


Hello everybody, welcome back. Today we're talking more about linear programming, well actually
we're not. We're looking at sort of a simpler problem first. So linear programming talks about dealing
with systems of linear inequalities. Today we're going to look at sort of a simple special case of this
where we look at systems of linear equalities. So for example we have a system of linear equalities x +
y = 5, 2x + 4y = 12 and we'd like to be able to solve this for x and y. So the very general way to do this
is by what's known as the method of substitution. You use the first equation to solve for one variable
in terms of the others. You then take that variable and substitute it into the other equations. You now
have a bunch of equations with n-1 variables and you recursively solve those. Then once you have the
answer to those equations, you substitute them back into the first equation to get the value of the
initial variable. Okay, let's see how this works in practice. So, x + y = 5. 2x + 4y = 12. Using the first
equation, we solve x as 5- y.
Play video starting at 1 minute 13 seconds and follow transcript1:13
We substitute that into the second equation and we find that 12 = 10 + 2y. Solving that for y, we find
out that y = 1. And substituting back into the first equation, x = 5- 1 = 4. So, x = 4y = 1, that's the
solution to the system.
Play video starting at 1 minute 34 seconds and follow transcript1:34
Now, just to make sure we're on the same page, if we have the system x + 2y = 6, and 3x- y = -3, what
is the value of x in the solution to that system?
Play video starting at 1 minute 49 seconds and follow transcript1:49
Well the answer is 0 here. So from the first equation, we get x = 6- 2y. Substituting into the second,
we get that -3 is 18-7y. Solving that tells us that y = 3, so x = 6- twice 3 = 0. And that's the answer.
Play video starting at 2 minutes 10 seconds and follow transcript2:10
Okay, so that was our first example. Let's look at another example. We have a system of linear
equations x + y + z = 5. 2x + y- z = 1.
Play video starting at 2 minutes 21 seconds and follow transcript2:21
So we solve this by substitution, great. From the first equation, x = 5- y- z. We substitute that into the
second equation and we solve for y. We find that y = 9 + 3z. Great, we now know what y is and we
want to solve for z but we can't. There are no equations left. We've already used the first equation to
solve for x and the second to solve for y. We can't solve for z because there's nothing left. But this is
actually fine for us. It turns out that any value that we assign z will give us an actual solution. You give
me any value for z, we set y = 9 + 3z and then x is 5- y- z or -4- 4z. And any value of z gives us this
solution. So there's an entire family of solutions. We can let z be a free variable. And for any value of
z, we have a unique solution.

Play video starting at 3 minutes 24 seconds and follow transcript3:24


So in general, your solution set will not necessarily be a point, but it will be a subspace. You'll have
some free variables and no matter what settings you give those, your other variables will be functions
of your free variables. Now this subspace has a dimension which is just the number of free variables,
the number of parameters you need to describe a point on it. And generally speaking, each equation
that you have gives you one variable in terms of the others. And so generally speaking, the dimension
of your set of solution is going to be the total number of variables minus the number given in terms of
others. So the total number of variables minus the number of equations. So generally speaking, if you
have n equations and n unknowns, there'll be no free variables left and you'll have a unique solution.
Play video starting at 4 minutes 17 seconds and follow transcript4:17
However if you have n+1 equations and n unknowns, the first n of your equation solves for the unique
solution and then the extra equation probably is something that isn't satisfied by that solution. So
generally if you've got too many equations, there are no solutions to the system. However, if you
have n- 1 and n unknowns, you generally solve those and you'll still have one pre-variable left, so
generally speaking, you'll have a full dimension one subspace. You'll have a line as your solution
instead of just a point. Okay, so in summary,
we can solve systems of linear equations using the method of substitution. And generally speaking,
and this isn't always the case. But generally, each equation reduces the number of degrees of
freedom by one. Now if all you want to do is solve systems of linear equations, you could basically
stop here. But we want to do more than that. So next time what we're going to do is we're going to
talk about how to systematize this whole thing, and simplify the notation sum to make this into an
honest algorithm that we're going to discuss. And so, when you want to talk about how to tell your
computer to solve systems of linear equations, that is what we're going to talk about in the next
lecture.
Play video starting at 5 minutes 34 seconds and follow transcript5:34
So I'll see you then.

Linear Algebra: Gaussian Elimination


Hello everybody, welcome back to our Linear Programming. Today we're going to talk about Gaussian
Elimination.
Play video starting at 6 seconds and follow transcript0:06
So the basic idea is that last time we talked about how to solve linear systems by substitution. Today,
we're going to make it into an algorithm. So remember last time we could solve systems of linear
equations using this method of substitution. And to begin with we'd like to simplify the notation a
little bit because the way we did it you had to write down these full equations x+y = 5 2x+4y- 12 these
have variables and addition signs and equality signs and all this mess. The only thing that really
matters are the coefficients of these equations.
Play video starting at 42 seconds and follow transcript0:42
So what we're going to do is we're going to simplify notation, and just store these coefficients in
what's known as an augmented matrix. That's a matrix with this little bar coming down in the middle
of the entries. So here, each row is going to correspond with single equation and the entries in that
row are going to be the coefficients. So the first row 1, 1, 5 translates to 1 times x plus 1 times y
equals 5 x plus y equals 5. The second row 2, 4, 12 means that 2x plus 4y equals 12. And so, this little
matrix is sort of just a convenient way of storing that system of linear equations.
Play video starting at 1 minute 24 seconds and follow transcript1:24
Now, one complication that this method runs into is when we're storing things in this matrix. How do
we implement substitution? How do we solve for x? For example, there's a sense in which we can't
write a row that corresponds to the equation x = 5- y. Every row corresponds to an equation where x
and y are on the same side of the equality. So this sort of doesn't work.
Play video starting at 1 minute 50 seconds and follow transcript1:50
On the other hand, the row 1, 1, 5 is almost as good, it corresponds to the equation x + y = 5 which is
equivalent. The next question we ask ourselves is how do we substitute this into the second
equation? Because once again, the immediate thing you get when you substitute has Xs and Ys on
sort of. I guess they're still in the same side but it's got constants on the wrong side of the equation
now.
Play video starting at 2 minutes 15 seconds and follow transcript2:15
And so you can't substitute directly but really you can do something almost as good. What you can do
is the substitution was just to get rid of the x's in that second equation and you can do that by
subtracting. If you subtract twice the equation (2x+y=2) times the equation (x+y=5) from the equation
(2x+4y=12). That tells you that (2y=2) which is exactly what you would have gotten from substitution
and this corresponds to a very nice operation on the matrix rows. You just subtract twice the first row
from the second to get the row corresponding to this guy.
Okay, so we're given this augmented matrix and what we're going to do is we're going to manipulate
it. We're going to use what are called Basic Row Operations. These are ways of transforming your
matrix to give you an equivalent system of equations.
Play video starting at 3 minutes 14 seconds and follow transcript3:14
Now the first piece is addition, just what we just saw. You add or subtract a multiple of one row from
another.
Play video starting at 3 minutes 21 seconds and follow transcript3:21
So for example, we subtract twice the first row from the second and 2, 4, 12 becomes 0, 2, 2 which is
good. Next off though we have 2y = 2, we want to change that to y = 1, so to do that we need to use
scaling. We need to multiply or divide a row by a non-zero constant and that's just multiplying the
equation by a constant. So we should be good. So if we divide the second row by two instead of
getting 0, 2, 2 it becomes 0, 1, 1 y equals 1. Now in some sense these two operations are enough but
for bookkeeping purposes we might want to reorder the way in which we list the rows. So a final
operation is to swap the order of two of the rows. So just list the second row up top and the first row
down bottom. This clearly doesn't really effect anything, we're just sort of listing our equations in
another order.
Ppt slides
Ppt slides
Play video starting at 4 minutes 19 seconds and follow transcript4:19
So these are the three basic row operations. And what we're going to do is we're going to combine
them into an algorithm called row reduction, or sometimes Gaussian elimination. We're going to use
these operations to put our matrix in a simple standard form.
Play video starting at 4 minutes 34 seconds and follow transcript4:34
And the idea is actually very simple, we're just going to simulate the method of substitution.
Play video starting at 4 minutes 40 seconds and follow transcript4:40
Okay, so let's consider this example. We have a big matrix here, it corresponds to a system of three
equations in four unknowns, which I'll call x, y, z and w and we'd like to solve them.
Play video starting at 4 minutes 52 seconds and follow transcript4:52
So method of substitution, what do we do? We use the first equation to solve for the first variable x.
And in some sense this first equation 2x + 4y- 2z = 2 already implicitly solves for x in terms of the
other variables. But for simplicity we'd like to rescale it so that the x entry is 1, so we divided the row
by two. And now it says x + 2y- z = 1 or equivalently, x = 1- 2y + z. We now want to substitute that
value into the other equations, which we do by adding this row to others, to clear out the x entries.
Play video starting at 5 minutes 33 seconds and follow transcript5:33
So we add the first row to the second and subtract twice the first row from the third.
Play video starting at 5 minutes 40 seconds and follow transcript5:40
And now there are no other entries in that column in any of our other rows.
Play video starting at 5 minutes 46 seconds and follow transcript5:46
Okay, so now we're done with X we want to solve for the next variable Y using the two other
equations.
Play video starting at 5 minutes 53 seconds and follow transcript5:53
We actually can't use the second to solve for Y but there's no Y in that equation the entry is 0.
Play video starting at 6 minutes 0 seconds and follow transcript6:00
But we can use the third equation so what we're going to do for book keeping purposes we're going
to swap the second and third rows just so that the, row we're using to solve for y is up top.
Play video starting at 6 minutes 11 seconds and follow transcript6:11
We now want to solve for the second variable by which we mean rescale so that entry is 1, so we
divide by -2.
Play video starting at 6 minutes 18 seconds and follow transcript6:18
And now, we want to substitute this value into the other equations.
Play video starting at 6 minutes 22 seconds and follow transcript6:22
So we subtract twice the second row from the first and actually the third row is okay.
Play video starting at 6 minutes 28 seconds and follow transcript6:28
Now the thing to note is we can't actually solve for z. The last equation doesn't have z. The first two
equations have z terms in them, but we've already used those equations to solve for x and y.
Play video starting at 6 minutes 42 seconds and follow transcript6:42
So actually, what we're going to do here, is we're going to skip Z and move on to W.
Play video starting at 6 minutes 46 seconds and follow transcript6:46
We can solve for w in terms of this variable, so we divide the last row by minus 2. We get the
equation w = 0 and we substitute that into the other equation. So we subtract twice the third row
from the first and then add the third row to the second. And now we're actually done, this is basically
as simple as our matrix is going to get.
Play video starting at 7 minutes 10 seconds and follow transcript7:10
But now that we have this, it's actually very easy to read off the solutions. We have this matrix, it
corresponds to the equations x + z = -1, y- z = 1, and w = 0. And here you can basically read off the
solution. For any value of z, we have a solution x = 1- z, y = 1 + z and w = 0. And that's the general
solution.
Play video starting at 7 minutes 37 seconds and follow transcript7:37
Great, so how do we RowReduce a matrix in general? We're going to do is we're just going to take the
leftmost non-zero entry. This is the thing we're going to solve for. We swap that row to the very top
of the matrix just for bookkeeping purposes. That leftmost entry we're going to call pivot, we're going
to use it to clear out the other entries in it's column. We rescale the row to make that entry one
because we want to actually solve for that variable.
Play video starting at 8 minutes 4 seconds and follow transcript8:04
We then subtract that row from all the others to make the other entries in that column 0.
Play video starting at 8 minutes 11 seconds and follow transcript8:11
Then, we just substitute that into the other equations and we're going to repeat this well, with slight
modifications. We want the leftmost non-zero entry not already in a row with the pivot,
Play video starting at 8 minutes 22 seconds and follow transcript8:22
we then swap that row to be the top of the non-pivot rows.
Play video starting at 8 minutes 26 seconds and follow transcript8:26
We make a new entry a pivot, we rescale, we clear out the other columns and we keep repeating until
all of the non-zero entries are in pivot rows and then we're done. Once you have that it's actually
pretty easy to read off the answer. Each row is going to have one pivot entry and maybe a few other
non-pivot entries.
Play video starting at 8 minutes 48 seconds and follow transcript8:48
This will give an equation that writes the pivot variable in terms of a bunch of non-pivot variables.
Play video starting at 8 minutes 55 seconds and follow transcript8:55
Now, there's a special case if there's a pivot in the units column. That means that we have the
equation 0 = 1, and if that happens we have a contradiction and there are no solutions.
Play video starting at 9 minutes 7 seconds and follow transcript9:07
But otherwise, the non-pivot variables, the variables corresponding to columns with no pivot in them,
these are actually free variables. We can set those to whatever we want and then once we've done
that each of our rows tells us what the pivot variables should be in terms of the non-pivot variables
and that's it.
Play video starting at 9 minutes 28 seconds and follow transcript9:28
One final thing to discuss is the runtime of this operation.
Play video starting at 9 minutes 32 seconds and follow transcript9:32
If you have n equations and n variables, there are a minimum of n and m pivots.
Play video starting at 9 minutes 40 seconds and follow transcript9:40
Whenever you find a pivot, you need to subtract a multiple of that row from each other row.
Play video starting at 9 minutes 46 seconds and follow transcript9:46
Now each row has n entries that you need to deal with this subtraction for and there are m rows so
that takes O of m, n time.
Play video starting at 9 minutes 55 seconds and follow transcript9:55
And so the total run time is O of n times m times the minimum of n and m. And this is pretty good, it's
polynomial n and m. You could maybe expected you a little bit better and in fact there are
sophisticated algorithms that do that. But for practical purposes, this is actually a pretty good runtime
and it's a very usable algorithm.
Play video starting at 10 minutes 18 seconds and follow transcript10:18
So that's basically all for a sort of linear algebra sidebar. On next lecture, we're going to go back to
talk about linear program, the systems of linear inequalities and how to deal with them. So until next
time.

Basic Tools
Convexity
Hello everybody. Welcome back to our Linear Programming unit. Today, we're going to talk about
convex polytope. In particular, we're going to try to understand what the solution set to the system of
linear inequalities that we need to deal with actually looks like. So remember, in a Linear Program
we're trying to Optimize a linear function subject to a bunch of linear inequality constraints. Today,
we're going to ask the question, what does the region of points defined by these inequalities actually
look like? For example, this factory example that we looked at way back at the beginning. If you look
at the set of solutions to these five inequalities, you got this nice trapezoid here. So the question is,
what did things look like in general?
Play video starting at 45 seconds and follow transcript0:45
Well, another example, if you look at the system were x, y and z and three dimensions were all
between zero and one. You've got the unit cube. And in general, you get much more complicated
looking regions. But you'll always get what's called convex polytope. And don’t worry will unrule these
meaning as we go. So the first thing to know is what is a single linear equation? Well if you look at the
linear equality, it defines a Hyperplane, infinite flat surface.
Play video starting at 1 minute 16 seconds and follow transcript1:16
Now, if instead you want an inequality it gives you what you call a halfspace. It gives you a Hyperplane
and everything on one side of that Hyperplane.
Play video starting at 1 minute 25 seconds and follow transcript1:25
So if we want the solutions to a system of linear inequalities, we have a thing defined by a bunch of
halfspaces, we want the intersection of all these of halfspaces. We want everything that's inside of all
them. We want to solve all of the equations.

Play video starting at 1 minute 39 seconds and follow transcript1:39


And so, we sort of get a thing that's defined by these Hyperplanes. In fact, what we'll always get is a
Polytope, that's a region in Rn that's bounded by finitely many flat surfaces. But, in fact, Polytopes
have a little bit more structure than that. If you think about the cube, not only do we have the six
faces but these faces intersect at edges and those intersect at vertices. And so, a polytope in general
will have these, surfaces may intersect at lower dimensional facets like edges but perhaps some other
dimensions, with the zero dimensional facets are called vertices. But it turns out that not every
polytope is actually possible as a solution, a set of solutions to such a system of linear inequalities. For
example, the donut pictured here is a polytope. But it's not a system solution to one of these systems.
Because if you look at some of these inward pointing faces. Well, these faces lie in a Hyperplane. But
you've got portions of your region on both sides of that Hyperplane.
Play video starting at 2 minutes 45 seconds and follow transcript2:45
Whereas, if you have a polytope defined by one of these systems of linear inequalities. Each bounding
Hyperplane is actually coming from one of those linear inequalities. And you can only have points on
one side of that Hyperplane or the other.
Play video starting at 2 minutes 59 seconds and follow transcript2:59
So you have this extra condition that everything must be on only one side of each face.
Play video starting at 3 minutes 5 seconds and follow transcript3:05
And that leads us to the condition of Convexity. A region C and Rn is called convex, if for each pair
points x and y and C. The line segment connecting x and y is entirely contained in their regions.
Play video starting at 3 minutes 20 seconds and follow transcript3:20
So the Lemma is that any intersection of halfspaces or site of solution to this systems is convex.
Play video starting at 3 minutes 28 seconds and follow transcript3:28
And the Proof is not that hard. Our system is defined by Ax at least b.
Play video starting at 3 minutes 34 seconds and follow transcript3:34
We need to show that if two points, x and y, are in this set, then everything on the segment contained
between them is also in the set.
Play video starting at 3 minutes 43 seconds and follow transcript3:43
Well, the lines that way you can parameterize is points of the form (tx + (1- t)y) where t is a real
number between 0 and 1.
Play video starting at 3 minutes 53 seconds and follow transcript3:53
So the fast of that point is in our set, we take A (tx + (1- t) y) = tAx + (1- t) Ay, since x and y are in the
set, that's at least tb + (1- t) b which is b. And so, every point in the line segment is in our set so the
set is convex, great.
Play video starting at 4 minutes 16 seconds and follow transcript4:16
So the Theorem is the region defined by a system of linear inequalities is always a convex polytope
which is nice.
Play video starting at 4 minutes 25 seconds and follow transcript4:25
So to reveal, we've got three pictures here. Which of these three regions, A, B, C is a convex
polytope?
Play video starting at 4 minutes 35 seconds and follow transcript4:35
Well, it turns out only B is. So A is not convex because we have these line segments here with the end
points are in A but some of the points the middle arc. C is not a convex polytope because there's this
region of the boundary here that sort of a curved region whereas if were a polytope, all that bound
your regions would have to be straight lines. B on the other hand, is actually a convex polytope.
Okay, so to conclude this lecture, we're going to actually prove a couple of important Lemmas about
convex polytopes.
Play video starting at 5 minutes 10 seconds and follow transcript5:10
So the first one is Separation. If you have C to be, in fact, any convex region, and x is a point not in C.
Then it turns out there's always a hyperplane H that separates x from C, where x is on one side and C
is on the other.
Play video starting at 5 minutes 25 seconds and follow transcript5:25
Now, if C is given by a system of liner inequalities. This is actually easy to prove because if x isn't in C
violates one of these defining inequalities, and that inequality gives you a hyperplane where C is on
one side and x is on the other.
Play video starting at 5 minutes 44 seconds and follow transcript5:44
But in general, you can prove this as well, you start with x, we're going to let y be the closest point in
C to x.
Play video starting at 5 minutes 52 seconds and follow transcript5:52
And it turns out you can just take the perpendicular bisector of xy or the hyperplane of points
equidistant between x and y.
Play video starting at 6 minutes 2 seconds and follow transcript6:02
Now, this is clearly a hyperplane to show that it separates x from C. Well, suppose that there were
some point z and C that were on the wrong side of this hyperplane.
Play video starting at 6 minutes 13 seconds and follow transcript6:13
Well, z and y are both in C, so everything on line segment between z and y is also in C.
Play video starting at 6 minutes 20 seconds and follow transcript6:20
But you can show that there's actually always a point on this segment zy, that's closer to x than y was.
Play video starting at 6 minutes 29 seconds and follow transcript6:29
And this is a contradiction because by assumption y was the closest point in z to x, and we just found
a closer one. So that completes it.

Okay, so the other Lemma is about polytopes. Suppose, that you have a polytope and there's a linear
function on this that you're trying to minimize or maximize.
Play video starting at 6 minutes 48 seconds and follow transcript6:48
The claim is that it takes its minimum or maximum values on vertices.
Play video starting at 6 minutes 52 seconds and follow transcript6:52
This is clearly relevant to our linear program because we're exactly trying to minimize and maximize
linear functions on this convex polytopes. So, we saw this in it's original factor example with the
maximum was at this vertex and turns out that happens in general.
Play video starting at 7 minutes 9 seconds and follow transcript7:09
Now, to maybe get some intuition for why this is true. We've got our polytope and it's, this polytope
is sort of spanned by the corners, it's got corners and like things in between these corners. But
because we have linear functions like the things in between the corners are never as good as the
extreme points and so the optimum must be at the corners.
Play video starting at 7 minutes 31 seconds and follow transcript7:31
Now, to actually prove this the thing to note is that you have a linear function to find on a line
segment. It always takes extreme values at the two end points.
Play video starting at 7 minutes 41 seconds and follow transcript7:41
And we're going to use to sort of push our points toward the corners and let the values get bigger and
bigger.
Ppt slides
Play video starting at 7 minutes 48 seconds and follow transcript7:48
So we start at any point in our polytope and what you do is you draw a line through it. And you'll
note, that the biggest value that our linear function takes on this line comes all the way over to that
line hits an end of the polytope. So it takes an extreme point at the endpoint of that line which is on
the face of your polytope.
Play video starting at 8 minutes 8 seconds and follow transcript8:08
Now, once you're on the face or some facet, you can repeat this. You draw a line through that point
and what you know is that the extreme values will be at the end points of this line and that lets you
push it to a lower dimensional facet. And you keep doing this until you end up at a vertex.
Play video starting at 8 minutes 26 seconds and follow transcript8:26
And so, we start at any point and we kept going until we hit a vertex. And that vertex has at least as
large a value as the point we started.
Play video starting at 8 minutes 35 seconds and follow transcript8:35
And so, the maximum values must be attained at some vertex.
Play video starting at 8 minutes 41 seconds and follow transcript8:41
So in summary, the Region defined by a linear program is always convex. The Optimum of this linear
program is always attained at a vertex. And finally, if you have a point that's not in the region, you Can
always separate it from points on the inside by an appropriate hyperplane.
Play video starting at 8 minutes 58 seconds and follow transcript8:58
So these are some basic facts about linear programs and their solution sets. Come back next time,
we'll talk about another interesting property of linear programs called duality.
Duality
Hello everybody. Welcome back to our Linear Programming Unit. Today we're going to talk about an
interesting phenomenon in linear programs called duality.
Play video starting at 9 seconds and follow transcript0:09
So let's recall the first example that we looked at here. We wanted to maximize 200M + 100W subject
to this bunch of equations. Now it turns out we had a very clever way of proving that optimum was
correct once we found it. So the best you could do is 60000, and there was a great way of proving it.
We took one constraint, we multiplied by a hundred, we took another constraint and we multiplied it
by half. We added those together and we got a new constraint that, if we satisfied our original
constraints, it had to be the case. That 200M plus 100W, the thing we were trying to optimize, was at
most 60,000. And this is a very interesting and general technique that if you want to bound your
objective. You can try and do this by combining the constraints that you have together to prove a
bound.
Play video starting at 1 minute 0 seconds and follow transcript1:00
So let's see what happens in general. You have a linear program, you say you want to minimize v1x1
plus v2x2 plus. all the way up to vnxn.
Play video starting at 1 minute 10 seconds and follow transcript1:10
Subject to a bunch of these linear inequality constraints A11x1 plus a12x2 plus dot dot dot is at least
b1 and thenetc etc. So how can we try and do this?
Well if you give me any constant ci bigger than 0, you can take the first constraint and multiply it by c1
and the second constraint multiplied by c2 and so on and so forth, and add those all up. And what
you'll get is a new linear inequality, w1x1 plus w2x2 plus dot dot dot is at least T. Here the w i are
some combination of the Cs, w i is the sum of c j a j i and t is the sum of c j b j and this is a new
inequality. Now, if it's the case that w i is equal to Vi for all i, we have is that V1X1 plus V2X2 plus dot,
dot, dot that thing we were trying to minimize is at least t. And so, if we can arrange for Wi to the Vi
for all i, we've proven a lower bound on the thing that we're trying to minimize. So, we'd like to find
there's a bunch of Ci's that are all non-negative such that vi is the sum of j = 1 to m of cj aji for all i.
And so that subject to this constraints t the sum of cjbj is as large as possible. So we like the biggest
lower bound we can. Now the very interesting thing about this, is that this system we just wrote
down is actually just another linear program. We want to find the c in Rm such that cjbj the sum of
that is as large as possible subject to a bunch of linear inequalities. CI bigger than or equal to 0 and a
few linear equalities. vi is the sum of cj times aji.
Play video starting at 3 minutes 3 seconds and follow transcript3:03
And so, to put this formally, given any linear program we can call the primal very often.. Say minimize
v.x subject to Ax at least b. There's a dual linear problem, which is the linear problem that we want to
maximize. Y.b subject to y transpose A equals v and that's just another way of rewriting our inequality
constraints. And y at least 0. And it should be noted that even if your linear program wasn't exactly in
this form, you can still write a dual program, it's with a linear program of trying to find a combination
of the constraints. To bound the thing that you're trying to optimize. And so it's not hard to show that
a solution to the dual program bounds the optimum for the primal.

Play video starting at 3 minutes 48 seconds and follow transcript3:48


Suppose that you have a solution for the dual, you've got a y bigger than or equal to 0, such that y
transpose A is equal to v.
Play video starting at 3 minutes 55 seconds and follow transcript3:55
Then for any x where ax is at least b, well, x dot v is equal to y transpose ax.
Play video starting at 4 minutes 4 seconds and follow transcript4:04
That's at least y transpose b, which is y dot b. And so y dot b, the solution to the dual, it's a lower
bound to the solution to the prime.
Play video starting at 4 minutes 14 seconds and follow transcript4:14
Now the surprising thing is that not just is this a way that you can get lower bounds, that these two
linear programs they actually have the same solution. If you find the best solution for the dual
program, it actually always gives you a tight lower bound for the primal.
Play video starting at 4 minutes 30 seconds and follow transcript4:30
And the theorem here is a linear programming duality that says a linear program and its dual have the
same numerical answer.
Play video starting at 4 minutes 38 seconds and follow transcript4:38
And this is incredibly useful. On the one hand it says if you have a linear program and want to prove
that your answer is optimal you could try and solve the dual to provide a matching upper band or
lower band.
Play video starting at 4 minutes 51 seconds and follow transcript4:51
It also means that if all you care about is the numerical answer you can try to solve with dual program
rather than the primal. And often, dual program is easier to solve, and so this makes things more
convenient. And even if the dual program isn't easier, often looking at the dual gives you some insight
into the solution to the primal.
Play video starting at 5 minutes 13 seconds and follow transcript5:13
Ok so that's the new programming duality. Let's look at some examples. For example, let's look at the
max flow problem. The size of your flow is the total flow going out of a source minus total flow going
into a source.
Play video starting at 5 minutes 26 seconds and follow transcript5:26
Now, we have a bunch of these conservation of flow equations, and we can add any multiples of
those that we like. And the objective stays the same. So when we do that, the thing we're trying to
maximize is the same as the sum over all vertices V of some constant C sub V times the total flow out
of vertex V minus the total flow into vertex V.
Play video starting at 5 minutes 50 seconds and follow transcript5:50
Here C sub s needs to be 1 if s is a source and C sub t needs to be zero if t is a sign but for any other
vertex V we can take C sub v to be anything we like. [COUGH] Okay, so we have this expression, what
do we get when we write this down?
Play video starting at 6 minutes 9 seconds and follow transcript6:09
And this is sum over edges from V to W of the flow along the edge times C sub V minus C sub W.
Play video starting at 6 minutes 18 seconds and follow transcript6:18
We can now try to bound this above using our capacity constraints. And so the best we can do here,
it's not hard to show is the sum over edges v to w of the capacity of the edge e times the maximum of
either C sub v minus C sub w or zero.
Play video starting at 6 minutes 38 seconds and follow transcript6:38
Okay so this gives us an upper bound and we want this upper bound to be a small as possible. It's not
hard to show that we should pick our C sub v to always be either zero or one.
Play video starting at 6 minutes 51 seconds and follow transcript6:51
Now if we do that, let C, be the set of vertices where C sub v equals one.
Play video starting at 6 minutes 58 seconds and follow transcript6:58
The bound that we prove then reduces to the sum over edges V w, where V is in script C and w isn’t of
the capacity h. But you'll note, C is just cut and this bound that we proved is just the size of the cut.
And so, this dual program in some sense is just trying to find the minimum cut. Hence, linear
programming in duality, in this special case, just gives us the max flow equals min cut.
Play video starting at 7 minutes 27 seconds and follow transcript7:27
Okay, let's look at this other problem for example. The diet problem. Here we want to minimize the
total cost of foods you need to buy, subject to constraints. It needs to meet your various daily
requirements for various nutrients, and you need to get a non-negative amount of each type of food.
Play video starting at 7 minutes 43 seconds and follow transcript7:43
So you've got this system, what's the dual program?
Play video starting at 7 minutes 48 seconds and follow transcript7:48
Well okay, so for each nutrients N we have some multiple C sub N of the equation for that nutrient.
And then we can add on positive multiples of the constraints even in non-negative amount of each
type of food.
Play video starting at 8 minutes 2 seconds and follow transcript8:02
Okay, so when we combine all of those together we're suppose to get A lower bound on the cost of
our diet. And so, if you compare coefficients well, the coefficient we need to end up with for a food f
is the cost of that food.
Play video starting at 8 minutes 18 seconds and follow transcript8:18
And this should be equal to the sum of our nutrients N of C sub of N times the amount of that
nutrients in the food f. Plus some positive amount that we got by adding whatever multiple we had
on the constraint that we got a non-negative amount of that food.
Play video starting at 8 minutes 33 seconds and follow transcript8:33
So what this says is for each food f, the cost of food f, should be at least the sum over nutrients N,
times the amount of that nutrient showing up in this food.
Play video starting at 8 minutes 43 seconds and follow transcript8:43
But there’s a nice way now of interpreting this C sub N. We can interpret it as a cost in order to buy a
unit of nutrients N. And so if there was a market where you could just buy calories at the cost of Cn
and you could buy protein at the cost of whatever. What the above equations are saying. Is that for
each food you can't cheat the system by buying out food. You can't get nutrients more cheaply than
you could by buying the nutrients individually.
Play video starting at 9 minutes 14 seconds and follow transcript9:14
And the cheapest way to get nutrients is buying them individually it's pretty clear the total cost of a
balanced diet is at least just the sum over nutrients at the cost of that nutrient times the amount of
nutrient that you're required to have in your diet.
Play video starting at 9 minutes 28 seconds and follow transcript9:28
And so, what this linear program tries to do, is it tries to find non-negative costs for the various
nutrients that satisfy this, no food allows you to cheat inequalities. Such that the total cost of your
diet is large as possible.
Now, there's one interesting observation about the solution, supposed that we're actually trying to
exactly achieved this lower bound.
Play video starting at 9 minutes 53 seconds and follow transcript9:53
That would mean that you could never afford to buy overpriced food. You can never afford to buy
foods where the cost of that food was strictly bigger than the total cost of all the nutrients that make
up that food. You could only buy foods where the cost of the foods is exactly the cost of the nutrients
in that food.
Play video starting at 10 minutes 11 seconds and follow transcript10:11
And this gives us as an example of a general phenomena called complementary slackness where
basically what this says Is that if you look at the solutions the dual, you should look at which equations
you needed to use in the dual program. That tells you about which equations in the primal program
need to be
Play video starting at 10 minutes 30 seconds and follow transcript10:30
so in particular, complementary slackness is the following theorem. If you give me a primal in your
program, minimize v. x subject to Ax at least b. And it's dual. Then if you take solutions to these two, if
you use a positive multiple of an equation in the dual, if yi is strictly bigger than 0, in the dual.
Play video starting at 10 minutes 52 seconds and follow transcript10:52
This happens only if the ith equation in the solution to the primal is actually tight. The ith inequality is
actually an equality in the optimal solution.
Play video starting at 11 minutes 4 seconds and follow transcript11:04
So let's reveal what this means. Let's suppose that we have a linear program to find by these five
linear inequalities labelled 1, 2, 3, 4, 5 and the diagram below. We have allowed what regions this
gray region and the red point located is the optimal.
Play video starting at 11 minutes 20 seconds and follow transcript11:20
Now, suppose that we're looking for solutions of the dual program. Which of these five equations
might actually be used as sort of positive multiple of those equations, in the solutions to the dual
program?
Play video starting at 11 minutes 34 seconds and follow transcript11:34
Well, it turns out the only equations that you could actually use are two and four because
complementary slackness says that the only equations that get used in the solution to the dual are
ones with that equation is tight in the primal. And in this case, two and four are the only lines of this
solution to the primal actually lies on. And its those are the only equations that could actually be used
in the solution.
Play video starting at 12 minutes 0 seconds and follow transcript12:00
So in summary everything in your program has a dual in your program. The solution to dual actually
bounds to the solutions to the primal. And surprisingly the LP na dit's dual has the same answer. And
this means that the solution do dual actually is tight bound to the solution to the primal An. In
addition, we have this complementary slackness where knowing the solutions to the dual tells you a
lot about where the solutions to the primal lies. In fact, it tells you which equations in the solutions
the primal needs to be tied.
Play video starting at 12 minutes 32 seconds and follow transcript12:32
So that's basically everything we have for this lecture. Next lecture we're going to talk about proofs
for these things so that material is not necessarily required, but if you'd like to see it, it's informative.
(Optional) Duality Proofs
Hello everybody, welcome back to the linear programming unit. 
Today we're going to talk with proofs from the duality lecture.
Play video starting at 7 seconds and follow transcript0:07
So remember last time we showed that each linear program, 
we could associate a dual program. 
Which is basically attempting to find a non-negative combination of 
our constraints that put a bound on the objective function. 
So in particular, we have the duality theorem, which says a linear program and 
its dual always have the same numerical answer. 
Today we're going to prove that. 
So before we get into the proof, let's talk a little bit about intuition. 
So one way to think like this is the region defined by linear 
constraints is sort of a well. 
It's bounded by these linear walls. 
And you need to find, minimize say a linear constraint. 
Say you want to minimize the height of a point inside this well. 
Now, if you understand some physics, 
one way to do this is you just take a ball that's pulled down by gravity, 
you drop it into the well and gravity will pull it down to the lowest point.
Play video starting at 56 seconds and follow transcript0:56
Now when it reaches this point the ball is at rest. 
And that means the forces acting upon it need to balance. 
So the force of gravity pulling it down needs to be balanced by the normal forces 
from the walls pushing it back up.
Play video starting at 1 minute 9 seconds and follow transcript1:09
Now when you write this down, 
you have a linear combination of these normal vectors pointing orthogonally to 
these walls, has to equal the downwards pointing vector from gravity. 
And if you worked out what this means in terms of the equations, the downward 
pointing vector, that's the direction in terms of your objective that points in 
the direction of your objective, and the normal pointing vectors, those are showed 
by the vectors that correspond to the equation to finding the walls later on.
Play video starting at 1 minute 38 seconds and follow transcript1:38
And so, if you sort of work out exactly what this means and 
put it in terms, it exactly is linear programming. 
You actually have a solution in a dual program that matches your primal. 
And if you'll even note which walls you use, 
only the walls that the ball is actually touching, 
that actually gives you a taste of why complementary slackness might be true. 
You only get to use the walls that the ball's actually touching, 
which are the ones for which your equations are tight. 
In any case, let's look at an actual proof. 
So the first thing that we're going to do is, instead of looking at an optimization 
problem, we're going to look at a solvability problem. 
We'd like to say when is there a solution to Ax at least b and x.v at most t.
We'd like to say when is there a solution to Ax at least b and x.v at most t.
Play video starting at 2 minutes 21 seconds and follow transcript2:21
So, we claim there's a solution to the system unless some combination of these constraints yields a
contradiction, yields the equation 0 bigger than 1. Of course, if you yield that from a combination of
your constraints, it's clear your constraints cannot all simultaneously be satisfied. But it turns out that
this is sort of if and only if. And if we can prove this, that will actually be enough to give us the original
problem. Because basically, the only way we can do that is if combining the constraints not including
the last one, we can conclude in its constraints that x.v needs to be strictly bigger than zero. Okay, so
let's see how that works. Now suppose that we have a bunch of constraints and now we want to look
at all possible combinations of constraints. So c1E1 plus c2E2 plus dot dot dot plus cmEm, where the
ci are some non-negative constants. Now the thing to note is that the set of inequalities that can be
derived in this way, the set C, is actually a convex set.
Play video starting at 3 minutes 28 seconds and follow transcript3:28
Now what happens if the equation zero bigger than 1 is not in this set C? Well, you've got a point
that's not in the convex set. That means that there has to be a separating hyperplane.
Play video starting at 3 minutes 40 seconds and follow transcript3:40
Now, what this separating hyperplane is is a little bit weird. It's a sort of a hyperplane in the set of
linear inequalities but it turns out that from this hyperplane, you can extract solution to the original
system of equations.
Play video starting at 3 minutes 57 seconds and follow transcript3:57
So this is a little bit abstract, so let's look at something slightly more concrete. We have a system of
linear inequalities, all the form something x plus something y is at least 1, and we want to know is
there a solution to this system?
Play video starting at 4 minutes 11 seconds and follow transcript4:11
Well, we're going to plot these equations. Here we've got a bunch of equation in the from ax plus by
is at least 1. And we've plotted all the values of a.
Play video starting at 4 minutes 21 seconds and follow transcript4:21
Now, it's clear that in none of these equations is the contradiction 0 bigger than 1. Now would be the
point of the origin. We also have considered linear combinations of these. It turns out the linear
combinations of these equations you can get are exactly the gray region here.
Play video starting at 4 minutes 39 seconds and follow transcript4:39
Now it's still the case that 0 bigger than 1 is not in this region, but it's a convex region so there has to
be a separating hyperplane.
Play video starting at 4 minutes 47 seconds and follow transcript4:47
Here we have the separate a plus b is bigger than 0. So what does this mean?

Play video starting at 4 minutes 53 seconds and follow transcript4:53


Well it means that all of our equations were of the from ax plus by at least 1 with a plus b being at
least zero.
Play video starting at 5 minutes 2 seconds and follow transcript5:02
Now what this means though is if we take x equals y equals the same big number,
Play video starting at 5 minutes 9 seconds and follow transcript5:09
ax plus by is equal to a plus b times this big number.
Play video starting at 5 minutes 14 seconds and follow transcript5:14
Since a plus b is positive and this other number is big, that's actually more than 1. And so we actually
have a solution. If x equals y is a big number, and in particular x equals y equals 1 gives us the
solution.
Play video starting at 5 minutes 27 seconds and follow transcript5:27
And so we're able to do here is, we said, well 0 bigger than 1 is not a linear combination. We have a
separating hyperplane and by looking at the form of this hyperplane, it allowed us to find an actual
solution to our system. And this is sort of how it works in general.
Play video starting at 5 minutes 45 seconds and follow transcript5:45
Okay so that's the Proof of Duality, we should also look at Complementary Slackness. Remember what
this said is we have our linear program and it's dual. And if you have the solution to the dual, where Yi
is strictly bigger than zero. This can happen only if the ith equation in the primal is tight.

Play video starting at 6 minutes 5 seconds and follow transcript6:05


To prove this is really actually not that hard. You take a solution x to the primal and a matching
solution y to the dual.
Play video starting at 6 minutes 12 seconds and follow transcript6:12
So the best you can do in the primal is x.v equals t. But by duality, you get a matching lower bounds in
the dual. So there's a combination of these linear inequalities, Ei. So the sum of yi Ei, yields the
inequality x.v is at least t. Now each of these equations Ei, those were true equations for that value of
x, and the final inequality that we get is actually tight.
Play video starting at 6 minutes 39 seconds and follow transcript6:39
So we have the sum of a bunch of inequalities, it gives us a tight inequality. The only way that that can
happen is if the inequalities that we used are tight.
Play video starting at 6 minutes 47 seconds and follow transcript6:47
And so that means for each i, either the inequality Ei is tight, or yi = 0. And that proves
complementary slackness. Okay, so that's all we have for this lecture. Come back next time and we
will go back and start talking about some formulations and different ways of looking at linear
programming problems.

algorithms
Linear Programming Formulations
Hello everybody, welcome back to our unit on Linear Program. Today, we're going to talk about some
sort of different types of linear programming problems. It should have all related but not quite the
same.
Play video starting at 12 seconds and follow transcript0:12
So the point is there's actually several different types of problems that go into the heading of linear
programming. Now, the one that we've been talking about so far, we might call the full optimization
version. Minimize or maximize a linear function subject to a system of linear inequality constraints. Or
maybe say the constraints have no solution if they don't.
Play video starting at 32 seconds and follow transcript0:32
Now, it turns out there are a number sort of related problems dealing with sort of systems of linear
inequalities that you might want to solve, that are maybe a little bit easier. And will actually be
important, we start coming up with algorithms in the next couple of lectures that will actually be
solving algorithm's other formulations.
Play video starting at 50 seconds and follow transcript0:50
So the first one is optimization from a starting point. Given the system of linear inequalities, and the
vertex of the polytope they define, optimize a linear function with respect to these constraints, so
you're given a place to start.
Play video starting at 1 minute 5 seconds and follow transcript1:05
Another version is one they call Solution Finding. Given the system of linear inequalities, no objective
whatsoever, find some solution to these systems, assuming it exists, this is also somewhat easier. And
finally, we can satisfiability. Given the system of linear inequalities, determine whether or not there is
a solution.
Play video starting at 1 minute 27 seconds and follow transcript1:27
So it turns out that these problems are actually equivalent. If you can solve any one of these, you can
solve the others. Which is very convenient, because the algorithms we'll be looking at will each only
solve one of these versions.
Ppt slides
First off, it's clear that the full optimization of the problem is the strongest. Using that you can solve
anything else you want. If you have optimization from a starting point, you can just ignore the starting
point and solve the problem from scratch. If you're trying to find a solution while the optimal solution
is a solution and if you merely want to know if there is a solution. Well you try and run a full
optimization and it outputs a solution, you have a solution, great.
Play video starting at 2 minutes 7 seconds and follow transcript2:07
But the nice thing is that you can go the other direction. If you can only solve optimization from a
starting point, you can actually do the full optimization.
Play video starting at 2 minutes 16 seconds and follow transcript2:16
And the problem here is, how do you find the starting point? If you had the starting point you could
run the algorithm and you'd be done. But somehow you need a solution to this system and there's
actually a clever way to do this, you add equations one at a time. So now we have a solution to the
first seven of your equations. We now need to add to make it a solution to the first eight. For that, we
need to say, well, maybe your solution, it doesn't satisfy this eight inequality. Well what you can do is
you can optimize. So not only do we want to satisfy these seven, we want to make this eighth
inequality, maybe, as true as possible. And that one will give you a solution set that satisfies all of
them.
Play video starting at 2 minutes 56 seconds and follow transcript2:56
So to see how this works let's look at an example. We start with this rectangle in a nice corner of it.
We know where to add an inequality that chops our rectangle at this line. So what we're going to do
is we're going to say, well, we want our sort of point to be as much below this line as possible, that's
just a linear optimization question. So we can solve that using our optimization from starting point
algorithm. We get that vertex, and what do you know? It's a solution to this bigger system.
Play video starting at 3 minutes 27 seconds and follow transcript3:27
Next, we want to add this thing as an inequality. So again, we solve our optimization from starting
point, we find a solution, we can now add the additional inequality. We add another one, find a
solution, and then finally we've got all of our equations in the mix. We now need to do for our
optimization and we can do that.
Play video starting at 3 minutes 47 seconds and follow transcript3:47
So this is basically how you do it, you act one equation at a time. There is a technical point to keep
into account. Things are a bit messier if some of these intermediate systems that you're trying to
solve don't have optima. They might need one of these unbounded systems where things can get as
large as you like.
Play video starting at 4 minutes 5 seconds and follow transcript4:05

Now, to fix this, I mean it's not hard, you just need to play around a bit. First you want to start with n
constraints. That means you actually have a single vertex to start out your system and then we you
are trying to add the constraint v.x at least t, you don't just maximize v.x. That would be sort of a
problem because that might be unbound. So what you'll do is, you'll add the additional constraint that
v.x is at most t. And this guarantees that v.x will actually have a maximum at t and once you find it,
that'll be good.
Play video starting at 4 minutes 43 seconds and follow transcript4:43
Okay, so that was that, let's talk about solution finding.
Play video starting at 4 minutes 46 seconds and follow transcript4:46
How do we go from being able to find a solution to find the best one?
Play video starting at 4 minutes 51 seconds and follow transcript4:51
We somehow need to guarantee that the solution we found was the best solution.
Play video starting at 4 minutes 57 seconds and follow transcript4:57
But there's actually a good way to do that, we can use duality.
Play video starting at 5 minutes 0 seconds and follow transcript5:00
The point is that duality gives you a good way to verify that your solution's optimal by solving the dual
program and providing matching upper bound.
Play video starting at 5 minutes 10 seconds and follow transcript5:10
So what you want to do is you want to find both a solution to the original program and a matching
dual solution.

Play video starting at 5 minutes 18 seconds and follow transcript5:18


So if you want to minimize v.x subject to Ax at least b, what you can do is you can instead solve this
bigger system. Ax at least b, y at least 0, y transpose A at least equal to v. That says x is solution to the
primal and y is solution to the dual and then we also want x.v = y.b. So, now we have a matching
solution to the primal and the dual. The fact that we have this solution to the dual means that you
can't do any better in the primal. And so this actually any solution to this system, if you look at x, that
will give an optimal solution to the original problem.
Play video starting at 5 minutes 59 seconds and follow transcript5:59
Finally, let's talk what's satisfiable. How do you just know whether or not there is a solution? How is
this going to help you find actual solutions?
Play video starting at 6 minutes 9 seconds and follow transcript6:09
Well, we know there's always a solution at a vertex of your problems.
Play video starting at 6 minutes 14 seconds and follow transcript6:14
And that means that any of these equations that we have are actually tight.
Play video starting at 6 minutes 18 seconds and follow transcript6:18
Now all we need to do is figure out which equations are tight. And then you can solve for the
intersection of those equations using Gaussian elimination. So how does this work? We have a bunch
of linear equations, here are the lines where we've got the equalities. And the solution to that system
is that hidden triangle here. We want to find a point on this track, so what are we going to do? We are
going to pick some equation here and say is there is a solution to not only this linear system, but this
linear system where that equation is actually an equality.
Play video starting at 6 minutes 51 seconds and follow transcript6:51
And for this guy the answer is no, because there is no point in the triangle that also lies on that line.
That means that we can actually throw out that line. There are no solutions, that line actually doesn't
help us at all.
Play video starting at 7 minutes 3 seconds and follow transcript7:03
Next we try another one, this guy doesn't work, we throw him out. This guy, yes, there are solutions.
They're both inside the triangle and on this line, so we keep that equation around as an equality. And
then try another one, what about this guy? Is there a solution to this system where we're on both of
those lines?
Play video starting at 7 minutes 24 seconds and follow transcript7:24
No, the intersection of those lines is not a solution so we keep going, what about these lines? Yeah, if
you look at the intersection of those lines it gives you that point, which is a solution. But now we
know there's a solution at the intersection at those lines, we solve for that intersection with Gaussian
elimination, and we're done. And so by just solving a whole bunch of satisfiability questions, we're
able to actually resolve this question and actually find one that satisfies us.

Play video starting at 7 minutes 54 seconds and follow transcript7:54


So, to make sure that we understand this, if we want to find the solution to a linear program with m
equations and n variables. How many times would we have to call a satisfiability algorithm in order for
this to work? Well, the answer is we need to do it m times. You need to test each equation once.
When you find equations that work, you keep them around. But, I mean, you don't need to test them
again. Those are equalities, and you just keep checking the other ones. Each equation is to be tested
once, that's a total of m times.
Play video starting at 8 minutes 27 seconds and follow transcript8:27
But you run this satisfiability algorithm m times and it gives you a solution to actually find a point. So
those are the different formulations, next time we're actually going to come back and start looking at
honest algorithms for solving the linear problems.

The Simplex Algorithm


Hello everybody, welcome back to our unit on Linear Programming. Today, we're finally going to get
to an actual algorithm to solve linear programs.
Play video starting at 9 seconds and follow transcript0:09
In particular we're going to talk about the simplex method, which is basically the oldest algorithm for
solving linear programs. And as it turns out, it's still one of the most efficient. Now unfortunately, as
we'll see, the runtime of this algorithm isn't quite as good as we would like, but it's still pretty
reasonable for many contexts. So first off, remembering in our lecture last time, where actually this is
going to solve a specific formulation of linear programming. And in particular, it's going to solve
optimization from a starting point. So if you want to use this method to solve an actual full
optimization problem, you'll have to remember how you do that based on the last lecture.
Play video starting at 48 seconds and follow transcript0:48
So, what's the idea here? We start at some vertex of our polytope, and we know that the optimum is
at another vertex.
Play video starting at 55 seconds and follow transcript0:55
So what we're going to do is, we're going to just find path between vertices that sort of as we go, that
our objective will get better and better and better until we reach the optimum.
Play video starting at 1 minute 7 seconds and follow transcript1:07
So, how do vertices work? Well, you get a vertex of your polytope when you look at the intersection
of n of the defining equations. And you take n of your defining equations, you make them all tight,
they intersect at a vertex, and you can solve for that vertex using Gaussian elimination.
Play video starting at 1 minute 24 seconds and follow transcript1:24
Now if you relax one of these equations, then instead of having a zero dimensional thing, you have a
one dimensional thing. And you end up with an edge, you get a set of points with the form p + tw
where t is non-negative. The constraint that you relaxed requires the t now be non-negative.
Play video starting at 1 minute 42 seconds and follow transcript1:42
So this gives you an edge and it continues, t being zero all the way up to the whatever, until you
violate some other constraint in your linear program. And then you get another vertex at the other
end of the edge. Now if v dot w is bigger than 0, if you follow this edge, you get a larger value of the
objective at the new vertex.
Play video starting at 2 minutes 5 seconds and follow transcript2:05
So here's the pseudocode for the simplex algorithm, it's actually pretty easy. You start at a vertex p
and you repeat the following. You look over each equation, passing through p and you relax that
equation to get an edge. If, when you travel along that edge, it improves your objective, you replace p
by the vertex you find at the other end of that edge. And then you break, you go back to for each
equation running through p.
Play video starting at 2 minutes 33 seconds and follow transcript2:33
If, however, there was no improvement, you tried every edge going out of the p, none of them did
any better, you actually know that you're at the optimal vertex and you return p.
Play video starting at 2 minutes 44 seconds and follow transcript2:44
Now to look at what does it mean to go to the other end of this edge, it's basically what we said
before. Your vertex p was defined by n equations. You relax one equation and now the general
solution of these n- 1 equations is a point at the form p + tw over real numbers t, you solve for this
using Gaussian elimination. The inequality that you relaxed requires, say, that t be bigger than or
equal to 0.
Play video starting at 3 minutes 11 seconds and follow transcript3:11
And each other inequality in the system might put other bounds on which t are valid to have solutions
to your equation. Now some of them might put upper bounds on t. They might say t is at most 7, or at
most 10, or whatever. And of those upper bounds you take the smallest of them and call that t0.
Play video starting at 3 minutes 31 seconds and follow transcript3:31
Then p + t0 is the vertex that you will get at the other end of that edge. Now, to show that this is
correct, we need to prove the following theorem. If p is a vertex that is not optimal, then there's
actually some vertex adjacent to p that does better.
Play video starting at 3 minutes 49 seconds and follow transcript3:49
And this means that we will keep finding better and better vertices until we find the best one, the
optimal, and then we can return that. Now the proof of this isn't so hard, you've got a vertex p. It's
the intersection of n equations, E1 through En.
Play video starting at 4 minutes 5 seconds and follow transcript4:05
Now what you'd like to do to prove this is optimal, is you want to use sort of the dual program. You
want to find a positive linear combination of E1 through En to find an upper bound, x dot v is at most
whatever.
Play video starting at 4 minutes 18 seconds and follow transcript4:18
Now you, of course, can't always do this, but it's actually going to be the case that you can basically
always write x dot v less than or equal to something as some linear combination of these constraints.
Possibly using negative coefficients.
Play video starting at 4 minutes 33 seconds and follow transcript4:33
Now if all the coefficients were positive, of course, we have an action solution that the dual program
that p is optimum.
Play video starting at 4 minutes 40 seconds and follow transcript4:40
However, if some of the coefficients were negative, you could actually show that if you relaxed the
equation with a negative coefficient, then that actually will give you an edge where if you move along
that edge your optimum gets better.
Play video starting at 4 minutes 54 seconds and follow transcript4:54
And so that proves the theorem. If we're not at the best point ever, then we can find an edge to
follow that does better.
Play video starting at 5 minutes 2 seconds and follow transcript5:02
Okay, so how long does the simplex algorithm take? Well, basically, taking one step on this path isn't
so bad. It's a nice polynomial time thing involving trying a bunch of edges and doing some Gaussian
elimination. But we do have to take this path that goes from wherever we started to the optimum.
And how long the algorithm takes will depend on how long that path is. And unfortunately, the path
might be somewhat long. So we'll suppose that we have the sort of almost cube like thing as sort of
the polytope defined by this linear system. We're trying to optimize the height, we're trying to go up
as far as possible. We start at the marked vertex and we're going to use the simplex method to travel
to other vertices, increasing the height as we go.
Play video starting at 5 minutes 53 seconds and follow transcript5:53
Now the question, of course, is what's the longest path that we might need to take? What's the
longest number of steps that we might need to take in order for the simplex algorithm to complete on
this example?
Play video starting at 6 minutes 5 seconds and follow transcript6:05
Well, unfortunately, it might take as many as seven steps to find the optimum.
Play video starting at 6 minutes 9 seconds and follow transcript6:09
Because it's possible that we can take this path as shown that actually passes through every single
vertex in the polytope. Now some paths are better, some other paths could take as few as three steps
but seven is big. And, in fact, if you do an n dimensional version of this, you might have only two
inequalities that take actually two to the n steps to get where you're going.
Play video starting at 6 minutes 34 seconds and follow transcript6:34
And this really isn't so good.
Play video starting at 6 minutes 37 seconds and follow transcript6:37
And so it turns out the runtime of simplex is proportional to the path length. Now the path length in
practice is very often very reasonable.
Play video starting at 6 minutes 46 seconds and follow transcript6:46
However, you can always find there are some unusual examples where the path length is actually
exponential.
Play video starting at 6 minutes 52 seconds and follow transcript6:52
And this means that simplex algorithms sometimes takes quite a long time to finish.
Play video starting at 6 minutes 59 seconds and follow transcript6:59
Now there's one other technical problem, this is degeneracy.
Play video starting at 7 minutes 3 seconds and follow transcript7:03
So in this analysis, we assumed that only n hyperplanes intersect at
Play video starting at 7 minutes 9 seconds and follow transcript7:09
a vertex, and that's not always the case. For example, in the picture that we have here, we've got a
pyramid and the vertex of that pyramid is on four of the defining hyperplanes. Even though you're in
only three dimensions. If you have some of these degenerate vertices, it's actually a little bit hard to
solve the system because we don't know which equation we're supposed to relax in order to follow
an action. We'd actually have to relax two of our equations to get to an edge and we wouldn't know
which ones to use.
Play video starting at 7 minutes 40 seconds and follow transcript7:40
So there's actually a fix for this, which is not very hard. Which is, if you take all of your equations and
tweak them by just a tiny, tiny bit, this basically doesn't change your solution at all. But it avoids
having any of these degenerate intersections. Now, in fact, if you're willing to be a little bit more
sophisticated, you can run a version of this fix that doesn't involve actually changing things at all. You
can sort of make infinitesimal changes, changes that are sort of only formally there. So you number
your constraints 1, 2, 3, 4, 5, etc. You strengthen the first constraint by epsilon and the next by epsilon
squared and the next by epsilon cubed. Where epsilon is just some incredibly tiny number, sort of so
tiny that it jsut doesn't matter for anything.
Play video starting at 8 minutes 29 seconds and follow transcript8:29
Then in practice, when you want to solve the system, you don't actually need to change any of your
equations.
Play video starting at 8 minutes 35 seconds and follow transcript8:35
If you're at a degenerate point, you need to keep track of which n equations you are really on, which
are really the n equations defining this point.
Play video starting at 8 minutes 44 seconds and follow transcript8:44
And then when you're travelling along an edge that hits a degenerate point, you need to figure out
which is the new equation you're actually at. And, for this, you should always add the lowest
numbered constraint at the new corner. So if you hit a new corner that actually has three hyperplanes
passing through it, you pick the lowest numbered of them to be the real one.
Play video starting at 9 minutes 7 seconds and follow transcript9:07
Now when you do this, you in fact, might have some edges that pass from a degenerate corner to
itself. And this is fine as long as you keep track of which n hyperplanes you're actually on at any given
time. And you make sure you only actually use that edge if it improves your objective. If the edge that
you followed, which should have had zero length, was in a direction at least that made things better
for you.
Play video starting at 9 minutes 33 seconds and follow transcript9:33
But when you do this, we have a nice algorithm called the simplex method. It solves linear programs
by moving between adjacent vertices trying to hit an optimum. It actually works pretty well in practice
but potentially is exponential time. And if you're really worried about that, come see the next optional
lecture that we're doing on the ellipsoid algorithm, which is a much more modern technique for
solving linear programs.

(Optional) The Ellipsoid Algorithm


Hello, everybody, welcome back to linear programming unit. Today, we're going to talk about one
more algorithm for solving linear programs, namely the ellipsoid algorithm.
Play video starting at 10 seconds and follow transcript0:10
So, remember last time we had the simplex algorithm, this solves linear programs. It works pretty well
in most cases, but in some of the time, it's actually exponential which is a problem. Today we're going
to talk about the ellipsoid algorithm, this again solves linear programs. It's actually polynomial time in
all cases, but it turns out in practice is often not as good as the simplex method.
Play video starting at 36 seconds and follow transcript0:36
So, to begin with, the ellipsoid algorithm solves a particular formulation, it solves the satisfiability
version of a linear program. Given the set of constraints, will just tell you whether or not there is a
solution.
Play video starting at 49 seconds and follow transcript0:49
So, here's how the algorithm works. The first thing you do is you take all the equations and relax them
by a tiny, tiny bit, you make them a tiny bit more lenient. And if there weren't solutions before, there
still won't be solutions. But if there were solutions before, even if beforehand there was only a single
point that was a solution, there's now going to be some small positive volume in your set of solutions.
Play video starting at 1 minute 16 seconds and follow transcript1:16
The next thing you do is you bound the set of solutions with a large ball. You find a large ball that
either contains all of your solutions or maybe just contains a large fraction of, some notable fraction
of your solutions, assuming that they exist. And this isn't too hard to do by just taking it really, really,
really big. Then what you do, is you have a ball, or in general an ellipsoid, that contains all of your
solutions.
Play video starting at 1 minute 41 seconds and follow transcript1:41
What you do is, you look at the center of this ellipsoid and say, is that a solution to your equations?
On the one hand, it might be a solution to your system, in which case you've found a solution, your
system is satisfiable, and you're done.
Play video starting at 1 minute 56 seconds and follow transcript1:56
On the other hand, it might not be a solution. If it's not a solution, we have a point that is not in a
convex region. So you can find a separating hyperplane that separates this center from your region.
Play video starting at 2 minutes 10 seconds and follow transcript2:10
What this means is that your entire convex region is on that side of the hyperplane. So instead of
being contained in your ellipsoid it's actually contained in a half-ellipsoid.
Play video starting at 2 minutes 21 seconds and follow transcript2:21
However, when you have a half-ellipsoid you actually can find a new ellipsoid whose volume is smaller
than the one you started with that contains this entire half-ellipsoid. And thus, we now have a smaller
ellipsoid that also contains your set of solutions.
Play video starting at 2 minutes 38 seconds and follow transcript2:38
So now what we're going to do is we're going to iterate this. We keep finding smaller and smaller
ellipsoids that contain our solution set. And eventually one of two things happens. Either eventually
we find that the center of our ellipsoid is actually contained in our solution set, in which case we're
done.
Play video starting at 2 minutes 56 seconds and follow transcript2:56
Or eventually, we end up with ellipsoids that are really, really tiny and yet still guaranteed to contain
our entire set of solutions.
Play video starting at 3 minutes 4 seconds and follow transcript3:04
But from step one, when we did this relaxation, we knew that if there were solutions, our set of
solutions has some small positive volume.
Play video starting at 3 minutes 13 seconds and follow transcript3:13
And eventually we'll find ellipsoids that are smaller than that. And therefore, they're too small to
contain our solution set if they existed. And then we will know that there must be no solutions.
Play video starting at 3 minutes 26 seconds and follow transcript3:26
So, what's the runtime of this? Well, you have to figure out how many iterations it takes, it's a little bit
of a mess. But the runtime of ellipsoid algorithm is something like O((m + n squared) n to the fifth
log(nU)). Where here, n is the dimension of the space that you're working in, m is the number of
equations that you're trying to solve, and U is the numerical size of the coefficients. So things to
notice about this. One, it's polynomial, hooray. We have a polynomial time algorithm for solving linear
programs. And this is pretty great. However, it's a bad polynomial, it runs in something like n to the
seven time. n to the seven's really not that great. And there are a lots of circumstances where I'd
rather taking an exponential algorithm, or a mildly exponential algorithm at least, over an n to the
seven algorithm. Finally, we'll note that the runtime actually depends, albeit logarithmically, on the
size of the coefficients. And this might be a problem if you have really, really complicated coefficients
in your linear program.
Play video starting at 4 minutes 35 seconds and follow transcript4:35
This affects how much you can relax your equations and how big the ball needs to be.
Play video starting at 4 minutes 42 seconds and follow transcript4:42
And if you have these sorts of problems, ellipsoid will run very slowly. Whereas if you're running the
simplex method, no matter how big your coefficients are, at least the number of algebraic operations
that you need to perform doesn't depend on the size of the coefficients.
Now there's one final thing to note about the ellipsoid algorithm. We don't really need that much
information about our system of equations or inequalities. All we really need is what's called a
separation oracle. That is, given a point, x, we need to either be able to tell is x in our system, or if
not, we need to find some hyperplane that separates x from our set of solutions.
Play video starting at 5 minutes 26 seconds and follow transcript5:26
And there are actually some circumstances where you don't have explicit sets of equations, or
defining your system, but can produce a separation oracle. And in these cases you can actually use
the ellipsoid algorithm to solve linear programs even though you don't have an explicit list of finitely
minute equations. And this is really useful in some cases. So in summary, the ellipsoid algorithm is
another way to solve linear programs. It has better worst-case performance than simplex.
Play video starting at 5 minutes 58 seconds and follow transcript5:58
However, it's usually going to be slower. In practice, it's generally not as good.
Play video starting at 6 minutes 4 seconds and follow transcript6:04
However, on the plus side, you can run the ellipsoid algorithm, only with access to a separation
oracle, which is nice. And there are definitely a few contexts where being able to do this is quite
useful. In any case, that wraps up our unit on linear programs. I hope you enjoyed it. Come back next
time and Sasha will start talking about complexity theory. And in particular, we'll be talking about
various aspects of NP-complete problems.
Play video starting at 6 minutes 31 seconds and follow transcript6:31
So, I hope you come back for that, and I'll see you then.
QUIZ • 10 MIN
Linear Programming Quiz
Week 3
Advanced Algorithms and Complexity
Week 3
Discuss this week's modules here.

22 threads · Last post 6 days ago


Go to forum
NP-complete Problems

Although many of the algorithms you've learned so far are applied in practice a lot, it turns out that the
world is dominated by real-world problems without a known provably efficient algorithm. Many of these
problems can be reduced to one of the classical problems called NP-complete problems which either
cannot be solved by a polynomial algorithm or solving any one of them would win you a million dollars
(see Millenium Prize Problems) and eternal worldwide fame for solving the main problem of computer
science called P vs NP. It's good to know this before trying to solve a problem before the tomorrow's
deadline :) Although these problems are very unlikely to be solvable efficiently in the nearest future,
people always come up with various workarounds. In this module you will study the classical NP-complete
problems and the reductions between them. You will also practice solving large instances of some of these
problems despite their hardness using very efficient specialized software based on tons of research in the
area of NP-complete problems.
Less
Key Concepts
 Give_examples of NP-complete problems
 Interpret the famous P versus NP open problem
 Develop a program for assigning frequencies to the cells of a GSM network
 Develop a program for determining whether there is a way to allocate advertising budget given a
set of constraints

Less
Slides and Resources on NP-complete Problems


Reading: Slides and Resources on NP-complete Problems

. Duration:10 min

Search Problems

Video: LectureBrute Force Search

. Duration:5 min

Resume

. Click to resume

Video: LectureSearch Problems

. Duration:9 min

Video: LectureTraveling Salesman Problem

. Duration:7 min

Video: LectureHamiltonian Cycle Problem

. Duration:8 min

Video: LectureLongest Path Problem

. Duration:1 min

Video: LectureInteger Linear Programming Problem

. Duration:3 min

Video: LectureIndependent Set Problem

. Duration:3 min

Video: LectureP and NP

. Duration:4 min

Reductions

Video: LectureReductions

. Duration:5 min


Video: LectureShowing NP-completeness

. Duration:6 min

Video: LectureIndependent Set to Vertex Cover

. Duration:5 min

Video: Lecture3-SAT to Independent Set

. Duration:14 min

Video: LectureSAT to 3-SAT

. Duration:7 min

Video: LectureCircuit SAT to SAT

. Duration:12 min

Video: LectureAll of NP to Circuit SAT

. Duration:5 min

Video: LectureUsing SAT-solvers

. Duration:14 min

Reading: Minisat Installation Guide

. Duration:10 min

End of Module Quiz

Purchase a subscription to unlock this item.

Quiz: NP-complete Problems

6 questions

Due Aug 9, 11:59 PM PDT

Programming Assignment

Purchase a subscription to unlock this item.

Programming Assignment: Programming Assignment 3

. Duration:3h
Slides and Resources on NP-complete Problems

Slides
17_np_complete_problems_1_search_problems.pdf PDF File
17_np_complete_problems_2_reductions.pdfPDF File

Reading
Chapter 8 in [DPV], Chapter 8 in [KT], Chapter 34 in [CLRS].

[DPV] Sanjoy Dasgupta, Christos H. Papadimitriou, and Umesh V. Vazirani. Algorithms. McGraw-
Hill, 2008.

[KT] Jon M. Kleinberg and Eva Tardos. Algorithm design. Addison-Wesley, 2006.

[CLRS] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.
Introduction to Algorithms (3. ed.). MIT Press, 2009.

Sudoku Solver

sudokusolver.py
Brute Force Search
Hello and welcome to the next module of the Advanced Algorithms and Complexity class. In this
module, we are going to meet problems that combinations very hard. In all previous modules, we
considered many efficient algorithms for various combinational problems. By saying efficient, we
usually mean polynomial algorithm and this is why. Consider for algorithms shown here on the slide,
because a running time of the felt algorithm is just n is usual by n with the other size of the input. The
running time of the second algorithm is n squared, so it is for the right timing algorithm and the third
one is a cubic time algorithm and the last one is running time 2 to the n. So here, we have polynomial
time algorithms, three polynomial time algorithms and the last one is exponential time. The second
draw here in the table shows the maximum value of n for which the total number of steps performed
by the corresponding algorithm stays below than 10 to the 9. Why 10 to the 9? Well, just because this
is roughly the estimate for the number of permutations performed by modern computers in one
second. So, we're interested in the maximum value of n for which the running time of the
corresponding algorithm stays below the one second. It is not difficult to compute these values. For
the first algorithm, this is 10 to 9, of course. For the second algorithm, this is 10 to 4.5 and for the
third one, it is 10 to the 3. So polynomial time algorithms are able to handle instances of size roughly
thousand, so even millions. While for exponential time algorithms, the maximum value for which it
performs less than 10 to the operations is roughly 30. So, it allows us to process only very small
instances. Recall that any exponential time function grows faster than any polynomial time function.
So for this reason, exponential time algorithms are usually considered as impractical.
Note, however, the following theme. Usually, for many computational problems, the corresponding
set of all candidate solutions is exponential. Let me illustrate this with a few examples. I assume that
way given n objects since our goal is to find an optimal permutation. An optimal in some sense
permutation of this object. A nice way to do this would be to go through all possible, such
permutations and to select an optimal one. The running time the corresponding algorithm, however is
going to be at least ten factorial, because there are n factorial different permutations of n given
objects. And n factorial grows even faster than any exponential function, 2 to the n, for example,
which means that the corresponding algorithm is going to be extremeless low. Another example is the
following. Assume that we're given objects and we need to split them into two sets. For example, we
need to partition a set of vertices of a graph in the two sets to find a cut. Then again, a nice way to do
this would be to go through all possible partitions into two sets and to select an optimal one.
However, there are 2 to the n ways to split the n given objects into two sets. And if we do this the
running time of the algorithm is going to be at least 2 to the n and we know that this is very slow. This
only allows us to handle instances of size roughly 30 in less than 1 second. Another example is assume
we need to find a minimum spanning tree in a complete graph, that is in a graph where we have an
edge between every pair of vertices. A naive way to do this would be to go through all possible
minimum spanning trees and to select one with minimum weight. However, the total number of
spanning trees in a graph on n vertices is n to the n-2. Again, this grows even faster than 2 the n and
this makes the corresponding algorithm completely impractical. So once again, in many cases, an
efficient polynomial algorithm is called efficient in particular, because it avoids going through all
possible candidate solutions, which usually has exponential size.
In the rest of this module, we will learn that there are many computational problems that arise
frequently in practice actually for which we don't know an efficient that is polynomial time algorithm.
For such problem, roughly the best we can do is to go naively through all possible candidate solutions
and to select the best one. It will turn out also surprisingly that all these seemingly different problems,
well, millions of problems they are related to each other. Namely if you construct an efficient, if you
design an efficient algorithm, a polynomial time algorithm for at least one of them, this will
automatically give you a polynomial time algorithm just for all these problems.
Play video starting at 5 minutes 26 seconds and follow transcript5:26
At the same time, constructing such an algorithm turns out to be an extremely difficult task. In
particular, there is a one million dollar prize for constructing such an algorithm or proving that there is
no such algorithm.
Search Problems
We will now give a formal definition of a Search Problem. And we will do this by considering the
famous boolean satisfiability problem.
Play video starting at 10 seconds and follow transcript0:10
The input for this problem is a formula in conjunctive normal form, which is usually abbreviated just
as CNF.
Play video starting at 22 seconds and follow transcript0:22
So a formal and conjunctive number form is just a set of clauses. In this case, in this example, we have
five clauses. This is the first one. This is the second one. The third one, the fourth one, and the last
one.
Play video starting at 36 seconds and follow transcript0:36
Each clause is a logical, is a logical or, or a disjunction of a few literals. For example, the first one is a
disjunction of literals x, y, and z. The second one is the disjunction of x and the negation of y. The third
one is a disjunction of y and not z, or a negation of z, and so on. So x, y and z are Boolean variables.
These are variables that take Boolean values. The Boolean values are true and false and we will
usually use 1 instead of true and 0 instead of false. So what this formula tells us is the first clause
actually constrains the values of x, y, and z to be so that either x = 1, or y = 1, or z = 1, right? So this is
just x, or y, or z. The second clause tells use that either x must be true, or the negation of y must be
true. That is, either x = 1, or y = 0 and so on. For example, the last clause tells us that either x = 0 or y
= 0, or z = 0.
Play video starting at 2 minutes 5 seconds and follow transcript2:05
Then, the Boolean Satisfiability problem, or just Satisfiability problem, which is also abbreviated as
SAT usually, is stated as follows. Given a formula in conjunctive normal form, we would like to check
whether it is satisfiable or not. That is, whether it is possible to assign Boolean values to all variables
so that all clauses are satisfied. If it is possible, we need to output a satisfying assignment. If it is not
possible, we need to report that no such assignment exists.
Play video starting at 2 minutes 38 seconds and follow transcript2:38
Now we give a few examples.
Play video starting at 2 minutes 41 seconds and follow transcript2:41
In the first example, we're given a formula over two variables, x and y. It contains three clauses, and it
is satisfiable. To satisfy it, we can assign the value 1 to x and the value 0 to y.
Play video starting at 2 minutes 55 seconds and follow transcript2:55
Let's check that it indeed satisfies all three clauses. Well, in the first clause, x is satisfied. In the second
clause, not y is satisfied. And, in the last clause X is satisfied. In the second example, we illustrate that
a formula may have more than just one satisfying assignment. For example, for this formula there is a
satisfying assignment which assigns the value 1 to x, y, and z and there is another satisfying
assignment which is shown here.
Play video starting at 3 minutes 23 seconds and follow transcript3:23
Okay, for the last formula, the last formula is unsatisfiable. And probably the most straightforward
way to check this is just to list all possible truth assignments to x, y and z.
Play video starting at 3 minutes 38 seconds and follow transcript3:38
So there are eight such assignments. Let me list them all.
Play video starting at 3 minutes 48 seconds and follow transcript3:48
Then for each of these assignments, we need to check that each of them falsifies at least one clause.
For example, the first one falsifies the first clause. When x, and y, and z = 0, the first clause is falsified,
right? For the second one, it falsifies the clause y or not z, right? The third one falsifies the clause x or
naught y and so on. So it can be checked that each of these eight assignments falsifies at least one
clause.
So another way of showing that this formula is satisfiable is the following. Let's first try to assign the
value zero to x. Then let's take a look at the following clause. So it is x or not y. x is already assigned
zero. So the only way to satisfy this clause is to assign the value 0 to y. So setting x to 0 forces us to set
y to 0 also. Now let's take a look at this clause. It forces us to set the values 0 to z, also. But then we
see that this clause is already falsified, right, which tells us that our initial move, I mean to assign 0 to
x was a wrong move. That we need to assign the value 1 to x.
Let's try to do this. If x = 1, let's take a look at the following clause. Not x is already falsified in this
clause, so we need to assign the value 1 to z. Now let's take at this clause. Not z is already falsified
here so we need to assign the value 1 to y. But then this clause is falsified. So no matter how we
assign x, it forces us to some other assignments and in the end, we falsify some clause which justifies
that this formula is unsatisfiable. That is a canonical hard problem.
It has applications in various branches of computer science. In particular because many hard
combinatorial problems are reduced very easily to the satisfiability problems. I mean, many hard
combinatorial problems can be stated very naturally in terms of SAT and then when a problem is
stated in terms of SAT, we can use a so-called SAT solver, which is a program that solves the
satisfiability problem. There are many such programs and there is even a [INAUDIBLE] competition of
SAT solvers.

Play video starting at 6 minutes 32 seconds and follow transcript6:32


SAT is also a classical example of a so-called search problem. In a search problem, we're given an
instance site, or just and and our goal is to final a solution for this instance. A solution S or to report
that there is not such solution.
Play video starting at 6 minutes 48 seconds and follow transcript6:48
For example, in case of the SAT problem, an instance I is a formula in conjunctive normal form and S is
a satisfied assignment. For this formula, we need to check whether there is a satisfying assignment
and to return one if it exists. Or to report that this formula is unsatisfiable. This is, that there is no
satisfying assignment. A natural property to require from a such problem is that we can quickly check
with a given Solution S is indeed a solution for I. For example, in case of SAT, it is easy. If we are given
a truth assignment of values to all the variables, we can quickly check whether it satisfies all the
clauses. Namely, we just count all the clauses from left to right and for each clause, we check whether
it contains literal that satisfies this clause. Another natural property is that we require the length of S
to be
Play video starting at 7 minutes 43 seconds and follow transcript7:43
bounded by polynomial in the lengths of I. Right, so we want S to be not very large. We do not want S
to have, for example, exponential size in the length of I. In this case, it would require us an
exponential time just to write down a solution for the instance I. So once again, the natural property
of a search problem is the following. We have an algorithm which checks whether the given solution S
is indeed a solution for an instance I in time,
Play video starting at 8 minutes 19 seconds and follow transcript8:19
which is bounded by polynomial in the lengths of I only. This also forces S, the length of S, to be
bounded by a polynomial of I.
Play video starting at 8 minutes 30 seconds and follow transcript8:30
In fact, it is convenient to define a search problem through such a very fine algorithm. Namely we say
that this search problem is defined by an algorithm C that takes two parameter as an input. An
instance I and a candidate solution S.
Play video starting at 8 minutes 48 seconds and follow transcript8:48
It should run in time polynomial I, in the length of I, and we say that S is a solution for the instance I if
C of I, S returns true. For example, SAT is clearly a search problem. In this case, once again, I is a CNF
formula and S is a truth assignment of Boolean values to variables. And this algorithm, C, just scans all
the clauses and checks whether each clause contains a literal that is satisfied by the given assignment,
S. Of course, it's surrounding time is polynomial in the lengths of the formula.
Play video starting at 9 minutes 30 seconds and follow transcript9:30
Great. In the next part, we will see a few examples of search problems that arise frequently in factors
for which we still don't know polynomial time algorithms.
Traveling Salesman Problem
How I feel is hard problem is the Traveling Salesman Problem. In this case, making it a graph with
vertices that we know the distance between a, two vertices. Together with this graph, we are given a
budget, b, and our goal is to find a cycle in this graph that visits each vertex exactly once and has total
lengths at most b. Finding a short cycle visiting all the given points is a usual task solved by delivery
companies. For example, this is how an optimal cycle looks like if we need to deliver something into
15 biggest cities in Germany. And those application is drilling holes in circuit boards.
Play video starting at 47 seconds and follow transcript0:47
Assumes that we have a machine that needs to visit some specific places in a circuit board to drill in
these places. Of course we would like our machine to visit all these places as fast as possible. And for
this we need to find a cycle that visits all those places whose length is as short as possible. Note the
following subtlety.
Play video starting at 1 minute 9 seconds and follow transcript1:09
The travelling salesman problem is of course an optimization problem. Usually we are given just the
graph and our goal is to find the optimal cycle that visits each vertex exactly once. That is a cycle of
minimum total weight, of minimum total lengths. At the same time, in our statement of this problem,
we also have a budget B. And our goal is to check whether there is a cycle that visits every vertex
exactly once, and that has total lengths at most B. We did it to ensure that this is a search problem.
Indeed, it is very easy to check whether given a solution, it is indeed a solution. For this, we need to
check that what is given to us is a sequence of vertices that forms a cycle, that means it's each vertex
exactly once, and has total lengths it must be. It is easy to do, we just trace the cycle and check that
it's lengths is at most b, right? However, it is not so clear for an optimization version. If you are given a
cycle, how are you going to check whether it is optimal or not.
Play video starting at 2 minutes 9 seconds and follow transcript2:09
Once again, we stated the decision version of
Play video starting at 2 minutes 14 seconds and follow transcript2:14
the travelling salesman problem to ensure that this is a search problem. At the same time, in terms of
algorithms, these two versions of this problem. I mean an optimization version where we need to find
an optimal cycle and a decision version where we need to check whether there is a cycle of total
length at most b. These two problems are hardly different. Namely, if have an algorithm that solves
optimization problem, we can of course use it to solve the decision version. If we have an algorithm
that finds an optimal cycle, we can of course use it to check whether there is a cycle of links that must
be or not, and vice versa. If we have an algorithm that for every b checks whether there is a cycle of
lengths that must be, we can use it to find that optimal value of b. By using binary research. Namely,
we first for example check whether there is an optimal cycle of lengths 100. If yes, we check whether
there is an optimal cycle of lengths 50. If there is no such cycle we then check whether there is an
optimal, whether there is a cycle of lengths at most 75 and so on. So it might, eventually we will find
the value of b such that there is a cycle of lengths b but there is no cycle of smaller lengths. At this
point, we find, we have found the optimal length of a cycle that visits each vertex exactly once. And
this is done by calling our algorithm a logarithmicn umber of times.
And the only way to solve the traveling salesman problem is to check all possible n factorial
permutations of other vertices. This will give an algorithm whose running time is roughly n factorial.
And this is where we quickly draw in function. For example, already for n equal to 15 and factorial is
about 10 to 12. Which means that this algorithm is completely impractical.
Play video starting at 4 minutes 18 seconds and follow transcript4:18
There is a better algorithm because running time is still exponential. It is based on dynamic
programming and we will see it later in our class. It's running time is n squared times 2 to the n,
where n is the number of vertices. So it is still exponential but it is much better than n factorial. In
fact, we have no better algorithm for this problem, unfortunately. So this is the best upper bound that
we can prove. In particular, we have no algorithm that solves this problem in time, for example 1.99
to the n. At the same time, there are algorithms that solve this problem in practice quite well.
Namely, even when n is equal to several thousands. It is usually sold by heuristic algorithms. So such
algorithms solve practical instances quite well. In practice, however, we have no guarantee on the
running time of such algorithms. And also, there are approximation algorithms for this problem. For
such algorithms, we have a guarantee, guarantee on the running time. At the same time, what they
return is not an optimal solution, but the solution which is not much worse than optimal. For
example, in the approximation algorithms that we will study later can find in polynomial time the
cycle which may not be optimal but it is guaranteed to be at most two times longer than an optimal
one.
It is instructive to compare the Traveling Salesman Problem with the Minimum Spanning Tree
problem.
Play video starting at 5 minutes 51 seconds and follow transcript5:51
Recall that in the Minimum Spanning Three Problem, we are given a graph, or just a set of cities, and
our goal is to connect all the cities to each other by adding n minus 1 edges of minimum possible total
lengths. For example, the minimum spanning tree for this set of six cities might look like as follows.
Play video starting at 6 minutes 14 seconds and follow transcript6:14
So added five edges to connect all these six cities. Now we can see the travelling salesman was for the
same set of cities. So for a moment, assume that in the traveling salesman problem, we need to find
not a cycle but a path, okay? Then in this case, the optimal path for this set of six cities might look like
this.
Play video starting at 6 minutes 44 seconds and follow transcript6:44
Note that in this case, this path, what we're looking for in optimal paths is also a 3, right? So this is a 3
with 5 edges that spans all the vertices.
Play video starting at 7 minutes 1 second and follow transcript7:01
This means that the travel and salesman problem is a problem that we get from the minimum
spanning tree problem by posing an additional restriction that the tree that we're looking for should
be actually a path, right? So this is emphasized here. And by posing this additional restriction to the
minimum spanning tree problem. We get a problem for which we know no polynomial time
algorithm. So once again, for this problem. For the minimum spanning tree problem, we have an
algorithm whose running time is almost linear. For this problem, we have no polynomial time
algorithm. We have no algorithm whose running time is quadratic or cubic or even something like n to
the one to 1,000. This is a very difficult problem, resulting from the minimum spending tree problem,
by posing some additional small restriction.

Hamiltonian Cycle Problem


Our next search problem is a Hamiltonian Cycle Problem.
Play video starting at 5 seconds and follow transcript0:05
The input of this problem is a graph directed on, directed without weights and edges and the goal is
just to check whether there is a cycle that visits every vertex of this graph exactly once. For example,
for this graph, the research cycle. It is shown here on this slide. It is not difficult to check that it
indeed, visits every vertex exactly once.
Play video starting at 31 seconds and follow transcript0:31
And for this reason, in general, this problem is a search problem. Given some sequence of vertices, it
is easy to check that each vertex appears in this sequence exactly once and that there is a match
between any two consequent vertices in this sequence. The Eulerian cycle problem looks very similar
to the Hamiltonian cycle problem. In this problem, we're given a graph again and our goal is to find a
cycle that visits every edge exactly once. So in the Hamiltonian cycle problem, we need a cycle that
visits every vortex exactly once. In that Eulerian cycle problem, we're looking for a cycle that visits
every edge exactly once. It turns out that the Eulerian cycle problem can be solved very efficiently.
Namely, there is a very simple check whether the input graph is Eulerian or not. That is whether it
contains a Eulerian cycle or not. This is given by the following theorem.
Play video starting at 1 minute 29 seconds and follow transcript1:29
So, it deals with undirected graph. Assumes that way given an undirected graph and it contains a
Eulerian cycle if and only if, it is connected and the degrees of all its vertices is even.

Play video starting at 1 minute 43 seconds and follow transcript1:43


We now give two toy examples that in particular will shut some light on how to prove the just
mentioned serum.
Play video starting at 1 minute 50 seconds and follow transcript1:50
Our first example is a graph in which there is no Eulerian cycle, that is a non-Eulerian graph.
Play video starting at 1 minute 58 seconds and follow transcript1:58
There is no way Eulerian cycle in this graph in particular, because the degree of this vertex is equal to
3. Let's prove that it is a degree of a vertex in a graph is equal to 3 and this graph does not have a non-
Eulerian cital for sure. First, assume as such in cycle existed and assume that it visited this vertex,
which is denoted by me. For example, exactly once. So, this is a cycle that visits every edge of our
graph exactly once and goes through v exactly once. But in this case, v would have our degree exactly
two.
Play video starting at 2 minutes 38 seconds and follow transcript2:38
There are two edges. We used one of them to go out of v and we used another of them to come back
to this galaxy.
Play video starting at 2 minutes 49 seconds and follow transcript2:49
But in our case, the degree of v is equal to 3. Now, assume that there is an Eulerian cycle that visits
the galaxy at least two times. So, let's start our cycle from the vertex v. So, we walk through our
graph. We get back to v and then we walk again, and then we get back to v again. But since in our
cycle we visit each edge exactly once, in this case, we see that there are at least four edges adjacent
to v.
Play video starting at 3 minutes 26 seconds and follow transcript3:26
So, which means that either the our degree of v is equal to 2 or it is at least 4. And in general, it is not
difficult to see that if we have an Eulerian cycle, then the degree of each vertex is must be even.
Because each time when we come in to some vertex, we need to have a match in edge, which we use
to go out from this vertex.
I'm going to show an example of an Eulerian cycle in a graph and I'll also show how to define it
quickly. This is the same graph we stood as a set of three. And in this graph, we can see that the
degrees of all vertices are even. Namely, the degree of this vertex is 2, the degree of this vertex is 4.
This is also 4, this is 6, this is 2, this is 4. So all of them are even and this graph is connected, which
means that this graph contains an Eulerian cycle. Let's try to find it. Well, for concreteness, let's start
from this vertex and let's just walk through this graph. We first traverse this edge, then we traverse
this edge, then we get back to this, to this vertex. At this point, we return to the vertex for which
there are no more unused edges. However, there are still some unused edges in our graph. Let's just
take and let me also mark this cycle, as the first one.
Play video starting at 5 minutes 1 second and follow transcript5:01
So we constructed some cycle, but still there are some unused edges. Let's just start traversing the
unused edges from some vertex.
For example, a couple from this one was in might might traverse this edge and then again get back to
the initial vertex and this is cycle number two. In case there are still some unused cycles, some
unused edges, so let's extend, for example, let's start from the vertex and let's traverse another cycle.
So, this is a set site we're currently constructing the third cycle. So we go here, then here, then we get
back here and then we get back here. So at this point, we used all the edges. However, what we have
is not just one single cycle, but a bunch of cycles. But the nice property is that if we have several
cycles, it is easily to glue them together into a single cycle. Schematically, it can be shown as follows.
So assume that we have some cycle and we have some other cycle and then in some other point, we
have other cycle, then what we can do is to traverse these cycles as follows. So we first go here, then
we go this way, then we go this way. And finally, we go this way.
Play video starting at 6 minutes 35 seconds and follow transcript6:35
Let me illustrate how to do this on our example graph.
We first go on this edge, then we use this cycle, then we continue in our first cycle, then we triggers
this cycle, the third cycle. And finally, we get back to the initial vertex. And this is how we, in general,
an Eulerian cycle can be constructed. We just work in the graph and when we turn back to vertex
which has no unused edges, we just start traversing another cycle from some vertex. The fact that the
initial graph is connected ensures that then all the constructed cycles are connected to each other
and then we can glue them easily to construct a single cycle, visiting each edge exactly once.
Let's now summarize. We have two similarly looking problems. In the first one in the Eulerian cycle
problem, we need to find a cycle that visits every edge of a given graph exactly once. This problem
can be solved efficiently in linear time in the size of the input graph. In the second problem, in the
Hamiltonian cycle problem, we are looking for a cycle that visits every vertex of our graph exactly
once. For this problem, we have no polynomial time algorithm.
Longest Path Problem
Integer Linear Programming Problem
Our next hard stage problem deals with integers and linear inequalities. Namely, the problem is called
integer linear programming. The input to this problem is a set, or a collection, or a system of linear
inequalities, which we present here in metrics form.
Play video starting at 18 seconds and follow transcript0:18
And our goal is to find integer values for all the variables that satisfy all the inequalities. To give it our
example, consider the following three inequalities. The first one says that x1 should be at least one-
half. The second one says that minus x1 plus 8x2 should be non negative. And the last one says that
minus x1 minus 8 times x2 should be at least minus 8. As usual, we can represent a set of all solutions,
all feasible points
Play video starting at 50 seconds and follow transcript0:50
to this system of linear inequalities as a convex polygon as follows. We first draw a half space that
contains all points which satisfy the first inequality, that's shown here in green. So once again, in the
green half space, all the pairs of points (x1, x2) satisfies the inequality x1 is at least half. The second,
the blue one, half space, contains all the points that satisfy the second inequality. Finally, the red half
space shown here, contains all the points that satisfies the last inequality. In particular, the
intersection of these three half spaces, this triangle, contains all the points that satisfies our three
inequalities. Recall, however, that what we need to find is an integer solution. That is, we would like
x1 and x2 to have integer values. And so this intersection is non empty, it contains no integer points.
So the integer points are here, the closest ones.
Play video starting at 2 minutes 3 seconds and follow transcript2:03
But none of them is inside this region. Right? So it turns out that this additional restriction, namely
the restriction that the solution should be integer, gives us a very hard problem. In particular, if we
just have a system of linear inequalities and we would like to check whether there is a point that
satisfies whether there is a solution to them, then we can use, for example, simplex method to solve it
in practice.
Play video starting at 2 minutes 34 seconds and follow transcript2:34
The running time of simplex method is not bounded by polynomial, so on some pathological cases, it
can have exponential running time. But there are other methods like ellipsoid method or interior
point method that have polynomial upper bounds in the running time. So in any case, we can solve
systems of linear inequalities efficiently in practice. But if we additionally require that we need the
solution to be integer, then we get a very difficult problem for which we have no polynomial
algorithm at the moment.

Independent Set Problem


Ppt slides

Does this graph has an independent set of size 3?


Yes, it does.

Correct 
That's right! \{B,D,F\}{B,D,F} is an independent set.

Yes, it does.

is selected.This is correct.
That's right! \{B,D,F\}{B,D,F}is an independent set.

No, it does not.

Our last hard set problem that we mentioned here, deals with graphs again. It is called an
independent set problem. Here, we're given a graph and the budget b. And our goal is to select at
least b vertices such that there is no edge between any pair of selected vertices.
Play video starting at 20 seconds and follow transcript0:20
For example, in this graph that we've seen before in this lecture, is there is an independent set of size
seven? So the selected vertices are shown here in red. And it is not difficult to check that there is no
edge between a pair of red vertices. And the particularization implies that an independent set is
indeed a search problem. It is easy to check whether a given set of vertices is an independent set, and
that it has size at least b.

. It is interesting to note, that the problem can be easily solved if the given graph is a three. Namely, it
can be solved by the following simple greedy strategy, given a tree if you want to find even just the
independence side of maximum size we can do the following. The first thing to do is let's just take all
the leaves into a solution. Then, lets remove all the leaves from the three together with all its parents.
And then, lets just continue this process. To prove that this algorithm produces and optimal solution,
we need to show the take in all the leaves and our solution is a safe move, that it is consistent with an
optimal solution.
Play video starting at 1 minute 31 seconds and follow transcript1:31
This is usually done as follows. Assume that there is some optimal solution in which not all the leafs
are taken.
Play video starting at 1 minute 39 seconds and follow transcript1:39
Assume that, just for concreteness, assume that this is suboptimal solution. Not all the leaves are
taken here because, well, we have this leaf, this leaf, and this leaf, which is not in the solution. Then
we show that it can be transformed into another solution which, without decreasing its size, such that
it contains all the leaves. Indeed, let's just take all these market leaves into a solution. This will
probably require us to discard from a solution all it's parents, but it will not decrease the size of the
solution. So what we get is another solution whose size is, in this case, actually the same. But it
contains all the leaves. Which proves that there always exists an optimal solution which contains all
the leaves. And this in turn means that it is safe to take all the leaves.

We will see the details of this algorithm later in this class. But in general, once again, if we are given a
tree then we can find
Play video starting at 2 minutes 45 seconds and follow transcript2:45
an independent set of maximum size in this tree. Very efficiently in linear time. But at the moment,
we have no polynomial algorithm that finds, that even checks where there's a reason and
independent set of size b in the given graph in polynomial time.
P and NP

Now when have a formal definition of the search problem and when we've seen a few example of
search problems, we are ready to state the most important open problem in computer science. The
problem about classes P and NP. So recall once again that the search problem is defined by an
algorithm C that takes an instance I and a candidate solution S, and checks in time polynomial in I
where the S is indeed a solution for I.
Play video starting at 32 seconds and follow transcript0:32
In other words, we say that S is a solution for I if and only if the corresponding algorithm C of I and S
returns to,
Play video starting at 43 seconds and follow transcript0:43
then the class NP is defined just as the class of all search problems.
The name of this class stands for non-deterministic polynomial time. This essentially means that we
can guess a solution and then check its correctness in polynomial time. That is a solution for a search
problem can be verified in polynomial time.
Play video starting at 1 minute 11 seconds and follow transcript1:11
The Class P on the other hand, contains all search problems that can be solved in polynomial time.
That is all such problems for which we can find a solution in polynomial time. So to summarize, once
again the class P contains all search problems whose solution can be found efficiently. This class
contains in particulars, as minimum spending 3 problems. The shortest path problem, the linear
programming problem, the independent set on trees problem.
Play video starting at 1 minute 46 seconds and follow transcript1:46
The class NP contains all the problems whose solution can be verified efficiently that is given an
instantiation solution for this instance solution, we can check in polynomial time. In the size of this
instance, whether this is indeed a solution for. This class contains such problems as a problem, the
longest path problem, problem and independent set on general graphs.
The main open problem in computer science asks whether these two clauses are equal, namely
whether the clause P is equal to the clause NP. This is also known as the P versus NP question. The
problem is open, namely we do not know whether these two clauses are equal and this problem turns
out to be very difficult. This is a so-called Millenium Prize Problem. There is a $1 million prize from
Clay Mathematics Institute for resolving this problem. Note that if P is equal to NP, then all search
problems, I mean all the problems for which we can efficiently verify a solution, they can be solved in
polynomial time.
Play video starting at 3 minutes 2 seconds and follow transcript3:02
In other words, for all the problems for which we can efficiently verify a solution, we can also
efficiently find a solution.
Play video starting at 3 minutes 12 seconds and follow transcript3:12
On that hand, if P is not equal to NP, then there are search problems for which there are no efficient
algorithms. So there are problems like, for example, problem for which we can quickly check whether
a given candidate solution is indeed a solution, but there is no polynomial time algorithm for finding
such a solution efficiently. At this point, we do not know whether P is equals to NP or not, I mean,
where they are such problems for which there are no polynomial time algorithms.

In the next part, at the same time we will show that all the problems that we mentioned in this
lecture namely the problem, the longest path problem, the traveling salesman problem, the inter
linear programming problem are in some sense, the most difficult search problems in the class NP.

Reductions
Reductions

Hello and welcome to the next module of the Advanced Algorithms and Complexity class. This module
is devoted to reductions. Reductions allow us to say that one search problem is at least as hard as
another search problem. Intuitively, the fact that a search problem A reduces to a search problem B
just means that we can use an efficient named the polynomial time algorithm for the problem B to
solve the problem A, also in polynomial time. And we can use it just as a black box.

Pictorially, the fact that the search problem A reduces to search problem B means that we have the
following pipeline. Assume that we have an instance, I, of a problem, A. Now we are going to design
an algorithm that solves the instance I using an algorithm, a polynomial-time algorithm for B as a
black box. For this reason, it is shown here in a black box. Okay, the first thing to do is that we need to
transform an instance I into an instance I of the problem A, we must enter an instance of the problem
B. We do this by calling an algorithm f. So we plug, we feed the instance I of the problem A into the
algorithm f, and it gives us the instance f(I) of the problem B. We then use the algorithm for B as a
black box to solve it efficiently, and it gives us one of two outputs. Either there is no solution for this
instance f(I). In this case, we report that there is no solution also for i, for the instance i of the
problem A.
Play video starting at 1 minute 56 seconds and follow transcript1:56
Otherwise, it gives us a solution S for instance f(I). In this case, we need to transform it back to a
solution of I. We do this by using the second algorithm h. And it transforms in solution f(I) into
solution h(S) of the initial instance I.
We can now state it formally. Given two search problems, A and B, we say that A reduces to B and
write
Play video starting at 2 minutes 30 seconds and follow transcript2:30
A to B, if there is a pair of two polynomial time algorithms f and g. The algorithm f transforms any
instance of A into any instance f(A) of the problem B, such that the following holds. If there is no
solution for the instance f(I) of the problem B, then there is no solution for the instance I of the
problem A.
Play video starting at 2 minutes 58 seconds and follow transcript2:58
Otherwise, if there is a solution S for the instance f(I), then by applying the algorithm h to this solution
S, we get a solution h(S) of the initial instance I.
Play video starting at 3 minutes 14 seconds and follow transcript3:14
Now when we have an option of reduction, we can imagine a huge, huge graph containing all search
problems. So this graph can respond to the class NP of all search problems. In this graph, there is a
vertex for each search problem and to put an edge between the search problem A and the search
problem B a direct approach, if a search problem A reduces to search problem B, okay?
Then by definition, we say that this search problem is NP-complete if all other search problems
reduce to it.
Play video starting at 3 minutes 50 seconds and follow transcript3:50
Pictorially, it looks as follows. So the red vertex here corresponds to an NP-complete search problem.
So in some sense, this problem attracts all other search problems, all other search problems reduce to
it.
Play video starting at 4 minutes 7 seconds and follow transcript4:07
But it otherwise an algorithm for an NP-complete problem can be used as a polynomial time
algorithm for an NP-complete can be used as a black box to solve just all other search problems also
in polynomial time.

Play video starting at 4 minutes 26 seconds and follow transcript4:26


It is not at all clear that such NP-complete problems exist in our graph
Play video starting at 4 minutes 35 seconds and follow transcript4:35
of all such problems, but we will show that they do exist. And we will show, actually, that all the
search problems that we've seen in the previous modules, namely satisfiability problem, travel
analysis problem, the maximum independence set problem, longest pass problem, integer linear
programming problem, they are all NP-complete. Namely, if you design a polynomial time algorithm
for any of them, you will solve just all search problems in polynomial time.
Showing NP-completeness

You might also like