Mixed Integer Programming: Models and Methods - Nicolai Pisaruk
Mixed Integer Programming: Models and Methods - Nicolai Pisaruk
net/publication/332859832
CITATIONS READS
0 7
1 author:
Nicolai Pisaruk
Belarus State University
33 PUBLICATIONS 172 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Nicolai Pisaruk on 04 May 2019.
+++++++
Preface
Initially, I wrote a short manual for users of the optimization library MIPCL (Mixed
Integer Programming Class Library). Later I decided to rework the manual to make
it useful for a wider audience. As a result, this book was written, in which almost
all the main theoretical results, already implemented in modern commercial mixed-
integer programming (MIP) software, are presented. All algorithms and techniques
described here (and many others) are implemented in MIPCL. Having the experi-
ence of developing such a complex software product as MIPCL, I dared to teach
others of using MIP in practice.
It is clear that in a book of this volume it is impossible to cover the diversity
of researches in MIP. In particular, specialized algorithms for solving numerous
special cases of mixed integer programs (MIPs). are not considered here since the
number of publications on this topic is enormous. Therefore, when selecting mate-
rials, I decided to discuss only those results that are already implemented in modern
computer MIP libraries or those that potentially can be included in these libraries
in the near future.
The strength of MIP as a modeling tool for practical problems was recognized
immediately with its appearance in the 50th and 60th years of the 20th century.
Unfortunately, for a long time the computers and software available could not solve
those models. As a result, the illusion melted. Even today, many potential users still
believe that MIP is just a tool for writing models, but with very limited capacity
for solving those models. In fact, the situation has changed dramatically over the
past twenty years. Today, we can solve many difficult practical MIPs using standard
software.
What is the reason for the wide use of MIP in practice? A brief answer is that,
using binary variables that take only two values, 0 or 1, we can model many types of
nonlinearities, in particular, almost any logical conditions. And the latter are present
in almost every non-trivial practical application. It is also very important that MIP
models are easily expandable. When developing a decision-making system, be care-
ful of using highly specialized models and software. Even if you do not encounter
problems at the development stage, they can appear later, during system operation,
v
vi Preface
when the requirements to it change, and their accounting is impossible in the cur-
rently used model.
For many years, the basic approach to solving MIPs has remained unchanged:
it was a linear-programming-based branch-and-bound method proposed by Land
and Doig back in 1960. And this was despite the fact that at the same time there
was significant progress in the theory of linear programming and related areas of
combinatorial optimization. Many of the ideas developed there ”passed” through
intensive computational experiments, but until recently only a few of them were
implemented in commercial software products used by practitioners. Nowadays, the
best MIP programs include many ideas that accumulate theoretical achievements,
for example, the modern MIP solvers preprocess and automatically reformulate the
problems being solved, generate cuts of various types, use a variety of heuristics in
the nodes of the search tree to build feasible solutions. This allowed R.E. Bixby to
state that the gap between theory and practice was being closed.
Next, we briefly present the contents of this book. In the introduction, we discuss
the specific features of the MIPs that distinguish them from other mathematical pro-
gramming problems. Next, we give examples of the formulations of various types of
nonlinearities in MIP models. Then we try to understand why one MIP formulation
is stronger (better) than another one, and also discuss some ways of strengthen-
ing existing formulations. Understanding that not all MIP formulations are equally
good in practice has come relatively recently. Prior to this, as a rule, preference was
given to more compact formulations.
In typical situations, to use MIP in practice, one does not need to be an expert
in theory. Some skills are only needed to formulate practical problems as MIPs.
This can be learned by studying applications and their formulations that have al-
ready become classical. Chapter 2 presents a number of such applications and their
formulations. Even more applications are considered in the other chapters as exam-
ples for demonstrating some of the techniques used in MIP. Descriptions of many
applications are also found in the exercises that are given after each chapter. Many
exercises are specially formulated by asking to justify the validity of the proposed
answer. This makes the exercises an additional source of information on the topic
under discussion.
Because of its universality, the general MIP is a very difficult computational
problem. Many of the most complex problems of combinatorial optimization are
very simply formulated as MIPs. A number of results from computational complex-
ity theory indicate that efficient algorithms are unlikely to be proposed for solving
such problems. We cannot also expect that in the foreseeable future a computer pro-
gram will be developed that will be able to solve with equal efficiency all MIPs that
arise in practice. Therefore, modern MIP libraries are designed so that they allow
the users to redefine (reprogram) many of their functions, replacing them with those
that take into account specific features of the problem being solved. One cannot ef-
fectively use these libraries without knowing the theory on which they are based.
The rest of the book is devoted to the study of the algorithms implemented in the
modern MIP libraries.
Preface vii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Integerality and Nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Discrete Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Fixed and Variable Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Approximation of Nonlinear Functions . . . . . . . . . . . . . . . . . . 3
1.1.4 Approximation of Convex Functions . . . . . . . . . . . . . . . . . . . . 5
1.1.5 Logical Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Multiple Alternatives and Disjunctions . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Floor Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Linear Complementarity Problem . . . . . . . . . . . . . . . . . . . . . . 9
1.2.3 Quadratic Programming Under Linear Constraints . . . . . . . . 9
1.3 How an LP May Turn Into a MIP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Good and Ideal Formulations, Reformulation . . . . . . . . . . . . . . . . . . . 14
1.6 Strong Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7 Extended Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.1 Single-Product Lot-Sizing Problem . . . . . . . . . . . . . . . . . . . . . 18
1.7.2 Fixed Charge Network Flows . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.8 Alternative Formulations for Scheduling Problems . . . . . . . . . . . . . . . 22
1.8.1 Continuous Time Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.8.2 Time-Index Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.9 Knapsack Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.9.1 Integer Knapsack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.9.2 0,1-Knapsack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.10 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
ix
x Contents
2 MIP Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.1 Set Packing, Partitioning, and Covering Problems . . . . . . . . . . . . . . . 37
2.2 Service Facility Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3 Portfolio Management: Index Fund . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4 Multiproduct Lot-Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5 Balancing Assembly Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.6 Electricity Generation Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.7 Designing Telecommunication Networks . . . . . . . . . . . . . . . . . . . . . . . 46
2.8 Placement of Logic Elements on the Surface of a Crystal . . . . . . . . . 47
2.9 Assigning Aircrafts to Flights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.10 Optimizing the Performance of a Hybrid Car . . . . . . . . . . . . . . . . . . . . 51
2.11 Short-Term Financial Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.12 Planning Treatment of Cancerous Tumors . . . . . . . . . . . . . . . . . . . . . . 54
2.13 Project Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.14 Short-Term Scheduling in Chemical Industry . . . . . . . . . . . . . . . . . . . 58
2.15 Multidimensional Orthogonal Packing . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.15.1 Basic IP Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.15.2 Tightening Basic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.15.3 Rotations and Complex Packing Items . . . . . . . . . . . . . . . . . . . 65
2.16 Single Depot Vehicle Routing Problem . . . . . . . . . . . . . . . . . . . . . . . . 65
2.16.1 Classical Vehicle Routing Problem . . . . . . . . . . . . . . . . . . . . . 68
2.17 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.18 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3 Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.1 Basic Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.2 Primal Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2.1 How to Find a Feasible Basic Solution . . . . . . . . . . . . . . . . . . 79
3.2.2 Pricing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.3 Dual Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.3.1 Adding New Constraints and Changing Bounds . . . . . . . . . . . 83
3.3.2 How to Find a Dual Feasible Basic Solution? . . . . . . . . . . . . . 83
3.3.3 The Dual Simplex Method Is a Cutting Plane Algorithm . . . 84
3.3.4 Separation Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.4 Why an LP Does Not Have a Solution? . . . . . . . . . . . . . . . . . . . . . . . . 87
3.5 Duality in Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.6 Linear Programs With Two-Sided Constraints . . . . . . . . . . . . . . . . . . . 89
3.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4 Cutting Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.1 Cutting Plane Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2 Chvátal-Gomory Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.3 Mixed Integer Rounding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.4 Fractional Gomory Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Contents xi
6 Branch-And-Cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.1 Branch-And-Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.2 Branch-And-Cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.3 Branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.3.1 Priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.3.2 Special Ordered Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.4 Global Gomory Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.5 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.5.1 Disaggregation of Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.5.2 Probing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.6 Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7 Branch-And-Price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.1 Column Generation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.1.1 One-Dimensional Cutting Stock Problem . . . . . . . . . . . . . . . . 183
7.1.2 Column Generation Approach . . . . . . . . . . . . . . . . . . . . . . . . . 184
7.1.3 Finding a Good Initial Solution . . . . . . . . . . . . . . . . . . . . . . . . 185
7.1.4 Cutting Stock Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
xii Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Abbreviations and Notations
xiii
xiv Abbreviations and Notations
G = (V, E): graph or directed graph (digraph) with vertex set V , and edge (arc)
set E
E(S, T ): set of edges in graph G = (V, E) with one end in set S, and other one in
set T ; or set of arcs in digraph G = (V, E) outgoing set S and incoming set T
f (n) = O(g(n)) if there exists constant c > 0 such that f (n) ≤ cg(n) for suffi-
ciently large n ∈ Z+ (for example, 5n2 + 7n + 100 = O(n2 ))
(Ω , A , P): probability space, where
Ω : space of elementary events or sample space,
A : algebra or σ -algebra of subsets of Ω (elements of A are called event),
P: probability measure on A (P(S) is probability that randomly chosen ω ∈ Ω
belongs to S ∈ A )
E(ξ ): (mathematical) expectation (expected value) of random variable ξ : Ω →
def R
R, E(ξ ) = Ω ξ (ω)P(dω)
IP: Integer Programming
IP: Integer Program (IP problem)
LP: Linear Programming
LP: Linear Program (LP problem)
MIP: Mixed Integer Programming
MIP: Mixed Integer Program (MIP problem)
NP: class of decision problems (with two answers: ”yes” or ”no”), that can be
solved by nondeterministic Turing machine in polynomial time
P: class of decision problems (with two answers: ”yes” or ”no”), that can be
solved by deterministic Turing machine in polynomial time
Chapter 1
Introduction
We will see later many times that the condition ”x is integer” can be used to express
many nonlinear constraints. But first we note that this restriction itself can be given
by means of a single smooth equation:
sin(πx) = 0.
Another important condition ”x is binary” (x can take only one of two values: 0
or 1) is written as one quadratic equation
x2 − x = 0.
1
2 1 Introduction
cT x → max,
Ax = e,
xi2 = xi , i = 1, . . . , n.
Here and below e denotes a vector of suitable size all components of which are equal
to 1.
Suppose now that an integer variable x is non-negative and upper bounded, that
is, 0 ≤ x ≤ d, where d is a positive integer. In the binary system, d can be written as
a k = blog dc + 1 digit number. Therefore, introducing k new continuous variables,
s0 , . . . , sk−1 , we can represent the condition x ∈ {0, 1, . . . , d} by the following system
of equations:
k−1
x= ∑ 2i si ,
i=0
s2i = si , i = 0, . . . , k − 1.
So, we can conclude that any MIP is reduced to a quadratic programming prob-
lem and, consequently, the general MIP is not more difficult than the general
quadratic programming problem. But a distinctive feature of integer programming
(IP) is that here the integer-valued variables are handled in a very special way at the
algorithmic level by branching on integer variables and by generating cuts.
From a practical point of view, it is more important that, introducing additional
integer (most often binary) variables, we can model many nonlinearities by linear
constraints.
A discrete variable x can take only a finite number of values v1 , . . . , vk . For example,
in the problem of designing a communication network, the capacity of a link can be,
say, 1, 2 or 4 gigabytes. This discrete variable x can be represented as an ordinary
continuous variable by introducing k binary variables y1 , . . . , yk and writing down
the constraints
1.1 Integerality and Nonlinearity 3
x − v1 y1 − v2 y2 − . . . − vk yk = 0, (1.2a)
y1 + y2 + . . . + yk = 1, (1.2b)
yi ∈ Z+ , i = 1, . . . , k. (1.2c)
Let us also note that, instead of declaring that all variables yi are integer, it is
enough to specify that (1.2b) is a generalized upper bound, i.e., in such a constraint
only one variable can take a non-zero value. The generalized upper bounds in the
MIP software manuals are often referred to as special ordered sets of type 1 (SOS1),
and their accounting is carried out by performing a special type of branching (see
Sect. 6.3.2).
One of the most significant limitations of linear programming with respect to solving
economic problems is that in linear models one cannot take into account fixed costs.
In MIP, accounting for fixed costs is simple.
Let us assume that the cost of producing x units
of some product is calculated as follows: c(x) 6
(
def f + px if 0 < l ≤ x ≤ u,
c(x) =
0 if x = 0, r -
l u x
where f is a fixed production cost, p is a cost of
Fig. 1.1 Cost function
producing one product unit, l and u are minimum
and maximum production capacities.
Introducing a new binary variable y (y = 1 if product is produced, and y = 0 oth-
erwise) and adding the variable lower and upper bounds ly ≤ x ≤ uy, we transform
the nonlinear c(x) into a linear function, c(x, y) = ax + by, of two variables x and y.
Let a nonlinear function y = f (x) be given on an interval [a, b], and let us choose a
partition of this interval:
Connecting the neighboring break-points (x̄k , ȳk = f (x̄k )) and (x̄k+1 , ȳk+1 = f (x̄k+1 ))
by the line segments, we obtain a piecewise-linear approximation, f˜(x), of the func-
tion f (x) (Fig. 1.2). Now we can write down the following system to describe the
set of points (x, y) lying on the graph of f˜:
4 1 Introduction
f (x) 6
f (x̄3 ) r
@
@r
f (x̄4 ) J
f (x̄6 ) r
J
r
J
f (x̄2 ) J
J
r
f (x̄5 ) J
r
J
f (x̄1 )
-
a = x̄1 x̄2 x̄3 x̄4 x̄5 x̄6 = b x
r
x= ∑ λk x̄k , (1.3a)
k=1
r
y= ∑ λk ȳk , (1.3b)
k=1
r
∑ λk = 1, (1.3c)
k=1
λk ≤ δk , k = 1, . . . , r, (1.3d)
δi + δ j ≤ 1, j = 3, . . . , r, i = 1, . . . , j − 2, (1.3e)
λk ≥ 0, δk ∈ {0, 1}, k = 1, . . . , r. (1.3f)
Equations (1.3a)–(1.3c) ensure that the point (x, y) belongs to the convex hull
of the points (x̄1 , ȳ1 ), . . . , x̄1 , ȳ1 ). The other relations, (1.3d)–(1.3f), require that no
more than two variables λk take nonzero values, and the indices of these non-zero
variables be consecutive numbers. These conditions reflect the requirement that the
point (x, y) must lie on some line segment connecting two neighboring break-points.
It should be noted that almost all modern commercial MIP solvers take into
account Ineqs. (1.3d) and (1.3e) algorithmically, organizing branching in a special
way (see Sect. 6.3.2). In this case, it is not necessary to explicitly specify these
inequalities, it suffices to indicate that Eq. (1.3c) is of type SOS2 (Special Ordered
Set of Type 2).
1.1 Integerality and Nonlinearity 5
If f (x) is a convex function, then in many cases we can represent the relation y =
f (x) without introducing integer variables. As before, given a partition a = x̄1 <
x̄2 < · · · < x̄r = b of the interval [a, b], we need to approximate f (x) with a piecewise
linear function f˜ (see Fig. 1.3).
f (x) 6
f (x̄5 )
r
f (x̄1 )
@
f (x̄2 ) @rP
f (x̄4 ) r
Pr
PP
f (x̄3 ) -
x
a = x̄1 x̄2 x̄3 x̄4 x̄5 = b
Formally, we write down logical conditions using boolean variables and formulas.
Any boolean variable can take only two values: true and false. From boolean vari-
ables, using binary logical operations ∨ (or), ∧ (and), and a unary operation ¬ (¬x
means not x), we can make up boolean formulas in much the same way that we can
make up algebraic expressions using arithmetic operations over real variables. For
example,
(x1 ∨ ¬x2 ) ∧ (¬x1 ∨ x3 ) (1.5)
is a boolean formula. Substituting values for boolean variables, we can calculate the
value of the boolean formula using the rules presented in Table 1.1.
For example, for a truth set (x1 , x2 , x3 ) = (true,false,false), (1.5) takes the value
of false.
Any boolean formula of n boolean variables can be represented in a conjunctive
normal form (CNF): !
m
^ _ σ ij
xj , (1.6)
i=1 j∈Si
where Si ⊆ {1, . . . , n} (i = 1, . . . , m) and all σ ij ∈ {0, 1}. Here we use the following
def def
notation: x1 = x and x0 = ¬x. Note that (1.5) is already represented
asa CNF.
W σ ij
CNF (1.6) takes the value of true only if every clause j∈Si x j contains at
least one literal (a literal is a variable or its negation) with the value of true. If we
identify false with 0, and true with 1, then the negation operation ¬ converts x into
1 − x. In view of what has been said, the truth sets on which (1.6) takes the value of
true, are the solutions to the following system of inequalities:
∑ x j + ∑ (1 − x j ) ≥ 1, i = 1, . . . , m,
j∈Si1 j∈Si0 (1.7)
x j ∈ {0, 1}, j = 1, . . . , n.
def
Here, for δ ∈ {0, 1}, we use the notation Siδ = { j ∈ Si : σ ij = δ }. For example, the
CNF
(x1 ∨ x2 ∨ x3 ) ∧ (x1 ∨ ¬x2 ) ∧ (x2 ∨ ¬x3 ) ∧ (x3 ∨ ¬x1 )
1.2 Multiple Alternatives and Disjunctions 7
takes the value of true on the sets that are solutions to the system
x1 + x2 + x3 ≥ 1,
x1 + (1 − x2 ) ≥ 1,
x2 + (1 − x3 ) ≥ 1,
x3 + (1 − x1 ) ≥ 1,
x1 , x2 , x3 ∈ {0, 1}.
Ai x ≤ bi , i = 1, . . . , m,
at least any q inequalities be satisfied. For example, if two jobs i and j are executed
on the same machine, then we must require the validity of the following disjunction:
ei − s j ≤ 0 or e j − si ≤ 0,
where si and ei are, respectively, the start and end times of job i.
Introducing binary variables yi for i = 1, . . . , m, with yi = 0 if Ai x ≤ bi is valid,
and yi = 0 otherwise, we can take into account the required condition as follows:
Ai x ≤ bi + M(1 − yi ), i = 1, . . . , m,
m
∑ yi ≥ q,
i=1
yi ∈ {0, 1}, i = 1, . . . , m.
x1 ≥ a or x2 ≥ b.
0 ≤ xi ≤ W − wi , 0 ≤ yi ≤ H − hi , i = 1, . . . , n. (1.8)
To ensure that two modules, i and j, do not intersect, at least one of the following
four inequalities must be valid:
Introducing four binary variables zli j , zrij , zbij , and zaij , we can represent this disjunc-
tion by the following system of inequalities
xi + wi ≤ x j +W (1 − zli j ),
x j + w j ≤ xi +W (1 − zrij ),
yi + hi ≤ y j + H(1 − zbij ), (1.9)
y j + h j ≤ yi + H(1 − zaij ),
zli j + zrij + zbij + zaij ≥ 1.
0 ≤ xi ≤ W − ((1 − δi )wi + δi hi ), i = 1, . . . , n,
0 ≤ yi ≤ H − ((1 − δi )hi + δi wi ), i = 1, . . . , n,
xi + (1 − δi )wi + δi hi ≤ x j +W (1 − zli j ), i = 1, . . . , n − 1; j = i + 1, . . . , n,
x j + (1 − δ j )w j + δ j h j ≤ xi +W (1 − zrij ), i = 1, . . . , n − 1; j = i + 1, . . . , n,
yi + (1 − δi )hi + δi wi ≤ y j + H(1 − zbij ), i = 1, . . . , n − 1; j = i + 1, . . . , n,
y j + (1 − δ j )h j + δ j w j ≤ yi + H(1 − zaij ), i = 1, . . . , n − 1; j = i + 1, . . . , n,
1.2 Multiple Alternatives and Disjunctions 9
Ax + y = b, (1.10a)
T
x y = 0, (1.10b)
x, y ≥ 0, (1.10c)
xi ≤ gi zi , yi ≤ hi (1 − zi ) , zi ∈ {0, 1}, i = 1, . . . , n.
1 For an integer matrix A, we can estimate the values of xi and yi from Cramer’s rule (do this as an
exercise). But from a practical point of view, such estimates are too rough.
10 1 Introduction
x ≥ 0, y ≥ 0,
Ax ≥ b,
c + Dx − AT y ≥ 0, (1.13)
yT (Ax − b) = 0,
T
c + Dx − AT y x = 0.
If (x, y) is a solution to (1.13), then the point x is called a stationary point (or KKT-
point) for (1.12). If D is a positive semi-definite matrix, then the objective function
is convex, and, consequently, every stationary point is an optimal solution to (1.12).
Let us consider the next MIP:
z → max,
0 ≤ Au − bz ≤ e − α,
0 ≤ Du−AT v + cz ≤ e − β ,
0 ≤ u ≤ β,
(1.14)
0 ≤ v ≤ α,
0 ≤ z ≤ 1,
α ∈ {0, 1}m ,
β ∈ {0, 1}n .
We denote by z∗ the optimal objective value in (1.14). It is not difficult to verify the
following statements.
1. If z∗ = 0, then problem (1.12) does not have stationary points.
2. If z∗ > 0 and (u∗ , v∗ , z∗ ) is an optimal solution to (1.14), then the vectors
x∗ = (1/z∗ ) u∗ , y∗ = (1/z∗ ) v∗ ) make up a solution to (1.13), and hence x∗ is a
stationary point for (1.12).
Objective (1.15a) is to minimize the sum of the fixed and variable transportation
costs. Inequalities (1.16d) ensure that each consumer j will have at least k j suppliers.
Together (1.16e) and (1.16f) imply that yi j = 0 only if xi j = 0. Inequalities (1.16f)
require that the volume of each delivery be not less than the minimum delivery
volumes of the involved supplier and consumer.
1.4 Polyhedra
We can say that the polyhedron is the intersection of a finite number of half-spaces.
Let P ∈ Rn be a polyhedron of size d, and H(a, β ) be a hyperplane. If P com-
pletely belongs to one of the half-spaces H≤ (a, β ) or H≥ (a, β ) and touches the hy-
perplane H(a, β ) (P ∩ H(a, β ) 6= 0),
/ then P ∩ H(a, β ) and H(a, β ) are called a face
and a supporting hyperplane of the polyhedron P. We specifically distinguish three
types of faces:
• facet: face of size d − 1;
• vertex: face of size 0 (dot);
• edge: face of size 1 (segment).
Two vertices of a polyhedron are called adjacent if they are connected by an edge
(lie on one edge).
Any supporting hyperplane that is tangent to a facet is also called a facet defining
hyperplane. For a full-dimensional polyhedron (of dimension n), the facet defining
hyperplanes are uniquely defined (up to multiplication by a positive scalar). If in
a system of inequalities Ax ≤ b a hyperplane H(Ai , bi ) is not facet defining for the
def
polyhedron P(A, b) = {x ∈ Rn : Ax ≤ b}, then it can be excluded from the system of
inequalities without expanding the set of its solutions. In practice, we can recognize
facet defining hyperplanes based on the following statement.
Proposition 1.2. A hyperplane is facet defining for a polyhedron P of dimension d
if and only if it contains at least d vertices of P.
A three-dimensional polytope P, shown in Fig. 1.4, is formed by the intersection
of half-spaces, which are given by the inequalities:
x1 + x2 + x3 ≤ 4,
x2 ≤ 2,
x3 ≤ 3,
3x1 + x3 ≤ 6,
x1 ≥ 0,
x2 ≥ 0,
x3 ≥ 0.
It has 7 facets, 8 vertices (depicted as bold dots), and 13 edges (segments that con-
nect the vertices).
14 1 Introduction
x3
6 (0, 0, 3)
s s (0, 1, 3)
@
s
@
(1, 0, 3) @
L @s (0, 2, 2)
L
L
L
L
L
(0, 2, 0)
(0, 0, 0) s L s
L
-
x2
L
L
L
s Ls
(2, 0, 0) (2, 2, 0)
x1
In many cases, the same problem can be formulated in several different ways. Not
all formulations are equivalent. In this section we will try to understand why one
formulation is better than another.
Let us consider a mixed integer set
def
P(A, b; S) = {x ∈ Rn : Ax ≤ b, x j ∈ Z for j ∈ S}, (1.17)
x2 x2 x2
6 6 6
r @r r r r r
3 @ 3 3
2
r r r @ 2 r r r 2 @ r r
r
aa r r
S r r
1 aa 1
S
1 @r r
a @
- S - -
1 2 3 x1 1 2 3 x1 1 2 3 x1
Figure 1.5 presents three different formulations for a set of seven points in the
plane. The rightmost figure shows the ideal formulation, the polyhedron of which
coincides with the convex hull of the given set of points. If it is possible to write
down an ideal formulation for some MIP, then this MIP can be considered as an LP.
If a MIP is solved by the branch-and-bound method (see Sect. 6.1), as a rule,
the use of a stronger formulation leads to a reduction in the solution time due to
the decrease in the number of branchings. An obvious way to strengthen an existing
formulation P(A, b; S) is to add new inequalities valid for P(A, b; S), but not valid for
the relaxation polyhedron P(A, b). We say that an inequality is valid for some set if
all points from this set satisfy this inequality. Let inequalities aT x ≤ u and α T x ≤ v
hold for all points in P(A, b; S). It is said that the inequality aT x ≤ u is stronger than
the inequality α T x ≤ v (or the inequality aT x ≤ u dominates the inequality α T x ≤ v)
if
P(A, b) ∩ H≤ (a, u) ⊂ P(A, b) ∩ H≤ (α, v).
and
x ∈ {0, 1}n+1 ,
x j ≤ xn+1 , j = 1, . . . , n (1.19)
Solution. Let
n
P1 = {x ∈ Rn+1 : ∑ x j ≤ nxn+1 , 0 ≤ x j ≤ 1, j = 1, . . . , n + 1},
j=1
P2 = x ∈ Rn+1 : x j ≤ xn+1 , 0 ≤ x j ≤ 1, j = 1, . . . , n + 1
be the relaxation polytopes for (1.18) and (1.19), respectively. Summing together
the inequalities
x j ≤ xn+1 , j = 1, . . . , n,
we obtain the inequality
n
∑ x j ≤ nxn+1 .
j=1
1
x̄ = (1 − x̄n+1 ) · 0 + x̄n+1 · e,
x̄n+1
then x̄ lies on the segment joining two points of the polytope P2 . Hence, x̄ is not a
vertex of P2 .
So, if x̄ is a vertex of P2 , then its component x̄n+1 is 0 or 1. If x̄n+1 = 0, then
all other components x̄ j are also equal to zero. If x̄n+1 = 1, then the inequalities
x j ≤ xn+1 become the inequalities x j ≤ 1. Therefore, the point (x̄1 , . . . , x̄n ) must be
one of the vertices of the cube [0, 1]n , which are all integer. t
u
Let us note, that the disaggregation of inequalities, i.e., replacing one inequality
with a system of inequalities that is stronger than the initial inequality, is a powerful
preprocessing technique (see Sect. 6.5).
α T x̄ = uT Ax̄ ≤ uT b ≤ β
Let ū ∈ Rm T T
+ be an optimal solution to the right-hand LP. Then α = A ū and β ≥ ū b.
t
u
1.7 Extended Formulations 17
Objective (1.20a) is to minimize total expenses over all T periods. Each balance
equation in (1.20b) relates two neighboring periods: the amount of product, st−1 , in
the warehouse at the end of period t − 1 plus the amount, xt , produced in period t
equals the demand, dt , in period t plus the amount, st , stored in the warehouse at the
end of period t. The inequalities in (1.20c) impose the implications: yt = 0 ⇒ xt = 0.
If all fixed costs, ft , are positive, then for an optimal solution (x∗ , y∗ ) of the
relaxation LP for (1.20), we have
Consequently, for all producing periods t (when xt∗ > 0), except for the last one, yt∗
is a fractional number, since yt∗ = xt∗ /Dt ≤ dt /Dt < 1. Many integer variables taking
fractional values is a clear indicator that the used formulation is weak.
To obtain an ideal formulation for our lot-sizing problem, we need to add to
(1.20) the system of (l, S)-inequalities:
j
where S̄ = {1, . . . , l} \ S and di j = ∑t=i dt . This inequalities reflect the following
simple observation: the amount of product produced in periods t ∈ S (∑t∈S xt ) and
the maximum amount of product that can be produced in periods t ∈ S̄ for use in the
first l periods (∑t∈S̄ dtl yt ) must be no less than the product demand in these first l
periods (d1l ).
We see that the ideal formulation for the set of feasible solutions to (1.20) con-
tains an exponentially many inequalities. Although this is not a big obstacle to us-
ing this formulation in practice, it is still impossible to solve such a MIP using the
standard software, and therefore, to represent any exponentially large family of in-
equalities, we need to implement a specialized separation procedure (for example
see Sect. 6.6).
We can strengthen (1.20) in another way by disaggregating the decision vari-
ables: xt = ∑Tτ=t xtτ , where, for t = 1, . . . , T , and τ = t, . . . , T , the new variable xtτ
represents the amount of the product produced in period t for period τ.
First, we exclude from (1.20) the variables st . Adding together the balance equa-
tions sk−1 + xk = dk + sk for k = 1, . . . ,t, we obtain
t t
st = s0 + ∑ xk − ∑ dk .
k=1 k=1
Using these equalities, we rewrite the objective function in the following way
!
T t t
∑ ( ft yt + ct xt + ht s0 + ∑ xk − ∑ dk
t=1 k=1 k=1
T T T T
= ∑ ( ft yt + wt xt ) + K = ∑ ft yt + ∑ ∑ wt xtτ + K,
t=1 t=1 t=1 τ=t
T
where wt = ct + ht + · · · + hT and K = ∑t=1 ht (s0 − ∑tk=1 dk ).
In the new variables xtτ , (1.20) can be reformulated as follows:
20 1 Introduction
T T T
∑ ft yt + ∑ ∑ wt xtτ → min,
t=1 t=1 τ=t
τ
∑ xtτ = dτ , τ = 1, . . . , T, (1.22)
t=1
0 ≤ xtτ ≤ dτ yt , t = 1, . . . , T ; τ = t, . . . , T,
yt ∈ {0, 1}, t = 1, . . . , T.
It is not difficult to show that among the solutions of the relaxation LP for (1.22)
there are solutions (x∗ , y∗ ) for which all components of the vector y∗ are integer
and from xtτ ∗ > 0 it follows that x∗ = d , i.e., the whole demand for product in
tτ τ
any period τ is fully produced only in one period. For this reason, (1.22) can be
considered as an ”almost” ideal formulation.
The main drawback of many extended formulations is their large size. In our case,
we replaced Formulation (1.20) having 3T variables and 2T nontrivial3 constraints
with Formulation (1.22) having T (T + 1)/2 variables and 2T constraints. For exam-
ple, if T = 100, we have only 300 variables in the first case, and 5050 in the second
case. The difference is huge! Sometimes, it is more efficient to use in practice so-
called approximate extended formulations, such a formulation is obtained by adding
to the basic compact formulation only a part of the ”most important” variables and
constraints of the extended formulation.
A transportation network is given by a directed graph (digraph) G = (V, E). For each
node v ∈ V , we know the demand dv for some product. If dv > 0, then v is a demand
node; if dv < 0, then v is a supply node; dv = 0 for transit nodes. It is assumed
that supply and demand are balanced: ∑v∈V dv = 0. The capacity of an arc e ∈ E is
ue > 0, and the cost of shipping xe > 0 units of product along this arc is fe + ce xe .
Naturally, if the product is not moved through the arc (xe = 0), then nothing is paid.
The fixed charge network flow problem (FCNF) is to decide on how to transport the
product from the supply to the demand nodes so that the transportation expenses are
minimum.
The FCNF problem appears as a subproblem in many practical applications such
as designing transportation and telecommunication networks, or optimizing supply
chains.
Introducing the variables
• xe : flow (quantity of shipping product) through arc e ∈ E,
• ye = 1 if product is shipped (xe > 0) through arc e, and ye = 0 otherwise,
we formulate the FCNF problem as follows:
3 Normally, the lower and upper bounds for variables are called trivial constraints
1.7 Extended Formulations 21
∑ ( fe ye + ce xe ) → min, (1.23a)
e∈E
∑ xe − ∑ xe = dv , v ∈ V, (1.23b)
e∈E(V,v) e∈E(v,V )
0 ≤ xe ≤ ue ye , e ∈ E, (1.23c)
ye ∈ {0, 1}, e ∈ E. (1.23d)
Here E(V, v) (resp., E(v,V )) denote the sets of arcs from E that are entering (resp.,
leaving) a node v ∈ V .
Objective (1.23a) is to minimize the total transportation expenses. Each balance
equation in (1.23b) requires that the number of flow units entering a particular node
be equal to the number of flow units leaving this node. The variable upper bounds
(1.23c) are capacity restrictions with the following meaning:
• the flow through any arc cannot exceed the arc capacity;
• if some arc is not used for shipping product (ye = 0), then the flow through this
arc is zero (xe = 0).
Since, for any optimal solution of the relaxation LP, ye = xe /ue , then (1.23) can-
not be a strong formulation if the capacities, ue , of many arcs are greater than the
flows, xe , along these arcs. This, for example, happens in problems without capacity
limitations, when the numbers ue are some rough upper estimates for the values of
the arc flows xe .
We can strengthen (1.23) by disaggregating the flow variables xe . In what follows
we assume that all the values of fe and ce are non-negative. In a new formulation
for each flow unit along any arc, we will indicate its origin supply node and its
destination demand node. Let us denote by S and T , respectively, the set of supply
(dv < 0) and demand (dv > 0) nodes. We introduce two new families of variables:
• qst : number of product units supplied from node s ∈ S to node t ∈ T ;
• zste : (s,t)-flow along arc e ∈ E that is a part of flow sending from s ∈ S to t ∈ T
and going through arc e.
With these new variables, we rewrite (1.23) as follows:
∑ ( fe ye + ce xe ) → min, (1.24a)
e∈E
st
∑ ze − ∑ zste = −qst , s ∈ S, t ∈ T, (1.24b)
e∈E(V,s) e∈E(s,V )
∑ qst = dt , t ∈ T, (1.24e)
s∈S
∑ zste = xe , e ∈ E, (1.24f)
(s,t)∈S×T
22 1 Introduction
In this formulation, (1.24b) and (1.24c) are flow conservation constraints: for
s ∈ S and t ∈ T , zst ∈ RE is a flow from s to t of value qst , i.e., qst flow units are sent
from s to t, and, for all nodes other than s or t, the incoming and outgoing flows are
equal. Equations (1.24d) and (1.24e) ensure that each supply nodes sends and each
demand node receives the required quantity of product. Equations (1.24f) determine
the total flow along any arc by summing up all the flows going from supply to
demand nodes. The variable upper bounds (1.24g) impose the capacity limitations
for the arc flows: the flow zste along arc e sent from s to t cannot exceed the capacity
ue of arc e, the supply −ds at node s and the demand dt at node t. In fact, these more
precise variable bounds make (1.24) stronger than (1.23).
In general not all jobs can be processed. For a given schedule, let U j = 0 if job j
is processed, and U j = 1 otherwise. Then the problem is to find such a job schedule
for which the weighted number of not processed jobs, ∑nj=1 w jU j , is minimum. Al-
ternatively, we can say that our goal is to maximize the weighted sum of processed
jobs, which is ∑nj=1 w j (1 −U j ).
In models with continuous time, the main variables correspond to events that are
defined as the moments when individual jobs begin or end. In our model, we use the
following variables:
• s j : start time of job j;
• yi j = 1 if job j is accomplished by processor i, and yi j = 0 otherwise;
• xi, j1 , j2 = 1 if both jobs, j1 and j2 , are carried out on processor i, and j1 is finished
before j2 starts, and xi, j1 , j2 = 0 otherwise.
In these variables we formulate our scheduling problem as follows:
n
∑ w j ∑ yi j → max, (1.25a)
j=1 i∈Pj
∑ yi j ≤ 1, j = 1 = 1, . . . , n, (1.25b)
i∈Pj
tj = ∑ pi j yi j , j = 1, . . . , n, (1.25c)
i∈Pj
rj ≤ sj ≤ dj −tj , j = 1, . . . , n, (1.25d)
s j2 − s j1 + M(1 − xi, j1 , j2 ) ≥ pi, j1 , j1 , j2 = 1, . . . , n, j1 6= j2 ,
i ∈ Pj1 ∩ Pj2 , (1.25e)
xi, j1 , j2 + xi, j2 , j1 ≤ yi, j2 , j1 , j2 = 1, . . . , n, j1 6= j2 ,
i ∈ Pj1 ∩ Pj2 , (1.25f)
yi, j1 + yi, j2 − xi, j1 , j2 − xi, j2 , j1 ≤ 1, j1 = 1, . . . , n − 1,
j2 = j1 + 1, . . . , n, i ∈ Pj1 ∩ Pj2 , (1.25g)
s j1 + t j1 ≤ s j2 , ( j1 , j2 ) ∈ E, (1.25h)
yi j ∈ {0, 1}, j = 1, . . . , n, i ∈ Pj , (1.25i)
xi, j1 , j2 ∈ {0, 1}, j1 , j2 = 1, . . . , n, j1 6= j2 ,
i ∈ Pj1 ∩ Pj2 . (1.25j)
M = max d j − min r j .
1≤ j≤n 1≤ j≤n
24 1 Introduction
y j1 − y j2 ≥ 0, ( j1 , j2 ) ∈ E, (1.26f)
s j2 − s j1 ≥ t j1 , ( j1 , j2 ) ∈ E, (1.26g)
x jit ∈ {0, 1}, i ∈ Pj , t = r j , . . . , d j − pi j , j = 1, . . . , n, (1.26h)
y j ∈ {0, 1}, j = 1, . . . , n. (1.26i)
n u j −pi j
∑ wj ∑ ∑ (t + pi j )x jit → min,
j=1 i∈Pj t=l j
relaxation schedule is obtained by slicing jobs into pieces and then each piece is
processed without interruption.
The main disadvantage of the time-index formulation is its size: even for one ma-
chine problems, there are n+T constraints and there may be up to nT variables. As a
consequence, for instances with many jobs and long processing intervals [r j , d j ], the
relaxation LPs will be very big in size, and their solution times will be large. Nev-
ertheless, the time-index formulation can be used in practice for solving scheduling
problems with relatively short planning horizons.
F(0) = 0,
F(β ) = max F(β − a j ) + c j , β = 1, . . . , b. (1.29)
j: a j ≤β
As usual, we assume that the maximum over the empty set of alternatives is equal
to −∞.
Calculating the values of F(β ) using (1.29) is called a direct step of dynamic pro-
gramming. When all values F(β ) are calculated, an optimal solution, x∗ , to (1.27)
can be found by performing the following reverse step.
Start with β ∈ arg max F(q), and set x∗j = 0 for j = 1, . . . , n.
0≤q≤b
While β > 0, do the following computations:
find an index j such that F(β ) = F(β − a j ) + c j ,
and set x∗j := x∗j + 1, β := β − a j .
Example 1.2 We need to solve the problem
F(0) = 0,
F(1) = −∞,
F(2) = F(0) + 1 = 1,
F(3) = max{F(1) + 1, F(0) + 2} = max{−∞, 2} = 2,
F(4) = max{F(0) + 5, F(2) + 1, F(1) + 2} = max{5, 2, −∞} = 5,
F(5) = max{F(0) + 4, F(1) + 5, F(3) + 1, F(2) + 2}
= max{4, −∞, 3, 3} = 4,
F(6) = max{F(1) + 4, F(2) + 5, F(4) + 1, F(3) + 2}
= max{−∞, 6, 6, 4} = 6,
F(7) = max{F(2) + 4, F(3) + 5, F(5) + 1, F(4) + 2} = max{5, 7, 5, 7} = 7.
28 1 Introduction
Now we can find an optimal solution x∗ . As F(7) = max F(q), we start with
0≤q≤7
β = 7 and x∗ = (0, 0, 0, 0)T . Since F(7) = F(7 − a2 ) + c2 , we set x2∗ = 0 + 1 = 1
and β = 7 − a2 = 3. Next, as F(3) = F(3 − a4 ) + c4 , we set x4∗ = 0 + 1 = 1 and
β = 3 − a4 = 0.
Therefore, the point x∗ = (0, 1, 0, 1)T is a solution to the knapsack problem of
Example 1.2. t
u
1.9.2 0,1-Knapsack
F0 (0) = 0, F0 (β ) = −∞ for β = 1, . . . , b.
Formula (1.30) is not the only possible one. Suppose that all values c j are integer.
Let C ∈ Z be an upper bound for the optimal objective value in (1.28). For k =
1, . . . , n and z = 0, . . . ,C, we define
( )
k k
def
Gk (z) = min ∑ a j x j : ∑ c j x j = z, x j ∈ {0, 1}, j = 1, . . . , k .
j=1 j=1
1.9 Knapsack Problems 29
After calculating all values Gk (z), we can find an optimal solution, x∗ , to (1.28) by
performing the following reverse step.
Start with z ∈ arg max{q : Gn (q) ≤ b}.
For k = n, . . . , 1,
if Gk (z) = Gk−1 (z), set xk∗ = 0, otherwise, set xk∗ = 1 and z := z − ck .
The calculations by Formula (1.31) can be performed in O(nC) time using
O(nC) memory cells.
Comparing the computational complexity of both recurrence formulas, (1.30)
and (1.31), we can conclude that (1.30) should be used if b < C, otherwise we need
to use (1.31).
Example 1.3 We need to solve the problem
Solution. Obviously, the optimal objective value in this problem is greater than
b = 7. Therefore, we use Formula (1.30). The calculations are presented in Table 1.2.
To find an optimal solution x∗ , we need to execute the reverse step starting with
β = 7:
Hence, the point x∗ = (1, 0, 0, 1)T is an optimal solution to the 0, 1-knapsack prob-
lem from Example 1.3. t
u
Solution. First, we estimate from above the optimal objective value. To do this,
we solve the relaxation LP that is obtained from the original problem by allowing
all binary variables to take values from the interval [0, 1]. The algorithm for solving
LPs with only one constraint is very simple (see Exercise 3.4).
1. First we sort the ratios c j /a j by non-increasing:
c1 2 c4 4 c3 3 c2 1
= ≥ = ≥ = ≥ = .
a1 35 a4 75 a3 69 a2 24
2. Then we compute the solution x̂:
100 − 35 13
x̂1 = 1, x̂4 = = , x̂3 = x̂2 = 0.
75 15
As an upper bound, we take the number C = bcT x̂c = b2 + 4 · (13/15)c = 5.
Since C = 5 < 100 = b, we will use Formula (1.31). The calculations are presented
in Table 1.3.
Since G4 (5) = 99 < 100 = b, the optimal objective value to our example problem
is 5. To find an optimal solution x∗ , we need to execute the reverse step starting from
1.10 Notes 31
z = 5:
Thus, the point x∗ = (0, 1, 0, 1)T is an optimal solution to the 0,1-knapsack problem
from Example 1.4. t
u
1.10 Notes
Sect. 1.2. The general linear complementarity problem is NP-hard, but a number of
special cases of this problem can be solved by simplex-like algorithms (see [95]).
Nevertheless, the formulation of the linear complementarity problem as a MIP al-
lows us to seek a solution with additional properties by assigning an appropriate
objective function.
The Karush-Kuhn-Tucker (KKT) optimality conditions for the non-linear con-
strained optimization problems can be found in almost any book on non-linear pro-
gramming, for example, in [29, 94].
Sect. 1.4. Fundamental works on systems of linear inequalities and polyhedral the-
ory are [66, 34]. Many aspects of polyhedral theory related to optimization are also
presented in [122, 102].
Sect. 1.5. The importance of strong MIP formulations was not recognized imme-
diately. In the literature, say, thirty years ago, we can find MIP formulations that
32 1 Introduction
are considered weak (bad) today. The principles, on which strong formulations are
build, are discussed in the reviews [141, 144].
Sect. 1.7.1. A single-product lot-sizing model with unlimited production capacities,
as well as a dynamic programming algorithm for its solution are described in [135].
Approximate and extended formulations for various lot-sizing problems are dis-
cussed in [133]. Inequalities (1.21), complementing Formulation (1.20) to the ideal
one, were obtained in [17].
Sect. 1.8. The MIP formulations are known for a wide variety of scheduling prob-
lems [114, 2]. The formulations with indexing by time were first introduced in [47].
The handbook [111] provides an up-to-date coverage of theoretical models and prac-
tical applications of modern scheduling theory.
Sect. 1.9. The classic work on using dynamic programming to solve the knapsack
problems is [56]. The idea to change the roles of the objective and the constraint,
which allowed us to write Formula (1.31), was proposed in [75]. For the complexity
of the knapsack problems, see, for example, [110, 109].
Sect. 1.11. Theorem 1.4 was proved in [73] (see also more accessible sources [110,
122]). The result of Exercise 5.8 was obtained in [145].
1.11 Exercises
1.1. Describe the following sets by systems of linear inequalities, introducing, where
necessary, binary variables:
a) X1 = {x ∈ R : |x| ≥ a};
b) b) \ {x̄}, where x̄ ∈ P(A, b);
X2 = P(A,
X3 = x ∈ R3 : x3 = min{x1 , x2 }, 0 ≤ x1 , x2 ≤ d ;
c)
d) X4 = {(x, y) ∈ {0, 1}n × {0, 1}m : ∑nj=1 x j ≥ a ⇒ ∑m
i=1 yi ≥ b}.
1.2. The sum of the largest components of a vector. Consider the set
r
def
Xr,α = {x ∈ Rn : ∑ xπ x (i) ≤ α},
i=1
Let us also note that a compact extended formulation for the set Xr,α is given in
Exercise 3.3.
1.3. Sudoku is a popular logic puzzle. An n × n-matrix A is formed from an m × m-
matrix by replacing its elements for m × m-matrices, which we call blocks. So, n =
m2 . Some positions in A are filled with some numbers:
It is necessary to fill all empty positions with numbers from 1 to n so that each row,
column and block does not contain the same numbers. An example of such a puzzle
for m = 3 is presented below, where a puzzle is on the left, and its solution is on the
right.
7 8 5 2 6 4 7 8 1 5 2 3 9
8 6 4 5 8 9 3 6 2 4 7 1 5
1 9 8 2 1 5 3 9 7 4 8 6
4 2 8 9 7 4 3 1 2 8 9 6 5 7
7 2 6 4 5 3 8 9 1
5 7 6 1 2 5 8 9 7 6 1 3 4 2
7 3 6 1 7 4 9 3 2 5 6 8
3 1 6 4 3 5 8 1 7 6 9 2 4
2 5 8 1 9 6 2 5 4 8 1 7 3
Formulate an IP to find a solution to the Sudoku puzzle so that the sum of the
diagonal elements of the resulting matrix A is maximum.
Hint. Use binary variables xi jk with xi jk = 1 if k is written into position (i, j).
1.4. Show that the IP
where M is a sufficiently large number. Estimate the value of M for given integer-
valued A, b and c?
1.5. Consider the quadratic knapsack problem
n n j−1
∑ c jx j + ∑ ∑ ci j xi x j → min,
j=1 j=2 i=1
n
∑ a j x j ≥ b, x ∈ {0, 1}n .
j=1
1.6. The binary classification problem is among the most important problems in
machine learning. We are given a set {x1 , . . . , xk } of ”positive” points from Rn , and
a set {y1 , . . . , x̃l } of ”negative” points also from Rn . Ideally, we would like to find
a hyperplane H(a, 1), which is called a linear classifier, that separates positive and
negative points:
aT xi ≤ 1, i = 1, . . . , k,
(1.32)
aT y j > 1, j = 1, . . . , l.
In most practical cases this is impossible, and therefore we are looking for a hy-
perplane that minimizes some ”classification error”. Intuitively, the most natural
measure for the classification error is the number of points, both positive and nega-
tive, that are on the ”wrong” side of the hyperplane H(a, 1). With this measure, we
need to find a vector a ∈ Rn that violates the minimum number of inequalities in
(1.32)6 . Formulate this problem as a MIP.
1.7. Consider a bimatrix game in which the gains of the first and second players
are given by two matrices A = [ai j ]m×n and B = [bi j ]m×n . A pair of vectors (mixed
strategies) (p, q) ∈ Rm × Rn is a Nash equilibrium if it satisfies the following con-
straints:
n m n
∑ ai j q j ≤ ∑ ∑ ai j pi q j , i = 1, . . . , m,
j=1 i=1 j=1
m m n
∑ ai j pi ≤ ∑ ∑ ai j pi q j , j = 1, . . . , n,
i=1 i=1 j=1
m n
(1.33)
∑ pi = 1, ∑ q j = 1,
i=1 j=1
pi ≥ 0, i = 1, . . . , m,
q j ≥ 0, j = 1, . . . , n.
a) Prove that for any solution (p, q) to (1.33) the following complementary slack-
ness conditions are valid:
!
m n n
pi ∑ ∑ ai j pi q j − ∑ ai j q j = 0, i = 1, . . . , m,
i=1 j=1 j=1
!
m n m
qj ∑ ∑ ai j pi q j − ∑ bi j pi = 0, j = 1, . . . , n.
i=1 j=1 i=1
b) Defining
w → max, (1.34a)
pi + xi ≤ 1, i = 1, . . . , m, (1.34b)
n
−U1 xi ≤ ∑ ai j q j − v1 ≤ 0, i = 1, . . . , m, (1.34c)
j=1
q j + y j ≤ 1, j = 1, . . . , n, (1.34d)
m
−U2 y j ≤ ∑ ai j pi − v2 ≤ 0, j = 1, . . . , n, (1.34e)
i=1
m n
∑ pi = 1, ∑ q j = 1, (1.34f)
i=1 j=1
Let (p∗ , q∗ , x∗ , y∗ , v∗1 , v∗2 , w∗ ) be an optimal solution to (1.34). Using the statement
of item a), prove that (p∗ , q∗ ) is a Nash equilibrium such that the minimum, w∗ =
min{v∗1 , v∗2 }, of the gains of both players is maximum.
1.8. Solve the following knapsack problems:
1.9. Consider the single-product lot-sizing problem from Sect. 1.7.1. Let H(t) de-
note the cost of the optimal solution to the subproblem in which the number of
periods in the planning horizon is t (0 ≤ t ≤ T ). Let us introduce the notations
T t t
def def def
wt = ct + ∑ hτ , dτt = ∑ dk , Ĥ(t) = H(t) + ∑ hτ d1τ .
τ=t k=τ τ=1
Ĥ(0) = 0,
(1.35)
Ĥ(t) = min {Ĥ(τ − 1) + fτ + wτ dτt }, t = 1, . . . , T.
1≤τ≤t
36 1 Introduction
b) Using (1.35), solve the example of the lot-sizing problem with the following
parameters: T = 4, d = (2, 4, 4, 2)T , c = (3, 2, 2, 3)T , h = (1, 2, 1, 1)T and f =
(10, 20, 16, 10)T .
The most reliable way to learn how to formulate complex practical problems as
MIPs is to study those MIP models which now are regarded as classical. In this
chapter, we consider examples of MIP formulations for various practical applica-
tions. Not all of our formulations are the strongest because in some cases to elab-
orate a strong formulation, we must conduct an in-depth analysis of the structure
of the problem being solved in order to take into account its specific features. Nev-
ertheless, each of the models studied here can be a starting point for developing a
solution tool for the corresponding practical application. Let us also note that many
practical MIP applications are also discussed in the other chapters. In addition, a lot
of applications are presented in the exercises.
Given a finite set S and a family E = {S1 , . . . , Sn } of its subsets. Often the pair
H = (S, E ) is called a hypergraph. By analogy with graphs, the elements of the
set S are called vertices, and the subsets in E are called hyperedges. A subset of
hyperedges J ∈ E is called a packing if every vertex from S belongs to at most
one hyperedge from J . If every vertex of S belongs to exactly one hyperedge from
J , then J is called a partition. Finally, if every vertex in S belongs to at least one
hyperedge from J , then J is called a cover.
Let us assign to each hyperedge S j its cost c j . The set packing problem is to find
a packing with the maximum total cost of its hyperedges. The set covering (resp.,
set partitioning) problem is to find a cover (resp., partition) with the minimum total
cost of its hyperedges. Note that if H is a graph (all sets S j contain two elements),
then the set packing problem is known as the weighted matching problem.
To simplify the exposition, we assume that S = {1, . . . , m}. The incidence matrix
A of the hypergraph H = (S, E ) has m rows, n columns, and its element ai j equals 1
if i ∈ S j , and ai j = 0 otherwise. For j = 1, . . . , n, we introduce a binary variable x j
that takes the value of 1 if hyperedge j is in the packing, partitioning or covering.
37
38 2 MIP Models
Crew Scheduling
A number of flight legs are given (taken from the time table of some company). A
leg is a flight taking off from its departure airport at some time and landing later at
its destination airport. The problem is to partition these legs into routes, and then
assign exactly one crew to each route. A route is a sequence of flight legs such that
the destination of one leg is the departure point of the next, and the destination of
the last leg is the departure point of the first leg. For example, there might be the
following short rout:
• leg 1: from Paris to Berlin departing at 9:20 and arriving at 10:50;
• leg 2: from Berlin to Rom departing at 12:30 and arriving at 14:00;
• leg 3: from Rom to Paris departing at 16:30 and arriving at 18:00.
A schedule is good if the crews spend flying as much of the elapsed time as it
is possible subject to the safety regulation and contract terms are satisfied. These
terms regulate the maximum number of hours a pilot can fly in a day, the maximum
number of days before returning to the base, and minimum overnight rest times. The
cost of a schedule depends on several of its attributes, and the wasted times of all
crews is the main one.
In practice, the crew scheduling problem is solved in two stages. Let all the legs
be numbered from 1 to m. First, a collection of n reasonable routes S j ⊂ {1, . . . , m}
(that meet all the constraints mentioned above) are selected, and the cost c j of each
route j is calculated. This route-selection problem is far from being trivial, but it is
not our main interest here.
2.2 Service Facility Location 39
Given the set of potential routes, E = {S1 , . . . , Sn }, the second stage is to identify
a subset of them, J ⊆ E, so that each leg is covered by exactly one route, and the
total cost of all routes in J is minimum. Of course, this second stage problem is a
set partitioning problem
Combinatorial Auctions
At an auction, m objects are put up for sale. Suppose that in a certain round of trades,
the auctioneer received n bids from the buyers. The difference between the combi-
natorial auction and the usual one is that any buyer in its bid is allowed to value not
only a single object, but also any group of objects. Therefore, each bid j is described
by a pair (S j , c j ), where S j is a subset of objects for which the buyer who submitted
the bid agrees to pay c j . Naturally, no object can be sold twice. Therefore, two bids,
(S j1 , c j1 ) and (S j2 , c j2 ) such that S j1 ∩ S j2 6= 0/ cannot be satisfied simultaneously.
The auctioneer must decide which of the bids to satisfy so that the seller’s profit is
maximum.
Clearly, here we have a set packing problem.
Objective (2.4a) is to minimize the total cost of locating centers and serving cus-
tomers. Let us note, that if all ci j = 0 and all fi = 1, then the objective is to minimize
the number of service centers needed to serve all customers. Equations (2.4b) insure
that each customer is served. Inequalities (2.4c) reflect the capacity limitations: if
a service center is established at site i (yi = 1), then at most ui customers can be
served from this site i. Inequalities (2.4d), which are logically implied by (2.4c) and
therefore are redundant, are introduced to strengthen the formulation.
shares for T previous periods. Let Ri (t) be the return (per one enclosed dollar) of
share i in period t. Then we can calculate
T
ρi j = ∑ pT −t (Ri (t) − R j (t))2 ,
t=1
Objective (2.5a) is to build an index fund that most accurately represents the
market index. Inequality (2.5b) do not allow the index fund to contain more than q
shares. Equalities (2.5c) require that each share of the market index be represented
by a share from the index fund. Inequalities (2.5d) do not allow the shares that are
not in the index fund to represent other shares.
This may seem strange, but it turns out that (2.5) is a special case of (2.4), which
is an IP formulation of the problem of locating service centers, when m = n, ci j =
ρi j , bi = n, fi = 0. It worth noting that such a coincidence of the formulations of
seemingly completely different problems are encountered quite often.
When (2.5) is solved and the set of shares in the index fund, I = {i : yi = 1}, is
known, we can proceed with the formation of the portfolio. First we calculate the
def
weights wi = ∑nj=1 V j xi j of all shares i ∈ I, where V j is the total value of all shares
of type j in the market index. In other words, wi is the total market value of all
shares represented by share i in the index fund. Therefore, the proportion of capital
invested in each share i from the index fund (i ∈ I) must be equal to wi /(∑k∈I wk ).
42 2 MIP Models
We need to work out an aggregate production plan for n different products processed
on a number of machines of m types for a planning horizon that extends over T
periods.
Inputs parameters:
• lt : duration (length) of period t;
• mit : number of machines of type i available in period t;
• fit : fixed cost of producing on one machine of type i in period t;
• Timin , Timax : minimum and maximum working time of one machine of type i;
• c jt : per unit production cost of product j in period t;
• h jt : inventory holding cost per unit of product j in period t;
• d jt : demand for product j in period t;
• ρ jk : number of units of product j used for producing one unit of product k;
• τi j : per unit production time of product j on machine of type i;
• sij : initial stock of product j at the beginning of the planning horizon.
• s fj : final stock of product j at the end of the planning horizon.
A production plan specifies on which machines and in which quantities each
product is produced in each of the periods. The goal is to determine a production
plan that can be implemented on existing equipment and the total production and
inventory cost is minimum.
Let us introduce the following variables:
• x jt : amount of product j produced in period t;
• s jt : amount of product j in stock at the end of period t;
• yit : number of machines of type i working in period t.
Now we formulate the problem as follows:
T n T m
∑ ∑ (h jt s jt + c jt x jt ) + ∑ ∑ fit yit → min, (2.6a)
t=1 j=1 t=1 i=1
n
sij + x j1 = d j1 + s j1 + ∑ ρ jk xk,1 , j = 1, . . . , n, (2.6b)
k=1
n
s j,t−1 + x jt = d jt + s jt + ∑ ρ jk xk,t , j = 1, . . . , n, t = 2, . . . , T, (2.6c)
k=1
n
∑ τi j x jt ≤ lt yit , i = 1, . . . , m, t = 1, . . . , T, (2.6d)
j=1
s jT = s fj , j = 1, . . . , n, (2.6e)
0 ≤ s jt ≤ u j , x jt ≥ 0, j = 1, . . . , n, t = 1, . . . , T, (2.6f)
0 ≤ yit ≤ mit , yit ∈ Z, i = 1, . . . , m, t = 1, . . . , T. (2.6g)
2.5 Balancing Assembly Lines 43
Objective (2.6a) is to minimize the total production and inventory expenses. Each
of the balance equations in (2.6b) and (2.6c) joins two adjacent periods for each of
the products: the stock in period t − 1 plus the amount of product produced in pe-
riod t equals the demand in period t plus the amount of product used when produc-
ing other products, and plus the stock in period t. Inequalities (2.6d) require that the
working times of all machines be withing given limits; besides, if machine i does not
work in period t (yit = 0), then no product is produced by this machine (all x jt = 0).
Assembly lines are special product-layout production systems that are typical for
the industrial production of high quantity standardized commodities. An assembly
line consists of a number of work stations arranged along a conveyor belt. The work
pieces are consecutively launched down the conveyor belt and are moved from one
station to the next. At each station, one or several operations, which are necessary
to manufacture the product, are performed. The operations in an assembly process
usually are interdependent, i.e., there may be precedence relations that must be en-
forced. The problem of distributing the operations among the stations with respect
to some objective function is called the assembly line balancing problem (ALBP).
We will consider here the simple assembly line balancing problem which is the core
of many other ALBPs.
The manufacturing of some product consists of a set of operations O = {1, . . . , n}.
We denote by to the processing time of operation o ∈ O. The precedence relations
between the operations are represented by a digraph G = (O, E), where (o1 , o2 ) ∈ E
means that operation o1 must be finished before operation o2 starts. Suppose that
the demand for the product is such that the assembly line must have a cycle time
C, which means that the running time of each station on one product unit must not
exceed C.
The simple assembly line balancing problem (SALBP) is to decide what is the
minimum number of stations that is enough for the line running with the given cycle
time to fulfill all the operations in an order consistent with the precedence relations.
An example of SALBP is presented in Fig. 2.1. Here we have n = 11 operations
that correspond to the vertices of the digraph representing precedence relations be-
tween these operations. The numbers over vertices are the processing times of oper-
ations.
To formulate SALBP as an IP, we need to know an upper bound, m, on the num-
ber of needed stations. In particular, we can set m to be the the number of station in
a solution build by one of the heuristics developed for solving SALBPs.
For example, let us consider the heuristic that assigns operations, respecting
precedence relations, first to Station 1, then to Station 2, and so on until all the
operations are assigned to the stations. If we apply this heuristic to the example of
Fig. 2.1 when the cycling time is C = 45, we get the following assignment:
• operations 1 and 2 are accomplished by station 1,
44 2 MIP Models
13
25 9 : 9m
11
2m - 4m - 7m
XX 8 @
z 8m @
X
X
HH@
16 13 7 R15 9
1m - 3m - 6m m - 11m
H
j
H@
: 10
-
H
HH 8
j m
H
5
Objective (2.7a) is to minimize the number of open stations. Equations (2.7b) re-
quire that each operation be assign to exactly one station. Inequalities (2.7c) reflect
the capacity restrictions inducing that the total running time of each open stations
does not exceed the cycle time. Equations (2.7d) establish the relation between the
assignment variables, binary x and integer z. Each precedence relation constraint
in (2.7e) require that, for a pair (o1 , o2 ) ∈ E of related operations, operation o1 be
assigned to the same or an earlier station than operation o2 ; this guaranties that op-
eration o1 is finished before operation o2 starts. Inequalities (2.7f) and (2.7g) ensure
that earlier stations are opened first.
Objective (2.8a) is to minimize the total (for all n generators over all T periods)
cost of producing electricity plus the sum of start-up costs. Equations (2.8b) guar-
antee that, for each period, the total amount of electricity produced by all working
generators meets the demand at that period. Inequalities (2.8c) require that, in any
period, the total capacity of all working generators be at least q times more than the
demand at that period. The lower and upper bounds in (2.8d) impose the capacity
restrictions for each generator in each period; simultaneously, these constrains en-
sure that non-working generators do not produce electricity. Two-sided inequalities
(2.8e) guarantee that generators cannot increase (ramp up) or decrease (ramp down)
their outputs by more than the values of their ramping parameters. Let us note that
period ((t − 2 + T ) mod T ) + 1 is immediately followed by period t. Inequalities
(2.8f) and (2.8g) reflect the fact that any generator is working in a given period only
if it has been switched on in this period or it was working in the preceding period.
∑ de yer ≤ U, r = 1, . . . , R, (2.9c)
e∈E
T
∑ de yer ≤ ∑ ct zirt , i ∈ V, r = 1, . . . , R, (2.9d)
e∈E(i,V ) t=1
T
∑ zirt ≤ Sxir , i ∈ V, r = 1, . . . , R, (2.9e)
t=1
yer ≤ xir , yer ≤ x jr , e = (i, j) ∈ E, r = 1, . . . , R, (2.9f)
xir ∈ {0, 1}, i ∈ V, r = 1, . . . , R, (2.9g)
yer ∈ {0, 1}, e ∈ E, r = 1, . . . , R, (2.9h)
zirt ∈ Z+ , i ∈ V, r = 1, . . . , R, t = 1, . . . , T. (2.9i)
Objective (2.9a) is to minimize the total cost of installed multiplexers and cards.
Equations (2.9b) assign every demand to exactly one ring. Inequalities (2.9c) guar-
antee that the bandwidth of any ring is not exceeded. Similar inequalities (2.9d)
require that the bandwidth of any multiplexer, which is the sum of the capacities of
all cards installed on the multiplexer, be not exceeded. Here E(i,V ) stands for the
set of edges e = (i, j) ∈ E incident to node i. Inequalities (2.9e) do not allow us to
insert more cards into any multiplexer than there are slots, and they also prevent the
insertion of cards into slots of not used multiplexers (if xir = 0, then zirt = 0 for all
t = 1, . . . , T ). Finally, each pair of inequalities in (2.9f) implies that, if a demand
e = (i, j) is assigned to ring r (yer = 1), then the multiplexers must be installed on
this ring at both nodes i and j (xir = x jr = 1).
We have a set C of logic elements (gates) that implement basic boolean functions.
Geometrically the surface of a crystal of size k × q can be considered as a rectan-
48 2 MIP Models
gular area with the uniform rectangular grid on it (like a sheet of a school notebook
in a box). The cells on the crystal are numbered from 1 to k q. For simplicity of
exposition we will assume that each gate can be placed into one cell of the crystal,
i.e., each gate can be considered as a unit square1 . The gates are connected to each
other by signal circuits (hereinafter simply ”circuits”). Any circuit n ∈ N is given
as a subset N (n) ⊆ C of gates, which it connects. Our goal is to place the set of
gates, C , into a subset of cells, I ⊆ {1, . . . , k q} (|I | ≥ |C |), so that to minimize
the sum of the semiperimeters of the minimal rectangles bounding the circuits.
The problem of placing a set of gates on the surface of a crystal is very complex
and usually it is solved in two stages. At the stage of global placement, the crystal
surface is divided into a set of disjoint rectangles, each of which is assigned a sub-
set of gates (without specifying specific positions). At the stage of detailed (local)
placement, for each of these rectangles, it is necessary to solve the placement prob-
lem with an indication of exact positions of all the gates. Here we will consider only
the problem of detailed placement.
Input data:
• C : set of gates;
• I : subset of crystal cells;
• N : set of circuits;
• N (n) ⊆ C : set of gates that are connected by circuit n ∈ N ;
• ai : distance from the center of cell i ∈ I to the crystal left side;
• bi : distance from the center of cell i ∈ I to the crystal top side.
Let us introduce the following variables:
• zci = 1 if gate c ∈ C is placed into cell i, and zci = 0 otherwise;
• xnmin , xnmax , ymin max min min max max
n , yn : the pairs (xn , yn ) and (xn , yn ) are the coordinates
of the left-bottom and right-top corners of the minimal rectangle that contains all
gates c ∈ N (n).
In these variables our IP formulation is as follows:
∑ zci = 1, c ∈ C, (2.10c)
i∈I
∑ bi zci ≥ ymin
n , c ∈ N (n), n ∈ N , (2.10f)
i∈I
1In rare cases, when the gate occupies several cells, it can be represented as several gates that are
unit squares which should be placed in an adjacent manner.
2.9 Assigning Aircrafts to Flights 49
∑ bi zci ≥ ymax
n , c ∈ N (n), n ∈ N , (2.10g)
i∈I
zci ∈ {0, 1}, c ∈ C , i ∈ I , (2.10h)
xnmin , xnmax , ymin max
n , yn ≥ 0, n ∈ N . (2.10i)
The flight schedule of even an average airline is huge and usually is stored in a
database in which the information is presented in a form similar to that shown in
Table. 2.1. In this particular example, we see that there is a flight from XYZ airport
to ZYX airport departing at 6:55 and arriving at 9:45. This flight can be performed
by Boeing-734, Boeing-757, or Boeing-767 aircrafts with the flight costs of $8570,
$12085, or $13095, respectively.
Any feasible assignment of aircraft to flights obeys the following requirements:
• no more aircrafts of each type can be used than there is in stock;
• aircrafts arriving at the airport must either fly away or remain on the ground;
• aircrafts must depart from the airports where they landed earlier.
The problem of assigning aircrafts to flights is to find a feasible assignment of min-
imum cost.
Input data:
• n: number of flights;
50 2 MIP Models
∑ xi j = 1, j = 1, . . . , n, (2.11b)
i∈T j
i = 1, . . . , m, k = 1, . . . , l, e = 0, . . . , rk − 1,
l
∑ xi j + ∑ fi,k,rk −1 ≤ qi , i = 1, . . . , m, (2.11d)
j: i∈T j , k=1
t(adj ,edj )>t(aaj ,eaj )
Objective (2.11a) of this IP is to minimize the total cost of all flights. Equations
(2.11b) ensure that each flight will be assigned to exactly one aircraft type. Accord-
ing to the balance equations (2.11c), for each airport k and every event e that occurs
there, the number of aircraft of any type i at the airport until the next event is equal
to their number at time t(k, e) plus the number of aircrafts landing at time t(k, e) and
minus the number of aircrafts taking off at time t(k, e). Since (2.11c) is valid, the
number of aircraft of each type remains constant during a day. Inequalities (2.11d)
require that at midnight the total number of aircrafts of each type in the air and on
the ground be not more than their number.
2.10 Optimizing the Performance of a Hybrid Car 51
A hybrid car among many other things has an internal combustion engine, a mo-
tor/generator connected to a battery, and a braking system. We will consider an
extremely simple parallel car model in which the motor/generator and the internal
combustion engine are directly connected to the driving wheels. The internal com-
bustion engine transfers mechanical energy to the wheels, and the braking system
takes away this energy from the wheels turning it into heat. The motor/generator can
work as an electric motor using the energy of the battery and feeding it to the wheels,
or as a generator when it consumes mechanical energy from the wheels or directly
from the internal combustion engine, and converts this mechanical energy into elec-
tricity charging the battery. When the generator consumes mechanical energy of the
wheels and charges the battery, it is called a regenerative brake.
A diagram illustrating energy flows in a hybrid car is presented in Fig. 2.2. Here
the arrows indicate positive directions of energy transmission. The engine power
peng is always positive and is transmitted in the direction from the engine to the
wheels. The power of the braking system, pbr , is always non-negative, and it is
positive when the car brakes. The energy consumption of the wheels, preq , is positive
when it is spent on driving the car (when the car accelerates, goes uphill or evenly
moves along the road), and preq is negative when the car brakes or goes down the
hill. We consider the motor/generator as two devices operating in turn. When the
motor is running, the energy pm is fed from it to the wheels, and when the generator
is running, it receives the energy pg from the wheels.
Engine Brakes
W
peng pbr
6 h
? - e
preq e
pm 6 pg
? l
s
Motor Generator
6
?
Battery
The car is tested on a track with fixed characteristics. The speed of the car on
each section of the route is predefined. Therefore, the time of passing the track is
also known and is equal to T seconds. We will build a discrete-time model with T
time intervals, each lasting one second. Because the profile of the route is known and
the speed on all route sections is also set, then it is possible to calculate the power
req
Pt required for feeding to the wheels. We also know the following parameters:
eng
• Pmax : maximum engine power;
52 2 MIP Models
g
• Pmax : maximum generator power;
• m : maximum motor power;
Pmax
• batt : maximum battery energy (charge);
Emax
• η: fraction of energy lost when converting mechanical energy into electricity,
and then into battery charge and vice versa;
• t1 ,t2 : for any time interval of continuous t2 seconds, the electric motor should run
no more than t1 seconds.
If the engine runs at a power of p, then per unit of time it consumes F(p) fuel
units. We assume that F : R+ → R+ is an increasing convex function.
For t = 1, . . . , T , we introduce the following variables:
eng
• pt : engine power in period t;
• ptm : motor power in period t;
g
• pt : generator power in period t;
• ptbr : braking system power in period t;
• Et : battery charge (energy) in period t;
• yt = 1 if the motor/generator works as a motor, and yt = 0 if the motor/generator
works as a generator.
We can determine an optimal operation mode of a hybrid car solving the follow-
ing program:
T
∑ F(pteng ) → min, (2.12a)
t=1
eng g req
pt + pt − pt − ptbr
m
= Pt , t = 1, . . . , T, (2.12b)
g
Et − (1 + η)ptm + (1 − η)pt = Et+1 , t = 1, . . . , T, (2.12c)
t
∑ yτ ≤ t1 , t = t2 , . . . , T, (2.12d)
τ=t−t2 +1
ET +1 ≥ E1 , (2.12e)
batt
0 ≤ Et ≤ Emax , t = 1, . . . , T, (2.12f)
eng eng
0 ≤ pt ≤ Pmax , t = 1, . . . , T, (2.12g)
0 ≤ ptm m
≤ Pmax yt , t = 1, . . . , T, (2.12h)
g g
0 ≤ pt ≤ Pmax , t = 1, . . . , T, (2.12i)
ptbr ≥ 0, t = 1, . . . , T, (2.12j)
yt ∈ {0, 1}, t = 1, . . . , T. (2.12k)
comparison of a hybrid car with a non-hybrid one: at the finish, the battery charge
should be no more than the battery charge at the start.
Since the objective function is nonlinear, (2.12) is not a MIP. But we can approx-
imate the convex function F with a piecewise linear function, and then, using the
method described in Sect. 1.1.4, we can represent this piecewise linear function as
linear by introducing new variables and constraints.
Objective (2.13a) is to maximize the firm ”wealth” at the end of the planning
horizon. Equations (2.13c) and (2.13e) balance the budgets of, respectively, cash and
securities in periods 2, . . . , T . The similar balance constraints, (2.13b) and (2.13d),
are applied only for period 1. Inequalities (2.13f) ensure that the total borrowing
from any credit line does not exceed the volume of this credit line. Inequalities
(2.13g) require that the necessary minimum of cash be available in any period.
In the past, oncologists used the apparatus with two or three comparatively large
beams (10 × 10 centimeters in cross section) of a fixed orientation. The number of
beams in modern devices is constantly increasing, and each beam is divided into
several smaller rays, the size and intensity of which can vary within certain limits.
The intensity of a ray at the points along its path (and to a much lesser extent at the
points closest to it) is measured in doses. The unit of the dose is Gray (Gy) defined
to be the amount of energy per unit of mass received from the beam in the ionization
process around a given point. Treatment of cancerous tumors with such devices is
known as intensive modulated radiation therapy (IMRT).
Treatment with the IMRT method begins with elaborating a treatment plan. The
aim of the planning is to guarantee that, as a result of the treatment, the cancer cells
receive the required dose, and the dose received by healthy tissues must be safe
2.12 Planning Treatment of Cancerous Tumors 55
(there must be no irreversible damage). Planning begins with the definition of the
critical area around the tumor. The critical region is covered by a three-dimensional
uniform rectangular lattice. Let us assume that the lattice points are numbered from
1 to l, and let L = {1, . . . , l}. Let T ⊂ L denote the set of lattice points inside the
tumor. The rest of the critical area is broken (by type of tissue) into subdomains. Let
K denote the number of such subdomains, and Hk ⊂ L be the set of lattice points
inside subdomain k. Table 2.2 presents the parameters that characterize an example
of planning treatment for prostate cancer. Here |T | = 2438 points belong to the
tumor, |H1 | = 1566 points belong to the region immediately surrounding the tumor
(”collar”), additional unclassified |H2 | = 1569 points (other) lie near the tumor. The
bladder (|H3 | = 1292 points) and the rectum (|H4 | = 1250 points) require special
attention.
Suppose that each beam can be directed under one of n possible angles. A sam-
ple is a particular way of breaking the beam into rays, indicating their dimensions
and intensities. Let Pj denote the set of possible samples for a beam directed at an
angle j. Using special software, one can calculate the dose ai j p obtained at point i
from a beam of unit intensity directed at angle j if sample p (p ∈ Pj ) is used. It is
assumed that the dose at any point is approximately equal to the sum of the doses
received from all the beams. It is necessary to determine the intensities x j p of all the
beams in order to satisfy a number of conditions for doses at points of the critical
regions. We will introduce these conditions when we explain the constraints of the
following model:
t → max, (2.14a)
n
1
t≤ ∑ ∑ ai j p x j p ≤ α t, i ∈ T, (2.14b)
j=1 p∈Pj
n
∑ ∑ ai j p x j p ≥ s, i ∈ T 0, (2.14c)
j=1 p∈Pj
n
∑ ∑ ai j p x j p ≤ bk , i ∈ Hk , k = 1, . . . , K, (2.14d)
j=1 p∈Pj
56 2 MIP Models
n
∑ ∑ ai j p x j p ≤ dk + (bk − dk )yi , i ∈ Hk , k ∈ K̂, (2.14e)
j=1 p∈Pj
x j p ≥ 0, p ∈ Pj , j = 1, . . . , n, (2.14g)
yi ∈ {0, 1}, i ∈ ∪k∈K̂ Hk . (2.14h)
In this model, the variable t represents the minimum dose at the tumor points.
The goal (2.14a) is to maximize this minimum dose. The uniformity coefficient, α,
in (2.14b) sets the lower bound for the ratio of the minimum and maximum doses at
the tumor points. In the example from Table 2.2 this coefficient is equal to 0.9.
In order not to miss the microscopic areas of the affected tissues in the immediate
vicinity of the tumor, a set of points, T 0 , surrounding the tumor is selected, and it
is required that each of these points receive at least the minimum dose of s. This
condition is expressed by (2.14c).
Inequalities (2.14d) limit the doses received by healthy tissues. Here bk is the
limiting dose for subdomain k. In the example from Table 2.2 the dose limits are set
for the ”collar”, bladder, rectum and other tissues.
To prevent irreversible damage to certain tissue types k ∈ K̂ ⊆ {1, . . . , K}, it is re-
quired that the fraction of points with a dose exceeding a given threshold dk (dk < bk )
be not greater than fk (0 < fk < 1). In the example from Table 2.2 such proportions
are given for the bladder and rectum. In our model, this condition is expressed by
Ineqs. (2.14e) and (2.14f), where, for each point i ∈ Hk (k ∈ K̂), we introduce an
auxiliary binary variable yi taking the value of 1 only if the dose at point i exceeds
the threshold dk .
Concluding the discussion of MIP (2.14), it should be noted that we cannot solve
such MIPs using standard software. Since each of the sets Pj consists of a very
large number of samples, the number of variables x j p is usually huge. To solve such
problems, it is necessary to develop a branch-and-price algorithm that is based on
the technique of column generation, which is discussed in Chap. 7.
Let us consider a project that consists of n jobs. There are qr renewable2 (non-
perishable) resources, with, respectively, Rri units of resource i available per unit of
time. There are also qn nonrenewable3 (perishable) resources, with, respectively, Rni
2 Renewable resources are available in the same quantities in any period. Manpower, machines,
storage spaces are renewable resources.
3 In contrast to a renewable resource, which consumption is limited in each period, overall con-
sumption of a nonrenewable resource is limited for the entire project. Money, energy, and raw
materials are nonrenewable resources.
2.13 Project Scheduling 57
units of resource i available for the entire project. It is assumed that all resources are
available when the project starts.
The jobs can be processed in different modes. Any job cannot be interrupted,
thus, if a job once started in some mode, it has to be completed in the same mode.
If job j is processed in mode m (m = 1, . . . , M j ), then
• it takes pmj units of time to process the job,
• ρ rjmi units of renewable resource i (i = 1, . . . , qr ) are used in each period when
job j is processed,
• and ρ njmi units of nonrenewable resource i (i = 1, . . . , qn ) are totally consumed.
Precedence relations between jobs are given by an acyclic digraph G = (J , R)
def
defined on the set of jobs J = {1, . . . , n}: for any arc ( j1 , j2 ) ∈ R, job j2 cannot
start until job j1 is finished.
A project schedule specifies when each job starts and in which mode it is pro-
cessed. The goal is to find a schedule with the minimum makespan that is defined to
be the completion time of the last job.
Let us assume that we know an upper bound H on the optimal makespan value.
We can take as H the makespan of a schedule produced by one of numerous project
scheduling heuristics. The planning horizon is divided into H periods numbered
from 1 to H, and period t starts at time t − 1 and ends at time t.
To tighten our formulation, we can estimate (say, by the critical path method) the
earliest, es j , and latest, ls j , start times of all jobs j.
First, we define the family of decision binary variables:
• x jmt = 1 if job j is processed in mode m and starts in period t (at time t − 1), and
x jmt = 0 otherwise.
For modeling purposes, we also need the following families of auxiliary variables:
• T : schedule makespan;
• d j ∈ R: duration of job j (d j depends on the mode in which job j is processed);
• s j ∈ R: start time of job j.
In these variables the model is written as follows:
T → min, (2.15a)
Mj ls j
∑ ∑ x jmt = 1, j = 1, . . . , n, (2.15b)
m=1 t=es j
Mj ls j
sj = ∑ ∑ tx jmt , j = 1, . . . , n, (2.15c)
m=1 t=es j
Mj ls j
dj = ∑ ∑ pmj x jmt , j = 1, . . . , n, (2.15d)
m=1 t=es j
T ≥ s j + d j, j = 1, . . . , n, (2.15e)
58 2 MIP Models
n Mj min(τ,ls j )
∑∑ ∑m ρ rjmi x jmt ≤ Rri , i = 1, . . . , qr , τ = 1, . . . , H, (2.15f)
j=1 m=1 t=max(τ−p j +1,es j )
n Mj ls j
∑∑ ∑ ρ njmi x jmt ≤ Rni , i = 1, . . . , qn , (2.15g)
j=1 m=1 t=es j
s j2 − s j1 ≥ d j2 , ( j1 , j2 ) ∈ R, (2.15h)
x jmt ∈ {0, 1}, t = es j , . . . , ls j , m = 1, . . . , M j , j = 1, . . . , n, (2.15i)
d j , s j ∈ R, j = 1, . . . , n. (2.15j)
It is easier to start with an example. Two products, 1 and 2, are produced from three
different raw products A,B, and C according to the following technological process.
• Heating. Heat A for 1 h.
• Reaction 1. Mix 50% feed B and 50% feed C and let them for 2 h to form inter-
mediate BC.
• Reaction 2. Mix 40% hot A and 60% intermediate BC and let them react for 2 h
to form intermediate AB (60%) and product 1 (40%).
• Reaction 3. Mix 20% feed C and 80% intermediate AB and let them react for 1 h
to form impure E.
• Separation. Distill impure E to separate pure product 2 (90%, after 1 h) and pure
intermediate AB (10% after 2 h). Discard the small amount of residue remaining
at the end of the distillation. Recycle the intermediate AB.
The above technological process is represented by the State-Task-Network (STN)
shown in Fig. 2.3.
The following processing equipment and storage capacities are available.
• Equipment:
– Heater: capacity 100 kg, suitable for task 1;
2.14 Short-Term Scheduling in Chemical Industry 59
Prod.
1
6
40% 2 h
1 h- Hot 40% 60% Int
Heating - Reaction 2 -
A 2h AB
6 6 10% 2 h
60%
80%
Feed Int Imp. - Separation
A BC
E
6 6
2h 1h 90% 1 h
?
?
Feed 50% Prod.
- Reaction 1 Reaction 3
B 2
50%
6 6
20%
Feed
C
def
di : duration of task i, di = maxs∈Sout pis .
i
MIP Formulation
Our formulation is based on the discrete representation of time. The planning hori-
zon is divided into a number of periods of equal duration. We number these periods
from 1 to T , and assume that period t starts at time t − 1 and ends at time t. Events
of any type — such as the start or end of processing any batch of a task, changes in
the availability of equipment units and etc. — are only happen at the beginning or
end of the periods.
Preemptive operations are not allowed and materials are transferred instanta-
neously from states to tasks and vice versa.
We introduce the following variables:
• xi jt = 1 if unit j starts processing task i at the beginning of period t, and xi jt = 0
otherwise;
• yi jt : total amount of products (batch size) used to start a batch of task i in unit j
at the beginning of period t;
• zst : amount of material stored in state s at the beginning of period t.
Now the MIP model is written as follows:
2.15 Multidimensional Orthogonal Packing 61
q q T
∑ cs zs,T − ∑ ∑ hs zst → max, (2.16a)
s=1 s=1 t=1
min{t,T −di }
∑ ∑ xi jτ ≤ 1, j = 1, . . . , m, t = 1, . . . , T, (2.16b)
i∈I j τ=max{0,t−di +1}
Vimin min
j xi jt ≤ yi jt ≤ Vi j xi jt , j = 1, . . . , m, i ∈ I j , t = 1, . . . , T, (2.16c)
0 ≤ zst ≤ us , s = 1, . . . , q, t = 1, . . . , T, (2.16d)
z0s = zs1 + ∑ ρisin ∑ yi j1 , s = 1, . . . , q, (2.16e)
i∈Tsin j∈Ui
s = 1, . . . , q, t = 1, . . . , T, (2.16f)
xi jt = 0, t > T − di , j = 1, . . . , m, i ∈ I j , (2.16g)
xi jt ∈ {0, 1}, yi jt ∈ R+ , j = 1, . . . , m, i ∈ I j , t = 1, . . . , T, (2.16h)
zst ∈ R+ , s = 1, . . . , q, t = 1, . . . , T. (2.16i)
Objective (2.16a) is to maximize the total profit that equals the total cost of the
products in all states at the end of the planning horizon minus the expenses for
storing products during the planning horizon. Inequalities (2.16b) ensure that at any
time any unit cannot process more than one task. The variable bounds in (2.16c)
restrict the batch size of any task to be within the minimum and maximum capacities
of the unit performing the task. The stock limitations are imposed by Ineqs. (2.16d):
the amount of product stored in any state s must not exceed the storage capacity
for this state. The product balance relations from (2.16f) ensure that, for any state
s in each period t > 1, the amount of product entering the state (the stock from
the previous period plus the input from the tasks ending in period t − 1) equals the
amount of product leaving the state (the stock at the end of period t plus the amount
of product consumed by the tasks that started in period t). Equations (2.16e) are
specializations of the balance relations for period 1, which has no preceding period.
In (2.16g) we set to zero the values of some variables to enforce all the tasks to
finish within the planning horizon.
Here we consider a modeling approach that combines the disjunctive approach with
the discrete representation of the container. The latter will allow us to formulate the
non-overlapping disjunctions by linear inequalities with small coefficients.
First, let us define two families of decision binary variables:
• zr = 1 if item r is placed into the knapsack, and zr = 0 otherwise;
• yri j = 1 if item r is placed at a point p ∈ L with pi = j, and yri j = 0 otherwise.
For modeling purposes, we also need two families of auxiliary variables:
def
• xri j = 1 if the open unit strip U ji = {w ∈ Rm : j < wi < j + 1} intersects item r,
and xri j = 0 otherwise;
• sr1 ,r2 ,i = 1 if items r1 and r2 are separated by a hyperplane that is orthogonal to
axis i, and sr1 ,r2 ,i = 0 otherwise.
In these variables the m-KP is written as follows:
n
∑ cr zr → max, (2.17a)
r=1
Li −lir
∑ yri j = zr , i = 1, . . . , m, r = 1, . . . , n, (2.17b)
j=0
min{ j,Li −lir }
xri j = ∑ yr,i, j1 , j = 0, . . . , Li − 1, i = 1, . . . , m,
j1 =max{0, j−lir +1}
r = 1, . . . , n, (2.17c)
m
∑ sr1 ,r2 ,i ≥ zr1 + zr2 − 1, r2 = r1 + 1, . . . , n,
i=1
r1 = 1, . . . , n − 1, (2.17d)
xr1 ,i, j + xr2 ,i, j + sr1 ,r2 ,i ≤ zr1 + zr2 , j = 1, . . . , Li , i = 1, . . . , m,
r2 = r1 + 1, . . . , n, r1 = 1, . . . , n − 1, (2.17e)
zr ∈ {0, 1}, r = 1, . . . , n, (2.17f)
yri j ∈ {0, 1}, j = 0, . . . , Li − lir , i = 1, . . . , m,
r = 1, . . . , n, (2.17g)
64 2 MIP Models
Objective (2.17a) is to maximize the total cost of items placed into the knapsack.
Equations (2.17b) ensure that the values of y-variables uniquely determine the po-
sitions of all items placed into the knapsack. Simultaneously, these equations set
to zero the values of those y-variables that correspond to the items not placed into
the knapsack (zr = 0). Equations (2.17c) reflect the relationship between the x and
y-variables: a strip U ji crosses item r only if coordinate i of its nearest to the origin
corner is between max{0, j − lir + 1} and min{ j, Li − lir }. These equations together
with (2.17b) also impose the restrictions on the item sizes, namely, if item r is placed
into the knapsack, then, for any i = 1, . . . , m, the number of strips U ji crossing r is
lir , and these strips are sequential. Two families of inequalities, (2.17d) and (2.17e),
imply that each pair of items placed into the knapsack will be separated by at least
one hyperplane that is orthogonal to a coordinate axis. The other relations, (2.17f)–
(2.17i), declare that all variables are binary.
sack inequalities:
n
∑ volr zr ≤ Vol, (2.18)
r=1
n
volr Vol
∑ r xri j ≤ , j = 1, . . . , Li , i = 1, . . . , m. (2.19)
r=1 li Li
Inequality (2.18) imposes a natural restriction that the sum of item volumes cannot
exceed the knapsack volume. Inequalities (2.19) reflect the fact that the sum of the
volumes of the intersections of all the items with any unit strip orthogonal to some
coordinate axis cannot exceed the volume of the intersection of the knapsack with
that strip.
Computational experiments showed that these knapsack inequalities may greatly
tighten our basic IP formulation.
2.16 Single Depot Vehicle Routing Problem 65
In this section we show how to extend our IP formulation of m-KP to cover the
cases when the rotation of items is allowed, and when the items are the unions of
rectangular boxes.
To model the first case, let us consider a version of m-KP when all n items are
partitioned into k groups, I1 , . . . , Ik , and from each group no more than one item can
be placed into the container. To take into account this additional restriction, we need
to add to (2.17) the following inequalities:
∑ zi ≤ 1, q = 1, . . . , k.
i∈Iq
If it is allowed to rotate the items, we put into one group all the items resulting
from all possible rotations of a particular item.
To model the case when some items are the unions of two or more rectangular
boxes, let us assume that our input n items are divided into groups of one or more
items, and all items from any group are together put or not put into the knapsack. In
each group of items we choose one item, r̄, called the base; any non-base item r in
this group is assigned a reference, re f (r) = r̄, to the base item, re f (r̄) = −1. Each
non-base item r is also assigned a vector vr ∈ Rm : if or̄ is the nearest to the origin
corner of the base item r̄ = re f (r), then or̄ + vr is the nearest to the origin corner of
item r. In other words, the shape of any group is determined by its base r̄ and the
vectors vr assigned to all non-base items r of this group.
The following equations model the above defined group restrictions;
There is a depot that supplies customers with some goods. This depot has a fleet
of vehicles of K different types. There are qk vehicles of type k, and each such a
vehicle is of capacity uk (maximum weight to carry). We also know the fixed cost,
fk , of using one vehicle of type k during a day.
At a particular day, n customers have ordered some goods to be delivered from
the depot to their places: customer i is expecting to get goods of total weight di .
To simplify the notations, we will also consider the depot as a customer with zero
demand. For a vehicle of type k, it costs ckij to travel from customer i to customer j.
A route for a vehicle is given by a list of customers, (i0 = 0, i1 , . . . , ir , ir+1 = 0),
in which none of the customers, except for customer 0 (depot), is met twice. This
list determines the order of visiting customers. The route is feasible for vehicles of
type k if the total demand of the customers on this route does not exceed the vehicle
66 2 MIP Models
capacity, ∑rs=1 dis ≤ uk . The cost of assigning a vehicle of type k on this route is
fk + ∑r+1 k
s=1 cis−1 ,is .
The vehicle routing problem (VRP) is to select a subset of routes such that each
customer is just on one route, and then assign a vehicle of sufficient capacity (from
the depot fleet) to each selected rout so that the total cost of assigning vehicles to
the routes is minimum.
3j4 j
BM 19 1
1
3 1j
2 B
1B
1 B 3 B
6j m
6
=
14 1 B3
A1
Y
H 2 B
j 4 7j
HH
1 4
B
j
AU
m
BN
3 2 P 13
PP 5 BM
q 8j 1 2
1 P B 5
XXX 2 2
z 0m j
XX B
X
X yX 2
2 12m9
1
XX
X 11m
]
J 2
J 1
2
1 15m4
J
1 2 1
m
10
2 - 5j
Figure 2.4 displays a solution to some example VRP. Here we have 15 customers
(represented by the nodes numbered from 1 to 15) serving from one depot (node 0),
the numbers next to the nodes are customer demands, and the number adjacent to
each arc is the traveling cost for the car assigned to the route that contains this arc.
In our example we have three routes:
• 0 → 4 → 14 → 3 → 6 → 2 → 8 → 0 of total demand 18 and cost 9,
• 0 → 7 → 1 → 9 → 13 → 11 → 0 of total demand 15 and cost 11,
• 0 → 12 → 10 → 5 → 15 → 0 of total demand 9 and cost 7.
To formulate the VRP as an IP, we need the following family of decision binary
variables:
• xikj = 1 if some vehicle of type k travels directly from customer i to customer j,
and xikj = 0 otherwise.
In addition, we also need one family of auxiliary variables:
• yi j : weight of goods that are carried by the vehicle assigned to the route from
customer i to customer j.
In these variables the model is written as follows:
2.16 Single Depot Vehicle Routing Problem 67
K n K n
∑ ∑ ( fk + ck0, j )x0,k j + ∑ ∑ ∑ ckij xikj → min, (2.20a)
k=1 j=1 k=1 i=1 j∈{0,...,n}\{i}
n
∑ x0,k j ≤ qk , k = 1, . . . , K, (2.20b)
j=1
K n
∑ ∑ xikj = 1, j = 1, . . . , n, (2.20c)
k=1 i=0
n n
∑ xikj − xkji
∑ = 0, j = 1, . . . , n, k = 1, . . . , K, (2.20d)
i=0 i=0
n n
∑ yi j − ∑ y ji = d j , j = 1, . . . , n, (2.20e)
i=0 i=0
K
0 ≤ yi j ≤ ∑ (uk − di )xikj , i, j = 0, . . . , n, (2.20f)
k=1
xiik = 0, i = 0, . . . , n, k = 1, . . . , K, (2.20g)
xikj ∈ {0, 1}, i, j = 0, . . . , n, k = 1, . . . , K. (2.20h)
Objective (2.20a) is to minimize the total fixed cost of using vehicles plus the
traveling cost of all used vehicles along their routes. Inequalities (2.20b) ensure that
the number of vehicles of any particular type that leave the depot does not exceed
the number of such vehicles in the depot fleet. Equations (2.20c) guarantee that each
customer will be visited by just one vehicle, while (2.20d) guarantee that any vehicle
that arrives at a customer will leave that customer. The next family of equations,
(2.20e), reflects the fact that any vehicle before leaving a customer must upload the
goods ordered by that customer. Each inequality from (2.20f) imposes the capacity
restriction: if a vehicle of type k travels directly from customer i to customer j,
then the total cargo weight on its board cannot exceed uk − di ; moreover, if none of
vehicles (of any type) travels directly from customer i to customer j, then yi j = 0.
One can easily argue that these capacity restrictions guarantee that each used vehicle
will never carry goods of total weight greater than the vehicle capacity.
A note of precaution is appropriate here. Formulation (2.20) may be very weak
because the variable bound constraints in (2.20f) are usually not tight. Let us assume
that some inequality yi j ≤ ∑kk=1 (uk − di )xikj holds as equality. If the capacity uk is
big, and the demand di is small, then uk − di is big. If we further assume that the
sum of demands of those customers that are on the same route with customer i and
are visited after customer i is small, then yi j is also small and, therefore, xikj takes a
small fractional value. Many binary variables taking fractional values is usually an
indicator of a difficult to solve problem. Therefore, one can hardly expect that (2.20)
can be used for solving to optimality even VRPs of moderate size. Nevertheless, if
implemented properly, an application based on this formulation can produce rather
good approximate solutions for VRPs of practical importance.
68 2 MIP Models
Here we consider a special case of the VRP that is more ”uniformly” structured. Let
us assume that there are m vehicles in the depot fleet, all of them are of the same
type (K = 1) and the same capacity U. Fixed costs of using vehicles are not taken
into account. This special VRP is known as the classical vehicle routing problem
(CVRP).
def
Let N = {0, 1, . . . , n}, and let r(S) denote the minimum number of vehicles
needed to serve a subset S ⊆ N \ {0} of customers. The value of r(S) can be com-
puted by solving the 1-BPP (see Sect. 2.15) with all bins of capacity U, and the item
set S, where the length of item i is l1i = di . Since 1-BPP is NP-hard in the strong
sense, in practice, r(S) is approximated from below by the value b(∑i∈S di ) /Uc.
To formulate the CVRP as an IP, we use the following family of decision binary
variables:
• xi j = 1 if some vehicles directly travels from customer i to customer j, and xi j = 0
otherwise.
In these variables the model is written as follows:
∑ ∑ ci j xi j → min, (2.21a)
i∈N j∈N\{i}
∑ x0, j ≤ m, (2.21b)
j∈N\{0}
∑ x ji = 1, i ∈ N \ {0}, (2.21c)
j∈N\{i}
∑ ∑ x ji ≥ r(S), S ⊆ N \ {0}, S 6= 0,
/ (2.21d)
j∈N\S i∈S
Objective (2.21a) is to minimize the total delivery cost. Inequality (2.21b) ensure
that no more than m vehicles may leave the depot, and, therefore, there may be no
more than m routes. Equations (2.21c) guarantee that each customer will be visited
exactly once. Inequalities (2.21d) guarantee that the routs defined by the values of
xi j -variables are feasible, i.e., each of them leaves the depot, and, due to definition
of r(S), the total weight of all customer demands on this route does not exceed the
vehicle capacity.
Since (2.21d) contains exponentially many inequalities, IP (2.21) can be solved
only by a cutting plane algorithm (see Sect 4), and this is possible only if the sepa-
ration problem (see Sect 4.7) for (2.21d) can be solved very quickly. Unfortunately,
in general this is not the case. However, when x is an integer vector that satisfied
all constraints from (2.21) but (2.21d), then the problem of finding in (2.21d) an
inequality that is violated at x is trivial, and we leave it to the reader to elaborate
such a separating procedure.
2.17 Notes 69
2.17 Notes
Sect. 2.1. In more detail, the set packing, set partitioning and set covering problems
are discussed in [98, 123]. The polyhedral structure of the set packing problem is
also considered in Sect. 5.5.
The problems similar to the problem of crew scheduling have always been a
fruitful area for applications of the set covering and set partitioning problems (see
[80]). If you are interested in how trades are organized at combinatorial auctions,
see [100].
Sect. 2.2. The problem of locating service centers was first formulated as an IP in
[15]. The polyhedral structure of this problem is studied in [98, 142].
Sect. 2.3. Exact and approximate methods of solving the problem of the formation
of the index fund are studied in [40].
Sect. 2.4. A classification of multi-product lot-sizing models is given in [143]. The
polyhedral structures of some of these models are studied in [98, 142].
Sect. 2.5. The assembly line balancing problems and the algorithms for their solu-
tion are discussed in [120].
Sect. 2.6. In the literature on the optimization of the performance of energy systems,
the unit commitment problem is one of the highest priority [126].
Sect. 2.7. From the sources [21, 131], you can learn more about the problems of
designing telecommunication networks and the methods for solving them.
Sect. 2.8. A good source about the detailed placement problem is the survey [125].
Sect. 2.9. The problem of assigning aircrafts to flights and its IP formulation is
studied in [1].
Sect. 2.10. The model for determining an optimal operation mode of a hybrid car is
an extended version of the model from [29].
Sect. 2.11. The problem of short-term financial management was formulated back
in 1969 [101], and its essence has not changed since then.
Sect. 2.12. A historical reference on the application of MIP for the treatment of
cancer tumors by the method of intensive modulated radiation therapy is given in
[112]. A description of the column generation algorithm for solving (2.14) is also
proposed there.
Sect. 2.13. A good survey on the resource-constrained project scheduling is given
in [128].
Sect. 2.14. Using STNs for describing technological processes was proposed in [82].
A MIP formulation for the problem of optimizing technological processes was also
presented there.
Sect. 2.15. The disjunctive approach for modeling multidimentional packing prob-
lems originates from the floor-planning applications [130]. The first IP formulation
of a two-dimensional packing problem (namely, the cutting stock problem) based
on the discretization of the container space was given by Beasley [20] (see also
Exercise 2.12). Formulation (2.17) is published in this book for the first time.
70 2 MIP Models
Sect. 2.16. Many IP alternative formulations have been proposed for different varia-
tions of the vehicle routing problem. The single-commodity flow formulation (2.20)
was first presented in [53]. Formulation (2.21) was given in [84] as an extension of
the IP formulation for the traveling salesman problem proposed in [43] (see also
(6.13)). For a survey on vehicle routing, see [38].
2.18 Exercises
2.8. Modify IP (2.7), which is a formulation of the simple assembly line balancing
problem, to take into account the following requirements to uniformly load the sta-
tions: a) any open station work time (on one product) must be at least q1 percent
of the cycle time, b) the work times of the maximum and minimum loaded stations
must not differ by more than q2 percent.
2.9. Clearing problem. There are m banks in a country. The current balance of bank i
is bi . Several times a day, the Interbank Settlement Center receives a list of payments
Pk = (ik , jk , A1k , A2k , Sk ), k = 1, . . . , n. The fields of the tuple Pk are interpreted as
follows: it is necessary to transfer the sum Sk from the account A1k in bank ik to the
account A2k in bank jk . The goal is to accept as many payments as possible, provided
that the new balance (calculated taking into account the payments made) of each of
the banks will be non-negative. Note that for this optimization problem the fields A1k
and A2k of the payment records are insignificant.
Formulate this clearing problem as an IP.
2.10. Sport tournament scheduling. The teams participating in the basketball cham-
pionship are divided into two conferences: Western and Eastern. There are n1 teams
in the western conference, and n2 teams in the eastern conference. One round of the
championship lasts T weeks. Each team must play no more than once a week and
must play 2k times with each team in its division and 2q times with each team in
72 2 MIP Models
the other division. For each pair of opponents, half of the games must be played on
the site of each team. Of all the round-robin schedules, the best is that for which the
minimum interval between games of the same pair of teams is maximum.
Formulate the problem of finding the best tournament schedule as an IP.
2.11. Nearest substring problem4 . Let A be some finite set of symbols (alphabet).
The sequence of characters s = ”s1 s2 . . . sk ” (si ∈ A ) is called a string of length
|s| = k in the alphabet A . A substring of string s is a string ”si1 si2 . . . sim ” composed
of the characters of string s and written in the same order in which they are present
in s, i.e., 1 ≤ i1 < i2 < · · · < im ≤ |s|. The distance d(s1 , s2 ) between two strings s1
and s2 of the same length is defined as the number of positions in which these strings
differ. For example, if s1 = ”ACT ” and s2 = ”CCA”, then d(s1 , s2 ) = 2. If |s1 | < |s2 |,
then the distance d(s1 , s2 ) is defined to be the maximum distance d(s1 , s̄2 ) for all
substrings s̄2 of length |s1 | in string s2 .
Given a list (s1 , s2 , . . . , sn ) of strings in some alphabet A , the length of each
string si is at least m. We need to find a string s of length m such that the maximum
distance d(s, si ) (i = 1, . . . , n) is minimum. Formulate this problem as an IP.
2.12. Formulate the 2-KP as an IP using only the following binary variables: yri j ,
j = 0, . . . , Li − lir , i = 1, 2, r = 1, . . . , n, where yr,1,s = yr,2,t = 1 only if (s,t) is the
item r corner that is nearest to the knapsack origin.
2.13. Balanced airplane loading. The cargo plane has three cargo compartments:
front (1st), central (2nd) and tail (3rd). The base of compartment i is a rectangle of
width Wi and length Li , the total weight of cargo in compartment i must not exceed
Gi tone, i = 1, 2, 3.
We need to load n containers into the plane, the containers cannot be stacked
on top of each other. The weight of container j is g j , and its base is of width w j
and length l j . The goal is to load the containers into the plane so that the weights of
cargo in different compartments are balanced: the difference between the largest and
smallest ratios of the total weight of the cargo in the compartment to the maximum
allowable weight of cargo in this compartment must be minimum.
Formulate the problem of balanced airplane loading as an IP.
4 The problems of comparing strings in different statements are often encountered in many appli-
cations of computational biology.
Chapter 3
Linear Programming
After discovering the interior point methods, it seemed that the new methods would
completely displace the simplex algorithms from the practical use. New methods
proved to be very efficient in practice, especially when solving large-scale LPs. A
crucial requirement for an LP algorithm to be used in MIP is its ability to quickly
perform reoptimization, i.e., having found an optimal solution of some LP, the
method must be able to quickly find an optimal solution of a ”slightly” modified
version of the just solved LP. None of the interior point algorithms is able to do
reoptimization quickly. The wide use of MIPs in practice and the ability of the dual-
simplex method to quickly perform reoptimization enabled the latter to survive in
the competition with the interior points methods. Moreover, the acute practical need
for efficient MIP algorithms stimulated researches that resulted in a significant in-
crease in the efficiency of the simplex algorithms in practice. And today the best
implementations of the simplex algorithms are quite competitive with the best im-
plementations of the best interior point methods. Therefore, here we will study only
the simplex algorithms, and the main attention will be paid to the dual simplex
method.
73
74 3 Linear Programming
from a set I ⊆ M and columns from a set J ⊆ N. If J = N (resp., I = M), then instead
of ANI (resp., AJM ) we write AI (resp., AJ ).
A subset I of n linearly independent rows of A is called a (row) basic set, the
matrix AI is called a basic matrix, and the only solution x̄ = A−1 I bI of the linear
system AI x = bI is a basic solution. If in addition x̄ is feasible, i.e., it satisfies the
system of inequalities Ax ≤ b, then x̄ is called a feasible basic solution, and I is
called a feasible basic set. Note also that the feasible basic solutions are nothing
def
else as the vertices of the polyhedron P(A, b) = {x ∈ Rn : Ax ≤ b}.
To clarify the above definitions, let us consider the feasible region of an LP with
n = 3 variables and m = 7 constraints
−x1 ≤ 0, H1
− x2 ≤ 0, H2
− x3 ≤ 0, H3
x1 + x2 + x3 ≤ 4, H4
2x1 ≤ 5, H5
3x2 ≤ 7. H6
These constraints are shown in Fig. 3.1. The basic solutions are depicted as bold
dots, and feasible basic solutions in addition are circled.
x3 6
sfx{124}
@
H2 @ H1
@
@
@ x
H4 @sf {246}
@
@
sf {245} sfx{123}
H6 @
x
sfx{236} @s -
@ ! x2
!! x{134}
H5@ ! !
@ !sfx!
sf @ sf! s {346}
!
x{235} ! x ! !@ x{356} H3
! {345} @s
!! x{456}
s!
!
x1 x{234}
polyhedron P(A, b). For example, two vertices (1, 0, 3) and (2, 2, 0)T of the polytope
from Fig. 1.4 are degenerate, and all the others are nondegenerate. A polyhedron
P(A, b) that has degenerate vertices is called degenerate. Similarly, an LP having
degenerate feasible basic solutions is called degenerate.
A basic set I and the corresponding to it basic solution x̄ = A−1I bI are called dual
feasible if the vector of potentials π T = cT A−1 I is non-negative. In this case, the
point ȳ = (ȳI = π, ȳM\I = 0) is a feasible solution to the dual LP1
cT x̄ = cT A−1 T T
I bI = π bI + bM\I 0 = b ȳ.
On the other hand, for any feasible solution x of the primal LP (3.1) and any
feasible solution y of the dual LP (3.2), we have
cT x ≤ yT Ax ≤ yT b. (3.3)
It follows that x̄ and ȳ are optimal solutions to the primal, (3.1), and dual, (3.2),
LPs if the basic set I is simultaneously primal and dual feasible. In the context of
duality, feasible basic sets and solutions are also called primal feasible, and the
solutions to the dual LP are called dual solutions (for the primal LP). We also note
that the components of an optimal dual solution are also called shadow prices (for
the primal LP).
Let I be a feasible basic set, B = AI and x̄ = B−1 bI be the corresponding basic matrix
and feasible basic solution. Now and in what follows, we assume that the order of
the elements in the basic set is fixed, i.e., I is a list, and we denote the i-th element
of this list by I[i].
Let us remove from the basic set some row index I[t]. The solution set of the lin-
def
ear system AI\I[t] x = bI\I[t] is the line {x(λ ) = x̄ − λ B−1 et : λ ∈ R}. Let us imagine
that we put an n-dimensional chip into the point x̄ = x(0), and then, increasing λ ,
move this chip within the feasible polyhedron P(A, b) along the ray {x(λ ) : λ ≥ 0}
until it rests against some hyperplane given by As x = bs for s 6∈ I. Let x̂ = x(λ̂t ) be
the intersection point of our ray with his hyperplane. Notice that
kx̂ − x̄k bi − Ai x̄ bs − As x̄
λ̂t = = min =
kB−1 et k i6∈I, −1
Ai B et As B−1 et
Ai B−1 et >0
1Duality in linear programming is discussed in Sect. 3.5. The variables yi of the dual LP (3.2) are
dual variables for the primal LP (3.1).
76 3 Linear Programming
and x̂ = B̂−1 bIˆ, where B̂ = AIˆ and the new basic row set, I,
ˆ is defined by the rule
ˆ = s, i = t,
I[i] (3.4)
I[i], i 6= t.
where uT = As B−1 , and the matrix I(t, u) is obtained from the identity matrix I by
substituting the row vector uT for the row t. Notice that
1
I(t, u)−1 = I − et uT + etT
ut
1 0 ... 0 0 0 ... 0
0 1 ... 0 0 0 ... 0
.. .. . . .. .. .. . . ..
. . . . . . . .
0 0 ...
(3.6)
1 0 0 ... 0
= − u1 − u2 . . . − ut−1 1 ut+1 un .
ut ut ut ut − ut . . . − ut
0 0 ... 0 0 1 ... 0
.. .. . . .. .. .. . . ..
. . . . . . . .
0 0 ... 0 0 0 ... 1
In LP, such a change in the basis is called a pivot operation. Column t is called a
pivot column, row s is a pivot row, and the element As B−1 et , which in the matrix
AB−1 is in row s and column I[t], is called a pivot element.
Let us illustrate the pivot operation using the example polytope from Fig. 3.1.
Let I = {2, 4, 6} and t = 1. Then x̄ = x{246} and the ray {x(λ ) : λ ≥ 0} is directed
from the point x{246} along the edge [x{246} , x {346} ] to the point x{346} , which lies
on the hyperplane H3 . Therefore, Iˆ = {3, 4, 6} and x̂ = x{346} are the new feasible
basic set and solution.
To ensure that, after performing the pivot operation, the objective function will
increase,
kx̂ − x̄k kx̂ − x̄k
cT x̂ − cT x̄ = − −1 cT B−1 et = − −1 πt > 0, (3.7)
kB et k kB et k
the index t is chosen so that the directing vector −B−1 et forms an acute angle with
the gradient, c, of the objective function: cT B−1 et = πt > 0.
It may happen that all components of the vector AB−1 et are nonpositive. Then
λ̂t = ∞ (we assume that the minimum over the empty set of alternatives is infinite),
and this means that we can move our chip along the ray {x(λ ) : λ ≥ 0} infinitely
long, remaining within the polyhedron P(A, b). If, in addition, πt < 0, then the ob-
jective function will increase to infinity.
3.2 Primal Simplex Method 77
x1 + 2x3 → max,
x1 + 2x2 + x3 ≤ 4,
x1 + x3 ≤ 3,
− x2 + x3 ≤ 1,
−x1 ≤ 0,
− x2 ≤ 0,
− x3 ≤ 0.
Solution. The polyhedron of feasible solutions for this LP is depicted in Fig. 3.2.
When the simplex method starts working with the feasible basic set I = (4, 5, 6), its
iterations are as follows.
78 3 Linear Programming
x3
x(2)
x(3) x(1)
x2
x(0)
x1
−1 0 0
0. I = (4, 5, 6), B−1 = 0 −1 0 , x(0) = (0, 0, 0)T , π = (−1, 0, −2)T .
0 0 −1
A feasible basic set is one of the inputs to the simplex procedure. But how to find
such a set? It is possible to transform the initial LP in order to obtain an equivalent
LP for which a feasible basic set can be simply identified. There are several ways to
perform such a transformation, but we will consider only one of them.
Again, let us consider LP (3.1). First, say, by the method of Gaussian elimination,
we find a set I of n linearly independent rows of the constraint matrix A. Let B = AI
and x̄ = B−1 bI . If the vector of the residuals b − Ax̄ is non-negative, then we are
finished: x̄ is a feasible solution (Ax̄ ≤ b), and I is a feasible basic set. Otherwise,
we solve the following LP:
−xn+1 → max,
Ax + axn+1 ≤ b, (3.8)
0 ≤ xn+1 ≤ 1,
cT x − Mxn+1 → max,
Ax + axn+1 ≤ b, (3.9)
0 ≤ xn+1 ≤ 1,
where M is a sufficiently large number. We could estimate the value of M, but any
theoretical estimate is usually too large, and it is not easy (for reasons of numerical
stability) to use it in practice. Therefore in practice, the solution of (3.9) begins with
a moderately large value of M. If it turns out that in the obtained optimal solution
the component xn+1 is non-zero, then M is doubled and the solution of this LP
is continued by the simplex method. This is repeated until either a solution with
xn+1 = 0 is found or the value of M exceeds a certain threshold value. In the latter
case, it is concluded that (3.1) does not have feasible solutions.
The calculation of the potential vector π in the simplex procedure is called pricing,
or a pricing operation. Obviously, there can be many negative components πi and,
therefore, we need a rule for an unambiguous choice of index t. The following rules
(strategies) are best known:
”first negative”: t = min{i : 1 ≤ i ≤ n, πi < 0};
”most negative”: t ∈ arg min πi ;
1≤i≤n
πi
”steepest edge”: t ∈ arg min ;
1≤i≤n kB−1 ei k
”maximum increase”: t ∈ arg min λ̂i πi .
1≤i≤n
The meaning of these rules, with the exception for the steepest edge rule, must
be clear from their names. When the steepest edge rule is used, the simplex-method
moves from the current vertex to the next one along an edge which directing vec-
tor, −B−1 et , forms the most acute angle, of value φt , with the objective (gradient)
vector c:
−cT B−1 et 1 −πt
cos(φt ) = −1
= · −1 .
kck · kB et k kck kB et k
The larger cos(φt ), the sharper the angle between the vectors −B−1 et and c is.
In practice, for many years the most negative rule (also known as Danzig’s rule)
prevailed. The first negative rule is the easiest to implement, but in comparison,
say, with the most negative rule the number of iterations of the simplex method
can increase substantially. The maximum increase rule requires computations that
take too much time at each iteration and, therefore, this rule is not practical. The
same could be said about the steepest edge rule until the formulas were found for
recalculating the squares of the column norms.
Lemma 3.1. Let I be a feasible basic set, Iˆ be the basic set determined by (3.4) for
some t ∈ {1, . . . , n}, and let B = AI , B̂ = AIˆ and
3.3 Dual Simplex Method 81
def T
γi = kB−1 ei k2 = eTi B−1 B−1 ei , i = 1, . . . , n.
Then
1
γt , i = t,
ut2
def
γ̂i = kB̂−1 ei k2 = (3.10)
ui u2
γi − 2 αi + 2i γt , i 6= t,
ut ut
where uT = As B−1 , v = B−1 et , α T = vT B−1 .
Proof. In view of (3.5), (3.6), and since
ui
ei − ut et , i 6= t,
1
I − et (uT + etT ) ei =
ut 1
− et , i = t,
ut
we obtain
In this section we consider another version of the simplex method known as the
dual simplex method. This simplex method is called dual, because, solving an LP, it
essentially repeats the work of the primal simplex method applied to the dual LP. It
should also be noted that the dual simplex method is the main LP method in MIP.
Again, we consider an LP of the form (3.1). Let I be a dual feasible basic set and
let B = AI , x̄ = B−1 bI , π T = cT B−1 ≥ 0 and ȳ = (ȳI = π, ȳN\I = 0). Let us recall
that ȳ is a feasible solution to the dual LP (3.2). If a basic solution x̄ is feasible, then
it is optimal. Otherwise, there is an inequality As x ≤ bs , s 6∈ I, that is violated at x̄.
The dual objective function bT y decreases if we move from the point ȳ along the ray
y(θ ) = ȳ − θ v (θ ≥ 0), where, for uT = As B−1 , the components of the vector v are
determined by the rule
82 3 Linear Programming
−1, i = s,
vi = u j , i = I[ j], j = 1, . . . , n,
0, i ∈ {1, . . . , m} \ (I ∪ {s}).
Let θ̂s denote the maximum value of θ such that y(θ ) is still a feasible solution
to the dual LP, and let ŷ = y(θ̂s ). Notice, that
and if
πi
t ∈ arg min ,
1≤i≤n, ui
ui >0
If all components of u are non-positive, then θ̂s = ∞ and the dual objective func-
tion indefinitely decreases along the ray {y(θ ) : θ ≥ 0}. If (3.1) had a feasible
3.3 Dual Simplex Method 83
solution x, then by (3.3) we would have cT x ≤ bT y(θ ) for any positive θ . But since
cT x is finite and limθ →∞ bT y(θ ) = −∞, we conclude that (3.1) does not have feasible
solutions.
A detailed description of the dual simplex method is presented in Listing 3.2.
The input of the dual-simplex procedure is composed of a triple (c, A, b) describing
an LP, and a dual feasible basic set I. The method terminates for two reasons.
1) The current dual feasible basic set I also becomes feasible. In this case, the output
is a triple
(true, x, y = (yI = π, yM\I = 0)),
where x is an optimal solutions to the primal LP (3.1), and y is an optimal solution
to the dual LP (3.2).
2) If u ≤ 0, then (3.1) does not have feasible solutions. In this case, the procedure
returns a pair (false,y), where y is a ”certificate of infeasibility” (see Sect. 3.4).
The dual simplex method has a specific feature that predetermined its wide use in
MIP.
Suppose that we have already solved an instance of (3.1). And now we want to
solve one of the following modifications of the just solved LP:
max{cT x : Ax ≤ b, Hx ≤ β }, (3.12)
T
max{c x : Ax ≤ b̃}. (3.13)
If I is an optimal basic set for (3.1), then I will be a dual feasible basic set for
both new LPs, (3.12) and (3.13), which allows us to use I as an initial basic set in the
dual-simplex procedure. If the changes are small (just a few inequalities in Hx ≤ β ,
or kb − b̃k is small enough), we can expect that the dual simplex method will need
to perform relatively few iterations to build a solution to the modified program.
In a very similar way, the primal simplex method can be used for reoptimization
when the objective function is changed or new columns (variables) are added. This
is due to the fact that with such changes the primal feasibility of a feasible basic
solution is preserved.
In practice, very often LPs are appear in the following most general two-sided form:
max{cT x : b1 ≤ Ax ≤ b2 , d 1 ≤ x ≤ d 2 }, (3.14)
84 3 Linear Programming
max{cT x : d 1 ≤ x ≤ d 2 }.
From what was said in Sect. 3.3.1, it follows that x0 is a dual feasible basic solution
for (3.14). The dual feasible basic set corresponding to x0 is I = {m + j : c j ≥
0} ∪ {−m − j : c j < 0}.
The dual simplex method for solving LP (3.1) can be considered as a cutting plane
algorithm. Let us illustrate this with an example.
Example 3.2 We need to solve the LP
x1 + 2x2 → max,
x1 + x2 ≤ 4,
−x1 + x2 ≤ 1,
(3.15)
−2x1 − x2 ≤ −2,
0 ≤ x1 ≤ 3,
0 ≤ x2 ≤ 3.
Solution. We begin with the dual feasible basic solution x(0) = (3, 3)T , at which
the objective function attains its maximum over the parallelepiped
x2 x2 x2
x(0) 6 xr (1)
3 r re 3 r e
6 6
3
@ @rex(2)
2 2 @ 2 @
P1 @r
1 r @r
P0 P2
1 1
r r - r r - r r -
0 1 2 3 x1 0 1 2 3 x1 0 1 2 3 x1
a b c
Fig. 3.3 Interpretation of the dual simplex method as a cutting plane algorithm
1. Since the point x(0) does not satisfy the first inequality from (3.15), we cut it
off by the hyperplane x1 + x2 = 4 (Fig. 3.3.b). After this, we perform the iteration of
the dual simplex method:
P1 = {x ∈ R2 : 0 ≤ x1 ≤ 3, 0 ≤ x2 ≤ 3, x1 + x2 ≤ 4}.
2. Since x(1) violates the second inequality in (3.15), we cut it off using the hyper-
plane −x1 + x2 = 1 (Fig. 3.3.c). Then we perform the iteration of the dual simplex
method:
1
s = 2, u = (−1, 2)T , λ = , t = 2, I = (1, 2),
2
−1 1 1 −1 (2) 1 3 1 3
B = , x = , π= .
2 1 1 2 5 2 1
Note that x(2) is the maximizer of the objective function over the polytope
P2 = {x ∈ R2 : 0 ≤ x1 ≤ 3, 0 ≤ x2 ≤ 3, x1 + x2 ≤ 4, −x1 + x2 ≤ 1}.
Since the point x(2) satisfies all constraints in (3.15), then it is an optimal solution
to (3.15). t
u
The search for a violated inequality As x ≤ bs in the dual simplex method is called
separation. It is clear that a point x can violate many inequalities and, therefore, we
86 3 Linear Programming
need a rule for an unambiguous choice of index s. The following rules (strategies)
are best known:
”first violated”: s = min{i : Ai x > bi , i 6∈ I};
”most violated”: s ∈ arg max(Ai x − bi );
i6∈I
Ai x − bi
s ∈ arg max p
”steepest edge”: ;
i6∈I 1 + kAi B−1 k2
”maximum decrease”: s ∈ arg max θ̂i (Ai x − bi ).
i6∈I
We can say about all these rules almost the same as was said in Sect 3.2.2 about
the corresponding pricing rules. In practice, various variations of the ”most violated”
and ”steepest edge” rules are used. To make the separation based on the steepest
edge rule practical, the formulas were obtained for recalculating the row norms of
the matrices AB−1 .
Lemma 3.2. Let I be a basic set, and let the basic set Iˆ be constructed according to
(3.4) for some t ∈ {1, . . . , n}, B = AI , B̂ = AIˆ and
def T
ηi = 1 + kAi B−1 k2 = 1 + Ai B−1 B−1 ATi , i = 1, . . . , m.
Then
ˆ
2, i ∈ I,
1
ηs , i = I[t],
def
η̂i = 1 + kAi B̂−1 k2 = ut2 (3.16)
(Ai v)2
2
ηi − (Ai α)(Ai v) + ηs , i 6∈ Iˆ ∪ I[t],
ut ut2
we have
T
η̂i = 1 + Ai B̂−1 B̂−1 ATi
1 T 1 T T T
−1 1 T 1 T −1
= 1 + Ai B − vu − vet B − vu − vet Ai
ut ut ut ut
T 2 2
= 1 + Ai B−1 B−1 ATi − Ai B−1 uvT ATi − Ai B−1 et vT ATi +
ut ut
1 2 1
2
Ai vuT uvT ATi + 2 Ai vuT et vT ATi + 2 Ai vetT et vT ATi
ut ut ut
3.4 Why an LP Does Not Have a Solution? 87
2 2 kuk2 2 1
= ηi − (Ai α)(Ai v) − (Ai v)2 + 2 (Ai v)2 + (Ai v)2 + 2 (Ai v)2
ut ut ut ut ut
2 (Ai v)2
= ηi − (Ai α)(Ai v) + ηs .
ut ut2
ηi = 2, Ai = et B, Ai v = et BB−1 et = 1, Ai α = et BB−1 u = ut ,
Having solved an LP on the computer and received a message that the problem
did not have a solution, we would probably want to know the reason for this, in
particular, in order to try to correct possible errors in our formulation.
It is said that an LP has no solution if: 1) its constraint system is infeasible (there
are no feasible solutions) or 2) the objective value is unbounded over the set of
feasible solutions.
To understand the reason for the inconsistency of a system of linear inequalities,
let us consider a simple example:
2x1 + 5x2 + x3 ≤ 5,
x1 + 2x2 ≥ 3,
x2 ≥ 0,
x3 ≥ 0.
Summing together the first inequality, the second multiplied by −2, and the third
and the fourth, multiplied by −1, we obtain the false inequality 0 ≤ −1. Hence, we
can conclude that the system of inequalities in question is incompatible.
Strange as it may seem, but in the general case, a system of linear inequalities is
incompatible if and only if the false inequality 0 ≤ −1 can be derived from it. Let
us give a more precise formulation of this criterion known as Farkas’ lemma.
Lemma 3.3 (Farkas). A system of inequalities Ax ≤ b has no solutions if and only
if there exists a vector y ≥ 0 such that yT A = 0 and yT b < 0.
Proof. We call a vector y that satisfies the conditions of Lemma 3.3 a certificate
of infeasibility for the system of inequalities Ax ≤ b.
The necessity of the assertion of Lemma 3.3 is obvious. Let us prove the suf-
ficiency. First we recall that the dual simplex method decides, that the system of
inequalities of LP (3.1) is infeasible if, performing an iteration with a basic set I, it
turns out that the vector u = As B−1 is non-positive. At this point, we can determine
a certificate of infeasibility, y ∈ Rm , by the rule:
88 3 Linear Programming
ys = 1,
yI[ j] = −u j , j = 1, . . . , n,
yi = 0, i ∈ {1, . . . , m} \ (I ∪ {s}).
Indeed,
The objective function of LP (3.1) is not bounded if and only if there exists a
feasible ray
{x(λ ) = x0 + λ v : λ ≥ 0},
along which the objective function strictly increases, i.e., cT v > 0, Ax0 ≤ b and
Av ≤ 0. Such a pair (x0 , v) is called a certificate of unboundedness for LP (3.1).
Note that the simplex procedure from Listing 3.1, after detecting that the objective
function is unbounded, returns a certificate of unboundedness.
Justifying the correctness of the primal and dual simplex methods, we established
that there is a close relationship between the dual LPs (3.1) and (3.2). This relation-
ship is expressed in the following theorems.
Theorem 3.1 (duality). For the pair of dual LPs (3.1) and (3.2) the following
alternatives take place:
1) both LPs have solutions and then zP = zD ;
2) if one of the LPs, (3.1) or (3.2), has a solution, and the other has not, the objec-
tive function of the LP that has a solution is unbounded;
3) both LPs have no solutions.
Proof. We have actually proved assertions 1) and 2) when justifying the correct-
ness of the primal and dual simplex methods. To prove the validity of assertion 3),
it suffices to give an example of a pair of dual LPs for which this assertion holds:
max{−x : 0 x ≤ −1}, min{−y : 0 y = −1, y ≥ 0}.
Theorem 3.2. Let x̄ and ȳ be feasible solutions of the primal, (3.1), and dual, (3.2),
LPs respectively. Then the following conditions are equivalent:
a) x̄ and ȳ are optimal solutions to the primal and dual LPs;
b) cT x̄ = bT ȳ;
c) (complementary slackness condition)
Proof. The equivalence of conditions a) and b) follows from the duality theo-
rem. Let us prove the equivalence of conditions b) and c). Taking into account the
inequalities Ax̄ ≤ b and ȳT A ≥ cT , we have
Therefore
cT x̄ = ȳT Ax̄ = ȳT Ax̄ = ȳT b,
hence
x̄T (c − AT ȳ) = 0 and ȳT (b − Ax̄) = 0.
Informally, we say that two LPs are dual to each other if all the statements of
Theorems 3.1 and 3.2 hold for them. A formal rule for writing the dual LP for a
given LP is presented in Exercise 3.1.
d 1j ≤ x̄ j ≤ d 2j , j ∈ J,
b1i ≤ Ai x̄ ≤ b2i , i ∈ M \ I.
The feasible basic solution x̄ is optimal if the following conditions are satisfied:
• for i ∈ I, if bi = b2i > b1i , then ȳi ≥ 0, and if bi = b1i < b2i , then ȳi ≤ 0;
• for j ∈ N \ J, if d j = d 2j > d 1j , then c̄ j ≥ 0, and if d j = d 1j < d 2j , then c̄ j ≤ 0.
90 3 Linear Programming
max{cT x : Ax = b, x ≥ 0},
considered in most LP manuals, the row basic set I contains all m rows, and each
pivot operation always consists in substituting a non-basic column for a basic one.
3.7 Notes
The first who proposed an algorithm for solving a general LP was L.V. Kantorovich
— Nobel Prize winner in Economics in 1975 — (see Exercise 3.11). But still
J. Danzig is deservedly considered to be the father of linear programming for his
invention of the simplex method. A book of J. Danzig [42] remains a classic book
on linear programming. The author’s views on linear programming have changed
significantly after studying an excellent brochure of L.G. Khachijan [81].
Those who wants to learn more about linear programming we refer to one of the
sources [37, 102, 110, 117, 122, 134].
Sects. 3.2.2 and 3.3.4. The rules for recalculating column and row norms were ob-
tained in [57]. The computational efficiency of using the steepest edge rules was
proved in [51].
Sect. 3.8. The statements of Exercises 3.11 and 3.14 were taken respectively from
[122] and [78]. The DEA method described in Exercise 3.13 was proposed in [33].
3.8 Exercises
3.1. The rules for writing the dual LP for a given LP are presented in Table 3.1.
Prove that the statements of Theorems 3.1 and 3.2 are also true for the pair of LPs
from this table.
Hint. Convert the primal LP to the canonical form, write down the dual to the
obtained LP, compare the new pair of dual LPs with the original pair from the table.
3.8 Exercises 91
3.2. Using the rules from Table 3.1, write down the dual LPs for the following LPs:
a) 2x1 − 4x2 + 3x3 → max, b) 5x1 − x2 + 4x3 → max,
x1 + x2 − x3 = 9, x1 + x2 + x3 = 12,
−2x1 + x2 ≤ 5, 3x1 − 2x3 ≥ 1,
x1 − 3x3 ≥ 4, x2 − x3 ≤ 2,
x1 ≥ 0, x1 , x3 ≥ 0;
x3 ≤ 0;
Then write down the dual to the LP in the right-hand side of this inequality.
3.4. Consider the LP
( )
n n
max ∑ c j x j : ∑ a j x j ≤ b, 0 ≤ x j ≤ u j , j = 1, . . . , n (3.17)
j=1 j=1
Show that the components of an optimal solution to (3.17) are defined by the rule:
∗
xπ( j) = uπ( j) , j = 1, . . . , r − 1,
∗
b − ∑r−1
j=1 aπ( j) uπ( j)
xπ(r) = ,
aπ(r)
∗
xπ( j) = 0, j = r + 1, . . . , n.
3.5. Can we solve LPs using a computer program that solves the systems of linear
inequalities Ax ≤ b?
3.6. Use Theorem 3.1 to prove the following important result from game theory.
Theorem 3.3 (von Neumann). For every real m × n-matrix A,
m n
max min
x∈Σm 1≤ j≤n
min max ∑ ai j y j ,
∑ ai j xi = y∈Σ n 1≤i≤m
i=1 j=1
m
where Σm denotes an m-dimensional simplex x ∈ Rm
+ : ∑i=1 xi = 1 .
3.7. Arbitrage. We have at our disposal n financial assets, the price of j-th of them
at the beginning of the investment period is p j . At the end of the investment period,
the price of asset j is a random variable v j . Suppose that m scenarios (outcomes)
are possible at the end of the investment period, and then v j is a discrete random
variable. Let vi j be the value of v j when scenario i occurs. From the elements vi j ,
we compose the m × n-matrix V = [vi j ].
A trading strategy is represented by a vector x = (x1 , . . . , xn )T : if x j > 0, then we
buy x j units of asset j, and if x j < 0, then −x j units of asset j are sold. A trading
strategy is called an arbitrage, if it allows us to earn today without any risk of losses
at the end of the investment period:
pT x < 0, (3.18)
V x ≥ 0. (3.19)
The strict inequality (3.18) means that at the beginning of the investment period we
get more than spend. And the validity of all inequalities ∑nj=1 vi j x j ≥ 0 from (3.19)
means that the reverse trading strategy, −x, will not be loss-making at the end of the
period in any of m possible scenarios.
Since prices adjust very quickly on the market, an opportunity to earn from an
arbitrage also disappears very quickly. Therefore, in mathematical financial models,
it is often assumed that arbitrage does not exist. Prove the following statements.
3.8 Exercises 93
pmin T
j = min{p j : V y = p, y ≥ 0, p j ≥ 0},
pmax
j = max{p j : V T y = p, y ≥ 0, p j ≥ 0}
have solutions, then there exists no arbitrage if and only if pmin max
j ≤ pj ≤ pj .
3.8. Cycling in a simplex method is a situation where a sequence of basic sets repeats
cyclically. Cycling can occur only when solving degenerate LPs (see Sect. 3.1 for
the definition of degeneracy). In practice cycling occurs often when solving combi-
natorial optimization problems.
Solve the following LP
3 1
4 x1 − 20x2 + 2 x3 − 6x4 → max,
1
1: 4 x1 − 8x2 − x3 + 9x4 ≤ 0,
1 1
2: 2 x1 − 12x2 − 2 x3 + 3x4 ≤ 0,
3: x3 ≤ 1,
4: −x1 ≤ 0,
5: −x2 ≤ 0,
6: −x3 ≤ 0,
7: −x4 ≤ 0
by the primal simplex method starting with the basic set I = (4, 5, 6, 7) and using
the following rules for resolving ambiguities when selecting rows for entering and
leaving the basic set:
a) the entering row is the row I[s] for s ∈ arg min1≤i≤n πi ;
b) the leaving row has the minimum index, t, among the row indices on which the
value of λ is attained (see Listing 3.1).
3.9. In the literature, several rules were proposed for eliminating cycling in the sim-
plex algorithms. Perhaps the most useful in practice is the lexicographic rule. A
nonzero vector x ∈ Rn is said to be lexicographically positive if its first non-zero
component is positive. A vector x is lexicographically greater than a vector y ∈ Rn
if the vector x − y is lexicographically positive. Prove the following theorem, which
conveys the essence of the lexicographic rule as applied to the dual simplex method.
Theorem 3.4. Suppose that when the dual simplex method starts all columns of the
matrix T
def c
A(I) = A−1
A I
are lexicographically positive. Then, during the execution of the algorithm, the
columns of the matrix A(I) remain lexicographically positive, the vector of resid-
uals
94 3 Linear Programming
−cT x
def
r(x) =
b − Ax
strictly lexicographically increases from iteration to iteration, and the method ter-
minates after a finite number of iterations if the row I[t] leaving the basic set is
selected according to the following rule:
A(I) j
t ∈ arg lexmin : u j > 0, j = 1, . . . , n .
uj
Here the operator lexmin means the choice of the lexicographically minimal vector.
Problem (3.20) is solved simply: its solutions are solutions to the system of linear
equations AT Ax = AT b. Two other problems, which are not smooth optimization
problems, are more difficult to solve. Formulate (3.21) and (3.22) as LPs.
3.11. Show that each LP can be reduced to the LP
λ → max,
m
∑ ti j = 1, i = 1, . . . , n,
i=1
m n
∑ ∑ ai jkti j = λ , k = 1 . . . , q,
j=1 i=1
ti j ≥ 0, i = 1, . . . , m; j = 1, . . . , n,
which was investigated by L.V. Kantorovich. This LP admits the following interpre-
tation. To produce a unit of some final product, one unit of each of q intermediate
products is used. There are n machines that can perform m tasks. When machine i
performs task j then ai jk units of intermediate product k are produced per one shift.
If ti j is the fraction of time when machine i performs task j, then λ is the number of
final product units produced.
3.8 Exercises 95
3.12. Hypothesis testing. Let X be a discrete random variable taking values from the
set {1, . . . , n} and having a probability distribution that depends on the value of a
parameter θ ∈ {1, . . . , m}. The probability distributions of X for m possible values
of θ are given by an n × m-matrix P with elements pk j = P(X = k|θ = i), i.e., the
i-th column of P defines the probability distribution for X provided that θ = i.
We consider the problem of estimating the value of parameter θ by observing
(sampling) values of X. In other words, a value of X is generated for one of the
m possible distributions (values of θ ), and we need to determine which distribu-
tion (value of θ ) was used in this case. The values of θ are called hypotheses, and
guessing, which of m hypotheses is true, is called hypothesis testing.
A probabilistic classifier for θ is a discrete random variable θ̂ , which depends
on the observed value of X and takes values from {1, . . . , m}. Such a classifier can
be represented by an m × n-matrix T with the elements tik = P(θ̂ = i|X = k). If we
observe the value X = k, then the classifier with probability tik chooses the value θ̂ =
i as an estimate of the parameter θ . The quality of the classifier can be determined
by the m × m-matrix D = T P with the elements di j = P(θ̂ = i|θ = j), i.e., di j is the
probability of predicting θ̂ = i when θ = j.
It is necessary to determine a probabilistic classifier for which the maximum of
the probabilities of classification errors,
If γ0 < 1, then department 0 does not work efficiently, and it should adopt the expe-
rience of other departments i for which Ei (u∗ , v∗ ) = 1, where (u∗ , v∗ ) is an optimal
solution to (3.23).
Formulate (3.23) as an LP.
96 3 Linear Programming
3.14. In a Markov decision process (with a finite number of states and discrete time),
if at a given period of time the system is in state i from a finite set of states S, we
choose an action a from a finite set of actions A(i), which provides profit ria ; then the
system passes to a new state j ∈ S with a given probability pia j that depends on the
action a undertaken in state i. Our goal is to find a decision strategy that maximizes
the expected average profit for one period during an infinite time horizon. It is known
that an optimal strategy can be sought among stationary strategies. A stationary
strategy (or policy) π, every time the system is in state i, prescribes to undertake
the same action π(i) ∈ A(i). We also note that there is a stationary strategy, which
is optimal for all initial states of the system, and therefore, such a strategy is called
uniform.
Let (x∗ , y∗ ) be an optimal solution to the following LP
Show that any stationary strategy π ∗ such that π ∗ (i) ∈ A∗ (i) is optimum.
3.15. A firm produces some products using a number of identical machine. At the
beginning of each week, each machine is in one of the following states: excellent,
good, average, or bad. Working a week, the machine generates the following income
depending on its state: $100 in excellent state, $80 in good state, $50 in average
state, and $10 in bad state. After inspecting each of the machines at the end of the
week, the firm can decide to replace it with a new one in excellent state. A new
machine costs $200. The state of any machine deteriorates over time as shown in
the table below.
State Excellent Good Average Bad
Excellent 0.7 0.3 0.0 0.0
Good 0.0 0.7 0.3 0.0
Average 0.0 0.0 0.6 0.4
Bad 0.0 0.0 0.0 1.0
It is necessary to determine a strategy of replacing machines that generates the
maximum per week profit in the long run. Write down LP (3.24) for this example,
and solve that LP using your favorite LP solver.
Chapter 4
Cutting Planes
As noted in Sect. 1.5, we can strengthen a MIP formulation by adding to it new in-
equalities valid for all feasible solutions, but invalid for the relaxation polyhedron.
Such inequalities are called cuts. We know that cuts can be added when formulat-
ing (reformulating) the problem. New inequalities can also be added in the process
of solving the problem. The cutting plane method for solving MIPs can be viewed
as an extension of the dual simplex method in which the separation procedure (for
searching violated inequalities) is not limited to verifying the constrains of the cur-
rent formulation, but it can also generate new cuts.
In this chapter, we study cuts that are used for solving general IPs and MIPs.
Here we also demonstrate the use of these cuts in the cutting plane algorithms. Any
cuts are useful in practice only if they can be generated (computed) by very fast
separation procedures. In the last section of this chapter we discuss the relationship
between the optimization and separation problems.
Let us demonstrate how a cutting plane algorithm works on the following simple
example:
x1 + 2x2 → max,
3x1 + 2x2 ≤ 9,
(4.1)
x2 ≤ 2,
x1 , x2 ∈ Z+ .
First, we solve the relaxation LP for IP (4.1), which is obtained from (4.1) by
allowing integer variables to take also real values. The feasible polytope, P0 , of this
relaxation LP and its optimal solution, x(0) = ( 53 , 2)T , are depicted in Fig. 4.1.a.
Since x(0) is not an integer point, it is not a solution to (4.1).
A cutting plane or simply cut is an inequality that ”cuts off” the point x(0) from
the set
97
98 4 Cutting Planes
x2 x2
x(0) x(1)
6 6
2 r r
d 2 r rd r
J @@ J
J @J
1 P0 J 1 P1 @J
J @J
r Jr - r Jr -
@
0 1 2 3 x1 0 1 2 3 x1
a b
holds for X, but not for x(0) (3· 53 +3·2 = 11 > 10). We can strengthen this inequality
if we first divide it by 3, and then round down the right-hand side:
10
x1 + x2 ≤ = 3.
3
So, we have found the cut x1 + x2 ≤ 3, and now we add it to (4.1) as an additional
constraint. As a result, a piece of the relaxation polytope P0 is cut off. Let us note
that this cut off area (the shadow area in Fig. 4.1.b) does not contain integer points,
and, therefore, no feasible point from X has been cut off, and X is contained in the
feasible polytope P1 of the new relaxation LP (with added cut). An optimal solution
to this LP is the point x(1) = (1, 2)T , which is integer and, therefore, x(1) is an optimal
solution to (4.1).
Let A be a real m × n-matrix, b ∈ Rn , and let us suppose that the polyhedron P(A, b)
belongs to Rn+ . Surprisingly, but we can build all inequalities that define the convex
hull of the set P(A, b) ∩ Zn using a procedure based on the following very simple
observation.
Rounding principle: the inequality x ≤ bbc is valid for the set {x ∈ Z : x ≤ b}.
4.2 Chvátal-Gomory Cuts 99
Proceeding from this principle, we present the following very general procedure for
constructing cuts for the set X = P(A, b) ∩ Zn .
Chvátal-Gomory procedure:
1) choose u ∈ Rm +;
2) since u ≥ 0, the inequality uT Ax ≤ uT b is valid for X;
3) since x ≥ 0, the inequality ∑nj=1 buT A j cx ≤ uT b is valid for X;
4) since x is integer, the inequality
n
∑ buT A j cx ≤ buT bc (4.2)
j=1
is valid for X.
Theorem 4.1. If P(A, b) ⊆ Rn+ , then each valid for P(A, b) ∩ Zn inequality can be
obtained by applying the Chvátal-Gomory procedure a finite number of times.
Theorem 4.1 motivates the following definition. For a given set X = P(A, b) ∩ Zn
the Chvátal rank of a valid for X inequality α T x ≤ β is the minimum number of
applications of the Chvátal-Gomory procedure necessary to obtain an inequality
that is not weaker than α T x ≤ β . The Chvátal rank of the set X is the maximum
Chvátal rank of an inequality in the description of its convex hull.
For example, consider an integer set
∑ ce xe → max, (4.3a)
e∈E
∑ xe ≤ 1, v ∈ V, (4.3b)
e∈E(v,V )
where E(S, T ) denotes the set of edges with one end in S, and the other in T . The
convex hull of vectors x satisfying (4.3b) and (4.3c) is called a matching polytope.
We need to show that, for each subset S ⊆ V of odd cardinality, the inequality
|S| − 1
∑ xe ≤ (4.4)
e∈E(S,S)
2
∑ xe ≤ 1, v ∈ S,
e∈E(v,V )
we have
∑ xe = 2 ∑ xe + ∑ xe ≤ |S|.
e∈E(S,V ) e∈E(S,S) e∈E(S,V \S)
1 |S|
∑ xe + ∑ xe ≤ .
e∈E(S,S)
2 e∈E(S,V \S) 2
Rounding down first the right and then the left-hand sides of this inequality, we
derive (4.4). t
u
From a practical point of view, a class of cuts is useful only if one can efficiently
solve the separation problem for this class. With respect to Ineqs. (4.2), the separa-
tion problem is formulated as follows:
Given a point x̃ ∈ Rn+ ; find u ∈ Rn+ such that the corresponding inequality in
(4.2) is violated at x̃, or prove that x̃ satisfies all inequalities from (4.2).
It is known that this separation problem is NP-hard. But, on the other hand, there are
a number of important special cases when it is solved efficiently. We consider only
one of such special cases when x̃ is a vertex of the relaxation polyhedron P(A, b) ⊆
Rn+ .
Theorem 4.2. Let x̃ = A−1 I bI be a feasible basic solution to the system of inequalities
Ax ≤ b (a vertex of P(A, b)), where I ⊆ {1, . . . , m} is a feasible basic set. Suppose
that x̃i 6∈ Z, and denote by vT = eTi A−1
I the i-th row of the inverse basic matrix. Then
the inequality (Gomory cut)
n j k
(v − bvc)T AIj x j ≤ (v − bvc)T bI
∑ (4.5)
j=1
Proof. Since (4.5) is a Chvátal-Gomory cut, it is valid for X. Let us prove that
this inequality is violated at x̃:
n j k
∑ (v − bvc)T AIj x̃ j − b(v − bvc)bI c =
j=1
n
x̃i − ∑ bvcT AIj x̃ j − x̃i − bvcT bI =
j=1
x1 + 2x2 → max,
1: 3x1 + 2x2 ≤ 6,
2: −3x1 + 2x2 ≤ 0,
3: 0 ≤ x1 ≤ 2,
4: 0 ≤ x2 ≤ 2,
x1 , x2 ∈ Z
x2 x2 x2
6 6 6
2 x(2) 2 2
se @ (3) (4)
1
J 1 s sex
@ 1 s sex
J
@J
@
s
Js - s
@Js - s
@s -
0 1 2 x1 0 1 2 x1 0 1 2 x1
a b c
3. The point x(2) satisfies all constraints of the relaxation LP. We have only one
fractional component x2 , which will be used to build the Gomory cut. For
T 1 1 T 1 1
v = , , (v − bvc) = , ,
4 4 4 4
3 2
taking into account that B = , we write down the cut
−3 2
1 1 3 1 1 2 1 1 6
, x1 + , x2 ≤ , .
4 4 −3 4 4 2 4 4 0
Having carried out the calculations, we obtain the inequality x2 ≤ 1 (Fig. 4.2.a),
which is added to our program as the 5-th constraint. This inequality is violated at
x(2) by β = 1 − 23 = − 12 .
Having the violated inequality (s = 5), we perform the next iteration of the dual
simplex method. First we calculate
4.2 Chvátal-Gomory Cuts 103
" #
1
− 16
T 6 1 1
u = (0, 1) 1 1
= , ,
4 4
4 4
4 4
λ = min 3, = ⇒ t =2 ⇒ I = (1, 5),
3 3
4. The point x(3) satisfies all constraints of the relaxation LP (including one cut)
but is not integer. We have only one fractional component x1 , which we will use to
build the Gomory cut. For
T 1 2 T 1 1
v = ,− , (v − bvc) = , ,
3 3 3 3
32
taking into account that B = , we write down the cut
01
1 1 3 1 1 2 1 1 6
, x1 + , x2 ≤ , .
3 3 0 3 3 1 3 3 1
After the calculations, we obtain the inequality x1 + x2 ≤ 2 (Fig. 4.2.b), which will
be the 6-th in our IP. This new inequality is violated at x(3) by β = 2 − 34 − 1 = − 31 .
Having the violated inequality (s = 6), we perform the next iteration of the dual
simplex method. First we calculate
1 2
− 1 1
uT = (1, 1) 3 3 = , ,
0 1 3 3
λ = min{1, 4} = 1 ⇒ t =1 ⇒ I = (6, 5),
The point x(4) is integer and it satisfies all constraints (including cuts) of our IP
(Fig. 4.2.c). Therefore, x(4) is an optimal solution of our example IP. t
u
It is clear that the rounding principle is not applicable when deriving cuts for mixed
integer sets, i.e., when there are variables of both types, integer and continuous.
Another simple observation will be useful here.
Disjunctive principle: if an inequality is valid for both sets X1 and X2 , then it
is also valid for their union X1 ∪ X2 .
Proof. Let
The statement of Lemma 4.1 is illustrated in Fig. 4.3, where the set X is repre-
sented by the straight horizontal lines. Inequality (4.6) cuts off from the relaxation
polyhedron, {(x, y) ∈ R × R+ : x − y ≤ b}, the shaded triangle.
is valid for Y .
4.3 Mixed Integer Rounding 105
y
6
x−y = b
∑ ba j cx j + ∑ a j x j − y2 ≤ b,
j∈N1 j∈N2
w= ∑ ba j cx j + ∑ da j ex j ∈ Z and z = y2 + ∑ (1 − f j )x j ≥ 0,
j∈N1 j∈N2 j∈N2
Inequality (4.7) is also known as the mixed integer rounding of the inequality
∑nj=1 a j x j + y1 − y2 ≤ b.
Note that, for an inequality ∑nj=1 a j x j ≤ b with non-negative integer variables, its
mixed integer rounding
n
max{0, f j − f }
∑ j ba c + x j ≤ bbc
j=1 1− f
set
X = {(x, y) ∈ Z+ × R+ : x + y ≤ 5, y − x ≤ 2}.
106 4 Cutting Planes
x + y + s1 = 5,
−x + y + s2 = 2.
In the extended space, the point (x̃, ỹ) corresponds to the point
3 7
(x̃, ỹ, s̃1 , s̃2 ) = , , 0, 0 .
2 2
2x + s1 − s2 = 3,
which is divided by 2:
1 1 3
x + s1 − s2 = .
2 2 2
Let us note that in general, the equation is divided by the coefficient of an integer
variable taking a fractional value. If there are several such variables, then as a divisor
we can try each of their coefficients.
Applying Theorem 4.3 to the inequality
1 1 3
x + s1 − s2 ≤ ,
2 2 2
we obtain its mixed integer rounding
y 6 x̃b
@
3 @ cut
@
@
2 @
@
1
@
@
@
@r -
1 2 3 4 5 x
Substituting s2 = 2+x −y into the last inequality, we obtain the cut y ≤ 3. In Fig. 4.4
the set X is represented by five bold lines and the point (5, 0); the region that is cut
off (by y ≤ 3) from the relaxation polyhedron is shaded. t
u
In this section we will learn how to construct fractional Gomory cuts for a mixed
integer set P(A, b; S) ⊆ Rn+ . Let x̃ = A−1
I bI be a feasible basic solution, where I is a
feasible basic set of rows for the system of linear inequalities Ax ≤ b. Introducing
a vector of slack variables, s ∈ Rn+ , we rewrite the subsystem of basic inequalities,
AI x ≤ bI in the equality form AI x + s = bI , or
x + A−1 −1
I s = AI bI = x̃. (4.8)
We will denote the elements of the inverse basic matrix A−1 I by āi j .
Let us pick up an integer variable xi which current value x̃i is not integer. Let N1 (i)
and N2 (i) be, respectively, the index sets of the integer and non-integer variables s j
in the i-th equation from (4.8). We consider a slack variable s j to be integer if all
variables and coefficients in both parts of the inequality AI[ j] x ≤ bI[ j] are integers.
Now, let us rewrite the i-th equation from (4.8) in the form:
Theorem 4.4. If xi is an integer variable and its value x̃i is not integer, f0 = x̃i − bx̃i c
and f j = āi j − bāi j c for j ∈ N1 (i) ∪ N2 (i), then the following fractional Gomory cut
f0 (1 − f j )
∑ f js j + ∑ sj +
j∈N1 (i): f j ≤ f0 j∈N1 (i): f j > f0
1 − f0
(4.10)
f0
∑ āi j s j − ∑ āi j s j ≥ f0
j∈N2 (i): āi j >0 j∈N2 (i): āi j <0
1 − f0
is valid for all (xi , s) ∈ Z+ × Rn+ such that (4.9) is satisfied and s j ∈ Z for all j ∈
N1 (i).
Proof. Let us write down the mixed integer rounding of (4.9):
108 4 Cutting Planes
f j − f0
xi + ∑ bāi j cs j + ∑ bāi j c +
j∈N1 (i): f j ≤ f0 j∈N1 (i): f j > f0
1 − f0
(4.11)
1
+ ∑ āi j s j ≤ bx̃i c.
j∈N2 (i): āi j <0
1 − f0
Using (4.9), we express the variable xi in terms of the variables s j , and then substi-
tute the resulting expression into (4.11); as a result, we obtain (4.10). t
u
Gomory fractional cuts (4.10) are written in terms of the slack variables s j . To
return to the original variables xi , we need to substitute s = bI − AI x into (4.10) to
get a cut for the set P(A, b; S).
Example 4.4 We need to solve the IP
x1 + x2 → max
1: 2x1 + 2x2 + x3 ≤ 9,
2: x2 − x3 ≤ 0,
3: 0 ≤ x1 ≤ 4, (4.12)
4: 0 ≤ x2 ≤ 3,
5: 0 ≤ x3 ≤ 5,
x1 , x2 ∈ Z
by the cutting plane algorithm that generates only fractional Gomory cuts.
Solution. This time, we will not practice the dual simplex method to solve the
relaxation LP for (4.12). We just start with its optimal basic solution:
1
1 0 0 4 3
1 1 2 1
I = (1, 2, 3), B−1 = 3 3 − 3 , x(1) = 3 , π = 13 .
1 2 2 1
3 −3 −3
1
3 3
Let us choose the variable x2 , which is integer and which current value is not integer,
to start computing the first cut:
1 1 2 1
x2 + s1 + s2 − s3 = .
3 3 3 3
Here s3 is the only integer slack variable. Next we compute the coefficients
1 2 1
f0 = , f3 = − − (−1) = ,
3 3 3
and then we write down the cut
1 1 1 1
s1 + s2 + s3 ≥ ,
3 3 3 3
or
4.5 Disjunctive Inequalities 109
s1 + s2 + s3 ≥ 1. (4.13)
Substituting the expressions
s1 = 9 − 2x1 − 2x2 − x3 , s2 = 0 − x2 + x3 , s3 = 4 − x1 ,
into (4.13), after simplification, we obtain the cut in the original variables x1 , x2 , x3 :
6: x1 + x2 ≤ 4,
The point x(2) is integer and satisfies all constraints in (4.12). Therefore, x(2) is
an optimal solution to (4.12). t
u
The union X1 ∪ X2 of two sets X1 and X2 is also called the disjunction of X1 and X2 ,
since the condition x ∈ X1 ∪ X2 is also written as the disjunction x ∈ X1 or x ∈ X2 .
The disjunctive principle was introduced in Sect. 4.3 where we studied the mixed
integer rounding cuts. Let us recall the essence of this simple principle: if an in-
equality holds for both sets, X1 and X2 , then it is also valid for X1 ∪ X2 . In this
110 4 Cutting Planes
section proceeding from this disjunctive principle, we will develop another version
of the disjunctive cuts.
From the algorithmic point of view, the disjunction of polyhedra is of special
interest.
Theorem 4.5. Let Pi = {x ∈ Rn+ : Ai x ≤ bi } for i = 1, 2. If both polyhedra, P1 and
P2 , are non-empty, then an inequality α T x ≤ β is valid for the union P1 ∪ P2 if and
only if there exists a pair of vectors y1 , y2 ≥ 0 such that (Ai )T yi ≥ α and (yi )T bi ≤ β
for i = 1, 2.
Proof. The sufficiency is verified simply. Suppose that there exists a pair of vec-
tors y1 , y2 ≥ 0 such that
α T x ≤ (yi )T Ai x ≤ (yi )T b ≤ β .
max{α T x : Ai x ≤ b, x ≥ 0},
min{bT y : (Ai )T y ≥ α, y ≥ 0}.
From Theorem 4.5, it directly follows the next method of solving the separation
problem for the disjunction of two polyhedra.
Separation procedure for conv(P1 ∪ P2 ):
Check whether the point x̃ ∈ Rn belongs to the set P1 ∪ P2 , and if not, find the
separating hyperplane by solving the next LP
x̃T α − β → max,
(Ai )T yi ≥ α, i = 1, 2,
i T i (4.14)
(b ) y ≤ β , i = 1, 2,
i
y ∈ Rn+ , i = 1, 2,
−1 ≤ β ≤ 1.
x1 + 2x2 → max,
1: −2x1 + 3x2 ≤ 4,
2: x1 + x2 ≤ 5,
(4.15)
3: x1 ≥ 0,
4: x2 ≥ 0,
x1 , x2 ∈ Z
The solution x(1) is non-integer, and we will try to cut it off. The polytope P(1) of
feasible solutions of the relaxation LP is shown in Fig. 4.5.a. Since there are no
integer points in the strip 2 < x1 < 3, we can remove from P(1) all points from
this strip (in Fig. 4.5.a the deleted area is shaded). Now the feasible domain of our
(1) (1)
problem is contained in the union of two polyhedra, P1 and P2 , that are given by
the following systems of inequalities:
−2x1 + 3x2 ≤ 4,
1: 1 : −2x1 + 3x2 ≤ 4,
x1 + x2 ≤ 5,
2: 2: x1 + x2 ≤ 5,
(1) (1)
P1 : 3:
x1 ≤ 2, P2 : 3 : −x1 ≤ −3,
x1 ≥ 0, x1 ≥ 0,
x2 ≥ 0, x2 ≥ 0.
(1) (1)
To separate x(1) from conv P1 ∪ P2 , we need to solve the following LP:
x2 x2 x2
x(1) x(2)
rc
6 6 6
3 3 cr 3
x(3)
Qr 2 r cr
Q
2 @ 2
r (1) @ r
(2) @
r
@
1 P1 (1)
@ 1 P1 @ 1 @
r P2 @r - r @r - r @r -
0 1 2 3 4 5 x1 0 1 2 3 4 5 x1 0 1 2 3 4 5 x1
a b c
Using any LP solver available to you, you can check that this LP has an optimal
solution for which the variables α1 , α2 , and β take the following values:
1 1
α1∗ = , α2∗ = , β ∗ = 1.
6 4
Thus, we have found the cut
1 1
x1 + x2 ≤ 1,
6 4
or
We add this cut to (4.15) as the 5-th constraint. The polytope P(2) of the feasible
solutions of the new relaxation LP is shown in Fig. 4.5.b.
Now let us proceed to the reoptimization. At the point x(1) , the added inequality
(s = 5) is violated by 12 − 2 · 11 14 4
5 − 3 · 5 = − 5 . We calculate
" #
− 15 35
T 1 12
u = (2, 3) · 1 2 = , ,
5 5
5 5
7
λ= ⇒ t = 2, I = (1, 5),
12
and then perform the pivot operation:
" # " 1 #
−1 −1 −1 −1 3
1 0 −4 1
B := B · I(2, u) = 15 5
· 1 5 =
4
,
2 − 12 12 1 1
5 5 6 6
! !
1 7 1 1
5 − 12 ·5 12
π= 7
= 7
,
12 12
! !
11 1
(2) (1) −1 5 4 4 2
x =x + β B e2 = 14
− · 1
= 8 .
5
5 6 3
4.5 Disjunctive Inequalities 113
The new solution x(2) is still non-integer, and we will try to cut it off. Since there
are no integer points in the strip 2 < x2 < 3, we can remove from P(2) all points
from this strip (the deleted area is shaded in Fig. 4.5.b). Now the feasible domain of
(2) (2)
our IP is contained in the union of two polyhedra, P1 and P2 , that are the solution
sets to the following systems of inequalities:
(2)
From Fig. 4.5.b it is clear that the polyhedron P2 is empty and, therefore, the
second system of inequalities
is incompatible.
But we will not rely on the drawing.
(2) (2) (2)
To separate x from conv P1 ∪ P2 , we solve the following LP:
2α1 + 38 α2 − β → max,
−2y11 + y12 + 2y13 − α1 ≥ 0,
3y11 + y12 + 3y13 + y14 − α2 ≥ 0,
1 1 1
4y1 + 5y2 + 12y3 + 2y41 − β ≤ 0,
− 2y21 + y22 + 2y23 − α1 ≥ 0,
2 2 2
3y1 + y2 + 3y3 − y4 2 − α2 ≥ 0,
4y21 + 5y22 + 12y23 − 3y24 − β ≤ 0,
−1 ≤ β ≤ 1,
y11 , y12 , y13 , y14 , y21 , y22 , y23 , y24 ≥ 0.
6: x2 ≤ 2,
which is added to our already extended IP as the 6-th constraint. The feasible poly-
tope, P(3) , of the relaxation LP for this new IP is shown in Fig. 4.5.c.
Now let us proceed to the reoptimization. At the point x(2) , the last added in-
equality (s = 6) is violated by 2 − 83 = − 32 . We calculate
" #
− 14 14
T 1 1
u = (0, 1) · 1 1 = , ,
6 6
6 6
114 4 Cutting Planes
1 7 1
λ = min , = ⇒ t = 1, I = (6, 5),
2 2 2
In this section we will consider MIPs in which all integer variables are binary:
cT x → max,
Ax ≤ b, (4.16)
x j ∈ {0, 1}, j = 1, . . . , p,
x j (Ax − b) ≤ 0, (1 − x j )(Ax − b) ≤ 0,
by substituting x j for x2j , and a new continuous variable yi for xi x j (i 6= j). Let M j
denote the polyhedron of the solutions of the resulting system of linear inequali-
ties.
Project the polyhedron M j ⊆ Rn × Rn−1 onto the space of x-variables. Let Pj =
lfpr(P) denotes the resulting polyhedron.
Recall that the projection of a set Q ⊆ U ×V onto the set U is the set
4.6 Lift And Project 115
def
projU (Q) = {u ∈ U : ∃ (u, v) ∈ Q}.
The inclusion X ⊆ Pj follows from the fact that, for x ∈ X, the point (x, y) belongs
to M j if we define yi = xi x j for all i 6= j. The inclusion Pj ⊆ P is valid because each
of the inequalities Ai x ≤ b is the sum of two inequalities defining M j :
x j Ai x ≤ bi x j and (1 − x j )Ai x ≤ bi (1 − x j ).
The above lift-and-project procedure does not give an explicit (in the form of a
system of linear inequalities) descriptions of the polyhedrons Pj . But we can still
describe each polyhedron Pj implicitly by providing a separation procedure. The
polyhedron M j is described by the following system of inequalities:
(A j − b)x j + AN\{ j} y ≤ 0,
AN\{ j} xN\{ j} + bx j − AN\{ j} y ≤ b,
def
where N = {1, . . . , n}. The point x̄ ∈ Rn belongs to Pj if and only if the following
system of inequalities is compatible:
AN\{ j} y ≤ x̄ j (b − A j ),
(4.17)
−AN\{ j} y ≤ (1 − x̄ j )b − AN\{ j} x̄N\{ j} .
By Farkas’ lemma (Lemma 3.3) the system of inequalities (4.17) has a solution if
and only if
x̄ j (b − A j )T u + ((1 − x̄ j )b − AN\{ j} x̄N\{ j} )T v ≥ 0
for all u, v ∈ Rm
+ such that
uT AN\{ j} − vT AN\{ j} = 0.
We can verify this condition of Farkas’ lemma by solving the following LP:
or after rearranging,
separates x̄ from Pj .
Example 4.6 We need to separate the point x̄ = (1/3, 2) from the solution set of the
following system:
1: 3x1 + 2x2 ≤ 5,
2: −x1 ≤ 0,
3: x1 ≤ 1,
4: −x2 ≤ 0,
5: x2 ≤ 2,
x1 ∈ {0, 1}.
Solution. First, for j = 1, we write down (4.18) applied to our problem instance:
2 1 2 2 2 2
z = u1 + u2 + u5 − v1 + v3 + 2v4 − v5 → min,
3 3 3 3 3 3
2u1 − u4 + u5 − 2v1 + v2 − v5 = 0,
u1 + u2 + u3 + u4 + u5 + v1 + v2 + v3 + v4 + v5 = 1,
u1 , u2 , u3 , u4 , u5 , v1 , v2 , v3 , v4 , v5 ≥ 0.
T T
The vectors ū = 31 , 0, 0, 0, 0 , v̄ = 0, 0, 0, 0, 23 constitute an optimal solution to
this LP, and the optimal objective value is z̄ = − 19 . Since z̄ < 0, by (4.19), we can
write the cut:
4 2 2 4
− x1 + x2 ≤ or x1 + x2 ≤ 2.
3 3 3 3
t
u
The following theorem shows that each polyhedron Pj build by the lift-and-
project procedure is the convex hull of the union of two polyhedra. This means
that the lift-and-project cuts are specialized disjunctive cuts.
def
Theorem 4.6. Pj = Pj∗ = conv ((P ∩ {x ∈ Rn : x j = 0}) ∪ (P ∩ {x ∈ Rn : x j = 1})).
Proof. We assume that P 6= 0./ Otherwise, the result is trivial. First we prove the
inclusion Pj ⊆ Pj∗ .
/ then Pj∗ = P ∩ {x ∈ Rn : x j = 1}. We already know
If P ∩ {x ∈ Rn : x j = 0} = 0,
that Pj ⊆ P. Therefore, to prove the inclusion Pj ⊆ Pj∗ , it suffices to show that the
def
inequality x j ≥ 1 holds for Pj . Since P∩{x ∈ Rn : x j = 0} = 0,
/ then ε = min{x j : x ∈
P} > 0 and the inequality x j ≥ ε is valid for P. Since x j ≥ ε is a linear combination
of some inequalities from Ax ≤ b, the inequality (1 − x j )x j ≥ (1 − x j )ε is valid for
the nonlinear system constructed in the lift step. Then x2j is replaced with x j , and we
obtain the inequality (1 − x j )ε ≥ 0, from which it follows that x j ≥ 1.
The inclusion Pj ⊆ Pj∗ is proved similarly when P ∩ {x ∈ Rn : x j = 1} = 0. /
4.6 Lift And Project 117
α T = (λ 0 )T A + γ0 e j and β ≥ (λ 0 )T b,
α T = (λ 1 )T A − γ1 e j and β ≥ (λ 1 )T b − γ1 .
(1 − x j )(α T x − γ0 x j − β ) ≤ 0,
x j (α T x − γ1 (1 − x j ) − β ) ≤ 0
def
X j = {x ∈ Rn : Ax ≤ b, xi ∈ {0, 1} for i = 1, . . . , j}.
By definition, P̄0 = P = X0 . Let j ≥ 1 and assume that P̄j−1 = conv(X j−1 ). By The-
orem 4.6, we have
In the most general form, the separation problem is formulated as follows. Given
a set X ⊂ Rn and a point x̃ ∈ Rn , we need to prove that x̃ ∈ conv(X), or find a
hyperplane H(a, β ) separating x̃ from X, i.e., such that
aT x ≤ β , x ∈ X,
T
a x̃ > β .
We have already considered some special cases of this general problem. In par-
ticular, studying the Gomory cuts, we solved the problem of separating a vertex x̃ of
a polyhedron P(A, b) from a mixed integer set X = P(A, b; S). Note that the problem
of separating an arbitrary point x̃ (not a vertex of P(A, b)) from the set P(A, b; S) is
NP-hard. It is not surprising that the general separation problem is also NP-hard,
and we can hardly hope to develop an efficient algorithm for solving it. Here we
describe a procedure for solving the general separation problem, which is efficient
in practice for a number of special sets X. Let us also note that in practice we need
very fast separation procedures, since they are repeatedly called by the cutting plane
algorithms. For this reason, in the modern MIP solvers, very often exact separation
procedures are replaced with fast heuristics.
It is known that the separation problem for the set X is polynomially equivalent
to the optimization problem
Here our main interest is not in investigating the complexity aspects of this equiv-
alence. We are going to present two LP based approaches for solving each of the
problems, optimization or separation, provided that there is an efficient procedure
for solving the other problem.
We already know how separation procedures are used for solving MIPs. In this sec-
tion on a particular quadratic optimization problem we demonstrate how to apply the
cutting plane approach for solving optimization problems with convex constraints
that are represented by separation procedures.
We want to invest some amount of money in some of n assets (stocks, bonds,
etc.). Let pi be the relative change in price of asset i during some planning hori-
zon (for example, one year), i.e., pi is the change in the price of this asset during
the planning horizon divided by its price at the beginning of the horizon (return
per one enclosed dollar). We assume that p1 , . . . , pn are dependent normal random
variables, and p = (p1 , . . . pn )T is a random price vector with known mean (mathe-
4.7 Separation and Optimization 119
In this problem, we maximize the average return of the portfolio at a limited risk
of r2 (see Sect. 8.3 for a discussion of risk measures).
To solve (4.21) with the dual simplex method, we need to represent the convex
set
def
XΣ = x ∈ Rn : xT Σ x ≤ r2
yT y ≤ r2 , y = Bx.
Example 4.7 Consider the problem of forming a portfolio of four assets. The mean
values and standard deviations of the future random returns of these assets are
presented in the following table
Asset 1 2 3 4
p̄i 1.03 1.06 1.08 1.1
σi 0 0.05 0.1 0.2
Here asset 1 is a risk-free asset with a return of 3%. The correlation coefficients
between risky assets are the following: ρ24 = −0.04, ρ34 = 0.03, and ρ23 = 0.
We need to find an approximately optimal portfolio which risk is not greater than
r2 for r = 0.04.
120 4 Cutting Planes
0 0 0 0.1997
Therefore,
We start with solving the LP (LP(0) in what follows) obtained from (4.22) after
removing the quadratic inequality. The point x(0) = (0, 0, 0, 1)T is the only opti-
mal solution to this LP. Then y(0) = Bx(0) = (0, −0.008, 0.006, 0.1997)T , and since
ky(0) k = 0.19995 > r = 0.04, we compute the first cut:
1 (0) T
y Bx = −0.01x2 + 0.015x3 + 0.999502x4 ≤ ky(0) k = 0.19995.
r
Adding this inequality to LP(0) and reoptimizing the extended LP (denoted as
LP(1)), we get its optimal solution x(1) , which is presented in Table 4.1. Column 2
of this table contains optimal solutions to all the LPs solved by our cutting plane
algorithm: LP(i + 1) is obtained from LP(i) by adding to the latter the inequality
T
1 (i)
r y Bx ≤ ky(i) k, which is presented in Column 3 of Row i, that separates from
def
XΣ an optimal solution, x(i) , to LP(i). Here y(i) = Bx(i) .
4.7 Separation and Optimization 121
Here we show how to separate a given point x̃ ∈ Rn from a given set X ⊆ Rn using
an algorithm for solving (4.20).
First, solving (4.20) with different objective vectors c = c1 , . . . , cm , we find a
family of vectors {x1 , . . . , xm } ∈ X. We can also start with an empty family by setting
m = 0. Then we look for a hyperplane H(a, β ) separating our point x̃ from the set
{x1 , . . . , xm }. To do this we solve the following LP:
122 4 Cutting Planes
x̃T v − α → max,
(xi )T v − α ≤ 0, i = 1, . . . , m,
(4.23)
−1 ≤ vi ≤ 1, i = 1, . . . , n,
−1 ≤ α ≤ 1.
Since the optimal primal and dual objective values are equal, then y∗m+i = 0 for
i = 1, . . . , 2n+2 and, therefore, the vector equality ∑m ∗ i
i=1 yi x = x̃ holds, which means
that
x̃ ∈ conv({x1 , . . . , xm }) ⊆ conv(X).
It is clear that in practice the separation procedure described above can be used
only for those sets X for which (4.20) is solved easily.
T
Example 4.8 We need to separate the point x̃ = 1, 45 , 15 from the knapsack set
X1 = {x ∈ {0, 1}3 : 4x1 + 5x2 + 5x3 ≤ 9}.
Solution. First, we set m = 0 and solve the next LP:
4 1
v1 + v2 + v3 − α → max,
5 5
−1 ≤ v1 , v2 , v3 ≤ 1,
−1 ≤ α ≤ 1.
Its solution is v∗ = (1, 1, 1)T , α ∗ = −1, and therefore we set a = v∗ = (1, 1, 1)T , β =
α ∗ = −1. Since aT x̃ = 2 > −1 = β , we need to solve the following 0,1-knapsack
4.7 Separation and Optimization 123
problem:
x1 + x2 + x3 → max,
4x1 + 5x2 + 5x3 ≤ 9,
x1 , x2 , x3 ∈ {0, 1}.
x1 + x3 → max,
4x1 + 5x2 + 5x3 ≤ 9,
x1 , x2 , x3 ∈ {0, 1}.
t
u
We will solve one more similar example, but with the other outcome.
124 4 Cutting Planes
3 4 1 T
Example 4.9 We need to separate the point x̃ = 4, 5, 2 from the knapsack set
X2 = {x ∈ {0, 1}3 : 4x1 + 5x2 + 2x3 ≤ 9}.
Solution. We set m = 0 and solve the next LP:
3 4 1
v1 + v2 + v3 − α → max,
4 5 2
−1 ≤ v1 , v2 , v3 ≤ 1,
−1 ≤ α ≤ 1.
x1 + x2 + x3 → max,
4x1 + 5x2 + 2x3 ≤ 9,
x1 , x2 , x3 ∈ {0, 1}.
Its solution is v∗ = (−1, 0, 1)T , α ∗ = −1, and therefore we set a = (−1, 0, 1)T ,
β = −1. Since aT x̃ = − 41 > −1 = β , we need to solve the following 0,1-knapsack
problem:
−x1 + x3 → max,
4x1 + 5x2 + 2x3 ≤ 9,
x1 , x2 , x3 ∈ {0, 1}.
−1 ≤ v1 , v2 , v3 ≤ 1,
−1 ≤ α ≤ 1.
x2 + x3 → max,
4x1 + 5x2 + 2x3 ≤ 9,
x1 , x2 , x3 ∈ {0, 1}.
x1 + x3 → max,
4x1 + 5x2 + 2x3 ≤ 9,
x1 , x2 , x3 ∈ {0, 1}.
1 1 1
x1 + x2 + x3 → max,
2 2 2
4x1 + 5x2 + 2x3 ≤ 9,
x1 , x2 , x3 ∈ {0, 1}.
1 1 1
x1 + x2 + x3 = 1
2 2 2
separates x̃ from X2 , and the inequality x1 + x2 + x3 ≤ 2 is valid for X2 . t
u
4.8 Notes
The reviews [88], [144] and [39] are a good addition to the material of this and the
next chapters.
Sect. 4.2. Theorem 4.1 was proved by Chvátal [35] for the case when P(A, b) is a
polytope. Schrijver showed in [121] that this result is also valid for arbitrary poly-
hedra P(A, b). Historically, the first cutting plane algorithm for solving MIPs was
developed by Gomory [58, 60]. This algorithm uses the cuts described in Theo-
rem 4.2. In the general case, the separation problem for the Chvátal-Gomory in-
equalities is NP-hard in the strong sense [50]. Efficient separation procedures have
been proposed for totally tight cuts [86] (see Exercise 4.7), as well as for some
special polyhedra, provided that all components of the vector u are 0 or 1/2 [32].
The inequalities (4.4), known as the flower inequalities, were introduced in [48].
Sect. 4.3. Mixed integer rounding was studied in [98, 99]. Many known classes of
strong cuts for a series of structured mixed integer sets can be obtained by applying
mixed integer rounding [89].
Sect. 4.4. The fractional Gomory cuts were introduced in [59]. In the same paper it
was proved that the cutting plane algorithm based on these cuts solves any general
MIP in a finite number of steps, provided that the objective function takes integral
values on all feasible solutions of this MIP.
Sect. 4.5. The disjunctive principle was first implicitly present in the derivation of
the fractional Gomory cuts. The approach we outlined is based on the Balas char-
acterization of the disjunction of polyhedra [11]. The disjunction of polyhedra from
different spaces can be represented more efficiently [12] (see Exercise 4.12). In [85]
it is shown that many classes of facet defining inequalities for some well known
combinatorial optimization problems can be obtained by applying the disjunctive
technique.
Sect. 4.6. The lift-and-project procedure presented here was proposed in [13]. A
more powerful procedure, in which lifting is performed simultaneously for all binary
variables, was studied in [127]. An even more powerful lift-and-project procedure
4.9 Exercises 127
was proposed in [87], but here the separation procedure is reduced to solving a
semidefined programming problem.
Sect. 4.7. The equivalence of the optimization and separation problems was estab-
lished in [63] (see also [64, 122]).
Markowitz received the 1990 Nobel Prize in Economics for his portfolio opti-
mization model [90].
Sect. 4.9. The statements of exercises 4.2, 4.5, 4.6, 4.7, 4.12 are, respectively, taken
from [91], [31], [49], [86], [12].
4.9 Exercises
4.1. Let A be a real m × n-matrix, b ∈ Rm , S ⊆ {1, . . . , n}. Prove that if the mixed-
integer set P(A, b; S) is bounded, then its convex hull, conv(P(A, b; S)), is a polytope.
4.2. Let us consider a very simple generalization of the mixed-integer set X from
Lemma 4.1:
X̄ = {(x, y) ∈ Z × Rm : x − yi ≤ bi , i = 1, . . . , m}.
Prove that conv(X̄) is the solution set of the following system of inequalities:
x − yi ≤ bi , i = 1, . . . , m,
yi
x− ≤ bbc, i = 1, . . . , m,
1 − fi
yi ≥ 0, i = 1, . . . , m.
def
Here fi = bi − bbi c for i = 1, . . . , m.
def
4.3. Let A be a real m × n-matrix, and b ∈ Rm . Let f (r) = r − brc denote the frac-
tional part of r ∈ R. Assume that f (uT b) < 12 holds for all u ∈ Rm + , and t is a positive
1 T
integer such that 2 ≤ t f (u b) < 1. Let ū = ( f (t u1 ), . . . , f (t um )). Prove that, for the
set P(A, b)∩Zn+ , the cut būT Acx ≤ būT bc is not weaker than the cut buT Acx ≤ buT bc.
4.4. A function h : R → R is called superadditive, if
Let β ∈ (0, 1) and f (r) = r − brc. Prove that the functions brc and
are superadditive.
4.5. Prove the validity of the following generalization of the rounding principle: if
g is a non-decreasing superadditive function and g(0) = 0, then the inequality
128 4 Cutting Planes
n
∑ g(α j )x j ≤ g(β )
j=1
X = {x ∈ Zn+ : xi + x j ≤ 1, i, j = 1, . . . , n, i 6= j},
prove that ( )
n
conv(X) = x ∈ Rn+ : ∑ xj ≤ 1
j=1
and the Chvátal rank of the inequality ∑nj=1 x j ≤ 1 is O(n log n).
4.9. Prove that the Chvátal rank of the solution set of the following system
tx1 + x2 ≤ 1 + t,
−tx1 + x2 ≤ 1,
x1 ≤ 1,
x1 , x2 ≥ 0
is equal to t − 1 for t = 1, 2, . . . .
4.10. Apply the cutting plane algorithm that generates only Chvátal-Gomory cuts to
solve the following IPs:
a) x1 + x2 → max, b) x1 + x2 + x3 → max,
−x1 + x2 ≤ 1, 2x1 + 2x2 + x3 ≤ 6,
3x1 + 2x2 ≤ 4, x1 + x3 ≤ 2,
x1 , x2 ∈ Z+ ; x2 + x3 ≤ 2,
x1 , x2 , x3 ∈ Z+ .
4.9 Exercises 129
4.11. Apply the cutting plane algorithm that generates only fractional Gomory cuts
to solve the following MIPs:
0 ≤ xkj ≤ 1, j = 1, . . . , nk , k = 1, 2,
a1i1 , j a2i2 , j
∑ x1j + ∑ x2j ≥ 1, (4.24)
j∈S1 1 − ∑ a1i1 ,l j∈S2 1 − ∑ a2i2 ,l
l∈N1 \S1 l∈N2 \S2
S1 ⊆ Fi11 , S2 ⊆ Fi22 , i1 ∈ M1 , i2 ∈ M2 .
def
b) For k ∈ {1, 2}, x ∈ [0, 1]nk and α ∈ [0, 1], let Sk (α, x) = { j ∈ Nk : x j ≤ α} and,
for i ∈ Mk , let
def
δik (x) = max x j : ∑ akil xl ≤ x j 1 − ∑ ail ,
k l∈S (x j ,x) k
l∈Nk \S (x j ,x)
def akij x j
αik (x) = ∑ .
j∈Sk (δik (x),x)
1− ∑ akil
l∈Nk \Sk (δik (x),x)
Prove that a given point (x̃1 , x̃2 ) ∈ [0, 1]n1 × [0, 1]n2 satisfies all inequalities in (4.24)
if and only if αi11 (x̃1 ) + αi22 (x̃2 ) ≥ 1 for all pairs (i1 , i2 ) ∈ M1 × M2 . If for some
(i1 , i2 ) ∈ M1 × M2 , αi11 (x̃1 ) + αi22 (x̃2 ) < 1, then for S1 = S1 (δi11 (x̃1 ), x̃1 ) and S2 =
S2 (δi22 (x̃2 ), x̃2 ) the corresponding inequality in (4.12) is violated at (x̃1 , x̃2 ).
4.13. Elaborate separation procedures for the following sets:
a) (solution set of a convex quadratic constraint)
X1 = x ∈ Rn : xT Qx + 2cT x ≤ d ,
Many optimization problems include various specific ”local” structures. For exam-
ple, very often, MIPs contain inequalities involving only binary variables, or, some-
times, a part of the problem constraints formulate a network flow problem. For such
local structures, we can get stronger inequalities using specific features of these
structures. It should be noted that there exist a great deal of such special structures.
In this chapter we consider a number of structural cuts for those mixed-integer sets
that are most often encountered in practical problems, and the separation procedures
for which have already been included in many modern MIP solvers.
Constraints involving binary variables are so common in practice that they deserve
special attention. Consider a knapsack set of 0, 1-vectors
( )
n
K= x ∈ {0, 1}n : ∑ a jx j ≤ b (5.1)
j=1
def
when all coefficients, b and a j , are positive. A set C ⊆ N = {1, . . . , n} is called a
def
knapsack cover (or simply cover) if its excess λ (C) = ∑ j∈C a j − b is positive. Each
knapsack cover C defines the cover inequality:
∑ x j ≤ |C| − 1, (5.2)
j∈C
which is valid for K, and which implies that not all variables x j for j ∈ C can simul-
taneously be equal to 1.
A knapsack cover C is called minimal if a j ≥ λ (C) for all j ∈ C. If C is not a
minimal knapsack cover, then the cover inequality written for C is redundant, since
it is the sum of the inequalities
131
132 5 Cuts for Structured Mixed-Integer Sets
For a given point x̃ ∈ [0, 1]n , we need to find a violated at this point inequality if
such one exists. Let us rewrite (5.2) in the following form:
∑ (1 − x j ) ≥ 1.
j∈C
Hence, it is clear that in order to solve the separation problem, we need to answer
the following question: is there a subset C ⊆ N such that
z ∈ {0, 1}n .
algorithms very often. In practice, (5.3) does not need to be solved to optimality,
instead, we can solve it approximately, but very quickly. Of course, solving (5.3)
approximately, we risk not finding a cover inequality violated at x̃, even if such one
exists.
Next we describe a simple efficient algorithm that solves (5.3) approximately.
1−x̃ j
1. List the ratios aj in non-decreasing order:
1 − x̃π(1) 1 − x̃π(2) 1 − x̃π(n)
≤ ≤ ... ≤ .
aπ(1) aπ(2) aπ(n)
2. Find a minimal index k such that ∑ki=1 aπ(i) > b.
3. Return C = {π(1), . . . , π(k)}.
The above algorithm is also known as the LP-heuristic because when all coeffi-
cients a j are integers, it constructs a solution that can be obtained by rounding up
the components of an optimal solution to the following LP:
( )
n n
min ∑ (1 − x̃ j ) z j : ∑ a j z j ≥ b + 1, z ∈ [0, 1]n .
j=1 j=1
is valid for X δ .
(δ = 0, lifting up) If X 1 = 0,
/ then xn = 0 for all x ∈ X. If X 1 6= 0,
/ then the
inequality
n−1
∑ a j x j + αn xn ≤ b (5.5)
j=1
(δ = 1, lifting down) If X 0 = 0,
/ then xn = 1 for all x ∈ X. If X 0 6= 0,
/ then the
inequality
n−1
∑ a j x j + γn xn ≤ b + γn (5.6)
j=1
The lifting technique described in Theorem 5.2 can be applied successively: the
inequality obtained as a result of lifting one variable can be lifted further by another
variable. Next, we apply this technique to strengthen the cover inequalities.
Let C ⊆ N = {1, . . . , n} be a knapsack cover for the set K defined by (5.1). We want
to strengthen the cover inequality ∑ j∈C x j ≤ |C| − 1, which is valid for K. We do this
using the lifting technique described in Theorem 5.2.
5.2 Lifting Inequalities 135
∑ xj + ∑ α j x j ≤ |C1 | − 1 + ∑ αj (5.8)
j∈C1 j∈N\C1 j∈C2
that is valid for K. For j ∈ N \C1 , the coefficients α j depend on the order in which
the variables x j are lifted. Let j1 , . . . , jk be some ordering of the elements of the
set N \ C1 , γ = |C1 | − 1, β = b − ∑ j∈C2 a j , and χ C2 ∈ {0, 1}N be the characteristic
vector of C2 (χiC2 = 1 if i ∈ C2 , and χiC2 = 0 if i ∈ N \C2 ). The coefficients α ji can
be calculated in series as follows.
Suppose that we have already obtained the inequality
r−1
∑ x j + ∑ α ji x ji ≤ γ
j∈C1 i=1
We calculate the coefficient α jr for the next variable x jr to guarantee that the result-
ing inequality must be valid for the set Kr . By virtue of Theorem 5.2, it is necessary
to solve the following 0,1-knapsack problem
r−1
∑ x j + ∑ α ji x ji → max,
j∈C1 i=1
r−1 C2 (5.9)
∑ a j x j + ∑ a ji x ji ≤ β + (−1)(1−χ jr ) a jr ,
j∈C1 i=1
α jr = γ − ξr , if jr ∈ N \C;
α jr = ξr − γ, γ = ξr , β := β + a jr , if jr ∈ C2 .
136 5 Cuts for Structured Mixed-Integer Sets
If (5.9) is solved using the recurrence formula (1.31), the above lifting procedure
runs in polynomial time. Let us also note again that different orderings of the set
N \C1 result in different lifted cover inequalities.
separates x̃ from K 1 .
Solution. First, let us write down (5.3) applied to our knapsack set K 1 :
1 1
z1 + z2 + z3 + 0z4 + 0z5 + z6 → min,
3 4
5z1 + 6z2 + 4z3 + 6z4 + 3z5 + 8z6 > 16,
z1 , z2 , z3 , z4 , z5 , z6 ∈ {0, 1}.
x2 + x3 + x4 + x5 → max,
6x2 + 4x3 + 6x4 + 3x5 ≤ 16 − 5 = 11,
x2 , x3 , x4 , x5 ∈ {0, 1}.
x1 + x2 + x3 + x4 + x5 → max,
5x1 + 6x2 + 4x3 + 6x4 + 3x5 ≤ 16 − 8 = 9,
x1 , x2 , x3 , x4 , x5 ∈ {0, 1}.
x1 + x2 + x3 + x4 + x5 + x6 ≤ 3
Example 5.2 We need to find a lifted cover inequality that separates the point
Solution. Let us first try to find a maximally violated cover inequality by solving
the following 0,1-knapsack problem:
Its optimal solution is z∗ = (1, 0, 0, 0, 0, 1), and the optimal objective value is ξ ∗ = 1.
Therefore, by Theorem 5.1, the point x̃ satisfies all cover inequalities valid for K 2 . In
particular, x̃ satisfies the cover inequality x1 +x2 ≤ 1 written for the cover C = {1, 6}
determined by z∗ . Moreover, one can easily verify that, regardless of the choice of
the partition (C1 ,C2 ) and regardless of the ordering of the set N \C1 , all coefficients
α j calculated by the lifting procedure are zeros. As a result, we have the inequality
x1 + x6 ≤ 1, which is not violated at x̃.
Nevertheless, there is still a lifted cover inequality that separates x̃ from K 2 . We
will find such an inequality later in the continuation of this example. t
u
Example 5.2 shows that if we are going to solve the separation problem for the
class of lifted cover inequalities, the choice of the most violated cover inequality
as the starting one is not always justified. This example motivates the use of the
following heuristic procedure for finding an initial knapsack cover C.
def
For δ ∈ {0, 1}, we define N δ = { j ∈ N : x̃ j = δ }, and then compute b̄ = b −
∑ j∈N 1 a j . Next, sorting the values x̃ j for j ∈ N 2 = N \ (N 0 ∪ N 1 ) in non-increasing
order,
x̃ j1 ≥ x̃ j2 ≥ · · · ≥ x̃ jk ,
we find an index r such that
r−1 r
∑ a ji ≤ b̄ and ∑ a ji > b̄.
i=1 i=1
138 5 Cuts for Structured Mixed-Integer Sets
Example 5.3 (continuation of Example 5.2) We need to apply the above proce-
dure for separating the point x̃ from the set K 2 .
Solution. First, we write the partition of N = {1, 2, 3, 4, 5, 6}: N 0 = {1}, N 1 = {6}
and N 2 = N \ (N 0 ∪ N 1 ) = {2, 3, 4, 5}. Next we sort the components x̃ j for j ∈ N 2 :
x̃5 = 0.7 > x̃3 = 0.5 = x̃4 = 0.5 > x̃2 = 0.3.
Since b̄ = b−a6 = 22−10 = 12, a5 +a3 = 3+6 < 12 and a5 +a3 +a4 = 3+6+5 >
12, then r = 3 and C1 = {3, 4, 5}, C2 = N 1 = {6}, C = C1 ∪C2 = {3, 4, 5, 6}. Listing
the elements of N \C1 in the order of (2, 6, 1), we will lift the inequality
x3 + x4 + x5 ≤ 2.
Note that this inequality is not valid for K 2 . It is valid only for the set {x ∈ K 2 : x6 =
1}. Initially, we set β = b − a6 = 22 − 10 = 12, γ = 2.
To compute α2 , we solve the following 0,1-knapsack problem:
x3 + x4 + x5 → max,
6x3 + 5x4 + 3x5 ≤ 12 − 7 = 5,
x3 , x4 , x5 ∈ {0, 1}.
ξ2 = x2 + x3 + x4 + x5 → max,
7x2 + 6x3 + 5x4 + 3x5 ≤ 12 + 10 = 22,
x2 , x3 , x4 , x5 ∈ {0, 1}.
ξ3 = x2 + x3 + x4 + x5 + 2x6 → max,
7x2 + 6x3 + 5x4 + 3x5 + 10x6 ≤ 22 − 13 = 9,
x2 , x3 , x4 , x5 , x6 ∈ {0, 1}.
5.2 Lifting Inequalities 139
2x1 + x2 + x3 + x4 + x5 + 2x6 ≤ 4,
The lifting technique presented in Theorem 5.2 can be used to strengthen any in-
equalities involving only binary variables. Let us again consider the knapsack set K
defined by (5.1). If a set S ⊂ N is not a knapsack cover (it is called a feasible set),
then ∑ j∈S a j ≤ b, and, for any vector w ∈ ZS++ , the inequality
∑ w jx j ≤ ∑ w j
j∈S j∈S
is valid for K. If we lift this trivial inequality, sometimes we can get a much stronger
inequality. Let us demonstrate this on an example.
Example 5.4 Consider the knapsack set
n o
K 3 = x ∈ {0, 1}6 : 3x1 + 4x2 + 6x3 + 7x4 + 9x5 + 18x6 ≤ 21 .
x1 + x2 + x3 + x4 ≤ 4,
which, clearly, is valid for K 3 . We need to strengthen this inequality lifting the vari-
ables in the following order: j1 = 5, j2 = 6.
Solution. We have γ = 4 and β = 21. To compute α5 , we solve the following
0,1-knapsack problem:
x1 + x2 + x3 + x4 → max,
3x1 + 4x2 + 6x3 + 7x4 ≤ 21 − 9 = 12,
x1 , x2 , x3 , x4 ∈ {0, 1}.
x1 + x2 + x3 + x4 + 2x5 → max,
3x1 + 4x2 + 6x3 + 7x4 + 9x5 ≤ 21 − 18 = 3,
x1 , x2 , x3 , x4 , x5 ∈ {0, 1}.
and, if (5.10) is valid for X̃ BK+1 , then it is valid also for X BK+1 .
After the substitution x̄ j for 1 − x j , the inequality
∑ a jx j − y ≤ b
j∈C
5.3 Mixed Knapsack Sets 141
aj 1 λ
∑ − ak x̄ j − ak y ≤ − ak , (5.11)
j∈C
to which we apply Theorem 4.3. Taking into account that ak > λ and ak ≥ a j for all
j ∈ C, we calculate
λ ak − λ
f =− − (−1) = ,
ak ak
aj ak − a j
f j = − − (−1) = ,
ak ak
fj − f λ −aj aj
= = 1− .
1− f λ λ
Applying (4.7) to (5.11), we obtain the inequality
1
− ∑ min{1, a j /λ }x̄ j − y ≤ −1,
j∈C λ
which, after multiplication by λ and the inverse substitution 1−x j for x̄ j , transforms
into (5.10). t
u
Example 5.5 We need to separate the point (x̃, ỹ) = 0, 1, 0, 1, 1, 34 , 21 , 0, 0 from the
Solution. We should not be confused by the fact that there are more than one real
variable here. Since y3 ≥ 0, the set X is contained in the set
X 0 = {(x, y) ∈ {0, 1}6 × R2+ : 5x1 + x2 + 3x3 + 2x4 + x5 + 8x6 − 2y1 − y2 ≤ 9},
and any inequality valid for X 0 will be also valid for X. Further, after the substitution
s for 2y1 + y2 , the set X 0 transforms into the set
Let us start with the cover C = {2, 4, 5, 6} that is produced by the LP heuristic
applying to x̃ and the knapsack set obtained from X̄ by dropping the variable s. Since
λ = 1 + 2 + 1 + 8 − 9 = 3, by Theorem 5.3, the inequality
142 5 Cuts for Structured Mixed-Integer Sets
x2 + 2x4 + x5 + 3x6 − s ≤ 1 + 2 + 1 + 3 − 3 = 4
def
Let Cλ = { j ∈ C : a j > λ } and let r = |Cλ | > 0. List the elements of Cλ in order of
j
their weights, aπ1 ≥ aπ2 ≥ . . . aπr > λ , and then set A j = ∑i=1 aπi for i = 1, . . . , r. It
is easy to see that
λ ( j − 1)
if A j−1 ≤ u ≤ A j − λ ,
φC (u) = λ ( j − 1) + u − (A j − λ ) if A j−1 − λ ≤ u ≤ A j ,
λ (r − 1) + u − (Ar − λ ) if u ≥ Ar − λ ,
The following theorem establishes the role of superadditivity in the sequence in-
dependent lifting, i.e, when the order of lifting variables does not affect the lifting
coefficients.
Proof. It is necessary to prove that any point (x∗ , y∗ ) ∈ X BK+1 satisfies (5.12). Let
Q = { j 6∈ C : x∗j = 1}. As φC is superadditive, we have
xe ≤ ue ye , e ∈ E(v)},
E(v)
Xv− = {(x, y) ∈ R+ × {0, 1}E(v) : ∑ xe − ∑ xe ≤ −dv ,
e∈E(v,V ) j∈E(V,v)
xe ≤ ue ye , e ∈ E(v)}.
def
Here E(v) = E(v,V ) ∪ E(V, v).
A pair (C1 ,C2 ), where C1 ⊆ N1 and C2 ⊆ N2 , is called a generalized cover for the
set X if
∑ u j − ∑ u j = b + λ (C1 ,C2 ),
j∈C1 j∈C2
∑ xj − ∑ xj +
j∈C1 j∈N2 \(C2 ∪L2 )
(5.13)
∑ max{0, u j − λ } · (1 − y j ) − λ ∑ y j ≤ b + ∑ u j
j∈C1 j∈L2 j∈C2
is valid for X.
Proof. We have to prove that any point (x, y) ∈ X satisfies (5.13). Let C1+ = { j ∈
C1 : u j > λ } and T = { j ∈ N1 ∪ N2 : y j = 1}. Consider two cases.
1. |C1+ \ T | + |L2 ∩ T | = 0.
∑ x j + ∑ max{0, u j − λ } · (1 − y j )
j∈C1 j∈C1
= ∑ xj + ∑ (u j − λ )
j∈C1 ∩T j∈C1+ \T
= ∑ xj (since C1+ \ T = 0)
/
j∈C1 ∩T
≤ ∑ xj (since x j ≥ 0)
j∈N1
=b+ ∑ xj + ∑ xj + ∑ xj
j∈C2 j∈L2 ∩T j∈N2 \(C2 ∪L2 )
≤b+ ∑ uj +λ ∑ yj + ∑ xj .
j∈C2 j∈L2 j∈N2 \(C2 ∪L2 )
2. |C1+ \ T | + |L2 ∩ T | ≥ 1.
∑ x j + ∑ max{0, u j − λ } · (1 − y j )
j∈C1 j∈C1
= ∑ xj + ∑ (u j − λ )
j∈C1 ∩T j∈C1+ \T
≤ ∑ u j − |C1+ \ T | · λ (since x j ≤ u j )
j∈C1
=b+ ∑ uj +λ ∑ yj ≤ b+ ∑ uj +λ ∑ yj + ∑ xj . t
u
j∈C2 j∈L2 j∈C2 j∈L2 j∈N2 \(C2 ∪L2 )
Inequality (5.13) cuts off from conv(X) a number of vertices of the relaxation
polyhedron
5.4 Simple Flow Structures 145
If we assume that L2 = N2 \C2 , x j = u j y j and u j > λ for all j ∈ C1 , then (5.13) takes
the form
∑ u j y j + ∑ (u j − λ ) (1 − y j ) ≤ b + ∑ u j + λ ∑ yj ,
j∈C1 j∈C1 j∈C2 j∈N2 \C2
∑ u j − λ ∑ (1 − y j ) ≤ b + ∑ u j + λ ∑ yj .
j∈C1 j∈C1 j∈C2 j∈N2 \C2
Substituting λ for b − ∑ j∈C1 u j + ∑ j∈C2 u j and then dividing the result by λ , we have
∑ (1 − y j ) + ∑ yj ≥ 1
j∈C1 j∈N2 \C2
or
∑ (1 − y j ) − ∑ y j ≥ 1 − ∑ y j .
j∈C1 j∈C2 j∈N2
The last inequality is nothing else as a cover inequality for the knapsack set
( )
y ∈ {0, 1}n : ∑ u jy j − ∑ u jy j ≤ b ,
j∈N1 j∈N2
that was written for the cover C = C1 ∪C2 subject to the condition
Based on the above reasoning, we can write down the following heuristic that
separates a given point (x̃, ỹ) from the set X by a flow cover inequality.
1. Solve the following 0,1-knapsack problem:
146 5 Cuts for Structured Mixed-Integer Sets
∑ (1 − ỹ j )z j − ∑ ỹ j z j → min,
j∈N1 j∈N2
∑ u j z j − ∑ u j z j > b,
j∈N1 j∈N2
z j ∈ {0, 1}, j ∈ N1 ∪ N2 .
x∗ = (3, 0, 2, 1, 0, 0),
y∗ = (y∗2 , y∗3 , y∗4 , y∗5 , y∗6 ) = (0, 2/3, 1, 0, 0)
and the hyperplane given by the equation y1 = 1. In addition, the point (x∗ , y∗ ) is
mapped into the point (x0 , y0 ) with
Next, we try to separate (x0 , y0 ) from X̄. Therefore, we solve the following 0,1-
knapsack problem:
1
z3 − z4 → min,
3
3z1 + 3z2 + 6z3 − 3z4 − 4z5 − z6 > 4, (5.14)
z1 , z2 , z3 ,z4 , z5 , z6 ∈ {0, 1}.
We should not be confused by the fact that this problem is not quite similar to the
standard 0, 1-knapsack problem (1.28). But after the change of variables
5.5 Generalized Upper Bounds 147
(5.14) is rewritten as
1
1 − z̄2 + (1 − z̄3 ) − z̄4 → min,
3
3(1 − z̄1 ) + 3(1 − z̄2 ) + 6(1 − z̄3 ) − 3z̄4 − 4z̄5 − z̄6 ≥ 5,
z̄1 , z̄2 , z̄3 , z̄4 , z̄5 , z̄6 ∈ {0, 1},
Let C ⊆ N be the set of vertices of some clique in G(A) (any two vertices from C
are connected by an edge in G(A)). Then the clique inequality
∑ xj ≤ 1
j∈C
is valid for the packing polytope. Note that in this way we can obtain new inequali-
ties not present in the original system. For example, for the system
x1 + x2 ≤ 1, x2 + x3 ≤ 1, x3 + x1 ≤ 1,
the set {1, 2, 3} is a clique in G(A), and the click inequality x1 + x2 + x3 ≤ 1 does
not belong to the system.
The separation problem for the class of clique inequalities is formulated as
follows: given a point x̃ ∈ [0, 1]n , find in G(A) a clique C∗ of maximum weight
w = ∑ j∈C∗ x̃ j . If w > 1, the clique inequality ∑ j∈C∗ x j ≤ 1 is violated at x̃; otherwise,
x̃ satisfies all the clique inequalities.
It is known that this separation problem for the clique inequalities is NP-hard.
Surprisingly, there is a wider class of inequalities that includes all clique inequal-
ities and for which there is a polynomial separation algorithm. Unfortunately, this
polynomial algorithm is too time consuming to be used in practice. Therefore, in
practice, the separation problem for the clique inequalities is solved with the help of
several heuristic procedures. One of them was specially developed for the case when
x̃ is a solution (or only a part of solution) to the relaxation LP of a MIP containing a
system of binary inequalities, Ax ≤ e, among its constraints.
Suppose that one of the inequalities from the system Ax ≤ e is fulfilled as an
equality. Let this be an inequality i0 and let C = { j : ai0 , j = 1, x̃ j > 0}. First, we
sort the components x̃ j for j ∈ N \ C in non-increasing order. Let j1 , . . . , jk be a
required order. Then for i = 1, . . . , k, if C ∪ { ji } is a clique in G(A), add ji to C. If
we can thus add at least one index ji with x ji > 0, then as a result we get a clique
inequality ∑ j∈C x j ≤ 1 that is violated at x̃.
|C| − 1
∑ xj ≤ 2
, (5.15)
j∈C
5.5 Generalized Upper Bounds 149
which is valid for the packing polytope. In fact, (5.15) is a Chvátal-Gomory cut
since we can derive it by adding together the inequalities
x j1 + x jk ≤ 1 and x ji + x ji+1 ≤ 1, i = 1, . . . , k − 1,
∑ x j + ∑ (1 − x j ) ≤ 1
j1 ∈C j0 ∈C
is valid for X. For example, for the set X of the solutions to the following system
x1 + 2x2 − x3 ≤ 1,
3x1 + x2 − 2x3 ≤ 2,
x1 , x2 , x3 ∈ {0, 1} ,
the conflict graph, GX , has three edges (11 , 21 ), (21 , 30 ) and (11 , 30 ). Therefore, the
set of vertices, C = {11 , 21 , 30 }, is a clique in GX , and the inequality
x1 + x2 + (1 − x3 ) ≤ 1 or x1 + x2 − x3 ≤ 0
150 5 Cuts for Structured Mixed-Integer Sets
is valid for X.
Similarly, if C is the vertex set of an odd cycle in GX , then the inequality
|C| − 1
∑ x j + ∑ (1 − x j ) ≤ 2
j1 ∈C j0 ∈C
is valid for X.
5.6 Notes
5.7 Exercises
5.1. Write down an inequality that cuts off just one given point a ∈ {0, 1}n from the
0,1-cube {0, 1}n .
5.2. Find a lifted cover inequality that separates a given point x̃ from a given set X:
5.7 Exercises 151
T
a) x̃ = 0, 0, 43 , 43 , 1 , X = {x ∈ {0, 1}5 : 8x1 + 7x2 + 6x3 + 6x4 + 5x5 ≤ 14};
T
b) x̃ = 0, 68 , 34 , 43 , 0 , X = {x ∈ {0, 1}5 : 12x1 + 8x2 + 6x3 + 6x4 + 7x5 ≤ 15};
T
c) x̃ = 0, 0, 21 , 61 , 1 , X = {x ∈ {0, 1}5 : 10x1 − 9x2 + 8x3 + 6x4 − 3x5 ≤ 2}.
5.3. For given set X and point (x̃, ỹ), find a flow cover inequality that separates (x̃, ỹ)
from X:
Here, in the representations of (x̃, ỹ), the x̃ and ỹ parts are separated by semicolons.
5.4. Consider the set
( )
n
n
X= (x, y) ∈ {0, 1} × Rn+ : ∑ y j ≤ b, y j ≤ a j x j for j = 1, . . . , n .
j=1
def
Let C ⊆ {1, . . . , n} and λ = ∑ j∈C a j − b > 0. Using Theorem 5.3, prove that the
inequality
∑ (y j + max{0, a j − λ }(1 − x j )) ≤ b
j∈C
is valid for X.
5.5. Consider the solution set X of the following system:
s+ ∑ x j − ∑ x j ≥ b,
j∈N1 j∈N2
(5.16)
0 ≤ x j ≤ y j, j = 1, . . . , n,
s ≥ 0, y ∈ Zn+ ,
where (N1 , N2 ) is a partition of the set N = {1, . . . , n}. Let f = b − bbc. Prove the
following statements:
a) conv(X) is described by the inequalities that determine the relaxation polyhedron
for (5.16), and the inequalities
s+ f ∑ y j + ∑ x j ≥ f dbe + ∑ (x j − (1 − f )y j ) , (5.17)
j∈L1 j∈R1 j∈L2
152 5 Cuts for Structured Mixed-Integer Sets
5.6. Let us consider (1.7), which is the system of inequalities that describes the truth
sets of the CNF given by (1.6). Suppose that q ∈ Sk1 ∩ Sl0 . Prove that the following
inequality
∑ x j + ∑ (1 − x j ) ≥ 1
j∈(Sk1 ∪Sl1 )\{q} j∈(Sk0 ∪Sl0 )\{q}
Prove that this polytope coincides with the set of solutions to the following sys-
tem of linear inequalities:
0 ≤ x j ≤ 1, j = 1, . . . , n.
Currently the main method of solving MIPs is the branch-and-cut method since it
is used in all (in all!) known competitive modern MIP solvers. Briefly, the branch-
and-cut method is a combination of the branch-and-bound and cutting-plane algo-
rithms. In this chapter we present a general schema of the branch-and-cut method,
and also discuss its most important components. In the last section of this chapter
we demonstrate an application of this method for solving MIPs with exponentially
many inequalities. Specifically, we consider a branch-and-cut algorithm for the trav-
eling salesman problem. This algorithm is considered as one of the most impressive
successful applications of the branch-and-cut method.
6.1 Branch-And-Bound
We will consider MIP (1.1) with two sided constraints. The basic structure of the
branch-and-bound method is the search tree. The root (or root node) of the search
tree corresponds to the original MIP. The search tree grows through a process called
branching that creates two or more descendants for one of the leaves of the current
search tree. Each of the MIPs in the child nodes is obtained from the parent MIP
by adding one or more new constraints that are usually upper or lower bounds for
integer variables. It should also be noted that in the process of branching, we should
not lose feasible solutions: the union of the feasible domains of the child MIPs must
be the feasible domain for their parent MIP.
But if the search tree only grew (via branching), then even for relatively small
MIPs the tree size could be huge. On the contrary, one of the main ideas of the
branch-and-bound method is to prevent an uncontrolled growth of the search tree
This is achieved by cutting off the ”hopeless” branches of the search tree. We eval-
uate the prospects of the nodes of the search tree by comparing their upper bounds
with the current lower bound. In the LP based branch-and-bound method, the up-
per bound at any node k is the optimal objective value, γ(k), of the relaxation LP
at this node. The lower bound (or record), R, is the largest value of the objective
153
154 6 Branch-And-Cut
function attained on the already found feasible solutions of the original MIP. The
best of these solutions is called a record solution. If γ(k) ≤ R, then node k and all its
descendants are cut off from the search tree.
branch-and-bound(c, b1 , b2 , A, d 1 , d 2 , S; xR , R);
{
Compute x0 ∈ arg max{cT x : b1 ≤ Ax ≤ b2 , d 1 ≤ x ≤ d 2 };
if (xS0 ∈ ZS ) { // change the record and record solution
R = cT x0 ; xR = x0 ; return;
}
initialize the list of active nodes with one node (x0 , d 1 , d 2 );
while (the list of active nodes is not empty) {
select a node, N = (x0 , d 1 , d 2 ), from the list of active nodes;
if (cT x0 ≤ R)
continue;
select a fractional component xi0 for i ∈ S;
compute x1 ∈ arg max{cT x : b1 ≤ Ax ≤ b2 , d 1 ≤ x ≤ d 2 (i, bxi0 c)};
if (cT x1 > R) {
if (xS1 ∈ ZS ) { // change the record and record solution
R = cT x1 ; xR = x1 ;
}
else
add the node (x1 , d 1 , d 2 (i, bxi0 c)) to the list of active nodes;
}
compute x2 ∈ arg max{cT x : b1 ≤ Ax ≤ b2 , d 1 (i, dxi0 e)) ≤ x ≤ d 2 };
if (cT x2 > R) {
if (xS2 ∈ ZS ) { // change the record and record solution
R = cT x2 ; xR = x2 ;
}
else
add the node (x2 , d 1 (i, dxi0 e), d 2 ) to the list of active nodes;
}
}
}
solution) and R = cT xR (initial record). There are MIPs for which it is difficult to
find an initial feasible solution, in such cases R is set to −∞. If R = −∞ when the
branch-and-bound procedure terminates, then the MIP being solved does not have
feasible solutions; otherwise, xR is an optimal solution to this MIP. In the description
of the method we use the following notation
(
def d j , if j 6= i,
d(i, α) =
α, if j = i.
x1 + 2x2 → max,
1: −2x1 + 3x2 ≤ 4,
2: 2x1 + 2x2 ≤ 11,
3: 1 ≤ x1 ≤ 4,
4: 1 ≤ x2 ≤ 5,
x1 , x2 ∈ Z.
Solution. The search tree is shown in Fig. 6.1. The nodes of this tree are num-
bered from 0 (the root node corresponding to the original MIP) to 5. Each node is
represented as a rectangle that, for the relaxation LP at this node, stores the feasible
intervals of both variables, as well as an optimal solution and the optimal objective
156 6 Branch-And-Cut
value (upper bound). The relaxation LPs are solved by the dual simplex method
starting with an optimal basis for the parent node relaxation LP.
0 γ(0) = 8
x(0) = (5/2, 3)
1 ≤ x1 ≤ 4
1 ≤ x2 ≤ 5
1 γ(1) = 7 2 γ(2) = 8
x(1) = (2, 8/3) x(2) = (3, 5/2)
1 ≤ x1 ≤ 2 3 ≤ x1 ≤ 4
1 ≤ x2 ≤ 5 1 ≤ x2 ≤ 5
3 γ(3) = 7 4 has no
x(3) = (7/2, 2) solutions
3 ≤ x1 ≤ 4 3 ≤ x1 ≤ 4
1 ≤ x2 ≤ 2 3 ≤ x2 ≤ 5
5 γ(5) = 7
x(5) = (3, 2)
3 ≤ x1 ≤ 3
1 ≤ x2 ≤ 2
Initially, we set R = −∞. In this example, the objective function takes only integer
values for all feasible solutions, therefore, we take as the upper bound, γ(k), not the
value of cT x(k) but its integer part bcT x(k) c. Here x(k) denotes an optimal solution to
the relaxation LP at node k. The iterations performed by the algorithm follow below.
They are numbered by the pairs i. j, where i is the node index, and j is the iteration
index of the dual simplex method.
10
0.0. I = (3, 4), B−1 = , x = (4, 5)T , π = (1, 2)T .
01
1 0
0.1. s = 1, u = (−2, 3)T , λ = 32 , t = 2, I = (3, 1), B−1 = 2 1 ,
3 3
x = (4, 4)T , π = ( 73 , 32 )T .
6.1 Branch-And-Bound 157
" #
3
− 15
0.2. s = 2, u = ( 10 2 T
3 , 3) , λ= 7
10 , t = 1, I = (2, 1), B−1 = 10
1 1
,
5 5
7 1 T
x = x(0) = ( 25 , 3)T , π = ( 10 , 5 ) , γ(0) = 8.
Since the solution x(0) to the relaxation LP is not integral, we form the root
(node 0) of the search tree, and then we perform branching on the variable x1 taking
a fractional value.
3 1 T 7 −1 1 0
1.1. s = 3, u = ( 10 , − 5 ) , λ = 3 , t = 1, I = (3, 1), B = 2 1 ,
3 3
x(1) = (2, 83 )T , π = ( 73 , 23 )T , γ(1) = 7.
Since the solution x(1) is not integer, and γ(1) = 7 > −∞ = R, we add node 1 to
the search tree.
0 −1
2.1. s = −3, u = (− 10 , 5 ) , λ = 1, t = 2, I = (2, −3), B−1 = 1
3 1 T
,
2 1
x(2) = (3, 25 )T , π = (1, 1)T , γ(2) = 8
Since the solution x(2) at node 2 is also not integer and γ(2) = 8 > −∞ = R, we
add this node to the search tree.
Among the active nodes, which are the tree leaves, node 2 has the maximum
upper bound. So, we choose this node to branch on the variable x2 .
1
1 T −1 2 −1
3.1. s = 4, u = ( 2 , 1) , λ = 1, t = 2, I = (2, 4), B = ,
0 1
x(3) = ( 72 , 2)T , π = ( 12 , 1)T , γ(3) = 7.
Since the solution x(3) at node 3 is not integer, and γ(3) = 7 > −∞ = R, we add
this node to the search tree.
4.1. s = −4, u = (−1/2, −1)T . Since all components of vector u are non-positive,
then the relaxation LP at this node has no feasible solutions. In this case, we do
not need to add node 4 to the search tree. Nevertheless, in Fig. 6.1 this node is
present there to make the tree more informative.
From two active nodes, 1 and 3, with the maximal upper bound 7, we select
node 3, which is farther from the root, and perform branching on the variable x1 .
10
5.1. s = 3, u = (1/2, −1)T , λ = 1, t = 1, I = (3, 4), B−1 = ,
01
x(5) = (3, 2)T , π = (1, 2)T , γ(5) = 7.
Since the solution x(5) is integer and γ(5) = 7 > −∞ = R, we change the record
and the record solution: R = 7, xR = (3, 2)T . In Fig. 6.1, node 5 is also presented for
informative purposes.
As the upper bounds for the nodes 1 and 3 are equal to the current record, then
node 1 and the right branch of node 3, which we have not created yet, must be cut
off.
158 6 Branch-And-Cut
Since there are no more unprocessed nodes in the search tree (the list of active
nodes is empty), the current record solution xR = (3, 2)T is optimal. t
u
6.2 Branch-And-Cut
START
?
Preprocessing Select active node
?
- Solve LP
?
HHH Is H H
Has H No - node list HH No
HH solution? HH empty?
HH HH
Yes ? Yes
Generating
cuts
?
STOP
?
HHH
Yes New H 6
HH cuts?
HH
No ? Yes
H IsHH
HH
No - node list HH No
HH γ > R?
H HH empty?
HH HH
Yes ? 6
IsHH
solution HH Yes - Change
H
HHfeasible? record solution
H
No
?
Node
heuristic
H ?
New HH
Yes - Change
HH record? H
record solution
HH
No ?
Perform branching
- and add new nodes
to list of active nodes
its descendants; for other nodes it may not be valid. Therefore, such an inequality
can not be present in the active constraint matrix permanently, and the pool is the
best place to store it.
Another way to reduce the duality gap at a node is to use node heuristics to
increase the lower bound (record). The idea of the node heuristics is simple. Each
time some tree node is processed, we can try to somehow ”round off” a solution to its
relaxation LP. Usually the rounding consists in performing some type of ”diving”,
when the values of a group of integer variables with ”almost integer” values are
fixed, the resulting LP is reoptimized, then another group of variables is fixed, and so
on until an integer solution is obtained, or it is proved that fixing variables resulted in
an inconsistent constraint system. When we are lucky to get a new feasible solution
better than the record one, then the lower bound is increased allowing us to eliminate
some active nodes of the search tree, and thereby speed up the solution process.
Example 6.2 We need to solve the following IP:
Solution. Let us agree to generate only the cover inequalities, which are always
global and, therefore, they are valid for all nodes of the search tree shown in Fig. 6.3.
0 γ(0) = 3 12
1 1 1 1 T
x(5) =
2, 2, 2, 2
Q
x1 = 0 Q x1 = 1
Q
QQ
1 γ(1) = 3 2 γ(2) = 3
x(6) = (0, 0, 1, 1)T x(7) = (1, 0, 0, 1)T
0. First, we solve the relaxation LP for (6.1). Its optimal solution is the point
T
x(1)= 0, 1, 0, 35 , which violates the inequality
x2 + x4 ≤ 1
written for the knapsack cover C11 = {2, 4} of the first knapsack constraint. Adding
this inequality to the constrain system, after reoptimizing, we get the solution x(2) =
1 5
T
3 , 1, 9 , 0 , which violates the inequality
6.3 Branching 161
x1 + x2 ≤ 1
induced by the cover C21 = {1, 2} of the first knapsack inequality. Adding this in-
T
equality and reoptimizing, we get the third solution x(3) = 0, 1, 56 , 0 , which vio-
lates the inequality
x2 + x3 ≤ 1
written for the cover C12 = {2, 3} of the second knapsack constraint. Again, adding
T
this inequality and reoptimizing, we find the solution x(4) = 59 , 49 , 95 , 95 , which
violates the inequality
x1 + x3 ≤ 1
induced by the cover C22 = {1, 3} of the second knapsack constraint. Adding this
T
inequality and reoptimizing, we obtain the solution x(5) = 12 , 12 , 12 , 21 , which sat-
isfies all cover inequalities for both knapsack sets of our IP. So, we turn to branching
on the variable x1 .
1. Now we solve the relaxation LP for node 1 (x1 = 0). Its optimal solution,
x(6) = (0, 0, 1, 1)T , is integer and, therefore, it is declared as the first record solution,
xR = (0, 0, 1, 1)T , and the record is set to R = cT xR = 3.
2. An optimal solution x(7) = (1, 0, 0, 1)T to the relaxation LP at node 2 (x1 = 1)
is also integer. The objective value on this solution is equal to 3. Therefore, both
points, x(6) and x(7) , are optimal solutions to (6.1). t
u
6.3 Branching
This is one of the simplest and most frequently used rules according to which a
variable with a fractional part closest to 0.5 is chosen. This corresponds to the cal-
culation of the estimates by the following rule:
In other words, this rule chooses for branching that variable, which is ”harder” to
round off. Unfortunately, numerical experiments have proved that in practice this
rule is not better than the rule that chooses a variable for branching randomly.
Pseudocost Branching
This complex rule remembers all successful branchings over all variables. The val-
ues
def q q
ζi− (q) = (z(q) − z−
i (q))/(x̄i − bx̄i c),
def q q
ζi+ (q) = (z(q) − z+
i (q))/(dx̄i e − x̄i )
define the average (per unit of change of the variable) decrement of the objective
function for the left and right descendants of node q, respectively. By this definition,
the value of ζi− (q) or ζi+ (q) is infinity if the corresponding relaxation LP does not
have a solution. In such a case, we can set either of these values to be equal to the
integrality gap at node q if the latter is finite, or to some predefined penalty. Further,
let ξi− (q) (resp., ξi+ (q)) denote the sum of ζi− (q0 ) (resp., ζi+ (q0 )) for all nodes q0
processed before the processing of node q starts, and for which the branching was
6.3 Branching 163
performed on the variable xi . Let ηi− (resp., ηi+ ) be the number of such nodes. For
any variable xi , the left and right pseudocosts are determined by the formulas
def def
Ψi− (q) = ξi− (q)/ηi− (q) , Ψi+ (q) = ξi+ (q)/ηi+ (q) .
Defining
Strong Branching
In its pure form, strong branching assumes the calculation of exact estimates
for all integer variables taking fractional values. Since the calculation of all the es-
timates takes too much time, the strong branching procedure can be modified as
follows. First, a relatively small subset C of integer variables with fractional values
(for example, 10% of all candidates with largest pseudocosts) is chosen. The next
simplification is that, when estimating the decrements z(q)−z− +
i (q) and z(q)−zi (q),
only some fixed number of iterations of the dual simplex method is accomplished.
This simplification is motivated by the fact that, for the dual simplex method, the av-
erage per iteration decrease of the objective value usually decreases with the number
of iterations performed. However, this observation is not valid for many problems
with built-in combinatorial structures, since, as a rule, the relaxation LPs of such
problems are strongly degenerate.
6.3.1 Priorities
Let us recall the representation (1.2) of the discrete variable (see Sect. 1.1.1). Sup-
pose that we have a MIP with such a constraint. If, in an optimal solution to the
relaxation LP, not all values λ̃i of the variables λi are integers, then using any stan-
dard branching rule, we choose a fractional component λ̃i∗ to divide the feasible
domain of λ variables,
( )
k
K= λ ∈ {0, 1}k : ∑ λi = 1 ,
i=1
K̄0 = {λ ∈ K : λi = 0, i = 1, . . . , r},
K̄1 = {λ ∈ K : λi = 0, i = r + 1, . . . , k},
where
j k
r = arg min ∑ λ̃i − ∑ λ̃i .
1≤ j<k i=1 i= j+1
and the requirement that no more than two components λi take non-zero values, and
if there are two of such components, they must be consecutive. Let
K1 = {λ ∈ K : λi = 0, i = 1, . . . , r},
K2 = {λ ∈ K : λi = 0, i = r, . . . , k},
Let us recall that global cuts are valid for all nodes of the search tree, and local
cuts are valid only for a particular node and all its descendants. Therefore, as a rule,
global cuts are more useful in practice. Recall that we declare a cut as being local
if in its derivation we used other local inequalities. Most often, such inequalities are
the lower and upper bounds for the values of integer variables. Changing a bound
(lower or upper) of a binary variable means fixing its value. This simple observa-
tion makes it possible to generate a global fractional Gomory cut each time when
no local inequalities, other than the bounds for binary variables, were used in the
derivation of the base inequality (for which mixed integer rounding is applied). Let
us demonstrate this with a simple example.
Example 6.3 Let us imagine that we are solving the following IP
root relaxation LP, and now we need to process the child node obtained from the
root node after fixing x1 to 1.
Solution. First, let us write down the relaxation LP for this node:
Its optimal basic feasible solution, basic set and inverse basic matrix are the follow-
ing:
0 −1 0
1 T
x̄ = 1, 0, , I = (1, −2, −3) and B−1 = 0 0 −1 .
2 1 3
2 2 1
As x3 is the only variable taking a fractional value, we will build the fractional
Gomory cut starting with the equation:
1 3 1
x3 + s1 + s2 + s3 = . (6.3)
2 2 2
Next, we compute the coefficients
1 f0 1 1
f0 = , = 1, f1 = , f2 = , f3 = 0
2 1 − f0 2 2
and write down the cut
1 1 1
s1 + s2 ≥ ,
2 2 2
or
s1 + s2 ≥ 1.
Substituting 4 − 3x1 − 2x2 − 2x3 for s1 , and −1 + x1 for s2 , after simplifications and
rearranging, we get the cut in the initial variables:
x1 + x2 + x3 ≤ 1.
This inequality, as it should, cuts off the point x̄, but is not global, since it also cuts
off the point (0, 1, 1)T , which is a feasible solution to (6.2). This happened because,
when derivating this cut, we used the local bound x1 ≥ 1.
Now, let us build a global cut. Since we have the equation x1 = 1, we can include
into the basic set any of two inequalities: x1 ≥ 1, which is local, or x1 ≤ 1, which is
global. This time we include into the basic set the global inequality x1 ≤ 1. So, we
have the basic set I¯ = (1, 2, −3) and the inverse basic matrix
6.5 Preprocessing 167
0 1 0
B̄−1 = 0 0 −1 .
1 3
2 −2 1
2x1 + x2 + x3 ≤ 2.
In the derivation of this cut, we did not use local inequalities, and therefore this cut
is global. t
u
6.5 Preprocessing
We have not yet discussed one position in the block diagram of the branch-and-cut
method shown in Fig. 6.2. It is about preprocessing, or automatic reformulation. All
modern commercial MIP solvers begin solving any MIP with an attempt to sim-
plify it (narrow the feasible intervals for variables or even fix their values, classify
variables and constraints, strengthen inequalities, scale the constraint matrix, etc.).
The following simple statements give an idea what actions are being performed in
the preprocessing step.
dli e ≤ xi ≤ bui c.
∑ a ju j + ∑ a j l j ≤ b.
j: a j >0 j: a j <0
∑ a jl j + ∑ a j u j > b.
j: a j >0 j: a j <0
max{cT x : Ax ≤ b, l ≤ x ≤ u}.
Example 6.4 We need to apply Propositions 6.1 and 6.2 to strengthen the following
formulation:
3x1 + 2x2 − x3 → max,
1 : 4x1 − 3x2 + 2x3 ≤ 13,
2 : 7x1 + 3x2 − 4x3 ≥ 8,
3: x1 + 2x2 − x3 ≥ 4,
x1 ∈ Z+ ,
x2 ∈ {0, 1},
x3 ≥ 1.
Solution. In view of Proposition 6.2, we fix the value of x3 to 1. Further, from
Ineqs. 1 and 2, we have
x1 ≥ 4 − 2x2 + x3 = 4 − 2 · 1 + 1 = 3.
So, we have fixed the values of all variables and found the solution to this exam-
ple, x∗ = (3, 1, 1), already at the preprocessing stage. Of course, this is rather an
exception than a rule. t
u
in which all the variables x j are binary and the coefficients a j and β are positive, is
equivalent to the inequality
n
∑ min{a j , β }x j ≥ β .
j=1
and let
U= ∑ a ju j + ∑ a jl j .
j:a j >0 j:a j <0
and, consequently,
n
ᾱ · 1 + ∑ a j x j ≤ β .
j=1
Now consider the case when α > 0. Let us rewrite (6.5) in the form
n
−α(1 − y) + ∑ a j x j ≤ β − α. (6.9)
j=1
From the already proved part of this proposition for the case α < 0, it follows that,
for
−α̂ = max{−α, β − α −U},
the inequality
n
−α̂(1 − y) + ∑ a j x j ≤ β − α
j=1
Example 6.5 We need to reduce the coefficients of the binary variables in the in-
equality
92x1 − 5x2 + 72x3 − 10x4 + 2x5 ≤ 100 (6.10)
under the following conditions:
x1 , x2 , x3 ∈ {0, 1}, 0 ≤ x4 ≤ 5, 0 ≤ x5 ≤ 2.
Solution. Let us apply Proposition 6.3 sequentially for the variables x1 , x2 and x3 .
Since α = a1 = 92 > 0, U = −5 · 0 + 72 − 10 · 0 + 2 · 2 = 76, then
After solving the relaxation LP for some node of the search tree, we have addi-
tional information that can be used to strengthen the formulation at this node.
Proposition 6.4. Let γ be the optimal objective value of the relaxation LP for some
node being processed, and R be the record value at that time. Let x∗j and c̄ j 6= 0 be
the value and the reduced cost of an integer variable x j , l j ≤ x j ≤ u j . Further, let
δ = b(γ − R)/c̄ j c. Then, for any optimal solution of the relaxation LP, the following
holds:
if x∗j = l j , then x j ≤ l j + δ , and if x∗j = u j , then x j ≥ u j − δ .
There exist methods that allows us to replace (aggregate) a system of linear inequal-
ities with a single linear inequality so that the integer solutions of the system and this
single inequality are the same. But in practice it is better to disaggregate inequali-
ties; usually, this strengthens existing formulations. In Example 1.1 we strengthened
an IP formulation of a binary set by replacing an inequality with a family of inequal-
ities that imply the original inequality. The next proposition presents a simple but
rather general disaggregation technique.
∑ f j x j + ∑ ai yi ≤ b, (6.11)
j∈B i∈I
where all variables x j ( j ∈ B) are binary, all variables yi (i ∈ I) are integer (or
binary), all coefficients f j ( j ∈ B) are positive and ∑ j∈B f j ≤ 1, b and all coefficients
ai (i ∈ I) are integers. Then the inequalities
x j + ∑ ai yi ≤ b, j ∈ B, (6.12)
i∈I
f j x j + ∑ ai yi ≤ b, j ∈ B,
i∈I
are valid for X since all f j and x j are non negative. As x j are binaries, and b −
∑i∈I ai yi takes only integer values, all inequalities from (6.12) are also valid for X.
172 6 Branch-And-Cut
Multiplying the j-th inequality in (6.12) by f j / ∑k∈B fk , and then summing together
all |B| resulting inequalities, we get the inequality
1
∑ j∈B f j
∑ f j x j + ∑ ai yi ≤ b,
j∈B i∈I
6.5.2 Probing
Solving a MIP with a substantial share of binary variables, we can also apply the
technique of probing the values of binary variable. At each iteration of the probing
procedure, the value of one binary variable, xi , is fixed to α ∈ {0, 1}, the basic
preprocessing techniques are applied, and then we explore the consequences.
1. If infeasibility is detected, then, for any feasible solution, the variable xi cannot
take the value of α, and, therefore, its value must be set to 1 − α. For demonstration,
consider a simple example:
Setting x1 = 1, from the second inequality we obtain that x2 = 0, and from the
first one we derive the upper bound x3 ≤ −2, which contradicts to the lower bound
x3 ≥ 1. Therefore, we can set x1 = 0.
2. If one of the bounds of a constraint l ≤ aT x ≤ u is changed, then we can
¯ ū be new bounds established as a result of prepro-
strengthen this constraint. Let l,
cessing. Then the following inequalities hold
In the example
x1 − x2 ≤ 0, x1 − x3 ≤ 0, x2 + x3 ≥ 1,
x2 + x3 ≥ 1 + (2 − 1)x1 or − x1 + x2 + x3 ≥ 1.
We can also consider the bounds on variables as inequalities. Consider the system
6.5 Preprocessing 173
Fixing x1 to 1, from the first inequality we get the upper bounds x2 ≤ 1 and x3 ≤ 2.
This allows us to derive the following inequalities:
or
2x1 + x2 ≤ 3 and 3x1 + x3 ≤ 5.
Despite the apparent simplicity, the probing procedure is a very powerful tool
for enhancing the formulations of MIPs with binary variables. Besides, the probing
techniques subsume some of the preprocessing techniques that we discussed earlier.
Disaggregation of Inequalities
x1 + x2 + · · · + xn ≤ k · y,
where all variables are binary, and 1 ≤ k ≤ n. Fixing y to 0, we get the new upper
bounds xi ≤ 0, i = 1, . . . , n. Consequently, the following inequalities are valid:
xi ≤ y, i = 1, . . . , n.
Probing binary variables, we can achieve much more than Proposition 6.3 allows.
Example 6.6 We need to strengthen the first inequality in the following system
Solution. First we note that we cannot strengthen the first inequality in the way
specified in Proposition 6.3. Therefore, let us proceed to probing the variables.
Setting x1 = 1, from the second inequality we conclude that x2 = 0, and from
the first inequality it follows that x3 = 0. Then the maximum value of the left-hand
side of the first inequality is u = 4 + 2 = 6 < 9. Therefore, if x1 = 1, this inequality
becomes redundant, and it can be strengthened as follows:
174 6 Branch-And-Cut
or
7x1 + 5x2 + 7x3 + 2x4 ≤ 9. (*)
Setting x2 = 1 and arguing in a similar way as above, we can further strengthen
(*) to the inequality
7x1 + 7x2 + 7x3 + 2x4 ≤ 9. t
u
Often, a formulation of a MIP contains too many inequalities, and all of them cannot
be stored in the computer memory. In such cases, some of the inequalities are ex-
cluded from the formulation, the truncated problem is solved by the branch-and-cut
method, and the excluded inequalities are considered as cuts. But such cuts, which
constitute a part of a MIP formulation, must be represented by an exact separation
procedure. Otherwise, we could get an infeasible solution to our MIP. Let us demon-
strate what has been said with a famous example of the minimum Hamiltonian cycle
problem.
Given a (undirected) graph G = (V, E), each edge e ∈ E of which is assigned a
cost ce , we need to find a Hamiltonian cycle (a simple cycle that covers all vertices)
with the minimum total cost of edges. We note that the minimum Hamiltonian cycle
problem on complete graphs is also called the traveling salesman problem (TSP)
because of the following interpretation. There are n cities and the distance ci j is
known between any pair of cities i and j. A traveling salesman wants to find the
shortest ring route, which visits each of the n cities exactly once. As a subproblem,
the TSP appears in practical applications in the following context. A multifunctional
device, processing a unit of some product, performs over it n operations in any order.
The readjustment time of the device after performing operation i for operation j is
ti j . It is necessary to find the order of performing operations for which the total time
spent on readjustments is minimum.
Introducing binary variables xe , e ∈ E, with xe = 1 if edge e is included in the
Hamiltonian cycle, and xe = 0 otherwise, we can formulate the minimum Hamilto-
nian cycle problem as follows:
∑ ce xe → min, (6.13a)
e∈E
∑ xe = 2, v ∈ V, (6.13b)
e∈E(v,V )
∑ xe ≥ 2, 0/ 6= S ⊂ V, (6.13c)
e∈E(S,V \S)
Here we use the notation E(S, T ) for the subset of edges from E with one end vertex
in S, and the other in T .
Equations (6.13b) require that each vertex be incident to exactly two selected
edges. The subtour elimination inequalities (6.13c) are needed to exclude ”short
cycles” (see below the solution of Example 6.7). System (6.13c) contains too many
inequalities. Even for relatively small graphs (say with 50 vertices) we can not store
in the memory of a modern computer any conceivable description of all subtour
elimination inequalities. But we can treat the subtour elimination inequalities as
cuts and add them to the active node LPs as needed. To do this, we only need to
solve the following separation problem:
given a point x̃ ∈ [0, 1]E that satisfies (6.13b), it is needed to verify whether all
inequalities in (6.13c) are valid, and if there exist violated inequalities, find
one (or a few) of them.
This separation problem can be formulated as the minimum cut problem in which
we need to find a proper subset S̃ of the vertex set V (0/ 6= S̃ ⊂ V ) such that the value
q = ∑e∈E(S̃,V \S̃) x̃e is minimum. If q < 2, then the inequality ∑e∈E(S̃,V \S̃) xe ≥ 2 is
violated at x̃; otherwise x̃ satisfies all inequalities from (6.13c).
To find a minimum cut, there are effective deterministic and probabilistic algo-
rithms. We can use one of them to solve the separation problem for the subtour
elimination inequalities. In addition, we can find several minimum cuts (violated in-
equalities) at once by constructing the Gomory-Hu tree. It is said that a cut (S,V \ S)
separates two vertices s and t if exactly one of these vertices belongs to S; such a
cut is also called an s,t-cut. The Gomory-Hu tree, TGH = (V, Ẽ), is defined on the
vertex set V of the graph G, but the edges e ∈ Ẽ need not be edges in G. Each edge
e ∈ Ẽ is assigned a number fe . For given two vertices s,t ∈ V , we can find a mini-
mum s,t-cut as follows. First, on a single path connecting s and t in TGH , we need
to find an edge e with the minimal value fe . Removing this edge e from the tree
TGH , we get two subtrees. Let S and V \ S be the vertex sets of these subtrees, then
the partition (S,V \ S) is a minimum s,t-cut. In spite of the fact that n-vertex graphs
have n(n − 1)/2 different s,t-cuts (different pairs s,t), we can build the Gomory-Hu
tree by a procedure that solves only n − 1 minimum s,t-cut problems.
Example 6.7 Consider an example of the minimum Hamiltonian cycle problem de-
fined on the graph depicted in Fig. 6.4, where the numbers near the edges are their
costs. We need to solve this example by the branch-and-cut method, when a) α = 4,
β = 10; b) α = β = 0.
1j 4j
0
@
@1 1
@j
5j
α
1 2 1
@
1 1@
3j
β @ j
6
Here the variable xi, j corresponds to the variable xe for the edge e = (i, j).
Case a). Use your favorite LP solver to verify that, for α = 4 and β = 10, an
optimal solution to (6.14) is the point x(1) with the coordinates:
(1) (1) (1) (1) (1) (1)
x1,2 = x2,3 = x3,1 = x4,5 = x5,6 = x6,4 = 1,
(1) (1) (1)
x1,4 = x2,5 = x3,6 = 0.
1→2→3→1 and 4 → 5 → 6 → 4.
The point x(1) violates the subtour elimination inequality for S = {1, 2, 3}:
1 → 3 → 2 → 5 → 6 → 4 → 1,
6.6 Traveling Salesman Problem 177
For such a small example, it is not difficult to verify (even without solving the sep-
aration problem) that the point x(3) satisfies all the subtour elimination inequalities.
This suggests that Formulation (6.13), containing so many inequalities, is not ideal.
Many other classes of inequalities are known for the minimum Hamiltonian cycle
problem. But, unlike the subtour elimination inequalities, all other classes of in-
equalities are usually not part of the problem formulation.
It is easy to see that of the six edges
(1, 2), (2, 3), (3, 1), (1, 4), (2, 5), (3, 6)
no more than four can be on a Hamiltonian cycle. Therefore, the next inequality
must hold
x1,2 + x2,3 + x3,1 + x1,4 + x2,5 + x3,6 ≤ 4, (6.16)
which is violated at x(3) . Adding this inequality to the constraints of (6.16), after
reoptimization, we again get the solution x(2) given by (6.15), but now the cost of
x(2) is 4. t
u
|H ∩ Ti | ≥ 1, |Ti \ H| ≥ 1, i = 1, . . . , k,
(6.17)
Ti ∩ T j = 0,
/ i = 1, . . . , k − 1, j = i + 1, . . . , k,
q q H q q
q qq q
4j 5j 6j 7j
T1 T2 T3
q q q q
1j 2j 3j
k k
k
x(H) + ∑ x(Ti ) ≤ |H| + ∑ (|Ti | − 2) + (6.18)
i=1 i=1 2
is called a comb-inequality.
Let us show that (6.18) is a Chvátal-Gomory cut. Using (6.13b) and the following
equivalent representation for the subtour elimination inequalities
x(S) ≤ |S| − 1, 0/ 6= S ⊂ V,
Dividing both sides of the resulting inequality by 2 and rounding the right-hand side,
we obtain (6.18).
Even for small n, the total number of comb-inequalities is huge, there are much
more of them than there are the subtour elimination inequalities. To use the comb-
inequalities in computational algorithms, we need an efficient separation proce-
dure for these inequalities. In the general case, the separation problem for comb-
inequalities is not solved. But there are several heuristic separation procedures.
These procedures may not find an inequality violated at a given point, even if such
inequalities exist.
An efficient exact separation procedure is known only for a subclass of comb-
inequalities, when
|H ∩ Ti | = 1, |Ti \ H| = 1, i = 1, . . . , k.
Such comb-inequalities are also called flower inequalities because these inequalities
are sufficient to describe the 2-matching polyhedron, which is the convex hull of
points x ∈ {0, 1}E satisfying (6.13b).
From a practical point of view, the main difference between the cuts that are in
the problem formulation, and the usual cuts, is that the exact separation procedure
is necessary for the former, and heuristic separation procedures can be used for the
latter. There are many examples where, even when there are theoretically efficient
separation procedures, in practice, preference is given to faster heuristics.
6.8 Exercises 179
6.7 Notes
Sect. 6.1. The LP-based branch-and-cut method for integer programming was pro-
posed by Land and Doig in [83].
Sect. 6.2. Articles [62, 106] were among the first to describe the use of cuts in the
branch-and-bound method.
Sect. 6.3. Now standard branching on an integer variable with a fractional value
appeared in [41], pseudocost branching was proposed in [23], strong branching was
introduced in CPLEX 7.5 (see also [6]), and GUB/SOS-branching was proposed in
[19].
Sect. 6.4. The idea to generate global cuts avoiding local bounds for binary variables
was specified in [14].
Sect. 6.5. Many preprocessing methods are considered as folklore, since it is very
difficult to trace their origins. Various aspects of preprocessing are discussed in
[4, 30, 70, 74, 119, 129, 132].
Sect. 6.6. Danzig, Falkerson and Johnson [44] were the first to use cuts for solving
a traveling salesman problem with 49 cities. Later, their method was significantly
expanded and improved by many researchers. An overview of these results is pro-
vided in [96]. An implementation of the branch-and-cut method for solving very big
traveling salesman problems is discussed in [7].
The minimum cut problem can be efficiently solved by both deterministic [97]
and probabilistic [79] algorithms. An efficient implementation of the procedure for
constructing the Gomory-Hu tree was proposed in [71].
The comb inequalities were introduced in [36] for a particular case of comb struc-
tures, where each tooth contains exactly one vertex from the handle, and the general
comb inequalities appeared in [65]. The separation procedure for flower inequalities
was developed in [105]. A number of heuristic separation procedures for comb-like
inequalities where described in [107].
Sect. 6.8. The statement of Exercise 6.11 was taken from [92] (see also [110]).
6.8 Exercises
Prove that for odd n, the branch-and-bound method from Listing 6.1 processes an
exponential (in n) number of nodes.
6.2. How many branchings can the branch-and-bound method perform in the worst
case when solving an IP with one integer variable?
180 6 Branch-And-Cut
6.3. Solve again Example 6.1 by the branch-and-bound method, but now first apply
preprocessing.
6.4. Solve the following IPs by the branch-and-bound method:
6.5. Using the result of Exercise 3.4, solve the following 0,1-knapsack problem by
the branch-and-bound method:
16x1 + 6x2 + 14x3 + 19x4 → max,
6x1 + 3x2 + 7x3 + 9x4 ≤ 13,
x1 , x2 , x3 , x4 ∈ {0, 1}.
6.6. Solve the next IP by the branch-and-cut method that at each node generates
only fractional Gomory cuts:
3x1 − x2 → max,
3x1 − 2x2 ≤ 3,
−5x1 − 4x2 ≤ −10,
2x1 + x2 ≤ 5,
0 ≤ x1 ≤ 2,
0 ≤ x2 ≤ 3,
x1 , x2 ∈ Z.
its first inequality can be replaced with the following much simpler inequality
6.8 Exercises 181
x1 + x2 − y ≤ 0.
dqa j e × b/dqbe ≤ a j , j = 1, . . . , n,
and al least one of the above inequalities is strict. Prove that the inequality
n
∑ dqa j ex j ≥ dqbe
j=1
∑ we xe → max, (6.20a)
e∈E
∑ xe ≤ |C| − 1, C ∈ CG , (6.20b)
e∈C
xe ∈ {0, 1}, e ∈ E. (6.20c)
The column generation method is used for solving LPs with a large (usually expo-
nentially large) number of columns (variables). Technically, this method is similar
to the cutting plane method. A standard problem for demonstrating this method is
the one-dimensional cutting stock problem.
Materials such as paper, textiles, cellophane and metal foil are produced in rolls of
great length, from which short stocks are then cut out. For example, from a roll with
a length of 1000 cm, we can cut out 20 stocks of 30 cm in length, and 11 stocks with
a length of 36 cm, with 4 cm going to waste. When it is required to cut out many
different types of stocks in different quantities, it is not always easy to find the most
economical way (with the minimum amount of waste) to do this.
The problem of finding the most economical method of cutting is known as the
cutting stock problem. In the simplest form, it is formulated as follows. From rolls of
length L, we need to cut out pieces of length l1 , . . . , lm , respectively, in the quantities
q1 , . . . , qm . Our goal is to use the minimum number of rolls.
183
184 7 Branch-And-Price
x j ∈ Z+ , j = 1, . . . , n.
Remark. This base model can be easily modified for the case when it is necessary
to cut rolls of different lengths. When cutting expensive materials, such as silk, a
more appropriate criterion is to minimize the cost of leftovers ∑nj=1 c j x j , where c j
is the waste cost for the pattern a j .
x j ≥ 0, j = 1, . . . , k,
has a feasible solution. Let x∗ and y∗ be optimal primal and dual basic solutions to
this LP. The solution x∗ can be extended to the solution of the full LP (relaxation LP
for (7.1)) if we set to zero the values of all variables x j for j = k + 1, . . . , n. Clearly,
this extended solution is optimal to (7.1) if y∗ is its optimal dual solution. By the
complementary slackness condition (see item c) of Theorem 3.2), the latter is valid
if all reduced costs are non-negative:
7.1 Column Generation Algorithms 185
m
c̄ j = 1 − ∑ aij y∗i ≥ 0, j = 1, . . . , n. (7.3)
i=1
To illustrate how the column generation algorithm works, let us apply it to solve
an instance of the cutting-stock problem with the following numeric parameters:
L = 100, l1 = 45, l2 = 36, l3 = 31, l4 = 14, q1 = 97, q2 = 610, q3 = 395, q4 = 211.
First, let us apply the heuristic from Sect. 7.1.3 to find an initial set of patterns.
Initialization. Set b = (97, 610, 395, 211), I = (1, 2, 3, 4), k = 0.
Step 1. Set W = 100, x1 = ∞. Compute in sequence
100 97
a11 = = 2, W = 100 − 2 · 45 = 10, x1 = = 49;
45 2
10 10 10
a12 = = 0; a13 = = 0; a14 = = 0.
36 31 14
Set b = (97, 610, 395, 211) − 49(2, 0, 0, 0) = (−1, 610, 395, 211), I = (2, 3, 4).
Step 2. Set W = 100, x2 = ∞, a21 = 0. Compute in sequence
2 100 610
a2 = = 2; W = 100 − 2 · 36 = 28, x2 = = 305;
36 2
28
a23 = = 0;
31
2 28 211
a4 = = 2, W = 28 − 2 · 14 = 0, x2 = = 106.
14 2
Set b = (−1, 610, 395, 211) − 106(0, 2, 0, 2) = (−1, 398, 395, −1), I = (2, 3).
Step 3. Set W = 100, x3 = ∞, a31 = 0, a34 = 0. Compute in sequence
100 398
a32 = = 2; W = 100 − 2 · 36 = 28, x3 = = 199;
36 2
28
a33 = = 0.
31
Set b = (−1, 398, 395, −1) − 199(0, 2, 0, 0) = (−1, 0, 395, −1), I = (3).
Step 4. Set W = 100, x4 = ∞, a41 = 0, a42 = 0, a44 = 0. Compute
7.1 Column Generation Algorithms 187
100 395
a43 = = 3; W = 100 − 3 · 31 = 7, x4 = = 132.
31 3
x1 + x2 + x3 + x4 → min,
2x1 ≥ 97,
2x2 + 2x3 ≥ 610,
(7.5)
3x4 ≥ 395,
2x2 ≥ 211,
x1 , x2 , x3 , x4 ≥ 0.
Its optimal primal and dual solutions are, respectively, the following vectors:
2 T
T
1 1 1 1 1 1 1 1
x = 48 , 105 , 199 , 131 , y = , , ,0 .
2 2 2 3 2 2 3
The vector z1 = (0, 1, 2, 0)T is its optimal solution. Since (y1 )T z1 = 1/2 + 2/3 =
7/6 > 1, then the column a5 = z1 is added to (7.5), and, as a result, we get the
following LP:
x1 + x2 + x3 + x4 + x5 → min,
2x1 ≥ 97,
2x2 + 2x3 + x5 ≥ 610,
3x4 + 2x5 ≥ 395,
2x2 ≥ 211,
x1 , x2 , x3 , x4 , x5 ≥ 0.
After reoptimizing, we obtain the following optimal primal and dual solutions to the
above LP:
1 T
T
1 1 3 1 1 1
x2 = 48 , 105 , 100 , 0, 197 , y2 = , , ,0 .
2 2 4 2 2 2 4
1 1 1
z1 + z2 + z3 + 0z4 → max,
2 2 4
45z1 + 36z2 + 31z3 + 14z4 ≤ 100,
z1 , z2 , z3 , z4 ∈ Z+ .
The point z2 = (1, 1, 0, 0) is its optimal solution. Since (y2 )T z2 = 1, then x2 deter-
mines an optimal solution to the full relaxation LP.
Rounding up the solution x2 , we get the following approximate solution of our
example: cut out 49 rolls according to the pattern (2, 0, 0, 0), 106 rolls according to
the pattern (0, 2, 0, 2), 101 according to the pattern (0, 2, 0, 0), and 198 rolls accord-
ing to the pattern (0, 1, 2, 0). In this case, 454 rolls are used in total.
Since any cutting plan must use at least
2 2 2 2 2
1
x1 + x2 + x3 + x4 + x5 = 452 = 453
4
rolls, then if there is a more economical way of cutting, it will save only one roll.
As a rule, a cutting plan obtained by rounding up a solution to the relaxation LP is
very ”close” to the optimal ones. Therefore, in practice, we can almost always limit
ourselves to the search for such approximate cutting plans. t
u
Danzig-Wolfe decomposition was originally developed (in 1960) for solving large
structured LPs. Much later in the 1980-th this method began to be used to reformu-
late MIPs in order to strengthen them.
Consider the IP in the following form:
K
∑ (ck )T xk → max,
k=1
K (7.6)
∑ Ak xk ≤ b,
k=1
k k
x ∈X , k = 1, . . . , K,
K
∑ ∑ (aT ck )λak → max,
k=1 a∈X k
K
∑ ∑ (Ak a)λak ≤ b, (7.7)
k=1 a∈X k
∑ λak = 1, k = 1, . . . , K,
a∈X k
λak ∈ Z+ , a ∈ X k , k = 1, . . . , K.
A natural question arises: what gives us the transition from Formulation (7.6),
which is compact, to Formulation (7.7) with a huge number of variables (columns)?
One of the advantages is that, in cases where each of the sets X k is a set of integer
points of a polyhedron, (7.7) is usually stronger than (7.6), since only points xk from
conv(X k ) are feasible to the relaxation LP for (7.7). For example, if
then the point x̄ = 1, 21 , which satisfies all inequalities defining X k , does not belong
to the convex hull of all integer points, (0, 0), (1, 0) and (0, 1), from X k .
Let zLPM denote the optimal objective value of the relaxation LP for (7.7). It is
not difficult to see that the following equality holds
K
def
zLPM = zCUT = max ∑ (ck )T xk ,
k=1
K
∑ Ak xk ≤ b,
k=1
x ∈ conv(X k ),
k
k = 1, . . . , K.
This means that the branch-and-bound method applied to (7.7) is equivalent (from
the point of view of the accuracy of upper bounds) to the the branch-and-cut method
applied to (7.6) assuming that this method uses exact separation procedures for the
sets conv(X k ).
The number of variables in (7.7) can be astronomically large. Therefore, its relax-
ation LP can only be solved by a column generation algorithm. First, relatively small
subsets Sk ⊆ X k are chosen and the master problem is solved:
190 7 Branch-And-Price
K
∑ ∑ (aT ck )λak → max,
k=1 a∈Sk
K
∑ ∑ (Ak a)λak ≤ b, (7.8)
k=1 a∈Sk
∑ λak = 1, k = 1, . . . , K,
a∈Sk
λak ≥ 0, a ∈ Sk , k = 1, . . . , K.
Let (y, v) ∈ Rm K
+ × R be an optimal dual solution to (7.8). The nonzero compo-
nents of an optimal primal solution to (7.8) are nonzero components of an optimal
solution to the relaxation LP for (7.7) if the inequalities
yT Ak a + vk ≥ aT ck , a ∈ X k , k = 1, . . . , K, (7.9)
hold. To verify this condition, for each k = 1, . . . , K, we need to solve the following
pricing problem
((ck )T − yT Ak )xk → max,
(7.10)
xk ∈ X k .
If for some k the optimal objective value in (7.10) is greater than vk , then an optimal
solution to (7.10), x̄k , is added to the set Sk . Next we solve the extended master LP.
We continue to act in this way until an optimal solution to the relaxation LP for (7.7)
is found.
7.2.2 Branching
Applying the column generation technique makes it difficult (or even impossible)
to use the conventional branching on integer variables in the branch-and-bound
method. For example, if at a particular node of the search tree we set some vari-
able λak to zero, then for an optimal solution to the relaxation LP at this node the
reduced cost of λak can be positive, and this makes it possible that the column corre-
sponding to λak can be an optimal solution to the pricing problem. To prevent this, an
additional constraint must be added to the pricing problem. As a result, even at the
search tree nodes of low height, initially a relatively easy to solve pricing problem
can turn into a difficult IP.
Everything is greatly simplified for problems involving only binary variables.
The point x̃k = ∑a∈Sk λak a belongs to {0, 1}nk if and only if all λak are integer-valued.
Therefore, if there are variables λak taking fractional values, then x̃k has fractional
components x̃kj , one of which can be chosen for branching on it.
When processing a branch xkj = α (α ∈ {0, 1}), all elements a with a j = 1 − α
must be removed from the set Sk . In addition, we need to exclude the generation
7.3 Generalized Assignment Problem 191
of such elements (columns) when solving the corresponding pricing problem. The
latter can be done by setting xkj = α in the pricing problem. If α = 1, then we also
need to exclude from the other sets Sk̄ (k̄ 6= k) all elements a with a j = 1. In the
pricing problems it is necessary to set xk̄j = 0 for all k̄ 6= k. Note that the addition
of such simple constraints usually does not destroy the structure of each pricing
problem.
This combination of the branch-and-bound and column generation methods is
known as the branch-and-price method.
Objective (7.11a) is to minimize the total assignment cost. Equations (7.11b) re-
quire that each task be assigned to exactly one machine. Inequalities (7.11c) impose
the capacity restrictions.
Setting
m
X k = {z ∈ {0, 1}m : ∑ pki zi ≤ lk },
i=1
∑ λak ≤ 1, k = 1, . . . , K, (7.12c)
a∈X k
λak ∈ {0, 1}, a ∈ X k , k = 1, . . . , K. (7.12d)
Here e denotes the m-vector of all ones. Observe that we write down (7.12c) as
inequalities (instead of equations in accordance with (7.7)) since X k contains the
point 0 ∈ Rm .
∑ λak ≤ 1, k = 1, . . . , K, (7.13c)
a∈Sk
λak ≥ 0, a ∈ Sk , k = 1, . . . , K, (7.13d)
si ≥ 0, i = 1, . . . , m, (7.13e)
Introducing slack variables also allows us to start with the simplest master problem
when all Sk = 0.
/
7.3 Generalized Assignment Problem 193
Let (ỹ, ṽ) ∈ Rm × RK+ be an optimal dual solution to (7.13). Here the dual variable yi
corresponds to the i-th equation in (7.13b), and the dual variable vk corresponds to
the k-th inequality in (7.13c). For k = 1, . . . , K, the pricing problem is the following
0,1-knapsack problem:
m
∑ (−cki − ỹi )zi → max,
i=1
m
(7.14)
∑ pki zi ≤ lk ,
i=1
zi ∈ {0, 1}, i = 1, . . . , m.
This pricing problem can be solved by the recurrence formula (1.30).
Let z∗ ∈ {0, 1}m be an optimal solution to (7.14). If the inequality
m
∑ (−cki − ỹi )z∗i > ṽk
i=1
7.3.3 Branching
Branching on the variables λaj is not efficient due to two reasons. First, such branch-
ing results in a non-balanced search tree, since the branch λaj = 0 excludes only one
column while the branch λaj = 1 excludes all columns that have 1 at least in one
row i such that ai = 1. Second, such branching violates the structure of the pricing
problem (see Sect. 7.3.2).
As we have already noted, if in an optimal solution of the master LP, λ̃ , one of
the values λ̃ak is non-integer, then the vector
x̃k = ∑ λ̃ak a
a∈Sk
Among the variables xiq , we choose for branching a variable xrq whose current value
x̃rq is closest to 0.5.
194 7 Branch-And-Price
7.3.4 Example
After we have specified all the elements of the branch-and-price method, let us apply
it for solving a small example.
0 γ(0) = −5
(x11 , x21 , x31 )T = ( 21 , 21 , 12 )
(x12 , x22 , x32 )T = ( 21 , 21 , 12 )
!! aa
x11 = 0 !! aa x11 = 1
! aa
!! a
1 γ(1) = −6 2 γ(2) = −7
(x11 , x21 , x31 )T = (0, 0, 1) node eliminated
since its upper bound
(x12 , x22 , x32 )T = (1, 1, 0) is less than the record
The point z∗ = (0, 1, 1)T and ξ ∗ = 8 are its optimal solution and objective value.
Since ξ ∗ = 8 > 0 = v1 , we add z∗ to the set S1 : S1 = {(0, 1, 1)T }.
For machine 2, we solve the next 0,1-knapsack problem:
Its optimal solution and objective value are z∗ = (1, 1, 0)T and ξ ∗ = 8. Since ξ ∗ =
8 > 0 = v2 , we add z∗ to the set S2 : S2 = {(1, 1, 0)T }.
0.2. Solve the next extended master LP:
−4λ11 − 4λ12 − 6s1 − 6s2 − 6s3 → max,
λ12 + s1 = 1,
λ1 + λ12 +
1 s2 = 1,
λ11 + s3 = 1,
λ11 ≤ 1,
λ12 ≤ 1,
λ11 , λ12 , s1 , s2 , s3 ≥ 0.
Its optimal solution and objective value are z∗ = (1, 0, 0)T and ξ ∗ = 5. Since ξ ∗ =
5 > 0 = v1 , we add z∗ to S1 : S1 = {(0, 1, 1)T , (1, 0, 0)T }.
For machine 2, we solve the next 0,1-knapsack problem:
Its optimal solution and objective value are z∗ = (0, 0, 1)T and ξ ∗ = 5. Since ξ ∗ =
5 > 0 = v2 , we add z∗ to S2 : S2 = {(1, 1, 0)T , (0, 0, 1)T }.
0.3. We need to solve one more master LP:
−4λ11 − λ21 − 4λ12 − λ22 − 6s1 − 6s2 − 6s3 → max,
λ21 + λ12 + s1 = 1,
1
λ1 + λ12 + s2 = 1,
λ11 + λ22 + s3 = 1,
1
λ1 + λ2 1 ≤ 1,
λ12 + λ22 ≤ 1,
1 1 2 2
λ1 , λ2 , λ1 , λ2 , s1 , s2 , s3 ≥ 0.
ξ = 0z1 + z2 − z3 → max,
3z1 + 2z2 + 2z3 ≤ 4,
z1 , z2 , z3 ∈ {0,1}.
Its optimal solution and objective value are z∗ = (0, 1, 0)T and ξ ∗ = 1. Since ξ ∗ =
1 > 0 = v1 , we add z∗ to S1 : S1 = {(0, 1, 1)T , (1, 0, 0)T , (0, 1, 0)T }.
For machine 2, we solve the next 0,1-knapsack problem:
Its optimal solution and objective value are z∗ = (0, 1, 0)T and ξ ∗ = 1. Since ξ ∗ =
1 > 0 = v2 , we add z∗ to S2 : S2 = {(1, 1, 0)T , (0, 0, 1)T , (0, 1, 0)T }.
0.4. Solve the next master LP:
7.3 Generalized Assignment Problem 197
−4λ11 − λ21 − 2λ31 − 4λ12 − λ22 − 2λ32 − 6s1 − 6s2 − 6s3 → max,
λ21 + λ12 + s1 = 1,
1
λ1 + λ3 + λ12 +
1 λ32 + s2 = 1,
λ11 + λ22 + s3 = 1,
λ11 + λ21 + λ31 ≤ 1,
λ12 + λ22 + λ32 ≤ 1,
λ11 , λ21 , λ31 , λ12 , λ22 , λ32 , s1 , s2 , s3 ≥ 0.
The non-zero components of its optimal primal and dual solutions are the following:
1
λ11 = λ21 = λ12 = λ22 = ,
2 (7.15)
y = (−2, −3, −2), v1 = v2 = 1.
Next we solve the pricing problems. For machine 1, we solve the following 0,1-
knapsack problem:
ξ = z1 + z2 + 0z3 → max,
3z1 + 2z2 + 2z3 ≤ 4,
z1 , z2 , z3 ∈ {0,1}.
Its optimal solution and optimal objective value are z∗ = (1, 0, 0)T and ξ ∗ = 1. Since
ξ ∗ = 1 = v1 , we cannot extend the set S1 .
For machine 2, we solve the next 0,1-knapsack problem:
ξ = 0z1 + z2 + z3 → max,
2z1 + 2z2 + 3z3 ≤ 4,
z1 , z2 , z3 ∈ {0,1}.
Its optimal solution and optimal objective value are z∗ = (0, 1, 0)T and ξ ∗ = 1. Since
ξ ∗ = 1 = v2 , we cannot extend the set S2 .
We were not able to extend both sets S1 and S2 . This means that the current
solution given by (7.15) is optimal for the root node LP. Knowing the values of the
variables λ jk , we calculate the values of the variables xik :
1 1
x1 0 1 2
x1 = 1 · 1 + 1 · 0 = 1
2,
2
2 2
x31 1 0 1
2
2 1
x1 1 0 2
x2 = 1 · 1 + 1 · 0 = 1
2.
2
2 2
x32 0 1 1
2
198 7 Branch-And-Price
In addition, for this node and all its descendants, when setting the pricing problems
for machine 1, it will always be necessary to set z1 = 0. The set S2 remains the same
as that of the parent node1 : S2 = {(1, 1, 0)T , (0, 0, 1)T , (0, 1, 0)T }.
Now we solve the following master LP:
Its optimal solution and objective value are z∗ = (0, 0, 1)T and ξ ∗ = 32 . Since ξ ∗ =
3 ∗ 1 1 T T T
2 > 0 = v1 , we add z to S : S = {(0, 1, 1) , (0, 1, 0) , (0, 0, 1) }.
For machine 2, we solve the next 0,1-knapsack problem:
3 5
ξ = 4z1 − z2 + z3 → max,
2 2
2z1 + 2z2 + 3z3 ≤ 4,
z1 , z2 , z3 ∈ {0,1}.
Its optimal solution and objective value are z∗ = (1, 0, 0)T and ξ ∗ = 4. Since ξ ∗ =
4 > 25 = v2 , we add z∗ to S2 : S2 = {(1, 1, 0)T , (0, 0, 1)T , (0, 1, 0)T , (1, 0, 0)T }.
1.2. Now we need to solve the next master LP:
1 Since we only have two machines, task 1 must be processed by machine 2. Therefore, we could
remove from the set S2 all vectors a with a1 = 0. These are the second and third vectors. But we
do not do this, because we do not want to use any specifics of this concrete example.
7.3 Generalized Assignment Problem 199
−4λ11 − 2λ21 − 2λ31 − 4λ12 − λ22 − 2λ32 − 2λ42 − 6s1 − 6s2 − 6s3 → max,
λ12 + λ42 + s1 = 1,
1
λ1 + λ2 +1 2
λ1 + 2
λ3 + s2 = 1,
λ11 + λ31 + λ22 + s3 = 1,
1 1
λ1 + λ2 + λ3 1 ≤ 1,
λ12 + λ22 + λ32 + λ42 ≤ 1,
λ11 , λ21 , λ31 , λ12 , λ22 , λ32 , λ42 , s1 , s2 , s3 ≥ 0.
The non-zero components of its optimal primal and dual solutions are the following:
λ31 = λ12 = 1,
(7.16)
y = (−3, −2, −2), v1 = 0, v2 = 1.
Next we solve the pricing problems. For machine 1, we solve the following 0,1-
knapsack problem:
Its optimal solution and objective value are z∗ = (0, 0, 0)T and ξ ∗ = 0. Since ξ ∗ =
0 = v1 , we cannot extend the set S1 .
For machine 2, we solve the next 0,1-knapsack problem:
ξ = z1 + 0z2 + z3 → max,
2z1 + 2z2 + 3z3 ≤ 4,
z1 , z2 , z3 ∈ {0, 1}.
Its optimal solution and objective value are z∗ = (0, 0, 1)T and ξ ∗ = 1. Since ξ ∗ =
1 = v2 , we cannot extend the set S2 .
We were not able to extend both sets S1 and S2 . This means that the current
solution given by (7.16) is optimal for node 1. Since all λ jk are integers, we can
compute a feasible solution to the original problem:
1 2
x1 0 x1 1
x1 = 0 , x2 = 1 .
2 2
x31 1 x32 0
According to this solution, task 3 is processed on machine 1, and tasks 1 and 2 are
processed on machine 2. This is our first solution, and therefore it is remembered
as a record solution in the form of the vector π R = (2, 2, 1)T (πiR = k if task i is
assigned to machine k). The optimal objective value, which is equal to −6, of the
solved relaxation LP is our new record: R = −6. Note that the cost of the record
assignment, π R , is −R = 6.
200 7 Branch-And-Price
2.1. Since x11 = 1 at node 2, which means that task 1 is assigned to machine 1,
we must exclude all vectors a with a1 = 1 from the set S2 that is inherited from the
parent node 0. As a result, we have
In addition, for this node and all its descendants, when setting the pricing problems
for machine 1, it will always be necessary to set z1 = 1, and for all other machines
(in this example only for machine 2), we need to set z1 = 0. The set S1 remains the
same as that at the parent node: S1 = {(0, 1, 1)T , (1, 0, 0)T , (0, 1, 0)T }.
Now we solve the next master LP:
−4λ11 − λ21 − 2λ31 − λ12 − 2λ22 − 6s1 − 6s2 − 6s3 → max,
λ21 + s1 = 1,
1
λ1 + λ31 + λ22 + s2 = 1,
λ11 + λ12 + s3 = 1,
λ11 + λ21 + λ31 ≤ 1,
λ12 + λ22 ≤ 1,
λ11 , λ21 , λ31 , λ12 , λ22 , s1 , s2 , s3 ≥ 0.
The non-zero components of its optimal primal and dual solutions are the following:
1
λ11 = λ21 = λ12 = λ22 = s1 = ,
2 (7.17)
y = (−6, −5, −4), v1 = 5, v2 = 3.
Next we solve the pricing problems. For machine 1, we solve the following 0,1-
knapsack problem:
Its optimal solution and objective value are z∗ = (1, 0, 0)T and ξ ∗ = 5. Since ξ ∗ =
5 = v1 , we cannot extend the set S1 .
For machine 2, we solve the next 0,1-knapsack problem:
Its optimal solution and objective value are z∗ = (0, 0, 1)T and ξ ∗ = 3. Since ξ ∗ =
3 = v2 , we cannot extend the set S2 .
We were not able to extend both sets S1 and S2 . This means that the current
solution given by (7.17) is optimal for node 2. But since the upper bound at this
7.4 Symmetry Issues 201
node is −7 and it is less than the current record equal to −6, then node 2 must be
eliminated from the search tree.
Since there are no more unprocessed nodes in the search tree, the current record
assignment π R = (2, 2, 1)T is optimal. t
u
After solving this example, someone might have doubts about the efficiency of
the branch-and-price algorithms (at least the one th6at we just used to solve our
example): so many calculations to solve an almost trivial example. In fact, this is
not so. But even this simple example should convince you that the implementation
of almost any branch-and-price algorithm is not a trivial exercise.
As we noted in Sect. 7.2, one of the reasons why Formulation (7.7) with a huge
number of variables may be preferable to the compact formulation (7.6) is that the
first one provides more accurate upper bounds. We also noted that, we will get the
same bounds if we solve (7.6) by the branch-and cut method using exact separation
procedures for the sets conv(X k ). Another reason why we can prefer (7.7) is the
presence of symmetric structures, when, for some k, the objects, X k , Ak and ck , are
the same. In such cases, there are many symmetric (essentially identical) solutions:
by interchanging the values of the corresponding components of the vectors xk1
and xk2 for two identical structures k1 and k2 , we get a feasible solution with the
same objective value. As a rule, the branch-and-cut algorithms are very inefficient
in solving problems with symmetric structures. The reason here is that, by adding
a cut valid for the set X k1 or by changing a bound for some variable xkj1 , we do
not exclude the later appearance in the search tree of a node with the symmetric LP
solution obtained from the previously cut off solution by just exchanging the vectors
xk1 and xk2 . In some cases, we can cut off some of the symmetric solutions by adding
new constraints. But it is not always possible to completely overcome this symmetry
issues. More effectively, these symmetry issues can be resolved by developing a
specialized branching scheme. Often, the transition to Formulation (7.7) and the use
of a special branching scheme allows us to completely eliminate the symmetry. Next
we demonstrate this with the example of the generalized assignment problem.
Let us modify the statement of the problem given in Sect. 7.3. Suppose now that
we have K groups of machines, and group k contains nk identical machines. Now
the parameters pki and cki characterize all machines of group k, they are, respectively,
the processing time and cost of performing task i on any machine from group k. For
a new statement, the master problem (7.13) changes only slightly: we only need to
replace (7.13c) with the following inequalities:
∑ λak ≤ nk , k = 1, . . . , K. (7.18)
a∈Sk
202 7 Branch-And-Price
At the root node of the search tree, the pricing problem (7.14) for each group of
machines remains unchanged. At other nodes, additional constraints will be added
to their pricing problem.
To eliminate symmetry, we now do not distinguish machines from one group. For
this reason branching on the variables xik is impossible. On the other hand, branching
on variables λak is inefficient. We can still develop an efficient branching scheme
based on the following observation.
Proposition 7.1. Let λ̃ be a solution to the master LP at some node of the search
tree. If not all values λ̃ak (a ∈ Sk ) are integers, then there exist two tasks, r and s,
such that
∑ λ̃ak < 1. (7.19)
a∈Sk : ar =as
So, if some all k not all λ̃ak are integers, we seek a pair of tasks, r and s, such that
(7.19) is satisfied. Branching is performed by dividing the set of feasible solutions
into two subsets, the first of which includes all assignments in which the tasks r and
s are performed on the same machine, and the second includes the assignments in
which the tasks r and s are performed on different machines. In the first case, when
solving the pricing problem for group k, the tasks r and s are combined into one
task of processing time pkr + pks and cost ckr + cks . At the same time, it is necessary to
exclude from the node master LP all variables λak for those k and a such that ar 6= as .
In the second case, we need to add the inequality zr + zs ≤ 1 to the pricing problem
for group k, and exclude from the master LP all variables λak for those k and a such
that ar = as . Of course, the addition of new inequalities destroys the structure of the
pricing problem, and now it is not a 0,1-knapsack problem and, therefore, it must be
solved as an ordinary IP. But, despite this, computational experiments have proved
the efficiency of such branching.
In this section we show that applications of the branch-and-price method are not
limited to the framework of the decomposition approach of Danzig and Wolfe. In a
number of cases, with a natural choice of variables, their number can be exponen-
tially large (recall the cutting stock problem from Sect. 7.1.1). In addition, there may
be restrictions that are difficult to compactly formulate by means of linear inequali-
ties. Sometimes, it is better to take into account such complex restrictions in pricing
problems, which can be solved in some other way (say, by dynamic or constraint
programming).
The input data in the problem of designing a reliable telecommunication network
are specified by two graphs: the channel graph G = (V, E) and the demand graph
H = (V , D). The set V consists of logical network nodes (offices, routers, etc.),
V is a subset of V (offices are in V , but routers are not). The edges e ∈ E of the
channel graph represent the set of physical communication lines (channels) that
7.5 Designing Telecommunication Networks 203
∑ ce xe → min, (7.20a)
e∈E
∑ ∑ fPsq ≤ ue xe , e ∈ Es , s = 0, . . . , S, (7.20b)
q∈D P∈P(s,q): e∈E(P)
Since the number of variables in (7.20) is huge, we can solve it only by the branch-
and-price method. The master problem is written very simply:
∑ ∑ fPsq ≤ ue xe e ∈ Es , s = 0, . . . , S, (7.21b)
q∈D P∈P̂(s,q): e∈E(P)
Here P̂(s, q) is a subset of the set P(s, q). We have also introduced slack variables,
zsq , so that the master MIP always had a solution. In particular, this allows us to start
the branch-and-price algorithm with the empty sets P̂(s, q). As it is common in LP,
M denotes a sufficiently large number.
Before proceeding to the discussion of the pricing problem, let us note that, in our
branch-and price algorithm we can apply the standard branching on integer variables
as all integer variables, xe , are always present in the active master problem,
Formulating (7.20), we hid the restriction on the length of the communication paths
— which is difficult to formulate by a system of inequalities — in the definition of
7.6 Notes 205
the sets P(s, q), thereby moving the accounting of this requirement into the pricing
problem.
Let us denote the dual variables for the master relaxation LP as follows:
• αse ≤ 0 is associated with the inequality in (7.21b) written for s ∈ {0, . . . , S} and
e ∈ E;
• βsq ≥ 0 corresponds to the inequality in (7.21c) written for s ∈ {0, . . . , S} and
q ∈ D.
Let the pair of α̃ ∈ R{0,...,S} × RE and β̃ ∈ R{0,...,S} × RD constitute an optimal
dual solution to the relaxation LP for (7.21). For given s ∈ {0, . . . , S}, q ∈ D and
P ∈ P(s, q), the reduced cost of the variable fPsq is
− ∑ α̃se − β̃sq .
e∈E(P)
This is the shortest path problem between nodes v and w in Gs , when the weight of
every edge e ∈ Es is −α̃se ≥ 0. If the weight of a shortest path, P̂, is less than β̃sq ,
then P̂ is added to the set P̂(s, q).
For emergency states (s = 1, . . . , S), (7.22) is the shortest path problem in the
undirected graph. For the normal state (s = 0), (7.22) is the problem of finding
a shortest path of limited length. For those who do not know how to solve this
problem, let us say that some shortest path algorithms are easily adapted to search
for shortest paths of limited length. In particular, if the path length should not exceed
k, the famous Bellman-Ford algorithm must execute only k iterates (if, of course, the
algorithm does not stop earlier).
The shortest path algorithms are studied almost in every manual on graph or
network flow theory, and therefore are not discussed here.
7.6 Notes
Sect. 7.1. The works [54, 55], devoted to solving the one-dimensional cutting stock
problem, can be considered the first application of the column generation technique
for solving IPs.
Sect. 7.2. The work of Danzig and Wolfe [45] is fundamental for using decomposi-
tion in LP. Johnson [76] was one of the first to realize both the potential and the im-
plementation complexity of the branch-and-price algorithms. The ideas and method-
ology of the branch-and-price method are discussed in [18], where an overview of
206 7 Branch-And-Price
7.7 Exercises
7.1. Solve an instance of the cutting stock problem in which from rolls of length 128
it is necessary to cut out 12 stocks of lengths 66, 34 of length 25, 9 of length 85, 20
of length 16, and 7 of length 45.
7.2. Write down a compact formulation for the one-dimensional cutting stock prob-
lem from Sect. 7.1.1.
Hint. Suppose that the desired number of stocks can be cut out from no more than
k rolls. For example, as k, we can take the required number of rolls for a cutting plan,
built by the heuristic algorithm from Sect. 7.1.3. Use the following variables:
• xi j : number of stocks of type i to be cut out from roll j, i = 1, . . . , m, j = 1, . . . , k;
• y j = 1 if roll j is cut out, and y j = 0 otherwise, j = 1, . . . , k.
7.3. Use the branch-and-price method to solve the generalized assignment problem
with the following parameters: m = 3, K = 2,
1 2 2 1
2
[cki ] = 2 2 , [pki ] = 1 1 , l = .
2
2 1 1 2
7.4. Clustering problem. Given a graph G = (V, E), each edge e ∈ E of which is
assigned a cost ce , and each vertex v ∈ V is associated with a weight wv . We need to
partition the set of vertices V into K clusters V1 , . . . ,VK (some of them may be empty)
so that the sum of the vertex weights in each of the clusters does not exceed a given
limit W , and the sum of the costs of the edges between the clusters is minimum.
Formulate this clustering problem as an IP so that the branch-and-price method
can be used to solve that IP, write down the master and pricing problems, elaborate
a branching rule.
7.5. If you have mastered Exercise 7.4, you can test yourself by numerically solving
an example of the clustering problem on the complete graph of three vertices with
three possible clusters, when the weights of all vertices and the cost of all edges are
equal to one, and the sum of node weights in any cluster is at most two.
7.7 Exercises 207
7.6. Formulate the single depot vehicle routing problem (VRP) from Sect. 2.16 as a
set partitioning problem defined on a hypergraph which vertices correspond to the
customers, and hyperedges represent all feasible routes. Elaborate a branch-and-
price algorithm that solve the IP formulation for this set partition problem: write
down the master and pricing problems, specify a branching rule.
7.7. Let us remember the problem of designing a reliable telecommunication net-
work from Sect. 7.5. Suppose now that a designing network will not reroute infor-
mation flows in case of failure of some its elements. In the design of such networks,
to increase reliability, a diversification strategy is used, which requires that, for each
demand edge q = (v, w) ∈ D, no more than δq dq (0 < δq ≤ 1) information units,
circulating between nodes v and w, pass through any channel and any node. The im-
plementation of this diversification strategy will guarantee that any damage of one
channel or one node will not affect (1−δq ) 100 % of the total amount of information
between nodes v and w.
Write down a compact IP as a model of the problem of designing diversified
telecommunication networks.
7.8. Elaborate a branch-and-price algorithm for the facility location problem formu-
lated in Sect. 2.2.
7.9. Reformulate the problem of detailed placement from Sect. 2.8 as an IP so that
the branch-and-price method can be used to solve that IP, write down the master and
pricing problems, show how to solve the pricing problem.
7.10. In Sect. 7.2 for IP (7.7) we have established the equivalence of two bounds
zLPM and zCUT . Prove that the Lagrangian relaxation provides the same bound:
def
zLPM = zCUT = zLD = minm L(u),
u∈R+
where
K
L(u) = ∑ max{(ck − uT Ak )T xk : xk ∈ X k } + uT b.
k=1
7.11. Steiner tree problem. Given a graph G = (V, E) and a set T ⊆ V of terminals.
A Steiner tree is a minimum (by inclusion) subgraph in G having a path between
any pair of terminals. Let each edge e ∈ E be assigned a cost ce ≥ 0. The Steiner
tree problem is to find in G a Steiner tree of minimum cost, where the cost of a tree
is the sum of the costs of its edges.
Let P denote the set of all paths in G connecting any pair of terminals, and
let V (P) and E(P) denote the sets of, respectively, vertices end edges on a path P.
Introducing two families of binary variables
• xe = 1 if e ∈ E is an edge of the Steiner tree, and xe = 0 otherwise,
• yP = 1 if P ∈ P is a path in the Steiner tree, and yP = 0 otherwise,
we formulate the Steiner tree problem as the following IP:
208 7 Branch-And-Price
∑ ce xe → min,
e∈E
∑ yP ≥ 1, t ∈ T,
P∈P: t∈V (P)
(7.23)
yP ≤ xe , e ∈ E, P ∈ P, e ∈ E(P),
yP ∈ {0, 1}, P ∈ P,
xe ∈ {0, 1}, e ∈ E.
Stochastic programming models include two types of variables: expected and adap-
tive. Expected variables represent those decisions that are taken here-and-now: they
do not depend on the future implementation of the random parameters. Decisions
described by adaptive variables are accepted after the values of the random param-
eters become known.
For example, consider the two-stage stochastic programming problem that is for-
mulated as follows:
cT x + E(h(ω)T y(ω)) → max,
A(ω) x + G(ω) y(ω) ≤ b(ω),
(8.1)
x ∈ X,
n
y(ω) ∈ R+y .
209
210 8 Optimization With Uncertain Parameters
In (8.1) a decision for use in the current (first) period is represented by the vector x ∈
X of the expected variables, where X ⊆ Rnx is some set (for example, X = Rn+x , X =
Zn+x or X = P(Ā, b̄; S)). A decision x ∈ X must be made in the current period before an
elementary event ω from the probability space (Ω , A , P) occurs in the next period.
A decision y(ω) is made in this next period after observing ω. Therefore, the vector
y of adaptive variables is a vector-function of ω. The system A(ω)x + G(ω)y(ω) ≤
b(ω) of stochastic constraints connects the expected and adaptive variables. The
objective function in (8.1) is the sum of two terms: deterministic cT x, estimating
the quality of the solution x, and the expected value E(h(ω)T y(ω)) of the random
variable h(ω)T y(ω), which estimates the quality of the solution y(ω).
Problem (8.1) can be reformulated as follows:
where f (x) = E( f (x, ω)), and the random variable f (x, ω) (see Exercise 8.3) is
determined by the rule:
def
f (x, ω) = cT x + max h(ω)T y(ω),
G(ω)y(ω) ≤ b(ω) − A(ω)x, (8.3)
n
y(ω) ∈ R+y .
If the sample space Ω is infinite, then computing f (x) can be a very difficult
problem. One approach is to approximate an infinite probability space with a finite
space. Discussion of how this is done is beyond the scope of this book. In what
follows we assume that Ω = {ω1 , . . . , ωK } is a finite set and the event (scenario)
ωk occurs with probability pk (k = 1, . . . , K). For k = 1, . . . , K, we introduce the
following notations: hk = h(ωk ), wk = pk hk , Ak = A(ωk ), Gk = G(ωk ), bk = b(ωk ),
yk = y(ωk ), nk = ny . The deterministic equivalent of the stochastic problem (8.1) is
written as follows:
K
cT x + ∑ wTk yk → max,
k=1
Ak x + Gk yk ≤ bk , k = 1, . . . , K, (8.4)
x ∈ X,
n
yk ∈ R+k , k = 1, . . . , K.
Having solved (8.4), we get a decision x for use in the current period. This de-
cision, x, must be adequate to everything that can happen in the next period. If we
knew what scenario ωk would happen in the next period, we would solve the prob-
lem
cT x + hTk yk → max,
A k x + Gk y k ≤ b k ,
x ∈ X,
n
yk ∈ R+k
8.2 Benders’ Reformulation 211
that takes into account the constraints only for this scenario. But since we do not
know which scenario will be realized in the future, in (8.4) we require that the
constraints Ak x + Gk yk ≤ bk be valid for all scenarios k = 1, . . . , K.
If the number of scenarios K is very large, then (8.4) can be a very difficult opti-
mization problem. Obviously, with a huge number of scenarios, the effect of each
of them on the solution x to be made in the current period is different. The Benders
decomposition approach allows us to reformulate the problem in such a way that
information about the scenarios will be provided through cuts.
Let us first assume that the vector of expected variables x is fixed. We can find
the values of the remaining variables yk by solving K LPs:
def
zk (x) = max{wTk yk : Gk yk ≤ bk − Ak x, yk ≥ 0}, k = 1, . . . , K. (8.5)
Now let us write down the dual for each of these LPs:
Theorem 8.1. If at least one polyhedron Qk is empty, then (8.4) does not have a
solution. If all polyhedra Qk (k = 1, . . . , K) are nonempty, then (8.4) is equivalent
to the following problem:
η → max, (8.6a)
K
cT x + ∑ uTk (bk − Ak x) ≥ η, (u1 , . . . , uK ) ∈ V1 × · · · ×VK , (8.6b)
k=1
uTk (bk − Ak x) ≥ 0, uk ∈ Ck , k = 1, . . . , K, (8.6c)
x ∈ X, η ∈ R. (8.6d)
Proof. If Qk = 0/ for some k, then by Theorem 3.1 (of duality), for all x, LP (8.5),
written for this given k, does not have a solution (its objective function is unbounded
or there are no feasible solutions). Therefore, in this case, (8.4) does not have a
solution as well.
Suppose now that all sets Qk are nonempty. Then (8.4) can be formulated as
follows:
212 8 Optimization With Uncertain Parameters
K
cT x + ∑ min uTk (bk − Ak x) → max,
u ∈Q
k=1 k k
(8.7)
uTk (bk − Ak x) ≥ 0, uk ∈ Ck , k = 1, . . . , K,
x ∈ X.
Introducing a new variable
K
η= ∑ umin uTk (bk − Ak x),
k=1k ∈Qk
Problem (8.6) is Benders’ reformulation of Problem (8.4). Despite the fact that
the number of inequalities in (8.6b) and (8.6c) can be huge, we can still solve (8.6)
by the branch-and-cut method that generates these inequalities, also known as Ben-
ders’ cuts, in a separation procedure. Obtaining an input vector x̄ ∈ Rn+x and a number
η̄, this procedure can work as follows.
If for some k this LP has no feasible solutions, and ūk is a certificate of infea-
sibility (see Sect. 3.4), then return the cut
ūTk (bk − Ak x) ≥ 0.
If (8.8) has an optimal solution ūk for each k, and if ∑Kk=1 ūTk (bk − Ak x̄) <
η − cT x̄, then return the cut
K
cT x + ∑ ūTk (bk − Ak x) ≥ η.
k=1
Otherwise, (x̄, η̄) satisfies all inequalities in both families, (8.6b) and (8.6c).
What gives us the transition from Formulation (8.4), which is relatively compact,
to Formulation (8.6) with a huge number of constraints? The obvious plus of the
transition to Benders’ reformulation is that we essentially reduced the number of
continuous variables, more precisely, we excluded the vectors yk for k = 1, . . . , K,
and added only one new variable η. Another advantage is not so obvious. As a
rule, the fewer continuous variables we have, the stronger cuts (for example, the
fractional Gomory cuts) we can generate.
Example 8.1 We need to solve the next MIP preliminary having carried out Ben-
ders’ reformulation:
8.2 Benders’ Reformulation 213
2u1 + u2 ≥ 2, 3
u1 − u2 ≥ −2, (GT1 u ≥ w1 ) 2r
A
u1 + 2u2 ≥ 2 1 Ar
H Hr - -
defines the polyhedron Q1 depicted in Fig. 8.1. This 1 2 3 u1
polyhedron has three vertices Fig. 8.1
T
1 2 2 2
u = (0, 2) , T
u = , , u3 = (2, 0)T ,
3 3
η → max,
x1 + 3x2 + 2 (−2 + x1 − x2 ) ≥ η,
2 2
x1 + 3x2 + (10 − 2x1 − 4x2 ) + (−2 + x1 − x2 ) ≥ η,
3 3
x1 + 3x2 + 2 (10 − 2x1 − 4x2 ) ≥ η,
10 − 2x1 − 4x2 ≥ 0,
(10 − 2x1 − 4x2 ) + (−2 + x1 − x2 ) ≥ 0,
x1 , x2 ∈ Z+ , x1 ≤ 3,
η → max,
η − 3x1 − x2 ≤ −4,
3η − x1 + x2 ≤ 16,
η + 3x1 + 5x2 ≤ 20,
(8.10)
x1 + 2x2 ≤ 5,
x1 + 5x2 ≤ 8,
x1 ≤ 3,
x1 , x2 ∈ Z+ .
MIP (8.10) can be solved quite easily. From the inequality x1 + 5x2 ≤ 8, by inte-
grality of x2 , we have
Taking into account the inequalities 0 ≤ x1 ≤ 3 and 0 ≤ x2 ≤ 1, from the first three
inequalities we calculate an upper bound for η:
η ≤ −4 + 3x1 + x2 ≤ 6,
η ≤ (16 + x1 − x2 )/3 ≤ 19/3,
η ≤ 20 − 3x1 − 5x2 ≤ 20.
8.3 Risks
Maximization of the expected profit implies that the decision making process is re-
peated a sufficiently large number of times under the same conditions. Only then the
asymptotic statements, such as the law of large numbers, guarantee the convergence
in probability terms of random variables to their expected values. In other situa-
tions, we cannot ignore the risk of obtaining a profit, which is significantly lower
than the expected value. The identification of suitable risk measures is the subject
of active research. We are not going to investigate the problem of risk modeling in
8.3 Risks 215
its entirety, but we will only discuss one concept of risk that is convenient for use in
optimization models.
Here we will try to expand the two-stage model of stochastic programming (8.1),
adding to it a system of inequalities that limits the risk of the decision x. The concept
of risk is more conveniently to introduce in terms of the loss function g(x, ω) that
depends on the solution x and is a random variable defined on some probability
space (Ω , A , P) (ω ∈ Ω ).
Historically, the first and perhaps most famous notion of risk was introduced by
H. Markowitz, Nobel Prize winner in Economics in 1990. He defined the risk as the
variation of the random loss value:
Conceptually, this measure of risk has several drawbacks. The most important of
them is that this measure is symmetric: it equally penalizes for receiving both
smaller and larger losses than the expected value. From the point of view of the use
in MIP, a drawback is that using this risk measure means introducing a quadratic
(nonlinear) constraint in optimization models.
Another not less well-known risk measure, called the Value-at-Risk, was devel-
oped by the financial engineers of J. P. Morgan. Let
def
G(x, η) = P{ω ∈ Ω : g(x, ω) ≤ η}
be the distribution function of the random variable g(x, ω). For a given probability
0 < α < 1, the risk of making a decision x is
def
VaRα (x) = min{η : G(x, η) ≥ α},
Ω = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
when the first three events, 0, 1, 2, occur with probability 16 , the next two events,
3, 4, with probability 18 , and the remaining events, 5, 6, 7, 8, 9, with probability 20 1
.
Let α = 0.9 and g(x, ω) = x − ω for x ∈ Z+ . As ωk = k − 1, then for x = 4 we have
gk (4) = g(4, k − 1) = 5 − k, k = 1, . . . , 10. Next we need to sort the values gk (4):
216 8 Optimization With Uncertain Parameters
i 1 2 3 4 5 6 7 8 9 10
π(i) 10 9 8 7 6 5 4 3 2 1
gπ(i) (4) −5 −4 −3 −2 −1 0 1 2 3 4
1 1 1 1 1 1 1 1 1 1
pπ(i)
6 6 6 8 8 20 20 20 20 20
Since
8
1 1 1
∑ pπ(i) = 3 6 + 2 8 + 3 20 = 0.9,
i=1
where
1
Z
def
gα (x, η) = η + max{g(x, ω) − η, 0}P(dω).
1−α Ω
def
where gk (x) = g(x, ωk ).
Continuing the example in which we calculated VaR0.9 (4) = 2, we compute
Again, consider the two-stage stochastic program (8.1). But now we want to maxi-
mize the expected profit by limiting the risk: CVaRα (x) ≤ r, where r is the maximum
allowable risk level. Introducing new variables zk to represent max{gk (x) − η, 0} in
Formula (8.11), we can extend the deterministic equivalent (8.4) as follows:
K
cT x + ∑ wTk yk → max,
k=1
Ak x + Gk yk ≤ bk , k = 1, . . . , K,
K
1
η+ ∑ pk zk ≤ r, (8.12)
1−α k=1
gk (x) − η − zk ≤ 0, k = 1, . . . , K,
η ∈ R, x ∈ X,
n
zk ≥ 0, yk ∈ R+k , k = 1, . . . , K.
Program (8.12) is a MIP if the functions gk (x) are linear and X is a mixed-integer
set P(Ā, b̄; S). Of particular interest is also the case when g(x, ω) = − f (x, ω), where
f (x, ω) is defined in (8.3) and is a nonlinear function. Then (8.12) can be rewritten
as follows:
K
cT x + ∑ wTk yk → max,
k=1
Ak x + Gk yk ≤ bk , k = 1, . . . , K,
K
1
η+
1−α ∑ pk zk ≤ r, (8.13)
k=1
cT x + hTk yk + η + zk ≥ 0, k = 1, . . . , K,
η ∈ R, x ∈ X,
n
zk ≥ 0, yk ∈ R+k , k = 1, . . . , K.
Let us verify that both programs, (8.12) and (8.13), are equivalent when g(x, ω) =
− f (x, ω). By definition
and since
K K
cT x + ∑ wTk yk = ∑ pk (cT x + hTk yk ),
k=1 k=1
then in an optimal solution (x∗ ; y∗1 , . . . , y∗K ; η ∗ ; z∗1 , . . . , z∗K ) to (8.13) each point y∗k is
an optimal solution to the LP
max{hTk yk : Gk yk ≤ bk − Ak x∗ },
Credit risk is the risk caused by the fact that the obligors do not fully fulfill their
obligations, or by a decrease in the market price of assets due to the fall of credit
ratings. For example, a portfolio of bonds from emerging markets (Brazil, India,
Russia, etc.) can most likely generate revenue, but at the same time there is a small
probability of large losses. For such investments, the distribution functions of returns
(future incomes) are asymmetric and, consequently, symmetric risk measures are not
entirely appropriate here. But the VaR measure (as well as its derivative CVaR) was
invented to assess the risks in such situations.
Consider a problem of optimizing a portfolio of n potential investments (such as
shares). We need to determine the share x j of each investment j in the portfolio.
Then the portfolio is presented by the vector x = (x1 , . . . , xn )T . The set X of feasible
solutions (portfolios) is described by the system
n
∑ x j = 1,
j=1
l j ≤ x j ≤ u j, j = 1, . . . , n,
gk (x) = (q − µ k )T x,
where q is the return vector, provided that the credit rating of each investment does
not change. We determine the risk of the portfolio x to be CVaRα (x) and limit this
risk to a given value r. Note that with this setting, the ”security level” of our portfolio
x is determined by choosing two parameters, α and r.
8.3 Risks 219
Under the above assumptions, the problem of maximizing the expected return of
the portfolio at a limited risk is written as follows:
µ T x → max, (8.14a)
K
1
η+
1−α ∑ pk zk ≤ r, (8.14b)
k=1
(q − µ k )T x − η − zk ≤ 0, k = 1, . . . , K, (8.14c)
n
∑ x j = 1, (8.14d)
j=1
l j ≤ x j ≤ u j, j = 1, . . . , n, (8.14e)
zk ≥ 0, k = 1, . . . , K, (8.14f)
η ∈ R. (8.14g)
Note that (8.14) is an LP. However, it can turn into a MIP after taking into account
a number of standard for portfolio optimization additional logical conditions. One
of these conditions is the requirement to diversify the investments. Suppose that the
set N = {1, . . . , n} of all investments is divided into subsets (groups) N1 , . . . , Nm ,
say, according to the sectoral or territorial principle. It is required that no more than
ni different investments be present in the portfolio from group Ni , and also that the
portfolio had investments from at least s groups.
We introduce two families of binary variables:
• y j = 1 if investment j is present in the portfolio, and y j = 0 otherwise ( j =
1, . . . , n);
• δi = 1 if at least one investment from group i is present in the portfolio, and δi = 0
otherwise (i = 1, . . . , m).
To take into account the above requirements, we need to replace (8.14e) with the
following system:
l jy j ≤ x j ≤ u jy j, j = 1, . . . , n,
∑ yi ≤ ni , i = 1, . . . , m,
j∈Ni
y j ≤ δi , j ∈ Ni , i = 1, . . . , m,
m
∑ δi ≥ s,
i=1
y j ∈ {0, 1}, j = 1, . . . , n,
δi ∈ {0, 1}, i = 1, . . . , m.
220 8 Optimization With Uncertain Parameters
Although multistage models of stochastic programming have been studied for sev-
eral decades, they have not been used in practice to solve problems of required
sizes. Only recently, with the advent of sufficiently powerful computers, stochas-
tic programming models began to be used in practice, and stochastic programming
itself began to develop at a rapid pace.
Multistage problems of stochastic programming are applied when the planning
horizon includes more than one period (stage). Let T denote the number of periods,
and ω t ∈ Ω t are events that can occur in period t, t = 1, . . . , T . At the beginning
of the planning horizon (at stage 0), an expected solution x is taken when the event
ω 1 has not yet occurred. A decision y(ω 1 , . . . , ω t ) is made at stage t, when the
events ω 1 , . . . , ω t−1 have already occurred, and the event ω t has not yet happened.
The decision y(ω 1 , . . . , ω t ) depends on the decision y(ω 1 , . . . , ω t−1 ) made at the
previous stage. The multistage model is written as follows:
T
cT x + ∑ h(ω 1 , . . . , ω t )T y(ω 1 , . . . , ω t ) → max,
t=1
A(ω 1 )x + G(ω 1 )y(ω 1 ) ≤ b(ω 1 ),
A(ω 1 , ω 2 )y(ω 1 ) + G(ω 1 , ω 2 )y(ω 1 , ω 2 ) ≤ b(ω 1 , ω 2 ),
A(ω 1 , ω 2 , ω 3 )y(ω 1 , ω 2 ) + G(ω 1 , ω 2 , ω 3 )y(ω 1 , ω 2 , ω 3 ) ≤ b(ω 1 , ω 2 , ω 3 ),
. . .. . . (8.15)
.. .
A(ω 1 , . . . , ω T )y(ω 1 , . . . , ω T −1 )+
G(ω 1 , . . . , ω T )y(ω 1 , . . . , ω T ) ≤ b(ω 1 , . . . , ω T ),
x ∈ X,
1 t
y(ω , . . . , ω ) ∈ Yt , t = 1, . . . , T.
∑ p(ω1 , . . . , ωt ) = 1.
(ω1 ,...,ωt )∈Ω 1 ×···×Ω t
Since many of these probabilities may be zeros, to formulate the deterministic equiv-
alent of the stochastic problem (8.15), it is convenient to introduce the concept of
the scenario tree.
The nodes of the scenario tree are numbered from 0 to n. Node 0 is the root of the
tree. The nodes that are at a distance of t from the root belong to stage t. We denote
by t(i) the stage to which node i belongs. We assume that the edges are oriented
8.4 Multistage Stochastic Programming Problems 221
in the direction from the root to the leaves, and the directed edges are called arcs.
Note that any node j, except for the root, is entered by only one arc (i, j), and then
node i is called the parent of node j and is denoted by parent( j). Each arc (i, j) of
the tree is associated with an event ω(i, j) from Ω t(i) . The problem input data are
distributed among the tree nodes as follows. The set X and the vector c0 = c are
assigned to node 0. Each of the remaining nodes, j ∈ {1, . . . n}, is associated with
the following parameters:
def
p j = p(ω(0, i1 ), ω(i1 , i2 ), . . . , ω(it( j)−1 , j)),
def
c j = h(ω(0, i1 ), ω(i1 , i2 ), . . . , ω(it( j)−1 , j)) × p j ,
def
b j = b(ω(0, i1 ), ω(i1 , i2 ), . . . , ω(it( j)−1 , j)),
def
A j = A(ω(0, i1 ), ω(i1 , i2 ), . . . , ω(it( j)−1 , j)),
def
G j = G(ω(0, i1 ), ω(i1 , i2 ), . . . , ω(it( j)−1 , j)),
where the sequence (0, i1 , . . . , it( j)−1 , j) specifies the only path in the tree leading
from the root (node 0) to node j. Note that, by definition of p j , the following equa-
tions hold:
∑ p j = 1, τ = 1, . . . , T,
j: t( j)=τ
A j x parent( j) + G j x j ≤ b j , j = 1, . . . , n, (8.16)
x0 ∈ X,
x j ∈ Yt( j) , j = 1, . . . , n.
It should be noted that (8.16) is a MIP only if X and all Yt are polyhedral mixed
integer sets, i.e., X = P(Ā0 , b̄0 , S0 ) and Y t = P(Āt , b̄t , St ) for t = 1, . . . , T . In the
next two sections we consider two concrete examples of the multistage stochastic
programming problem.
When forming an investment portfolio, one of the most important goals is to prevent
the portfolio yield to fall below some critical level. This can be done by including in
the portfolio derivative financial assets, such as options. In situations where deriva-
tive assets are not available, we can achieve the desired result by forming a portfolio
based on the ”synthetic option” strategy.
The input for the portfolio optimization parameters are the following:
• n: number of assets;
• T : number of periods in the planning horizon, period t begins at time t − 1 and
ends at time t;
• z0 : amount of cash at the beginning of the planning horizon;
• xi0 : amount of money invested in asset i at the beginning of the planning horizon;
• R: interest on capital (1 + rate of interest) in terms of one period;
• rit = ri (ω 1 , . . . , ω t ): random return (per one enclosed dollar) of asset i in period t;
• ρit : transaction cost when buying and selling asset i in period t; it is assumed that
all transactions are made at the very beginning of each period;
• qi : maximum share of asset i in the portfolio.
Expected variables:
b : amount of money spent in period 1 on buying asset i;
• xi1
s : amount of money received in period 1 from selling asset i.
• xi1
Adaptive variables:
• xit = xi (ω 1 , . . . , ω t ): amount of money invested in asset i in period t, t = 1, . . . , T ;
• zt = zi (ω 1 , . . . , ω t ): amount of cash at the end of period t, t = 1, . . . , T ;
• xitb = xib (ω 1 , . . . , ω t−1 ): amount of money spent in period t on buying asset i,
i = 1, . . . , n, t = 2, . . . , T ;
• xits = xis (ω 1 , . . . , ω t−1 ): amount of money received in period t from selling asset i,
i = 1, . . . , n, t = 2, . . . , T ;
• ξ = ξ (ω 1 , . . . , ω T ): random component of the portfolio value at the end of the
planning horizon (at the end of period T );
• w: risk-free (attained if the worst scenario occurs) component of the portfolio
value at the end of the planning horizon.
In the selected variables, the portfolio optimization problem is formulated as
follows:
8.5 Synthetic Options 223
Example 8.2 An investor wants to invest an amount of z0 in one risky asset. The
planning horizon consists of T = 2 periods. As for the general model, let R denote
the interest on capital in one period. In period 1 the return of this asset is r1+ or r1−
with equal probability, and in period 2 the return is r2+ with probability 32 , and r2−
with probability 31 . The cost of the transaction when buying and selling one asset
unit is constant and equal to ρ. It is necessary to write a deterministic equivalent
for (8.17) applied to this investment problem.
1
3 3h 4h61 1
3 5h 6h16
A
K KA
A A
1 h
A Ah
1
2 1 2 2
I
@
@
@h
0
1
z2 + (1 − ρ)x2s − (1 + ρ)x2b = z5 , node 5
R
1
x2 + x2b − x2s = + x5 ,
r2
1
z2 + (1 − ρ)x2s − (1 + ρ)x2b = z6 , node 6
R
b s 1
x2 + x2 − x2 = − x6 ,
r2
z3 + (1 − ρ)x3 = w + ξ3 , isolating
z4 + (1 − ρ)x4 = w + ξ4 , risk-free part
z5 + (1 − ρ)x5 = w + ξ5 , of portfolio
z6 + (1 − ρ)x6 = w + ξ6 , value
x1 , z1 , x2 , z2 , x3 , z3 , x4 , z4 , x5 , z5 , x6 , z6 ≥ 0,
x0b , x0s , x1b , x1s , x2b , x2s ≥ 0,
ξ3 , ξ4 , ξ5 , ξ6 ≥ 0. t
u
D = 60 days can be divided into T = 4 periods of length 30, 20, 7 and 3 days). The
airline can use one of K available planes, the seats in all planes are divided into the
same number, I, of classes. Plane k (k = 1, . . . , K) costs fk to hire, and has qki of
seats of class i (i = 1, . . . , I). For example, plane k may have qk,1 = 30 first class
seats, qk,2 = 40 business class seats, and qk,3 = 60 economy class seats. In plane k,
up to rki l and r h seats of class i can be transformed into seats of lower, i − 1, and
ki
l = 0 and r h = 0.
higher, i + 1, classes, i = 1, . . . , I. It is assumed that rk,1 kI
For administrative simplicity, in each period t (t = 1, . . . , T ) only O price options
can be used, and let ctio denote the price of a seat of class i (i = 1, . . . , I) in period t
if option o is used.
Demand is uncertain but it is affected by ticket prices. Let us assume that S sce-
narios are possible in each period. The probability of scenario s (1 ≤ s ≤ S) in pe-
riod t is pts , ∑Ss=1 pts = 1. The results of demand forecasting are at our disposal:
if scenario s occurs in period t, and price option o is used in this period, then the
demand for seats of class i will be dtsoi .
We have to choose a plane to hire, and to decide, for each of T periods, which
price option to use, how many seats to sell in each class (depending on demand).
Our goal is to maximize the expected yield.
Let us also note that period t starts at time t − 1 and ends at time t. Therefore, it is
assumed that the decision which option to use in period t is made at time t − 1. The
other decision how many seats of each class to sell depends on the demand in this
period; therefore, this decision is assumed to be made at time t (the end of period t).
To write a deterministic model for this stochastic problem, we need to describe a
T
scenario tree. In this application the scenario tree has n + 1 = ∑t=0 |Vt | nodes, where
Vt denotes the set of nodes in level t, t = 0, 1, . . . , T . Let us also assume that the root
of the scenario tree is indexed by 0, then V0 = {0} and V = ∪t=0 T V.
t
Each node j ∈ Vt (t = 1, . . . , T ) corresponds to one of the histories, h( j) =
(s1 , s2 , . . . , st ), that may happen after t periods, where sτ is an index of a scenario for
period τ. By definition, the history of the root node is empty, h(0) = (). The parent of
node j, denoted by parent( j), is that node in Vt−1 which history is (s1 , s2 , . . . , st−1 ),
i.e, h( j) = (h(parent( j)), st ). Note, that the root node 0 is the parent of all nodes in
V1 (of level 1). In what follows, if we say that something is doing at node j it means
that this is doing when the history h( j) is realized.
def def
For j ∈ V \ {0}, the likelihood of history h( j) is p̄ j = ∏tτ=1 pτ,sτ , p̄0 = 1. If price
def
option o is used at node j, the demand for seats of class i is d¯joi = dt,st ,o,i , and their
def
price is ctoi . Let us define c̄ joi = p̄ j ctoi .
Now we introduce the variables. The first family of binary variables is to decide
which plane to use. For each plane k we define
• vk = 1 if plane k is used, and vk = 0 otherwise.
Having hired a plane, we need to decide how to transform the seats in that plane.
So, we define two families of integer variables:
• wli : number of seats of class i to be transformed into seats of class i − 1, i =
2, . . . , I;
8.6 Yield Management 227
Objective (8.18a) is to maximize the profit from selling seats minus the ex-
penses for hiring a plane. Equation (8.18b) prescribes to hire just one plane, and
Eqs. (8.18c) determine the capacities of all seat classes in the hired plane. Inequal-
ities (8.18d) and (8.18e) restrict the number of seats in any class that can be trans-
formed into seats of the lower and higher classes. Equations (8.18f) prescribe to
choose only one price option at each not leaf node. The variable upper bounds
(8.18g) guarantee that, in any period and for any price option, the number of sold
seats of each class does not exceed the demand for these seats. Equations (8.18h)
calculate the total number of seats in each class that are sold in any of T periods.
Inequalities (8.18i)–(8.18k) imply that, for each class, the number of sold seats plus
the number of seats transformed into seats of the adjacent classes does not exceed
the total number of seats of this class plus the number of seats of the adjacent classes
transformed into seats of this class.
When (8.18) is solved, we know which option, o1 , to use and how many seats
of each class for sale in period 1. When period 1 is over, we will know the actual
number of seats, s1i , of each class i sold in this period. To determine a price option
and the number of seats of each class for sale in period 2, we will solve a new
planning problem for the time horizon that extends from period 2 to the end of the
planning horizon. Writing (8.18) for this new problem, we need to modify (8.18c)
in order to take into account the seats sold in period 1:
K
ui = ∑ qki vk − s1i , i = 1, . . . , I.
k=1
Similarly, we will determine a price option and the number of seats of each class for
sale in any subsequent period t when period t − 1 is over.
cT x → min,
Ax ≤ b for all A ∈ A ,
(8.19)
l ≤ x ≤ u,
x j ∈ Z, j ∈ S,
z → min,
T
c x − z ≤ 0.
1 A conic linear program is a problem of minimizing a linear function over the intersection of an
affine subspace and a convex cone. In particular, an LP in standard form is a conic linear program
which convex cone is polyhedral.
230 8 Optimization With Uncertain Parameters
cT x → min,
sup aT x ≤ bi , i = 1, . . . , m,
a∈Ai (8.20)
l ≤ x ≤ u,
x j ∈ Z, j ∈ S,
A = {A : Ai ∈ Ai , i = 1, . . . , m}.
max{x̃T a : a ∈ Ai }. (8.21)
To simplify further arguments, let us assume that this problem has a solution denoted
by a(x̃). If a(x̃)T x̃ > bi , then the inequality a(x̃)T x ≤ bi is valid for Xi but not for x̃;
otherwise, x̃ belongs to Xi .
For example, let us consider the case of ellipsoidal uncertainties when, for i =
1, . . . , m, Ai = {a ∈ Rn : a = ai + Pi u, kuk ≤ 1} with ai ∈ Rn and Pi symmetric and
positive defined n × n-matrix. Then (8.21) is rewritten as follows:
The point u∗ = 1
kPi x̃k Pi x̃ is the only optimal solution to this problem (prove this!).
Therefore, for a(x̃) = ai + kP1i x̃k Pi2 x̃, if a(x̃)T x̃ = ai x̃ + kPi x̃k > bi , then the inequality
a(x̃)T x ≤ bi is valid for Xi but not for x̃; otherwise, x̃ belongs to Xi .
cT x → min,
gTi zi ≤ bi , i = 1, . . . , m,
HiT zi = x, i = 1, . . . , m,
(8.22)
i
z ≥ 0, i = 1, . . . , m,
l ≤ x ≤ u,
x j ∈ Z, j ∈ S,
as an LP with a as the vector of variables, and then let us write down the dual to this
LP:
γi = max{xT a : Hi a ≤ gi } = min{gTi zi : HiT zi = x, zi ≥ 0}.
Substituting these expressions into (8.20), we obtain (8.22). t
u
In this section we consider a model in which the uncertain parameters of the con-
straint matrix take values from given intervals, and the level of conservatism is ex-
pressed by the combinatorial requirement that the number of uncertain parameters
with values different from the standard ones is limited in each row of the constraint
matrix.
Let us consider (8.20) when, for i = 1, . . . , m,
n
Ai = (ai1 , . . . , ain )T : ai j ∈ [āi j − αi j , āi j + αi j ] for j = 1, . . . , n,
n
|ai j − āi j | o (8.23)
∑ αi j ≤ qi ,
j=1
āi j − αi j δi j ≤ ai j ≤ āi j + αi j δi j , j = 1, . . . , n,
n
∑ δi j ≤ qi ,
j=1
0 ≤ δi j ≤ 1, j = 1, . . . , n.
Theorem 8.4. Robust MIP (8.20) with the uncertainties given by (8.23) is equiva-
lent to the following MIP:
n
∑ c j x j → min, (8.24a)
j=1
n n
∑ āi j x j + qi vi + ∑ wi j ≤ bi , i = 1, . . . , m, (8.24b)
j=1 j=1
vi + wi j ≥ αi j y j , j = 1, . . . , n, i = 1, . . . , m, (8.24c)
−y j ≤ x j ≤ y j , j = 1, . . . , n, (8.24d)
l j ≤ x j ≤ u j, j = 1, . . . , n, (8.24e)
x j ∈ Z, j ∈ S, (8.24f)
vi ≥ 0, i = 1, . . . , m, (8.24g)
wi j ≥ 0, i = 1, . . . , m, j = 1, . . . , n. (8.24h)
āi j − αi j δi j ≤ ai j ≤ āi j + αi j δi j , j = 1, . . . , n,
n
∑ δi j ≤ q i ,
j=1
0 ≤ δi j ≤ 1, j = 1, . . . , n.
It is easy to see that this LP has an optimal solution (ai1 , . . . , ain ; δi1 , . . . , δin ) such
that (
αi j δi j if x j ≥ 0,
ai j − āi j =
−αi j δi j if x j < 0.
Therefore, ai j x j = āi j x j + αi j δi j |x j |, and by Theorem 3.1 (of duality) we have
8.7 Robust MIPs 233
n
γi (x) − ∑ āi j x j =
j=1
( )
n n
= max ∑ αi j |x j |δi j : ∑ δi j ≤ qi , 0 ≤ δi j ≤ 1 for j = 1, . . . , n (8.25)
j=1 j=1
( )
n
= min qi vi + ∑ wi j : vi + wi j ≥ αi j |x j |, vi ≥ 0, wi j ≥ 0 for j = 1, . . . , n .
j=1
Now, to get (8.24), it remains to replace supa∈Ai aT x in (8.20) with the above
expression for γi (x). Note that the variables y j are introduced in (8.24) to represent
the modules |x j |. Therefore, for all non-negative variables x j , it is better to substitute
x j for y j . t
u
If the value of the parameter qi is non-negative integer, then the first LP in (8.25)
always has an integer solution. and, therefore, the definition of Ai means that no
more than qi of uncertain elements ai j (αi j > 0) in row i may take values other
than āi j .
Example 8.3 We need to solve the following robust IP
when, in each inequality, at most one non-zero coefficient can vary by not more than
one.
Solution. In this example, q1 = q2 = 1, α1,1 = α1,2 = α1,3 = 1 α2,1 = α2,2 = 1
and α2,3 = 0. Now, let us write down (8.24) applied to our instance:
Note that here we have excluded the variables y j since all the variables x j are non-
negative and, consequently, for any optimal solution, y j = x j for j = 1, 2, 3.
It is easy to verify that an optimal solution to (8.27) has the following compo-
nents:
234 8 Optimization With Uncertain Parameters
x1 = 1, x2 = 0, x3 = 1, v1 = v2 = 1,
w11 = w12 = w13 = w21 = w22 = 0.
Consequently, the point x∗ = (1, 0, 1)T is a solution to our robust program. It is worth
noting that x∗ would not be an optimal solution to (8.26) if this program were not
robust. t
u
It is important to notice that, from the point of view of robust optimization, the bal-
ance relations written in Form (8.29) are preferable to those written in Form (8.28).
This is because in (8.29) the expression for st contains all uncertain parameters,
d1 , . . . , dt , affecting the value of st .
Having determined
def
n t
|dτ − d¯τ |
At = (d1 , . . . , dt ) : ∑ ≤ qt ,
τ=1 ατ
o
d¯τ − ατ ≤ dτ ≤ d¯τ + ατ for τ = 1, . . . ,t
Objective (8.30a) is to minimize the total expenses over all T periods. Here,
in view of (8.29), (8.30b) and (8.30c), each variable zt represents the value of
max{ht st , −pt st }, which is either the cost of storing st product units in period t
236 8 Optimization With Uncertain Parameters
if st ≥ 0, or, if st < 0, the penalty for not supplying −st product units in the
first t periods. Inequalities (8.30d) express the storage capacity restrictions. In-
equalities (8.30e) impose the production capacity restrictions and the implications:
yt = 0 ⇒ xt = 0.
Proof. We cannot apply Theorem 8.4 directly because all uncertain parameters dt
are not in the constraint matrix but their linear combinations are the constant terms in
the inequalities (8.30b), (8.30c) and (8.30d). Nevertheless, we can replace in (8.30b)
and (8.30d) each occurrence of any uncertain parameter dτ with −dτ γτ− , where γτ−
is a new variable set to −1. Similarly, we replace in (8.30c) each occurrence of any
uncertain parameter dτ with dτ γτ+ , where γτ+ is a new variable set to 1. Then we
apply Theorem 8.4 to this modified version of (8.30) to obtain an equivalent MIP
that is transformed into (8.31) after substituting −1 for any variable γτ− , and 1 for
any variable γτ+ . t
u
8.8 Notes
Sect. 8.3. Theorem 8.2 was proved in [115]. The problem of risk measuring in
stochastic programming models with integer variables is discussed in [124]. The
application from Sect. 8.3.2 is explored in more detail in [5].
Sect. 8.4. A more detailed introduction to multi-stage models of stochastic inte-
ger programming is given in [116]. Synthetic options are considered in [146]. The
model of yield management in airline industry was adapted from [137].
Sect. 8.7. Theorem 8.4 was proved in [26]. The single-product lot-sizing robust
problem was studied in [27]. The book [24] and the survey [25] are devoted to
robust optimization and its applications.
Sect. 8.9. See [28] if you have problems with Exercise 8.2.
8.9 Exercises
8.1. You want to invest $50 000. Today the XYZ shares are sold at $20 per share. A
$700 European option gives the right (but does not obligate) in six months to buy
100 of XYZ shares at $15 per share. In addition, six-month risk-free bonds with
a face value of $100 are now sold at $90. You decided not to buy more than 20
options.
Six months later, three equally likely scenarios for the XYZ share price are pos-
sible: 1) the price will not change; 2) the price will rise to $40; 3) the price will drop
to $12.
Formulate and solve three MIPs in which you want to form a portfolio in order
to maximize:
a) expected income;
b) expected income provided that the income must not be less than $2000 for any
of three scenarios;
c) risk-free income that is determined as the income in the worst of three possible
scenarios.
Compare optimal solutions to your three models.
8.2. A newspaper seller decides how many newspapers to buy at a price of α to
sell them at α + β , provided that the demand, u, is a random variable with a dis-
tribution function G. Seller’s goal is to maximize his profit. Solve this stochastic
programming problem.
8.3. Prove that, for a fixed x, the function f (x, ω) defined by (8.3) is in fact a random
variable.
8.4. Specify how to calculate VaRα and CVaRα for a discrete probability space
when all events are equally likely.
8.5. Write explicitly (as in Example 8.1) and then solve Benders’ reformulation for
the next MIP:
238 8 Optimization With Uncertain Parameters
8.6. Write Benders’ reformulation for (1.20), which is a MIP formulation of the
single-product lot-sizing problem.
8.7. The robust optimization problem with scenario-type uncertainties is formulated
as follows:
min max{cTk x : x ∈ X}, (8.32)
0≤k≤K
cT x → max,
Ax ≤ b, (8.33)
P{Hx ≥ ξ (ω)} ≥ α,
References
1. Abara, J.: Applying integer linear programming to the fleet assignment problem. Interfaces
19, 20–28 (1989)
2. van den Akker, M., Van Hoesen, P.M., Savelsbergh, M.W.P.: A polyhedral approach to single-
machine scheduling problems. Math. Program. 85, 541–572 (1999)
3. Alevras, D., Grötschel, M., Wessäly, R.: Capacity and survivability models for telecommu-
nication networks. Tech. Rep. Technical Report SC 97-22, Konrad-Zuse-Zentrum für Infor-
mationstechnik, Berlin (1997)
4. Andersen, E., Andersen, K.: Presolving in linear programming. Math. Program. 71, 221–245
(1995)
5. Anderson, F., Mausser, H., Rosen, D., Uryasev, S.: Credit risk optimization with conditional
value-at-risk criterion. Math. Program. 89, 273–291 (2001)
6. Applegate, D., Bixby, R., Chvátal, V., Cook, W.: Finding cuts in the TSP. Tech. Rep. DI-
MACS Technical Report 95-05, Rutgers University, New Brunswick, NJ (1995)
7. Applegate, D., Bixby, R., Chvátal, V., Cook, W.: Implementing the Dancig-Fulkerson-
Johnson algorithm for large traveling salesman problem. Math. Program. 97, 91–153 (2003)
8. Atamtürk, A., Nemhauser, G.L., Savelsberg, M.W.P.: Conflict graphs in solving integer pro-
gramming problems. European Journal of Oper. Res. 121, 40–45 (2000)
9. Atamtürk, A., Rajan, D.: On splittable and unsplittable flow capacitated network design arc-
set polyhedra. Math. Program. 92, 315–333 (2002)
10. Balas, E.: Facets of the knapsack polytope. Math. Program. 8, 146–164 (1975)
11. Balas, E.: Disjunctive programming. Annals of Discrete Mathematics 5, 3–51 (1979)
12. Balas, E., Bockmayr, A., Pisaruk, N., Wolsey, L.: On unions and dominants of polytopes.
Math. Program. 99, 223–239 (2004)
13. Balas, E., Ceria, S., Cornuéjols, G.: A lift-and-project cutting plane algorithm for mixed 0-1
programs. Math. Program. 58, 295–324 (1993)
14. Balas, E., Ceria, S., Cornuéjols, G., Natraj, N.: Gomory cuts revisited. Oper. Res. Lett. 19,
1–9 (1996)
15. Balinski, M.L.: On finding integer solution to linear programs. Tech. rep., Mathematica,
Princeton, N.J. (1964)
16. Balinski, M.L., Ouandt, R.: On an integer program for a delivery problem. Oper. Res. 12,
300–304 (1964)
17. Barany, I., Van Roy, T., Wolsey, L.A.: Uncapacitated lot-sizing: the convex hull of solutions.
Math. Program. Study 22, 32–43 (1984)
18. Barnhart, C., Johnson, E.L., Nemhauser, G.L., Savelsbergh, M.W.P., Vance, P.: Branch-and-
price: column generation for solving huge integer programs. Oper. Res. 46, 316–329 (1998)
19. Beale, E.M.L., Tomlin, J.A.: Special facilities in a general mathematical programming sys-
tem for nonconvex problems using ordered sets of variables. In: Proceedings of the Fifth
Annual Conference on Operational Research, pp. 447–454. J. Lawrence (ed.), Tavistock Pub-
lications (1970)
20. Beasley, J.E.: An exact two-dimensional non-guillotine cutting tree search procedure. Oper.
Res. 33, 49–64 (1985)
21. Belvaux, G., Boissin, N., Sutter, A., Wolsey, L.A.: Optimal placement of add/drop multiplex-
ers: static and dynamic models. European Journal of Oper. Res. 108, 26–35 (1998)
22. Benders, J.F.: Partitioning procedures for solving mixed-variables programming problems.
Numerische Mathematik 4, 238–252 (1962)
23. Benichou, M., Gauthier, J., Girodet, P., Hentges, G., Ribiere, G., Vincent, O.: Experiments
in mixed-integer programming. Math. Program. 1, 76–94 (1971)
24. Bental, A., Ghaoui, L.E., Nemirovski, A.: Robust Optimization. Princeton University Press
(2009)
25. Bertsimas, D., Brown, D.B., Caramanis, C.: Theory and applications of robust optimization.
SIAM Rev. 53(3), 464–501 (2011)
26. Bertsimas, D., Sim, M.: The price of robustness. Oper. Res. 52, 35–53 (2004)
240 8 Optimization With Uncertain Parameters
27. Bertsimas, D., Thiele, A.: A robust optimization approach to supply chain management. In:
D. Bienstock, G. Nemhauser (eds.) Integer Programming and Combinatorial Optimization.
IPCO 2004. Lecture Notes in Computer Science, vol. 3064, pp. 86–100. Springer, Berlin,
Heidelberg (2004)
28. Birge, J.R., Louveaux, F.V.: Introduction to stochastic programming. Springer Verlag, New
York (2011)
29. Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press (2004)
30. Brearley, A.L., Mitra, G., Williams, H.P.: Analysis of mathematical programming problems
prior to applying the simplex algorithm. Math. Program. 8, 54–83 (1975)
31. Burdett, C.A., Johnson, E.L.: A subadditive approach to solve linear integer programs. Ann.
Discrete Math. 1, 117–144 (1977)
32. Caprara, A., Fischetti, M.: 0,1/2-chvatal-gomory cuts. Math. Program. 74, 221–236 (1996)
33. Charnes, A.: Measuring the efficiency of decision-making units. European Journal of Oper-
ational Research 2, 429–444 (1978)
34. Chernikov, S.N.: Linear Inequalities (in Russian). Nauka, Moscow (1968)
35. Chvátal, V.: Edmonds polytopes and a hierarchy of combinatorial problems. Discrete Math.
4, 305–337 (1973)
36. Chvátal, V.: Edmonds polytopes and a weakly hamiltonian graphs. Math. Program. 5, 29–40
(1973)
37. Chvátal, V.: Linear Programming. Freeman, New York (1983)
38. Cordeau, J.F., Laporte, G., Savelsbergh, M.W., Vigo, D.: Chapter 6 vehicle routing. In:
C. Barnhart, G. Laporte (eds.) Transportation, Handbooks in Operations Research and Man-
agement Science, vol. 14, pp. 367–428. Elsevier (2007)
39. Cornuéjols, G.: Valid inequalities for mixed integer linear programs. Math. Program. 112,
3–44 (2008)
40. Cornuéjols, G., Fisher, M.L., Nemhauser, G.L.: Location of bank accounts to optimize float:
An analytic study of exact and approximate algorithms. Management Science 23, 789–810
(1977)
41. Dakin, R.: Valid inequalities for mixed integer linear programs. Computer Journal 8, 250–
255 (1965)
42. Dancig, G.: Linear programming and extentions. Princeton University Press, Princeton
(1963)
43. Dancig, G., Fulkerson, D., Johnson, S.: On a linear programming combinatorial approach to
the traveling salesman problem. Oper. Res. 7, 58–66 (1959)
44. Dancig, G.B., Fulkerson, D., Johnson, S.: Solution of a large-scale traveling salesman prob-
lem. Oper. Res. 2, 393–410 (1954)
45. Dantzig, G., Wolfe, P.: Decomposition principle for linear programs. Oper. Res. 8, 101–111
(1960)
46. Desrosiers, J., Dumas, Y., Solomon, M., Soumis, F.: Time constrained routing and schedul-
ing. Handbooks in operations research and management science 8, 35–139 (1995)
47. Dyer, M., Wolsey, L.: Formulating the single-machine sequencing problem with release dates
as a mixed integer program. Discrete Appl. Math. 26, 255–270 (1990)
48. Edmonds, J.: Paths, trees and flowers. Canadian Journal of Mathematics 17, 449–467 (1965)
49. Edmonds, J., Giles, R.: A min-max relations for submodular functions on graphs. Ann.
Discrete Math. 1, 185–204 (1977)
50. Eisenbrand, F.: On the membership problem for the elementary closure of a polyhedron.
Combinatorica 19, 297–300 (1999)
51. Forrest, J.J., Goldfarb, D.: Steepest-edge simplex algorithms for linear programming. Math.
Program. 57, 341–374 (1992)
52. Fulkerson, D.R.: Blocking and anti-blocking pairs of polyhedra. Math. Program. 1, 160–194
(1971)
53. Gavish, B., Graves, S.: The traveling salesman problem and related problems. Tech. rep.,
Graduate School of Management, University of Rochester, New York (1979). Working Paper
54. Gilmore, P.C., Gomory, R.E.: A linear programming approach to cutting stock problem.
Oper. Res. 9, 849–859 (1961)
References 241
55. Gilmore, P.C., Gomory, R.E.: A linear programming approach to cutting stock problem: Part
ii. Oper. Res. 11, 863–888 (1963)
56. Gilmore, P.C., Gomory, R.E.: The theory and computation of knapsack functions. Oper. Res.
14, 1045–1077 (1966)
57. Goldfarb, D., Reid, J.K.: A practical steepest-edge simplex algorithm. Math. Program. 12,
361–371 (1977)
58. Gomory, R.E.: Outline of an algorithm for integer solutions to linear programs. Bull. Amer.
Soc. 64, 275–278 (1958)
59. Gomory, R.E.: An algorithm for the mixed integer problem. Tech. Rep. Technical Report
RM-2597, The RAND Cooperation (1960)
60. Gomory, R.E.: Solving linear programming problems in integers. In: Proceedings of Sym-
posia in Applied Mathematics, vol. 10 (1960)
61. Gomory, R.E.: Some polyhedra related to corner problems. Linear Algebra and its Applica-
tions 2, 451–588 (1969)
62. Grötschel, M., Jünger, M., Reinelt, G.: A cutting plane algorithm for the linear ordering
problem. Oper. Res. 32, 1195–1220 (1984)
63. Grötschel, M., Lovász, L., Schrijver, A.: The ellipsoid method and its consequences in com-
binatorial optimization. Combinatorica 1, 169–197 (1981)
64. Grötschel, M., Lovász, L., Schrijver, A.: Geometric algorithms and combinatorial optimiza-
tion. Springer, Berlin (1988)
65. Grötschel, M., Padberg, M.: On the symmetric travellling salesman problem ii: lifting theo-
rems and facets. Math. Program. 16, 281–302 (1979)
66. Grunbaum, B.: Convex polytopes. Wiley, New York (1967)
67. Gu, Z., Nemhauser, G.L., Savelsbergh, M.W.P.: Lifted cover inequalities for 0-1 integer pro-
grams: computation. INFORMS J. Comput. 10, 427–437 (1998)
68. Gu, Z., Nemhauser, G.L., Savelsbergh, M.W.P.: Lifted flow cover inequalities for mixed 0-1
programs. Math. Program. A 85, 436–467 (1999)
69. Gu, Z., Nemhauser, G.L., Savelsbergh, M.W.P.: Sequence independent lifting in mixed inte-
ger programming. J. Combinat. Optim. 4, 109–129 (2000)
70. Guignard, M., Spielberg, K.: Logical reduction methods in zero-one programming. Oper.
Res. 29, 49–74 (1981)
71. Gusfield, D.: Very simple method for all pairs network flow analysis. SIAM J. Comput. 19,
143–155 (1990)
72. Hammer, P.L., Johnson, E.L., Peled, U.N.: Facets of regular 0-1 polytopes. Math. Program.
8, 179–206 (1975)
73. Heller, I., Tompkins, C.B.: An extension of a theorem of dancig’s. In: Linear inequalities and
related systems, ed H.W. Kuhn and A.W. Tucker, pp. 247–252. Princeton University Press,
Princeton, N.J. (1956)
74. Hoffman, K., Padberg, M.: Improving representations of zero-one linear programs for
branch-and-cut. ORSA Journal of Computing 3, 121–134 (1991)
75. Ibarra, O.H., Kim, C.E.: Fast approximations algorithms for the knapsack and sum of subset
problems. Journal of the ACM 22, 463–468 (1975)
76. Johnson, E.L.: Modeling and strong linear programs for mixed integer programming. In:
Wallace S.W. (eds) Algorithms and Model Formulations in Mathematical Programming.
NATO ASI Series (Series F: Computer and Systems Sciences), vol. 51, pp. 1–41. Springer,
Berlin, Heidelberg (1989)
77. Kall, P., Wallace, S.: Stochastic programming. Wiley (1994)
78. Kallenberg, L.C.M.: Linear programming and finite markovian control problems. Tech. Rep.
148, Mathematisch Centrum, Math. Centre Tract, Amsterdam (1983)
79. Karger, D.R.: Minimum cuts in near-linear time. Journal of the ACM 47, 46–76 (2000)
80. Kaufmann, A., Henry-Labordére, A.: Méthodes et modeles de la recherche operationnelle.
Dunon, Paris-Bruxelles-Montreal (1974)
81. Khachian, L.G.: Complexity of linear programming problems (in Russian). Moscow (1987)
82. Kondili, E., Pantelides, C.C., Sargent, R.W.H.: A general algorithm for short-term scheduling
of batch operations – I. MILP formulation. Computers chem. Engng. 17, 211–227 (1993)
242 8 Optimization With Uncertain Parameters
83. Land, A.H., Doig, A.G.: An automatic method for solving discrete programming problems.
Econometrica 28, 497–520 (1960)
84. Laporte, G., Nobert, Y.: A branch and bound algorithm for the capacitated vehicle routing
problem. OR Spektrum 5, 77–85 (1983)
85. Letchford, A.L.: On disjunctive cuts for combinatorial optimization. Journal of Combinato-
rial Optimization 5, 299–315 (2001)
86. Letchford, A.L.: Totally tight chvátal-gomory cuts. Oper. Res. Lett. 30, 71–73 (2002)
87. Lovász, L., Schrijver, A.: Cones of matrices and set-functions and 0-1 optimization. SIAM
J. Optim. 1, 166–190 (1991)
88. Marchand, H., Martin, A., Weismantel, R., Wolsey, L.: Cutting planes in integer and mixed
integer programming. Discrete Appl. Math. 123, 397–446 (2002)
89. Marchand, H., Wolsey, L.A.: Aggregation and mixed integer rounding to solve mips. Oper.
Res. 49, 363–371 (2001)
90. Markowitz, H.: Portfolio Selection: Efficient Diversification of Investments. Wiley, New
York (1959)
91. Miller, A.J., Wolsey, L.A.: Tight formulations for some simple mixed integer programs and
convex objective integer programs. Math. Program. 98, 73–88 (2003)
92. Miller, C.E., Tucker, A.W., Zemlin, R.A.: Integer programming formulations and the travel-
ing salesman problem. J. Assoc. Comput. Mach. 7, 326–329 (1960)
93. Minoux, M.: Optimum synthesis of a network with non-simultaneous multicommodity flow
requirements. In: P. Hansen (ed.) Studies on Graphs and Discrete Programming, pp. 269–
277. North-Holland Publishing Company (1981)
94. Minoux, M.: Programmation Mathémattique. Bordas et C.N.E.T.-E.N.S.T., Paris (1989)
95. Murty, R.G., Yu, F.T.: Linear complementarity, linear and nonlinear programming (internet
edition). http://ioe.engin.umich.edu/people/fac/books/murty/linear complementarity web-
book (1993)
96. Naddef, D.: Polyhedral theory and branch-and-cut algorithms for the symmetric tsp. In:
G. Gutin, A. Punnen (eds.) The traveling salesman problem and its variations, pp. 29–116.
Kluwer Academic Publishers (2002)
97. Nagamochi, H., Ibaraki, T.: Computing edge connectivity in multigraphs and capacitated
graphs. SIAM J. Disc. Math. 5, 54–66 (1992)
98. Nemhauser, G.L., Wolsey, L.A.: Integer and Combinatorial Optimization. Wiley (1988)
99. Nemhauser, G.L., Wolsey, L.A.: A recursive procedure to generate all cuts for 0–1 mixed
integer programs. Math. Program. 46, 379–390 (1990)
100. Nisan, N.: Bidding and allocation in combinatorial auctions. In: Proceedings ACM Confer-
ence on Electronic Commerce (EC-00), pp. 1–12. Minneapolis, MN (2000)
101. Orgler, Y.: An unequal period model for cash management decisions. Management Science
16, B77–B92 (1969)
102. Padberg, M.: Linear Optimization and Extensions. Springer-Verlag, Berlin, Heidelberg
(1995)
103. Padberg, M.W.: On the facial structure of set packing polyhedra. Math. Program. 5, 199–215
(1973)
104. Padberg, M.W.: A note on zero-one programming. Oper. Res. 23, 833–837 (1975)
105. Padberg, M.W., Rao, M.R.: Odd minimum cut-sets and b-matchings. Math. Oper. Res. 7,
67–80 (1982)
106. Padberg, M.W., Rinaldi, G.: Optimization of a 532 city symmetric traveling salesman prob-
lem by branch and cut. Oper. Res. Lett. 6, 1–7 (1987)
107. Padberg, M.W., Rinaldi, G.: Facet identification for the symmetric traveling salesman poly-
tope. Math. Program. 47, 219–257 (1990)
108. Padberg, M.W., Van Roy, T.J., Wolsey, L.A.: Valid linear inequalities for fixed charge prob-
lems. Oper. Res. 33, 842–861 (1985)
109. Papadimitriou, C.H.: Computational complexity. Addison-Wesley Publishing Company
(1994)
110. Papadimitriou, C.H., Stieglitz, K.: Combinatorial Optimization: Algorithms and Complexity.
Prentice-Hall, Englewood Cliffs, NJ (1982)
References 243
111. Pinedo, M.L.: Handbook of Scheduling: Algorithms, Models, and Performance Analysis.
Springer (2012). Discussion of the basic properties of scheduling models, provides an up-to-
date coverage of important theoretical models in the scheduling literature as well as signifi-
cant scheduling problems that occur in the real world.
112. Precido-Walters, F., Rardin, R., Langer, M., Thai, V.: A coupled column generation, mixed
integer approach to optimal planning of intensity modulated radiation therapy for cancer.
Math. Program. 101, 319–338 (2004)
113. Prékopa, A.: Stochastic programming. Kluwer Academic Publishers, Dordrecht (1995)
114. Queyranne, M.A., Schulz, A.S.: Polyhedral approaches to machine scheduling. Tech. Rep.
Technical report 408/1994, Institut für Mathematik, Technische Universität Berlin, Berlin
(1994)
115. Rockafellar, R.T., Uryasev, S.: Optimization of conditional value-at-risk. J. Risk 2, 21–41
(2000)
116. Römisch, W., Schultz, R.: Multistage stochastic integer programs: an introduction. In: M.
Grötschel, S.O. Krumke, J. Rambau (eds.) Online Optimization of Large Scale Systems, pp.
581–600. Springer, Berlin, Heidelberg (2001)
117. Saigal, R.: Linear programming: a modern integrated analysis. Kluwer Academic Publishers,
Boston/Dordrecht/London (1995)
118. Savelsbergh, M.W.P.: A branch and price algorithm for the generalized assignment problem.
Tech. Rep. COC-93-02, Computational Optimization Center, Georgia Institute of Technol-
ogy, Atlanta (1993)
119. Savelsbergh, M.W.P.: Preprocessing and probing techniques for mixed integer programming
problems. ORSA Journal on Computing 6, 445–454 (1994)
120. Scholl, A.: Balancing and sequencing of assembly lines. Physica-Verlag, Berlin, Heidelberg
(1999)
121. Schrijver, A.: On cutting planes. Ann. Discrete Math. 9, 291–296 (1980)
122. Schrijver, A.: Theory of linear and integer programming. Wiley (1986)
123. Schrijver, A.: Combinatorial optimization. Springer Verlag (2004)
124. Schultz, R.: Stochastic programming with integer variables. Math. Program. 97, 285–309
(2003)
125. Shahookar, K., Mazumder, P.: VLSI cell placement techniques. ACM Computing Surveys
23, 143–220 (1991)
126. Sheble, G.B., Fahd, G.N.: Unit commitment literature synopsis. IEEE Transactions on Power
Systems 9, 128–135 (1994)
127. Sherali, H., Adams, W.: A hierarchy of relaxations between the continuous and convex hull
representations for zero-one programming problems. SIAM J. Discr. Math. 3, 311–430
(1990)
128. Sönke, H., Briskorn, D.: A survey of variants and extensions of the resource-constrained
project scheduling problem. European Journal of Oper. Res. 207, 1–14 (2010)
129. Suhl, U.H., Szymanski, R.: Supernode processing of mixed-integer models. Computational
Optimization and Applications 3, 317–331 (1994)
130. Sutanthavibul, S., Shragowitz, E., Rosen, J.B.: An analytical approach to floorplan design
and optimization. IEEE Transactions on Computer-Aided Design 10, 761–769 (1991)
131. Sutter, A., Vanderbeck, F., Wolsey, L.A.: Optimal placement of add/drop multiplexers:
heuristic and exact algorithms. Oper. Res. 46, 719–728 (1998)
132. Tomlin, J.A.: On scaling linear programming problems. Math. Program. 4, 144–166 (1975)
133. Van Vyve, M., Wolsey, L.A.: Approximate extended formulations. Math. Program. 105,
501–522 (2006)
134. Vanderbei, R.J.: Linear Programming: Foundations and Extensions. Kluwer Academic Pub-
lishers (2001)
135. Wagner, H.M., Whitin, T.M.: Dynamic version of the economic lot size model. Management
Science 5, 89–96 (1958)
136. Weismantel, R.: On the 0/1 knapsack polytope. Math. Program. 77, 49–68 (1997)
137. Williams, H.P.: Model Building in Mathematical Programming, 5th Edition. Wiley (2013)
244 8 Optimization With Uncertain Parameters
138. Wolsey, L.A.: Faces for a linear inequality in 0-1 variables. Math. Program. 8, 165–178
(1975)
139. Wolsey, L.A.: Facets and strong valid inequalities for integer programs. Oper. Res. 24, 367–
372 (1976)
140. Wolsey, L.A.: Valid inequalities and superadditivity for 0/1 integer programs. Math. Oper.
Res. 2, 66–77 (1977)
141. Wolsey, L.A.: Strong formulations for mixed integer programs: a survey. Math. Program. 45,
173–191 (1989)
142. Wolsey, L.A.: Integer Programming. Wiley (1998)
143. Wolsey, L.A.: Solving multi-item lot-sizing problems with an mip solver using classification
and reformulation. Management Science 48, 1587–1602 (2002)
144. Wolsey, L.A.: Strong formulations for mixed integer programs: valid inequalities and ex-
tended formulations. Math. Program. 97, 423–447 (2003)
145. Yannakakis, M.: Expressing combinatorial optimization problems by linear programs. J.
Comp. Syst. Sci. 43, 441–466 (1991)
146. Zhao, Y., Ziemba, W.T.: The russell-yasuda model: A stochastic programming model using
an endogenously determined worst case risk measure for dynamic asset allocation. Math.
Program. 89, 293–309 (2001)
Index
245
246 Index
problem, 22 optimal, 75
continuous time formulation, 23 primal feasible, see feasible
time-index formulation, 24 dual, 75
sport tournament, 71 feasible, 74
separation, 85 SOS1, 3
for convex quadratic constraints, 129 SOS2, 4, 165
for cover inequalities, 132 stationary point, see KKT point
for flow cover inequalities, 145 Steiner tree, 207
for norm cones, 130 packing problem, 208
problem, 118 problem, 207
rule stochastic programming, 209
first violated, 85 multistage problem, 220
maximum decrease, 85 two-stage problem, 209
most violated, 85 strong branching, 163
steepest edge, 85
set telecommunication network design problem,
basic, 74 46, 207
dual feasible, 75 transportation problem, 11
feasible, 74 traveling salesman problem, 174
of columns, 89
of rows, 89 unit commitment problem, 45
primal feasible, see feasible
covering problem, 37 variable
packing problem, 37, 39 adaptive, 209
partitioning problem, 2, 37–39, 207 binary, 1
polyhedral, 17 boolean, 6
special ordered discrete, 2, 164
of type 1, see SOS1 dual, 75
of type 2, see SOS2 expected, 209
shadow price, 75 priority, 163
simplex method vehicle routing problem, 66
cycling, 93 classical, 68
lexicographic rule for preventing, 93 vertex
dual, 81 of hypergraph, 37
primal, 77 of polyhedron, 74
solution degenerate, 74
basic, 74
degenerate, 74 Weyl’s theorem, 12
dual feasible, 75
feasible, 74 yield management, 225