Frank 1956
Frank 1956
1. INTRODUCTION
The problem of maximizing a concave quadratic function whose variables a r e subject to
linear inequality constraints has been the subject of several recent studies, from both the com-
putational side and the theoretical (see Bibliography). Our aim here has been to develop a
method for solving this non-linear programming problem which should be particularly well
adapted to high-speed machine computation.
The quadratic programming problem as such, called PI, is set forth in Section 2.
We find in Section 3 that with the aid of generalized Lagrange multipliers the'solutions
of PI can be exhibited in a simple way as parts of the solutions of a new quadratic programming
problem, called PII, which embraces the multipliers. The maximum sought in PI1 is known to
be zero. A test for the existence of solutions to P I arises from the fact that the boundedness of
its objective function is equivalent to the feasibility of the (linear) constraints of PII.
In Section 4 we apply t o PII an iterative process in which the principal computation is
the simplex method change-of-basis. One step of our "gradient and interpolation" method,
given an initial feasible point, selects by the simplex routine a secondary basic feasible point
whose projection along the gradient of the objective function at the initial point is sufficiently
large. The point at which the objective is maximized for the segment joining the initial and
secondary points is then chosen as the initial point for the next step.
The values of the objective function on the initial points thus obtained converge to zero;
but a remarkable feature of the quadratic problem is that in some step a secondary point which
is a solution of the problem will be found, insuring the termination of the process.
A simplex technique machine program requires little alteration for the employment of
this method. Limited experience suggests that solving a quadratic program in n variables and
m constraints will take about as long as solving a linear program having m + n constraints and
a "reasonable" number of variables.
Section 5 discusses, for completeness, some other computational proposals making use
of generalized Lagrange multipliers.
Section 6 c a r r i e s over the applicable part of the method, the gradient-and-interpolation
routine, to the maximization of an arbitrary concave function under linear constraints (with one
qualification). Convergence to the maximum is obtained as above, but termination of the process
in an exact solution is not, although an estimate of e r r o r is readily found.
In Section '7 (the Appendix) a r e accumulated some facts about linear programs and con-
cave functions which are used throughout the paper.
95
96 M. FRANK AND P. WOLFE
n n.n
x-
1-
> 0 ,
( j = 1, ... n ) ,
n
(i = 1, . . . , m) .
j =C l Aijxj 5 bi
Matrix notation will be used exclusively below. x is the n x l matrix (i.e., column
vector) of variables xl,. . . , xn. A, C, p, and b are, respectively, mx n, nx n, l x n, and mx 1
matrices. Ai. = (Ail,. . ., Ain) will denote the ith row of A, and likewise A the jth column.
e . is the jth coordinate (column) vector of n dimensions. The symbol ' denotes matrix trans-
1
position. For any function f (x), the gradient l x n matrix is denoted by df. C may
without loss of generality be supposed symmetric.
In matrix terms, we may rewrite the quadratic problem as
-
PI: Maximize f (x) = px'- x' cx subject to
An x satisfying (I) will be called feasible. The set of all feasible x is the constraint
set. The problem P I is called feasible if the constraint set is not empty, that is, if there exists
a feasible x. If the objective function has a supremum on the constraint set, this supremum will
be denoted by M. An x for which f (x) = M is called a solution of P I and the set of all solutions
will be called the solution set. It will be assumed in this and the following section that P I is
feasible, but not that solutions necessarily exist.
The postulated concavity of the objective function is equivalent to the requirement that
C be the matrix of a positive semidefinite quadratic form [App. d]. A function is said to be
concave if interpolation never overestimates its values: that is, if
It follows that any local maximum point x of f on a convex constraint set is also a global maxi-
mum: for otherwise a y such that f (y) > f (x) could be found, and then any point on the segment
joining x to y, no matter how close to x, would yield a higher value for f than would x. Simi-
larly for a convex function, any local minimum on a convex set is a global minimum. It is this
fact which makes maximizing a concave function (or minimizing a convex function) one of the
simpler of the non-linear programming problems.
Concavity of the objective function implies that a "mixture" of two equally good sets of
activities can never result in a poorer set of activities. Conversely, the worst possible program
97
AN ALGORITHM FOR QUADRATIC PROGRAMMING
3. LAGRANGE MULTIPLIERS
Kuhn and Tucker's generalization [6] of the method of Lagrange multipliers to the maxi-
.
mization of a function f whose variables xl,. . ,xn lie in the constraint set given by the ine-
qualities
is based on the observation that f has a local extreme value at x if, and only if, the normal
hyperplane to the gradient vector 13 f (x) at x is, locally, a supporting hyperplane of the con-
straint set. (This formulation applies equally well to the classical case fi(x) = 0.) If x is to
be a local maximum, then the constraint set and the gradient drawn from x must lie on opposite
sides of the supporting hyperplane. If, furthermore, the constraint set is convex (as when the
functions f i are convex or, in particular, linear), and f is concave, then, as seen in Section 2,
this condition is necessary and sufficient that f have a global maximum at x.
Noting that a hyperplane touching the convex constraint set at a point x satisfies the
above condition if, and only if, x has the maximum projection (of all points of the convex) along
the outward normal to the hyperplane, a necessary and sufficient condition that xo be a solution
of PI is that
But by the Duality Theorem for linear programming [App. a] (xo is fixed), the right side of
(3.2) is equal to
Substituting this for the right side of (3.2) and transposing, we obtain that:
(3.4)
(3.5) g ( x , ~ ) E a f ( x ) x - u b = p x - u b ~- x ' C X
extracted from (3.4), which is similar to the generalized Lagrangian of [6], plays a dominant
role in the sequel. It is evidently a concave function of the variables (x, u). We suppose hence-
forth that (x,u) is constrained by the linear relations used in PI, (3.2), and (3.4), namely,
98 M. FRANK AND P. WOLFE
x10, u10,
(3.6)
AxLb, df(x)suA.
since g(x, u) = a f (x) x - u b 5 u A x - u b = u (Ax - b) 5 0. But then the Max of (3.4) i s never
positive, and we have proved the result on which our use of the generalized Lagrange multiplier
method rests, the
SOLUTION CRITERION:2 x is a solution of PI if, and only if, for some u such that
(x, u) satisfies (3.6),
Evidently if f is not bounded above for all feasible y, then such (x,u) cannot be found. On the
other hand, [App. i] if f is so bounded then there exists a solution, xo, and thus, by the Solution
Criterion, (xo, u) satisfying (3.6) exists. We have thus proved the
BOUNDEDNESS CRITERION: f is bounded above on the constraint set if, and only if,
the joint constraints (3.6) a r e feasible.
Finally, supposing in (3.10) that y is a solution of PI-that is, that f (y) = M, we have the
2The Sylution Criterion was originally obtained f r o m Lemma 1 of [6]: our g (x, u) is
xo t #: uo in Kuhn and T a c k e r ' s notation, s o that (3.8) is equivalent to p a r t of the con-
ditions of their Lemma 1. (3.6) is equivalent to the remaining conditions, and Theorems 1 and
2 of [6] then establish the Solution Criterion. Barankin and Dorfman [ l ] obtain the C r i t e r i o n
and the transformed problem PI1 in this way.
99
AN ALGORITHM FOR QUADRAT1 C PROGRAMMING
function g(x,u) is achieved under the constraints (3.6). 'It is, in fact, this latter problem, PI1
below, which is solved in Section 4. Also, the Approximation Criterion (3.11) yields an esti-
mate f (x) 5 M 5 f (x) - g(x, u) for M when any (x, u) satisfying (3.11) is at hand.
A s is customary in solving linear programs [3, p. 3393 the constraints (3.6) will be
written as equations by adjoining one non-negative variable yi (i = 1 , . . . , m) to each of the m
inequalities of A x 5 b, and one v. (j = 1 , . . ., n) to each of the n inequalities of a f (x) 5 uA
-j
(written for our purposes as - 3 f (x) + u A 2 0, or, using matrix transposition, a s 2C x + A' u'
-> p'), obtaining
(3.12) Ax+ y = b ,
2 C x + A'u' - v',=p'
(y an mx 1 matrix, v an n x 1 matrix).
We also have, after substitution from (3.12),
(3.14) vx+uy=o.
This condition is a geometric dual of the condition stated at the beginning of this section, and
has been obtained from it by the duality theorem for linear programming.
3After Dantzig. Under the non-degeneracy hypothesis of [3], a basic feasible vector has
exactly m t n positive components.
100 M. FRANK AND P. W O L F E
Now the conditions (3.15) entail that at most m + n components of any solution of PI1 are
positive, s o that we may hope to find a solution of PI1 among the basic feasible vectors of the
constraint s e t for PII. We show in Section 4 that this is indeed the case.4
This remarkable consequence of the use of Lagrange multipliers might suggest the
direct u s e of the simplex method to maximize the objective function; however, as discussed in
Section 5, this is not possible. Nevertheless, the simplex method machinery is usefully
exploited in the "gradient-and-interpolation" method presented in the next section.
4. THE COMPUTATION
PRELIMINARIES: F o r present purposes the four interdependent vectors x, u, y, v
may be combined into a single constrained vector z. Accordingly let
rx i
(4.1)
the two instances of I being identity matrices of appropriate rank. z, R, and d are then
2 ( m + n) x 1, (m + n) x 2 ( m + n), and (m + n) x 1 matrices, respectively; and the system of
constraints (3.12) for PI1 assumes the form.
The new objective function may be conveniently expressed using this linear operation, for by
(3.13)
1 -
(4.3) g(x,u) = - (vx-t- u y ) = - -2 2 2.
Further, for z = [x', u, y', v]' and Z = [X', U, Y', V]' feasible with respect to (11) above,
-
i ; Z = Z z , and
(4.4)
(E - 5) (Z - z ) 2 0
The first relation is evident. The second relation is given by
In these terms, the final version of the Lagrange multiplier problem obtained in the last
section may be stated as
PII: Find z for which the maximum, zero, of the inner product
-2z
By the Boundedness Criterion, PI has a (finite) solution if, and only if, PI1 is feasible;
by the Solution Criterion, x solves PI if, and only if, z = [x' u y' v]' solves PI1 for some y,
u, and v.
Although this has not been necessary before, for application of the simplex method we
suppose that the constraint equations (11) have, where necessary, been multiplied by -1, SO that
the right side is non-negative.
THE ALGORITHM: The process below will yield a basic feasible vector solution of
PII. Phase I initiates the process, while Phase I1 is iterated until it yields the required vector.
Phase I: The constraints (11) are tested for feasibility. (Most commonly employed is the
artificial basis [App. c].) If they a r e feasible, a basic feasible vector z1 is produced with
which to begin Phase 11. If not, the last n equations may be discarded and the remainder simi-
larly tested for feasibility. If these are feasible, then the quadratic problem is feasible, but
unbounded above; otherwise, the quadratic problem i s infeasible.
Phase 11: This phase is defined inductively: At the kth instance of Phase I1 we have the
feasible (11) vector wk which does not solve PII, and also a basic feasible vector zk with which
to s t a r t the simplex machinery. (At the first instance, let w1 = z1 .)
Employ the simplex technique in the maximization of the linear form
1
obtaining the sequence of basic feasible vectors z = zk, z2, z3, . such that
(4.5) z h z h = 0, o r
(4.6)
5In the likely event that the c o n s t r a i n t s ( 1 1 ) a r e degenerate, '," may occur h e r e f o r a
while, but not f o r long. Dantzig's method f o r handling degeneracy [ 2 , 41 is exceptionally e a s y
t o u s e h e r e , owing to the p r e s e n c e ( e x c e p t f o r s i g n ) of a n identity m a t r i x i n the c o n s t r a i n t
matrix.
102 M. FRANK AND P. WOLFE
I
h
'k+l=
(4.7)
Wkk+1'
(b) Convergence. Dropping subscripts as in (a), when Z = zh satisfying (4.6) is
obtained in the kth instance of Phase 11, we have
GW = G w + 2 p % ( Z - w) + p 2 ( 2 - G ) (Z - w)
= G w + p E ( Z - w) + p [ p @ - G ) (Z - w) - G(w - Z ) ]
(4.8)
-< G w -p - -21.-w w
= (1 - -1p ) G w .
2
Letting F be the (compact) convex hull of the set of all basic feasible vectors of (n),let
(4.9) L = M= { (51 - 22) (zl - 22) 1 z1,z' in F) .
AN ALGORITHM FOR QUADRATIC PROGRAMMING 103
If p < 1, then
1 G(W-Z),GW
p' 2 L - 4-
L by (4.6) Y
(4.10)
-w w
Resuming subscripts, let %=-
4 L
. Then (4.10) becomes
1 1 + % + a k2 +...
1 1
-> - -- - -> - + 1 ,
ak+ 1 ak 1 - ak ak ak
s o that
Thus
(4.12)
Hence %kwk approaches zero at worst like l/k, and the convergence is proved.
(c) Finiteness. Now suppose that the linear program of Phase 11 never yields a basic
feasible vector which solves PII; then in particular no zk obtained in the iteration of Phase 11
solves PII. Since each wk is a convex combination of { zk' 1 k' < k } , there is a point of
104 M. F R A N K A N D P. WOLFE
accumulation wo of {wk}, belonging to the convex hull of the finite set of basic feasible vectors
{ zk I Ekzk > 0 } for which So wo = 0; yet wo = C tkzk with tk 2 0, C tk = 1, so that Gow0
= C t.t
j,k 1
z.3 z > 0: a contradiction. The proof that Phase 11 tei-minates in a solution is com-
plete.
(d) Approximation. Some iterations of Phase I1 can be avoided if only a solution of P I
to a predetermined degree of approximation is desired; for the Approximation Criterion (3.11)
yields that, where xk is the x-part of wk,
(4.13)
5. OBITER DICTA
We briefly consider here some alternative computational proposals which aim directly
to exploit the fact that some basic feasible vector of (11) is a solution of the problem PII by
using the simplex change of basis to pass from a basic feasible vector to adjacent basic feasible
vectors.
Perhaps the most attractive possibility is that of the existence of a function defined for
basic feasible vectors which assumes its minimum at a solution and which can always be
decreased with a single change of basis i f the solution has not been reached. In view of the
Solution Criterion (3.8), the function -g(x, u) = 1 Z z naturally merits consideration from this
2
angle, as well as the integer-valued function which counts the number of positive components
of that coincide in position with the positive components of z; these functions have been
studied by Barankin and Dorfman [l]. Quadratic programming problems may be constructed,
however, in which certain non-solving basic feasible vectors occur as "local minima" for these
functions, which cannot be decreased with a single change of basis. Whether o r not a function
of the desired type exists is not known.
One could also systematically explore all the basic feasible vectors of (11) for a solution,
changing basis to pass from vertex to adjacent vertex in the graph consisting of all the basic
feasible vectors and their connecting edges in (11). Such a process has been proposed by
Charnes [2] for obtaining all solutions of a linear programming problem, given one. The
amount of information that must be recorded if a graph of unknown design is to be traced out
has been studied as the "labyrinth problem" [5]. The tracing process becomes less efficient
(in t e r m s of edges retraced) the less the data recorded. A minimum record is a list of all
vertices already traversed. The best available upper bound [7] for the number of vertices of
(11) being of the order of (iy ):! , it seems that such an approach to PII would be infeasible
for large-scale problems because of the excessive demands made on the memory of a high-
speed computer.
AN ALGORITHM F O R QUADRATIC PROGRAMMING 105
6. CONCAVE PROGRAMMING
The gradient-and-interpolation method of Section 4 applied to the quadratic concave
function g (x, u) may be equally well applied to the problem of maximizing a more general
concave function f, satisfying the hypothesis (A) below, whose variables a r e constrained by
linear relations. Generalized Lagrange multipliers do not seem to have the same utility here
that they do in the quadratic case, since the constraint d f (x) 5 u A entailed by their use is no
longer linear, and thus the features which ensured the termination of the quadratic program-
ming in a finite number of steps a r e not found here.
Let f be a concave function of n variables possessing continuous second derivatives.
The concave programming problem is:
-
PC: Maximize f (x) under the constraints
matrix (-).
The problem will be supposed to satisfy the hypothesis
(A): For each feasible x, t d f (x) y I y feasible 1 is bounded above.
In particular, (A) holds if the constraint set is bounded, and in any case implies that f
is bounded above on the constraint set, since for x fixed we have [App. el the consequence of
concavity
From (6.2), for any feasible w we have f (w) 5 f (x) + a f (x) (w - x) 5 f (x) + a f (x) (z, +
Since, moreover,
'of course s 5 L. For this method, s should be chosen as small a s possible, but we have
no suggestions for finding it.
AN ALGORITHM F O R QUADRATIC PROGRAMMING 107
(6.9)
*
which is the same as (4.9) with - replaced by -,
M-f
so that -
(xk) converges to z e r o
4L L L
1 and f(xk)+M.
like -
k’
Finally, we have in analogy to the Approximation Criterion of Section 3 the estimate for
M given by (6.7):
This estimate improves throughout the process; letting Zk+ - Xk = yk, we show d f (xk)yk+O.
‘k
Let 6k = Min {d f (x,) yk, L). Then for some w between x and x + r y ,
n
7. APPENDIX
(a) The duality theorem f o r linear programming asserts, where A, b, c are, respec-
tively, m x n , m x l , l x n matrices:
sup { c x Ix 2 0, Ax 5 b } = inf {u b Iu 1. 0, u A 2 c 1,
the supremum (infimum) over an empty set being - co (a);the extrema are assumed if finite.
(b) The simplex method for solving the linear program Max {cx Ix L 0, A x = b }
employs at each step the basis consisting of m columns of A such that their totality B is
nonsingular and B - l b 2 0. The vector having components B-?b with appropriate zeroes
adjoined is a basic feasible vector. A change of basis replaces one column of B by another
column from A in such a way as to increase the objective c x on the basic feasible vector.
This is continued until the maximum is attained. The constraints are non-degenerate if always
B - l b > 0. The non-degeneracy assumption of [3] has been removed 141.
(c) An initial basic feasible vector for the linear program i s found by applying the
simplex method to the expanded problem Max {- y 1 x L 0, y 2 0, A x + I y = b I, x = 0, y = b 2 0
108 M. F R A N K AND P. W O L F E
being an initial basic feasible vector for this problem. If Max 0, then the original problem is
not feasible; otherwise, for the Max, y = 0, so x is a basic feasible vector. This formulation
is used with a further modification to eliminate degeneracy [2, 41.
(d) The function f (having continuous second derivative) on the convex set S is concave
if, and only if, for all x, y in S and 0 s p 2 1, using h ( p ) = f(x + p (y - x)), we have h"(p) 5 0,
whence:
(7.1)
(y - xy a 2 f (w) (y - x) 5 0 ,
for all w between x and y, is a necessary and sufficient condition for concavity. If S has
interior points, (7.1)is equivalent to the condition that the matrix a 2 f (w) be negative semi-
definite.
If f(x) = p x - x' C x then d 2 f = -C, so that f is concave for all x if, and only if, C is
positive definite.
(e) For f as in (d), if x, y E S, by Taylor's theorem
is a necessary condition for concavity of f [6, Lemma 31. If the left of (7.1)is positive for some
2
x, y, w, then 51,T can be chosen sufficiently close to w 'that (L - TI)' d f (G) (L - 51) > 0 for all
-
w between 51 and 7, so that (7.2) fails, whereby it is also sufficient.
(f) Since, in general, if f is concave then {x If (x) L c} is convex for all c, the solution
set {x€SI f (x) 2 Max) of any concave program on a convex set S is convex.
(g) If C is positive semi-definite and symmetric and X ' C X = 0 then for all /-and l z,
2 z' C z, so that x' C z = 0 for all z, whence x' C = C x=O.
0 5 ( x + p z)' C (x + p z) = 2 p x' C z + J.I
(h) If x and y belong to the solution set of the quadratic problem, then for all 0 2 p 5 1,
REFERENCES
[l] Barankin, E. W., and Dorfman, R., Toward Quadratic Programming, 0. N. R. Logistics
Projects at Columbia Univ. and U. C. L. A., Univ. of California at Berkeley.
[2] Charnes, A., Cooper, W. W., and Henderson, A., An Introduction to Linear Programming,
Wiley, 1953.
[3] Dantzig, G. B., "Maximization of a linear function of variables subject to linear inequalities,"
Activity Analysis of Production and Allocation (T. C. Koopmans ed.), Wiley, 1951.
[4] Dantzig, G. B., Orden, A., and Wolfe, P., "The generalized simplex method for minimizing
a linear form under linear inequality restraints," Pacific J. Math. 2, 183-195 (June 1955).
[ 51 Konig, D., Theorie d e r Endlichen und Unendlichen Graphen, Leipzig, 1936.
[6] Kuhn, H. W.,and Tucker, A. W., "Nonlinear programming," Proceedings of the Second
Berkeley Symposium on Mathematical Statistics and Probability, 481-492 (1951).
[7] Saaty, T. L., "The number of vertices of a polyhedron," Amer. Math. Monthly - 62, 326-331
(May 1955).
ADDITIONAL BIBLIOGRAPHY
(Items 4-7 discuss computational proposals.)
1. Debreu, G., "Definite and semidefinite quadratic forms," Econometrica - 20, 295-300 (1952).
2. Mann, H. B., Quadratic f o r m s with linear constraints," Amer. Math. Monthly, -
58, 430-433
(1943).
3. Phipps, C. G., Maxima and minima under restraint," Amer. Math. Monthly 59, 230-235 -
(1952).
4. Arrow, K. J., and Hurwicz, L., "A gradient method for approximating saddle points and
constrained maxima," RAND P-223 (June 1951). T o appear in Proceedings of the Third
Berkeley Symposium on Mathematical Statistics and Probability.
5. Charnes, A., and Lemke, C. E., "Minimization of non-linear separable convex functionals
(Computational theory of linear programming IV)," O.N.R. Research Memorandum No. 16,
Carnegie Institute of Technology (May, 1954).
391368 0 - 56 - 8
110 M. FRANK A N D P. WOLFE
6. Manne, A. S., "Concave programming for gasoline blends," RAND P-383 (April 1953).
7. Markowitz, H., "The optimization of quadratic functions subject to linear constraints,"
Naval Research Logistics Quarterly, Vol. 3, Nos. 1, 2, 1956.
* * *