The Kuhn-Tucker Conditions: Minimize Such That 0 1,..., H 0 1,...
The Kuhn-Tucker Conditions: Minimize Such That 0 1,..., H 0 1,...
Hao Zhang
1. General Case
minimize f ( x); x ∈ Rn
such that g j ( x) ≤ 0; j = 1,..., p
h j ( x) = 0; j = 1,..., q
In general, this problem may have several local minima. Only under
function
ne
( x, λ ) = f ( x ) − ∑ λ j h j ( x )
Ldx
j =1
-1-
∂L ∂f ne
∂h j
= − ∑ λj =0 i = 1,..., n
∂xi ∂xi j =1 ∂xi
∂L
= h j ( x) = 0 j = 1,..., ne
∂λ j
have constraints gradients that are linearly dependent, it means that we can
by adding slack variables. That is, the inequality constraints are written as
g j ( x) − t 2j = 0, j = 1,..., ng
where tj is a slack variable which measures how far the jth constraint
ng
L ( x, t , λ ) = f − ∑ λ j ( g j − t 2j )
-2-
j =1
Differentiating the Lagrangian function with respect to x, λ, t we
obtain
∂L ∂f
ng
∂g j
= − ∑ λg = 0, i = 1,..., n
∂xi ∂xi j =1 ∂xi
∂L j = 1,..., ng
= − g j + t 2j = 0,
∂λ j
∂L
= 2λ j t j = 0, j = 1,..., ng
∂t j
not critical (so that the corresponding slack variable is non-zero) then the
regular point. Note that for inequality constraints a regular point is one
-3-
only if a set of nonnegative λj’s may be found such that:
∂L ∂f
ng
∂g j
1.
= − ∑ λg = 0, i = 1,..., n
∂xi ∂xi j =1 ∂xi
is satisfied.
illustrated in Fig 1 for the case of two constraints. ∇g1 and ∇g 2 denote
the gradients of the two constraints which are orthogonal to the respective
Clearly from the Fig 1, any vector which forms an acute angle with
−∇f will also form an acute angle with either −∇g1 or −∇g 2 .
as
sT ∇g j ≥ 0, j ∈ I A
IA is the set of active constraints. Equality is permitted only for linear
s T ∇f < 0
-5-
Multiplying by si and summing over i we obtain
ng
s ∇f = ∑ λ j s T ∇ g j
T
j =1
to find a direction with a negative slope for objective function that does
sT ∇f = sT ∇g j = 0, j ∈ IA
this direction could reduce the objective function without violating the
the Kuhn-Tucker conditions are necessary but not sufficient for optimality.
-6-
When the number of active constraints is not equal to the number of
the Lagrangian is
ne
∇ 2 L== ∇ 2 f − ∑ λ j ∇ 2 h j
j =1
The sufficient condition for optimality is that
sT (∇ 2 L) s > 0
sT ∇g j ≥ 0, when g j = 0and λj = 0
2. Example
Subject to
-7-
g1 = 10 − x1 x2 ≥ 0,
g 2 = x1 ≥ 0,
g3 = 10 − x2 ≥ 0
⎛ −6 x1 λ1 ⎞
∇2 L = ⎜ ⎟
⎝ λ1 −4 − 12 x2 ⎠
equations
−3x12 + 10 + λ1 x2 = 0
−4 x2 − 6 x22 + λ1 x1 = 0
x1 x2 = 0
The only solution for these equations that satisfies the constraints on
x1 and x2 is
-8-
Hessian of the Lagrangian at that point
⎛ −23.08 13.24 ⎞
∇2 L = ⎜ ⎟
⎝ 13.24 −35.19 ⎠
Next we consider the possibility that g1 is not active, so that λ1=0, and
−3 x12 + 10 − λ2 = 0
−4 x2 − 6 x22 + λ3 = 0
Both points satisfy the KT conditions for a minimum, but not the
(x1=0 is the only one) have the form sT = (0, a) , and it is easy to check that
sT ∇ 2 Ls < 0 . It is also easy to check that these points are indeed no minima
by reducing x2 slightly.
It is easy to check that ∇ 2L= is negative definite in this case, so that the
-9-
sufficiency condition could not be satisfied.
Now the KT conditions are satisfied, and the number of active constraints
3. Convex Problems
Kuhn-Tucker conditions are not only necessary but also sufficient for a
global minimum.
A function is convex if
f [α x2 + (1 − α ) x1 ] ≤ α f ( x2 ) + (1 − α ) f ( x1 ), 0 <α <1
convex if all the inequality constraints gj are concave and the equality
- 10 -
important in structural optimization, as we often approximate optimization
- 11 -