Slack Variable
Slack Variable
hj ( x ) ≥ 0 j = 1, 2…, r
2
θj = hj ( x ) ≥ 0 j = 1, 2, …, r
subject to hj ( x ) ≥ 0 j = 1 ( 1 )r (2)
2
hj ( x ) – θj = 0 j = 1 ( 1 )r
r
∂L ∂f ∂h j
∂ xi
=
∂ xi
+ ∑ λj
∂ xi
= 0 i = 1 ( 1 )n (4)
j=1 x = x*
*
λ = λ
∂L = h ( x ) – θ 2 j = 1 ( 1 )r
∂ λj j j x = x* = 0 (5)
*
θ = θ
∂L * *
= – 2λ j θ j = 0 j = 1 ( 1 )r (6)
∂ θj
*
• From the last expression (6), it is obvious that either λ* = 0 or θ j = 0 or both.
*
• Case 1: λ *j = 0 , θ j ≠ 0
* 2
In this case, the constraint h j ( x ) ≥ 0 is ignored since h j ( x * ) = ( θ j ) > 0
*
If all λ j = 0 , then (4) implies that ∇f ( x * ) = 0 which means that the the
unconstrained minimum.
*
• Case 3: θ j = 0 and λ *j = 0 for all j.
f( x) 2
f(x) = (x – a) + b
b x≥c
a c x
• If c > a then the minimum lies at x = c , which is the boundary of the feasible
reagion defined by x ≥ c .
2
• Introducing a single slack variable, θ = x – c ≥ 0 :
2
x–c–θ = 0
and we can write the Lagrangian as
2 2
L ( x , λ, θ ) = ( x – a ) + b + λ ( x – c – θ )
where λ is the Lagrange multiplier.
∂L
= 2 ( x * – a ) + λ* = 0 (7)
∂x
∂L 2
= x * – c – θ* = 0 (8)
∂λ
∂L
= – 2λ* θ* = 0 (9)
∂θ
*
From (9), assume λ* = 0 , θ ≠ 0 .
2
Therefore, from (7) x * = a and thus from (8), a – c – θ* = 0 which gives
2
that θ* = a – c and we have that θ* is real only for c ≤ a .
∂f
L ( x *, λ* , θ* ) = f ( x * ) and = 0
∂x
x*
x* = a .
∂L ∂f
= = 0.
∂x ∂x
x* x*
Example: As a two dimensional example, consider
2 2
minimize f (x) = ( x 1 – 3 ) + 2 ( x 2 – 5 )
x
x 2 f (x) = ( x – 3 ) 2 + 2 ( x – 5 ) 2
1 2
4
g = 2x 1 + 3x 2 – 5
2
3 6 x1
• Unless one was to draw a very accurate contour plot, it is hard to find the
minimum from such a graphical method.
• It is obvious from the graph though, that the minimum will lie on the line
g(x) = 0 .
2
• We introduce a single slack variable, θ , and construct the Lagrangian as
2 2 2
L ( x, λ, θ ) = ( x 1 – 3 ) + 2 ( x 2 – 5 ) + λ ( 2x 1 + 3x 2 – 5 + θ ) .
2
• The inequality constraint was changed to the equality constraint g(x) + θ = 0 ,
2
using the slack variable θ = – g(x) ≥ 0 .
• The necessary conditions become
∂L
= 2 ( x *1 – 3 ) + 2λ* = 0 (10)
∂ x1
∂L
= 4 ( x *2 – 5 ) + 3λ* = 0 (11)
∂ x2
∂L
= 2θ* λ* = 0 (12)
∂θ
∂L 2
= 2x *1 + 3x *2 – 5 + θ* = 0 (13)
∂λ
From (10) and (11):
x *1 = 3 – λ*
3
x *2 = 5 – --- λ*
4
substituting these expressions in (13) we have:
3 2
2 ( 3 – λ* ) + 3 ⎛ 5 – --- λ* ⎞ – 5 + θ* = 0
⎝ 4 ⎠
17 2
16 – ------ λ* + θ* = 0 .
4
13 37
x *1 = – ------ x *2 = ------
17 17
h ( x ) ≤ 0 and h ( x ) ≥ 0 or – h(x) ≤ 0 .
• Now assume that f ( x ) and g i ( x ) are differentiable functions; The Lagrangian is:
L ( x, λ ) = f ( x ) + ∑ λi gi ( x )
i=1
The necessary conditions for x * to be the solution to the above problem are:
r
∂ ∂
f(x *) + ∑ λ*i g i ( x * ) = 0 j = 1, 2, …, n (14)
∂ xj ∂ xj
i=1
g i(x *) ≤ 0 i = 1 ( 1 )r (15)
*
λi gi ( x* ) = 0 i = 1 ( 1 )r (16)
*
λi ≥ 0 i = 1 ( 1 )r (17)
• These are known as the Kuhn-Tucker stationary conditions; written compactly
as:
∇x L ( x *, λ* ) = 0 (18)
* *
∇λ L ( x *, λ ) = g ( x ) ≤ 0 (19)
T *
( λ* ) g ( x ) = 0 (20)
*
λ ≥0 (21)
r
∂ * ∂
–
∂ xj
f(x *) + ∑ λi ∂ xjgi ( x* ) = 0 j = 1, 2, …, n (22)
i=1
r
∂ ∂
f(x *) + ∑ ( – λ*i ) g i ( x * ) = 0 j = 1, 2, …, n . (23)
∂ xj ∂ xj
i=1
*
∇x L ( x *, λ ) = 0 (24)
*
∇λ L ( x *, λ* ) = g ( x ) ≤ 0 (25)
* T *
(λ ) g(x ) = 0 (26)
λ* ≤ 0 (27)
Transformation via the Penalty Method
• From a practical computer algorithm point of view we are not much further than
we were when we started.
n
minimize f(x) x∈ (28)
x
subject to gj ( x ) ≤ 0 j = 1 ( 1 )J (29)
h k(x) = 0 k = 1 ( 1 )K (30)
• The penalty term is a function of R and the constraint functions, g(x), h(x) .
• The purpose of the addition of this term to the objective function is to penalize
the objective function when a set of decision variables, x , which are not feasible
are chosen.
Use of a parabolic penalty term
constraints, h(x) .
K
2
minimize P ( x ;R ) = f ( x ) +
x ∑ Rk { hk ( x ) } . (31)
k=1
th
As the penalty parameters R k → ∞ , more weight is attached to satisfying the k
constraint.
th
If a specific parameter is chosen as zero, say R k = 0 , then the k equality
constraint is ignored.
The user specifies value of R k according to the importance of satisfying each
equality constraint.
Example:
2 2
minimize x 1 + x2
x
subject to: x2 = 1
∂P
= 2x *1 = 0 ⇒ x *1 = 0
∂ x1
∂P * * * R
= 2x 2 + 2R ( x 2 – 1 ) = 0 ⇒ x 2 = -------------
∂ x2 1+R
R - = 1.
x *2 = lim ------------
R→∞1+R
x2
R = ∞
x2 = 1
R = 2
R = 1
x1
unconstrained minimum
R = 0
Example of the use of a parabolic penalty function.
Inequality constrained problems
n
minimize f(x) x∈R (32)
x
subject to gj ( x ) ≤ 0 j = 1 ( 1 )J (33)
J
2
P ( x ;R ) = f( x ) + ∑ R i [ g i ( x ) ] u(g i) (34)
i=1
⎧0 if g i(x) ≤ 0
u ( gi ) = ⎨ (35)
⎩1 if g i(x) > 0
2
• The term [ g i ( x ) ] u ( g i ) is sometimes called the bracket operator and is denoted
〈 g i ( x )〉
constraint constraint
satisfied violated J
2
Ω = ∑ Ri [ gi ( x ) ] u ( gi )
Ri i=1
1 gi
subject to gj ( x ) ≥ 0 j = 1 ( 1 )J (37)
R
Ω = -----------
g(x)
where g(x) is the single constraint.
R
Ω = -----------
g(x)
-1 1 g(x)
-R
• As can be deduced from the figure, it is important that only feasible points be
started with; because of this, this method is classified as an interior method.
2 2
minimize f(x) = ( x 1 – 4 ) + ( x 2 – 4 )
x
subject to: g ( x ) = 5 – x1 – x2 ≥ 0
2
P ( x ;R ) = f(x) + R [ g ( x ) ] u ( – g(x) )
we have
2 2 2
P ( x ;R ) = ( x 1 – 4 ) + ( x 2 – 4 ) + R ( 5 – x 1 – x 2 ) u ( – g(x) )
• Thus, when g(x) < 0 , i.e. the decision variables are infeasible, then a penalty of
2
R ( 5 – x 1 – x 2 ) is applied.
∂P
= 2 ( x *1 – 4 ) + ( 2R ) ( 5 – x *1 – x *2 ) ( – 1 ) = 0
∂ x1
∂P
= 2 ( x *2 – 4 ) + ( 2R ) ( 5 – x *1 – x *2 ) ( – 1 ) = 0
∂ x2
2 ( x *1 – 4 ) – 2 ( x *2 – 4 ) = 0 ⇒ x *1 = x *2 .
( x 1 – 4 ) – R ( 5 – 2x 1 ) = 0
and therefore
5R + 4
x 1 = ----------------
2R + 1
5
lim x 1 = ---
R→∞ 2
and the constrained minimum is:
5 5
x * = ⎛ ---, ---⎞
⎝ 2 2⎠
–1
P ( x ;R ) = f( x ) + R [ g ( x ) ]
we have
2 2 –1
P ( x ;R ) = ( x 1 – 4 ) + ( x 2 – 4 ) + R ( 5 – x 1 – x 2 )
Whether or not g(x) < 0 , i.e. whether or not the decision variables are infeasible,
–1
a penalty of R ( 5 – x 1 – x 2 ) is applied.
• We must make sure that we remain feasible during the execution of any
algorithm we may employ.
• Proceeding analytically to find the necessary conditions for a minimum, we have
∂P R
= 2 ( x *1 – 4 ) + --------------------------------- = 0
∂ x1 2
( 5 – x *1 – x *2 )
∂P R
= 2 ( x *2 – 4 ) + --------------------------------- = 0
∂ x2 2
( 5 – x *1 – x *2 )
3 2 R
4 ( x *1 ) – 36 ( x *1 ) + 105x *1 – 100 + --- = 0
2
• This equation can be solved, for its roots, and the minimum of P ( x ;R ) for
particular values of R can be found.
R x *1, x *2 f (x* )
100 0.5864 23.3053
10 1.7540 10.0890
1 2.2340 6.32375
0.1 2.4113 5.0479
0.01 2.4714 4.6732
0.001 2.4909 4.5548
0 2.5000 4.5000