OD5 PL Dynamic Programming A
OD5 PL Dynamic Programming A
PL #5 Dynamic Programming
Alexandra Moutinho
th
The cost per unit produced with overtime for each week is $100 more than for regular time. The cost
of storage is $50 per unit for each week it is stored. There is already an inventory of two widgets on
hand currently, but the company does not want to retain any widgets in inventory after the 3 weeks.
Management wants to know how many units should be produced in each week to minimize the
total cost of meeting the delivery schedule.
Solve this problem using dynamic programming.
Resolution:
The decisions that need to be made are the number of units to produce each week, so the
decision variables are
= number of widgets to produce in week n, for n = 1, 2, 3.
To choose the value of xn, it is necessary to know the number of widgets already on hand at the
beginning of week n. Therefore, the state of the system then is
= number of widgets on hand at the beginning of week n, for n = 1, 2, 3.
Because of the shipment of three widgets to the customer each week, note that
3.
Also note that two widgets are on hand at the beginning of week 1, so
= 2.
To introduce symbols for the data given in the above table for each week n, let
= unit production cost in regular time,
= maximum regular time production,
= maximum total production.
n
1
2
3
rn
2
3
1
mn
4
5
3
cn
300
500
400
We wish to minimize the total cost of meeting the delivery schedule, so our measure of
performance is total cost. For n = 1, 2, 3, let
(
Thus,
) + 50 max (0,
+ 100 max(0,
3)
( ) = optimal cost for week n onward (through the end of week 3) when starting week
n in state sn, for n = 1, 2, 3.
( ) = min
where
( ) = 0 for
3)],
= 0.
Recall that the company does not want to retain any widgets in inventory after the three weeks,
so = 0. Therefore, the optimal policy for week 3 obviously is to produce just enough widgets to
have a total of three to ship to the customer, so
=3
Given and
), we can use the recursive relationship to solve for the optimal policy for week
2 and then for week 1. These calculations are shown below.
For n = 3:
s3
0
1
2
3 s3
f3 (s3)
1,400 (1)
900 (2)
400 (3)
0
x3
3
2
1
0
(1)
For n = 2:
x2
s2
0
1
2
3(2)
(+)
(+)
(+)
1,400
5(1)
3,200(5)
2,850
2,900(*)
2,950(*)
*
f2 (s2)
x2
2,900
2,400
1,900
1,400
3
2
1
0
(1)
3) + 50 max(0,0 + 3
2
(4)
(0,4)
(0,4) + (0 + 4 3) = 500 4 + 100 max(0,4 3) + 50 max(0,0 + 4
3) + 900 = 3050
(5)
(0,5)
(0,5) + (0 + 5 3) = 500 5 + 100 max(0,5 3) + 50 max(0,0 + 5
3) + 400 = 3200
(+)
Options that do not assure minimum delivery.
(*)
Are these values necessary? No, because they imply that
0 since
3 > 3.
For n = 1:
x1
s1
2
*
f1 (s1)
x1
2,950
Hence, the optimal plan is to produce four widgets in period 1, store three of them until period 2,
and produce three in period 3, with a total cost of $2,950.
(has 2)
=4
Produce 4
Deliver 3
Store 3
=0
Produce 0
Deliver 3
Store 0
=3
Produce 3
Deliver 3
Store 0 ( = 0)
Total cost
$2,950
Let
be the decision variable at stage 1 and
be the decision variable at stage 2. The key to
applying dynamic programming to this problem is to interpret the right-hand side of the constraint
as the amount of a resource being made available to activities 1 and 2 (whose levels are and ).
Then the state of the system entering stage 1 (before choosing
and ) and entering stage 2
(before choosing ) is the amount of the resource still available for allocation to the remaining
activities. Consequently, letting denote the state of the system entering stage , we have
= 2 and
For
= 2
)=
,
, with
)=
= :
For
)
= 2.
( ) <=>
(2,
( )
:
2
=2
+4
= 0 =>
1, 0, 1
Computing the second partial derivative to determine which are maxima and minima:
(
Hence,
12
(2,
+4
< 0 for
> 0 for
1 and
= 1.
(2, 1)
(2, 1) = 1, whereas
Since
2, 2 = 0 at the endpoints of the feasible region
(
2), we have both
1 and = 1 as global maximizers and (2) = 1.
= :
( )
1
) = ( 1, 1), and
1, 1