0% found this document useful (0 votes)
54 views20 pages

DPOCexam2017 Solution BB

This document describes a final exam for a course on dynamic programming and optimal control. The exam contains 4 problems involving dynamic programming formulations and solutions. It will last 150 minutes and contains 4 problems worth a total of 100 points. No calculators are allowed. The problems cover topics like optimal control of linear systems, dynamic programming with probabilistic predictions, minimizing the expected value of a random variable, and maximizing the volume of a box under constraints.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views20 pages

DPOCexam2017 Solution BB

This document describes a final exam for a course on dynamic programming and optimal control. The exam contains 4 problems involving dynamic programming formulations and solutions. It will last 150 minutes and contains 4 problems worth a total of 100 points. No calculators are allowed. The problems cover topics like optimal control of linear systems, dynamic programming with probabilistic predictions, minimizing the expected value of a random variable, and maximizing the volume of a box under constraints.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Final Exam January 25th, 2018

Dynamic Programming & Optimal Control (151-0563-01) Prof. R. D’Andrea

Solutions

Exam Duration: 150 minutes

Number of Problems: 4

Permitted aids: One A4 sheet of paper.


No calculators allowed.
Page 2 Final Exam – Dynamic Programming & Optimal Control

Problem 1 [29 points]

a) Consider the system  


xk+1 = 1> uk xk + u>
k R uk , k = 0, 1

where    
1 2 0
1= , R= .
1 0 1
Furthermore, the state xk ∈ R and the control input uk ∈ R2 . The cost function is given
by
X 2
xk .
k=0

Calculate an optimal policy µ∗1 (x1 )


using the dynamic programming algorithm and show
that it is indeed optimal. Simplify the expression as much as possible. [4 points]

b) Consider the system

xk+1 = fk (xk , uk , wk ), k = 0, 1, ..., N − 1.

The cost function is given by


N
X −1
x2k .
k=0
At the beginning of each period k, we receive a prediction yk that wk+2 will
attain
 a probability distribution out of a given finite collection of distributions
pwk+2 |yk (·|1) , pwk+2 |yk (·|2) , ..., pwk+2 |yk (·|m) . In particular, we receive a forecast that
yk = i and thus pwk+2 |yk (·|i) is used to generate wk+2 . Furthermore, the forecast itself has
a given a-priori probability distribution, namely,

yk+1 = ξk ,

where ξk is a random variable taking value i ∈ {1, 2, ..., m} with probability pξk (i). Given
yk−2 , wk is independent of all variables before time k; ξk is independent of all variables
before time k.
Convert the above problem into the standard problem formulation of dynamic program-
ming. In particular, write down the state vector, the system dynamics, and the distur-
bance vector with its probability density function (PDF) expressed as a function of the
given PDFs. The dimension of the state space should be as small as possible. You do
not have to solve the dynamic programming problem [4 points]

c) Consider a discrete random variable x which is defined by the set of all its possible outcomes
X with X = {1, 2, 3}, and a PDF px . Each element of X can only occur with probability
0, 0.5, or 1. The objective is to find the PDF that attains the minimum expected value of
x:
minimize E [x] (1)
px ∈P x
Pk
where P is the set of all possible PDFs of x. Define the state yk as i=1 px (i) for all k ≥ 1.
[11 points]

i) Formulate an equivalent problem that matches the standard form to which the dy-
namic programming algorithm can directly be applied, that is, explicitly state the
Final Exam – Dynamic Programming & Optimal Control Page 3

• dynamics fk (yk , uk , wk ) such that yk+1 = fk (yk , uk , wk ), initial condition y0 ,


and what uk and wk correspond to in the original problem.
• number of stages N such that k = 0, ..., N − 1.
• state-space Sk such that yk ∈ Sk .
• control-space Uk (yk ) such that uk ∈ Uk (yk ).
• stage costs gk (yk , uk , wk ) and terminal cost gN (yN ), such that the total cost is
PN −1
k=0 gk(yk , uk , wk ) + gN (yN ).
You do not have to solve the dynamic programming problem
ii) Is it possible to convert the above problem to a deterministic shortest path problem?
If so, draw the corresponding graph. In particular, draw all the vertices including
the starting node and the terminal node, and all the edges with the associated arc
lengths. If not, explain why.

d) Consider a rectangular box with side lengths l > 0, w > 0, and h > 0, as shown in Fig.
1. The problem is to determine the side lengths that maximize the volume of the box,
subject to the constraint l + w + h = 1. [10 points]

l
w

Figure 1: A rectangular box, also known as a right rectangular prism, or a cuboid.

i) Formulate an equivalent problem that matches the standard form to which the dy-
namic programming algorithm1 can directly be applied, that is, explicitly state the
• dynamics fk (xk , uk ) such that xk+1 = fk (xk , uk ), initial condition x0 , and what
xk and uk correspond to in the original problem.
• number of stages N such that k = 0, ..., N − 1.
• state-space Sk such that xk ∈ Sk .
• control-space Uk (xk ) such that uk ∈ Uk (xk ).
• stage costs gk (xk , uk ) and terminal cost gN (xN ), such that the total cost is
PN −1
k=0 gk(xk , uk ) + gN (xN ).
You do not have to solve the dynamic programming problem
ii) Is it possible to convert the above problem to a deterministic shortest path problem?
If so, draw the corresponding graph. In particular, draw all the vertices including
the starting node and the terminal node, and all the edges with the associated arc
lengths. If not, explain why.

1
You may replace min with max, and correspondingly the gk (·) are rewards.
Page 4 Final Exam – Dynamic Programming & Optimal Control

Solution 1
a) k=N =2:
J2 (x2 ) = x2
k=1:

J1 (x1 ) = min (x1 + J2 (f1 (x1 , u1 )))


u1 ∈R2
   
= min x1 + 1> u1 x1 + u> 1 R u1
u1
∂J1 (x1 )
:= 0
∂u1
⇒ 1> x1 + u>
1 (2R) = 0
2Ru1 = −1x1
1
u1 = R−1 (−1)x1
2  
1 1
=− x1
4 2

Furthermore,
∂ 2 J1 (x1 )
= 2R
∂u21
Its eigenvalues are 4 and 2, which are positive, and thus the matrix is positive definite.
The sufficient condition for optimality is therefore satisfied.

b) • Let sk := yk−2 , rk := yk−1 , then the augmented state vector x̃k := (xk , yk , rk , sk ).
Since the forecasts yk−2 , yk−1 , yk are known at time k, we still have perfect state
information.
• We define our new disturbance as w̃k := (wk , ξk ), with probability distribution

p(w̃k |x̃k , uk ) = p(wk , ξk |xk , yk , rk , sk , uk )


= p(wk |xk , yk , rk , sk , uk ) p(ξk |xk , yk , rk , sk , uk )
= p(wk |sk ) p(ξk ) .

Note that wk depends only on x̃k (in particular sk ), and ξk does not depend on
anything.
• The dynamics therefore become
   
xk+1 fk (xk , uk , wk )
 yk+1   ξk  =: f˜k (x̃k , uk , w̃k ) ,

 rk+1  = 
x̃k+1 =   
yk 
sk+1 rk

which now match the standard form.

c) i) • Stage index: there are 3 elements in the outcome set X . There are 3 stages.
State: yk = ki=1 px (i) for k = 1, 2, 3, y0 = 0.
P

• State space: S0 = {0}; Sk = {0, 21 , 1}, k = 1, 2; S3 = {1}.
• Control input: uk = px (k + 1), j = 0, 1, 2.
• Dynamics: yk+1 = yk + uk .
Final Exam – Dynamic Programming & Optimal Control Page 5

• Control space: U2 (y2 ) = 1 − y2 ; Uk (yk ) = {u ∈ R|u = 12 n, n ∈ N, 0 ≤ u ≤ 1 − yk },


k = 0, 1.

• Disturbance: there are no disturbances.

• Stage cost: uk (k + 1), k = 0, 1, 2

• Terminal cost: there is no terminal cost.

ii) Yes.

0 (2, 0)
(1, 0)

1 3
0

0.5 0 1.5 0
(0, 0) (1, 0.5) (2, 0.5) (3, 1) T

1
1 2

(1, 1) 0
(2, 1)

Figure 2
Page 6 Final Exam – Dynamic Programming & Optimal Control

d) The original problem can be written as follows:

maximize lwh
s.t. l + w + h = 1
l, w, h > 0

This is equivalent to

maximize ln(lwh) = ln(l) + ln(w) + ln(h)


s.t. l + w + h = 1
l, w, h > 0

since ln(·) is a monotonic increasing function. We have seen its use in the Viterbi Algo-
rithm.

i) • Let uk represent the length of side k, k = 0, 1, 2, and xk represent the sum of


the lengths of sides 0 to k − 1 (inclusive). Thus we start with x0 = 0 and the
dynamics are

xk+1 = xk + uk

• Since there are three sides, k = 0, 1, 2, and thus N = 3.

• S0 = {0}, Sk = (0, 1], k = 1, 2, S3 = {1}.

• Uk (xk ) = (0, 1 − xk ], k = 0, 1, U2 (x2 ) = {1 − x2 }.

• gk(xk , uk ) = ln(uk ), gN (xN ) = 0.


Final Exam – Dynamic Programming & Optimal Control Page 7

Method 2 (Defining it as one stage problem)


Something like the following.
• Let u0 = (l, w, h) be a touple of the edge lengths. Dynamics xk can be anything,
doesn’t matter.
• k = 0, 1, so N = 1.
• State space should be consistent with the dynamics.
• U0 = {u |||u||1 = 1}
• g0 (x0 , u0 ) = −lwh. gN (xN ) = 0.
ii) No, the state space is not finite.
Page 8 Final Exam – Dynamic Programming & Optimal Control
Final Exam – Dynamic Programming & Optimal Control Page 9

Problem 2 [13 points]

For problems marked with *: Answers left blank are worth 0 points. Each wrong answer is
worth -1 point. You do not have to explain your answer. Each correct answer is worth 1 point.
The minimum score of Problem 2 is 0.

a) True or False questions. You do not have to explain your answer. [3 points]

i)* In dynamic programming, every finite state problem can be converted to a determin-
istic shortest path problem.
ii)* In the Viterbi algorithm, we are given a measurement sequence ZN = (z1 , . . . , zN ),
and we want to find the “most likely” state trajectory XN = (x0 , . . . , xN ). In partic-
ular, we solve for a maximum a-posteriori estimate X̂N := (x̂0 , . . . , x̂N ) where

X̂N = arg max p(XN |ZN ) .


XN

Let Zk := (z1 , . . . , zk ) and Xk := (x1 , . . . , xk ) for some time k < N . The estimate X̂k
that maximizes p(Xk |Zk ) can always (in theory) be computed at the end of time k.
iii)* Consider any deterministic shortest path problem, and −C is the smallest arc length,
where C is positive. If we add C to every arc length such that now the smallest arc
length has length 0, and is thus non-negative, such that the smallest arc length is 0,
then we can always apply the label correcting algorithm to find the shortest path.

b) Suppose the label correcting algorithm was applied to a shortest path problem, producing
the following table:

Table 1

# Remove OPEN dS d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 dT


0 – S 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
1 S 1,2,3 0 3 2 1 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
2 3 1,2,4,6 0 3 2 1 3 ∞ 4 ∞ ∞ ∞ ∞ ∞ ∞
3 2 1,4,6,5 0 3 2 1 3 7 3 ∞ ∞ ∞ ∞ ∞ ∞
4 5 1,4,6,7,9 0 3 2 1 3 7 3 17 ∞ 9 ∞ ∞ ∞
5 4 1,6,7,9,5 0 3 2 1 3 6 3 17 ∞ 9 ∞ ∞ ∞
6 9 1,6,7,5,10,11 0 3 2 1 3 6 3 17 ∞ 9 10 10 ∞
7 11 1,6,7,5,10 0 3 2 1 3 6 3 17 ∞ 9 10 10 12
8 10 1,6,7,5 0 3 2 1 3 6 3 17 ∞ 9 10 10 11
9 5 1,6,7,9 0 3 2 1 3 6 3 17 ∞ 8 10 10 11
10 6 1,7,9 0 3 2 1 3 6 3 8 ∞ 8 10 10 11
11 7 1,9,8 0 3 2 1 3 6 3 8 9 8 10 10 11
12 8 1,9 0 3 2 1 3 6 3 8 9 8 10 10 10
13 1 9 0 3 2 1 3 6 3 8 9 7 10 10 10
14 9 10,11 0 3 2 1 3 6 3 8 9 7 8 8 10
15 11 10 0 3 2 1 3 6 3 8 9 7 8 8 10
16 10 - 0 3 2 1 3 6 3 8 9 7 8 8 9

Answer the following questions pertaining to the above table. You do not have to explain
your answer. [10 points]
Page 10 Final Exam – Dynamic Programming & Optimal Control

i) What is the shortest path from S to T.


ii) Find three paths from S to node 8.
iii)* True or False: the associated graph can have more than 4 distinct paths from S to
node 8.
iv)* True or False: The Depth-First Search method is used to remove nodes from OPEN.
v)* True or False: c4,5 ≥ c2,5 . Recall ci,j is the edge length from node i to node j.
vi)* Which of the following is correct:
• c1,8 < ∞
• c1,8 = ∞
• There is insufficient data to determine c1,8 .
vii)* Which of the following is correct:
• c1,4 < 1
• c1,4 ≥ 1
• There is insufficient data to determine c1,4 .
viii)* Which of the following is correct:
• c6,7 = 5
• c6,7 6= 5
• There is insufficient data to determine c6,7 .
Final Exam – Dynamic Programming & Optimal Control Page 11

Solution 2
a) i) False
ii) True
iii) False

b) i) (S, 1, 9, 10, T).


ii) From the table we can deduce 4 possible paths:

• (S, 2, 6, 7, 8)
• (S, 3, 4, 5, 7, 8)
• (S, 3, 6, 7, 8)
• (S, 2, 5, 7, 8).
iii) True.
iv) False.
v) False.
vi) There is insufficient data to determine this.
vii) There is insufficient data to determine this.
viii) c6,7 = 5.
Page 12 Final Exam – Dynamic Programming & Optimal Control

Problem 3 [17 points]

For problems marked with *: Each correct answer is worth 1 point. Answers left blank are worth
0 points. Each wrong answer is worth -1 point. You do not have to explain your answer. The
minimum score of Problem 3 is 0.

a) True of False questions. You do not have to explain your answer. [5 points]

i)* In stochastic shortest path problems, the value iteration algorithm always converges
after a finite number of iterations.
ii)* In stochastic shortest path problems, the value iteration algorithm involves solving a
system of linear equations.
iii)* In stochastic shortest path problems, the policy iteration algorithm in discounted
problems can be initialized with an arbitrary admissible policy.
iv)* In stochastic shortest path problems, the policy iteration algorithm involves solving
a system of linear equations.
v)* In stochastic shortest path problems, let the state space be S = {0, 1, ..., n} with the
termination state 0, and Pµ ∈ Rn×n be the probability transition matrix associated
with a policy µ, whose (i, j)th entry is Pij (µ(i)) with i, j ∈ S\{0}. The invertibility
of the matrix (I − Pµ ) for the policy µ is equivalent to the properness of that policy.

b) You are implementing the policy iteration algorithm for a stochastic shortest path problem
on the computer. You printed out the cost vectors you solved at each iteration and the
cost vectors at the second and third iteration are shown in Table 2:

Table 2

second iteration third iteration


state 1 5.03 4.93
state 2 4.67 4.32
state 3 2.87 2.89
state 4 1.50 1.28

Which of the following is correct? Explain your answer. [2 points]

• The implementation is definitely correct


• The implementation is definitely wrong
• Nothing can be deduced from Table 2

c) Consider the stochastic shortest path problem represented in Figure 3, where at state
i ∈ {0, 1, 3}, the control action u can either be A or B, and at state i = 2, the control
action u can only be A. [10 points]

i)* True or False: The policy µ(1) = A, µ(2) = A, µ(3) = B is proper.


ii)* True or False: The policy µ(1) = A, µ(2) = A, µ(3) = A is proper.
iii)* True or False: When solving this problem using the policy iteration algorithm, we
can initialize the algorithm with the policy µ(1) = B, µ(2) = A, µ(3) = B.

iv) For the policy in part i), construct the transition probability matrix Pµ ∈ R3×3 , whose
(i, j)th entry is Pij (µ(i)) with i, j ∈ {1, 2, 3}. Is this matrix invertible or not?
Final Exam – Dynamic Programming & Optimal Control Page 13

p=1
p=1 g=1 p=1 p=1
0 1 0 1
g=0 g=0 g=1

p = 0.5
g=1 p = 0.5
g=1
p = 0.5
3 p=1 g=1 3 p = 0.5
2 2 g=1
g=1

(a) u = A (b) u = B

Figure 3: Probability transition graph, with the associated probabilities p, and stage costs g,
denoted on each arc, that is,

P00 (A) = 1
P00 (B) = 1
P10 (A) = 1
P11 (B) = 1
P21 (A) = 0.5
P32 (B) = 0.5
P20 (A) = 0.5
P33 (B) = 0.5
P33 (A) = 1
g(i, B, j) = 1 ∀i ∈ {1, 3}, ∀j
g(i, A, j) = 1 ∀i 6= 0, ∀j
g(0, B, 0) = 0
g(0, A, 0) = 0

v) The optimal cost vector to the above stochastic shortest path problem can be obtained
by solving a linear program of the generic form

minimize f >V
V
subject to M V ≤ h

where V , f and h are vectors, and M is a matrix. Write down a choice for f , h, and
M such that the optimal cost vector is obtained by solving the above linear program.
Page 14 Final Exam – Dynamic Programming & Optimal Control

Solution 3
a) i) False
ii) False
iii) True
iv) True
v) True

b) The implementation is definitely wrong, since the cost associated with state 3 increases
after iteration 2. In policy iteration, the cost stays the same or decreases for any state
after each iteration.

c) i) True
ii) False
iii) False
iv)
 
0 0 0
Pµ = 0.5 0 0
0 0.5 0.5

No, it is not.2
v) The optimization problem for the stochastic shortest path problem has the form

3
X
minimize fi V (i)
V
i=1
 
3
X
subject to V (i) ≤ q(i, u) + Pij (u)V (j) , ∀u ∈ U(i), ∀i ∈ S\{0}
j=1

Write out the inequalities:

V (1) ≤ q(1, A)
V (1) ≤ q(1, B) + V (1)
V (2) ≤ q(2, A) + 0.5V (1)
V (3) ≤ q(3, A) + V (3)
V (3) ≤ q(3, B) + 0.5V (2) + 0.5V (3)

Since every stage cost is 1, the expected stage cost q(i, u) is equal to 1 for all i ∈ S
and u ∈ U(i). Hence we get the inequalities in matrix form:
   
1 0 0 1
 0 0 0  1
   
−0.5 1 0  V ≤ 1
   
 0 0 0 1
0 −0.5 0.5 1
2
Note that (I − Pµ ) is invertible.
Final Exam – Dynamic Programming & Optimal Control Page 15

therefore    
1 0 0 1
 0 0 0 1
   
M = −0.5
 1 0 , 1 ,
h= 
 0 0 0 1
0 −0.5 0.5 1
and  
−1
f = −1 .
−1
Page 16 Final Exam – Dynamic Programming & Optimal Control
Final Exam – Dynamic Programming & Optimal Control Page 17

Problem 4 [22 points]

Consider a ground vehicle traveling on a horizontal plane at a constant speed:

ẋ(t) = cos(θ(t))
ẏ(t) = sin(θ(t))
θ̇(t) = u

where (x(t), y(t)) is the vehicle’s position on the plane at time t, θ(t) is its heading (see Fig. 4),
and u(t) ∈ [−1, 1], for all t, is the control input.

θ
y

Figure 4: A ground vehicle.

The vehicle starts off at position (0, 0) with a heading of 0 at t = 0. The objective is to determine
the time-optimal trajectory that transfers the vehicle to position (0, 3).

a) Compute Pontryagin’s necessary conditions for optimality, including any singular arc con-
ditions. [9 points]

b) In a couple of sentences, motivate why the optimal trajectory must end in a singular arc,
and that the optimal u(0) = 1. [1 point]

c) Compute the optimal state and input trajectories, and the optimal terminal time T , using
the hints from part b). Show that your solution satisfies the conditions from part a). [12
points]

Table 3: Trigonometry table for reference.

φ sin(φ) cos(φ) tan(φ)


0 0 1

0
π 1 3 √1
6 2 2 3
π √1 √1
4 1
√2 2 √
π 3 1
3 2 2 3
π
2 1 0 ∞


3

3 2 − 12 − 3
3π 1
4

2
− √12 −1


6
1
2 − 23 − √13
π 0 -1 0
Page 18 Final Exam – Dynamic Programming & Optimal Control

Solution 4

a) Let x1 := x, x2 := y, x3 := θ.

H(x, u, p) := 1 + p1 cos(x3 ) + p2 sin(x3 ) + p3 u



∂H
ẋ1 (t) = = cos(x3 (t))
∂p1 x(t),u(t),p(t)

∂H
ẋ2 (t) = = sin(x3 (t))
∂p2 x(t),u(t),p(t)

∂H
ẋ3 (t) = = u(t)
∂p2 x(t),u(t),p(t)

∂H
ṗ1 (t) = − =0
∂x1 x(t),u(t),p(t)
⇔ p1 (t) = c1

∂H
ṗ2 (t) = − =0
∂x2
x(t),u(t),p(t)
⇔ p2 (t) = c2

∂H
ṗ3 (t) = − = c1 sin(x3 (t)) − c2 cos(x3 (t)) (2)
∂x3
x(t),u(t),p(t)

Boundary conditions:

x(0) = (0, 0, 0)
x(T ) = (0, 3, free)
p3 (T ) = 0

Since the problem data is time-invariant and T is free,

H(x(t), u(t), p(t)) = 1 + p1 (t) cos(x3 (t)) + p2 (t) sin(x3 (t)) + p3 (t)u(t) = 0 ∀t (3)

Optimal input:

u(t) = arg min H(x(t), u, p(t))


u∈[−1,1]

−1
 p3 (t) > 0
= arg min p3 (t)u = 1 p3 (t) < 0
u∈[−1,1] 
? p3 (t) = 0

where the last case corresponds to a potential singular arc. Check if p3 (t) = 0 can occur
non-trivially:

ṗ3 (t) = 0
⇔ c1 sin(x3 (t)) = c2 cos(x3 (t))
⇔ x3 (t) = arctan 2 (c2 , c1 )
⇒ u(t) = 0
Final Exam – Dynamic Programming & Optimal Control Page 19

The optimal input trajectory is then



−1
 p3 (t) > 0
u(t) = 1 p3 (t) < 0

0 p3 (t) = 0, x3 (t) = arctan 2 (c2 , c1 ) =: c3

b) Note that p3 (T ) = 0, so we can end with a singular arc. The intuition is this: apply
u(t) = 1 until the vehicle heading faces the target position at some time t = t̃, at which
point we switch to the singular arc, stop turning, and drive into the target position. Any
other solution would take more time.

c) For T ≥ t ≥ t̃ - singular arc, t̃ is a switching time:

x3 (t) = c3
ẋ1 = cos(c3 ) ⇒ x1 (t) = (t − T ) cos(c3 )
ẋ2 = sin(c3 ) ⇒ x2 (t) = (t − T ) sin(c3 ) + 3
p3 (t) = 0.
u(t) = 0 (4)
H(x(t), u(t), p(t)) = 1 + c1 cos(c3 ) + c2 sin(c3 ) = 0 (5)

For t̃ > t ≥ 0 - regular arc:


Assume p3 (t) < 0 so that u(t) = 1 (we will verify this later). Then,

u(t) = 1
x3 (t) = t
x1 (t) = sin(t)
x2 (t) = 1 − cos(t)

At the switch, we must have:

x3 (t̃) = t̃ = c3 = arctan 2 (c2 , c1 ) (6)


x1 (t̃) = sin(t̃) = (t̃ − T ) cos(t̃) (7)
x2 (t̃) = 1 − cos(t̃) = (t̃ − T ) sin(t̃) + 3
⇒ cos(t̃) + 2 = −(t̃ − T ) sin(t̃). (8)
p3 (t̃) = 0 (9)

Now sum eq.(7)∗ sin(t̃) and eq.(8)∗ cos(t̃):


1 + 2 cos(t̃) = 0 ⇒ t̃ = .
3
2π √
⇒T = + 3.
3

(note we reject the solution t̃ = 4π


3 since this will not get us to the terminal position).
Now we must determine c1 and c2 and show that p3 (t) < 0 up to t = t̃ = 2π 3 . From (5),

1 3
1 − c1 + c2 =0
2 2
Page 20 Final Exam – Dynamic Programming & Optimal Control

and from (6)



 
2π c2
tan = =− 3
3 c1


3
and thus c1 = 21 , c2 = − 2 .
From (2) and (9):
Z √
1 3
p3 (t) = sin(t) + cos(t)dt + c4
2 2

1 3
= − cos(t) + sin(t) + c4
  2 2

p3 = 0 ⇒ c4 = −1
3

and indeed p3 (t) < 0 for t ∈ [0, 2π


3 ) since p3 (0) = −1.5 and p3 (t) is at most 0 (can
maximize to show this, or can plug in values from the table) which first happens after
t = 0 at t = 2π
3 .
Finally, we must show (3) during the regular arc as well:

1 3
H(x(t), u(t), p(t)) = 1 + cos(t) − sin(t) + p3 (t)
2 2
=0

You might also like