Slides CO-course C4C Part I
Slides CO-course C4C Part I
Organization
1. Introduction.
General considerations. Motivation. Problem Formulation. Classes of problems. Issues in optimal control theory
4. Additional issues
Nonsmoothness. Degeneracy. Nondegenerate conditions with and without a priori normality assumptions.
Key References
• Pravin Varaiya,“Notes on Optimization”, Van Nostrand Reinhold Company, 1972.
• Francis Clarke, ”Optimization and Nonsmooth Analysis”, John Wiley, 1983.
• Richard Vinter, “Optimal Control”, Birkhauser, 2000.
• João Sousa, Fernando Pereira, “A set-valued framework for coordinated motion
control of networked vehicles”, J. Comp. & Syst. Sci. Inter., Springer, 45, 2006,
p.824-830.
• David Luenberger, “Optimization by Vector Space Methods”, Wiley, 1969.
• Aram Arutyunov, “Optimality Conditions: Abnormal and Degenerate Problems”,
Kluwer Academic Publishers, 2000
• Martino Bardi, Italo Capuzzo-Dolcetta, “Optimal Control and Viscosity Solutions
of Hamilton-Jacobi-Bellman Equations”, Birkhauser, 1997.
• Vladimir Girsanov, “Lectures on Mathematical Theory of Extremum Problems”,
Lect. Notes in Econom. and Math. Syst., 67, Springer Verlag, 1972.
• Francesco Borrelli, http://www.me.berkeley.edu/ frborrel/Courses.html
Applications
Useful for optimization problems with inter-temporal constraints.
o Management of renewable and non-renewable resources
o Investment strategies,
o Management of financial resources,
o Resources allocation,
o Planning and control of productive systems (manufacturing, chemical processes,..),
o Planning and control of populations (cells, species),
o Definition of therapy protocols,
o Motion planning and control in autonomous mobile robotics
o Aerospace Navigation,
o Synthesis in decision support systems,
o Etc . . .
Fernando Lobo Pereira, João Tasso Borges de Sousa FEUP, Porto 5
Introduction to Optimal Control
(P ) Minimize g(x(1))
by choosing (x, u) : [0, 1] → IRn × IRm
satisfying : ẋ(t) = f (t, x(t), u(t)), [0, 1] L-a.e., (1)
x(0) = x0, (2)
u(t) ∈ Ω(t), [0, 1] L-a.e.. (3)
A(1; (x0, 0)) - the set in IRn that can be reached at time 1 from x0 at time 0.
General definitions
Dynamic System - System whose state variable conveys its past history. Its future
evolution depends not only on the future (“inputs”) but also on the current value of
the state variable.
Trajectory - Solution of the differential equation (1) with the boundary condition
(2) and for a given controlo function satisfying (3).
Admissible Control Process - A (x, u) satisfying the constraints (1,2,3).
Attainable set - A(1; (x0, 0)) is the set of state space points that can be reached
from x0 with admissible control strategies
A(1; (x0, 0)) := {x(1) : for all admissible control processes (x, u)}
Boundary process - Control process whose trajectory (or a given function of it)
remains in the boundary of the attainable set (or a given function of it).
Local/global minimum - Point for which the value of the objective function is lower
than that associated with any other/other within a neighborhood feasible point.
Types of Problems
Z 1
o Bolza - g(x(1)) + L(s, x(s), u(s))ds.
0
Z 1
o Lagrange - L(s, x(s), u(s))ds.
0
o Mayer - g(x(1)).
Other types of constraints besides the above:
o Mixed constraints - g(t, x(t), u(t)) ≤ 0, ∀t ∈ [0, 1].
Z 1
o Isoperimetric constraints, h(s, x(s), u(s))ds = a.
0
o Endpoints and intermediate state constraints, y(1) ∈ S.
o State constraints, hi(t, x(t)) ≤ 0 para todo o t ∈ [0, 1], i = 1, . . . , s.
Algorithms
Let V : [0, 1] × IRn → IR be a smooth function s.t. in the neighborhood of (t, x∗(t)),
V (1, z) = g(z),
V (0, x0) ≥ g(x∗(1)),
Vt(t, x∗(t)) − sup{Vx(t, x∗(t))f (t, x∗(t), u) : u ∈ Ω(t)} = 0, [0, 1] L-a.e., (7)
where x∗ is solution to (1) with u = u∗ and x∗(0) = x0, then the control process
(x∗, u∗) is optimal for (P ).
V - solution to Hamilton-Jacobi-Bellman equation (7) - is the verification function
which under certain conditions coincides with the value function.
Although these conditions have a local character, there are results giving conditions of
global nature.
New types of solutions - Viscosity, Proximal, Dini, ... - generalizing the classic concept.
Maximum Principle
The control strategy u∗ is optimal for (P1) if and only if u∗(t) maximizes
v → pT (t)B(t)v, in Ω(t), [0, 1] L-a.e.,
where p : [0, 1] → IRn is an a.c. function s.t.:
−ṗT (t) = pT (t)A(t), [0, 1] L-a.e.
p(1) = c.
For this problem, the Maximum Principle is a necessary and sufficient condition.
Geometric Interpretation:
o Existence of a boundary control process associated with the optimal trajectory.
o The adjoint variable vector is perpendicular to the attainable set at the optimal
state value for all times.
Geometric Interpretation
X1
X*(t 1)
t
t2
p(t1 ) t1
X*(t 2)
x* (t)
p(t2 )
X2
Fig. 2. Relation between the adjoint variable and the attainable set (inspired in [17])
Proposition
Let cT x∗(1) ≥ cT z, ∀z ∈ A(1; (x0, 0)) and c 6= 0, i.e.,
−pT (1) = c is perpendicular to A(1; (x0, 0)) at x∗(1) ∈ ∂A(1; (x0, 0)).
Then, ∀t ∈ [0, 1),
o x∗(t) ∈ ∂A(t; (x0, 0)),
o −pT (t) is perpendicular to A(t; (x0, 0)) at x∗(t).
Fernando Lobo Pereira, João Tasso Borges de Sousa FEUP, Porto 17
Introduction to Optimal Control
Analytic Interpretation
It consists in showing that the switching function
σ : [0, 1] → IRm := −pT (t)B(t)
is the gradient of the objective functional J(u) := −cT x(1) relatively to the value of
the control function at time t, u(t).
By computing the directional derivative and using the time response formula for the
dynamic linear system, we have:
Z 1
J 0(u; w) = σ(t)w(t)dt =< ∇uJ(u), w > .
0
Here, ∇uJ(u) : [0, 1] → IRm is the gradient of the cost functional w.r.t. to control,
and < ·, · > is the inner product, in the functional space.
o Express the optimality conditions as a function of the state variable at the final
time.
Check that {x∗(1)} and A(1; (x0, 0)) fulfill the conditions to apply a Separation
Theorem.
After showing the equivalence between the trajectory optimality and the fact of
being a boundary process
Observe that (cT x∗(1), x∗(1)) ∈ ∂{(z, y) : z ≥ cT y, y ∈ A(1; (x0, 0))}
write the condition of perpendicularity of the vector c to A(1; (x0, 0)).
o Express the conditions obtained above in terms of the control variable at each
instant in the given time interval by using the time response formula.
In this step, the control maximum condition, the o.d.e. and the boundary
conditions satisfied by the adjoint variable are jointly obtained.
Example
0 1 0 0 £ ¤
Let t ∈ [0, 1], u(t) ∈ [−1, 1], A = 0 0 1 , B = 0 , e C = 1 0 0 .
6 −11 6 1
By writing eAτ = α0(τ )I + α1(τ )A + α2(τ )A2, where τ = 1 − t, we get
σ(t) := pT (t)B = Ce(A(1−t))B = α2(1 − t).
The eigenvalues of A - roots of the characteristic polynomial de A,
p(λ) = det(λI − A) = 0. By Cayley-Hamilton theorem,
α0(τ ) + α1(τ ) + α2(τ ) = eτ
α0(τ ) + 2α1(τ ) + 4α2(τ ) = e2τ
α0(τ ) + 3α1(τ ) + 9α2(τ ) = e3τ .
Thus,
e3τ − 2e2τ + eτ
α2(τ ) = .
2
∗
Since σ(t) > 0, ∀t ∈ [0, 1], we have u (t) = 1, ∀t ∈ [0, 1].
Fernando Lobo Pereira, João Tasso Borges de Sousa FEUP, Porto 20
Introduction to Optimal Control
Maximum Principle
These conditions are necessary and sufficient.
Let (x∗, u∗) be an admissible control process for (P2), i.e., s.t. x∗(0) ∈ X0,
u∗(t) ∈ Ω(t) and x∗(1) ∈ X1. Then:
A) Necessity
If (x∗(0), u∗) is optimal, then, ∃p : [0, 1] → IRn e λ ≥ 0, s.t.:
λ + kp(t)k 6= 0, (8)
−ṗT (t) = pT (t)A(t), [0, 1] L-a.e., (9)
p(1) − λc is perpendicular to X1 at x∗(1), (10)
p(0) is perpendicular to X0 at x∗(0), (11)
u∗(t) maximizes the map v → pT (t)B(t)v on Ω(t), [0, 1] L-a.e.. (12)
B) Sufficiency
If (8)-(12) hold with λ > 0, then (x∗(0), u∗) is optimal.
Geometric Interpretation
cTx
Sa
c T x * (1 )
Sb
A(1;0 ,X 0 )
x * (1)
X1
X1
Fig. 2. Separation of the optimal state at the final time subject to affine constraints (inspired from [17]).
For all x ∈ A(1; (X0, 0)), we have, ∀z ∈ X0, ∀v ∈ A(1; (0, 0)):
(λc + q)T x = pT (1)x
= pT (1)[Φ(1, 0)z + v]
= pT (1)Φ(1, 0)[z − x∗(0)] + pT (1)Φ(1, 0)x∗(0) + pT (1)v
= pT (0)[z − x∗(0)] + pT (1)[Φ(1, 0)x∗(0) + v],
Note that the first parcel is null and that the second one is in
A(1; (x∗(0), 0)) ⊂ A(1; (X0, 0)).
Thus, (λc + q)T x ≤ pT (1)x∗(1) = (λc + q)T x∗(1).
Since x ∈ A(1; (X0, 0)) ∩ X1, q is perpendicular to X1 at x∗(1).
Hence, the sufficiency.
Example - Formulation
Z 1
Minimize cT x(1) + α0 u(t)dt
0
where ẋ(t) = Ax(t) + Bu(t), [0, 1] L-a.e.
x1(0) + x2(0) = 0
x1(1) + 3x2(1) = 1
u(t) ∈ [0, 1], [0, 1] L-a.e.,
· ¸ · ¸ · ¸
0 1 0 1
being α0 > 0, A = , B= , c= .
−2 3 1 1
a) Determine the values of α0 for which there exist optimal control switches within the
time interval [0, 1].
b) Determine the switching function as a function of α0.
For a given λ ∈ {0, 1}, the system of equations p̄ = p(1) − λc is perpendicular to X1,
p(0) is perpendicular to X0, and pT (0) = pT (1)eA fully determine the adjoint variable.
Let λ = 1. Thus, we have
· t 2t 2t t
¸ · 1
¸ · ¸
2e − e e −e 1 + 3 p1 p0
eAt = , and p(1) = e p(0) = .
−2(e2t − et) 2e2t − et 1 + p1 p0
Note that these last two relations determine p0 e p1.
To put the problem in the canonical form, add a component to the state variable, and
the maximum condition becomes:
u∗(t) maximizes, in [0, 1], the map v → [pT (1)eA(1−t)B − α0]v.
There exists an interval of values of α0 for which the switching point is in (0, 1).
(P3) Minimize T
by choosing (x, u) : [0, T ] → IRn × IRm such that:
ẋ(t) = A(t)x(t) + B(t)u(t), [0, T ] L-a.e.,
x(0) = x0 ∈ IRn,
x(T ) ∈ O(T ) ⊂ IRn,
u(t) ∈ Ω(t), [0, T ] L-a.e.,
being T the final time and the multifunction O : [0, T ] ,→ P(IRn) define the target to
be attained in minimum time, being P(IRn) the set of subsets in IRn.
Typically, this multi-function is continuous and takes compact sets as values. For
example, O(t) = {z(t)}, being z : [0, 1] → IRn a continuous function.
Generalization: Objective function defined by g(t0, x(t0), t1, x(t1)); Terminal
Constraints given by (t0, x(t0), t1, x(t1)) ∈ O ⊂ IR2(n+1).
Geometric Interpretation
x1
z(t )
x0
x * (t1)
x * (t)
t*
t
0 t1
x * (t*)=z(t*)
A (t1; 0,x 0)
A (t*;0 ,x 0)
x2
The optimal state at t∗ is the intersection of sets O(t∗) and A(t∗; (x0, 0)), and, thus,
necessarily in the boundary of both sets.
Time t∗ is given by
inf{ t > 0 : O(t) ∩ A(t; (x0, 0)) = {x∗(t)}}.
Fernando Lobo Pereira, João Tasso Borges de Sousa FEUP, Porto 29
Introduction to Optimal Control
Maximum Principle
Let (t∗, u∗) be optimal.
Then, there exists h ∈ IRn e p : [0, t∗] → IRn a.c. s.t.
kp(t)k 6= 0, (17)
−ṗT (t) = pT (t)A(t), [0, t∗] L-a.e., (18)
p(t∗) = q. (19)
u∗(t) maximizes v → pT (t)B(t)v em Ω(t), [0, t∗] L-a.e., (20)
x∗(t∗) minimizes z → hT z in O(t∗). (21)
Deduction: By geometric considerations, ∃h ∈ IRn, h 6= 0, simultaneously
perpendicular to A(t∗; (x0, 0)) and to O(t∗) em x∗(t∗), i.e., ∀z ∈ O(t∗) and
∀x ∈ A(t∗; (x0, 0)),
hT z ≤ hT x∗(t∗) ≤ hT x.
From here, we have (21) and, by writing x and x∗(t∗), as the state at t∗ as response of
the system, respectively, to arbitrary admissible control and the optimal control, (20),
we obtain the optimality conditions.
Fernando Lobo Pereira, João Tasso Borges de Sousa FEUP, Porto 30
Introduction to Optimal Control
Z
1 T 1 1£ T T
¤
(P4) Minimize x (1)Sx(1) + x (t)Q(t)x(t) + u (t)R(t)u(t) dt,
2 2 0
by choosing (x, u) : [0, 1] → IRn × IRm such that:
ẋ(t) = A(t)x(t) + B(t)u(t), [0, 1] L-a.e.,
x(0) = x0 ∈ IRn.
S ∈ IRn×n and Q(t) ∈ IRn×n are positive semi-definite, ∀t ∈ [0, 1], and
R(t) ∈ IRm×m is positive definite, ∀t ∈ [0, 1].
Optimality Conditions.
The solution to (P4) is given by
u∗(t) = −R−1(t)B T (t)S(t)x∗(t)
where S(·) is solution to the Riccati equation:
−Ṡ(t) = AT (t)S(t) + S(t)A(t) − S(t)B(t)R−1(t)B T (t)S(t) + Q(t), ∀t ∈ [0, 1], (22)
S(1) = S. (23)
Observations:
(a) The optimal control is defined as a linear state feedback law.
(b) The Kalman gain, K(t) := R−1(t)B T (t)S(t), can be computed a priori.
Exercise: Given kakP = aT P a, show that the cost function on [t, 1] is:
Z
1 T 1 1 −1
x (t)S(t)x(t) + kR (s)B T (s)S(s)x(s) + u(s)k2R ds.
2 2 t
Obviously that, for u∗, the above integrand becomes zero, and the optimal cost on
1 T
[0, 1] is equal to x0 S(0)x0.
2
Fernando Lobo Pereira, João Tasso Borges de Sousa FEUP, Porto 32
Introduction to Optimal Control
Example - Formulation
Let us consider the following dynamic system:
ẋ(t) = Ax(t) + Bu(t), [0, 1] L-a.e.,
x(0) = x0,
y(t) = Cx(t),
with the following objective function
Z T
1 T 1 £ 2 T
¤
x (T )Sx(T ) + ky(t) − yr (t)k + u(t) R(t)u(t) dt,
2 2 0
where S and R(t) are symmetric, positive definite matrices.
a) Write down the maximum principle conditions for this problem.
b) Derive the optimal control in a state feedback form when yr (t) = Cxr (t),
xr (0) = x0 and ẋr (t) = Ar xr (t).
Example - Solution
a) Let
T
£ ¤ 2 T
H(t, x(t), u(t), p(t)) := p (t) [Ax(t) + Bu(t)] + ky(t) − yr (t)k + u (t)R(t)u(t)
where p : [0, 1] → IRn satisfies:
p(T ) = Sx∗(T ),
−ṗ(t) = AT (t)p(t) + C T C[x∗(t) − xr (t)].
Additional Bibliography
• Pontryagin, L., Boltyanskii, V., Gamkrelidze, R., Mishchenko, E., “The Mathematical Theory of Optimal Processes”, Pergamon-Macmillan, 1964.
• Anderson, A., Moore, B., ”Linear Optimal Control”, Prentice-Hall, 1971.
• Athans, M., Falb, P., ”Optimal Control”, McGraw Hill, 1966.
• Aubin , J., Cellina, A., ”Differential Inclusions: Set-Valued Maps and Viability Theory”, Springer-Verlag, 1984.
• Bryson, A., Ho, Y., ”Applied Optimal Control”, Hemisphere, 1975.
• Clarke F.H., Ledyaev Yu.S., Stern R.J., Wolenski P.r., “Nonsmooth Analysis and Control Theory”, Springer, 1998.
• Grace, A., ”Optimization Toolbox, User’s Guide”, The Math. Works Inc., 1992.
• Lewis, F., ”Optimal Control”, John Wiley Sons, 1987.
• Macki, J., Strauss, A., ”Introduction to Optimal Control Theory”, Springer-Verlag, 1981.
• Monk, J. et al., ”Control Engineering, Unit 15 - Optimal Control”, 1978.
• Neustadt, L., ”Optimization, A Theory of Necessary Conditions”, Princeton University Press, 1976.
• Pesch, H., Bulirsch, R., ”The Maximum Principle, Bellman’s Equation and Carathéodory’s Work”, Historical Paper in J. of Optimization Theory
and Applications, Vol. 80, No. 2, pp. 199-225, 1994.
• Tu, P., “Introductory Optimization Dynamics”, Springer-Verlag, 1984.