0% found this document useful (0 votes)
45 views11 pages

Infinite Horizon Linear Quadratic Regulator

The document discusses the infinite horizon linear quadratic regulator (LQR) problem. It presents the dynamic programming solution, which uses the Hamilton-Jacobi equation to derive the algebraic Riccati equation (ARE) that characterizes the optimal value function and feedback gains. The solution involves computing the unique positive semidefinite solution to the ARE. Receding horizon LQR control is also discussed, where the optimal input is computed by solving a finite horizon LQR problem at each time step.

Uploaded by

Sayan Mandal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views11 pages

Infinite Horizon Linear Quadratic Regulator

The document discusses the infinite horizon linear quadratic regulator (LQR) problem. It presents the dynamic programming solution, which uses the Hamilton-Jacobi equation to derive the algebraic Riccati equation (ARE) that characterizes the optimal value function and feedback gains. The solution involves computing the unique positive semidefinite solution to the ARE. Receding horizon LQR control is also discussed, where the optimal input is computed by solving a finite horizon LQR problem at each time step.

Uploaded by

Sayan Mandal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

EE363 Winter 2008-09

Lecture 3
Infinite horizon linear quadratic regulator

• infinite horizon LQR problem

• dynamic programming solution

• receding horizon LQR control

• closed-loop system

3–1
Infinite horizon LQR problem

discrete-time system xt+1 = Axt + But, x0 = xinit

problem: choose u0, u1, . . . to minimize


X
xTτ Qxτ uTτ Ruτ

J= +
τ =0

with given constant state and input weight matrices

Q = QT ≥ 0, R = RT > 0

. . . an infinite dimensional problem

Infinite horizon linear quadratic regulator 3–2


problem: it’s possible that J = ∞ for all input sequences u0, . . .

xt+1 = 2xt + 0ut, xinit = 1

let’s assume (A, B) is controllable

then for any xinit there’s an input sequence

u0, . . . , un−1, 0, 0, . . .

that steers x to zero at t = n, and keeps it there

for this u, J < ∞

and therefore, minu J < ∞ for any xinit

Infinite horizon linear quadratic regulator 3–3


Dynamic programming solution

define value function V : Rn → R



X
xTτ Qxτ uTτ Ruτ

V (z) = min +
u0 ,...
τ =0

subject to x0 = z, xτ +1 = Axτ + Buτ

• V (z) is the minimum LQR cost-to-go, starting from state z

• doesn’t depend on time-to-go, which is always ∞; infinite horizon


problem is shift invariant

Infinite horizon linear quadratic regulator 3–4


Hamilton-Jacobi equation

fact: V is quadratic, i.e., V (z) = z T P z, where P = P T ≥ 0


(can be argued directly from first principles)
HJ equation:
T T

V (z) = min z Qz + w Rw + V (Az + Bw)
w

or
T T T T

z P z = min z Qz + w Rw + (Az + Bw) P (Az + Bw)
w

minimizing w is w∗ = −(R + B T P B)−1B T P Az


so HJ equation is

zT P z = z T Qz + w∗T Rw∗ + (Az + Bw∗)T P (Az + Bw∗)


T T T T T

= z Q + A P A − A P B(R + B P B)
−1
B PA z

Infinite horizon linear quadratic regulator 3–5


this must hold for all z, so we conclude that P satisfies the ARE

P = Q + AT P A − AT P B(R + B T P B)−1B T P A

and the optimal input is constant state feedback ut = Kxt,

K = −(R + B T P B)−1B T P A

compared to finite-horizon LQR problem,

• value function and optimal state feedback gains are time-invariant


• we don’t have a recursion to compute P ; we only have the ARE

Infinite horizon linear quadratic regulator 3–6


fact: the ARE has only one positive semidefinite solution P

i.e., ARE plus P = P T ≥ 0 uniquely characterizes value function

consequence: the Riccati recursion

Pk+1 = Q + AT Pk A − AT Pk B(R + B T Pk B)−1B T Pk A, P1 = Q

converges to the unique PSD solution of the ARE


(when (A, B) controllable)

(later we’ll see direct methods to solve ARE)

thus, infinite-horizon LQR optimal control is same as steady-state finite


horizon optimal control

Infinite horizon linear quadratic regulator 3–7


Receding-horizon LQR control
consider cost function
τ =t+T
X
xTτ Qxτ uTτ Ruτ

Jt(ut, . . . , ut+T −1) = +
τ =t

• T is called horizon
• same as infinite horizon LQR cost, truncated after T steps into future

if (u∗t , . . . , u∗t+T −1) minimizes Jt, u∗t is called (T -step ahead) optimal
receding horizon control
in words:

• at time t, find input sequence that minimizes T -step-ahead LQR cost,


starting at current time
• then use only the first input

Infinite horizon linear quadratic regulator 3–8


example: 1-step ahead receding horizon control

find ut, ut+1 that minimize

Jt = xTt Qxt + xTt+1Qxt+1 + uTt Rut + uTt+1Rut+1

first term doesn’t matter; optimal choice for ut+1 is 0; optimal ut


minimizes

xTt+1Qxt+1 + uTt Rut = (Axt + But)T Q(Axt + But) + uTt Rut

thus, 1-step ahead receding horizon optimal input is

ut = −(R + B T QB)−1B T QAxt

. . . a constant state feedback

Infinite horizon linear quadratic regulator 3–9


in general, optimal T -step ahead LQR control is

ut = KT xt, KT = −(R + B T PT B)−1B T PT A

where

P1 = Q, Pi+1 = Q + AT PiA − AT PiB(R + B T PiB)−1B T PiA

i.e.: same as the optimal finite horizon LQR control, T − 1 steps before
the horizon N

• a constant state feedback


• state feedback gain converges to infinite horizon optimal as horizon
becomes long (assuming controllability)

Infinite horizon linear quadratic regulator 3–10


Closed-loop system

suppose K is LQR-optimal state feedback gain

xt+1 = Axt + But = (A + BK)xt

is called closed-loop system

(xt+1 = Axt is called open-loop system)

is closed-loop system stable? consider

xt+1 = 2xt + ut, Q = 0, R=1

optimal control is ut = 0xt, i.e., closed-loop system is unstable

fact: if (Q, A) observable and (A, B) controllable, then closed-loop system


is stable

Infinite horizon linear quadratic regulator 3–11

You might also like