MPC and Data Driven Control
MPC and Data Driven Control
Abstract— Designing data-driven controllers in the presence Standard model-based control techniques, in particular
of noise is an important research problem, in particular when standard MPC schemes, rely on a priori knowledge of
guarantees on stability, robustness, and constraint satisfaction system models that are either identified from measured data
are desired. In this paper, we propose a data-driven min-max
arXiv:2309.17307v1 [eess.SY] 29 Sep 2023
model predictive control (MPC) scheme to design state-feedback by system identification methods [8] or derived based on
controllers from noisy data for unknown linear time-invariant first principles. In contrast, data-driven control approaches
(LTI) system. The considered min-max problem minimizes the can design controllers directly from the available data. The
worst-case cost over the set of system matrices consistent with Fundamental Lemma proposed by Willems et al. [20] has
the data. We show that the resulting optimization problem can led to control strategies in a behavioral setting [11], and
be reformulated as a semidefinite program (SDP). By solving the
SDP, we obtain a state-feedback control law that stabilizes the more recently in the context of MPC [2], [5]. For example,
closed-loop system and guarantees input and state constraint [2] proposes a robust data-driven MPC scheme and proves
satisfaction. A numerical example demonstrates the validity of theoretical guarantees in case of noisy data. The behavioral
our theoretical results. framework requires persistently exciting data, enabling the
unique representation of the system from the data in the
I. I NTRODUCTION noise-free scenarios. On the other hand, it has been shown
in the informativity framework that the data need not to be
Model Predictive Control (MPC) is an advanced control
sufficiently informative to uniquely identify the system [19].
technique that can handle nonlinear multi-input multi-output
In this framework, control strategies are proposed to stabilize
systems with constraints [15]. The basic idea of MPC is to
the system based on a representation of the set of system
solve an open-loop optimal control problem at each sampling
matrices consistent with the data [6], [17]–[19]. For example,
time, which uses the current state as the initial condition and
[17] proposes a controller design method directly from noisy
system dynamics to predict future open-loop states.
input-state data. However, the design of MPC schemes,
Ensuring robust constraint satisfaction in the presence of
known for their effectiveness in handling constraints, remains
noise or model uncertainty is challenging. Several MPC
unaddressed in this framework.
methods have been proposed to deal with this issue. Tube-
In this paper, we present a data-driven min-max MPC
based robust MPC [12], [13], where a constraint tightening
scheme which uses noisy data to control linear time-invariant
is included in the MPC optimization problem, ensures that
(LTI) systems with unknown system matrices. Our approach
all possible realizations of the state trajectory lie in an
relies on a representation of the system matrices consistent
uncertainty tube around a nominal system. This approach
with a sequence of noisy input-state data that was employed,
typically assumes that the nominal system is known, whereas
e.g., by [3], [4], [17]. We reformulate the data-driven min-
the noise is unknown and bounded. In [9] and [10], model
max MPC problem with input and state constraints to a
uncertainty is characterized within a set, and additional
semidefinite program (SDP) that yields a state-feedback
measurements are leveraged to reduce model uncertainty and
control law. Further, we show that the proposed data-driven
enhance the performance of the tube-based MPC scheme.
min-max MPC guarantees closed-loop properties including
Furthermore, min-max MPC can effectively address scenar-
recursive feasibility, constraint satisfaction and exponential
ios involving model uncertainty, as discussed in [1], [7],
stability.
and [16]. The goal of min-max MPC is to design control
The remainder of this paper is organized as follows. In
inputs that minimize the worst-case cost w.r.t. disturbances
Section II, we introduce necessary preliminaries. In Sec-
or uncertainties. State-feedback control laws are commonly
tion III, we propose the data-driven min-max MPC scheme
employed in the min-max MPC framework to decrease the
and show that the scheme ensures recursive feasibility and
computational complexity [1], [7], [16].
exponential stability. We apply the developed scheme to a
F. Allgöwer is thankful that his work was funded by Deutsche
numerical example in Section IV. Finally, we conclude the
Forschungsgemeinschaft (DFG, German Research Foundation) under Ger- paper in Section V.
many’s Excellence Strategy - EXC 2075 - 390740016 and under grant
468094890. F. Allgöwer acknowledges the support by the Stuttgart Center
for Simulation Science (SimTech). The authors thank the International Max II. P RELIMINARIES
Planck Research School for Intelligent Systems (IMPRS-IS) for supporting
Yifan Xie. Let I[a,b] denote the set of integers in the interval [a, b]
Yifan Xie, Julian Berberich, Frank Allgöwer are with the Institute and I≥0 denote the set of nonnegative integers. For a matrix
for Systems Theory and Automatic Control, University of Stuttgart,
70550 Stuttgart, Germany. {yifan.xie, julian.berberich, P , we write P ≻ 0 if P is positive definite and P 0
frank.allgower}@ist.uni-stuttgart.de. if P is positive semi-definite. For a vector x and a matrix
√
P ≻ 0, we write kxkP = x⊤ P x. For matrices A and B of B. Problem setup
⊤
compatible dimensions, we abbreviate ABA⊤ to AB ⋆ . In this paper, we employ data-driven min-max MPC for
the unknown system xt+1 = As xt + Bs ut to stabilize the
A. System representation origin, while the input and state satisfy given constraints. As
We consider an unknown discrete-time LTI system explained in Section II-A, the offline measurements (U− , X)
are affected by noise satisfying the instantaneous constraint
xt+1 = As xt + Bs ut + ωt , (1) in Assumption 1. On the other hand, the data collected
online during closed-loop operation are assumed to be noise-
where xt ∈ Rn denotes the state, ut ∈ Rm denotes the input, free. This is assumed for simplicity, to avoid an additional
and ωt ∈ Rn denotes the unknown noise for t ∈ N. The maximization w.r.t. the noise in the min-max MPC problem.
matrices As ∈ Rn×n and Bs ∈ Rn×m are unknown. We
Extending the proposed framework to handling online noise
define a sequence of input, noise and corresponding state is an interesting issue for future research. In Section V, we
measurements, which is denoted by
show with a numerical example that the proposed approach
produces reliable results also in the presence of online noise.
U− := u0 u1 . . . uT −1 ,
We consider ellipsoidal constraints on the input and the
W− := ω0 ω1 . . . ωT −1 ,
state, i.e.,
X := x0 x1 . . . xT .
kut kSu ≤ 1, kxt kSx ≤ 1, ∀t ∈ N,
Throughout this paper, we assume that data of the form X
and U− are available, whereas W− is unknown. The noise where Su ≻ 0 and Sx 0. The centers of the ellipsoids for
ωt should satisfy the following instantaneous constraint. the input and state constraints are at the origin, but our results
Assumption 1: For all t ∈ N, the noise ωt ∈ R satisfies can be adapted for non-zero centers. In order to stabilize the
n
kωt k22 ≤ ǫ for a known bound ǫ ≥ 0. origin, we define the quadratic stage cost function
We define the set of system matrices (A, B) consistent l(u, x) = kuk2R + kxk2Q ,
with the data xi , ui , xi+1 , i ∈ N by
where R, Q ≻ 0. The following results can be adapted for
Σi := (A, B) : (1) holds for some ωi satisfying kωi k22 ≤ ǫ . non-zero equilibria (us , xs ) 6= (0, 0).
This set includes all system matrices for which there exists III. DATA -D RIVEN M IN -M AX MPC
a noise satisfying Assumption 1 and the system dynamics
In Section III-A, we define a general data-driven min-
(1). We proceed as in [3], [4], [17] to derive a data-driven
max MPC problem with input and state constraints. In
parametrization of the system matrices. The system dynamics
Section III-B, we restrict the optimization to state-feedback
(1) can be rewritten as
control laws, which allows to reformulate the data-driven
ωi = xi+1 − As xi − Bs ui , min-max MPC problem as an SDP. The state-feedback con-
trol law at each time step can be obtained from a receding-
with which the set of system matrices Σi can be equivalently horizon algorithm, which is proposed in Section III-C. Fi-
characterized by the following quadratic matrix inequality nally, we prove recursive feasibility, constraint satisfaction
and exponential stability for the closed-loop system in Sec-
I xi+1 ǫI 0 ⊤ tion III-D.
Σi = (A, B) : I A B 0 −xi ⋆ 0 .
0 −I
0 −ui A. Min-max MPC problem
The set of system matrices consistent with the sequence of At time t, given an initial state xt , the data-driven min-
input-state measurements (U− , X) is defined by max MPC optimization problem is formulated as follows:
∞
X
T\
−1 ∗
J∞ (xt ) := min max l(ūk (t), x̄k (t)) (3a)
C= Σi . ū(t) (A,B)∈C
k=0
i=0
s.t. x̄k+1 (t) = Ax̄k (t) + B ūk (t), (3b)
We can characterize C by the following quadratic matrix x̄0 (t) = xt , (3c)
inequality [3], [4]
kūk (t)kSu ≤ 1, ∀t ∈ N, (3d)
( ⊤ )
kx̄k (t)kSx ≤ 1, ∀(A, B) ∈ C, t ∈ N.
I A B Π(τ ) I A B 0, (3e)
C = (A, B) : , (2)
∀τ = (τ0 , . . . , τT −1 ), τi ≥ 0, i ∈ I[0,T −1] The objective function is a minimization of the worst-case
where cost over all consistent system matrices in C by adapting the
⊤ control input ūk (t), ∀k ∈ N. In the optimization problem,
T −1 I xi+1 I xi+1 x̄k (t) and ūk (t) are the predicted state and control input at
X ǫI 0
Π(τ ) = τi 0
−xi 0 −xi . time t+k based on the measurement at time t. The prediction
0 −I
i=0 0 −ui 0 −ui model employs the system matrices consistent with the data
trajectory in constraint (3b). In constraint (3c), we initialize (A, B) ∈ C. As the following theorem shows, this is possible
x̄0 (t) as the state measurement at time t. We consider that based on LMIs.
the input and state should lie in the ellipsoidal constraints in Theorem 1: Suppose that there exist γ > 0, H ∈ Rn×n ,
(3d) and (3e). The state constraint (3e) must be satisfied for L ∈ Rm×n , τ ∈ RT such that the inequalities (8) hold
any states predicted using any system matrices in C. In order
1 x⊤
to effectively address this problem and obtain a tractable t
0, (8a)
solution, we limit our focus to a state-feedback control law xt H
of the form ut = Ft xt , where Ft ∈ Rm×n . 0
−H 0
0 + Π(τ ) H 0
B. Reformulation as an SDP 0
≺ 0,
L (8b)
0 H L⊤ ⊤
In this subsection, we first prove that the state-feedback −H Φ
gain Ft that minimizes the upper bound on the optimal 0 Φ −γI
cost of the min-max MPC problem (4) can be obtained by τ = (τ0 , . . . , τT −1 ), τi ≥ 0, ∀i ∈ I[0,T −1] , (8c)
solving the SDP (8). Then, we reformulate the input and
state constraints as linear matrix inequalities (LMIs). MR L
where Φ = and MR⊤ MR = R, MQ ⊤
MQ = Q. Then
We first neglect the input and state constraints (3d) and MQ H
(3e) and obtain the state-feedback gain for the following γ is an upper bound on the optimal cost of (4). Applying
data-driven min-max MPC problem the state-feedback control ut = F xt with F = LH −1 to the
∞
system (1) leads to a cost that is guaranteed to be at most γ.
Proof: As discussed in (5)-(7), the quadratic function
X
∗
J∞ (xt ) := min max l(ūk (t), x̄k (t)) (4a)
ū(t) (A,B)∈C
k=0
V (xt ) = x⊤ t P xt with P ≻ 0 is an upper bound on the
s.t. x̄k+1 (t) = Ax̄k (t) + B ūk (t), (4b) optimal cost of (4). Suppose x⊤ t P xt ≤ γ holds and define
H = γP −1 ≻ 0. Using the Schur complement, x⊤ t P xt ≤ γ
x̄0 (t) = xt . (4c) is equivalent to the inequality (8a).
The method to reformulate the input and state constraints Additionally, V is required to satisfy the inequality (5)
(3d)-(3e) will be proposed later. for any (A, B) ∈ C, k ∈ N. We provide sufficient conditions
Our goal is to derive an upper bound on the worst-case cost for inequality (5) based on LMIs. By substituting ūk (t) =
over the set C and then to find a state-feedback control law F x̄k (t), the inequality (5) holds for all x̄k (t), ūk (t), k ∈ N
to minimize this upper bound. In order to derive the upper and (A, B) ∈ C if
bound of the worst-case cost over all system matrices in C,
(A + BF )⊤ P (A + BF ) − P + F ⊤ RF + Q ≺ 0 (9)
we define a quadratic function V (x) = x⊤ P x for x ∈ Rn ,
where P ≻ 0. Suppose V satisfies the following inequality holds for any (A, B) ∈ C. Multiplying both sides of the
for all states and inputs x̄k (t), ūk (t), k ∈ N predicted by the inequality (9) with H = γP −1 , defining L = F H and
system dynamics (4b) with any (A, B) ∈ C dividing by γ, we obtain
V (x̄k+1 (t)) − V (x̄k (t)) ≤ −l(ūk (t), x̄k (t)). (5) 1
(AH + BL)⊤H −1(AH + BL)− H + (L⊤RL + HQH) ≺ 0.
γ
To ensure that the cost in equation (4a) is finite, we must have (10)
lim x̄k (t) = 0. Therefore, we have lim V (x̄k (t)) = 0.
k→∞ k→∞ Schur complement with H ≻ 0 and defining Φ =
Using the
Summing the inequality (5) from k = 0 to k = T along an MR L
, the inequality (10) is equivalent to
arbitrary trajectory and letting T → ∞, we obtain MQ H
∞
H − γ1 Φ⊤ Φ (AH + BL)⊤
X
−V (x̄0 (t)) ≤ − l(ūk (t), x̄k (t)). (6) ≻ 0.
k=0
(AH + BL) H
Since x0 (t) = xt and the inequality (6) holds for any Using the Schur complement again, we obtain the equivalent
matrices (A, B) ∈ C, it also holds for the worst-case value, inequalities
i.e., 1 ⊤ −1
∞
X H − (AH + BL)(H − Φ Φ) (AH + BL)⊤≻ 0, (11a)
max l(ūk (t), x̄k (t)) ≤ V (xt ). (7) γ
(A,B)∈C
k=0 1 ⊤
H− Φ Φ ≻ 0. (11b)
This provides an upper bound on the cost (4a). The above γ
method to derive the upper bound on the worst-case cost, The inequality (11a) is equivalent to
i.e., (5)-(7), is inspired by the existing LMI-based min-max ⊤
H 0
MPC approach in [7]. I I
A⊤ ⊤A⊤ ≻ 0.
The goal of our data-driven min-max MPC problem is
H 1 ⊤ −1 H
to synthesize a state-feedback control law ut = Ft xt to B⊤ 0 − (H − γ Φ Φ) B⊤
L L
minimize the upper bound V (xt ) satisfying (5) for any (12)
As the set C is characterized by (2), the inequality (12) holds holds for any (A, B) ∈ C, k ∈ N. Since l(ūk (t), x̄k (t)) ≥ 0,
for any (A, B) ∈ C if there exists τ = (τ0 , . . . , τT −1 ), τi ≥ we have
0, i ∈ I[0,T −1] , such that the following inequality holds
x̄k+1 (t)⊤ P x̄k+1 (t) ≤ x̄k (t)⊤ P x̄k (t). (14)
−H 0
⊤ holds for any (A, B) ∈ C, k ∈ N. Therefore, if x⊤
t P xt =
H + Π(τ ) ≺ 0. (13)
H
0 (H − γ1 Φ⊤ Φ)−1 x̄0 (t)⊤ P x̄0 (t) ≤ γ, then we have
L L
(14)
Applying again the Schur complement, (13) together with x⊤ ⊤
t+1 P xt+1 ≤ max x̄1 (t) P x̄1 (t) ≤ γ.
(11b) is equivalent to (A,B)∈C
http://arxiv.org/ps/2309.17307v1
This figure "state.jpg" is available in "jpg" format from:
http://arxiv.org/ps/2309.17307v1