0% found this document useful (0 votes)
5 views

Distributed_inverse_optimal_control_for_discrete-time_nonlinear_multi-agent_systems

This paper presents a robust distributed inverse optimal control framework for discrete-time nonlinear multi-agent systems, addressing the interactions among agents through a non-cooperative game approach. The authors derive conditions for input-to-state stability and provide simulation results for a coupled pendula system to demonstrate the effectiveness of the proposed control strategy. The work is part of the HAR-MONY project focused on distributed optimal control for cyber-physical systems.

Uploaded by

lekhanh2410
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Distributed_inverse_optimal_control_for_discrete-time_nonlinear_multi-agent_systems

This paper presents a robust distributed inverse optimal control framework for discrete-time nonlinear multi-agent systems, addressing the interactions among agents through a non-cooperative game approach. The authors derive conditions for input-to-state stability and provide simulation results for a coupled pendula system to demonstrate the effectiveness of the proposed control strategy. The work is part of the HAR-MONY project focused on distributed optimal control for cyber-physical systems.

Uploaded by

lekhanh2410
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2021 American Control Conference (ACC)

New Orleans, USA, May 25-28, 2021

Distributed inverse optimal control for


discrete-time nonlinear multi-agent systems
João P. Belfo1 , A. Pedro Aguiar2 , Senior member, IEEE and João M. Lemos3 , Senior member, IEEE

Abstract— This paper describes a robust distributed inverse II. P ROBLEM STATEMENT
optimal control framework for a multi-agent discrete-time
nonlinear system, where the dynamics of each agent is directly
Consider a multi-agent system, with a total of Na agents,
affected by terms that depend on the state and input of that interact with each other according with the discrete time
the neighborhood agents and other disturbance signals. The non-linear dynamics
individual local cost is formulated and a control solution for
xi,t+1 = Fi xV ,t + ∑ gi j xSi ,t u j,t , i ∈ V
 
each agent is derived considering an inverse optimal control (1)
approach. To address the interaction between the agents, a j∈Si
coordination method based on a non-cooperative game is
where V = {1, ..., Na }, xV ,t = {x1,t , . . . , xNa ,t } is the set of
proposed. Using Lyapunov and Input-to-State Stability (ISS)
arguments, we derive conditions under which the proposed states of all agents at sampling time t, xi,t ∈ Rn is the state
game converges to a fixed point and the overall multi-agent of agent i, u j,t ∈ Rm is the control signal of agent j, Fi :
system is ISS with respect to the disturbance signals. Simulation Rn → Rn , gi j : Rn×m → Rn are smooth mappings such that
results for a coupled pendula system are presented. Fi (0) = 0 and gii (xSi (t)) 6= 0 for all xSi (t) 6= 0, Si is the set
I. INTRODUCTION of agents with which agent i interacts, including itself, and
xSi corresponds to the set of states x p , for all p ∈ Si . It is
Distributed control of networked systems, where many important to stress that, in this setup, the interaction among
practical systems are composed by a large number of inter- agents may not only happen through the manipulated input
acting linear/non-linear subsystems spread over wide-space signals, but also through the states.
areas, is a topic that currently receives major attention. In (1), we consider two types of interactions: one that is
Efforts have been made to develop non-linear distributed directly due to the neighbors agents, and another that can be
optimal control strategies in order to design non-linear local viewed as a disturbance. More precisely, we consider that
controllers, associated to each subsystem in the network, the function Fi can be written as
such that the overall system is stable [1], [2].
Fi xV ,t = fi xSi ,t + f¯i xV ,t ,
  
In this work, contrary to many results in the literature, (2)
where linear dynamics and no dynamic interaction between where we assume that the disturbance term is bounded by
the agents in open loop are assumed, a multi-agent system
k f¯i xV ,t k ≤ Zi xV ,t + δi αi kxSi ,t k ,
  
is considered, where each agent is modeled by a disturbed (3)
discrete time, non-linear, time invariant dynamics. Moreover, 
with Zi xV ,t ≤ zi , for δi , zi > 0 positive constants, and
the local agents interact through the state and input of the
αi kxSi ,t k is a K∞ -function.
neighborhood agents, for which some interactions may be
The multi-agent system can be represented by an undi-
unknown. A typical example of such system is the water
rected graph G = (V , E ), where V = {1, ..., Na } is the set
delivery canal presented in [3]. The contribution of this
of agents associated to the graph nodes, and E ⊆ V × V is
work consists of a distributed sub-optimal control strategy
the set of edges. Two agents interact (and their respective
that combines robust inverse optimal control (RIOC) [4]
controllers may communicate) if there is an edge connecting
for the design of the local controllers with a game based
them in the graph G . If there are agents i, j ∈ Si , then, there
coordination algorithm. Conditions under which the coordi-
must be an edge between those two nodes in the set E . In
nation algorithm converges and the overall controlled multi-
this paper, we consider that G is connected and its topology
agent system is input-to-state stable (ISS) are presented. The
does not change over time.
proposed distributed control strategy is applied to a nonlinear
coupled pendula system. A. Distributed topology
This work was performed within the framework of the project HAR-
In general, the goal of the control system is to optimize
MONY, Distributed Optimal Control for Cyber-Physical Systems Ap- a global objective function J with respect to the control
plications, financed by FCT under contract AAC n2/SAICT/2017 - variables ui,t , i = 1, ..., Na . This problem can be solved using
031411, project IMPROVE - POCI-01-0145-FEDER-031823, and plurian-
nual INESC-ID funding UIDB/50021/2020.
a centralized topology, in which there is a central controller
1 INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, that optimizes J and then sends the control signals to all
Portugal (e-mail: [email protected]). the agents. In contrast, under certain conditions, the problem
2 SYSTEC, Faculdade de Engenharia da Universidade do Porto, Porto,
can also be solved using a distributed topology, where J
Portugal (e-mail: [email protected]).
3 INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, is separated in different objective functions Ji , that are
Portugal (e-mail: [email protected]). optimized by each local controller associated to agent i. The

978-1-6654-4197-1/$31.00 ©2021 AACC 2661


Authorized licensed use limited to: Peter the Great St. Petersburg Polytechnic Univ. Downloaded on February 21,2024 at 00:31:05 UTC from IEEE Xplore. Restrictions apply.
local controllers communicate (using a certain coordination where, for simplification,
protocol method) in order to achieve a consensus. In this  
case, we consider that the global cost J can be expressed as fi = fi xSi ,t , gi j = gi j xSi ,t , hi,t = ∑ gi j (xSi ,t )u j,t .
j∈Si \{i}
Na (10)
J (u1 , .., uNa ) = ∑ Ji (xi,t , ui,t ) , (4)
i=1 Proof. From Bellman’s principle of optimality, the value
function V (xi,t ) (i.e., the optimal cost that drives the system
where Ji is the local cost associated to agent i. In this paper,
from the initial state xi,t , at time t up to infinity) satisfies [5]
the local optimization problem that is solved by the local
controller of agent i ∈ V is defined as V ∗ (xi,t ) = min l(xi,t ) + u0i,t Ri ui,t +V ∗ (xi,t+1 ) ,

(11)
ui,t
min Ji (xi,t , ui,t )
ui that progresses backwards in time. Computing the gradient
 
s.t. xi,t+1 = Fi xV ,t + ∑ gi j xSi ,t u j,t , (5) of the r.h.s. of (11), yields 1
j∈Si  
xi,t=0 = xi0 . ∂ l(xi,t ) + u0i,t Ri ui,t 
∂ xi,t+1 0 ∂V ∗ (xi,t+1 )

+ = 0, (12)
where the local cost is given as ∂ ui,t ∂ ui,t ∂ xi,t+1
∞ which leads to the control law (9), that is written in an
Ji = ∑ l(xi,t ) + u0i,t Ri ui,t , (6) implicit form. Substituting (9) in (11), (8) is obtained.
t=0

with l(·) denoting a positive semidefinite function of the With exception of the Linear Quadratic Regulator (LQR)
state, and Ri = R0i  0 a constant weight. case (linear system and quadratic l (xt )), where in this case
It is clear that the separate solution of the optimization (8) yields the Riccati equation, and thus, V ∗ has a quadratic
problems (5) for i = 1, ..., Na does not yield the (u1 , ..., uNa ) form, solving (8) is in general difficult.
that minimizes J. Since the local optimization problem (5) A. Robust inverse optimal control approach
is dependent on the neighbors agents state and input, two
approaches can be envisioned: i) fully decentralized case, In the Inverse Optimal Control (IOC) problem [4], [7],
that corresponds to a topology of no communication between an optimal control law u∗i,t is first formulated and then the
the agents; ii) distributed case, assuming that the agents objective function for such u∗i,t is calculated. When consid-
communicate with their neighbors. ering non-linear dynamics and the presence of disturbances,
In the former, if possible, it would require that each the Robust Inverse Optimal Control (RIOC) approach [4] is
local controller would have to be robust to the dynamic a good method to apply, in which a controller is designed
interactions between the neighbor agents, making the design such that, for the class of bounded unknown disturbances
more conservative and far way from the performance of (3), the system (1) is ISS with respect to the disturbance.
the centralized case. In the latter, a coordination is needed More precisely, consider the discrete time non-linear sys-
to adequately incorporate and exploit the information that tem described in (1) with the associated local objective
neighbors are transmitting allowing in this way the design functional (7) defined for each sampling time t together
of less conservative local controllers. with the assumption (3). Then, using the results in [4], we
introduce the following definition.
III. N ON - LINEAR DISTRIBUTED CONTROL
Definition 1. The control law (9) is robust inverse optimal
Since non-linear dynamics are considered for the agents (globally) stabilizing if
in the network, inverse optimal control techniques are used
(i) it achieves (global) input-to-state stability (ISS) for the
to solve the local optimization problems (5). To simplify the
system (1) with respect to the disturbance f¯i ;
presentation, at this stage, consider first the disturbance free
(ii) V (xi,t ) is (radially unbounded) positive definite such
case, that is, consider that the term f¯i xV ,t is zero, and

that the inequality
define the local cost functional
0
V (xi,t+1 ) −V (xi,t ) + ui,t∗ Ri u∗i,t ≤ −σ xSi ,t + ldi k f¯k


V (xi,t ) = ∑ l(xi,n ) + u0i,n Ri ui,n . (7) (13)
n=t 
is satisfied, where σ xSi ,t is a positive definite function
Proposition 1. The discrete time Hamilton-Jacobi-Bellman that represents a desired amount of negativity [4], and
(HJB) equation related with the local optimal optimization ldi is a positive constant. 
(5)-(6) in the disturbance free case can be written as
0
This definition is based on the knowledge of V (xi,t ). To
V ∗ (xi,t ) = l(xi,t ) + ui,t∗ Ri u∗i,t +V ∗ (xi,t+1 ) (8) solve RIOC problems, a V (xi,t ) expression that satisfies (i)
and (ii) is chosen. Depending on the dynamics of the system,
and the optimal control law satisfies
1 For the case in which the value function is not differentiable, one must
1 0 ∂V (xi,t+1 )
u∗i,t = − R−1
i gii , (9) need to use the machinery of non-smooth analysis and viscosity solutions
2 ∂ xi,t+1 xi,t+1 = fi +gii u∗i,t +hi,t of the HJB equation [6].

2662
Authorized licensed use limited to: Peter the Great St. Petersburg Polytechnic Univ. Downloaded on February 21,2024 at 00:31:05 UTC from IEEE Xplore. Restrictions apply.
this function can be very different. However, a starting The coordination method consists of recursively perform-
candidate is the quadratic function ing the steps in Algorithm 1, where H is the total period for
1 0 which the system operates.
V (xi,t ) = xi,t Pi xi,t , (14)
2 Algorithm 1 Non-cooperative game
with Pi = Pi0  0. Once an expression for V (xi,t ), ∀xi,t 6= 0,
Initialization : xi,t=0 = xi0 , ui,t=0 = ui0 , for all i ∈ V
is guessed, it is possible to calculate l(xi,t ) from
1: for t = 0 to t = H − 1 do
0
l (xi,t ) = V (xi,t+1 ) −V (xi,t ) + ui,t∗ Ri u∗i,t , (15) 2: for k = 0 to k = N − 1 do
3: for all i ∈ V do
which for the quadratic case simplifies to 4: local controller i solve 5, obtaining an optimal
1 control solution Ui t (k + 1) given by (20), for all
l(xi,t ) = Pi10 [Ri + Pi2 ]−1 Pi1 −V f hi , (16) i∈V.
4
5: local controller i broadcast its solution Ui t (k + 1)
where Pi1 = g0ii Pi ( fi + hi,t ), Pi2 = 21 g0ii Pi gii , and V f hi =
1 0 to it neighbors j, with j ∈ Si \ {i}.
2 ( f i + hi,t ) Pi ( f i + hi,t ), with f i , gii and hi,t defined in (10). 6: end for
In this case, in order to define the matrix Pi , we will make 7: end for
use of (13) with σ (z) = ξi αi (z), ξi > 0 and the following 8: Ui t+1 (0) = Ui t (N).
result from [4], [7]: Select Pi = Pi0  0 such that the inequality 9: agent i applies u∗i,t ← Ui t (N) and broadcast its state
0
V (xi,t+1 ) −V (xi,t ) + ui,t∗ Ri u∗i,t ≤ −ξi αi kxSi ,t k , update xi,t+1 , according to (1), to its neighbors, for all

(17)
i∈V.
holds for   10: end for
ldi zi
∀xSi ,t : kxSi ,t k ≥ αi−1 , (18)
βi ξi
Besides the evolution of the dynamics of each agent for
where u∗i,t is defined in (9), with δi in (3) satisfying each sampling time t, there is also an evolution of each
ηi control signal indexed by k, for each t, given by the signals
δi < , (19) Ui t (k). This internal dynamic is dictated by the sequence
ldi
of solutions of the local optimization problems, which are
where ldi > 0, ηi = (1 − βi ) ξi , for 0 < βi < 1. different for each k, since each local optimization problem
Lemma 1. With the choice of (14), the control law becomes depends on the control signals of the neighbors that also
change in k.
1
u∗i,t = − [Ri + Pi2 ]−1 Pi1 . (20) C. Linear dynamics
2
Moreover, the closed-loop system (1) with (20) is ISS. For the case
 in which the agents  have linear dynamics,
i.e., Fi xV ,t = Ai xi,t and gi j xSi ,t = bi j for all j ∈ Si in
Proof. The proof can be found in [4]. (1), and thus
There are different methods that can be used to find xi,t+1 = Ai xi,t + ∑ bi j u j,t , (21)
appropriate value for Pi . For instance, the particle swarm j∈Si
optimization (PSO) algorithm defined in [8] and used in
and considering that V ∗ (xi,t ) = 12 xi,t 0 P x , and l(x ) =
i i,t i,t
[4], the BB-BC algorithm defined in [9] used in [10], the 0
xi,t Qi xi,t , with Pi and Qi positive define matrices (LQR prob-
heuristic dynamic programming (HDP) algorithm used in
lem), equation (8) reduces to the algebraic Ricatti equation
[5], and many others. In this paper, the PSO algorithm is  −1
considered. 0 1 −1 0
Pi = Ai Pi I + (ρi Ri ) bii bii Pi Ai + I. (22)
2
B. Coordination method
Furthermore, the control law (20) becomes
Since the control law (9) depends on the knowledge of
u j,t , for j ∈ Si \ {i}, and vice-versa, a coordination method u∗i,t = −KLQi xi,t − KLQ f f i hi,t , (23)
is needed for the local controllers to reach a consensual 
decision. The coordination method considered here is a non- where hi,t is defined in (10), considering that gi j xSi ,t = bi j .
cooperative game based on Game theory. The constants KLQi and KLQ f f i are defined in [3].
Define Ui t (k), i ∈ V , k = 0, ..., N − 1, where N is the IV. C ONVERGENCE ANALYSIS
 t number tof iterations
total 0 of the game, with U t (0) = A. Convergence of the non-cooperative game
U1 (0), . . . , UNa (0) = U0t , for U0t given, as the signals that
First observe that the dynamics of the game for each agent
capture the sequence of decisions (i.e., the solutions of the
i ∈ V consists of iterating in k the equation
local optimization problems during the coordination phase)
of each local controller. Note that, using this notation, the 1 ∂V (xi,t+1 )
Ui t (k + 1) = − R−1 g0 ,
optimal solution is considered to be u∗i,t = Ui t (N) at each 2 i ii ∂ xi,t+1 xi,t+1 = fi +gii Uit (k+1)+Wit (k)
sampling time t, for i ∈ V . (24)

2663
Authorized licensed use limited to: Peter the Great St. Petersburg Polytechnic Univ. Downloaded on February 21,2024 at 00:31:05 UTC from IEEE Xplore. Restrictions apply.
where Wit (k) = ∑ j∈Si \{i} gi j U jt (k). Now, observe that for which is a discrete time affine dynamical system. Thus, re-
each sampling time t, the game evolves in k, that is t is writing (28) for all i ∈ V , in a more compact form, yields
frozen, which implies that all the terms are constant/fixed
except the ones that depend on k. The following result U t (k + 1) = B U t (k) +Ct , (29)
U1 (k)
 t
provides conditions under which each game played at each
  
C1
sampling time t converges to a fixed point.  ..  t  .. 
U (k) =  .  , C =  .  ,
t
(30)
Theorem 1. Assume that all the dynamic sub-systems in UNta (k) CNa
(24) are ISS with respect to the input signal U jt (k) in the
sense that there exist KL-functions βi and K-functions γi j where Ci = − 21 [Ri + Pi2 ]−1 g0ii Pi fi , and therefore the result
(1 ≤ i ≤ Na and for each i, j ∈ Si \ {i}) such that follows.

|Ui t (k + 1)| ≤ βi (|Ui t (0)|, k) + ∑ γi, j (kU jt k), (25) B. Stability


j∈Si \{i} Theorem 2. Consider the multi-agent system (1). Assume
for all i ∈ V . that there is a robust inverse optimal local controller (9) for
If the following set of small-gain conditions holds for each each agent, with a generic V (xi,t ) such that the inequality
r = 2, ..., Na (17) holds for the condition (18). Then, the dynamics of
γi1 i2 ◦ γi2 i3 ◦ · · · ◦ γir i1 < Id , (26) each agent i ∈ V with the local control signal defined in (9),
is ISS with respect to the perturbation f¯i .
for all 1 ≤ i j ≤ Na , i j 6= i j0 , if j 6= j0 , and for all xi j ,t , where
Proof. Consider the Lyapunov difference for the disturbed
Id is the identity function, then the interconnected system
system
(24) is input-to-state stable (ISS) and, consequently, the game
∆Vd xi,t , f¯i = Vi (xi,t+1 ) −Vi (xi,t )

converge, at each sampling time t.
The symbol ” ◦ ”, above, denotes the composition of = Vi (xi,t+1 ) −Vi,no (xi,t+1 ) +Vi,no (xi,t ) −Vi (xi,t ) ,
(31)
functions. For instance, for K-functions γ1 , γ2 , (γ1 ◦ γ2 )(x) =
where Vi,no (xi,t+1 ) is the control Lyapunov function (CLF)
γ1 (γ2 (x)). Note also that γ1 ◦γ2 < Id , means that (γ1 ◦γ2 )(x) <
for the nominal system, meaning the system  in (1) with-
x, for all x > 0.
out the disturbance f¯i . Define ΛVi xi,t , f¯i = Vi (xi,t+1 ) −
For the case with Na = 3, the set of small-gain conditions
are the following
Vi,no (xi,t+1 ), and ∆V i (xi,t ) = Vi,no (xi,t ) − Vi (xi,t ). Then, con-
sidering σ kxSi ,t k = ξi αi kxSi ,t k , ξi > 0, the following is
γ12 ◦ γ21 < Id , γ13 ◦ γ31 < Id , γ23 ◦ γ32 < Id , obtained
(27) 0
γ12 ◦ γ23 ◦ γ31 < Id , γ13 ◦ γ32 ◦ γ21 < Id . ∆Vi (xi,t ) = V (xi,t+1 ) −V (xi,t ) + ui,t∗ Ri u∗i,t ≤ −ξi αi kxSi ,t k ,


Proof. The ISS conditions in (25) implies that each system (32)
in (24) is globally stable. Following the steps in [11], it is with u∗i,t defined in (9). Furthermore
possible to verify that, for the conditions in (26), the overall |ΛVi xi,t , f¯i | = |Vi (xi,t+1 ) −Vi,no (xi,t+1 ) |

system composed by the interconnected systems in (24) is
globally stable. Moreover, invoking the small gain theorem ≤ ldi fi + f¯i + gi j u j,t − fi − gi j u j,t
for networks, see Theorem 1 in [12] or, for the discrete case
∑ ∑
j∈Si j∈Si
[13], the result follows.
= ldi f¯i

For the particular case that V (xi,t ) is quadratic, the follow- ≤ ldi zi + ldi δi αi kxSi ,t k ,
ing result holds. (33)
where ldi and zi are positive constants defined previously.
Proposition 2. Consider the multi-agent system (1). Assume Thus, the Lyapunov difference becomes
that there is a robust inverse optimal local controller (9)
∆Vd xi,t , f¯i = Vi (xi,t+1 ) −Vi (xi,t )

for each agent with V (xi,t ) given by (14) such that all the
eigenvalues of B = [B1 , · · · , BNa ]0 have magnitude strictly ≤ |ΛVi xi,t , f¯i | + ∆Vi (xi,t )

smaller than 1, where Bi is a row vector with dimension Na , 
≤ ldi zi + ldi δi αi kxSi ,t k − ξi αi kxSi ,t k

that contains the term − 21 [Ri + Pi2 ]−1 g0ii Pi gi j , for j ∈ Si \{i},  
and zero value otherwise. Then, the game dynamics (29) = − (1 − βi ) ξi αi kxSi ,t k − βi ξi αi kxSi ,t k +

converges to a fixed point. + ldi δi αi kxSi ,t k + ldi zi
 
Proof. The dynamics of the game for each agent i ∈ V with = − (1 − βi ) ξi αi kxSi ,t k + ldi δi αi kxSi ,t k ,
(34)
V (xi,t ) defined as (14) consists of iterating in k the equation
where 0 < βi < 1. This result is valid for all xSi ,t such
!
1 that condition (18) is satisfied. Simplifying the previous
−1 0
Ui (k + 1) = − [Ri + Pi2 ] gii Pi fi + ∑ gi j U j (k) ,
t t
expression, we obtain
2 j∈Si \{i}
∆Vd xi,t , f¯i ≤ − ηi − ldi δi αi kxSi ,t k ,
  
(28) (35)

2664
Authorized licensed use limited to: Peter the Great St. Petersburg Polytechnic Univ. Downloaded on February 21,2024 at 00:31:05 UTC from IEEE Xplore. Restrictions apply.
where bi j , for i, j = 1, 2, 3, are constant parameters. Thus,
l1 l2 l3 
θ2 ẋi,t = Di xV ,t + ∑ gi j u j,t , (39)
θ1 S1 θ3 j∈Si
S2

L3
where
θ̃˙i
   
S3 0
L1 L2 Di = , gi j = , (40)
m1 m3 Hi − r¨i bi j
m2
with V = {1, 2, 3}. With some abuse of notation (for time),
Fig. 1. Schematic representation of three pendulums with length Li and the corresponding discrete-time nonlinear system is given by
mass mi , connected by horizontal springs Si attached to the suspensions 
pints at distance li , for i = 1, 2, 3. xi,t+1 = Fi xV ,t + T ∑ gi j u j,t , (41)
j∈Si

where Fi = xi,t + T Di , and T is the sampling time.


which is negative definite if the inequality (19) is satisfied, In this example, we set S1 = {1, 2}, S2 = {1, 2, 3}, and
with ηi = (1 − βi ) ξi . Then, the closed-loop system (1) with S3 = {2, 3}, and, thus, there is no communication between
(9) is ISS [4]. the local controllers associated to the pendulums 1 and 3,
but they are directly influenced through the spring S3 .
Proposition 3. Consider the multi-agent system (1). Assume
that there is a robust inverse optimal local controller (9) for A. Robust stabilization
each agent, with V (xi,t ) given by (14) such that the inequality Since the state of agent i depends on the θ j , for j ∈ Si , and
(17) holds for the condition (18). Then, the dynamics of each not on the error x j,t , the information that each agent broadcast
agent i ∈ V with the local control signal defined in (9), is to its neighbors, corresponding to step 9 in Algorithm 1, is
ISS with respect to the perturbation f¯i .
yi,t = [1, 0] xi,t + ri . (42)
Proof. This follows as a consequence of Theorem 2, by
considering the specific quadratic V (xi,t ). Furthermore, since the local controllers 1 and 3 does not
communicate, we have
V. S IMULATION RESULTS
m1 k1 k3
f1 = − gL1 s1 + l12 (sy2 − s1 ) c1 + l32 sr3 − s1 c1 + r¨1 ,

Consider the coupled pendula system presented in Figure I1 I1 I1
1, composed by three pendulums (agents) with length Li and (43)
mass mi , for i = 1, 2, 3. The pendulums are coupled by the where sy2 = sin (y2,t ), and sr3 = sin (r3 ), and f¯1 =
k3 2
horizontal springs, Si , which are attached at a distance li I1 l3 sy3 − sr3 c1 . Similarly,
from the suspensions points, for i = 1, 2, 3. In the figure, θi
m3 k3 k2
corresponds to the angle between pendulum i and the vertical f3 = − gL3 s3 − l32 (s3 − sr1 ) c3 − l22 (s3 − sy2 ) c3 + r¨3 ,
axis. I3 I3 I1
(44)
Consider the tracking problem in which the state of each and f¯3 = kI33 l32 (sy1 − sr1 ) c3 . In this example, we consider mi =
pendulum, xi,t , is given by the error between the angle θi 1, Li = 1, bii = 2, Ii = 1, Ri = 0.00001, and bi j = 0.5, with
and some desired reference ri and its time derivative, that is, j ∈ Si , for all i ∈ V , and k1 = k2 = 0.1, k3 = 0.4, l1 = l2 =
  
θ̃i θ −r
  ˙ 
θ̃i 0.5, l3 = 0.7. In this case, it follows that
xi,t = ˙ = i i , ẋi,t = . (36)
θ̃i θ̇i − r˙i θ̈i − r¨i k3 2 k3
k f¯1 k ≤ 2 l + δ1 kx1,t k, k f¯3 k ≤ 2 l32 + δ3 kx3,t k. (45)
I1 3 I3
Without external actuation, the coupled pendula system sat-
isfies the following dynamics Therefore, it is possible to set δ1 = δ3 = 0.01, ld1 = ld3 =
 m1 k1 2 k3 2
ξ1 = ξ3 = 0.01, β1 = β3 = 0.9, and z1 = 2 kI13 l32 , z3 = 2 kI33 l32 ,
θ̈1 = H1 = − I1 gL1 s1 + I1 l1 (s2 − s1 ) c1 + I1 l3 (s3 − s1 ) c1
 so that the matrices P1 and  P3 satisfy inequality (17),  for
m2 k1 2
θ̈2 = H2 = − I2 gL2 s2 − I2 l1 (s2 − s1 ) c2 + kI22 l22 (s3 − s2 ) c2 −1 ld1 z1 −1 ld3 z3
all ∀kxS1 ,t k ≥ α1 β1 ξ1
, and ∀kxS3 ,t k ≥ α3 β3 ξ3
, re-
θ̈3 = H3 = − mI33 gL3 s3 − kI33 l32 (s3 − s1 ) c3 − kI32 l22 (s3 − s1 ) c3

spectively. In Figure 2, the angles θi are represented, for

(37) i = 1, 2, 3, where a1 = a2 = 0.1, a3 = 0.2, ω1 = π, ω2 = π2
where Ii is the moment of inertia of pendulum i around its and ω3 = 32 π. In this situation,
suspension point, ki is the elastic constant of spring Si , g    
corresponds to the acceleration due to gravity, si = sin(θi ), −1 ld1 z1 −1 ld3 z3
α1 = α3 = 0.4355, (46)
and ci = cos(θi ), for i = 1, 2, 3. The manipulated variables β1 ξ1 β3 ξ3
are added in (37) such that and, thus, the matrices P1 and P3 satisfy the inequality (17)
 for kx1,t k, kx3,t k ≥ 0.4355. As it can be seen, the pendulums
θ̈1 = H1 (θ1 , θ2 , θ3 ) + b11 u1,t + b12 u2,t

oscillate according to the desired reference (that are different
θ̈2 = H2 (θ1 , θ2 , θ3 ) + b21 u1,t + b22 u2,t + b23 u3,t , (38) for each pendulum) in spite of the spring and actuation

θ̈3 = H3 (θ1 , θ2 , θ3 ) + b32 u2,t + b33 u3,t interaction between them.

2665
Authorized licensed use limited to: Peter the Great St. Petersburg Polytechnic Univ. Downloaded on February 21,2024 at 00:31:05 UTC from IEEE Xplore. Restrictions apply.
VI. C ONCLUSIONS
This work proposes a distributed sub-optimal controller
for a class of networked systems in which the local nodes
are described by a broad class of nonlinear systems. The
control algorithm combines robust inverse optimal control
for the design of the local control agents with a coordi-
nation algorithm based on a non-cooperative game. Under
conditions that are verified by a broad class of plants, it is
proven that the resulting controlled system is input-to-state
stable and is robust with respect to unknown disturbances
that may depend on the state. The extension of the results to
time-varying communication graphs and the stochastic case
is considered for future work.
R EFERENCES
[1] Qing Hui, Wassim M. Haddad: Distributed nonlinear control algo-
rithms for network consensus. Automatica, vol.44(9), pp:2375–2381,
(2008). doi: 10.1016/j.automatica.2008.01.011.
Fig. 2. Pendulum angles θi (t), for i = 1, 2, 3, as a function of time t, [2] J. Monteil and G. Russo: On the Design of Nonlinear Distributed Con-
following the respective references ri (t). The signals θicom (t), for i = 1, 2, 3, trol Protocols for Platooning Systems. IEEE Control Systems Letters,
represent the pendulum angles as a function of time, for the case in which vol.1, pp:140–145, (2017). doi: 10.1109/LCSYS.2017.2710907.
the communication graph and the decentralized case commute periodically. [3] J. P. Belfo, A. P. Aguiar, J. M. Lemos: Distributed LQ Control of a
Water Delivery Canal Based on a Selfish Game. Proceedings of the
14th APCA International Conference on Automatic Control and Soft
Computing, Portugal, vol.695, pp:466–476, (2020). doi: 10.1007/978-
3-030-58653-9 45.
[4] Ornelas Tellez, F., Sanchez, E. N., Loukianov, A. G., Rico, J. J.: Robust
Figure 2 also displays the signals θicom that represent the inverse optimal control for discrete-time nonlinear system stabilization.
pendulum angles for a different scenario. In this case, we European Journal of Control, vol.20(1), pp:38–44,(2014).
impose a periodic commutation at each 0.2 seconds between [5] A. Al-Tamimi, F. L. Lewis and M. Abu-Khalaf: Discrete-Time
Nonlinear HJB Solution Using Approximate Dynamic Programming:
the corresponding graph in the previous scenario and the Convergence Proof. IEEE Transactions on Systems, Man, and Cyber-
complete decentralized case (no communication between the netics, Part B (Cybernetics), vol.38, no.4, pp:943–949, (2008). doi:
agents). Note that in the decentralized mode, the controllers 10.1109/TSMCB.2008.926614.
[6] D. Liberzon: Calculus of Variations and Optimal Control Theory: A
do not have access to the communicated neighbors state and Concise Introduction, Princeton University Press, (2012).
control signal and, thus, their design was made to be robust [7] Ornelas, Fernando, Loukianoc, G. Alexander and Sanchez, Edgar
with the coupled springs that generate the bounded unknown N.: Discrete-Time Robust Inverse Optimal Control for a Class of
Nonlinear Systems. Procedding of the 18th World Congress, The
disturbance terms. As it is possible to observe, the signals International Federation of Automatic Control, Milano (2011). doi:
θi and θicom are equal at the beginning as expected, until the 10.3182/20110828-6-IT-1002.03386.
time is equal to 0.2 seconds, when the first graph commutes [8] R. Ruiz-Cruz, E. N. Sanchez, F. Ornelas-Tellez, A. G. Loukianov
and R. G. Harley: Particle Swarm Optimization for Discrete-Time
to the decentralized case. When the error between the angles Inverse Optimal Control of a Doubly Fed Induction Generator. IEEE
and the reference is approximately zero, for instance at 0.5 Transactions on Cybernetics, vol.43, no.6, pp:1698–1709, (2013). doi:
seconds, the disturbance terms and the manipulated variables 10.1109/TSMCB.2012.2228188.
[9] Erol, O. K., Eksin, I.: A new optimization method: Big Bang-Big
are also very small, and thus the difference between the Crunch. Advances in Engineering Software, vol.37(2), pp:106–111,
solutions after that time is negligible. (2006). doi: 10.1016/j.advengsoft.2005.04.005.
[10] L. Ulusoy, M. Güzelkaya and I. Eksin: Inverse optimal control ap-
In order to characterize the performance of the proposed proach to model predictive control for linear system models. 2017 10th
distributed topology, a comparison with the centralized and International Conference on Electrical and Electronics Engineering
the decentralized topology is made. In the centralized topol- (ELECO), Bursa, pp:823–827, (2017).
[11] Dashkovskiy, S., Rüffer, B., and Wirth, F. R.: An ISS small gain
ogy, the manipulated variables are computed at a central theorem for general networks. Mathematics of Control, Signals, and
agent that has access to the dynamic of all agents, which Systems, vol.19, pp:93–122, (2007).
then communicates the solutions to the local controllers. [12] Jiang, Z.-P., and Wang,Y.: A generalization of the nonlinear small-
gain theorem for large-scale complex systems. Proceedings of the 7th
In the decentralize topology, each local controller solves World Congress on Intelligent Control and Automation, vol.1, pp:1188-
the local optimization problems without knowledge of the –1193, (2008).
state and decisions of the neighbors. Simulating the three [13] Zhongping, J., Yuandan, L. and Yuan, W. Nonlinear Small-Gain
Theorems for Discrete-Time Large-Scale Systems. Proceedings of the
topologies for a longer simulations horizon, the total costs, 27th Chinese Control Conference, pp:704-–708, (2008).
obtained through (4), of each topology are Jcen = 214.5,
Jdecen = 1345.4 and Jdis = 221.8, for the centralize, decen-
tralize and distributed topologies, respectively. Clearly, one
can see for this example that the solution given by the
distributed topology is closer to the centralized topology, than
the solution given by the decentralized topology.

2666
Authorized licensed use limited to: Peter the Great St. Petersburg Polytechnic Univ. Downloaded on February 21,2024 at 00:31:05 UTC from IEEE Xplore. Restrictions apply.

You might also like