0% found this document useful (0 votes)
10 views

2017 Model-Based Control Using Koopman Operators

Uploaded by

a192590663
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

2017 Model-Based Control Using Koopman Operators

Uploaded by

a192590663
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Model-Based Control Using Koopman Operators

Ian Abraham, Gerardo De La Torre, and Todd D. Murphey


Department of Mechanical Engineering
Northwestern University, Evanston, Illinois 60208
Email: [email protected]
[email protected],
[email protected]

Abstract—This paper explores the application of Koopman of data to perform model-based control of various dynamical
operator theory to the control of robotic systems. The operator systems [11]. Nonetheless, several questions about the training
is introduced as a method to generate data-driven models that
arXiv:1709.01568v1 [cs.RO] 5 Sep 2017

data, stability, convergence properties, computational complex-


have utility for model-based control methods. We then motivate
the use of the Koopman operator towards augmenting model- ity, and mechanical property conservation of the models are
based control. Specifically, we illustrate how the operator can still open questions that need to be addressed.
be used to obtain a linearizable data-driven model for an un- Recently, the use of data-driven techniques to mitigate the
known dynamical process that is useful for model-based control
synthesis. Simulated results show that with increasing complexity effects of model uncertainty have sparked interest in the
in the choice of the basis functions, a closed-loop controller is Koopman operator [12]. The Koopman operator is a infinite-
able to invert and stabilize a cart- and VTOL-pendulum systems. dimensional linear operator that is able to exactly capture the
Furthermore, the specification of the basis function are shown behavior of nonlinear dynamical systems. In application, the
to be of importance when generating a Koopman operator for Koopman operator is approximated with a finite-dimensional
specific robotic systems. Experimental results with the Sphero
SPRK robot explore the utility of the Koopman operator in a linear operator [13]. This approximation can be computed in a
reduced state representation setting where increased complexity solely data-driven manner without any prior information of the
in the basis function improve open- and closed-loop controller dynamical system. Complex fluid flow systems have accurately
performance in various terrains, including sand. been modeled using this approach [14]. Furthermore, it has
I. I NTRODUCTION been shown that the spectral properties of the approximate
Koopman operator can be examined to investigate system-
Modeling for complex dynamical systems has typically been
level behavior like ergodicity and stability [12], [15], [16].
the first step when designing, control, planning, or state-
In addition, recent work has shown its utility in human-
estimation algorithms. System design and specifications have
machine systems [17]. In this paper, we investigate the utility
been dependent on the use of high-fidelity models. However,
of Koopman operator theory for control in robotic systems.
any derivation of a dynamical model from first principles
is typically a demanding task when the complexity of state The work is motivated by the desire to generate or augment
interactions is high. Moreover, analytical models do not cap- dynamical models of robotic systems through data collec-
ture external disturbances. As a result, derived models, for tion. In particular, it is of interest to synthesize model-based
use in model-based control settings, often have limited use controllers using these data-driven models. Thus, the main
or poor prediction over longer time spans. Nevertheless, a contribution of this paper is the application of Koopman
representation of the behavior of a dynamical system is central operator theory to the control of robotic systems. The Koop-
to most model-based engineering and scientific application. man operator is shown to have a linearizable data-driven
Within the field of systems and control theory, model uncer- model of the dynamical system that is amenable to model-
tainty has typically been mitigated with the use of robust and based control methods. Closed-loop and open-loop controllers
adaptive control architectures. Typically, adaptive controllers are then formulated using the proposed data-driven model.
are self tuning and reactive to incoming state information Furthermore, we explore the consequences of the specific
while robust controllers are designed to be invariant to model choice of basis function as well as complexity order for
uncertainty [1]–[4]. Motion planning for uncertain dynamical swing up control of a simulated cart- and vertical take-off and
systems have also been extensively investigated. Generally, in landing (VTOL)-pendulum systems. Last, experiments using
this approach, uncertainty is explicitly modeled and incorpo- the Koopman operator using a Sphero SPRK robot are shown.
rated into the decision making process [5]–[7]. However, like We conclude the paper with recommendations for future work.
robust and adaptive control approaches, the need for an explicit The organization of this paper is as follows. Section II gives
uncertainty model often limits its utility in general settings. an overview of the Koopman operator theory and its applica-
Machine learning, offers a much more general approach [8]– tion to data-driven approximations of dynamical systems. In
[10]. In particular, recent advances have utilized large sets addition, Sections III and IV explore the implementation of
This work was supported by Army Research Office grant W911NF-14-1- Koopman operator theory in simulation and experimentation,
0461. respectively. Conclusions are in Section V.
II. KOOPMAN O PERATOR step. Next, it is assumed that the trajectory of the system has
An overview of Koopman operator theory is given in this been collected such that
section. For the purposes of this paper, we focus more on X = [x1 , . . . , xP ] (8)
the practical implementation of the theory and omit much of
the theoretical presentation. However, the interested reader can where P is the number of recorded data points.
find a complete treatment of the Koopman operator in [13]. The matrix K can be computed in a number of ways. In
To begin, consider a discrete-time dynamical system evolv- this paper, we adopt the least-squares approach, described in
ing as [18], where K is determined by minimizing
xk+1 = F (xk ), (1) P −1
1X
where xk ∈ M is the, possibly unobserved, state of the system J= |r(xp )|2 , (9)
2 p=1
and yk ∈ C. Furthermore, define an observation function
P −1
1X
yk = g(xk ), (2) = |Ψ(xp+1 ) − Ψ(xp )K|2 . (10)
2 p=1
where g ∈ G : M → C and G is a function space. For the
purposes on this paper, we assume that G is the L2 space. The Solving the least-squares problem yields
Koopman operator, K : G → G, is defined as K = G† A, (11)
[Kg](x) = g(F (x)). (3) where † denotes the Moore–Penrose pseudoinverse and
Note that the Koopman operator maps elements in G to P −1
1 X
elements in G. Therefore, it does not, as done by F , map G= Ψ(xp )T Ψ(xp ), (12)
system states to system states. Furthermore, note that (3) can P p=1
be written as P −1
1 X
A= Ψ(xp )T Ψ(xp+1 ). (13)
[Kg](xk ) = g(F (xk )) = g(xk+1 ). (4) P p=1
Therefore, the Koopman operator propagates the output of the Note that the computational burden of this approach grows
system forward. Finally, the observable equation can be easily as the dimension of Ψ increases. The approach generally
extended to the case where multiple observations are available, yields a better approximation as the dimension of Ψ increases.
g : M → CK . Furthermore, the number of data points and their distribution
The Koopman operator defined in (3) is linear when G is across the state space will have a large effect on the computed
a vector space. This property holds even if the considered K matrix.
discrete-time dynamical system is nonlinear. However, since The definitions of (8-13) can be generalized. The recorded
the Koopman operator maps G to elements in G it is infinite data points need not come from a single trajectory nor be
dimensional. Therefore, a nonlinear dynamical system given sequential [18]. Multiple trajectories and trajectories with
by (1) can be equivalently described by a linear infinite missing data points can be used. The only requirement is the
dimensional operator. From a practical standpoint, there is sum of residuals given in (9) be defined by consecutive states
not much benefit from this infinite dimensional representation (xk , xk+1 ) spaced equally in time. Even this could be avoided
even if the operator could be defined for a specific system of by choosing another optimization to solve for K.
interest. However, the Koopman operator can be approximated
with a linear finite dimensional operator using data-driven B. Approximating Dynamical Systems
approaches. For predicting dynamical systems, the approximation to the
Koopman operator can be used to generate a data-driven model
A. Approximating a Koopman Operator
of a system by defining Ψ as
In order to define an approximate Koopman operator the
observation function (2) is redefined as Ψ(x) = [xT , ψn+1 (x), . . . , ψN (x)]. (14)

yk = g(xk ) = Ψ(xk ), (5) Note that the state of the system, x ∈ Rn , is now included in
Ψ(x). Thus we can write the approximate dynamical equations
where Ψ(x) is a user-defined vector-valued function of the considered system as
Ψ(x) = [ψ1 (x), ψ2 (x), . . . , ψN (x)]. (6) xk+1 ≈ K̂ T Ψ(xk )T , (15)
Next, the relation described by (4) is now given as where K̂ T ∈ Rn×N is the first n columns of K. Note that
Ψ(xk+1 ) = Ψ(xk )K + r(xk ). (7) equation (15) simply propagates forward the quantities of
interest (e.g. system states). Furthermore, in this work, xk+1
where K ∈ CN ×N and r(xk ) is a residual (approximation is described as a linear combination of the system state, xk ,
error). Note that the matrix K advances Ψ forward one time and the functions ψi (xk ).
III. C ONTROL S YNTHESIS : O PEN - AND C LOSED - LOOP Open-Loop: Open-loop trajectory optimization precom-
C ONTROLLERS putes the set of trajectory and control actions that minimize
the objective function (23) subject to the modeled dynamical
In this section we formulate open- and closed-loop model- constraints in (22). Projection-based optimization [19] is used
based controllers using the Koopman operator. It is first in discrete time to generate the set of trajectory and control
shown that for a differentiable choice of basis function Ψ, the actions given an initial trajectory xk and control uk for k ∈
Koopman operator has a linearization that can be computed for [0, N ]. In the experiment, the projection-based optimization
model-based control methods. Given the linearizable Koopman algorithm first generates the control actions based on the
operator, a model-based optimal control problem is formulated dynamical model and then at a fixed rate the command signals
for open- and closed-loop controllers. are sent via Bluetooth communication to the robot. Odometry
data is collected only for post-processing and is not used to
A. Koopman Operator Linearization update the command signals.
Closed-Loop: In the simulated and the experimental work, a
By choosing a Ψ that is differentiable, the Koopman opera- discrete-time version of Sequential Action Control (SAC) [20]
tor approximation to the dynamical system can be linearized: is used with the Koopman operator to generate closed-loop
∂Ψ optimal control calculations. However, any MPC technique
xk+1 ≈ K̂ T xk (16) can be used with the Koopman operator. Here, SAC operates
∂x
≈ A(xk )xk . (17) by first forward simulating an open-loop trajectory for some
horizon N for a control-affine dynamical system given by
Control inputs are readily incorporated to the definition of xk+1 = f (xk , uk ) = g(xk ) + h(xk )uk . (24)
Ψ as an augmented state,
The sensitivity to a control injection for any given discrete
Ψ(x, u) = [xT , uT , ψ1 (x, u), ψ2 (x, u), . . . , ψN (x, u)]. (18) time of the objective function is given as
dJ
This yields the approximate dynamical equations, = ρk (f2 (k) − f1 (k)) (25)
dλk
xk+1 ≈ K̂ T Ψ(xk , uk )T (19) where

and the linearization of the approximate dynamical equations, f1 (k) = f (xk , u0,k ), (26)
f2 (k) = f (xk , u?k ) (27)
∂Ψ ∂Ψ
xk+1 ≈ K̂ T xk + K̂ T uk (20)
∂x ∂u are the dynamics subject to the default control u0,k and derived
≈ A(xk , uk )xk + B(xk , uk )uk . (21) control u?k . The co-state variable ρk ∈ Rn is computed by
backwards simulating the following discrete equation
Note that linearizable equations of motion of a dynamical
system can be computed solely from data. ∂lk ∂fk T
ρk−1 = + ρk , (28)
∂x ∂x
B. Optimal Control Problem where lk = 12 (xk − x̃k )T P(xk − x̃k ) + 21 uTk Ruk and fk =
f (xk , u0,k ) for some default u0,k subject to ρN = ~0. The
Control synthesis for trajectory optimization is generated optimal control u∗k is computed by first defining a secondary
for mobile robot dynamics of the form objective function as
N
xk+1 = f (xk , uk ), (22) X 1 dJ 1
Ju = ( − αd )2 + ku?k − u0,k k2R . (29)
n m
2 dλk 2
where x ∈ R is the state and u ∈ R is the control input. For k=0

a discrete system, we can solve for a trajectory that minimizes The objective (29) is now convex in u∗k and has a minimizer
the objective defined as when
N
u∗k = (Λ + RT )−1 h(xk )T ρk αd + u0,k , (30)
X 1 1
J= (xk − x̃k )T P(xk − x̃k ) + uTk Ruk , (23) where Λ = h(xk )T ρk ρTk h(xk ). Given the sequence of actions
2 2
k=0 u?k , it is then possible to calculate the time of control appli-
cation t?k as
where P ∈ Rn×n and R ∈ Rm×m are positive definite weight
dJ
matrices on state and control and x̃k is the reference trajectory t?k = argmin . (31)
at time k. Note that the accuracy of the system model (22) will dλk
largely determine the effectiveness of the synthesized optimal The control duration in discrete time is found using an outward
control. line search [21] for a sufficient descent on the cost.
seeking to approximated a high dimensional model, a reduced
state model was sought.
Figure 1 shows a closer look at the SPRK robot. Odometry
is collected using a Xbox Kinect with OpenCV [23] image
processing. More details about odometry and motion capture
are stated in the caption of Fig. 1.
B. SPRK Koopman Operator
The representation of the system consists of the position
of the robot (x, y), its velocity (ẋ, ẏ), and the commanded
velocity (ux , uy ). Odometry data from the Kinect paired
with recorded velocity commands are used to generate the
approximate Koopman operator. The vector-valued functions
used in this experiment are polynomial basis functions given
Fig. 1. Sphero SPRK Robot is shown with its clear spherical casing revealing as
the underlying mechanism. The internal mechanism shifts the center of mass
by rolling and rotating within the spherical enclosure, causing the SPRK to Ψ(x) = [x, y, ẋ, ẏ, ux , uy , 1, ψ1 , ψ2 , . . . , ψM ] (32)
roll. RGB LEDs on the top of the SPRK are utilized to track the odometry
of the robot through an Xbox Kinect with OpenCV and OpenKinect libraries ψi (x) = ẋαi ẏ βi (33)
for image processing and motion capture. ROS [22] is used to transmit and
collect data at 20Hz. where αi , and βi are nonnegative integers, index i tabulates
all the combinations such that αi + βi ≤ Q and Q > 1 defines
the largest allowed polynomial degree. We ignore higher order
IV. E XPERIMENTS U SING S PHERO SPRK position dependence in the operator in order to prevent any
In this section, we describe the experimental set-up for possible overfitting of position-based external disturbances.
use of the Sphero SPRK robot with model-based control The approximated Koopman operator was computed using
algorithms that utilize a state-space model generated via the data captured when the robot was operating at velocity under
Koopman operator. In particular, we define data-driven closed- 1 m/s for the open-loop trails.
and open-loop model predictive controllers as well as motivate V. R ESULTS
and explore the utility of Koopman operator for control of a A. Simulation: Mechanical Energy
robotic system.
In this section, the equations of motion of a double pendu-
In the experiments with the SPRK, trajectory optimization
lum are approximated with the method described in Section II.
is run both in open-loop form and closed-loop feedback form.
The mass of both pendulums are 1 kilogram and the lengths
Here, the tracked states of the robot are position x, y and
of both are 1 meter. The mass of the pendulums are assumed
velocity ẋ, ẏ and inputs to the robot are desired velocities
to be concentrated at their ends. The system is conservative
u1 , u2 . The objective function parameters are defined as
and subject to a gravitational field (9.81 m/s2 ).
P = diag([60, 60, 0.1, 0.1]) and R = diag([20, 20]) and are
The state of the system, x, is described by the relative
maintained constant through both open-loop and closed-loop
angles of the pendulums with respect to the vertical (θ1 and
experiments. An additional set of experiments are done to
θ2 ) and their time derivatives (θ̇1 and θ̇2 ). Data was collected
show the use of the Koopman operator for control in a sand
by simulating the system multiple times with random initial
environment.
conditions given by
A. SPRK x0 = [U(−1, 1)lθ1 , U(−1, 1)lθ2 , U(−1, 1)lθ̇1 , U(−1, 1)lθ̇2 ]
The SPRK is a differential drive mobile robot enclosed in a where U(−1, 1) is an uniformly distributed random variable
spherical case. The dynamics of the SPRK are driven by the with range −1 to 1. Furthermore, lθ1 = lθ2 = π3 and
nonlinear coupling between the internal mechanism and the lθ̇1 = lθ̇2 = 0.5. Therefore, the initial condition is uniformly
outer spherical encasing. In addition, proprietary underlying distributed around the origin (and the stable equilibrium) and
controllers govern how the command velocities are interpreted its range is defined by L = [lθ1 , lθ2 , lθ̇1 , lθ̇2 ]. Any data point
to low-level motors. The proprietary embedded software uses that fall outside of the range defined by L was not used to
the on-board gyro-accelerometers to balance the robot up- approximate the Koopman operator. Data collection occurred
right while rolling. The caster wheels on top of the internal at 100 Hz and was stopped when 2, 000 data points were
mechanism ensures constant contact of the lower wheels that collected.
are driven via two motors. The embedded software interfaces The vector-valued functions used in this numerical experi-
with heading and velocity (or x − y velocity) command inputs ment are polynomial basis functions give as
sent via Bluetooth communication. A high fidelity model of Ψ(x) = [θ1 , θ2 , θ̇1 , θ̇2 , 1, ψ1 , ψ2 , . . . , ψM ] (34)
the robot would include several internal states characterize αi βi γi δi
the internal mechanism and controller. However, rather than ψi (x) = (θ1 /lθ1 ) (θ2 /lθ2 ) (θ̇1 /lθ̇1 ) (θ̇2 /lθ̇2 ) (35)
Q=1 Q=2 Q=3
Simulated 10 5
Predicted
0.4 0.4 0.4

Prediction Error
0.2 0.2 0.2 10 0
0 0 0

-0.2 -0.2 -0.2 10 -5

-0.4 -0.4 -0.4

-0.6 -0.6 -0.6 10 -10


0 1 2 3 0 1 2 3 0 1 2 3 0 2 4 6 8
Time (sec) Time (sec) Time (sec) Total Mechanical Energy (Joule)

Fig. 2. Simulated trajectories when the approximate Koopman operator was used to propagated the system’s configuration. As the complexity Ψ increases,
so does the accuracy in prediction. 100 trials with uniformly random initial conditions were conducted to invesgate the relationship between accuracy and
total mechanical energy. The prediction error tended to increase with total mechanical energy.

where αi , βi , γi , and δi are nonnegative integers, index i where we use


tabulates all the combinations such that αi + βi + γi + δi ≤ Q ψi (x) = θαi xβi θ̇γi ẋδi u (38)
and Q > 1 defines the largest allowed polynomial degree. Note
that −1 ≤ ψi ≤ 1 when the state of the system is within the as the polynomial basis function set and compare with a
defined range. The polynomial basis functions were scaled by Fourier basis function,
YY
the maximum expected value of the state to prevent numerical ψi (x) = cos([x]i κj ) sin([x]i κj )u, (39)
instability when higher order polynomials were utilized. [x]i κj
Figure 2 shows a simulated trajectory and the corresponding
predicted trajectories when approximated Koopman operators ith state of the system and κj is the j th basis
where [x]i is theP
were used to propagate the system’s configuration. As ex- order such that j κj ≤ Q.
pected, the accuracy of the predicted trajectories are improved In this simulation, a nominal model given by
when Q is increased. Figure 2 also shows how the accuracy  
θ̇k
of the predicted trajectories are dependent on the initial ẋk 
conditions. The prediction error of a trajectory is computed xk+1 = xk +   δt (40)
u
as u
N
1 X
(xsim,i − xK,i )2 (36) is utilized as an initial guess for the controller in order to
N i boot-strap the data-driven process. Figure 3 presents the use
where xsim is the simulated trajectory, xK is the system’s of increasing complexity orders of a polynomial and Fourier
trajectory predicted by the approximated Koopman operator, basis function for the cart-pendulum system. Both test cases
and N is the total run-time of the simulation. The prediction begin with the same initial condition and the same nominal
error tended to increase with total mechanical energy. Recall model. At intervals of 20s, a Koopman operator is computed
that the dynamics of a double pendulum are described by with either the polynomial or Fourier basis functions using
transcendental functions. Therefore, any approximation by the initial 20s of data collected. Due to the existence of the
polynomials of these dynamics will deteriorate as the relative pendulum on SO(1), the Fourier basis function immediately
angle increases in magnitude. However, when the relative an- generates a Koopman operator model that allows the controller
gles are small (total mechanical energy is small) a polynomial to balance and stabilize the pendulum. Moreover, the use of the
approximation is accurate. As expected, selection of Ψ plays Fourier basis illustrates the concept that increasing complexity
a critical role in determining the quality of the computed on the operator basis set is not always guaranteed to return an
Koopman operator. improved data-driven model. In particular, when Q = 2, the
Koopman operator matches the system model identically. As
B. Simulation: Inversion and Stabilization of Pendulum Sys- a result, any further additions in complexity using the Fourier
tems basis for this system is not beneficial (this is not always the
In this section, we describe the results of utilizing the case if the system has higher order dependencies). In contrast,
Koopman operator for inverting a cart-pendulum system and a the polynomial basis function does show improvement as
VTOL-pendulum system. In particular, this section overviews complexity is increased. Although it would require an infinite
the effect that the choice of basis functions has on systems set of polynomials to approximate a cosine or sine function,
that have components in SO(n) for n > 1. the controller using this operator model provides the desired
For the cart-pendulum system, the Koopman states are given energy pumping cart motion that is commonly witnessed in
as inverting a pendulum.
Simulated examples are further investigated with the use
Ψ(x) = [θ, x, θ̇, ẋ, u, 1, ψ1 , ψ2 , . . . , ψM ] (37) of a vertical take-off and landing (VTOL) pendulum system
Nominal Model Q=1 Q=2

Fourier Basis
time (s)
Polynomial Basis

Fourier Basis
Polynomial Basis
Attempted Swing Up

2
x (m)

20 time (s) 40 60
-1

Fig. 3. The progressive improvement in control as the Koopman operator increases the basis order of complexity Q is shown. Each pendulum configuration
is taken as a snapshot in time. Koopman operators with complexity Q are trained on the initial first 20 seconds with the nominal model. Note that because
of the SO(1) configuration of the pendulum, a Fourier basis of complexity Q = 1 is sufficient to invert at stabilize the cart-pendulum. Adding a higher
complexity Q = 2 does not provide a different Koopman matrix (this does not necessarily hold true for non-simulated systems). It is interesting to note
that as the complexity of the polynomial basis increases, so do the number of attempts at swinging up the cart-pendulum. Link to multimedia provided:
https://vimeo.com/219458009 .

[24]. For this example, the problem of inverting the pendulum the Fourier basis functions is used, the controller generates
attached to a VTOL is slightly modified. Specifically, it is the appropriate control strategy to swing up and invert the
assumed that a well known model of the VTOL exists, but pendulum.
the interaction between the VTOL and the pendulum remains In the following section, our discussion on the use of the
unknown. Thus, the goal of this simulated example is to Koopman operator is extended to control of a Sphero SPRK
generate a Koopman operator that describes the interaction robot in a reduced state setting.
of the VTOL on the pendulum.
In this example, the Koopman operator is redefined as an C. SPRK Experiments
augmentation to a dynamical system 1) Open-Loop Trajectory Optimization: Figure 5 shows
xk+1 = f (xk , uk ) + K̃ Ψ(xk , uk ) . T T
(41) trajectories generated using the open-loop controller with
varying Q. The reference trajectory is given as
By subtracting the current nominal model of the system    
f (xk , uk ) from both side in equation (41) and treating xk+1 x̃ r cos(vt)
ỹ   r sin(2vt) 
as the measurement of state, we can define the following as  =
x̃˙   −rv sin(vt)  . (43)

a nonlinear process that can be used to generate a Koopman
operator: ỹ˙ 2rv cos(2vt)

Ψ(xk+1 ) = xk+1 − f (xk , uk ) = K̃ T Ψ(xk , uk )T . (42) where r = 0.5 and v = 1.3. The reference trajectory was
made sufficiently aggressive to excite the system’s internal
Given the previous cart-pendulum result, we see that the nonlinearities.
interaction between the VTOL and the pendulum can be As expected, the system improves in performance when
captured solely via a vast set of basis functions across the state tracking the reference trajectory with increasing Q. In par-
of the VTOL-pendulum system. In Fig. 4, the VTOL is shown ticular, as Q goes from 1 to 2, less drift in the resulting open-
attempting to invert and balance the pendulum attached with loop trajectory is visually noted at the end of the path. As Q is
the use of the Koopman operator. Each sequential Koopman further increased, more complexity is added to the description
operator with increasing complexity is generated from the of the SPRK via the Koopman operator which in turn reduces
first 20 seconds worth of data. Originating from the nominal drift and improves the tracking performance. Furthermore, the
model, it can be seen that the swinging behavior captures a standard deviation of tracking error across trials is shown to
portion of the energy pumping maneuvers required to invert reduce as Q is increased. This implies both consistency in the
the pendulum. As the Koopman basis order increases, so does behavior of the robot subject to the controller. Therefore, it
the refinement in control authority. When Q = 2 for the can be concluded that the approximated Koopman operator
polynomial basis, it can be seen that swing up attempts are is better able to represent the dynamics of the system by
more successful. Once the Koopman operator generated from increasing the complexity of Ψ.
Nominal Model Polynomial (Q=1) Polynomial (Q=2) Fourier (Q=1)

VTOL Time snapshot

Pendulum end
point
time

Fig. 4. Each Koopman operator is trained on the residual modeling error of 20 seconds attempted pendulum inversion using the nominal model. As the order
of the polynomial basis increases from 1 → 2, the number of swing up attempts also increases. Notably, a first order Fourier basis captures the necessary
features that allow the controller to invert and stabilize the pendulum. Link to multimedia provided: https://vimeo.com/219458009 .

A) Open-Loop Trials 3 x Standard Deviation


Mean Trajectory
Q=1 Q=2 Q=3 Q=4 Target Trajectory

B) Tracking Error Across Trials


x (m)

Standard Deviation

Integrated Error
y (m)

Time (s) Basis Function Order

Fig. 5. Here we show reference tracking using open-loop trajectory optimization. The reference trajectory was made sufficiently aggressive to excite the
system’s internal nonlinearities that cannot be captured completely by the minimal state representation. Respective integrated tracking errors are shown to
decrease with an increase in Q. This suggests that the approximate Koopman operator better represents the dynamics of the system with increasing complexity
of Ψ.

2) Closed-Loop Trajectory Tracking: Figure 6 shows the Koopman operator did not have a sparse enough data set that
experimental results for trajectory tracking on a tarp and sand spans the higher order terms in the operator. This can be fixed
terrain using closed-loop model-based controllers with the by collecting more data that spans the robot’s operating region.
Koopman operator. The optimal control signal was updated at Here, the nonlinear dynamics driven by the internal mech-
20 Hz and the reference trajectory was given by equation (43) anism become more apparent as the order of the operator is
where r is split into two components, rx = 0.7 and ry = 0.4, increased. In particular, equation (6) provides some insight
with v = 0.9. The nominal linear model is given by into the output of the data-driven model of the Koopman
operator for the update equation of the SPRK’s velocity subject
xk+1 = Axk + Buk , (44) to control inputs. Because the effect of the internal mecha-
nism’s configuration (typically described on SO(3)) cannot
where A and B are defined as a fully controllable double
be linearly approximated, the Koopman operator begins to
integrator system.
approximate a Taylor expansion (6). Therefore, the Koopman
The effectiveness of the closed-loop controller is bench-
operator captures the inherent nonlinearities that are utilized
marked by comparing the model generated from the Koopman
by the model-based controller with respect to the terrain.
operator to that of a simulated example of the controller
However, achieving a representation that performs consistently
knowing the true system model (Fig. 6). Using only the first
across all operating terrains seems infeasible with such limited
20 seconds worth of data from the nominal model controller,
information, without extra structure on the Koopman operator,
we can see in Fig. 6 A) that as the operator increases in com-
such as global Lie group structure or mechanical properties
plexity, so shows the performance of the controller relative to
(e.g. symmetries). VI. C ONCLUSION
the benchmark test. Specifically, Fig. 6 B) shows the tracking
error for experimental trials with increasing complexity of We present Koopman operator theory and focus on the
the Koopman operator. Notably, when Q = 3 in sand, the practical implementation of the theory for model-based con-
Target Trajectory
A) Experimental Results with Sphero SPRK Tarp Results
Sand Results B) Tracking Error Across
1.0 Simulated Baseline
Nominal Q=1 Q=2 Q=3 Operator Orders
Model
x (m)

250 Baseline Performance

Integrated Error
-1.0

1.0
y (m)

125
0 1 2 3
-1.0 Basis Order
0 20 40 60 80
Time (s)

Fig. 6. Here, we show closed-loop model-based control using sequentially increasing basis complexity, Q in the Koopman operator. Two examples using
the SPRK robot are run on a tarp and on sand. A baseline simulated example is provided to show the best-case performance of the controller subject to the
nominal model used. As the complexity of the operator’s basis function is increased, so does the performance of the tracking. Note that in B), the 3rd order
operator used in sand (shown as the dashed red line) did not have a sparse enough set of data to provide a stable model, although it performed better than
the nominal model. Link to multimedia provided: https://vimeo.com/219458009 .

0.08xk − 0.35yk + 0.76ẋk + 0.21ẏk + 1.06u1 − 0.17u2 − 0.05ẋ2k − 0.19ẏk2 − 1.09ẋk ẏk2 − 0.71ẏk ẋ2k + 0.40
   
ẋk+1
=
ẏk+1 −0.06xk + 0.16yk + 0.17ẋk + 0.87ẏk − 0.38u1 + 0.57u2 − 0.20ẋ2k − 0.52ẏk2 − 0.45ẋk ẏk2 − 3.17ẏk ẋ2k − 0.04
(6)

trol. We derive a linearizable data-driven model using the [9] M. Jordan and T. Mitchell, “Machine learning: Trends, perspectives, and
Koopman operator. Closed-loop and open-loop controllers prospects,” Science, vol. 349, no. 6245, pp. 255–260, 2015.
[10] C. G. Atkeson, A. W. Moore, and S. Schaal, “Locally weighted learning
were formulated using the proposed data-driven model. The for control,” in Lazy learning. Springer, 1997, pp. 75–113.
open-loop experiments reveal the Koopman operator improves [11] G. Williams, N. Wagener, B. Goldfain, P. Drews, J. M. Rehg, B. Boots,
performance as the complexity of the basis increases. Closed- and E. A. Theodorou, “Information theoretic mpc for model-based
reinforcement learning,” International Conference on Robotics and
loop experiments reveal the Koopman operator is able to Automation (ICRA), 2017.
capture the nonlinear dynamics of simulated examples with [12] B. O. Koopman, “Hamiltonian systems and transformation in Hilbert
the cart- and VTOL-pendulum and the SPRK robot. space,” Proceedings of the National Academy of Sciences, vol. 17, no. 5,
pp. 315–318, 1931.
Future research directions include an in-depth analysis of [13] D. Henrion, I. Mezic, and M. Putinar, “Applied Koopmanism,” 2016.
the choice of basis for dynamical system with distinct structure [14] I. Mezić, “Analysis of fluid flows via spectral properties of the Koopman
(e.g. conservative systems, mechanical systems, etc.). The operator,” Annual Review of Fluid Mechanics, vol. 45, pp. 357–378,
2013.
relationship between available states and the accuracy of the [15] A. Mauroy and I. Mezić, “Global stability analysis using the eigen-
approximate Koopman operator needs rigorous stability anal- functions of the Koopman operator,” IEEE Transactions on Automatic
ysis. Moreover, numerical stability analysis and algorithmic Control, vol. 61, no. 11, pp. 3356–3369, 2016.
[16] I. Mezić, “On applications of the spectral theory of the Koopman
optimization is another possible research avenue. operator in dynamical systems and control theory,” in Decision and
Control (CDC), 2015, pp. 7034–7041.
R EFERENCES [17] A. Broad, T. D. Murphey, and B. Argall, “Learning models for shared
[1] N. Hovakimyan and C. Cao, L1 Adaptive Control Theory: Guaranteed control of human-machine systems with unknown dynamics,” Robotics:
Robustness with Fast Adaptation. SIAM, 2010. Science and Systems Proceedings, 2017.
[2] K. J. Åström and B. Wittenmark, Adaptive control. Courier Corporation, [18] M. O. Williams, I. G. Kevrekidis, and C. W. Rowley, “A data–driven
2013. approximation of the koopman operator: Extending dynamic mode
[3] K. Zhou and J. C. Doyle, Essentials of robust control. Prentice hall decomposition,” Journal of Nonlinear Science, vol. 25, no. 6, pp. 1307–
Upper Saddle River, NJ, 1998, vol. 104. 1346, 2015.
[4] D. S. Bernstein and W. M. Haddad, “LQG control with an H/sup infin- [19] J. Hauser, “A projection operator approach to the optimization of
ity/performance bound: a Riccati equation approach,” IEEE Transactions trajectory functionals,” IFAC Proceedings Volumes, vol. 35, no. 1, pp.
on Automatic Control, vol. 34, no. 3, pp. 293–305, 1989. 377–382, 2002.
[5] S. C. Ong, S. W. Png, D. Hsu, and W. S. Lee, “Planning under [20] A. R. Ansari and T. D. Murphey, “Sequential action control: Closed-
uncertainty for robotic tasks with mixed observability,” The International form optimal control for nonlinear and nonsmooth systems,” IEEE
Journal of Robotics Research, vol. 29, no. 8, pp. 1053–1068, 2010. Transactions on Robotics, vol. 32, no. 5, pp. 1196–1214, Oct 2016.
[6] A. Bry and N. Roy, “Rapidly-exploring random belief trees for motion [21] J. Nocedal and S. J. Wright, “Numerical optimization 2nd,” 2006.
planning under uncertainty,” in International Conference on Robotics [22] M. Quigley, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs,
and Automation (ICRA), 2011, pp. 723–730. R. Wheeler, and A. Y. Ng, “ROS: an open-source robot operating
[7] J. Van Den Berg, S. Patil, and R. Alterovitz, “Motion planning under system,” in ICRA Workshop on Open Source Software, 2009.
uncertainty using differential dynamic programming in belief space,” in [23] G. Bradski, Dr. Dobb’s Journal of Software Tools, 2000.
Robotics Research. Springer, 2017, pp. 473–490. [24] T. Luukkonen, “Modelling and control of quadcopter,” Independent
[8] D. Nguyen-Tuong and J. Peters, “Model learning for robot control: a research project in applied mathematics, Espoo, 2011.
survey,” Cognitive processing, vol. 12, no. 4, pp. 319–340, 2011.

You might also like