0% found this document useful (0 votes)
6 views

Model Predictive Control With Learned Vehicle Dynamics For Autonomous Vehicle Path Tracking

Uploaded by

Tuấn Đỗ
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Model Predictive Control With Learned Vehicle Dynamics For Autonomous Vehicle Path Tracking

Uploaded by

Tuấn Đỗ
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Received August 19, 2021, accepted September 7, 2021, date of publication September 14, 2021,

date of current version September 22, 2021.


Digital Object Identifier 10.1109/ACCESS.2021.3112560

Model Predictive Control With Learned Vehicle


Dynamics for Autonomous Vehicle Path Tracking
MOHAMMAD ROKONUZZAMAN , NAVID MOHAJER , SAEID NAHAVANDI, (Fellow, IEEE),
AND SHADY MOHAMED
Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Waurn Ponds, VIC 3216, Australia
Corresponding author: Mohammad Rokonuzzaman ([email protected])

ABSTRACT Model Predictive Controller (MPC) is a capable technique for designing Path Tracking
Controller (PTC) of Autonomous Vehicles (AVs). The performance of MPC can be significantly enhanced
by adopting a high-fidelity and accurate vehicle model. This model should be capable of capturing the full
dynamics of the vehicle, including nonlinearities and uncertainties, without imposing a high computational
cost for MPC. A data-driven approach realised by learning vehicle dynamics using vehicle operation data
can offer a promising solution by providing a suitable trade-off between accurate state predictions and the
computational cost for MPC. This work proposes a framework for designing an MPC with a Neural Network
(NN)-based learned dynamic model of the vehicle using the plethora of data available from modern vehicle
systems. The objective is to integrate an NN-based model with higher accuracy than the conventional vehicle
models for the required prediction horizon into MPC for improved tracking performances. The proposed
NN-based model is highly capable of approximating latent system states, which are difficult to estimate, and
provides more accurate predictions in the presence of parametric uncertainties. The results in various road
conditions show that the proposed approach outperforms the MPCs with conventional vehicle models.

INDEX TERMS Autonomous vehicles, path tracking controller, model predictive control.

I. INTRODUCTION of implementation and acceptable performance at low


Path Tracking Controller (PTC) is an integral subsystem of speeds. Direct Lyapunov-based controller [9] and feedback
an Autonomous Vehicle (AV). For designing PTC for AVs, linearisation-based controller [10] have been also found prac-
the complexity and fidelity of the chosen vehicle model tical solutions for PTCs of AV. These controllers generally
vary depending on the type of the tracking task and control do not consider the dynamics effects of the vehicle and suffer
technique [1]. For example, to implement the path tracking at from a high level of uncertainties, so they are not suitable
low speed, a simpler vehicle model is sufficient and provides for the AVs in the urban environment [2]. To reduce the
reasonable accuracy [2]. A kinematic vehicle model is a effect of the unmodelled vehicle dynamics, dynamic vehicle
popular choice for designing PTC for the lower-speed opera- models are used to design PTCs. These types of controllers
tion of AVs [3]–[5]. However, due to unmodelled dynamics, include Sliding Mode Control (SMC) [11], Optimal Con-
a kinematic vehicle model is not viable at higher speeds [2]. trol [12], Adaptive Control [13] and Model Predictive Control
In these situations, a dynamic vehicle model is more accurate. (MPC) [14]–[16].
Different fidelity vehicle dynamic models such as bicycle MPC has shown to be a suitable option for AVs that can
and dual-track models have been used depending on the accommodate controller design with a reasonable level of
complexity of path tracking tasks. complexity and computational cost. In addition, this control
A number of techniques have been proposed to approach has the potential to ensure and increase the comfort
design the PTC [1], [6]. Geometry-based controllers of the AVs’ passengers by improving handling performance
such as Pure Pursuit [7] and Stanley controller [8] are of an AV [17]. One of the major advantages of MPC is its
quite popular due to their low computational cost, ease ability to handle multiple variables and constraints. Besides,
it has inherent robustness against uncertainties. It can address
The associate editor coordinating the review of this manuscript and the physical limit of the actuators making it highly useful
approving it for publication was Qing Yang . for AV’s PTC design. Moreover, additional constraints on the

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.


VOLUME 9, 2021 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 128233
M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

states can be imposed to ensure the safety and stability of suspension systems, which may also need to be integrated
the vehicle. Nevertheless, the performance of the controller into the vehicle model in some capacity. For instance, involv-
directly depends on the accuracy of the vehicle model and ing the steering dynamics has been recommended to improve
requires careful considerations. As MPC predicts the states the performance of MPC [27], [28]. Furthermore, some
of the vehicle for a certain time horizon at each sampling aspects change due to the operating conditions, including
time, the accuracy of these predictions affects the perfor- parametric and non-parametric uncertainties affecting vehi-
mance of the controller significantly. A high fidelity dynamic cle motion. For example, different environmental conditions
vehicle model can improve the performance of the controller. such as the friction coefficient of the road surface, wind
However, the computational cost associated with a complex speed, vehicle weight and load transfer also impact vehicle
vehicle model may not be suitable for real-time operation as motion. In addition, some vehicle parameters change during
MPC solves an optimisation problem at each time step. the vehicle lifetime. A rigidly defined vehicle model may
For PTCs, two different approaches are generally adopted not be suitable for different vehicles in various operation
to design MPCs. These include Linear MPC (LMPC) conditions. As the design and development of AV are becom-
using a linearised vehicle model [14], [18], [19] and Non- ing more advanced and optimised and eventually adopted
linear MPC (NMPC) where a fully nonlinear model is by more people, an adaptive data-driven approach may be
utilised [15], [20]. For linearisation of vehicle dynamics, required to identify and design vehicle dynamic model.
successive linearisation at each operating point is a common A Neural Network (NN) is a highly capable solution for
approach that transforms the model into a linear time-varying approximating nonlinear functions and can be used for learn-
model [14], [18], [19]. However, the use of linearised vehicle ing a vehicle dynamic model using the measured state and
model is only applicable for certain operating regions. For input data of the vehicle. A properly designed and trained
example, the force approximation using linear tire model NN does not generally suffer from unmodelled dynamics
becomes invalid for large slip angles [14], [21]. and provides more accurate performances. Besides, it can
In conventional MPC design, uncertainties are either tack- handle the parametric uncertainties given that it is trained
led by designing robust controllers or by estimating the values with sufficient data. For any dynamic system, identifying of
of the parameters. A robust controller can handle uncertain- latent system states is demanding and generally circumvented
ties up to a certain limit where a bound on the uncertainty by estimating some parameters. However, identification of
needs to be known. In this regard, robust MPC such as these states is not necessary when an NN-based approach is
tube-based MPC has been proposed by researchers. In the adopted. A properly trained NN with sufficient data can iden-
tube-based MPC approach [22], a feedback controller is used tify its internal representation of time-varying dynamics [29].
to keep the state within an invariant tube even under the In addition, a NN trained with state and input history can
influence of uncertainties. This approach has been used for identify variation in latent states such as vehicle load and
the active safety of the vehicle by ensuring the state and input friction co-efficient.
constraints are satisfied in the presence of disturbances and The NN-based model identification has been used for
uncertainties due to model mismatch [23]–[26]. controller design in different systems. This approach has
The nonlinear dynamic model provides more accuracy; been used for controlling the helicopters [30]–[32], autopilot
however, it still may not completely capture the dynamics of control of aerial vehicles [33], underwater vehicles [34], [35],
a vehicle. The design of an analytical mathematical model and different industrial systems such as wastewater treat-
generally requires choosing some specific physical aspects ment [36], interface level in a flotation column [37] and
that are most significant for the control task and ignore oth- PH maintaining system [38]. In the context of AV, NN-based
ers. However, the efficacy of these choices depends on the system identification has been adopted in a number of
designer’s capability and the required control task. A sim- reported researches. For example, in [39], [40], this approach
pler model may perform well for some specific situations, was used for identifying longitudinal dynamics, and in [41],
yet, in some cases, the unmodelled dynamics may introduce it was used for modelling the steering dynamics. A more
uncertainty and significantly affect the controller’s perfor- detailed combined lateral and longitudinal dynamics vehi-
mance. On the other hand, a highly complex model may cle model was designed using this approach in [42]. For
not be the best option to be used in the MPC context due the control of AVs, a number of control techniques have
to the computation cost of the online optimisation. In this been adopted with an NN-based system model. For instance,
regard, a learning-based MPC where the vehicle dynamics a backstepping variable mode controller was reported in [43]
are learned using the vehicle operation data can provide a and a sliding mode fuzzy controller was proposed in [44].
suitable trade-off between accuracy, unmodelled dynamics Furthermore, more recently, feedforward control [45] and
and complexity. iterative LQR [46] with NN-based vehicle model is proposed.
Different types of vehicles are currently available, and In the context of AV, the NN-based vehicle model iden-
a general model formulation for all vehicle types is diffi- tification approach has been proposed in combination with
cult. Even for the same kind of vehicles, some properties other control techniques. However, the use of this model in
will be inherently different. Besides, several subsystems in the MPC for PTC design has been unexplored. MPC can be
a vehicle affect the motion, such as the steering, brake and potentially used to improve the path tracking performance

128234 VOLUME 9, 2021


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

using a more accurate NN-based vehicle model. Especially and their variation in the MPC context is also provided.
with the modern vehicle data acquisition system providing These vehicle models are used for comparing the perfor-
abundant operational data, a learning-based approach can mance of the NN-based vehicle model and the proposed
provide a more reliable solution for vehicle dynamics approx- MPC controller.
imation. In this work, we propose a new data-driven MPC The considered AV system can be expressed as
where two sets of states and input measurements history
are maintained for a few previous steps for NN prediction. xt+1 = f (xt , ut , wθ ), (1)
The first set contains the controlled vehicle’s state and input
measurements, and the second set is used for the predicted where, t is time, xt ∈ Rn is the state vector, ut ∈ Rm is
states and corresponding inputs during the MPC optimisa- the input of the system, and f : Rn × Rm → Rn is the
tion. These histories with current measurements are used to vehicle transition function. In addition, wθ represents a set
estimate future states. This approach allows more accurate of variables that represents the parametric uncertainty of the
prediction in the presence of uncertainties in the vehicle’s system.
parameters, such as surface friction coefficient and load vari- In this work, we limit the study only to parametric
ation. The use of state and input measurements history allows uncertainties wθ of the system. For the sake of simplicity,
more accurate predictions up to a specific prediction horizon. we assume that other forms of uncertainties (including noise)
The main contributions of this work include: (1) Learning and delays are negligible for the current system. In addition,
a more accurate vehicle dynamic prediction model than the the state values are assumed to be directly measurable, and
current analytical vehicle models using a NN. The learned state estimations are not required. Our primary objective is
model can be reliably used in MPC to provide more accurate to design an MPC with an NN-based vehicle model that
state estimations up to a certain prediction horizon with- provides improved performance under different parametric
out significantly increasing the MPC computational cost. uncertainties. We assume that certain vehicle parameters
(2) Demonstrating that the resulted MPC with an NN-based such as vehicle load and road surface friction vary during
prediction model can improve the tracking accuracy of an vehicle operation. These parameters differ from their initial
AV in the presence of parametric uncertainty. (3) Designing values, and the variations in parameters are not known. Here,
a novel Switched MPC (SMPC) with an adaptive NN-model the objective is to use the NN which can identify the under-
where the NN’s weights and biases are updated online using lying changes in parameter values and provide more accurate
vehicle measurements data. In the switching scheme, a choice predictions for the MPC.
between a nonlinear analytical model and the adaptive NN It is aimed to design a lateral MPC controller for which
model is made based on a cost function. We propose two vehicle states x = [X , Y , ψ, vx , vy , ay , r] are considered.
different approaches for designing MPC with an NN-based Here, [X , Y ] is the vehicle position on the global coordinate,
lateral vehicle model. In the first approach, the NN is trained ψ is the yaw angle, vx is the longitudinal velocity, vy is the
offline with the data collected for various operating con- lateral velocity, ay is the lateral acceleration and r is the yaw
ditions and used by the MPC for the prediction of the rate of the vehicle. The control action is the steering angle
states. In the second approach, an adaptive technique is u = δ for the system.
adopted for training NN where the network’s weights and
biases are updated based on real-time data from the vehicle. A. DYNAMIC VEHICLE MODEL
An SMPC is used to accommodate the use of the online The bicycle model is the most commonly used for design-
trained NN-model. Ultimately, the performances of the pro- ing MPC. Here, the considered vehicle has a mass m and
posed approaches are compared to the existing LMPC and a moment of inertia Iz at the vehicle center of gravity. The
NMPC. dynamic model of the vehicle can expressed as [47], [48]
The rest of the paper is organised as follows. In section II,
a discussion of the AV and the conventional physical models 1 
used for MPC design is provided. In section III, the details of v̇x = Fxf cos(δ) − Fxb + Fyf sin(δ) + vy r, (2a)
m
the NN-based vehicle model are discussed. Next, the design 1 
of MPC using the NN-based vehicle model is reported in v̇y = Fyf cos(δ) + Fyb − Fxf sin(δ) − vx r, (2b)
m
section IV. The implementation procedure is addressed in 1
Fyb lb + Fxf sin(δ) − Fyf cos(δ) lf .
 
section V and the performances of the proposed vehicle mod- ṙ = (2c)
Iz
els and the controllers are evaluated on various conditions,
and the results are reported in section VI. Finally, the conclu- Here, the forward and rear wheels are represented by f
sion of the work is drawn in section VII. and b. Besides, Fx and Fy is the longitudinal and lateral force,
respectively. In addition, lf is the distance of the front wheel
II. PRELIMINARIES from the centre of gravity, and lb is the distance between the
This section provides a brief introduction to the AV system, rear wheel and the centre of gravity. Finally, δ represents the
including the chosen states and inputs. A preliminary dis- steering angle of the vehicle. Figure 1 shows the schematics
cussion on the most commonly used vehicle dynamic model of a vehicle dynamic model.

VOLUME 9, 2021 128235


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

For this model, the lateral tyre force is expressed as [49]



C2
−Cα tan α + α | tan α| tan α



3µFz



Fy = Cα3 (6)
− 2F 2
tan3 α if |α| < αs
27µ



 z
µF sgn(α)

Otherwise
z
Here, µ is the tyre-road friction coefficient, Cα is the
cornering stiffness, Fz is the normal load and αs is the slip
saturation angle. This αs is calculated as
 
3µFz
αs = tan−1 (7)

These vehicle models will be used for the performance
comparison with the proposed approach. We implement two
different MPCs using these vehicle models and compare their
performances with MPC with the proposed NN-based vehicle
model.

FIGURE 1. Geometry of a dynamic bicycle model of a car-like vehicle. III. LEARNING NEURAL NETWORK VEHICLE MODEL
To learn a lateral vehicle model using a NN, a subset of
vehicle states is assumed as χ = [vx , vy , r] and a vector
1) LINEAR TYRE MODEL qt = [(χt , . . . χt−Nh +1 ), (ut . . . ut−Nh +1 )] representing the
Different tyre models have been proposed for designing current and history of the states and control value up to a
MPC depending on the slip angle of the vehicle. For small slip certain time period Nh is considered at each time instance t.
angles, a linear tyre model is often used where the relationship One of the primary objectives of this work is to train a
between the cornering stiffness and the generated force is multilayer feedforward NN that provides the following rela-
linear. For the linear tyre model, the wheel slip angle for the tionship between the current and history of inputs and states
front αf and rear wheel αr can be expressed as [47] to the next step of the system.
ŷt = fNN (qt , ω) (8a)
vy + lf ψ̇ ut = ut−1 + 1ut .
 
(8b)
αf = δ − tan−1 , (3a)
vx where ŷt = [v̇y,t , ṙt ] is the estimated value of the lateral
vy −lb ψ̇ and yaw accelerations of the vehicle. fNN represents the NN
 
αb = − tan−1 . (3b) with two hidden layers, each with N units and ω is the sets
vx
of weights and biases. In this work, we chose a feedforward
multilayer NN even though a number of other NN architec-
For the small slip angle approximation, a linear relation-
ture can be used for this proposed work. Multilayer NN is one
ship between lateral tyre force and tyre slip angle can repre-
of the commonly used architecture for system identification
sented as [48],
due to the simplicity of its design and ease of implementation.
This NN architecture has successfully been used for identi-
Fyf = −Cf αf , (4) fication of different system such as helicopter systems [31],
Fyb = −Cb αb , (5) aerial vehicle [33] and underwater vehicles [34], [35]. In this
work, different hyper-parameters of the NN are chosen based
where Cf is the cornering stiffness of the front wheels and on our previous experience of NN-based designs and trial-
Cb is the cornering stiffness of the rear wheel. and-error. The architecture of the NN is shown in Fig. 2.
Two different approaches are used for training the NN.
In the first approach, data from human driving is collected
2) NONLINEAR TYRE MODEL
for different road conditions and then used to train the
The linear tyre model is only efficient for small slip condi-
NN offline. In the second approach, the NN is trained online
tions. For larger slip conditions, nonlinear tyre models per-
in parallel with the vehicle operation.
form significantly better than linear models. An analytical
model such as Brush model [49] is one of the commonly A. OFFLINE TRAINED MODEL
used approaches for approximating vehicle tyre forces. In this
The training dataset of n number of trajectories with a sam-
approach, the tyre forces are calculated using the wheel’s
pling step size 1t is assumed as
lateral slip angle (α) and the normal force (Fz ). In the context
of MPC, a modified brush model is often used [50], [51]. D(i) = {qit , qit−1 . . . ..qit−p } (9)

128236 VOLUME 9, 2021


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

FIGURE 2. The proposed architecture for NN.

Here, i = 1, 2 . . . n is the trajectory instances and p is the model is proposed. Here, the network is trained online when
time step in each trajectory. the data from the vehicle is updated.
In the standard NN approach, the objective is to find a It is assumed that no data is available prior to the start of the
set of weights and biases that reduces the error between the vehicle operation. In this approach, training the NN before the
estimated network output and the observed data from the start of the vehicle operation is not required. The NN weights
system. The NN shown in Fig. 2 establishes the following and biases are initialised using the Nguyen-Widrow method
relationship (NW) [52] and then updated sequentially when a new set
of data is available from the vehicle. The network weights
ŷt = WL2 φ WL1 φ(WI a + b1 ) + b2 + b3

(10a)
and biases are always updated using the Ns number of the
where, a := (χ , u) ∈ R|a| is the input the NN and φ(·) is vehicle measurement data. The training of the NN starts when
the activation function. WI , WL1 , WL2 represent the weights Ns steps of vehicle state measurement and corresponding
of the input layer, first hidden layer, and the second hidden input data are available from the vehicle operation. The
layer, respectively. Similarly, b1 , b2 , b3 represent the weights weights and biases are updated periodically after a specific
of the input layer, first hidden layer, and the second hidden update delay of d = Nn steps, which allows the use of new
layer, respectively. The choice of training algorithms and Nn number of data for each update. A mixture of old and new
other hyper-parameters such as the number of neurons in each data Ns = Nc + Nn is used to update the network, where
layer, activation function, and learning rate are discussed in Nc is the number of old data and Nn is the number of new
more detail in section V. operation data. The change in weight after each iteration can
The output of the NN can be used to estimate the lateral be expressed as
velocity and yaw rate of the vehicle as follows 1W (t + 1) = M 1W (t) + (1 − M ) ∗ L ∗ W (t) (12)
   
vy,t v
= y,t−1 + ŷt dt (11a) where, 1W is the change in weights, M is the momentum
rt rt−1
constant, and L is the learning rate. This operation is con-
The trained network is used to predict the states’ output in ducted in parallel with the path tracking task. More details on
the context of MPC for the lateral control of a vehicle. The the choice of the dataset size and hyperparameter values will
formulation of MPC with NN-based vehicle state prediction be reported in section V.
model will be shown in section IV.
IV. CONTROLLER DESIGN WITH LEARNED MODEL
B. ONLINE TRAINED MODEL The formulation of the MPC for the lateral control of an
An efficient PTC should be able to perform under dif- AV based on the NN-based lateral transition model is dis-
ferent operating conditions. Approximating vehicle transi- cussed in this section. Firstly, the MPC with the offline trained
tion dynamics using the offline trained NN needs a large NN-based vehicle model is discussed. Then, a switched MPC
dataset containing information from different road conditions approach for the online trained NN-based vehicle is reported.
with different vehicle states and controls. Moreover, a static
NN model may not be sufficient for a highly dynamic system A. MPC WITH OFFLINE TRAINED MODEL
such as AV operating on various environmental conditions. For the AV system of (1), based on the discussion
To circumvent this, an adaptive approach to learn NN-based on section III, the NN-based transition model can be

VOLUME 9, 2021 128237


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

expressed as using vehicle model during the optimisation process. At each


T sampling time, the optimisation process is started by replac-
v̇y ṙ = fNN (xk , . . . xk−Nh +1 , uk . . . uk−Nh +1 )

(13a) ing Hp with real vehicle history Hr . These Hp values
Ẋ = vx cos ψ − vy sin ψ (13b) are then used for the prediction of the states using the
Ẏ = vx sin ψ + vy cos ψ (13c) NN model. During the optimisation process, the history of the
predicted states Hp is updated based on the estimated states
ψ̇ = r (13d)
for the corresponding input generated by the optimiser. The
For the sake of compactness, the vehicle transition model algorithm for the MPC with an NN-based vehicle model is
is expressed as shown in Algorithm 1.
xk+1 = xk + ẋk 1t = fNN (xk , uk , θ) (14)
Algorithm 1 MPC With NN-Based Vehicle Model
where, θ = [WI , WL1 , WL2 , b1 , b2 , b3 ] are the parameters
of the NN. In an MPC approach, the optimal control problem Input: Initial state x0 (feedback from real vehicle),
is solved using a receding horizon approach. At any time t, history of real vehicle states and input Hr =
the MPC problem can be expressed as [xt−1 , xt−2 . . . xt−N , ut−1 , ut−2 . . . ut−N ], history of esti-
mated state and corresponding input Hp , prediction hori-
Np −1
X zon Np , cost function J , NN-based vehicle model fNN
arg min J (xk|t , uk|t ) (15a) 1: Form the optimisation problem using (13),(14) and (15)
U k=0 2: while MPC is running do
subjected to, x̂t+k+1|t = fNN (x̂t+k|t , ut+k|t , θ) (15b) 3: Measure current state xt .
uk = uk−1 + 1uk , (15c) 4: Update Hp = Hr with current state measurement and
x(0|t) = x(t) (15d) control (if available).
5: Start the optimisation problem
u(k) ∈ U ∀k ∈ [t, t + Np ] (15e) 6: while Optimisation is running do
x̂(k) ∈ X ∀k ∈ [t, t + Np ] (15f) 7: for i = 1:Np do
8: Estimate next state x̂t+i using fNN , Hp and ui
Here, Np is the prediction horizon, and J is the stage
9: Update Hp using the estimated states and control
cost. The state and input constraints are represented by
10: end for
sets X and U. In addition, x̂ represents the predicted state
11: Find optimal control sequence
of the vehicle based on the current measured state. At each
12: end while
time step, an optimal solution for the control action U ∗ =
13: Apply only u(t|t)
[u∗t , u∗t+1 . . . .u∗t+Np ] is found and only the first control action
14: t = t+1
is sent to the real system. Then, the whole process is repeated
15: end while
at the next time step. Fig. 3 shows the architecture of MPC
with the offline trained NN transition model.

B. SWITCHED MPC FOR ONLINE TRAINED MODEL


In the online training approach, the NN-model is capable of
adopting new data collected during vehicle operation. How-
ever, this requires a certain number of data (vehicle operation
for a certain time) and a certain iteration of learning to be
effective. This is true at the start of vehicle operation or when
a completely different operating condition is faced. During
this time, the network needs to be trained in a number of
FIGURE 3. MPC with NN-based vehicle model.
iterations with new data to perform better than a nonlinear
dynamic model.
To circumvent the aforementioned problem, a switched
During the MPC control process, the future states are pre- MPC is suggested. In this approach, both the NN-based
dicted for a certain time horizon at each sampling time based vehicle model and the nonlinear dynamic model (discussed
on the current state. The NN vehicle transition model requires in section II-A) can be in effect. The vehicle model used
a history of states and corresponding control actions for for state prediction is switched based on the accuracy of the
certain previous time steps along with the current states and predictions of these models. At each time interval, prediction
control input. To facilitate this, two separate sets of states and from both models is compared with the vehicle’s current state
input histories are maintained: 1) history of real vehicle Hr , using the same input sent to the vehicle. This prediction error
including vehicle state measurements during vehicle is calculated based on the difference between the models’
operation, and 2) history of the predictions Hp , calculated predicted states and the vehicle’s current state. In this case,

128238 VOLUME 9, 2021


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

the switched MPC formulation is expressed as the same vehicle is controlled on various road conditions
Np −1
for different manoeuvrers. In addition, to evaluate the per-
X formance of the proposed controller in the presence of para-
arg min J (xk|t , uk|t ) (16a)
U metric uncertainty, tests are conducted for different parameter
k=0
values of the system. The performance has been evaluated for
subjected to x̂t+k+1|t = fp (x̂t+k|t , ut+k|t ) (16b)
the variation of two parameters of 1) road surface friction and
p
f = h(ep ) (16c) 2) vehicle load (mass).
uk = uk−1 + 1uk (16d) In this section, first, a description of the controlled vehi-
x(0|t) = x(t) (16e) cle and the corresponding simulated environment is briefly
discussed. Then, the data collection process for training the
u(k) ∈ U ∀k ∈ [t, t + Np ] (16f)
NN-based model is reported. Finally, the formulation and
x̂(k) ∈ X ∀k ∈ [t, t + Np ] (16g) performances of the MPC are reported.
Here, f p is vehicle transition function where p represents
either the NN-based model or the nonlinear dynamic model, A. SIMULATED VEHICLE
ep is the prediction error of each pth vehicle model and h(·) is The simulated real vehicle controlled by the MPC has 14 DoF.
the switching function. The prediction error is calculated as This vehicle body has six DoF (longitudinal, lateral, vertical,
p
yaw, pitch and roll) with four wheels, and each of them has
ep (f p , t) = |x̂t − xt |2 (17) two DoF (vertical and rolling). The vehicle body is con-
nected to each wheel by a spring-damper suspension sys-
where, x̂p is the predicted states using the vehicle model f p
tem. In addition, this model also includes a front-wheel-drive
and x is the measured vehicle state. Fig. 4 shows the architec-
driveline, mapped spark-ignition engine, transmission, brake
ture of the SMPC with the adaptive NN transition model. The
hydraulics and steering subsystems. This vehicle model is
same approach discussed in Algorithm 1 is used if the SMPC
implemented in the MATLAB/Simulink environment. Fig. 5
selects the NN model.
shows the architecture of the simulated vehicle model. The
nominal values for different parameters of this model are
shown in Table 1.

TABLE 1. Nominal parameter of the simulated vehicle.

FIGURE 4. Switched MPC with NN-based vehicle model. Here,


NN represents the adaptively learned NN-based vehicle model and NL
represents the conventional nonlinear analytical vehicle model.

B. DATA COLLECTION
V. IMPLEMENTATION To implement the proposed offline NN-based MPC, first,
A simulated testbed using Matlab and the physics simulator the NN representing the dynamics of the vehicle needs to
‘Unreal Engine’ is developed to evaluate the performance of be trained. Data from a number of driving scenarios were
the proposed NN-based MPC controllers. A complex, high collected from the simulated environment. During this pro-
fidelity 14 Degree of Freedom (DoF) vehicle model includ- cess, the high fidelity model described in section V-A is
ing several other subsystems such as steering, suspension, used to drive the road on the road. A 3D simulation environ-
transmission and driveline is used to simulate the vehicle. ment, ‘Unreal Engine’, is used for rendering the road envi-
Details of the vehicle system are discussed in the following ronment. The Unreal Engine was interfaced with Simulink,
subsection. This model represents an actual vehicle that is too which performs the vehicle dynamics operations. A Log-
complex to be used in the MPC optimisation process. itech G290 steering-pedal system is used to control the
The performance of the proposed NN-based MPC is vehicle while the data is collected through communicating
compared with two implementations of existing MPC: between the Unreal Engine and the vehicle dynamic model.
i) LMPC using a linear tyre model ii) NMPC using a non- Fig. 6 and 7 shows the architecture of the data collection
linear tyre model. These two models are commonly used for system.
designing a dynamic model of a vehicle in the context of As the path profile has a significant effect on a vehicle’s
MPC, as discussed in section II. For both implementations, handling performance [53], for collecting data, the vehicle

VOLUME 9, 2021 128239


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

FIGURE 5. Architecture of the complex 14 DoF vehicle model.

global position X and Y , yaw angle ψ, longitudinal veloc-


ity vx , lateral velocity vy , longitudinal acceleration ax , lateral
acceleration ay , yaw velocity r and yaw acceleration ṙ. The
corresponding steering wheel angle input δw and steering
angle δ is also recorded. All data were collected at 100 Hz
and then down-sampled at 33Hz.
FIGURE 6. System architecture for data collection.
C. TRAINING NEURAL NETWORK
1) OFFLINE TRAINING
Here, using the collected data, the NN is trained offline. The
NN model represents the vehicle transition dynamics, so it
can be used in the MPC to predict the future states of the
vehicle. As the MPC is designed for the lateral control of
the vehicle, the objective is to train a NN that estimate the
transition of the vehicle states based on the history of the
vehicle states and steering wheel angle input.
The gradient-based optimisation technique, ‘Adam’,
is used for training the multilayer network. To this aim,
the collected dataset is separated into three segments. Ran-
domly chosen 70% of total data are used for training,
15% for validation, and the remainder 15% is used for testing.
The Relu activation function is used for each hidden layer
FIGURE 7. Driving controller and environment rendering in unreal engine having N = 100 unit. The minibatch size of 50 is used with a
for human demonstration. learning rate of 0.001. The NN is trained for 10,000 iterations.
A total of 230,000 trajectory steps (around 115 min of driving
data) is used for training. The time required for training is
was driven on three different road types numerous times, 55 minutes on a computer with an Intel i7 processor and
including 1) straight road (highway), 2) curved road (race- 32 GB RAM using MATLAB’s Deep Learning Toolbox.
track), and 3) city block (a mixed of different turns). Different
manoeuvres such as single lane and double lane change are 2) ONLINE TRAINING
also performed frequently for each road condition. In this approach, the network weight and bias values are
The ability of the controller is tested for different param- initialised using the NW method [52]. To reduce the compu-
eters variation. Data for different surface conditions such tational burden of online training, a smaller number of units
as dry road (friction coefficient, µ = 1) and wet road for each hidden layer N = 50 is used. The network weights
(µ = 0.6) was collected to achieve this. Moreover, data and biases are then updated using the gradient descent with
for different loading conditions were considered. To simulate momentum algorithm when a certain amount of data is avail-
this, the mass of the vehicle is varied due to the presence able. Data is collected when the vehicle starts its operation.
of a passenger on the vehicle. For no passenger condition, After collecting Ns = 750 steps of vehicle states and input
the nominal mass of the vehicle is considered. For the single data, the network weight and biases are updated. This process
passenger condition, 70kg is added to the mass of the vehicle. is repeated after Nn = 50 time steps. After each Nn time
For the sake of simplicity, the additional mass is assumed to steps, new dataset is assembled which contains Ns = Nc + Nn
be distributed evenly on the vehicle. number of states and control sequence data from the vehicle.
For all driving tasks, at each time step, all vehicle states Here, Nc is the number of sequences from the old dataset.
and inputs are recorded. The vehicle states include: current This process can be conducted in parallel with the MPC with

128240 VOLUME 9, 2021


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

a separated processing core, so it is not considered a part of sufficient data is available, and the updated network is used
the real-time optimisation process of the MPC. for the prediction error calculation. Based on this prediction
error, the switched MPC uses the NN-model when it is more
D. MPC FORMULATION accurate than the nonlinear dynamic model.
Using the offline trained NN-based vehicle transition model, For both cases, MPC is only used for the lateral control
the MPC controller is created based on the discussion in of the vehicle and a separate longitudinal controller for the
section IV. The following cost function is used for the vehicle is designed. For the longitudinal control, a simple
MPC optimisation PI controller is used to maintain a predefined speed of the
Np −1
vehicle.
X ref R ref
J (xk|t , uk|t ) = wd |ξd |2 + wψ |ξψ |2 + wδ |1δ|2 (18a) ax = Kp (vx − vx ) + KI (vx − vx ) (20)
k=0 where, vref is the reference speed of the vehicle, KP is the
where, proportional and KI is the integral gain.
q 2 2
ref ref
ξd = X̂t+k|t − Xt+k|t + Ŷt+k|t − Yt+k|t (18b) E. PARAMETER TUNING
q
ref 2 Parameter tuning of MPC weights plays an important role
ξψ = ψ̂t+k|t − ψt+k|t (18c) in the performance of the controller. MPC’s performances
Here, the first term of the cost function ξd represents the can be improved by tuning the parameters even for specific
distance error between the current position of the vehicle road conditions and manoeuvres. To properly compare dif-
and the closest point of the reference path. X̂ and Ŷ is the ferent MPCs, it is essential to provide a baseline approach
predicted position of the vehicle using the transition model. for tuning its parameters. To be able to have a consistent
Similarly, ξψ expresses the angle error between the current comparison, the parameter for each controller is tuned using
yaw angle of the vehicle and the path angle in the global a Genetic Algorithm (GA)-based optimiser. The objective of
coordinate, where ψ̂ is the estimated yaw angle of the vehicle. the optimiser is to find proper tuning parameters with similar
w(·) are the corresponding weight of each term. X ref , Y ref and effort for each controller.
ψ ref represent the position and angle of the reference path to The following features are used to compare the perfor-
be followed by the vehicle. In addition, 1δ is the steering mance of different the controller.
angle input rate used for lateral control of the vehicle. maximum lateral error : ξd,max = max |ξd (t)|
t∈[0,T ]
The optimisation problem of the MPC is solved using the
interior point optimisation method using the IPOPT package maximum orientation error : ξψ,max = max |ξψ (t)|
t∈[0,T ]
on a computer with an Intel i7 processor with multiple cores. s
1 T
Z
The value of the prediction horizon, time step and weights of
average lateral error : ξd,rms = ξd (t)2 dt
the cost function are listed in Table. 2. T 0
s
1 T
Z
TABLE 2. Value of controller parameters.
average orientation error : ξψ,rms = ξψ (t))2 dt
T 0
where, ξd is the lateral error and ξψ is the orientation error.
The optimal tuning parameters minimises the RMS and max-
imum tracking error. The following cost function is used for
the GA optimisation:
Jtune = ξd,rms + ξd,max + ξψ,rms + ξψ,max (21)
In the GA optimisation, a population size of 50 and a
For the switched MPC approach with the online trained maximum number of generation of 100 are used.
NN-model, the same cost function of (18) is used. However, Using this GA optimiser, a set of suitable parameters are
the vehicle is started with the nonlinear dynamic model, chosen for proposed controllers. In addition, to remove the
and prediction performances for both nonlinear dynamic effect of the parameter tuning from the performance compar-
and adaptive NN-model are compared at each step. Predic- ison, the same GA optimiser is used for the compared LMPC
tion error for each model is calculated using the following and NMPC.
equation
p VI. RESULTS AND DISCUSSION
ep (f p , t) = we |xt − xvt |2 (19)
In this section, the results of prediction performances of the
where, xv = [X , Y , ψ, vy , r] is the measured vehicle states, designed vehicle models are shown. Moreover, the results of
xp is the predicted states of model f p . To have an appro- tracking performance of the proposed MPCs are presented.
priate representation of each state, we = [1, 1, 1, 10, 10] is We also provide a detailed discussion on the results and
used. The NN-model is adapted at a regular interval when corresponding comparisons with other controllers.

VOLUME 9, 2021 128241


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

A. MODEL PERFORMANCES
Prediction performances of the offline trained NN-based
vehicle model are evaluated, and a comparison with the
observer vehicle data is shown in Fig 8. Here, the results
for different road conditions are partitioned using the dotted
vertical line. The first portion shows the prediction results for
road surface friction of µ = 1, and the second portion is for
µ = 0.6.

FIGURE 9. Test road segments with different curvatures.

FIGURE 8. Performance comparison of observed data and trained NN’s


prediction. The variation in road-surface friction is indicated by the dotted
line. In the first portion road surface friction coefficient µ = 1 and in
the second portion µ = 0.6 is used.

The prediction performance of the model is also tested


for different vehicle load conditions. Moreover, the model’s
prediction performance on roads with different curvatures for
these parameter variations is recorded. The test road segments
with different curvatures are shown in Fig. 9. Here, we can
see that segment-2 has smaller curvatures than segment-1,
whereas the curvatures for segment-3 is much higher than
other segments. The performances of the model are tested for
these three road segments under the aforementioned param-
eter variations. Here, we use the metric Root Mean Square
Error (RMSE) to express the NN’s performance. Figure 10
shows the RMSE for each output of the offline NN model
using the test vehicle data and corresponding NN predictions.
Figure 10a shows the RMSE of the lateral acceleration for FIGURE 10. NN model performance for different road segments for
different road segments under different parameter values. parameter variations.

A similar result for the yaw acceleration output is shown


in Fig. 10b. each K time steps, the states of the prediction model are
In an MPC approach, the vehicle model is used to pre- updated using the state-feedback of the vehicle. The predic-
dict future vehicle states, which are then used to generate tion error is calculated using (19).
optimised control action. After applying the control action, Figure 11a shows the K-step ahead error comparison of
the vehicle model is reinitialised using the state feedback. The two NN-based models and the nonlinear dynamic model for
estimation accuracy of the vehicle model up to a K-step ahead a constant steering angle operation of the vehicle. In addi-
prediction horizon plays an important role in the performance tion, a similar comparison for an increasing steering angle
of the MPC. Figure 11 show the K-step ahead prediction error operation of the vehicle is shown in Fig. 11b. In both figures,
for different vehicle models. For this comparison, the pre- at each step, the mean error for the K-step prediction horizon
diction horizon of MPC K = 8 is used. This means after is shown. The dotted vertical line represents the condition

128242 VOLUME 9, 2021


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

when it has sufficient data and is trained for a number of itera-


tions. This observation can be detected from the results at the
beginning of the operation. However, this model eventu-
ally performs better than other models after a specific time.
A switching MPC is proposed to circumvent this problem
where the operation starts with a nonlinear dynamic model
and switches to the NN model when it provides better
performance.

B. TRACKING PERFORMANCE
1) MPC WITH OFFLINE NN-BASED MODEL
To test the proposed controller’s performance with the offline
NN-based model, two different manoeuvers are considered,
1) Single-Lane Change (SLC) and 2) Double-Lane Change
(DLC). For both manoeuvres, the reference trajectory is
collected from human driving with the same vehicle. Each
manoeuvre is performed for the variation of two parameters:
friction co-efficient and vehicle load. The controller’s per-
formance is compared with two conventional MPCs: LMPC
and NMPC.
To evaluate the controller’s efficacy with the parameter
variations, the NN-based vehicle model is trained with the
FIGURE 11. K-step ahead prediction error for vehicle operation with a) a data from different variations of these parameters. For con-
constant steering angle b) increased steering. At each step, a mean
prediction error for K = 8 prediction horizon are shown. ventional MPCs, the vehicle models assume a constant value
of these parameters, which is common in the literature of
when the online NN has enough data and starts adapting the PTCs for AVs. Here, for the physical models, the friction
weights and biases of the NN. Here, the vehicle longitudinal coefficient is fixed at µ = 1, and the vehicle load is at
speed is constant at 60 km h−1 . nominal vehicle load reported in Table 1. First, the vehicle
From these results, it is apparent that the NN-based mod- is operated on two different road surface conditions and the
els reflect superior performances when properly trained. results for each controller is recorded for both SLC and
The online trained adaptive NN provides better performance DLC manoeuvres. Fig 12 shows the trajectory and yaw angle

FIGURE 12. Tracking performance of the controllers for SLC manoeuvrer with a road surface friction coefficient of a-b) µ = 1 and c-d) µ = 0.6.

VOLUME 9, 2021 128243


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

FIGURE 13. Tracking performance of the controllers for DLC manoeuvrer with a road surface friction coefficient of a-b) µ = 1
and c-d) µ = 0.6.

comparison of different controllers for two surface conditions


(µ = 1 and 0.6) for SLC maneuver. Similarly, for the DLC
manoeuvre, the performance comparison for two different
road surface conditions are shown in Fig 13. Figure. 14
shows the RMS error comparison for all these operations.
For all operations, a forward velocity of 60 km h−1 is used.
This vehicle speed is chosen based on the most common
road speed limit information of Australia. According to the
review report of the Victorian Government, Australia [54],
60 km h−1 is the most common speed limit for Australian
roads with little to no pedestrian activities with a high number
of access points.
The controller’s performance is also evaluated for the
variation of vehicle load. Load conditions are considered
with no passenger and a single passenger. For the vehicle
without passenger condition, the nominal mass of the vehicle
reported in table 1 is used. For the single passenger condition,
the average mass of a human 70 kg is added. Figure 15
shows the SLC manoeuvre for the single passenger condition
for each controller. Similarly, Fig 16 shows the trajectories
and yaw angles for the DLC manoeuvre. A forward velocity
of 60 km h−1 is used for both cases. Finally, Fig. 17 shows the
rms error comparison for no-passenger and single passenger FIGURE 14. Tracking error comparison of the controllers for different
surface friction co-efficient values for a) SLC and b) DLC manoeuver.
condition.

prediction performance results shown in Fig. 11a and 11b,


2) MPC WITH ONLINE NN-BASED MODEL it is apparent that the adaptive NN approach requires a
The performance of the online NN model-based switched certain number of data and learning iterations to provide
MPC is discussed here. For the performance evaluation, better performance than the physics-based dynamic model.
the same road surface with a number of different manoeuvers To circumvent this problem, an SMPC is designed where
is chosen. From the discussion in section VI-A as well as the the controller uses the nonlinear dynamic vehicle model until

128244 VOLUME 9, 2021


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

FIGURE 15. Tracking error comparison of the controllers for SLC manoeuver for single passenger load condition.

FIGURE 16. Tracking error comparison of the controllers for DLC manoeuver for single passenger load condition.

FIGURE 17. Tracking error comparison of the controllers for vehicle load variation for a) SLC manoeuver and b) DLC manoeuver.

the NN-model provides better performance. For the clarity of of presentation, the data is shown when the NN model has
presentation, we refer to the nonlinear dynamic model as the enough data and starts adapting the network.
‘NL’ model. Figure. 18 shows the trajectory generated by the The performance of the proposed SMPC is also com-
SMPC. Here, the red portion of the trajectory is generated pared with other controllers. The trajectories generated by
while using the NL model, whereas for the rest of the blue different controllers are depicted in Fig. 20. The RMS error
coloured trajectory, the online NN-model is used. for each controller for the same tracking task is shown
During the operation of the switched MPC, the weights in Fig. 21.
and biases are adapted at a regular interval. This process
can be conducted in parallel with the MPC with a sepa- C. DISCUSSION
rated processing core, so it is not considered a part of the From the observation of Fig.12-17, it is apparent that MPC
real-time optimisation process of the MPC. Figure. 19 shows with the offline trained NN model performs significantly
the prediction error comparison of NL and NN vehicle model better than the other two controllers even in the pres-
calculated using (19) during the tracking task. For the clarity ence of parameter variations. Two important aspects to

VOLUME 9, 2021 128245


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

NN-MPC does not degrade significantly (which happens to


other controllers) due to the change in road-friction param-
eters. This is due to the fact that due to the used history of
state and control input, the NN-based transition model can
approximate latent states of the system without a significant
increase in the computation cost.
The proposed NN-based approach is highly beneficial
when the formulation of a mathematical model is complex.
Based on the design and application, an AV can have different
shapes and sizes. In addition, even the same type of vehicles
FIGURE 18. Trajectory generated by the proposed SMPC. Here, for
generating the red portion of the trajectory, SMPC used the nonlinear are not identical and are guaranteed to have some degree of
dynamic model, and for the green portion of the trajectory adaptive NN variation. Designing an analytical model for each of them is
model is used.
difficult and time-consuming. For most cases, the simplified
analytical model may introduce uncertainties due to unmod-
elled dynamics. Using the data-driven approach to learn the
vehicle dynamics, simplification is not required anymore, and
the full range of dynamics can be identified and simulated.
In this work, we only tested the performance of this approach
for parametric uncertainties. However, this approach poten-
tially can provide similar improved performances with the
presence of nonparametric uncertainties, noise, and delay
which are in the scope of our future works.
FIGURE 19. Prediction error comparison during SMPC tracking task. Here, One of the important aspects of the proposed approach
prediction error for two available models for the SMPC during the
tracking task is shown. is that the NN-based MPC’s efficacy depends on the size
and quality of the dataset. A large amount of data from
different road conditions are required for it to be highly
efficient. Moreover, similar vehicles do not reflect the iden-
tical dynamic and can vary in different aspects. For example,
some vehicle parameters change during the vehicle lifetime.
A fixed vehicle model may not be an ideal solution in this
case. To address this issue, the adaptive NN approach is
proposed. In this case, the network weight and biases are
adapted at a regular interval when a new set of data from the
vehicle is available. This approach continuously changes the
network based on the updated data. A mix of old and current
FIGURE 20. Trajectories generated by different controllers. Only a portion information can be used for the adaptation, so the NN does not
of the trajectory is shown for clarity.
totally ignore the previous experiences when the new data is
available. One of the bottlenecks of this approach is that the
network needs a certain amount of data to be available before
it provides proper results. An SMPC provides a good solution
where the controller uses the nonlinear dynamic model when
the prediction accuracy of the adaptive NN model is low.
Besides, from the results in Fig. 18-21, it is apparent that
the proposed SMPC is capable of performing the optimal path
tracking task. During the initial phase of the task, the con-
troller uses the nonlinear dynamic model until the NN-model
is ready. Then, the controller switches to the NN model. The
FIGURE 21. Tracking error comparison of the controllers for different
performance of the proposed SMPC has been evaluated for
surface friction co-efficient values. different road surface friction conditions. From the tracking
accuracy comparison shown in Fig. 21, it is clear that adaptive
note here: first, the performance of the NN-MPC is sig- NN-based SMPC reflects significantly superior performance
nificantly better even when the NMPC and LMPC have a than the conventional MPCs.
good approximation of the underlying parameter. This is Another important performance criterion of MPC is
because the NN-based vehicle transition model approximates the computational cost. A complex model may increase
the dynamics of the vehicle more comprehensively than the accuracy; however, it may not be fast enough for real-time
mathematical models. In addition, the performance of the operation. From the results of Fig. 22, it is evident that

128246 VOLUME 9, 2021


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

unmodelled dynamics by providing more accurate predic-


tions without significantly increasing the complexity of the
model, 3) accommodate approximation of nonlinearities of
the model, 4) estimate latent system states, and 5) identify the
system’s internal representation of time-varying dynamics if
properly trained with state and input history.
Here two approaches for MPC with an NN-based vehi-
cle model have been proposed. In the first approach, the
NN model was trained offline. The dataset for training
was collected by driving the simulated vehicle on various
road conditions and in the presence of parameter variations.
In the second approach, an adaptive NN model was used
where no data from the vehicle was required before starting
the operation. It has been observed that this latter approach
requires a certain amount of data and a number of adaptation
iterations before it performs better than a conventional analyt-
FIGURE 22. MPC optimisation solution time range for different ical nonlinear dynamic model. To circumvent this, a Switched
controllers a) LMPC b) NMPC c) NN-MPC. MPC (SMPC) was designed to switch to NN-model when it
outperforms the nonlinear dynamic model.
From the outcomes of the work, offline trained MPC out-
the proposed NN-based MPC is capable of real-time oper- performs the conventional MPC when the performance of
ation with a step size of dt = 0.033s. It provides a the controllers are evaluated on different road conditions and
more accurate prediction performance without imposing a in the presence of parameter variations. It is noted that this
significant increase in the computational cost of online opti- scheme requires a large dataset to be efficient in dynamic
misation. Noting that the online NN can be operated in operating conditions. Furthermore, it has been observed that
parallel with the MPC and is not considered a part of the the SMPC with an adaptive NN-based model provides signif-
MPC task. icantly superior performance compared to the proposed MPC
For future work, the proposed approach will be imple- with offline NN-model and the conventional MPC.
mented with more parameter variations and road conditions.
In addition, this approach will be integrated with our previous REFERENCES
works [55] on learning-based MPCs. [1] B. Paden, M. Cáp, S. Z. Yong, D. Yershov, and E. Frazzoli, ‘‘A survey of
motion planning and control techniques for self-driving urban vehicles,’’
IEEE Trans. Intell. Veh., vol. 1, no. 1, pp. 33–55, Mar. 2016.
VII. CONCLUSION [2] J. M. Snider, ‘‘Automatic steering methods for autonomous automobile
Path Tracking Controller (PTC) is an integral subsystem path tracking,’’ Robot. Inst., Pittsburgh, PA, USA, Tech. Rep. CMU-RITR-
of Autonomous Vehicles (AVs), responsible for control- 09-08, 2009.
[3] F. Kuhne, W. F. Lages, and J. G. da Silva, Jr., ‘‘Model predictive control of
ling the vehicle on a predefined reference path. Different a mobile robot using linearization,’’ in Proc. Mechatronics Robot., 2004,
approaches have been proposed for designing PTCs; among pp. 525–530.
them, Model Predictive Control (MPC) has shown to be [4] S. G. Vougioukas, ‘‘Reactive trajectory tracking for mobile robots based
on non linear model predictive control,’’ in Proc. IEEE Int. Conf. Robot.
a capable technique providing inherent robustness against Automat., Apr. 2007, pp. 3074–3079.
uncertainties. However, an efficient MPC relies on the proper [5] I. Batkovic, M. Zanon, M. Ali, and P. Falcone, ‘‘Real-time constrained
choice of the vehicle model used to predict future states. trajectory planning and vehicle control for proactive autonomous driving
with road users,’’ in Proc. 18th Eur. Control Conf. (ECC), Jun. 2019,
Moreover, a proper balance between complexity and accuracy pp. 256–262.
is essential to have acceptable performance for the MPC. [6] M. Rokonuzzaman, N. Mohajer, S. Nahavandi, and S. Mohamed, ‘‘Review
A simplified vehicle model can perform well on some operat- and performance evaluation of path tracking controllers of autonomous
vehicles,’’ IET Intell. Transp. Syst., vol. 15, no. 5, pp. 646–670, May 2021.
ing conditions; however, due to unmodeled dynamics of the
[7] S. F. Campbell, ‘‘Steering control of an autonomous ground vehicle with
vehicle, MPC’s performance may degrade for other condi- application to the DARPA urban challenge,’’ Ph.D. dissertation, Dept.
tions. It is noteworthy that a too complex model may not Mech. Eng., Massachusetts Inst. Technol., Cambridge, MA, USA, 2007.
be suitable for the real-time optimisation requirement of [Online]. Available: http://dspace.mit.edu/handle/1721.1/42301
[8] S. Thrun, M. Montemerlo, and H. Dahlkamp, ‘‘Stanley: The robot that
the MPC. won the DARPA grand challenge,’’ in Proc. Grand Challenge, Great Robot
Learning the vehicle dynamics from the vehicle oper- Race. Berlin, Germany: Springer, 2007, pp. 1–43.
ation data can provide a highly efficient alternative with [9] E. Alcala, V. Puig, J. Quevedo, T. Escobet, and R. Comasolivas,
‘‘Autonomous vehicle control using a kinematic Lyapunov-based tech-
a proper trade-off between accuracy and complexity. This nique with LQR-LMI tuning,’’ Control Eng. Pract., vol. 73, pp. 1–12,
works proposes learning the dynamics of a vehicle using a Apr. 2018.
Neural Network (NN). An NN-based vehicle model approach [10] A. De Luca, G. Oriolo, and C. Samson, ‘‘Feedback control of a nonholo-
nomic car-like robot,’’ in Robot Motion Planning and Control (Lecture
has the potential to 1) provide a balanced performance in Notes in Control and Information Sciences). Berlin, Germany: Springer,
terms of accuracy and complexity, 2) reduce the effect of 1998, pp. 171–253.

VOLUME 9, 2021 128247


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

[11] D. Chwa, ‘‘Sliding-mode tracking control of nonholonomic wheeled [32] M. K. Samal, S. Anavatti, and M. Garratt, ‘‘Neural network based system
mobile robots in polar coordinates,’’ IEEE Trans. Control Syst. Technol., identification for autonomous flight of an eagle helicopter,’’ IFAC Proc.
vol. 12, no. 4, pp. 637–644, Jul. 2004. Volumes, vol. 41, no. 2, pp. 7421–7426, 2008.
[12] L. Li, G. Jia, J. Chen, H. Zhu, D. Cao, and J. Song, ‘‘A novel vehicle [33] V. A. Akpan and G. D. Hassapis, ‘‘Nonlinear model identification and
dynamics stability control algorithm based on the hierarchical strategy adaptive model predictive control using neural networks,’’ ISA Trans.,
with constrain of nonlinear tyre forces,’’ Veh. Syst. Dyn., vol. 53, no. 8, vol. 50, no. 2, pp. 177–194, Apr. 2011.
pp. 1093–1116, Aug. 2015. [34] P. W. J. van de Ven, T. A. Johansen, A. J. Sørensen, C. Flanagan, and
[13] K. D. Do, Z. P. Jiang, and J. Pan, ‘‘Simultaneous tracking and stabilization D. Toal, ‘‘Neural network augmented identification of underwater vehicle
of mobile robots: An adaptive approach,’’ IEEE Trans. Autom. Control, models,’’ IFAC Proc. Volumes, vol. 37, no. 10, pp. 263–268, Jul. 2004.
vol. 49, no. 7, pp. 1147–1151, Jul. 2004. [35] Z. Yan and J. Wang, ‘‘Model predictive control for tracking of underac-
[14] P. Falcone, F. Borrelli, J. Asgari, H. E. Tseng, and D. Hrovat, ‘‘Predictive tuated vessels based on recurrent neural networks,’’ IEEE J. Ocean. Eng.,
active steering control for autonomous vehicle systems,’’ IEEE Trans. vol. 37, no. 4, pp. 717–726, Oct. 2012.
Control Syst. Technol., vol. 15, no. 3, pp. 566–580, May 2007. [36] G. M. Zeng, X. S. Qin, L. He, G. H. Huang, H. L. Liu, and Y. P. Lin,
[15] P. Falcone, H. E. Tseng, F. Borrelli, J. Asgari, and D. Hrovat, ‘‘A neural network predictive control system for paper mill wastewater
‘‘MPC-based yaw and lateral stabilisation via active front steering and treatment,’’ Eng. Appl. Artif. Intell., vol. 16, no. 2, pp. 121–129, Mar. 2003.
braking,’’ Veh. Syst. Dyn., vol. 46, pp. 611–628, Sep. 2008. [37] S. Mohanty, ‘‘Artificial neural network based system identification and
model predictive control of a flotation column,’’ J. Process Control, vol. 19,
[16] B. Gutjahr, L. Gröll, and M. Werling, ‘‘Lateral vehicle trajectory opti-
no. 6, pp. 991–999, Jun. 2009.
mization using constrained linear time-varying MPC,’’ IEEE Trans. Intell.
[38] A. Grancharova, J. Kocijan, and T. A. Johansen, ‘‘Explicit output-feedback
Transp. Syst., vol. 18, no. 6, pp. 1586–1595, Jun. 2017.
nonlinear predictive control based on black-box models,’’ Eng. Appl. Artif.
[17] N. Mohajer, S. Nahavandi, H. Abdi, and Z. Najdovski, ‘‘Enhancing passen- Intell., vol. 24, no. 2, pp. 388–397, Mar. 2011.
ger comfort in autonomous vehicles through vehicle handling analysis and [39] S. S. James, S. R. Anderson, and M. D. Lio, ‘‘Longitudinal vehicle dynam-
optimization,’’ IEEE Intell. Transp. Syst. Mag., vol. 13, no. 3, pp. 156–173, ics: A comparison of physical and data-driven models under large-scale
Oct. 2021. real-world driving conditions,’’ IEEE Access, vol. 8, pp. 73714–73729,
[18] A. Katriniok and D. Abel, ‘‘LTV-MPC approach for lateral vehicle guid- 2020.
ance by front steering at the limits of vehicle dynamics,’’ in Proc. 50th [40] M. Da Lio, D. Bortoluzzi, and G. P. R. Papini, ‘‘Modelling longitudinal
IEEE Conf. Decis. Control Eur. Control Conf. (CDC-ECC), Dec. 2011, vehicle dynamics with neural networks,’’ Vehicle Syst. Dyn., vol. 58, no. 11,
pp. 6828–6833. pp. 1675–1693, Nov. 2020.
[19] A. Katriniok, J. P. Maschuw, F. Christen, L. Eckstein, and D. Abel, [41] G. Garimella, J. Funke, C. Wang, and M. Kobilarov, ‘‘Neural network mod-
‘‘Optimal vehicle dynamics control for combined longitudinal and lat- eling for steering control of an autonomous vehicle,’’ in Proc. IEEE/RSJ
eral autonomous vehicle guidance,’’ in Proc. Eur. Control Conf. (ECC), Int. Conf. Intell. Robots Syst. (IROS), Sep. 2017, pp. 2609–2615.
Jul. 2013, pp. 974–979. [42] S. J. Rutherford and D. J. Cole, ‘‘Modelling nonlinear vehicle dynamics
[20] M. Rokonuzzaman, N. Mohajer, and S. Nahavandi, ‘‘NMPC-based con- with neural networks,’’ Int. J. Vehicle Des., vol. 53, no. 4, p. 260, 2010.
troller for autonomous vehicles considering handling performance,’’ in [43] X. Ji, X. He, C. Lv, Y. Liu, and J. Wu, ‘‘Adaptive-neural-network-based
Proc. 7th Int. Conf. Control, Mechatronics Autom. (ICCMA), Nov. 2019, robust lateral motion control for autonomous vehicle at driving limits,’’
pp. 266–270. Control Eng. Pract., vol. 76, pp. 41–53, Jul. 2018.
[21] J. Liu, P. Jayakumar, J. L. Stein, and T. Ersal, ‘‘A multi-stage optimization [44] H. Taghavifar and S. Rakheja, ‘‘Path-tracking of autonomous vehicles
formulation for MPC-based obstacle avoidance in autonomous vehicles using a novel adaptive robust exponential-like-sliding-mode fuzzy type-
using a LIDAR sensor,’’ in Proc. Dyn. Syst. Control Conf., vol. 2, Oct. 2014, 2 neural network controller,’’ Mech. Syst. Signal Process., vol. 130,
pp. 1–4. pp. 41–55, Sep. 2019.
[22] D. Q. Mayne, E. C. Kerrigan, E. J. van Wyk, and P. Falugi, ‘‘Tube- [45] N. A. Spielberg, M. Brown, N. R. Kapania, J. C. Kegelman, and
based robust nonlinear model predictive control,’’ Int. J. Robust Nonlinear J. C. Gerdes, ‘‘Neural network vehicle models for high-performance auto-
Control, vol. 21, no. 11, pp. 1341–1353, 2011. mated driving,’’ Sci. Robot., vol. 4, no. 28, Mar. 2019, Art. no. eaaw1975.
[23] S. Mata, A. Zubizarreta, and C. Pinto, ‘‘Robust tube-based model predic- [46] A. Nagariya and S. Saripalli, ‘‘An iterative LQR controller for off-road and
tive control for lateral path tracking,’’ IEEE Trans. Intell. Vehicles, vol. 4, on-road vehicles using a neural network dynamics model,’’ in Proc. IEEE
no. 4, pp. 569–577, Dec. 2019. Intell. Vehicles Symp. (IV), Oct. 2020, pp. 1740–1745.
[24] Y. Gao, A. Gray, H. E. Tseng, and F. Borrelli, ‘‘A tube-based robust nonlin- [47] R. N. Jazar, Vehicle Dynamics: Theory Application. New York, NY, USA:
ear predictive control approach to semiautonomous ground vehicles,’’ Veh. Springer, 2014.
Syst. Dyn., vol. 52, no. 6, pp. 802–823, Apr. 2014. [48] R. Rajamani, ‘‘Lateral vehicle dynamics,’’ in Vehicle Dynamics Control.
[25] E. Kayacan, E. Kayacan, H. Ramon, and W. Saeys, ‘‘Robust tube- Boston, MA, USA: Springer, 2012, pp. 15–46.
based decentralized nonlinear model predictive control of an autonomous [49] H. B. Pacejka, Tire Vehicle Dynamics, 3rd ed. Oxford, U.K.:
tractor-trailer system,’’ IEEE/ASME Trans. Mechatronics, vol. 20, no. 1, Butterworth-Heinemann, 2012.
pp. 447–456, Feb. 2015. [50] J. Funke, M. Brown, S. M. Erlien, and J. C. Gerdes, ‘‘Collision avoid-
ance and stabilization for autonomous vehicles in emergency scenar-
[26] P. Hang, X. Xia, G. Chen, and X. Chen, ‘‘Active safety control of auto-
ios,’’ IEEE Trans. Control Syst. Technol., vol. 25, no. 4, pp. 1204–1216,
mated electric vehicles at driving limits: A tube-based MPC approach,’’
Jul. 2017.
IEEE Trans. Transport. Electrific., early access, Jul. 28, 2021, doi:
[51] M. Brown, J. Funke, S. Erlien, and J. C. Gerdes, ‘‘Safe driv-
10.1109/TTE.2021.3100843.
ing envelopes for path tracking in autonomous vehicles,’’ Control
[27] O. Garcia, J. V. Ferreira, and A. M. Neto, ‘‘Design and simulation for path Eng. Pract., vol. 61, pp. 307–316, Apr. 2017. [Online]. Available:
tracking control of a commercial vehicle using MPC,’’ in Proc. Joint Conf. http://www.sciencedirect.com/science/article/pii/S0967066116300831
Robot., SBR-LARS Robot. Symp. Robocontrol, Oct. 2014, pp. 61–66. [52] D. Nguyen and B. Widrow, ‘‘Improving the learning speed of 2-layer neural
[28] E. Kim, J. Kim, and M. Sunwoo, ‘‘Model predictive control strategy networks by choosing initial values of the adaptive weights,’’ in Proc.
for smooth path tracking of autonomous vehicles with steering actua- IJCNN Int. Joint Conf. Neural Netw., Jun. 1990, pp. 21–26.
tor dynamics,’’ Int. J. Automot. Technol., vol. 15, no. 7, pp. 1155–1164, [53] N. Mohajer, M. Rokonuzzaman, D. Nahavandi, S. M. Salaken,
Dec. 2014, doi: 10.1007/s12239-014-0120-9. Z. Najdovski, and S. Nahavandi, ‘‘Effects of road path profiles on
[29] I. Lenz, R. Knepper, and A. Saxena, ‘‘DeepMPC: Learning deep latent autonomous vehicles’ handling behaviour,’’ in Proc. IEEE Int. Syst. Conf.
features for model predictive control,’’ in Proc. Robot., Sci. Syst., Jul. 2015, (SysCon), Apr. 2020, pp. 1–6.
pp. 1–14. [54] Vic Roads Australia. (Aug. 2012). Victorian Speed Limit Review.
[30] S. Bansal, A. K. Akametalu, F. J. Jiang, F. Laine, and C. J. Tomlin, ‘‘Learn- Accessed: Aug. 13, 2021. [Online]. Available: https://www.warrnambool.
ing quadrotor dynamics using neural network for flight control,’’ 2016, vic.gov.au/road-safety
arXiv:1610.05863. [Online]. Available: http://arxiv.org/abs/1610.05863 [55] M. Rokonuzzaman, N. Mohajer, S. Nahavandi, and S. Mohamed,
[31] A. Punjani and P. Abbeel, ‘‘Deep learning helicopter dynamics models,’’ ‘‘Learning-based model predictive control for path tracking control of
in Proc. IEEE Int. Conf. Robot. Automat. (ICRA), Seattle, WA, USA, autonomous vehicle,’’ in Proc. IEEE Int. Conf. Syst., Man, Cybern. (SMC),
May 2015, pp. 3223–3230. Oct. 2020, pp. 2913–2918.

128248 VOLUME 9, 2021


M. Rokonuzzaman et al.: MPC With Learned Vehicle Dynamics for AV Path Tracking

MOHAMMAD ROKONUZZAMAN received the SAEID NAHAVANDI (Fellow, IEEE) received


B.Sc. degree in electrical and electronic engi- the Ph.D. degree from Durham University, U.K.,
neering in 2009, and the M.Sc. degree in space in 1991.
science and technology with the specialization He is currently working as Alfred Deakin
in space robotics and automation from Aalto Professor, the Pro Vice-Chancellor, the Chair of
University, Finland, in 2015. He is currently engineering, and the Founding Director of the
pursuing the Ph.D. degree with the Institute Institute for Intelligent Systems Research and
for Intelligent Systems Research and Innovation Innovation, Deakin University. He has published
(IISRI), Deakin University, Australia. His research over 1000 scientific papers in various international
interests include control of the autonomous vehi- journals and conferences. His research interest
cle, human effects in autonomous driving, and learning-based control of includes modeling of complex systems, robotics, and haptics.
autonomous and semi-autonomous systems. Prof. Nahavandi is a fellow of Engineers Australia (FIEAust), the Insti-
tution of Engineering and Technology (FIET), and the Australian Academy
of Technology and Engineering (ATSE). He is the Editor-In-Chief of IEEE
Systems, Man, and Cybernetics Magazine, a Senior Editor of IEEE SYSTEMS
JOURNAL and IEEE ACCESS, and an Associate Editor of IEEE TRANSACTIONS ON
CYBERNETICS.

SHADY MOHAMED received the B.Sc. and


NAVID MOHAJER received the B.Eng. degree M.Sc. degrees in information technology from
in mechanical engineering and the M.Sc. degree Cairo University, Giza, Egypt, in 2000 and 2003,
in mechatronics engineering from the University respectively, and the Ph.D. degree in control
of Tehran, Iran, in 2009 and 2012, respectively, theory from Deakin University, Geelong, VIC,
and the Ph.D. degree in vehicle dynamics and Australia, in 2009. He is currently an Associate
mechanical engineering from Deakin University, Professor with the Institute for Intelligent Sys-
Australia, in 2017. He is currently a Researcher tems Research and Innovation (IISRI), Deakin
with IISRI, Deakin University. His research inter- University. His research interests include inter-
ests include control and dynamic of autonomous disciplinary research involving signal processing,
vehicles, mechanical design and analysis of com- control theory, human biodynamics, haptics, and medical imaging.
plex systems (kinematics and dynamics), and multibody systems (MBS).

VOLUME 9, 2021 128249

You might also like