0% found this document useful (0 votes)
14 views

Advancing_Robotic_Control_Data-Driven_Model_Predictive_Control_for_a_7-DOF_Robotic_Manipulator

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Advancing_Robotic_Control_Data-Driven_Model_Predictive_Control_for_a_7-DOF_Robotic_Manipulator

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Received 16 June 2024, accepted 13 August 2024, date of publication 16 August 2024, date of current version 29 August 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3444899

Advancing Robotic Control: Data-Driven Model


Predictive Control for a 7-DOF
Robotic Manipulator
HAITHAM EL-HUSSIENY 1 , IBRAHIM A. HAMEED 2, (Senior Member, IEEE),
TAMER F. MEGAHED 3,4 , AND AHMED FARES 5,6
1 Department of Mechatronics and Robotics Engineering, Egypt-Japan University of Science and Technology (E-JUST), Alexandria 21934, Egypt
2 Department of ICT and Natural Sciences, Norwegian University of Science and Technology, 6009 Ålesund, Norway
3 Department of Electrical Power Engineering, Egypt-Japan University of Science and Technology (E-JUST), Alexandria 21934, Egypt
4 Electrical Engineering Department, Faculty of Engineering, Mansoura University, Mansoura 35516, Egypt
5 Department of Computer Science and Engineering, Egypt-Japan University of Science and Technology (E-JUST), Alexandria 21934, Egypt
6 Electrical Engineering Department, Faculty of Engineering at Shoubra, Benha University, Cairo 11629, Egypt

Corresponding authors: Haitham El-Hussieny ([email protected]) and Ibrahim A. Hameed ([email protected])

ABSTRACT In this study, we applied deep learning to improve the control of a KUKA LBR4 7 Degrees of
Freedom (DOF) robotic arm. We developed a dynamic model using a comprehensive dataset of joint angles
and actuator torques obtained from pick-and-place operations. This model was incorporated into a Model
Predictive Control (MPC) framework, enabling precise trajectory tracking without the need for traditional
analytical dynamic models. By integrating specific constraints within the MPC, we ensured adherence to
operational and safety standards. Experimental results demonstrate that deep learning models significantly
enhance robotic control, achieving precise trajectory tracking. This approach not only surpasses traditional
control methods in terms of accuracy and efficiency but also opens new avenues for research in robotics,
showcasing the potential of deep learning models in predictive control techniques.

INDEX TERMS Deep learning, model predictive control (MPC), robotic arm, trajectory tracking.

I. INTRODUCTION Additionally, these manipulators play a crucial role in space


Robotic manipulation and movement describe how robotic exploration, where they perform tasks like satellite servicing
systems engage with and transform their surroundings and assembly of large structures in the harsh environment
through meticulous control over their mechanical compo- of space [4]. The versatility and reliability of robotic
nents [1]. Robotic manipulators are versatile mechanical manipulators make them indispensable tools across various
devices used extensively in various industrial, medical, domains.
and research applications due to their precision, efficiency, Feedback control at the joint level of robotic manip-
and adaptability. In manufacturing, they are employed for ulators is essential for ensuring precision, stability, and
tasks such as assembly, welding, and material handling, responsiveness during various tasks [5]. By continuously
significantly enhancing productivity and safety by perform- monitoring and adjusting the joint positions, velocities, and
ing repetitive and hazardous tasks with high accuracy [2]. torques based on sensor feedback, feedback control systems
In the medical field, robotic manipulators are integral can correct errors in real-time and maintain the desired
to minimally invasive surgeries, providing surgeons with trajectory [6]. This is crucial in applications requiring high
enhanced precision and control, which leads to reduced accuracy, such as in assembly lines or surgical procedures,
patient recovery times and improved surgical outcomes [3]. where even minor deviations can lead to significant errors
or safety hazards. One of the key benefits of joint-level
The associate editor coordinating the review of this manuscript and feedback control is its ability to compensate for uncertainties
approving it for publication was Min Wang . and disturbances. These can include variations in payload,

2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
115926 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 12, 2024
H. El-Hussieny et al.: Advancing Robotic Control: Data-Driven MPC for a 7-DOF Robotic Manipulator

external forces, or changes in the robot’s dynamic parameters


over time. By providing immediate corrections, feedback
control enhances the robustness and reliability of robotic
operations [7]. Furthermore, feedback control allows for
improved performance in dynamic environments. In tasks
where the robot interacts with varying objects or operates
in unstructured settings, feedback control ensures that the
manipulator can adapt to changes and maintain performance
without requiring manual recalibration or intervention [8].
FIGURE 1. A block diagram depicting the proposed data-driven model
Model-based control of robotic manipulators involves predictive control (MPC) strategy used to control the KUKA LBR4 robotic
developing mathematical models that represent the robot’s manipulator.
dynamics and kinematics, which are then used to predict
and optimize the robot’s behavior in real-time [9], [10], [11].
By incorporating detailed models, controllers can anticipate in [23], or by their avoidance of online optimization, as seen
the effects of control actions, compensate for nonlinearities, in the works by [24] and [25].
and account for interactions between different joints, leading In this study, we present a data-driven dynamic model for
to enhanced performance [12]. This level of control is a seven Degrees-of-Freedom (DOF) KUKA LBR4 robotic
critical in industries such as aerospace, where precision and manipulator utilizing deep learning techniques. This model
reliability are paramount. Moreover, model-based approaches is integrated into the Model Predictive Control (MPC)
enable the implementation of advanced control strategies, framework to enhance trajectory tracking, eliminating the
such as Model Predictive Control (MPC), which optimizes need for traditional analytical dynamic models. We developed
control inputs over a future time horizon to achieve desired and validated the deep learning dynamic model using
performance while respecting constraints [13], [14]. an extensive dataset of joint angles and actuator torques.
However, implementing model-based control presents sev- Furthermore, we performed a theoretical analysis to ensure
eral challenges. One of the primary difficulties is accurately the stability and feasibility of the deep learning-based MPC.
identifying and modeling the robot’s dynamics, which can be The model was successfully incorporated into the MPC
highly nonlinear and subject to various uncertainties, such as framework with additional constraints to improve operational
friction, backlash, and external disturbances [15]. Addition- safety and efficiency. Experimental results demonstrated
ally, real-time computation requirements for complex models improved trajectory tracking capabilities, and we discussed
can be demanding, necessitating powerful computational the potential implications for future advancements in robotic
resources and efficient algorithms [16]. Another challenge control systems.
is ensuring robustness to model inaccuracies and parameter The structure of this paper is as follows: Section II
variations, which requires sophisticated adaptive or robust describes the development of the data-driven Model Pre-
control techniques to maintain performance in the presence dictive Control (MPC) system. It covers the construction
of uncertainties. of the data-driven dynamic model for the 7 DOF robotic
Data-driven methods, particularly neural networks, have manipulator, sets forth the control objectives for the MPC,
shown potential in accurately modeling these nonlinear and includes an analysis of the system’s stability. Section III
dynamical effects without relying on exhaustive mathemati- provides experimental results that demonstrate the proposed
cal modeling [17], [18]. Integrating such models as surrogates data-driven MPC’s ability to control the robotic manipulator
in MPC systems could facilitate meeting the real-time control to accurately follow predefined joint values and trajectories
requirements for robotic manipulators. Previous studies uti- while adhering to specific constraints. Finally, Section IV
lizing predictive controllers have typically employed one of concludes the paper and suggests potential directions for
two approaches: using linear predictive models by linearizing future research.
the system around a fixed point [19], or implementing
gain scheduling to establish a multi-level controller where II. DATA-DRIVEN MODEL PREDICTIVE CONTROL
each level handles a specific operational mode [20]. While A. BACKGROUND
these methods can facilitate real-time operation, they tend Model Predictive Control (MPC) is a multivariable control
to provide limited accuracy in predicting system responses. technique that uses a mathematical or data-driven model
To improve prediction accuracy, some research has intro- to predict the future state of the system being controlled.
duced nonlinear predictive models, such as in [21]. However, It then calculates a series of optimal control inputs within
these models often fail to support real-time operation because defined constraints. MPC fundamentally consists of three
solving the nonlinear equations involved requires extensive main components: the predictive model, the target trajectory,
computation [22]. Meanwhile, alternative control strategies and the controller, which optimizes the outcomes in a rolling
to MPC were employed. These strategies are character- manner. The structure of a closed-loop MPC system is
ized either by their non-predictive nature, as discussed depicted in Figure 1. In this figure, xr represents the desired

VOLUME 12, 2024 115927


H. El-Hussieny et al.: Advancing Robotic Control: Data-Driven MPC for a 7-DOF Robotic Manipulator

state trajectory of the robot, τ indicates the manipulated


torque variables, xp represents the robot’s state predicted from
the data-driven model, x denotes the controlled joint state,
and τ̄ refers to the sequence of optimized torques.
The model of the 7-DOF robotic manipulator can be
described in discrete terms as follows:

x(k + 1) = f (x(k), τ (k)) (1)


FIGURE 2. Structure of the DNN feed-forward neural network employed
where x = [q, q̇]T ∈ R14 represents the joint state, for predicting the joint state in the 7-DOF KUKA LBR4 robotic manipulator.
encompassing both joint positions and velocities, τ ∈ R7
denotes the joint torques, and f (·) is the system’s unknown
dynamic function. Given the nonlinear nature of the system, The dataset from [26] was utilized, consisting of ten
accurately identifying the precise function f (·) that mirrors trajectories generated during pick and place tasks executed
the robotic behavior is challenging. Additionally, using the by the KUKA LBR4 robot. The pick and place locations
nonlinear system dynamics as the MPC prediction model to were randomly selected from two non-overlapping areas,
anticipate the robot’s future states based on the sequence each measuring 50 × 50 cm. The robots were considered to
of actuation can be computationally exhaustive, making have successfully completed a task if they started and finished
real-time control of the robot difficult. Therefore, the primary at the same location. The dataset includes approximately
objective of this approach is to accurately predict the system’s 18,000 samples, containing current joint positions, current
behavior under control and derive optimal control actions. joint velocities, applied torques, next joint positions, and next
In this research, we employ a data-driven dynamics model joint velocities. The data was split into 70% for training and
as the prediction model in the proposed MPC strategy. 30% for validation.
In developing our deep learning-based MPC, it was
essential to seamlessly integrate and efficiently execute the
B. DATA-DRIVEN DYNAMIC MODEL
predictive model across varied computational requirements.
The primary goal of developing a Deep Neural Network
To achieve this, we used the Open Neural Network Exchange
(DNN) model is to create a surrogate model that can
(ONNX) framework [27] for model conversion and interop-
effectively serve as a predictive model within the proposed
erability. The initial predictive model, designed and trained
MPC strategy for regulating the joint state of the robotic
using the TensorFlow API, showed promising performance
manipulator. Specifically, this effort aims to approximate the
in preliminary experiments. Converting this model from
joint state x(k + 1) at the next time step k + 1, based on the
TensorFlow to ONNX involved using the tf2onnx tool
current joint state x(k) and the applied input torque τ (k) at
to transform the TensorFlow computational graph into
the current time step k. Thus, the DNN is trained to directly
an ONNX model file. This streamlined process generally
approximate the solution in Eq. (1),
involves specifying the input and output nodes of the model
f (x(k), τ (k); θ) ≈ f (x(k), τ (k)) (2) to maintain the integrity of its predictive capabilities after
conversion.
where θ represents the free parameters of the DNN. This
enhances the predictive capability and control precision C. CONTROL OBJECTIVE
within robotic manipulator systems by using the DNN model The main objective of the proposed data-driven non-
as the MPC prediction model, facilitating real-time decision- linear MPC is to stabilize the robotic manipulator
making. by ensuring it follows a predefined reference joint
A feed-forward shallow neural network, illustrated in trajectory xr = [q1r , q2r , . . . , q7r ]T in joint space.
Figure 2, functions as the dynamic predictive model for the This goal includes accounting for physical constraints,
robotic manipulator. This network includes an input layer that such as joint and actuation limits, when determining
takes in the current joint state x(k), comprising joint positions, the optimal control actions, specifically the applied joint
velocities, and input torques τ . It has two hidden layers: the torques. Therefore, the cost function J is formulated
first with 128 neurons and the second with 32 neurons, both to evaluate both tracking performance and the effective-
utilizing the rectified linear unit (RELU) activation function. ness of control actions over a prediction horizon N ,
The architecture is finalized with an output layer that uses a defined as:
linear activation function to predict the subsequent joint state
x(k + 1). In the experiment section, we evaluated different N
X
network architectures to assess their impact on prediction J(k) = eT(k+j) W 1 e(k+j)
performance. Future work could explore using Recurrent j=1
Neural Networks (RNN) and Long Short-Term Memory N
X
(LSTM) networks to handle time-series data in building the + 1τ T(k+j−1) W 2 1τ (k+j−1) (3)
deep learning model. j=1

115928 VOLUME 12, 2024


H. El-Hussieny et al.: Advancing Robotic Control: Data-Driven MPC for a 7-DOF Robotic Manipulator

Here, e = x − xr denotes the tracking error, while 1τ Algorithm 1 Data-Driven MPC


signifies the predicted change in control input. The matrices Require:
W 1 = w1 I7 ≥ 0 and W 2 = w2 I7 ≥ 0 are positive weighting (qk , τ k , qk+1 ), ∀k = [0, M ]: DNN training data
matrices, which are assumed to stay constant throughout the qr : Desired joint values (reference trajectory)
prediction horizon N . q0 , τ 0 : Initial joints and torque
The optimal control problem minimizing Eq. (3) is Uq , Uτ : Bounds of joints and torques
constrained by the physical and actuator limits. The g(q, τ ): Nonlinear constrain function
robot’s joint values and torques are restricted within the W 1 , W 2 : Weighting matrices for the cost function
lower and upper bounds [qmin , qmax ] and [τ min , τ max ], N : Prediction Horizon
respectively, as determined from the movement dataset. Ensure:
Additionally, nonlinear constraints function g(.) related τ ∗k : Optimal sequence of control inputs (torques) over the
to the robot’s state, actuation, or system parameters can prediction horizon
be incorporated when searching for the optimal control
action: 1: θk = θ0 ▷ initialize DNN model with random weights
2: For k = 0 to M − 1:
g(x, τ ) ≤ 0 (4) 3: q̂k+1 = fθ (qk , τk ) ▷ Predict the next state
From the perspective of supervised learning, the task 4: L = MSE(q̂k+1 , qk+1 ) ▷ Compute the loss
∂L
of determining the optimal control law can be viewed 5: θ k+1 ←θ −αk ▷ Update DNN parameters
as a nonlinear mapping executed by a single-layer neu- ∂θ
6: qk = q0 ▷ initial robot’s joints
ral network [28]. As a result, Gradient Descent (GD) 7: For k = 0 to M − 1:
proves to be a suitable algorithm for this task. There- 8: ek = ||qrk − qk ||2W1 ▷ Compute the error
fore, the sequence of control laws, τ (k), is updated as
Jk = N 2
P
9: j=1 e + ||1τ || W2 ▷ Compute cost
follows:
10: τ̄ ← IPOPT(J k , fθ , g, e, Uq , Uτ ) ▷ Solve the
τ (k + 1) = τ (k) + 1τ (k) (5) optimization problem
∂J(k) τk∗ ← τ̄ [0] ▷ Apply the first control input
 
11:
1τ (k) = η − (6) qk+1 ← fθ (qk , τk∗ ) ▷ Get the next state
∂τ 12:

where η > 0 denotes the learning rate for the control


sequence. According to [29], the control increment 1τ (k) is
defined as: Substituting the value of ė into Eq. (9) yields:
V̇ (e) = eT (ẋr − ẋ)
 T
ηw1 ∂q
1τ (k) = e (7)
1 + ηw2 ∂τ ∂xr ∂x ∂τ
 
= eT −
In summary, Algorithm 1 delineates the procedure for two ∂τ ∂τ ∂t
∂x ∂τ
 
primary tasks: First, it details the construction of the surrogate
= −eT
(DNN) dynamic model for the 7-DOF robotic manipulator, ∂τ ∂t
as outlined from lines 1 to 5. Second, it describes the ∂x
 
implementation of the data-driven model predictive control ≊ −eT 1τ (k) (10)
∂τ
(MPC) for precise tracking of the reference joint positions,
covered from lines 6 to 12. By substituting 1τ (k) in Eq. (7) into Eq. (10):
 (  T )
T ∂x ηw1 ∂x
D. LYAPUNOV STABILITY ANALYSIS V̇ (e) = −e e
To illustrate the stability of the system, a quadratic Lyapunov ∂τ 1 + ηw2 ∂τ
 T  
function expressed in terms of the tracking error is chosen as ηw1 ∂x ∂x T
follows: =− e e (11)
1 + ηw2 ∂τ ∂τ
1
V (e) = eT e (8) Given that V̇ (e) < 0, following the principles of Lyapunov
2
stability theory, we can conclude that the proposed control
where e = xr − x. To ensure the global asymptotic stability
strategy is stable.
of the system, the first derivative of V (e) with respect to time
must be negative definite, which implies that e converges
III. RESULTS AND DISCUSSION
exponentially to zero.
A. PERFORMANCE OF THE DNN PREDICTION MODEL
The first time derivative of V (e) is calculated as
follows: In our initial experiments, we investigated how well our
DNN-based model perform over predicting the movements of
V̇ (e) = eT ė (9) the 7 DOF robotic manipulator, as part of the MPC system.

VOLUME 12, 2024 115929


H. El-Hussieny et al.: Advancing Robotic Control: Data-Driven MPC for a 7-DOF Robotic Manipulator

FIGURE 4. Comparison between actual next joint states x(k + 1) and


estimated joint states x̂(k + 1) using the proposed DNN-based prediction
model, showing the first joint values on the left and the joint velocities
on the right.

consequently the control performance. This indicates that


pre-processing the dataset should be considered in the future
FIGURE 3. Training and validation loss of the DNN regression model over
100 epochs. The training loss decreases steadily, while the validation loss before using it to build the DNN regression model.
shows a similar downward trend, suggesting good generalization.

B. POINT STABILIZATION PREDICTIVE CONTROL


We conducted both training and prediction tasks with the In this study, we aim to evaluate the performance of the
TensorFlow 2.x API on a standard desktop computer. The proposed data-driven MPC controller in point stabilization
choice of computer hardware, especially the CPU’s clock tasks for the KUKA LBR4 robotic manipulator. This MPC
speed of 2.8 GHz, significantly affected how long it took to controller was developed using the do-mpc framework [30].
train our model. The goal is to command the robot to reach fixed targets in
To evaluate the performance of the proposed DNN joint space qr ∈ R7 , with these targets being updated every
regression model, we monitored the training and validation 500 samples. The experiment was conducted with a sampling
loss over the course of 100 epochs. The training process time of T = 0.05 seconds and a prediction horizon of
aimed to minimize the Mean Squared Error (MSE) between N = 10. For the optimization cost function, we used diagonal
the predicted and actual future joint states. Figure 3 shows the state and input weighting matrices, with W 1 = 100I 7 to
training and validation loss curves. The x-axis represents the emphasize the importance of accurately reaching the target
number of epochs, and the y-axis represents the loss (MSE) joint positions and W 2 = 0.01I 7 to slightly penalize the input
value. The training loss (blue curve) and the validation loss torques needed to achieve these positions.
(red curve) both exhibit a decreasing trend, indicating that This setup provides a rigorous test of the controller’s
the model is learning effectively and generalizing well to the ability to adapt to changing targets and maintain stability
validation data. in the joint positions. The MPC controller is using the
The training curve indicates that the model achieves a data-driven model to predict future states and optimize
lower error rate as the number of epochs increases, with control inputs accordingly. The model was trained on a
both the training and validation losses converging towards dataset encompassing a wide range of joint configurations
lower values. This convergence suggests that the model is and corresponding torques to ensure robust performance
not overfitting, as there is no significant divergence between across various scenarios. To assess the controller’s effec-
the training and validation losses. Further, the model’s tiveness, we used the tracking error performance metrics.
performance on the validation set is critical for assessing its Tracking error was measured as the difference between the
ability to generalize to unseen data. The consistent decrease desired and actual joint positions that is visualized through
in validation loss confirms that the DNN model is capable plots showing joint positions over time for multiple cycles
of effectively learning the underlying patterns in the robot’s of 500 samples each. These plots, illustrated in Figure 5,
dynamics, thereby making accurate predictions on new data. highlight the transitions between different targets and the
To assess the performance of our DNN model, we tested controller’s ability to follow the desired trajectories closely
it on a dataset containing 1800 samples. Figure 4 illustrates accompanied with the applied joints torques.
the comparison between the model’s predictions and the The results indicate that the data-driven MPC controller
actual values. For brevity, we present results only for the effectively tracks the desired joint trajectories with minimal
first joint, displaying the predicted next joint positions and error. The use of a data-driven model in the MPC framework
velocities. The close alignment between the predicted outputs proved beneficial, offering accurate predictions and efficient
and the actual data demonstrates the minimal discrepancy, control. Meanwhile, the data-driven controller provides
highlighting the model’s accuracy. However, the noise present nonlinear control approach while maintain the real-time
in the velocity data, particularly during position transitions, capability of the controller. Extending the approach to more
can adversely impact the system behavior predictions and dynamic goals could also provide valuable insights.

115930 VOLUME 12, 2024


H. El-Hussieny et al.: Advancing Robotic Control: Data-Driven MPC for a 7-DOF Robotic Manipulator

FIGURE 5. Results of point stabilization scenarios to evaluate the the FIGURE 6. Results of trajectory tracking scenarios evaluating the
proposed deep learning-based MPC for reaching predefined joints proposed deep learning-based MPC in following reference joint
references. trajectories.

C. TRAJECTORY TRACKING PREDICTIVE CONTROL overall system, it has been included to test the algorithm’s
To assess the performance of the proposed data-driven MPC robustness. The results demonstrate that the proposed MPC
controller in trajectory tracking, we utilized a reference achieves satisfactory tracking performance while adhering to
trajectory. This trajectory is particularly useful in scenarios this constraint and effectively managing constraints on the
where the robotic manipulator must follow a specific other joints.
sequence of goals, such as in pick-and-place applications.
Specifically, the reference trajectory, denoted as qr ∈ R7 , D. COMPARISON WITH PID CONTROL
consists of a series of desired joint values. In this simulation experiment, we compare the performance
The robot starts with initial joint values q0 = [0.26, of our data-driven MPC controller with a PID independent
0.24, −0.34, 1.76, −0.07, −1.61, −1.666]T (rad). The con- joint control approach. The PID control loops were applied
troller’s time step is set to t = 0.01 seconds, with a to the robot model obtained using a deep learning model
prediction horizon of N = 12, and the total simulation trained from data. The PID gains were determined through
time is 18 seconds. The weighting matrices are chosen trial and error. To evaluate the controller, we used tracking
as W 1 = 100 and W 2 = 0.01. The performance error, measuring the difference between the desired and actual
of the MPC is depicted in Figure 6, demonstrating the joint positions. Within this PID controller, the torque τi for
MPC’s effectiveness in minimizing the difference between each joint is determined individually in the following manner:
the measured and reference trajectories. The Mean Squared dei
Z
Error (MSE) between the reference trajectory and the actual τi = Kp ei + Kd + Ki ei dt (12)
dt
robot trajectory is calculated to be 2 × 10−4 (in radians
squared). In contrast, the PID independent joint control approach
Furthermore, Figure 7 illustrates the performance of yielded poor results. Despite the effort to tune the PID
the data-driven MPC in tracking a reference joint and gains, the controller struggled to maintain stability. The
the corresponding applied torque, considering a nonlinear tracking error was significantly higher than that of the MPC.
inequality constraint on the robot’s joint (q1 − 1)2 ≤ 0. This More concerningly, the applied torques exhibited signs of
constraint acts as a limitation on the first joint within the joint instability, with significant amplifications when the reference
space. While this constraint does not significantly impact the joints changed.

VOLUME 12, 2024 115931


H. El-Hussieny et al.: Advancing Robotic Control: Data-Driven MPC for a 7-DOF Robotic Manipulator

FIGURE 7. The results of the trajectory following experiment, highlighting


the joint constraint on q1 in the red area.

Figure 8 shows the tracking performance and applied


torques for the PID controller. The plots reveal substantial
deviations from the desired trajectories and unstable torque
values, indicating that the PID control was not effective for
this application.
The comparison highlights the advantages of the data-
driven MPC approach over PID independent joint control for
FIGURE 8. Performance of the PID tracking controller in following the
the KUKA LBR4 robotic manipulator. The MPC controller’s reference joint trajectories.
ability to predict and optimize future states based on the
data-driven model provided superior tracking accuracy and
stability. In contrast, the PID controller, even with carefully In this study, we developed a set of five deep neural net-
tuned gains, could not achieve the same level of performance work (DNN) models, all utilizing feed-forward architectures.
and exhibited instability in the applied torques. These models vary in complexity, with trainable parameters
The superior performance of the data-driven MPC can ranging from 594 to 47,694. Each model consists of three
be attributed to its predictive capabilities and optimization fully connected layers, differentiated by the number of hidden
framework, which allow it to handle the nonlinear dynamics units, as detailed in Table 1. The layers are denoted with
of the robotic system more effectively. The PID controller’s an ‘F’ and a subscript indicating the number of neurons in
poor performance underscores the challenges of tuning PID each. The primary activation function used is the hyperbolic
gains for complex, nonlinear systems and highlights the tangent (tanh), except for the output layer, which employs a
limitations of relying solely on feedback control without linear activation function.
predictive modeling. These models were trained over 100 epochs using the
Adam optimizer with mean squared error (MSE) as the
E. DNN MODEL SELECTION primary loss metric. To ensure comprehensive evaluation and
We conducted a thorough evaluation of the proposed DNN validation, a 5-fold cross-validation method was employed.
model for the robotic manipulator using the K-Fold Cross- The convergence trends of the learning algorithm, encom-
Validation technique, as outlined by Anguita et al. [31]. passing all network configurations and validation folds, are
This approach is crucial for rigorously assessing the model’s depicted in Figure 9-a. Additionally, Figure 9-b provides a
predictive capability under various architectural designs, detailed analysis of the average MSE losses and their standard
ensuring its effectiveness and reliability. The dataset was deviations across different model architectures. Notably,
divided into five distinct subsets, enabling a cyclic process networks with a higher parameter count are highlighted in
of training and evaluation. This comprehensive evaluation dark red, indicating that larger networks tend to converge
provides valuable insights into the model’s performance more effectively towards lower MSE values. To simplify
across different conditions, helping to identify the most the selection process, we chose a neural network that
effective architectural design and demonstrating the model’s performed well during both training and testing phases.
ability to generalize to real-world scenarios. Exploring architectures that are both compact and capable of

115932 VOLUME 12, 2024


H. El-Hussieny et al.: Advancing Robotic Control: Data-Driven MPC for a 7-DOF Robotic Manipulator

contributing valuable insights into the integration of deep


learning techniques within complex control systems.

REFERENCES
[1] B. Siciliano, O. Khatib, and T. Kröger, Springer Handbook of Robotics,
vol. 200. Germany: Springer, 2008.
[2] S. Kalpakjian and S. R. Schmid, Manufacturing Engineering and
Technology. London, U.K.: Prentice-Hall, 2009, pp. 568–571.
[3] R. H. Taylor and D. Stoianovici, ‘‘Medical robotics in computer-integrated
surgery,’’ IEEE Trans. Robot. Autom., vol. 19, no. 5, pp. 765–781,
Oct. 2003.
FIGURE 9. (a) Training losses expressed as mean squared error (MSE) for [4] D. Arney, R. Sutherland, J. Mulvaney, D. Steinkoenig, C. Stockdale,
five deep neural network (DNN) models during 5-fold cross-validation, and M. Farley, ‘‘On-orbit servicing, assembly, and manufacturing
and (b) mean and standard deviations.
(OSAM) state of play,’’ Edition-NASA Tech. Rep. Server (NTRS), White
Paper 20210022660, 2021. Accessed: May 1, 2023. [Online]. Available:
TABLE 1. Chosen architectures for the 5-fold cross-validation experiment.
https://ntrs.nasa.gov/api/citations/20210022660/downloads/osam_state_
of_play%20(1).pdf
[5] C. C. Cheah, S. Kawamura, and S. Arimoto, ‘‘Feedback control
for robotic manipulator with uncertain kinematics and dynamics,’’
in Proc. IEEE Int. Conf. Robot. Autom., vol. 4, May 1998,
pp. 3607–3612.
[6] L. Sciavicco and B. Siciliano, Modelling and Control of Robot Manipula-
tors. Germany: Springer, 2012.
[7] J. Son, H. Kang, and S. H. Kang, ‘‘A review on robust control of
robot manipulators for future manufacturing,’’ Int. J. Precis. Eng. Manuf.,
vol. 24, no. 6, pp. 1083–1102, Jun. 2023.
rapid learning could potentially enhance the overall system’s [8] P. D. Nguyen, N. H. Nguyen, and H. T. Nguyen, ‘‘Adaptive control for
robustness. manipulators with model uncertainty and input disturbance,’’ Int. J. Dyn.
Control, vol. 11, no. 5, pp. 2285–2294, Oct. 2023.
[9] L. Sciavicco and B. Siciliano, Modeling and Control of Robot Manipula-
IV. CONCLUSION tors. Germany: Springer, 2012.
To address the challenges of constrained nonlinear joint [10] M. W. Spong, S. Hutchinson, and M. Vidyasagar, Robot Modeling and
control for a seven Degrees of Freedoms (DoF) robotic Control. Hoboken, NJ, USA: Wiley, 2006.
[11] W. Khalil and E. Dombre, Modeling, Identification and Control of Robots.
manipulator, this study introduces a new approach through London, U.K.: Butterworth, 2002.
the implementation of a data-driven-enhanced nonlinear [12] M. Ruderman, F. Hoffmann, and T. Bertram, ‘‘Modeling and identification
Model Predictive Control (MPC) method. Initially, a data- of elastic robot joints with hysteresis and backlash,’’ IEEE Trans. Ind.
Electron., vol. 56, no. 10, pp. 3840–3847, 2009.
driven predictive model was developed, capable of forecast-
[13] R. Fareh, S. Khadraoui, M. Y. Abdallah, M. Baziyad, and M. Bettayeb,
ing future joint positions based on current joint values and ‘‘Active disturbance rejection control for robotic systems: A review,’’
applied torques. This model was then integrated within an Mechatronics, vol. 80, 2021, Art. no. 102671.
MPC framework, employing an online optimization problem [14] V. Bargsten, P. Zometa, and R. Findeisen, ‘‘Modeling, parameter identifi-
cation and model-based control of a lightweight robotic manipulator,’’ in
with process constraints to determine the optimal control Proc. IEEE Int. Conf. Control Appl., 2013, pp. 134–139.
torques for accurate point stabilization and joint trajectory [15] T. W. Yang, W. L. Xu, and J. D. Han, ‘‘Dynamic compensation control
tracking of the KUKA LBR4 robotic manipulator. of flexible macro–micro manipulator systems,’’ IEEE Trans. Control Syst.
Technol., vol. 18, no. 1, pp. 143–151, 2009.
In conclusion, the data-driven MPC controller exhibited [16] F. Aghili, ‘‘Adaptive control of manipulators forming closed kinematic
strong performance in terms of tracking accuracy, handling chain with inaccurate kinematic model,’’ IEEE/ASME Trans. Mechtron.,
of joint and actuators constraints, and effectively adapting vol. 18, no. 5, pp. 1544–1554, 2012.
[17] T. Salzmann, E. Kaufmann, J. Arrizabalaga, M. Pavone, D. Scaramuzza,
to changing targets. The data-driven MPC outperformed
and M. Ryll, ‘‘Real-time neural MPC: Deep learning model predictive
the PID independent joint control in tracking reference control for quadrotors and agile robotic platforms,’’ IEEE Robot. Autom.
joint trajectories for the KUKA LBR4 robotic manipulator. Lett., vol. 8, no. 4, pp. 2397–2404, Apr. 2023.
The MPC approach provided better tracking accuracy, [18] H. El-Hussieny, I. A. Hameed, and A. A. Nada, ‘‘Deep CNN-based static
modeling of soft robots utilizing absolute nodal coordinate formulation,’’
consistent stabilization times, and stable control efforts, Biomimetics, vol. 8, no. 8, p. 611, Dec. 2023.
demonstrating its effectiveness and reliability for robotic [19] P.-B. Wieber, ‘‘Trajectory free linear model predictive control for stable
control applications. walking in the presence of strong perturbations,’’ in Proc. 6th IEEE-RAS
Int. Conf. Humanoid Robots, Dec. 2006, pp. 137–142.
Furthermore, a K-fold cross-validation was conducted [20] H. Li, R. J. Frei, and P. M. Wensing, ‘‘Model hierarchy predictive control of
to select the optimal model architecture, ensuring robust robotic systems,’’ IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 3373–3380,
performance and generalization across different data splits. Apr. 2021.
A promising avenue for future research is the development of [21] G. García, R. Griffin, and J. Pratt, ‘‘MPC-based locomotion control of
bipedal robots with line-feet contact using centroidal dynamics,’’ in Proc.
data-driven Moving Horizon Estimation (MHE) [32] along IEEE-RAS 20th Int. Conf. Humanoid Robots (Humanoids), Jul. 2021,
with the MPC, potentially mitigating the current assumption pp. 276–282.
of full state observability that this work presupposes. [22] T. Ohtsuka and K. Ozaki, ‘‘Practical issues in nonlinear model predictive
control: Real-time optimization and systematic tuning,’’ in Nonlinear
Such advancements promise to further refine the precision Model Predictive Control: Towards New Challenging Applications. Berlin,
and applicability of MPC in robotic manipulator control, Germany: Springer, 2009, pp. 447–460.

VOLUME 12, 2024 115933


H. El-Hussieny et al.: Advancing Robotic Control: Data-Driven MPC for a 7-DOF Robotic Manipulator

[23] T. Akbas, S. E. Eskimez, S. Ozel, O. K. Adak, K. C. Fidan, and K. Erbatur, IBRAHIM A. HAMEED (Senior Member, IEEE)
‘‘Zero moment point based pace reference generation for quadruped robots received the Ph.D. degree in industrial systems and
via preview control,’’ in Proc. 12th IEEE Int. Workshop Adv. Motion information engineering from Korea University,
Control (AMC), Mar. 2012, pp. 1–7. Seoul, South Korea, and the Ph.D. degree in
[24] S. Kolathaya, ‘‘Local stability of PD controlled bipedal walking robots,’’ mechanical engineering from Aarhus University,
Automatica, vol. 114, Apr. 2020, Art. no. 108841. Aarhus, Denmark. He is currently a Professor
[25] M. Sombolestan, Y. Chen, and Q. Nguyen, ‘‘Adaptive force-based control with the Department of ICT and Natural Sciences,
for legged robots,’’ in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), Faculty of Information Technology and Electrical
Sep. 2021, pp. 7440–7447.
Engineering, Norwegian University of Science
[26] A. S. Polydoros and L. Nalpantidis, ‘‘A reservoir computing approach
and Technology (NTNU), Norway. He is the
for learning forward dynamics of industrial manipulators,’’ in
Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), Oct. 2016, Deputy Head of research and innovation within the Department of ICT
pp. 612–618. and Natural Sciences, Faculty of Information Technology and Electrical
[27] J. Bai, F. Lu, and K. Zhang. (2019). ONNX: Open Neural Network Engineering, NTNU. His current research interests include artificial
Exchange. [Online]. Available: https://github.com/onnx/onnx intelligence, machine learning, optimization, and robotics. He is elected as
[28] H.-G. Han, X.-L. Wu, and J.-F. Qiao, ‘‘Real-time model predictive control the Chair of the IEEE Computational Intelligence Society (CIS) Norway
using a self-organizing neural network,’’ IEEE Trans. Neural Netw. Learn. Section.
Syst., vol. 24, no. 9, pp. 1425–1436, Sep. 2013.
[29] G. Wang, Q.-S. Jia, J. Qiao, J. Bi, and M. Zhou, ‘‘Deep learning-based
model predictive control for continuous stirred-tank reactor system,’’
IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 8, pp. 3643–3652,
Aug. 2021. TAMER F. MEGAHED received the B.Sc.,
[30] F. Fiedler, B. Karg, L. Lüken, D. Brandner, M. Heinlein, F. Brabender, and M.Sc., and Ph.D. degrees in electrical engineering
S. Lucia, ‘‘do-mpc: Towards FAIR nonlinear and robust model predictive from Mansoura University, Mansoura, Egypt,
control,’’ Control Eng. Pract., vol. 140, Nov. 2023, Art. no. 105676. in 2006, 2010, and 2015, respectively. Currently,
[31] D. Anguita, A. Ghio, S. Ridella, and D. Sterpi, ‘‘K-fold cross validation he is an Associate Professor and the Chairperson
for error rate estimate in support vector machines,’’ in Proc. DMIN, 2009,
of the Electrical Engineering Department, Egypt-
pp. 291–297.
Japan University of Science and Technology
[32] H. El-Hussieny, I. A. Hameed, and A. B. Zaky, ‘‘Plant-inspired soft
(E-JUST). Also, he has four patents in mag-
growing robots: A control approach using nonlinear model predictive
techniques,’’ Appl. Sci., vol. 13, no. 4, p. 2601, Feb. 2023. netic refrigeration, designed the IoT systems for
monitoring electricity consumption, thrust-vector-
control model rocket design, and magnetic gear. His research interests
include power control, renewable energy sources, smart grid, electric
vehicles, wireless charging, energy storage control and management, vector
control, power system protection, model predictive control, and green
hydrogen.

AHMED FARES received the Ph.D. degree


in computer science and engineering from
HAITHAM EL-HUSSIENY received the M.Sc. Egypt-Japan University of Science and Technol-
and Ph.D. degrees in mechatronics and robotics ogy (E-JUST), Alexandria, Egypt, in May 2015.
engineering from Egypt-Japan University of Sci- He is currently an Associate Professor with the
ence and Technology (E-JUST), Alexandria, Department of Computer Science and Engineer-
Egypt, 2013 and 2016, respectively. He is currently ing, E-JUST. He is also an Associate Professor
an Associate Professor of robotics and artificial with Shoubra Faculty of Engineering, Benha
intelligence with E-JUST. His research interests University (on-leave). His research interests
include robotics, soft robotics, haptics, teleopera- include multimedia content analysis, cognitive
tion, data-driven control, and applied intelligence. science, psychological and brain science, and machine learning.

115934 VOLUME 12, 2024

You might also like