Artificial Intelligence For Robotics and Autonomous Systems Applications
Artificial Intelligence For Robotics and Autonomous Systems Applications
Artificial
Intelligence
for Robotics and
Autonomous
Systems
Applications
Studies in Computational Intelligence
Volume 1093
Series Editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
The series “Studies in Computational Intelligence” (SCI) publishes new develop-
ments and advances in the various areas of computational intelligence—quickly and
with a high quality. The intent is to cover the theory, applications, and design methods
of computational intelligence, as embedded in the fields of engineering, computer
science, physics and life sciences, as well as the methodologies behind them. The
series contains monographs, lecture notes and edited volumes in computational
intelligence spanning the areas of neural networks, connectionist systems, genetic
algorithms, evolutionary computation, artificial intelligence, cellular automata, self-
organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems.
Of particular value to both the contributors and the readership are the short publica-
tion timeframe and the world-wide distribution, which enable both wide and rapid
dissemination of research output.
Indexed by SCOPUS, DBLP, WTI Frankfurt eG, zbMATH, SCImago.
All books published in the series are submitted for consideration in Web of Science.
Ahmad Taher Azar · Anis Koubaa
Editors
Artificial Intelligence
for Robotics
and Autonomous Systems
Applications
Editors
Ahmad Taher Azar Anis Koubaa
College of Computer and Information College of Computer and Information
Sciences Sciences
Prince Sultan University Prince Sultan University
Riyadh, Saudi Arabia Riyadh, Saudi Arabia
Automated Systems and Soft Computing
Lab (ASSCL)
Prince Sultan University
Riyadh, Saudi Arabia
Faculty of Computers and Artificial
Intelligence
Benha University
Benha, Egypt
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
Robotics, autonomous control systems, and artificial intelligence technology are all
examples of the Fourth Industrial Revolution. In order to further the autonomous car
age, artificial intelligence is integrated with robots and independent driver systems.
The remarkable advancements in artificial technology have sparked the development
of new services and online solutions to a variety of social problems. However, there is
still more to be done to integrate artificial intelligence with physical space. Robotics, a
branch of a technical and scientific discipline that deals with bodily interactions with
the physical world, is mechanical in nature. There are several domain-specific skills
in robotics, including sensing and perception, computing film/dynamic actions, and
controlling theory. Robotics and artificial intelligence work together to revolutionize
society by connecting cyberspace and physical space.
This book’s objective is to compile original papers and reviews that demonstrate
numerous uses of robotics and artificial intelligence (AI). It seeks to showcase cutting-
edge robotics and AI applications as well as developments in machine learning and
computational intelligence technologies in a variety of scenarios. In order to give a
cogent and comprehensive strategy to ion conservation employing technology and
analytics, it is also urged to develop and critically evaluate data analysis methodolo-
gies using such approaches. For applied AI in robotics, this book should serve as a
useful point of reference for both beginners and experts.
Both novice and expert readers should find this book a useful reference in the
field of artificial intelligence, mathematical modelling, robotics, control systems,
and reinforcement learning.
v
vi Preface
Book Features
• The book chapters deal with the recent research problems in the areas of
artificial intelligence, mathematical modelling, robotics, control systems, and
reinforcement learning.
• The book chapters present advanced techniques of AI applications in robotics and
drones.
• The book chapters contain a good literature survey with a long list of references.
• The book chapters are well-written with a good exposition of the research problem,
methodology, block diagrams, and mathematical techniques.
• The book chapters are lucidly illustrated with numerical examples and simula-
tions.
• The book chapters discuss details of applications and future research areas.
Audience
The book is primarily meant for researchers from academia and industry, who are
working in the research areas such as robotics engineering, control engineering,
mechatronic engineering, biomedical engineering, medical informatics, computer
science, and data analytics. The book can also be used at the graduate or advanced
undergraduate level and many others.
Acknowledgements
As the editors, we hope that the chapters in this well-structured book will stimulate
further research in artificial intelligence, mathematical modelling, robotics, control
systems, and reinforcement learning, and utilize them in real-world applications.
We hope sincerely that this book, covering so many different topics, will be very
useful for all readers.
Preface vii
We would like to thank all the reviewers for their diligence in reviewing the
chapters.
Special thanks go to Springer, especially the book Editorial team.
ix
x Contents
Abstract During the last decade, Convolutional Neural Networks (CNNs) have been
recognized as one of the most promising machine learning methods that are being
utilized for deep learning of autonomous robotic systems. Faced with everlasting
uncertainties while working in unstructured and dynamical real-world environments,
robotic systems need to be able to recognize different environmental scenarios and
make adequate decisions based on machine learning of the current environment’s
state representation. One of the main challenges in the development of machine
learning models based on CNNs is in the selection of appropriate model structure
and parameters that can achieve adequate accuracy of environment representation.
In order to address this challenge, the book chapter provides a comprehensive anal-
ysis of the accuracy and efficiency of CNN models for autonomous robotic applica-
tions. Particularly, different CNN models (i.e., structures and parameters) are trained,
validated, and tested on real-world image data gathered by a mobile robot’s stereo
vision system. The best performing CNN models based on two criteria—the number
of frames per second and mean intersection over union are implemented on the
real-world wheeled mobile robot RAICO (Robot with Artificial Intelligence based
COgnition), which is developed in the Laboratory for robotics and artificial intelli-
gence (ROBOTICS&AI) and tested for obstacle avoidance tasks. The achieved exper-
imental results show that the proposed machine learning strategy based on CNNs
provides high accuracy of mobile robot’s current environment state estimation.
1 Introduction
The worldwide interest in Artificial Intelligence (AI) techniques has become evident
after the paper [1] reached a substantially better result on image classification task
by utilizing Artificial Neural Networks (ANNs). Afterward, numerous ANN models
that achieve even better results on image classification and various other tasks have
been developed in [2]. Consequently, Deep Learning (DL) emerged as a new popular
AI subfield. DL represents the process of training and using ANNs that utilize much
deeper architectures, i.e., models with a large number of sequential layers. Another
important innovation provided by [1, 3] was that the deep ANNs provide better
results with convolutional layers instead of fully connected layers. Therefore, deep
ANNs with convolution as a primary layer are entitled as Convolutional Neural
Networks (CNNs). The CNN models such as ResNet [4], VGG [5], and Xception
[6] have become the industry and research go-to options, and many researchers tried
and succeeded in improving models’ accuracy or modifying the models for other
tasks and purposes. An introductory explanation of the CNN layers (e.g., Pooling,
ReLU, convolution, etc.) is beyond the scope of this chapter, and interested readers
are referred to the following literature [2].
Background of the research performed in this chapter includes the high intercon-
nection of robotics, computer vision, and AI fields that has led numerous researchers
in the robotics community to get interested in DL. Many robotics tasks that have a
highly non-linear nature can be effectively approximated by utilizing DL techniques.
Nowadays, the utilization of DL in robotics spans from Jacobian matrix approxima-
tion [7] to decision-making systems [8]. However, in the robotics context, AI is mainly
used when a high dimensional sensory input is utilized in control [9], simultaneous
localization and mapping [10], indoor positioning [11], or trajectory learning [12,
13]. One of the main challenges for utilizing state-of-the-art DL models in robotics
is related to processing time requirements. Keeping in mind that all robotic algo-
rithms need to be implemented in the real-world setting, where processing power is
limited by robot hardware, DL models need to be able to fulfill the time-sensitive
requirements of embedded robotic processes.
In the beginning, the accuracy of the utilized CNN models was the only relevant
metric researchers considered, and therefore the trend was to utilize larger models.
Larger models not only require more time, energy, and resources to train but are also
impossible to implement in real-world time-sensitive applications. Moreover, one
major wake-up call was the realization that the energy utilized for training one of
the largest DL models for natural language processing [14] was around 1,287 MWh
[15], whereas a detailed analysis of the power usage and pollution generated from
training DL models was shown in [16]. Having that in mind, the researchers started
exploring the models that do not utilize a large amount of power for training, as well
as the models that are usable in real time.
As it can be concluded from the previous elaboration, motivation for the research in
this chapter is in the possible utilization of highly accurate DL models within robotic
domain. Particularly, the DL models that provide effective tool for the mobile robot
Efficient Machine Learning of Mobile Robotic Systems Based … 3
perception system to further enhance the understating of the robot’s surroundings and
utilize that information for further decision-making will be considered. The objective
of the research presented in this chapter is to identify how large (in terms of number
of parameters and layers) the CNN model needs to be to achieve high accuracy for
semantic segmentation task, while having practical value in terms of capability for
its implementation to computationally restricted Nvidia Jetson Nano board.
The main contributions of this chapter include the analysis of the efficiency of
developed DL models implemented on Nvidia Jetson Nano board, within mobile
robot RAICO (Robot with Artificial Intelligence based COgnition) for real-world
evaluation. Particularly, efficient CNN models with different levels of computational
complexity are trained on well-known dataset and tested on mobile robot RAICO.
Different from other approaches (see e.g., [17–19]) that usually utilize AI-boards
with higher level of computation recourses or even high end GPUs (that are much
harder to integrate within robotic systems), the authors analyze the models that have
far lower computational complexity and that are implementable on Jetson Nano.
After the model selection process and thorough analysis of achieved results, the best
CNN model is implemented within the obstacle avoidance algorithm of mobile robot
RACIO for experimental evaluation within real robotic algorithm.
The chapter outline is as follows. The formulation and initial analysis of the
problem at hand are given in Sect. 2. The related work regarding the efficient DL
models in robotic domain is presented in Sect. 3. Section 4 includes the imple-
mentation details of different efficient DL models. The methodology for mobile
robot obstacle avoidance algorithm based on DL is considered in Sect. 5. Section 6
is devoted to the analysis of experimental results and followed by discussion of
achieved results presented in Sect. 7. Section 8 has concluding remarks with future
research directions.
where P is the number of parameters, F w , F h are the width and height of the filter,
N is the number of channels (depth) of the input feature maps, and M is the number
of filters used; all the parameters have additional index that shows which layer
they represent, c—standard convolution, d—depthwise convolution, p—pointwise
convolution, m—MobileNet. The difference between the standard and MobileNet
convolutional layer can be graphically seen in Fig. 1.
Moreover, Eqs. (3) and (4) represent the difference between a number of FLOPs
(without bias, padding is 0, and stride is 1) utilized for these two convolution layers,
where F represents the number of FLOPs, Dw and Dh are width and height of the
output feature map, with the same notation as in (1) and (2). As it can be seen,
both memory footprint (according to the number of parameters) and inference time
according to the FLOPs are four times lower for the MobileNet convolution in the
Efficient Machine Learning of Mobile Robotic Systems Based … 5
considered example. For larger layers, the difference can be even more significant
(up to 8 or 9 times [21]).
Another efficient general-purpose CNN model is ShuffleNet [22]. In the same
manner as MobileNet, ShuffleNet utilizes depthwise and pointwise convolution
layers. Differently, it utilizes a group convolution to further reduce the number of
FLOPs. Additionally, the model also performs the channel shuffle between groups
to increase the overall information provided for feature maps. ShuffleNet achieves
better accuracy than MobileNet while having the same number of FLOPs. Both
MobileNet and ShuffleNet have novel versions of their models to further improve
their performance ([23, 24]).
Different CNN models have been developed for various computer vision appli-
cations. Therefore, the following related work is divided into two sections based on
computer vision tasks that are utilized in robotic applications.
The first frequently utilized computer vision application in robotics is object detec-
tion. Object detection represents the process of finding a specific object in the image,
defining its location with the bounding box, and classifying the object into one of the
predefined classes with the prediction confidence score. The most common efficient
detection networks are analyzed next. The faster R-CNN [25] represents one of the
first detection CNN models that brought the inference time so low that it encouraged
the further development of real-time detection models. Nowadays, detection models
can be used in real-time with significant accuracy, e.g., YOLO [26] and its variants
(e.g., [27, 28]), and SSD [29].
Object detection-based visual control of industrial robot was presented in [30]. The
authors utilized faster R-CNN model in conjunction with an RGBD camera to detect
objects and decide if the object was reachable for a manipulator. The human–robot
collaboration based on hand gesture detection was considered in [31]. The authors
improved SSD network by changing VGG for Resnet backbone and adding an extra
feature combination layer. The considered modifications improved the detection of
hand signs even when the human was far away from the robot. In [32], the authors
analyzed dynamic Simultaneous Localization And Mapping (SLAM) based on SSD
network. Dynamic objects were detected to enhance the accuracy of the standard
visual ORB-SLAM2 method by excluding parts of the image that were likely to move.
The proposed system significantly improved SLAM performance in both indoor and
outdoor environments. Human detection and tracking performed with SSD network
were investigated in [33]. The authors proposed a mobile robotic system that can
find, recognize, track, and follow (using visual control) a certain human in order
to achieve human–robot interaction. The authors of [34] developed a YOLOv3-
based bolt position detection algorithm to infer the orientation of the pallets the
Efficient Machine Learning of Mobile Robotic Systems Based … 7
industrial robot needs to fill up. The YOLOv3 model was improved by using a k-
means algorithm, a better detector, and a novel localization fitness function. The
human intention detection algorithm was developed in [35]. The authors utilized
YOLOv3 for object detection and LSTM ANN for human action recognition. CNNs
were integrated into one human intention detection algorithm and robot decision-
making system utilizes that information for decision-making purposes.
The second common computer vision task that is utilized within robotic applications
is semantic segmentation (e.g., [36]). Semantic segmentation represents the process
of assigning (labeling) every pixel in the image with an object class. The accuracy of
DL models for semantic segmentation can be represented either in pixel accuracy or
mean Intersection over Union (mIoU) [37]. Few modern efficient CNN models for
semantic segmentation are analyzed next, following by the ones that are integrated
into robotic systems.
The first analyzed CNN model was ENet [17]. The authors improved the effi-
ciency of ResNet model by adding a faster reduction of feature map resolution with
either max-pooling or convolution with stride 2. Moreover, the batch normalization
layer was added after each convolution. The results showed that ENet achieved mIoU
accuracy close to state-of-the-art while having a much lower inference time (e.g., 21
FPS on Jetson TX1). Another efficient CNN model entitled as ERFNet was proposed
in [18]. ERFNet further increased the efficiency of a residual block by splitting 2D
convolution into two 1D convolution layers. Each n × n convolution layer was split
into 1 × n, followed by ReLU activation and another n × 1 convolution. ERFNet
achieved higher accuracy than ENet, at the expense of some inference time (ERFNet–
11 FPS on TX1). The authors of [38] proposed attention-based CNN for semantic
segmentation. Fast attention blocks represent the core improvement of the proposed
paper. The ResNet was utilized as a backbone network. The utilized network was
evaluated on the Cityscape dataset, where it achieved 75.0 mIoU, while being imple-
mented on Jetson Nano and achieving 31 FPS. The authors of [39] proposed a CNN
model that integrates U-net [40] with ResNet’s skip connection. The convolution
layers were optimized with CP decomposition. Moreover, the authors proposed an
iterative algorithm for fine-tuning the ratio of compression and achieved accuracy.
At the end, the compressed network achieved astonishingly low inference time (25
FPS–Jetson Nano), with decent mIoU accuracy. The authors in [19] proposed to
utilize CNNs for semantic segmentation of crops and weeds. The efficiency of the
proposed network was mainly achieved by splitting the residual 5 × 5 convolution
layer into the following combination of convolution layers 1 × 1—5 × 1—1 ×
5—1 × 1. The proposed system was tested on an unmanned agriculture robot with
both 1080Ti NVidia GPU (20 FPS) and Jetson TX2 (5FPS). The novel semantic
8 M. Petrović et al.
segmentation CNN model (Mininet) was proposed in [41]. The main building block
included two subblocks; the first one has depthwise and pointwise convolution where
depthwise convolution was factorized into two layers with filter n × 1 and 1 × n,
and the second subblock includes Atrous convolution with a factor greater than 1.
Both subblocks included ReLU activation and Batch normalization. At the end of the
block, both subblocks were summed, and another 1 × 1 convolution was performed.
The proposed model achieves high accuracy with around 30FPS on high-end GPU. In
regards to the robotic applications, the network was evaluated on efficient keyframe
selection for ORB2 SLAM method. The authors in [42] proposed CNN model for
RGBD semantic segmentation. The system was based on ResNet18 backbone with
decoder that utilized ERFNet modules. Mobile robot had Kinect2 RGBD camera,
and the proposed model was implemented on Jetson Xavier. Resulting system was
able to have high accuracy person detection with free space representation based on
floor discretization. Visual SLAM based on depth map generated by CNN model was
considered in [43]. The authors utilized version of ResNet50 that had improved infer-
ence time, so that it can be implemented on Jetson TX2 in near real-time manner (16
FPS). Overview of all analyzed efficient CNN models utilized for different robotic
tasks can be summarized in Table 2.
As it can be seen from Sect. 3, numerous CNN models have been proposed for small
embedded devices that can be integrated into robotic systems. In Sect. 4, the authors
will describe the CNN models that will be trained and deployed to the NVidia Jetson
Efficient Machine Learning of Mobile Robotic Systems Based … 9
Nano single-board computer. Models will be trained on the Cityscapes dataset [44]
with images that have 512 × 256 resolution.
As a baseline model, we have utilized the CNN network proposed by the official
NVidia Jetson instructional guide for inference (real-time CNN vision library) [45].
The network is based on a fully convolutional network with a ResNet18 backbone
(entitled ResNet18_2222). Due to the limited processing power, the decoder of the
network is omitted, and the output feature map is of lower resolution. The baseline
model is created from several consecutive ResNet Basic Blocks (BB) and Basic
Reduction Blocks (BRB), see Fig. 2. When the padding and stride are symmetrical,
only one number is shown. The number of feature maps in each block is determined
by the level in which the block is, see Fig. 3.
The complete architecture of the baseline and selected 1D model with three levels
is presented in Fig. 3. As it can be seen, the architecture is divided into four levels.
In the baseline model, each level includes two blocks, either two BB or BB + BRB
(defined in Fig. 2). Size of the feature maps is given between each level and each
layer. For all architectures, the number of features per level is defined as follows:
Efficient Machine Learning of Mobile Robotic Systems Based … 11
level 1—64 features, level 2—128 features, level 3—256 features, and level 4—512
features, regardless of the number of blocks in each level.
The first modification we propose to the baseline model is to change the number
of blocks in each level. The intuition for this modification is twofold, (i) a common
method of increasing the efficiency of CNN models is rapid reduction of feature maps
resolution, and (ii) the prediction mask resolution can be increased by not reducing
input resolution (since we do not use decoder). Classes that occupy small spaces
(e.g., poles in the Cityscapes dataset) cannot be accurately predicted if the whole
image with 256 × 512 resolution is represented by a prediction mask of 8 × 16;
therefore, a higher resolution of the prediction mask can increase both accuracy and
mIoU measure.
The second set of CNN models that are trained and tested include the decomposi-
tion of 3 × 3 layer into 1 × 3 and 3 × 1 convolution layers. Two types of blocks—1D
Block (DB) and 1D Reduction Block (DRB), created from this type of layer can be
seen in Fig. 2. CNN models with 1D blocks are entitled ResNet_1D (or RN_1D).
One of the 1D ResNet models is shown in Fig. 3. This model includes only the first
three levels with a larger number of blocks per level compared to the baseline model.
Since there is one less level, the output resolution is larger with the output mask of
16 × 32.
Lastly, the depth-wise separable and pointwise convolution is added into 1D layers
to create a new set of blocks (Fig. 4). Additional important parameters for separation
block are a number of feature maps at the input and the output of the level.
Fig. 5 ResNet_sep_4400
architectures
Another eight architectures named separable ResNet models (RN_sep) are created
using separable blocks. The example of the separable convolutional model with only
two levels is shown in Fig. 5.
After mobile robots receive high-level tasks that need to be performed (see, e.g.
[46–48]), a path planning step is required to ensure the safe and efficient execution
of tasks. Along the way, new obstacles can be detected in the defined plan; therefore,
local planning needs to occur to avoid collisions.
In this work, the efficient deep learning system is utilized to generate the semantic
map of the environment. Afterward, the semantic map is utilized to detect obstacles
Efficient Machine Learning of Mobile Robotic Systems Based … 13
in the mobile robot’s path. The considered mobile robot system moves in a hori-
zontal plane, and therefore the height of the camera remains the same for the whole
environment. Moreover, since the pose and intrinsic camera parameters are known,
it is possible to geometrically link the position of each group of pixels produced by
the semantic map to the position in the world frame. By exploiting the class of each
group of pixels, the mobile robot can determine how to avoid obstacles and reach the
desired pose. A mathematical and algorithmic explanation of the proposed system
is discussed next.
Mobile robot pose is defined by its position and orientation, included in the state
vector (5):
x = (z, x, θ )T (5)
where x and z are mobile robot coordinates, and θ is the current heading angle. The
camera utilized by the mobile robot RAICO is tilted downwards for the inclination
angle α. Camera angles of view in terms of image height and width are denoted as γ h
and γ w , respectively. As mentioned in Sect. 4, the output semantic mask is smaller
than the input image; therefore, the dimensions of the output mask are defined with
its width (W ) and height (H) defined in pixels. The geometric relationships between
the output mask and the area in front of the mobile robot, in the vertical plane, can
be seen in Fig. 6.
If the output semantic mask pixel belongs to the class “floor”, we can conclude that
there is no obstacle in that area (e.g., between z1 and z2 in Fig. 6) of the environment,
and mobile can move to that part of the environment. The same view in the horizontal
plane is shown in Fig. 7.
In order to calculate the geometric relationships, the first task is to determine the
increment of the angle between the edges of the output pixels in terms of both camera
width (β w ) and height (β h ) by using (6) and (7):
βw = γw /W , (6)
βh = γh /H . (7)
Afterward, starting angles for width and height need to be determined by using
(8) and (9):
Therefore, it is possible to calculate the edges of the area that is covered by each
pixel of the semantic map, defined with their z and x coordinates (10) and (11):
The example of the generated map by the mobile robot is shown in Fig. 8, where
the green area is accessible while obstacles occupy the red areas.
The width that a mobile robot occupies while moving is defined by its width B (see
Fig. 8). Since the obstacles that are too far away from the robot do not influence the
movement, we defined threshold distance D. The area defined with B and D is utilized
to determine if the obstacle avoidance procedure needs to be initiated. Therefore, if
any of the pixels that correspond to this area include obstacles, the obstacle avoidance
procedure is initiated. The whole algorithm utilized for both goal-achieving behavior
and obstacle avoidance is represented in Fig. 9.
It is assumed that the mobile robot is localized (i.e., the initial pose of the mobile
robot is known), and the desired pose is specified. Therefore, the initial plan is to
rotate the mobile until it is directed to the desired position and perform translation
until it is achieved. If an obstacle is detected within the planned path (according to
the robot width), the mobile robot performs an additional obstacle avoidance strategy
before computing new control parameters to achieve the desired pose.
There are five states (S1 –S5 ) in which a mobile robot can be and two actions it
can take. The actions that mobile robot performs are translation or rotation. At the
start of the movement procedure, the mobile robot is in state S1 , which indicates
that rotation to the desired pose needs to be performed. After the rotation is finished,
obstacle detection is performed. If the obstacle is not detected (O = 0), the mobile
robot transitions to state S2 ; otherwise, it transitions to state S3 . Within state S2 , the
mobile robot performs translational movement until the desired position is reached
or until the dynamic obstacle is detected in the robot’s path. In state S3 , the mobile
robot calculates a temporary goal (new goal) and rotates until there is no obstacle in
its direction. Afterward, it transitions to state S4 , where the mobile robot performs
translation until the temporary goal is achieved or a dynamical obstacle is detected.
If the translation is completed, the mobile robot starts rotating to the desired position
(S1 ). On the other hand, if the obstacle is detected in the S4 , the robot transitions to
S3 and generates a new temporary goal. The obstacle avoidance process is performed
until the mobile robot achieves state S5 , indicating the desired pose’s achievement.
An example of the mobile robot’s movement procedure with and without obstacles
is shown in Fig. 10.
6 Experimental Results
The experimental results are divided into two sections. The first includes the results
of training of deep learning models, while the second one involves utilizing the best
model within an obstacle avoidance algorithm.
All CNN models are trained and tested on the same setup to ensure a fair compar-
ison. Models have been trained on the Cityscapes dataset [44] with input images
of 512 × 256 resolution. Low-resolution images are selected since the used NVidia
Efficient Machine Learning of Mobile Robotic Systems Based … 17
Jetson Nano has the lowest level of computation power out of all NVidia Jetson
devices (see Table 1). Models are trained on a deep learning workstation with three
NVidia Quadro RTX 6000 GPUs and two Xeon Silver 4208 CPUs using the PyTorch
v1.6.0 framework. All analyzed models are compared based on two metrics, the mIoU
and FPS achieved on Jetson Nano. At the same time, global accuracy and model size
are also shown to compare the utilized models better. All networks are converted to
TensorRT (using ONNX format) with FP16/INT8 precision to increase the models’
inference time.
Table 3 includes all the variations of all three CNN models, whose detailed expla-
nation is provided in Sect. 4. The experiment is proposed not to change the number
of used blocks for each network but only to change their position within four levels.
Since the networks need to be tested on a real-world mobile robot instead of FLOPs,
we compare the efficiency of the networks in FPS. Also, the network size in MB is
also provided.
The CNN with the best mIoU value is the model RN_8000. The model with the
lowest memory footprint is RN_1D_8000, with its size being only 1.6 MB. The
model with the fastest inference time represented in FPS is RN_sep_1115. However,
since the primary motivation for these experiments was to determine the best network
in regards to the ratio of FPS and mIoU, the network selected for utilization in the
obstacle avoidance process is RN_2600, since it achieves both a high level of accuracy
and number of FPS. The primary motivation for training the CNNs on the Cityscapes
dataset is its popularity and complexity. Afterward, the selected network is trained
again on the Sun indoor dataset [49] to be used in mobile robot applications.
By utilizing the algorithm proposed in Sect. 5, the obstacle avoidance ability
of the mobile robot RAICO is experimentally evaluated (Fig. 11). Mobile robot is
positioned on the floor within the ROBOTICS&AI laboratory.
Mobile robot is set to initial pose x = (0, 0, 0), while the desired pose is set to be xd
= (600,100,-0.78). The change in pose of the mobile robot is calculated according to
the dead-reckoning odometry by utilizing wheel encoders [50]. A spherical obstacle
is set to a position (300, 50) with a diameter of roughly 70 mm. The proposed
algorithm is started, and the mobile robot achieves the trajectory shown in Fig. 12.
Mobile robot representation is shown with different colors for different States (see
Sect. 5), S1 is red, S2 is blue, S3 is dark yellow, and S4 is dark purple. Moreover, the
obstacle is indicated with a blue-filled circle. Desired and final positions are shown
with black and green dots, respectively.
Moreover, the selected images mobile robot acquired and semantic segmentation
masks generated by the CNN overlayed over the image can be seen in Fig. 13. As it
can be seen, segmentation of floor (red), wall (teal), chairs (blue), tables (black), and
objects (yellow) is performed well, with precise edges between mentioned classes.
By utilizing accurate semantic maps, mobile robot was able to dodge the obstacle,
and successfully achieve the desired pose.
Now, we show Fig. 14 with four examples of the influence of the semantic maps
on free and occupied areas in the environment generated during the experimental
evaluation. The green area is free, and it corresponds to the floor class, while the red-
occupied area corresponds to all other classes. In the first image, the mobile robot
18 M. Petrović et al.
detects the obstacle in its path and then rotates to the left until robot can avoid an
obstacle. The second image corresponds to the moment the obstacle almost leaves
the robot’s field of view due to its translational movement. The third image represents
the moment at the end of the obstacle avoidance state, and the last image is generated
near the final pose. By analyzing images in the final pose, it can be seen that mobile
robot accurately differentiate between free space (floor) in the environment and (in
this case) the wall class that represents the occupied area. This indicates that it is
possible to further utilize CNN models in conjunction with proposed free space
detection algorithm to define the areas in which mobile robot can safely perform
desired tasks.
Efficient Machine Learning of Mobile Robotic Systems Based … 19
300
X axis [mm]
200
100
-100
0 200 400 600
Z axis [mm]
7 Discussion of Results
The experimental results are divided into two sections, one regarding finding the
optimal CNN model in terms of both accuracy and inference speed and the other
regarding experimental verification with the mobile robot. Within the first part of
the experimental evaluation, three types of CNN models with a different number of
layers in each level are analyzed. The experimental results show that the best network
(RN_8000) in terms of accuracy is the one with all layers concentrated in the first
20 M. Petrović et al.
level. This type of CNN has the highest output resolution, which is the main reason
why it provides the overall best results. Moreover, the general trend is that networks
with fewer levels have higher accuracy (the best CNN model has 40.8 mIoU and
84.6 accuracy). Regarding the model size, it is shown that networks with many layers
concentrated in the fourth level occupy more disk space (the largest model occupies
96 MB of disk space compared to the smallest model, which occupies only 1.6 MB).
22 M. Petrović et al.
The main reason for this occurrence is that the higher levels include a larger number
of filters. However, the largest CNN models also have the lowest inference time and,
therefore, the highest number of FPS, reaching even 48.7 average FPS. Moreover, it
is shown that both proposed improvements of the network (depth separable and 1D
convolution) show a marginal decrease in inference time at the expense of a slight
decrease in accuracy. By exploiting the ratio between the accuracy and inference
time, the RN_2600 CNN model is utilized in the experiment with a mobile robot.
This network achieved the second-best accuracy and is 4 FPS faster than the network
with the best accuracy. Moreover, since modern SSD or micro-SD cards are readily
available with disk spaces much larger than the proposed size of the models in MB,
it can also be concluded that the disk size all models occupy are is not a substantial
restriction.
On the other hand, the main achievement of this chapter is shown through the
experiment with the mobile robot RAICO as it performed obstacle avoidance to
demonstrate a case study for the utilization of an accurate and fast CNN model.
The model is employed within the obstacle detection and avoidance algorithm. The
output of the network is processed and generates the output semantic segmentation
masks. Afterward, the geometric relationship between the camera position and its
parameters is utilized to determine the free area in the environment. If the obstacle
is detected close to the mobile robot path, the algorithm transitions the mobile robot
from goal achieving states to obstacle avoidance states. Mobile robot avoids obstacle
and transitions back to the goal-achieving states, all while checking for new obstacles
in the path. Experimental evaluation reinforces the validity of the proposed algorithm,
which can, in conjunction with the CNN model, successfully avoid obstacle and
achieve the desired position with a satisfactory error. Moreover, since the proposed
algorithm has shown accurate free space detection, it can be further utilized within
other mobile robotic tasks.
8 Conclusion
In this work, we propose applying an efficient deep learning model employed for the
obstacle avoidance algorithm. The CNN model is used in real-time on the Jetson Nano
development board. The utilized CNN model is inspired by the ResNet model inte-
grated with depth separable convolution and 1D convolution processes. We proposed
and trained 24 variants of CNN models for semantic segmentation. The best model
is selected according to the ratio of mIoU measure and the number of FPS it achieves
on Jetson Nano. The selected model is RN_2600 with two levels of layers and it
achieves 42.6 FPS with 40.6 mIoU. Afterward, the selected CNN model is employed
in the novel obstacle avoidance algorithm. Within obstacle avoidance, the mobile
robot has four states. Two states are reserved for goal achieving and two for obstacle
avoidance purposes. According to the semantic mask and known camera pose, the
area in front of the mobile is divided into free and occupied sections. According to
those areas, mobile robot transitions between goal-seeking and obstacle avoidance
Efficient Machine Learning of Mobile Robotic Systems Based … 23
states during the movement procedure. The experimental evaluation shows that the
mobile robot managed to avoid obstacle successfully and achieve the desired position
with an error in the Z direction of −15 mm, and 23 mm in the X direction, generated
according to the wheel encoder data.
Further research directions include the adaptation of proposed CNN modes and
their implementation on an industrial-grade mobile robot with additional computa-
tional resources. The proposed method should be a subsystem of the entire mobile
robot decision-making framework.
Acknowledgements This work has been financially supported by the Ministry of Education,
Science and Technological Development of the Serbian Government, through the project “Inte-
grated research in macro, micro, and nano mechanical engineering–Deep learning of intelligent
manufacturing systems in production engineering”, under the contract number 451-03-47/2023-
01/200105, and by the Science Fund of the Republic of Serbia, Grant No. 6523109, AI-MISSION4.0,
2020-2022.
Appendix A
Abbreviation List
References
1. Krizhevsky, A., Sutskever, I., & Hinton, G.E. (2012) ImageNet classification with deep convolu-
tional neural networks. In Advances in neural information processing systems (pp. 1097–1105).
2. Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., & Lew, M. S. (2016). Deep learning for visual
understanding: A review. Neurocomputing, 187, 27–48.
3. LeCun, Y., & Bengio, Y. (1995) Convolutional networks for images, speech, and time series.
Handbook brain theory neural networks (Vol. 3361, no. 10).
4. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In
IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).
5. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image
recognition. In 3rd International Conference on Learning Representations (pp. 1–14).
6. Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In Proceed-
ings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1251–1258).
7. Nguyen, H., & Cheah, C.C. (2022). Analytic deep neural network-based robot control.
IEEE/ASME Transactions Mechatronics (pp. 1–9).
8. Jokić, A., Petrović, M., & Miljković, Z. (2022). Mobile robot decision-making system based
on deep machine learning. In 9th International Conference on Electrical, Electronics and
Computer Engineering (IcETRAN 2022) (pp. 653–656).
9. Miljković, Z., Mitić, M., Lazarević, M., & Babić, B. (2013). Neural network reinforcement
learning for visual control of robot manipulators. Expert Systems with Applications, 40(5),
1721–1736.
10. Miljković, Z., Vuković, N., Mitić, M., & Babić, B. (2013). New hybrid vision-based control
approach for automated guided vehicles. International Journal of Advanced Manufacturing
Technology, 66(1–4), 231–249.
11. Petrović, M., Ci˛eżkowski, M., Romaniuk, S., Wolniakowski, A., & Miljković, Z. (2021).
A novel hybrid NN-ABPE-based calibration method for improving accuracy of lateration
positioning system. Sensors, 21(24), 8204.
12. Mitić, M., Vuković, N., Petrović, M., & Miljković, Z. (2018). Chaotic metaheuristic algorithms
for learning and reproduction of robot motion trajectories. Neural Computing and Applications,
30(4), 1065–1083.
13. Mitić, M., & Miljković, Z. (2015). Bio-inspired approach to learning robot motion trajectories
and visual control commands. Expert Systems with Applications, 42(5), 2624–2637.
14. Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A.,
Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T.,
Child, R., Ramesh, A., Ziegler, D.M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin,
M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., &
Amodei, D. (2020). Language models are few-shot learners. In Advances in Neural Information
Processing Systems (Vol. 33, pp. 1877–1901).
15. Patterson, D., Gonzalez, J., Le, Q., Liang, C., Munguia, L.-M., Rothchild, D., So, D., Texier,
M., & Dean, J. (2021). Carbon emissions and large neural network training (pp. 1–22).
arXiv:2104.10350.
16. Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep
learning in NLP. arXiv:1906.02243.
Efficient Machine Learning of Mobile Robotic Systems Based … 25
17. Paszke, A., Chaurasia, A., Kim, S., & Culurciello, E. (2016). ENet: a deep neural network
architecture for real-time semantic segmentation. arXiv:1606.02147.
18. Romera, E., Alvarez, J. M., Bergasa, L. M., & Arroyo, R. (2017). Erfnet: Efficient residual
factorized convnet for real-time semantic segmentation. IEEE Transactions on Intelligent
Transportation Systems, 19(1), 263–272.
19. Milioto, A., Lottes, P., & Stachniss, C. (2018) Real-time semantic segmentation of crop and
weed for precision agriculture robots leveraging background knowledge in CNNs. In 2018
IEEE International Conference on Robotics and Automation (pp. 2229–2235).
20. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., & Keutzer, K. (2016).
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size (pp. 1–
13). arXiv:1602.07360.
21. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., &
Adam, H. (2017). MobileNets: Efficient convolutional neural networks for mobile vision
applications. arXiv:1704.04861.
22. Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). Shufflenet: An extremely efficient convolutional
neural network for mobile devices. Proceedings of the IEEE Conference on Computer Vision
Pattern Recognition (pp. 6848–6856).
23. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C. (2018). MobileNetV2:
Inverted residuals and linear bottlenecks. IEEE Conference on Computer Vision Pattern
Recognition (pp. 4510–4520).
24. Ma, N., Zhang, X., Zheng, H.-T., & Sun, J. (2018). Shufflenet v2: Practical guidelines for
efficient CNN architecture design. Proceedings of European Conference on Computer Vision
(pp. 116–131).
25. Ren, S., He, K., Girshick, R., & Sun, J. (2016). Faster R-CNN: Towards real-time object
detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 39(6), 1137–1149.
26. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-
time object detection. IEEE Conference on Computer Vision Pattern Recognition (pp. 779–788).
27. Redmon, J., & Farhadi, A. (2017) YOLO9000: Better, faster, stronger. IEEE Conference on
Computer Vision Pattern Recognition (pp. 7263–7271).
28. Redmon, J., & Farhadi, A. (2018) YOLOv3: An incremental improvement. arXiv:1804.02767.
29. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD:
Single shot multibox detector. In European Conference Computer Vision (pp. 21–37).
30. Chen, X., & Guhl, J. (2018). Industrial robot control with object recognition based on deep
learning. Procedia CIRP, 76, 149–154.
31. Gao, Q., Liu, J., & Ju, Z. (2020). Robust real-time hand detection and localization for space
human–robot interaction based on deep learning. Neurocomputing, 390, 198–206.
32. Xiao, L., Wang, J., Qiu, X., Rong, Z., & Zou, X. (2019). Dynamic-SLAM: Semantic monocular
visual localization and mapping based on deep learning in dynamic environment. Robotics and
Autonomous Systems, 117, 1–16.
33. Hwang, C.-L., Wang, D.-S., Weng, F.-C., & Lai, S.-L. (2020). Interactions between specific
human and omnidirectional mobile robot using deep learning approach: SSD-FN-KCF. IEEE
Access, 8, 41186–41200.
34. Zhao, K., Wang, Y., Zuo, Y., & Zhang, C. (2022). Palletizing robot positioning bolt detection
based on improved YOLO-V3. Journal of Intelligent and Robotic Systems, 104(3), 1–12.
35. Liu, C., Li, X., Li, Q., Xue, Y., Liu, H., & Gao, Y. (2021). Robot recognizing humans intention
and interacting with humans based on a multi-task model combining ST-GCN-LSTM model
and YOLO model. Neurocomputing, 430, 174–184.
36. Jokić, A., Petrović, M., & Miljković, Z. (2022). Semantic segmentation based stereo visual
servoing of nonholonomic mobile robot in intelligent manufacturing environment. Expert
Systems with Applications, 190, 116203.
37. Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., Martinez-Gonzalez,
P., & Garcia-Rodriguez, J. (2018). A survey on deep learning techniques for image and video
semantic segmentation. Applied Soft Computing, 70, 41–65.
26 M. Petrović et al.
38. Hu, P., Perazzi, F., Heilbron, F. C., Wang, O., Lin, Z., Saenko, K., & Sclaroff, S. (2020). Real-
time semantic segmentation with fast attention. IEEE Robotics and Automation Letters, 6(1),
263–270.
39. Falaschetti, L., Manoni, L., & Turchetti, C. (2022). A low-rank CNN architecture for real-
time semantic segmentation in visual SLAM applications. IEEE Open Journal of Circuits and
Systems.
40. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical
image segmentation. In International Conference on Medical Image Computing and Computer-
Assisted Intervention (pp. 234–241).
41. Alonso, I., Riazuelo, L., & Murillo, A. C. (2020). Mininet: An efficient semantic segmentation
convnet for real-time robotic applications. IEEE Transactions on Robotics, 36(4), 1340–1347.
42. Seichter, D., Köhler, M., Lewandowski, B., Wengefeld, T., & Gross, H.-M. (2021) Efficient
rgb-d semantic segmentation for indoor scene analysis. 2021 IEEE International Conference
on Robotics and Automation (pp. 13525–13531).
43. Bokovoy, A., Muravyev, K., & Yakovlev, K. (2019). Real-time vision-based depth reconstruc-
tion with NVidia Jetson. In 2019 European Conference on Mobile Robots (pp. 1–6).
44. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth,
S., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In
Proceedings IEEE Conference on Computer Vision and Pattern Recognition (pp. 3213–3223).
45. Dustin, F. (2022) Hello AI world NVidia Jetson. https://github.com/dusty-nv/jetson-inference.
46. Petrović, M., Miljković, Z., & Jokić, A. (2019). A novel methodology for optimal single mobile
robot scheduling using whale optimization algorithm. Applied Soft Computing, 81, 105520.
47. Petrović, M., Jokić, A., Miljković, Z., & Kulesza, Z. (2022). Multi-objective scheduling of
single mobile robot based on grey wolf optimization algorithm. SSRN.
48. Petrović, M., Miljković, Z., Babić, B., Vuković, N., & Čović, N. (2012). Towards a conceptual
design of intelligent material transport using artificial intelligence. Strojarstvo, 54(3), 205–219.
49. Song, S., Lichtenberg, S.P., & Xiao, J. (2015). SUN RGB-D: A RGB-D scene under-
standing benchmark suite. In Proceedings IEEE Conference on Computer Vision and Pattern
Recognition (pp. 567–576).
50. Corke, P. (2017). Robotics, vision and control: Fundamental algorithms in MATLAB®.
Springer.
UAV Path Planning Based on Deep
Reinforcement Learning
Abstract Currently, UAV has been used for the military and civil purposes, espe-
cially the rotor UAV, which has the ability of vertical take-off and landing, has six
degrees of freedom and can hover in the air. Because of its high mobility, it has
become a working platform for various environments with different purposes. When
UAV performs autonomous flight mission, the static and dynamic obstacle environ-
ment occurs, therefore, research on effective obstacle avoidance and path planning
technology for unknown environment is very important. Traditional path planning
technology needs to rely on map information and high real-time algorithm, which
requires huge storage space and computing resources. In this chapter, the author
studies the deep reinforcement learning algorithm for UAV path planning. In view of
the current challenges faced by UAVs in autonomous flight in obstacle environments,
this chapter proposes an improved DQN algorithm combined with artificial poten-
tial fields, establishing a reward function to evaluate the behavior of UAV, which
could guide the UAV to reach the target point as soon as possible under the premise
of avoiding obstacles. The network structure, state space, action space and reward
function of the DQN algorithm is designed and a UAV reinforcement learning path
planning system is established. In order to better verify the advantages of the algo-
rithm proposed in this chapter, a comparative experiment between the improved DQN
algorithm and the DQN algorithm is carried out. The path planning performance of
the DQN algorithm and the improved DQN algorithm in the same environment is
compared. The loss function and success rate are selected as the comparison criteria.
The experimental results show that the improved DQN algorithm is faster and more
stable than the DQN algorithm for UAV path planning, which verifies the superiority
of the improved DQN algorithm for path planning.
1 Introduction
The multirotor has the function of vertical take-off and landing, has six degrees
of freedom, and can hover in the air. UAVs can provide many conveniences for
human society. In the military field, UAVs can be used for tasks such as target strike,
reconnaissance monitoring, and information communication; in the civilian field,
security inspections, plant protection, etc., epidemic prevention and control, aerial
photography and other aspects are more and more widely used.
The development of artificial intelligence and information technology has put
forward higher and higher requirements for the autonomous intelligence of UAVs.
UAVs need path planning technology to perform various tasks autonomously. The
task of path planning is to obtain a path from the starting point to the target point,
preventing collisions with obstacles and making the path as short as possible. UAVs
have six degrees of freedom and can move in all directions in the air, so two-
dimensional path planning cannot meet the needs of UAVs. For 3D environments,
there are a lot of complex structures and uncertainties, especially in complex envi-
ronments such as forests, caves, and cities, and efficient 3D path planning algorithms
are required. However, the research on the 3D path planning algorithm of UAV is
full of challenges. Since the birth of the UAV path planning algorithm, its adaptive
stability in scenes with high environmental complexity, unstable external light, and
high obstacle dynamics It has always been a difficult point in this field.
UAVs are usually equipped with depth cameras and lidars to perceive the
surrounding environment, establish environmental maps through vision and laser
SLAM (Simultaneous Localization and Mapping) technologies, map the environ-
mental maps into a form that can be processed by computers, and then perform path
planning. Structural design is currently the most effective solution in the field of
autonomous driving. However, path planning in an unknown environment without
map information is more complicated than in a known environment. When the envi-
ronment is unknown, complex obstacles and unexpected events make the movement
of the UAV rely on the data collected from the sensors and the efficiency of the algo-
rithm to quickly decide a passable path to avoid obstacles and navigate. to the target
location. Therefore, when there is an obstacle in the driving direction of the UAV,
the path planning algorithm must not only decide to move in a certain direction to
UAV Path Planning Based on Deep Reinforcement Learning 29
avoid the obstacle, but also comprehensively consider the overall path condition to
form a good trajectory, which requires a shorter Route length and less time spent. In
recent years, R&D technicians have been continuously studying the use of different
algorithms and methods to deal with problems in path planning, so that the path plan-
ning of UAVs can achieve the best results. Research on effective obstacle avoidance
and path planning technology for unknown environments is crucial for autonomous
driving of UAVs.
Path planning is an important basis for drones to achieve autonomous flight tasks.
The purpose is to obtain a global path from the starting point to the target location,
reduce collisions with obstacles, and make the path as short as possible. At present, the
classical path planning algorithms are mainly divided into three categories, namely
artificial potential field, heuristic search and sampling-based algorithm. The artificial
potential field method was published by Khatib [1] in 1985. By defining a potential
field function, a potential field is artificially assigned to each point in the space, so
that the obstacles in the potential field are opposite to each other. The mobile robot
generates repulsive force, and the target point attracts the mobile robot, so the robot
will drive to the attractive target point, and effectively avoid collision due to the
repulsive force of obstacles [2].
The design of the algorithm has the disadvantage that it is easy to fall into the
local extremum. When the attractive and repulsive forces received by the mobile
robot in the potential field cancel out, the mobile robot will stop moving. Secondly,
the artificial potential field does not introduce kinematic and dynamic constraints.
The mobile robot flies in the direction of the resultant force in the configuration
space without any flight angle restrictions. Therefore, the planned trajectory does
not conform to the dynamic model, and the UAV cannot be planned according to the
actual situation. trajectory flight. Mabrouk in [3] proposed a new extended artificial
potential field method, which uses dynamic internal agent states. Internal state is
modeled as a coupled first-order differential equation of dynamic system, the equation
of the manipulation of the agent’s potential field, internal state dynamic is forced
by the agent and the external environment interaction, and local equilibrium was
monopolized by the internal state of potential field, and from stable equilibrium to
unstable equilibrium, allowing the escape from local minima in the potential field.
This new method successfully solves the complex maze problem with multiple local
minima that cannot be completed by traditional artificial potential fields.
The method of heuristic search is also a representative class of path planning
algorithms. The method is based on a sampling strategy to discretize the configu-
ration space and transform the path search problem into a graph search, which is
easier to handle than the continuous problem. Dijkstra’s algorithm [4] can quickly
30 R. Dong et al.
plan the shortest path. Its main idea is to find the unvisited node with the shortest
current distance, mark it as visited, and update its distance from adjacent nodes, and
execute the loop until all nodes are visited. The algorithm is applicable to graphs
where all edge weights are non-negative. The A* search algorithm improves the
Dijikstra algorithm and introduces two parts: equal cost search and heuristic search
to comprehensively evaluate the cost of the path traversed and the cost of the path to
be searched in the future. Because of the introduction of the heuristic idea, the path
planning storage node and consume less time [5].
A* search algorithm has similar shortcomings to Dijikstra’s algorithm. The algo-
rithm requires the nodes in the environment to be continuously expanded from the
starting point through a certain strategy, and the expanded nodes are saved in the
open list and closed list, and the search and comparison are continuously performed.
Operations, so search strategies in high-dimensional spaces require high memory
requirements and computational resources. Podsedkowski compares a variety of
heuristic functions, discretizes the nodes in the space, and when a new obstacle is
detected, the obstacle node and the node related to the obstacle are removed from
the open list, so that the algorithm’s performance is improved. The author in [6,
7] proposed an improved A * algorithm, which was applied to automated guided
vehicles by traversing all nodes on the path and removing unnecessary nodes and
connections. Sedighi in [8] presented the A * search method with the visibility graph
search, introduces an application-aware cost function, uses the derived shortest path
to provide the correct waypoint and combines A * to plan the optimal path with
respect to nonholonomic constraints.
In addition to the two algorithms of artificial potential field and heuristic function,
the other kind is the algorithm based on sampling. In 1998, Valley proposed the
Rapidly-exploring Random Trees (RRT) algorithm. By randomly extracting a free
space area and expanding outward from the starting point, it is a kind of algorithm that
can inevitably search for a feasible path but not the shortest path [9]. But Karaman
[10] proved that RRT is not asymptotically optimal. Kavraki The road map algorithm
PRM (Probabilistic Road Maps) [11] is proposed, which mainly builds a map through
sampling, and then introduces the A * algorithm for path finding. Karaman also
proposed methods based on progressive optimal sampling, including RRG (Rapidly
Exploring Random Graph), PRM * and RRT*, with the increase of samples, the
solution converges to the global optimum. RRG is an extension of the RRT algorithm
that connects new samples not only to the nearest node, but also to all other nodes in
range, and searches for a path after building the graph. The PRM* algorithm attempts
to connect a range of roadmap vertices, and RRT* is an asymptotically optimal form
of RRT, using a rerouting mechanism that reconnects nodes locally in the tree and
maintains the distance from the root node to each the shortest path to a leaf node.
Webb et al. [12] fixed the final state and free final time optimal controllers in
combination with the RRT* method to ensure the asymptotic optimality and dynam-
ical feasibility of the path. In this method, the optimal state transition trajectory
connecting the two states is computed by solving a linear quadratic regulator problem.
Bry and Roy [13] proposed another method combining RRG and belief roadmap.
In this method, a partial order is introduced to weigh beliefs and distances while
UAV Path Planning Based on Deep Reinforcement Learning 31
expanding the graph in the belief space. There are also some improvements that
can speed up the convergence rate, such as RRT*-Smart [14], informed RRT* [15],
which shows some advantages over the classical RRT algorithm in various scenarios.
Based on the sampling algorithm, the FAST lab at ZJU [16] proposed a lightweight but
effective topology-guided motion dynamics planner, TGK-Planner, for fast quadrotor
flight with limited on-board computational resources. The system follows the tradi-
tional hierarchical planning workflow and uses a novel design to improve the robust-
ness and efficiency of the pathfinding and trajectory optimization submodules. The
method proposes a topology-guided graph, which guides the state sampling of the
dynamic planner through the topology of the rough environment, thereby improving
the efficiency of exploring safe and dynamically feasible trajectories, and integrates
the proposed system into a fully autonomous quadrotor aircraft and verified in a
variety of simulated and real-world scenarios.
• Bionic algorithm and fusion algorithm
Traditional path planning algorithms solve the problem of finding a passable path and
can control mobile robots to avoid obstacles, but they all have common defects. The
traditional path planning method requires a lot of data, so it needs a lot of storage and
computing space. In order to reduce the demand for computer hardware and reduce
the consumption of storage space, researchers often propose innovative solutions,
such as bionic algorithms and fusion algorithms.
Genetic algorithm is a random search algorithm inspired by the evolution mecha-
nism of biological evolution in nature. This method abstracts the problem to be solved
into chromosomes, defines a group by chromosomes, and evaluates all individuals
in the group through the adaptability of chromosomes to the environment, so as to
guide the evolution of the population, and finally iterates the optimal solution to the
problem. The algorithm performs the generation of the next generation of offspring by
crossover operators, and there are many crossover operators for each type of chro-
mosome representation associated with different types of optimization problems.
The crossover operations in the genetic algorithm are designed to solve combinato-
rial optimization problems based on permutations, which are computationally more
expensive compared to other cases. The crossover operations in the genetic algorithm
are designed to solve combinatorial optimization problems based on permutations,
which are computationally more expensive compared to other cases. This is mainly
due to the fact that duplicate numbers are not allowed in chromosomes, and therefore
the offspring need to be legalized after each substring exchange. The time required
to perform the crossover operation increases significantly with the chromosome size,
which can seriously affect the efficiency of these genetic algorithms.
Koohestani [17] proposed a genetic algorithm in the form of partial map crossover
substitution, which represented the path as a chromosome. Numerical experimental
results on a benchmark problem show that using this crossover operator can improve
the effectiveness of permutation-based genetic algorithms and help efficiently handle
path planning problems. Lamini et al. [18] proposed an improved crossover operator,
an algorithm that generates the optimal path through a genetic algorithm model,
thereby preventing the crossover operator from prematurely converging, and giving
32 R. Dong et al.
a better fitness value than its parent the feasible path to make the algorithm converge
faster. A new fitness function that considers distance, stability and energy is also
described. In order to verify the effectiveness of the scheme, it is applied to many
different environments, and the results show that the genetic algorithm with improved
crossover operator and fitness function is beneficial to the solution of the optimal
solution compared with other methods. Combining genetic algorithm with artificial
potential field technology can overcome the problem that artificial potential field
algorithm is easy to fall into local minimum. Li et al. [19] adopted the idea of
integrating two algorithms to rasterize the environment. First, the genetic algorithm
was used for path planning. On this basis, the artificial potential field method was
used for local dynamic obstacle avoidance, which was able to handle local minima.
It improves the search efficiency and the ability to solve path planning problems.
The swarm intelligence algorithm that integrates bionic and artificial intelligence
technology is also a popular research direction. It mainly simulates the food-seeking
behavior of groups represented by fish swarms, bee swarms and bird flocks, and
optimizes by accumulating the experience of all members in the swarm. the direction
of the search. Liang and Lee [20] provided an artificial bee colony path planning
method for swarm robots. Using the artificial bee colony objective function, it can
reach the target point without collision, and proposed a real-time sharing strategy
and a method of adjusting the size of bees. Avoid obstacles and other members of the
group. Ant colony algorithm is an intelligent optimization algorithm, which imitates
the behavior of ants to find paths according to the concentration of pheromone.
Because of its advantages of good feedback information, strong robustness and strong
distributed computing ability, it has been applied to the path planning of mobile
robots. It also has the problem of slow convergence. Akka and Khaber [21] optimized
the ant colony algorithm, adding stimulus probability while pheromone concentration
guided the selection network, expanding the exploration range and improving the
visibility accuracy. And the improved algorithm introduces new pheromone update
rules and dynamic regulation of evaporation rate, which improves the convergence
rate and expands the search space.
Su et al. [22] designed an improved ant colony algorithm to solve the problem
that the traditional ant colony algorithm is prone to path redundancy and easy to fall
into the local optimal solution. First, analyze the process of ant movement, because
they choose the path base on probability, and the path has redundancy, backward, and
wave-like forward, so it is difficult to find the optimal path. To solve this problem, path
correction is used to modify the path to the target point, thus effectively improving the
convergence of the ant colony algorithm and obtaining a shorter optimal path, while
avoiding the pheromone update of some redundant paths to affect the probability
of later ant colony path selection. For the obtained optimal path, the path nodes are
optimized to improve the path smoothness. After the simulation test, the improved
ant colony algorithm has better convergence and fewer nodes than the conventional
ant colony algorithm, which is more in line with the actual needs of robot motion. In
order to improve the performance of bird flock search, Cheng, Wang and Xiong [23]
proposed a new improved bird flock search algorithm. In this algorithm, the search
range is expanded by adding exploration strategies. At the same time, the control
UAV Path Planning Based on Deep Reinforcement Learning 33
parameters of step size and discovery probability are adaptively adjusted through the
improvement rate of the solution to the optimal value.
Fuzzy logic is a method that does not clearly agree the result but a certain value
range. It is an implicit control strategy commonly used in motion control systems.
Khaksar, Hong, Khaksar and Motlagh [24] proposed an algorithm based on real-
time sampling. To evaluate the generated samples, the genetic algorithm strategy
is adopted to improve the controller parameters, and the scope of application of the
algorithm is improved. Xiang [25] proposed an improved dynamic window algorithm
DWA (Dynamic Window Approach), adding the weight coefficient of the original
DWA evaluation function to the fuzzy controller to realize the weight coefficient self-
adaptation, so as to adapt to a more complex environment and generate a smoother
path.
Reinforcement learning methods have been tried in many scenarios and tested in
Atari games, using high-dimensional image information as input, and using game
scores as evaluations to surpass human performance through reinforcement learning
strategies [26]. In 2010, Jaradat et al. [27] et al. proposed to apply the Q–Learning
method to the path planning problem of mobile robots, limiting the number of states
and reducing the size of the Q table. Shi et al. [28] combined the objective function
of the ant colony algorithm with the Q-Learning algorithm, and used pheromone
to spread information among swarm agents, realizing the information interaction of
multi-robot path planning.
• Deep learning methods
Due to the update of high-performance computing hardware, deep neural networks
have shown great potential in dealing with complex computing problems. However,
the practice of deep learning in the field of robotics is usually limited by various
constraints. On the one hand, the workspace is not completely observable and will
change at any time [29]. On the other hand, robots are usually used in complex
working environments, so they will greatly Increase sample space. Usually, in order
to simplify the problem, the workspace is discretized [30, 31]. Due to the advance-
ment of graphics processing capabilities, deep neural networks will also be applied
to high-dimensional complex environments, and have been successfully applied to
obstacle avoidance tasks based on depth images [32]. Some neural network-based
methods have been proposed for Solve the problem of autonomous navigation of
small UAVs in unknown environments, but the network after training is opaque,
unintuitive and difficult to understand, which affects the use in the real world. He
et al. [33] proposed an interpretable deep neural network path planning method for
autonomous flight of quadrotors in unknown environments. The navigation problem
is described as a Markov decision process, and to better understand the trained model,
34 R. Dong et al.
a new model interpretation method based on feature attributes is proposed, and some
easily interpretable textual and visual explanations are generated to allow the end
user to understand What triggers a particular behavior. In addition, a global anal-
ysis is performed to evaluate and improve the network, and real-world flight tests are
performed to verify that the trained path planner can be directly applied to real-world
environments.
Jeong et al. [34] proposed a learning model that simplifies the processing steps.
The laser information is input into the neural network, and then the A* algorithm
is used to label the information for supervised learning. After training, it can pass
the two-dimensional laser data and target coordinates. Directly output robot motion
commands. Chen et al. [35] used semantic information obtained from pictures by
deep neural networks to make behavioral decisions for autonomous vehicles. Wu et al.
[36] proposed a deep neural network approach for real-time online path planning in
unknown cluttered environments, and designed an end-to-end deep neural network
architecture for online 3D path planning networks to learn 3D Local path planning
strategy. It is based on multivalued iterative computation approximated by a recurrent
2D convolutional neural network to determine actions in 3D space. In addition, a
path planning framework is developed to achieve near-optimal real-time online path
planning.
• Deep reinforcement learning method
Deep reinforcement learning combines the abstract ability of deep learning and the
strategy of reinforcement learning, which can be more suitable for human thinking to
solve practical problems. Maw et al. [37] proposed a hybrid path planning algorithm
that uses a graph-based path planning algorithm for global planning using deep
reinforcement learning for local planning, which is applied to a real-time mission
planning system for autonomous UAVs. It mainly solves the problem that local
planning and collision avoidance are not fully considered in the shortest path search.
The main work consists of two main parts: optimal flight path generation and collision
avoidance, a graph-based path planning algorithm is fused with a learning-based local
planning algorithm, and a hybrid path planning method is developed to allow UAVs
to avoid collisions in real time. The global path planning problem is solved in the
first stage using a novel incremental search algorithm on-the-fly called Modified
On-the-Fly A*, validated in the AirSim environment.
Gao et al. [38] proposed an incremental training mode that employs deep rein-
forcement learning to solve the path planning problem. The related graph search
algorithm and reinforcement learning algorithm were evaluated in a lightweight
two-dimensional environment. Then a deep reinforcement learning-based algorithm
is designed in a 2D environment, including observation states, reward functions,
network structures, and parameter optimization to avoid time-consuming work in a
3D environment. The designed algorithm is transferred to a simple 3D environment
for retraining to obtain converged network parameters, including the weights and
biases of the deep neural network. These parameters are used as initial values to train
the model 3D environment in a complex model. To improve the generalization of
the model to different scenes, the deep reinforcement learning algorithm TD3 (Twin
UAV Path Planning Based on Deep Reinforcement Learning 35
– The path planning algorithm of the UAV needs to use a three-dimensional path
planning algorithm. The two-dimensional path planning algorithm cannot solve
more complex three-dimensional scenes. For three-dimensional environments
such as corridors, caves, and cities, due to the existence of complex obstacles and
inconveniences. In various situations such as deterministic factors, it is easy to fall
into the problem of dimension disaster and insufficient computing power when
using classical algorithms, and it is difficult to realize real-time path planning. A
single algorithm can no longer meet the actual needs of UAV path planning. In
order to find a better path planning solution, it is necessary to comprehensively
consider constraints such as environment, time, and performance. At present, a
main solution is to combine multiple algorithms. Use or improve some existing
algorithms.
– The path planning method in the traditional dynamic environment requires the use
of lidar, depth camera, or a combination of the two to collect surrounding envi-
ronmental information, thereby forming an environmental map, and completing
the path under the condition that the map information is known planning, so it
is also a challenge to complete path planning in an unknown environment. In
the face of complex three-dimensional environments, especially cities and other
environments, creating three-dimensional map information requires huge storage
space.
– The environmental navigation problem is usually divided into several processes
such as perception, mapping and planning to solve in sequence, that is, first, use
equipment such as lidar depth cameras to build an environmental map, and map
the point cloud information into grids on the premise of known map information.
A grid map, on which collision-free trajectories are calculated. This increases
processing latency and reduces the correlation between steps.
– Reinforcement learning obtains rewards and punishments by interacting with the
environment and continuous trial and error. Therefore, for UAVs, it has high trial
and error costs and huge security risks, and deep reinforcement learning training
in real environments requires a lot of data. Time-consuming and labor-intensive,
training is usually performed in a simulated environment.
– The rational design of the reward function is challenging. Reinforcement learning
is an end-to-end decision learning model, and the design of the reward function
will affect the learning effect of the strategy.
36 R. Dong et al.
– The structure of the real environment is uncertain, the light is unstable, and the
environment is highly dynamic, which has a large gap with the simulation envi-
ronment. The adaptive and stable navigation of path planning and the migration
from the simulation environment to the real environment are the most important
areas in the field of UAV autonomous navigation. difficulty.
In summary, it is of great significance for UAV path planning to deeply discuss how
to improve the existing reinforcement learning algorithm, reduce the trial and error
cost of deep reinforcement learning training through a realistic physical simulation
engine, and set a reasonable reward function.
with the DQN algorithm, and the improvement of the DQN algorithm proposed in
this chapter is emphasized to make it perform better in the UAV path planning task.
The fourth chapter compares and analyzes the simulation environment commonly
used in the field of autonomous driving and UAV path planning algorithm research.
Gazebo and Airsim are selected as the environment used in this chapter, and the soft-
ware and hardware platform parameters are given. Based on the above two simulation
environments, a simulation environment for training the indoor path planning ability,
outdoor path planning ability and dynamic obstacle avoidance ability is designed and
built respectively to establish a foundation for training the UAV path planning model.
The fifth chapter, combined with the UAV model, the reinforcement learning
path planning algorithm and the simulation environment, is trained according to the
system state, action and reward function of the UAV path planning reinforcement
learning task. The indoor path planning ability and outdoor dynamic obstacle avoid-
ance ability are simulated flight tests, and the test results are analyzed. Compare
the path search results before and after the improvement of the DQN algorithm.
By comparing the number of UAV collisions and the numerical changes of the loss
function, the improvement effect of the efficiency and stability of the algorithm is
obtained. The path planning trajectory effect of the UAV is analyzed in the exper-
imental verification environment, and the deep reinforcement learning of the UAV
designed in this chapter is verified. Feasibility and stability of path planning methods.
The sixth chapter explains the conclusion and progress of this chapter, puts forward
the existing deficiencies, and makes an outlook on the future work in combination
with the actual situation.
There are three main types of machine learning algorithms: supervised learning, unsu-
pervised learning and reinforcement learning. Supervised learning requires manually
given labels, and after learning a large number of labeled samples, makes predic-
tions on new data. Its essence is to first carry out the process of labeling according to
the existing data set samples, then determine the relationship between the input and
output, and iteratively train to obtain an optimal model according to this relation-
ship, and finally use the model to Samples outside the training set are used to make
predictions. The training data in supervised learning must have labels. The limitation
is that the subjectivity and limitations of manual labeling, as well as low efficiency
and high cost, will seriously affect the learning effect.
Unsupervised learning can learn unlabeled samples to mine potential structural
information in training samples, instead of relying on artificially given labels. Suitable
for situations where prior knowledge is unknown or insufficient, and manual labeling
is difficult or expensive. But coexisting with it is that unsupervised learning requires a
huge set of samples as support to find structural features without category information
(Table 1).
The training samples of reinforcement learning also do not need any labels,
and only learn through the reward signal given by the environment, which is an
autonomous learning mode. Reinforcement learning requires the agent to acquire
state information by exploring the environment. It doesn’t give a solution directly, it
needs to find the answer autonomously in the environment through trial and error. In
each state, the agent can choose to perform a variety of actions. Each choice can be
based on a greedy strategy or other strategies such as softmax, and the choices made
are then evaluated. The evaluation is reflected by the reward value, but the reward
value can only be regarded as an evaluation score, and it is impossible to determine
whether the current selection is correct. But the better the action the agent chooses,
the more reward it will get. Therefore, reinforcement learning does not need to be
marked in advance, and the agent can judge the quality of the final result by executing
these actions to complete the process of autonomous learning and optimization of
the strategy.
From the above comparison of the two types of algorithms, it can be concluded
that supervised learning is suitable for situations where the environment can not be
fully explored and actions cannot be evaluated; reinforcement learning is suitable for
situations where the label information is noisy or the sample labels obtained are not
accurate enough. However, the optimal strategy of reinforcement learning is difficult
to extract features from higher-dimensional states. With the help of deep learning’s
perception of high-dimensional input, it is possible to learn optimal strategies from
high-dimensional data such as images or videos. Deep reinforcement learning fusion
The abstraction ability of deep learning and the strategies of reinforcement learning,
It can be more in line with the human way of thinking to solve practical problems.
Because it is difficult for reinforcement learning to learn policies directly from high-
dimensional raw data, some methods have been proposed to use deep neural networks
for reinforcement learning. As an end-to-end learning algorithm, the deep learning
method belongs to a branch of machine learning. It has powerful feature extraction
capabilities and overcomes the limitations of traditional machine learning in the
fields of image processing, speech analysis and other feature extraction. Using deep
learning methods to study classification problems and regression problems, it mainly
includes four parts: data, model, loss function and optimizer.
For the model, the simplest is a single-layer perceptron, which takes several
features as input, multiplies each input feature with its corresponding weight and
sums it up, similar to a model in which a neuron collects information through synaptic
weighting. Linear operations, namely addition and number multiplication, can be
efficiently completed by matrix multiplication [39], as shown in Eq. (1).
m
y = g W x+b = g
T
wi xi + b (1)
i
network. However, a fully connected network that is too deep will lead to overfitting,
and Drapout technology can alleviate this problem [40]. The fully connected network
is not suitable for the case of too many inputs, and its input needs to be manually
selected or processed by feature extraction in advance. When an image is input, there
are more than a million parameters, and the larger the image, the deeper the network,
the larger the amount of parameters. In this chapter, the deep reinforcement learning
algorithm is applied to the UAV path planning task. The deep neural network acts as
a function fitter in the entire algorithm platform. The lidar data is used as the input of
the deep network, and its dimension is generally one-dimensional data. Therefore,
it can be directly input to the fully connected layer.
UAV Path Planning Based on Deep Reinforcement Learning 41
Reinforcement learning is to obtain rewards from the interaction with the environ-
ment, so that the agent (Agent) learns the desired behavior. The goal of solving
reinforcement learning problems is to find the optimal policy for each state. Solving
reinforcement learning problems generally requires two steps: constructing the math-
ematical model of reinforcement learning Markov decision process and solving the
optimal solution of the Markov decision model.
Intuitively speaking, the state of each moment in a random process is only related
to the state of the previous moment. However, in the real environment, the state at
a certain time is usually related to the state at the historical time. Therefore, when
all the states at the historical time are converted into the state at the current time
according to a certain rule, the Markov property can be met. f n , f n+1 represents the
state of two adjacent moments, when f n+1 not only f n related to but also related to
f n−1 , f n−2 , . . . , f n−m both, transform the historical state into the current state, let
f n = ( f n−m , f n−m+1 , . . . , f n−1 , f n ). A typical Markov decision process is shown
in Fig. 4, where dn and f n represent the decision action and state value at the nth
moment [39], respectively.
The agent is in the environment (Environment), the state (State) represents the
agent’s perception of the current environment; the agent performs actions (Action)
to interact with the environment. After an action is performed, the environment
transitions from the current state to another state with a certain probability, and a
reward is given to the agent to evaluate the action according to potential reward rules.
This trial and error learns from experience to optimize the action strategy. The state,
action, state transition probability and reward function are the main components of the
reinforcement learning process, which are defined by a quadruple < S, A, T, R >,
S is the system state space set, A is the system action space set, T : S × A → S is
the state transition probability function and R is the reward function of the system.
Therefore, t the immediate reward obtained by the system when the system rt =
R(st , at , st+1 ) performs an action at ∈ A in the state and st ∈ S transfers to the state
at the moment is st+1 ∈ S. The interactive process can be seen in Fig. 5 [39].
It can be seen more clearly from the above figure that when the agent performs a
certain task, it first obtains the current state information by perceiving the environ-
ment, then selects the action to be performed, and interacts with the environment to
generate a new state, and at the same time The environment gives a reinforcement
signal for evaluating the pros and cons of the action, that is, the reward, and so on.
The process of the reinforcement learning algorithm is to execute the strategy and
the environment to achieve new state data, and use the new data to modify its own
behavior strategy under the guidance of the reward function. After many iterations,
the agent will learn to complete the task. The optimal behavior policy required.
Solving the Markov decision problem refers to solving the distribution of behav-
iors in each state so that the accumulated reward is maximized. For the model-free
method with unknown environment, the method based on the value function is gener-
ally adopted, and only the state value function is estimated during the solution, and
the optimal strategy is obtained in the iterative solution time of the value function.
The DQN method used in this chapter is a method based on value function.
UAV Path Planning Based on Deep Reinforcement Learning 43
This section first analyzes the principle of the DQN algorithm, and then proposes
some improvements to the DQN algorithm according to the characteristics of the
UAV path planning task. It mainly studies and analyzes the boundaries and goals
of the interaction between the agent and the environment, so as to establish a deep
reinforcement learning model design that meets the requirements of the task.
Reinforcement learning realizes the learning of the optimal strategy by maxi-
mizing the cumulative reward. Formula (2) is a general cumulative reward model,
which represents the future cumulative reward value of the agent executing the
strategy from time t.
n
Rt = γ k−t rk (2)
k=t
where: γ ∈ [0, 1] is the discount rate, which is used to adjust the reward effect of
the future state on the reward at the current moment [39].
In the specific algorithm design, the reinforcement learning model establishes a
value function based on the expectation of accumulated reward to evaluate the policy.
Value functions include action value functions Q π (s, a) and state value functions
V π (s). The action value function of Q π (s, a) formula (3) reflects the expected reward
value of completing the action a from the state s execution strategy π; the state value
function of V π (s) formula (4) reflects the expected reward value when the state s
executes the strategy π.
Equation (3) into Eq. (5) and express it in the iterative form shown in (8):
In the formula: s and a are s the successor states and actions of and, respectively,
and formula (8) is called Q ∗ (s, a) the a Bellman optimal equation.
It can be seen that the value function includes the value function value of the next
moment and the immediate reward value, which means that when reinforcement
learning evaluates the strategy at a certain moment, it also considers the value of the
current moment and the range of future moments. The long-term value, that is, the
possible cumulative reward; this also avoids the limitations of the model, and avoids
only focusing on the size of the immediate reward and ignoring the long-term value,
which is not the optimal strategy choice. The Bellman equation iteratively finds the
MDP, and then obtains the Q ∗ (s, a) optimal action policy function π ∗ (s) shown in
the formula (9) [39]:
In the process of modeling and fitting the value function, the neural network is
very sensitive to the sample data; while the sequence data samples output by the
reinforcement learning execution strategy have strong correlation, which seriously
affects the fitting accuracy of the neural network and makes the iterative optimization
process of the strategy fall into Local minima even lead to non-convergence problems.
In order to overcome the above problems, deep reinforcement learning algorithms
generally use experience replay technology to weaken the coupling between the data
extraction process and the policy optimization process and remove the correlation
between sample data. Specifically, in the reinforcement learning process, the data
obtained by the interaction between the agent and the environment are temporarily
stored in the database as shown in Fig. 6, and then the data is obtained through
random sampling to guide the neural network update.
In addition, in order to obtain the optimal policy, it is desirable to obtain as high
a reward value as possible when performing the policy selection action, and also
hope that the model has a certain search ability to ensure the state space search
UAV Path Planning Based on Deep Reinforcement Learning 45
ability. Obviously, the traditional greedy method cannot take into account the above
requirements, so soft strategies are generally used for action selection. ε-The greedy
action selection method is: when executing an action, (1 − ε) select a high-value
action according to π ∗ (s) the probability; ε randomly select the search action space
with probability, the mathematical expression is as formula (12):
π ∗ (s), probability 1 − ε
πε (s) = (12)
Random selection a ∈ A, probability ε
The DQN algorithm finally obtained is shown in Table 2. Among them, M is the
maximum number of training steps, the subscript j represents the serial number of the
state transition sample in Nbatch the small batch sample set of, is the si environmental
state of the mobile robot, ai is the executable action in the state space, and D is the
experience playback pool.
In the DQN algorithm, the state transition probability is not considered, only the
description of the state space, action space and reward function of the agent is consid-
ered, and these elements should be designed according to specific tasks [40]. For the
path planning reinforcement learning task, it is first necessary to design a state space
based on sensor parameters, an action space based on UAV motion characteristics,
and a reward function based on path planning characteristics to build a UAV path
planning reinforcement learning system. In order to improve the robustness of the
UAV’s path planning in an unknown environment and improve the learning efficiency
of UAV path planning, based on the obstacle position information and target position
information of the environment where the UAV is located, the design is suitable for
path planning tasks. The reward function of, fully considers the influence of position
and orientation on the reward function. In addition, this chapter supplements and
improves the reward function based on the problem and idea that the range repulsion
46 R. Dong et al.
affects the path planning of the artificial potential field method: through the analysis
of the motion collision of the UAV, the supplementary establishment of the direction
penalty obstacle avoidance function is more effective for the UAV’s movement. Eval-
uation, guiding the UAV to quickly reach the target position under effective obstacle
avoidance conditions.
Since the DQN method generally overestimates the Q value of the behavior value
function, there is an over-optimization problem. The overall estimated value function
is larger than the real value function, and the error will increase with the increase of
the number of behaviors, as shown in Fig. 7, generally using two network Q Network
UAV Path Planning Based on Deep Reinforcement Learning 47
and Target Q Network, which implements behavior selection and behavior evaluation
with different value functions. The two network structure models are exactly the
same. In order to ensure the convergence and learning ability of the network training
process, the parameter update speed of the Target Q network is slower than that of
the Online Q network. Target in this section Q The network is updated every 300
steps by default, and can be adjusted according to actual training needs. Because in
the actual learning and training process, the learning time of the agent or the cost of
network training time increases with the increase of the model complexity. Therefore,
this chapter designs a network structure with low model complexity that can meet
the task requirements and uses the Keras hierarchical structure to build it; in order
to avoid the over-fitting phenomenon of the model, random deactivation Dropout is
added after the fully connected layer.
Based on the above theories, considering that the input data size of the deep
network network model is the characteristic state size of the regional perception
centered on the drone, and adapting to the network input size and scale, the final
network model consists of three fully connected layers and one hidden layer. consti-
tute. According to the actual requirements of the mobile robot control task, the
characteristic state of the robot is used as the input of the network, and the network
expects to output the Q value of 7 actions, and at the same time select the action with
the largest Q value to execute. The network model is shown in Fig. 8.
Fig. 7 Graph
48 R. Dong et al.
The distance from the UAV to surrounding obstacles is the most intuitive indicator to
reflect the environmental state of the UAV. In this chapter, the current environmental
state is detected based on the distance information from the UAV to the surrounding
obstacles, and the lidar is used as a sensor to detect the relative position and distance
between the UAV and the obstacles. Therefore, the detection distance information
diagram as shown in Fig. 9 is designed.
The feature state information is mainly composed of three parts: sensor feedback
information, target position information and motion state information, forming a
standard finite Markov process, so that deep reinforcement learning can be used to
deal with this task. The state space is represented by a one-dimensional array formed
by the distance values of the lidar; considering that there will be at least 360 depth
value channels in the circular area with the UAV as the center when the lidar detects,
which is not required in the actual task of this chapter So many depth values and too
large state space will increase the computational cost and weaken the learning ability
of the model. Therefore, in actual lidar detection, as shown in Fig. 9, this chapter sets
the sampling interval of lidar to 15°, and finally obtains the down-sampled lidar data
array. Specifically, the length of the lidar data array is 2 4, the first interval represents
the forward direction of the UAV, and the distance and included angle between the
UAV and the obstacle position are added; the state space format is shown in formula
(13), where state is the state space, which represents the distance value li of the ith
interval d corresponding to the lidar, the distance value between the UAV distance
and the target zone, a and the angle value between the UAV’s forward direction and
the obstacle.
The action space should be designed so that the drone can explore the environment as
much as possible for rewards. The UAV controls the heading through the command of
the autopilot. By defining the fixed yaw angle change of the UAV, plus the rising and
falling speeds, the movement of the UAV can basically cover the entire environment
through the speed and angular velocity control. Explore space. Therefore, the action
space, that is, the value range of the UAV action, is shown in Table 3.
DQN algorithm is discrete, and the UAV’s actions are divided into seven spatially,
including fast left turn, left turn, straight ahead, right turn, fast right turn, ascent and
descent, and the angular velocities are 1.2, −0.6, 0, 0.6, 1.2 rad /s, and ascent and
descent speeds 0.5 m/s and −0.5 m/s. The speed command is sent to the drone at
a fixed frequency in time. Through this design, the actual path of the drone is a
continuous arc and polyline.
50 R. Dong et al.
The flight mission of the UAV is generally sailed according to the planned route,
which is composed of multiple waypoints arranged and connected, so the flight
mission of the UAV can be decomposed into multiple path planning tasks between
multiple waypoint sequences. The UAV starts from the starting point and passes
through the designated waypoints in turn. When the UAV in flight encounters an
obstacle, if there is a danger of collision, the UAV needs to avoid the obstacle.
Reinforcement learning is used for path planning. Since the motion behavior of
the UAV is selected from the action space, the path representing the UAV must
be flyable. At the same time, the artificial potential field method is introduced into
the reinforcement learning algorithm. As shown in Fig. 10, the attractive potential
is assigned to the target waypoint, and the repulsive potential is assigned to the
obstacle. The multi-rotor will attract the target waypoint and obstacles. Flying under
the combined action of the repulsive force, it is shown to fly along the desired path to
the target waypoint while avoiding obstacles on the path. The artificial potential field
method is embodied in the reward function of reinforcement learning. The following
is an introduction to the design of the improved DQN algorithm.
Figure 11, the movement of the drone in the surrounding environment is designed
as a movement in an abstract electric field, the target point is assumed to be negatively
charged, the drone and obstacles are assumed to be positive charges, the target The
point and the drone have different charges, so they have gravity, and the obstacle and
the drone have the same charge, so they have repulsion. The movement of the drone
is guided by the resultant force in space, which r1 is the distance from the drone to the
target point, which r2 is The distance from the UAV to the obstacle is the Q G amount
of negative charge assigned to the target point, the amount of negative charge Q O
assigned to the obstacle, and the amount of Q U positive charge assigned to the UAV,
ka , kb and kc is the proportional coefficient, ϕ is the angle between the gravitational
direction and the movement direction of the U B UAV, and is the resultant force
received by the UAV; in actual work, in order to avoid the UAV to avoid collisions
as much as possible, choose the round-trip motion instead of the target point In the
case of motion, the attraction of the target point to the UAV should be greater than
the repulsion effect of the obstacle, so the set Q G value is larger than Q O the value to
ensure that the UAV can avoid obstacles and reach the target point; When the drone
approaches the target point, the gravitational force increases, and when the drone
approaches the obstacle, the repulsive force increases; function of the DQN deep
reinforcement learning algorithm is expressed as the following three parts:
Gravitational reward function (14):
QU QG U G
RU G = U G · ka = ka (14)
r12 |U G|
QU Q O U O
RU O = U O · kb = kb (15)
r22 |U O|
(U O + U G )U C
Rϕ = arccos kc (16)
|U O + U G ||U C|
(U O +U G )U C
where: U C—the force actually received by the UAV; arccos |U O +U G ||U C|
—the
angle between the actual motion direction and the expected motion direction; reward
function is shown in (17):
R = RU G + RU O + Rϕ (17)
This chapter analyzes the advantages and disadvantages of the three major types
of machine learning, supervised learning, unsupervised learning, and reinforcement
learning, discusses the training methods and limitations of deep learning and rein-
forcement learning, introduces the basic theory of deep learning and reinforcement
learning, and introduces deep reinforcement. The DQN algorithm in learning focuses
on the improvement of the DQN algorithm combined with the artificial potential field
algorithm in the classical path planning algorithm, so that it has better performance
in the UAV path planning decision task. Constraining the agent to discrete actions
makes the algorithm easier to converge at the cost of reduced mobility. Using the
fusion method of electric potential field and deep reinforcement learning, obsta-
cles generate repulsive force, and target point generates gravitational force, which
is combined with the reward function to guide the drone to reach the target point
without collision, so that the algorithm converges quickly. The lightweight network
structure is adopted, including three fully connected layers and one hidden layer,
which enhances the real-time performance. Finally, the feasibility of the algorithm
is verified.
This chapter is the training and testing chapter. It applies the improved DQN algo-
rithm to the UAV path planning task, and selects common UAV mission scenarios
to design multiple sets of experiments for algorithm feasibility testing, training and
UAV Path Planning Based on Deep Reinforcement Learning 53
verification. According to the training model of the improved DQN algorithm, the
feasibility test of indoor path planning in Gazebo environment and the training of
dynamic obstacle avoidance were carried out. After completion, it does not rely on
the reinforcement learning strategy, and only outputs the action value through the
deep neural network.
With the UAV model, the reinforcement learning path planning algorithm and the
simulation environment, the system state, action and reward function of the rein-
forcement learning task are planned according to the UAV path. The process of this
experimental study is shown in Fig. 12. First, initialize the UAV and the simulation
environment, load the UAV into the simulation environment, and obtain the reward
function corresponding to R the state space S of the previous moment and the action
space of the previous moment when the UAV flies in the simulation environment
A. In addition, the reward function at the current moment is R stored in the data
container, and the data stored in the data container can be updated in real time with
the movement of the drone; when the sample size is sufficient, the training process is
started, and the decision-making network in DQN is used. Fit the Q value, and select
the value in the action space with the highest expected value as the action command
of the UAV; when the UAV approaches an obstacle or collides with an obstacle, the
R value of the reward function generated is small, and when the UAV approaches
the target point Or when the target point is reached, the generated reward function
R value is large. As the training progresses, the behavior of the drone will avoid
obstacles and reach the target point. When the reward value reaches the requirement
or reaches the set number of training steps, save the most The weights and parameters
of the optimal deep neural network are then verified in the test environment, such as
the reliability of the model. After the training, the weights and parameter files of the
network model can be obtained. In the algorithm testing and application stage, there
is no need to train the model and reinforcement learning strategies. It only needs to
send the state information to the deep neural network module to output the action
value.
This chapter mainly conducts three parts of the experiment: The first part is to
verify the feasibility of the algorithm in the indoor path planning environment of
Gazebo; the second part is to conduct intelligent body training, to carry out the UAV
path planning task in the training environment, the environment needs to be observed,
sent to the neural network, and finally The network model is trained according to
the DQN algorithm. The third part is the agent test, which loads the trained model
into the test environment, executes the path task according to the network model
obtained during the training process, and finally counts the completion of the task. In
order to train and test the UAV path planning algorithm proposed above, Chap. 4 has
54 R. Dong et al.
constructed a variety of path planning algorithm training and testing scenarios. The
training of all environments is designed and developed based on Tensorflow, ROS
and Pixhawk using Python language under Ubuntu 20.04 system.
In order to determine the path planning ability of the network model obtained
after training, testing is required. That is, in the same simulation environment as the
training environment, the path is completely determined by the network model of
deep reinforcement learning, and 100 rounds of testing are designed for each round.
Take the drone initialization as the beginning of the test round, and take the drone
reaching the target point as the end condition of the test round. A total of 100 rounds
of testing are carried out, and the target point is placed in the range that the drone can
reach in 50 training steps. Within, i.e. by defining a limited number of points within
a certain range, the drone target points randomly appear at the defined locations. At
the beginning of each turn, the drone will initialize, returning to the set position.
UAV Path Planning Based on Deep Reinforcement Learning 55
In order to evaluate the path planning performance of the algorithm, this chapter
designs three evaluation indicators as the actual quantitative indicators for judging
the effect of the algorithm, which are specifically defined as:
• Loss function: The loss function is the error function between the actual value
obtained by the target network and the predicted value output by the training
network. The training process is to use gradient descent to reduce this value. The
change in loss can indicate that the network is approaching convergence. Record
the test process The change of the loss function in the middle, judge the rate of
convergence and the size of the error;
• Maximum Q value: During training, each time the observation value of the number
of samples in the experience pool is taken, the current state is input into the training
network to obtain the Q value, and then the corresponding next state is input into
the target network, and the maximum Q value is selected. Record the change of
the maximum Q value during the test process to judge the learning effect of the
reinforcement learning strategy;
• Success rate: For each path planning process, if the UAV can reach the target
smoothly in the end, it is regarded as a successful path planning process, but
if after more than 50 actions, it is still If it cannot be achieved, or during the
execution of the action, the drone moves beyond the specified range of motion,
such as hitting an obstacle, or exceeding the specified motion area, etc., it will be
regarded as an unsuccessful test. Count the number of successes in one hundred
rounds, and count the success rate.
In an indoor closed space without obstacles, the path planning capability of the UAV
3D path planning task algorithm is verified. The UAV collects obstacle information
through lidar, obtains the distance of nearby obstacles relative to the agent, and the
clip relative to the target point. Angle and distance from the target point. Each time
training is performed according to the method proposed in this chapter, the artificial
potential field method is always followed for certain rewards and punishments during
the training process. The network model parameter settings are shown in Table 4.
The initial value of the greedy coefficient Epsilon is set to 1.0, and it gradually
decreases by 0.01 until 0.05 no longer decays. The deep neural network architecture
consists of 1 input layer, 2 fully connected layers, 1 hidden layer and then a fully
connected layer, and finally the output layer. Therefore, this chapter designs a network
structure with low model complexity that can meet the task requirements and uses the
Keras hierarchical structure to build it; in order to avoid the over-fitting phenomenon
of the model, a random dropout (Dropout) is added after the fully connected layer.
Considering the deep network The input data size of the network model is the size of
the feature state of the regional perception centered on the UAV, and it adapts to the
input size and scale of the network. The final network model consists of three fully
connected layers and one hidden layer. According to the actual requirements of the
56 R. Dong et al.
mobile robot control task, the characteristic state of the UAV is used as the input of
the network, and the network expects to output the Q value of 7 actions, and at the
same time, the action with the largest Q value is selected for execution.
Therefore, the training is carried out according to the above method, and the model
of the UAV reinforcement learning deep network with the ability of two-dimensional
path planning is obtained. Figure 13 shows the change of the model’s Eplison value,
and Fig. 14 shows the change of the model’s maximum Q value with the number of
training steps. It can be concluded that with the increase of the number of training
steps, the maximum Q value gradually increases, and the model error also tends to be
stable. After obtaining the reinforcement learning path planning model, there is no
need to train the model and reinforcement learning strategy, and only need to send the
state information to the deep neural network module to output the action value. The
motion command is selected in the action space, so it must conform to the kinematic
model, that is, the application of three-dimensional path planning (Fig. 15).
In this path planning task, the UAV starts from the starting point, can move toward
the target point, and deflects itself toward the target point. When approaching an
obstacle, it will make an obstacle avoidance action. It collides with the boundary
or obstacle and always maintains a safe distance, indicating that the path planning
strategy has been learned, and it proves that the reinforcement learning strategy
designed in this chapter can realize the path planning of the UAV (Table 5).
After the training is completed, load the parameters and weights of the 1000 -step
training model, and only need to send the state information to the deep neural network
module to output the action value. Send the action value to Pixhawk to control the
drone movement through ROS. Two sets of tests are conducted, where the starting
point of the drone is ( 0, 0, 2) and the target points are (2, 0, 0) and (−2, −2, 2), each
group conducts 100 tests for a total of 200 times, and counts the success rate of the
UAV Path Planning Based on Deep Reinforcement Learning 57
drone reaching the target point and the number of collisions. If it collides or stops,
the next test will be started. The test results are shown in the table below. As shown
in Fig. 16, the results show that the UAV has a certain path planning ability in the
indoor environment.
On the basis of the previous section, in order to better verify the performance of the
improved DQN algorithm for UAV path planning, the improved DQN path planning
58 R. Dong et al.
experiment was carried out in the indoor simulation environment. Set the classic
DQN algorithm to reward the target point, punish the collision, and give the reward
for the distance from the target point. At the same time, under the same environment,
the traditional DQN algorithm is used as a comparative experiment. 100 rounds of
testing are designed each round. Take the drone initialization as the start of the test
round, and take the drone to reach the target point as the end condition of the test
round. A total of 100 rounds of tests are carried out, and the target point is placed
within the range that the drone can reach in 50 steps, namely By defining a limited
number of points within a certain range, drone target points appear randomly at
defined locations. At the beginning of each round, the drone will initialize, return to
the set position, and select the loss change and average path length as the comparison
criteria.
The reward function of the classic DQN algorithm is shown in formula (18)
R = Ra + Rd cos θ + Rc + Rs (18)
(c) The drone moves towards the (d) The drone reaches the target
target point point
Compared with the classic QN algorithm, the improved DQN algorithm has a shorter
average path length and a higher success rate, while the classic DQN algorithm still
cannot reach the target point after 100 rounds. From the comparison of the success
rate change curve, it can be shown that the effect of improving the DQN algorithm
in the obstacle avoidance experiment is better than that of the DQN algorithm.
Figure 17 shows the loss curves obtained by the DQN algorithm and the improved
DQN algorithm in the indoor 3D path planning environment of the UAV. The red
curve in the figure represents the loss change trend of the improved DQN algorithm,
while the blue curve represents the loss change trend of the classic DQN algorithm.
The figure reflects the variation of the error obtained by the UAV in each step after
being trained by two reinforcement learning algorithms. It can be seen that the error
of the improved DQN algorithm is smaller and the convergence speed is faster.
Figures 18 and 19 are the path planning trajectories of the UAV in the test envi-
ronment. For the improved DQN algorithm, the shortest path that can be achieved in
the discrete action space is used to reach the target point, while the test results of the
60 R. Dong et al.
loss curve of the DQN algorithm is greater than that of the improved DQN algorithm,
which shows that DQN is not as good as the improved DQN algorithm in terms of
algorithm stability.
5 Conclusions
UAV has become a working platform for various environments with different
purposes. This chapter takes the rotor UAV as the experimental platform to study the
deep reinforcement learning path planning of UAVs. This chapter uses ROS as the
communication, sends the decision instructions to Pixhawk to control the UAV to
achieve path planning, and proposes a path planning method that improves the DQN
algorithm. This algorithm combines the advantages of the artificial potential field
62 R. Dong et al.
References
1. Khatib, O. (1995). Real-time obstacle avoidance for manipulators and mobile robots. Inter-
national Journal of Robotics Research, 5(1), 500–505. https://doi.org/10.1177/027836498600
500106.
2. Ge, S. S., & Cui, Y. J. (2002). ‘Dynamic motion planning for mobile robots using potential field
method. Autonomous robots’, 13(3), 207–222. https://doi.org/10.1023/A:1020564024509.
3. Mabrouk, M. H., & McInnes, C. R. (2008). Solving the potential field local minimum problem
using internal agent states. Robotics and Autonomous Systems, 56(12), 1050–1060. https://doi.
org/10.1016/j.robot.2008.09.006.
4. Jurkiewicz, P., Biernacka, E., Domżał, J., & Wójcik, R. (2021). Empirical time complexity of
generic Dijkstra algorithm. In 2021 IFIP/IEEE International Symposium on Integrated Network
Management (IM) (pp. 594–598). IEEE. (May, 2021).
5. Knuth, D. E. (1977). A generalization of Dijkstra’s algorithm. Information Processing Letters,
6(1), 1–5.
UAV Path Planning Based on Deep Reinforcement Learning 63
6. Pods˛edkowski, L., Nowakowski, J., Idzikowski, M., & Vizvary, I. (2001). ‘A new solution
for path planning in partially known or unknown environment for nonholonomic mobile
robots. Robotics and Autonomous Systems, 34(2–3), 145–152. https://doi.org/10.1016/S0921-
8890(00)00118-4.
7. Zhang, Y., Li, L. L., Lin, H. C., Ma, Z., & Zhao, J. (2017, September). ‘Development of
path planning approach based on improved A-star algorithm in AGV system. In International
Conference on Internet of Things as a Service (pp. 276–279). Springer, Cham. https://doi.org/
10.1007/978-3-030-00410-1_32. (Sept, 2017).
8. Sedighi, S., Nguyen, D. V., & Kuhnert, K. D. (2019). Guided hybrid A-star path planning
algorithm for valet parking applications. In 2019 5th International Conference on Control,
Automation and Robotics (ICCAR) (pp. 570–575). IEEE. https://doi.org/10.1109/ICCAR.2019.
8813752. (Apr, 2019).
9. LaValle, S. M. (1998). Rapidly-exploring random trees: A new tool for path planning (pp. 293–
308).
10. Karaman, S., & Frazzoli, E. (2012). Sampling-based algorithms for optimal motion planning
with deterministic μ-calculus specifications. In 2012 American Control Conference (ACC)
(pp. 735–742). IEEE. https://doi.org/10.1109/ACC.2012.6315419. (June, 2012).
11. Kavraki, L. E., Svestka, P., Latombe, J. C., & Overmars, M. H. (1996). Probabilistic roadmaps
for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics
and Automation, 12(4), 566–580. https://doi.org/10.1109/70.508439.
12. Webb, D. J., & Van Den Berg, J. (2013). Kinodynamic RRT*: Asymptotically optimal motion
planning for robots with linear dynamics. In 2013 IEEE International Conference on Robotics
and Automation (pp. 5054–5061). IEEE. https://doi.org/10.1109/ICRA.2013.6631299. (May,
2013).
13. Bry, A., & Roy, N. (2011). Rapidly-exploring random belief trees for motion planning under
uncertainty. In 2011 IEEE International Conference on Robotics and Automation (pp. 723–
730). IEEE. https://doi.org/10.1109/ICRA.2011.5980508. (May, 2011).
14. Nasir, J., Islam, F., Malik, U., Ayaz, Y., Hasan, O., Khan, M., & Muhammad, M. S.
(2013). RRT*-SMART: A rapid convergence implementation of RRT. International Journal
of Advanced Robotic Systems, 10(7), 299. https://doi.org/10.1109/ICRA.2011.5980508.
15. Gammell, J. D., Srinivasa, S. S., & Barfoot, T. D. (2014). Informed RRT*: Optimal sampling-
based path planning focused via direct sampling of an admissible ellipsoidal heuristic. In 2014
IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 2997–3004). IEEE.
https://doi.org/10.1109/IROS.2014.6942976. (Sept, 2014).
16. Ye, H., Zhou, X., Wang, Z., Xu, C., Chu, J., & Gao, F. (2020). Tgk-planner: An efficient topology
guided kinodynamic planner for autonomous quadrotors. IEEE Robotics and Automation
Letters, 6(2), 494–501. arXiv:2008.03468.
17. Koohestani, B. (2020). A crossover operator for improving the efficiency of permutation-based
genetic algorithms. Expert Systems with Applications, 151, 113381. https://doi.org/10.1016/j.
eswa.2020.113381.
18. Lamini, C., Benhlima, S., & Elbekri, A. (2018). ‘Genetic algorithm based approach for
autonomous mobile robot path planning. Procedia Computer Science’, 127, 180–189. https://
doi.org/10.1016/J.PROCS.2018.01.113.
19. Li, Q., Wang, L., Chen, B., & Zhou, Z. (2011). An improved artificial potential field method for
solving local minimum problem. In 2011 2nd International Conference on Intelligent Control
and Information Processing (Vol. 1, pp. 420–424). IEEE. https://doi.org/10.1109/ICICIP.2011.
6008278. (July, 2011).
20. Liang, J. H., & Lee, C. H. (2015). Efficient collision-free path-planning of multiple mobile
robots system using efficient artificial bee colony algorithm. Advances in Engineering Software,
79, 47–56. https://doi.org/10.1016/j.advengsoft.2014.09.006.
21. Akka, K., & Khaber, F. (2018). Mobile robot path planning using an improved ant colony
optimization. International Journal of Advanced Robotic Systems, 15(3), 1729881418774673.
https://doi.org/10.1177/1729881418774673.
64 R. Dong et al.
22. Su, Q., Yu, W., & Liu, J. (2021). Mobile robot path planning based on improved ant colony
algorithm. In 2021 Asia-Pacific Conference on Communications Technology and Computer
Science (ACCTCS) (pp. 220–224). IEEE. https://doi.org/10.1109/ACCTCS52002.2021.00050.
(Jan, 2021).
23. Cheng, J., Wang, L., & Xiong, Y. (2018). Modified cuckoo search algorithm and the prediction
of flashover voltage of insulators. Neural Computing and Applications, 30(2), 355–370. https://
doi.org/10.1007/s00521-017-3179-1.
24. Khaksar, W., Hong, T. S., Khaksar, M., & Motlagh, O. R. E. (2013). A genetic-based opti-
mized fuzzy-tabu controller for mobile robot randomized navigation in unknown environment.
International Journal of Innovative Computing, Information and Control, 9(5), 2185–2202.
25. Xiang, L., Li, X., Liu, H., & Li, P. (2021). Parameter fuzzy self-adaptive dynamic window
approach for local path planning of wheeled robot. IEEE Open Journal of Intelligent
Transportation Systems, 3, 1–6. https://doi.org/10.1109/OJITS.2021.3137931.
26. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A.,
Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., & Hassabis, D. (2015). Human-
level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.
org/10.1038/nature14236.
27. Jaradat, M. A. K., Al-Rousan, M., & Quadan, L. (2011). Reinforcement based mobile robot
navigation in dynamic environment. Robotics and Computer-Integrated Manufacturing, 27(1),
135–149. https://doi.org/10.1016/j.rcim.2010.06.019.
28. Shi, Z., Tu, J., Zhang, Q., Zhang, X., & Wei, J. (2013). The improved Q-learning algorithm
based on pheromone mechanism for swarm robot system. In Proceedings of the 32nd Chinese
Control Conference (pp. 6033–6038). IEEE. (July, 2013).
29. Zhu, Y., Mottaghi, R., Kolve, E., Lim, J. J., Gupta, A., Fei-Fei, L., & Farhadi, A. (2017).
Target-driven visual navigation in indoor scenes using deep reinforcement learning. In 2017
IEEE International Conference on Robotics and Automation (ICRA) (pp. 3357–3364). IEEE.
https://doi.org/10.1109/ICRA.2017.7989381. (May, 2017).
30. Sadeghi, F., & Levine, S. (2016). Cad2rl: Real single-image flight without a single real image.
https://doi.org/10.48550/arXiv.1611.04201. arXiv:1611.04201.
31. Tai, L., & Liu, M. (2016). Towards cognitive exploration through deep reinforcement learning
for mobile robots. https://doi.org/10.48550/arXiv.1610.01733. arXiv:1610.01733.
32. Jisna, V. A., & Jayaraj, P. B. (2022). An end-to-end deep learning pipeline for assigning
secondary structure in proteins. Journal of Computational Biophysics and Chemistry, 21(03),
335–348. https://doi.org/10.1142/S2737416522500120.
33. He, L., Aouf, N., & Song, B. (2021). Explainable deep reinforcement learning for UAV
autonomous path planning. Aerospace Science and Technology, 118, 107052. https://doi.org/
10.1016/j.ast.2021.107052.
34. Jeong, I., Jang, Y., Park, J., & Cho, Y. K. (2021). Motion planning of mobile robots for
autonomous navigation on uneven ground surfaces. Journal of Computing in Civil Engineering,
35(3), 04021001. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000963.
35. Chen, C., Seff, A., Kornhauser, A., & Xiao, J. (2015). DeepDriving: Learning affordance for
direct perception in autonomous driving. In 2015 IEEE International Conference on Computer
Vision (ICCV). https://doi.org/10.1109/ICCV.2015.312.
36. Wu, K., Wang, H., Esfahani, M. A., & Yuan, S. (2020). Achieving real-time path planning
in unknown environments through deep neural networks. IEEE Transactions on Intelligent
Transportation Systems. https://doi.org/10.1109/tits.2020.3031962.
37. Maw, A. A., Tyan, M., Nguyen, T. A., & Lee, J. W. (2021). iADA*-RL: Anytime graph-based
path planning with deep reinforcement learning for an autonomous UAV. Applied Sciences,
11(9), 3948. https://doi.org/10.3390/APP11093948.
38. Gao, J., Ye, W., Guo, J., & Li, Z. (2020). ‘Deep reinforcement learning for indoor mobile robot
path planning. Sensors’, 20(19), 5493. https://doi.org/10.3390/s20195493.
39. Yongqi, L., Dan, X., & Gui, C. (2020). Rapid trajectory planning method of UAV based on
improved A* algo-rithm. Flight Dynamics, 38(02), 40–46. https://doi.org/10.13645/j.cnki.f.d.
20191116.001.
UAV Path Planning Based on Deep Reinforcement Learning 65
40. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller,
M. (2013). ‘Playing atari with deep reinforcement learning. https://doi.org/10.48550/arXiv.
1312.5602. arXiv:1312.5602.
41. Ruan, X., Ren, D., Zhu, X., & Huang, J. (2019). ‘Mobile robot navigation based on deep
reinforcement learning’. In 2019 Chinese control and decision conference (CCDC) (pp. 6174–
6178). IEEE. https://doi.org/10.1109/CCDC.2019.8832393. (June, 2019 ).
Drone Shadow Cloud: A New Concept
to Protect Individuals from Danger Sun
Exposure in GCC Countries
Abstract The pick temperature in the Gulf and the Persian Gulf is around 47◦ in the
summer. The hot season lasts for six months in this region, starting at the end of April
and ending in October. The average temperature in the same period exceeds 44◦ in
the USA and Australia. The high temperature worldwide affects the body’s capability
to function outdoors. “Heat stress” refers to excessive amounts of heat that the body
cannot handle without suffering physiological degeneration. Heat stress due to high
ambient temperature seriously threatens workers worldwide. This issue increases the
risk of limitations in physical abilities, discomfort, injuries, and heat-related illnesses.
According to the Industrial safety & hygiene news, the worker’s body must maintain
a core temperature of 36◦ to maintain its normal function. Many companies have
their workers wear UV-absorbing clothing to protect themselves from the sun’s rays.
This chapter presents a new concept to protect construction workers from dangerous
sun exposure in hot temperatures. The fly umbrella drone with a UV-blocker fabric
canopy provides a stable shaded area. The solution minimizes heat stress and protects
them from UV rays when working outdoors. According to the sun’s position, a fly
umbrella moves dramatically through an open space, providing shade for workers.
Keywords Safety · Heat stress · Smart flying umbrella · Drone · Outdoor worker
1 Introduction
Global temperatures are rising, and how to reverse the trend is the subject of discus-
sion. A special report published by the Intergovernmental Panel on Climate Change
(IPCC) in October 2018 addressed the impacts of global warming of 1.5◦ , as shown
Fig. 2 Percentage of working hours lost to heat stress by subregion since 1995 and projected for
2030 [23]
In [32], the authors present a model based on worldwide analysis Heat exposure
has killed hundreds of U.S. workers. At least 384 workers in the United States
have died from environmental heat exposure in the past decade. In 37 states across
America, the count includes people working essential jobs, such as farm workers in
California, construction workers in Texas, and tree trimmers in North Carolina and
Virginia. That shows that heat stress is not limited to GCC countries and is prevalent
throughout the United States.
In [26], the authors and reporters from CJI and NPR examined worker heat deaths
recorded by OSHA between 2010 and 2020. They compared the high temperature
of each incident day with historical averages for the previous 40 years. Most of the
deaths happened on sweltering days on that date. Approximately two-thirds of the
incidents occurred on days when the temperature reached at least 50◦ .
In [12], the authors present a thorough longitudinal study to describe heat exposure
and renal health of Costa Rican rice workers over three months of production. In this
study, 72 workers with various jobs in a rice company provided pre-shift urine and
blood samples at baseline and three months later. NIOSH guidelines and the WBGT
index were used to calculate metabolic and ambient heat loads. As a result of the
research, the research recommended that efforts be made to provide adequate water,
rest, and shade to heat-exposed workers in compliance with national regulations.
In this study [31], the authors explain that during the stages of the milk manu-
facturing cycle, Italian dairy production exposes workers to certain uncomfortable
temperatures as well as potentially subjecting workers to heat shock. This study
aimed to assess the risks of heat stress for dairy workers who process buffalo milk
in southern Europe.
The United States has a high rate of heat-related deaths, although they are generally
considered preventable. In the United States, heat-related deaths averaged 702 per
year between 2004 and 2018, as shown in Fig. 4. As part of the CDC’s effort to study
heat-related deaths by age group, gender, race/ethnicity, and urbanization level and
to evaluate comorbid conditions associated with heat-related deaths, CDC analyzed
mortality data from the National Vital Statistics System (NVSS). The highest heat-
related mortality rates were observed among males 65 years and older, American
Indians/Alaska Natives who lived in nonmetropolitan counties, and those in large
central metro counties [6]. To counteract this risk, legal texts, as well as technological
solutions, exist. Nations agree to stop all work if the wet-bulb globe temperature
(WBGT) exceeds 32.1◦ in a particular workplace, regardless of the time. The factors
considered by the WBGT index are air temperature, humidity, sunlight, and wind
strength. Construction workers should not stay outside in the heat for long periods
because they will face serious health problems [5, 19]. In [8] the authors explain the
epidemic of chronic kidney disease in Central America is largely attributed to heat
stress and dehydration from strenuous work in hot environments. This study shows
the efforts to reduce heat stress and increase efficiency among sugarcane workers
in El Salvador. This study pushes the Salvadoran government to provide mobile
canopies for workers, as shown in Fig. 5. Umbrellas are widely used as shade in
middle eastern countries. This research aims to develop a flying umbrella to provide
shade and safe conditions parameters for outdoor workers.
Drone Shadow Cloud: A New Concept to Protect Individuals … 71
2 Related Work
2.1 Overview
Scientists and researchers have devoted much attention to this issue. In [18], the
authors present a technique based on shade structures installed in primary schools to
help reduce children’s exposure to ultraviolet radiation (UVR) during their formative
72 M. Z. Chaari et al.
years and to provide areas where they can conduct outdoor activities safely. In [25]
Scientists such as Keith seek ways to mimic the volcanic effect artificially. They
explain in a nature journal that a method to cool the planet rapidly is by injecting
aerosol particles into the stratosphere to reflect away some inbound sunlight, as shown
in Fig. 6. In [21, 29], the authors report that the process involves injecting sulfur
aerosols into the stratosphere between 9 and 50 kilometers above the Earth’s surface.
Solar geoengineering involves reflecting sunlight into space to limit global warming
and climate change. After the aerosols combine with water particles, sunlight will
be reflected more than usual for one to three years.
One scientific team is developing a global sunshade that uses balloons or jets to
shield the most vulnerable countries in the global south from the adverse effects
of global warming [34]. In [7], the present author’s scientists have focused more
on modifying fabric as a primary protective layer of skin against harmful radiation.
Today, many people and outdoor workers use umbrellas to protect themselves from
the sun’s and UV rays, are shown in Fig. 7. In the modern world, umbrellas are
necessary. In sweltering conditions, it is beneficial.
In addition, when we are working or doing something outdoors under changing
weather conditions, umbrellas are a handy tool, as shown in Fig. 7a. However, under
such circumstances, sometimes it has noticeable shortcomings. The hand may always
be busy when handling an umbrella, limiting some hand functions and requiring
further care and attention. There are some difficulties in obtaining some disadvantages
in holding an umbrella. Therefore, many companies and associations supply umbrella
hats for generating a canopy, as shown in Fig. 7b. Several high-tech solutions help
workers cope with and adapt to this climate, particularly outdoors. Providing shade
to customers utilizing robotics technology requires a great deal of work.
Drone Shadow Cloud: A New Concept to Protect Individuals … 73
Research at the Dongseo University has developed a solution to finding the perfect
spot to enjoy shade during a summer picnic [3]. A new type of portable architecture
solves the mundane but frustrating problem of adjusting shades throughout the day.
Researchers at this university have demonstrated that an adaptive canopy can change
shape as the sun moves throughout the day, providing stable shade and shadowing
regardless of the solar position or time of day while still considering its configuration
irrespective of location.
Research at the fabrication laboratory in QSC has developed a prototype to provide
cool air from the robot the perfect spot to enjoy during a summer picnic [11]. A new
type of air conditioner robot prototype that follows humans in outdoor applications.
Several robotic systems are sensitive to solar radiation, including some integrated
with solar panels, and can even use them for shading. The researchers conclude
that “the resulting architectural system can autonomously reconfigure, and stable
operation is driven by adaptive design patterns, rather than solely robotic assembly
methods.” Cyberphysical macro materials and aerial robotics utilize to construct the
canopy [44]. The drone is a lightweight nanomaterial made of Carbon Fiber with
integrated electronics to sense, process, and communicate data [1, 13].
University of Maryland researchers developed a system called RoCo which pro-
vides cooling while conserving the user’s comfort [15, 16]. The engineering faculty
of this university has created multiple versions of air conditioner robots with differ-
ent options. In today’s fast-moving world, robots that assist humans are increasingly
74 M. Z. Chaari et al.
relevant. In many fields and industries, a robot helps a human [35, 40]. Researchers
have confirmed the possibility of creating an intelligent flying umbrella by combining
drone and canopy technology. Drones significantly impact the production capabilities
of individuals such as artists, designers, and scientists, and they can detect and track
their users continuously. Over the past few years, UAV and mini UAV technology
have grown significantly [10]. It is present in our daily lives and helps us in vari-
ous fields, recently in the COVID-19 pandemic (Spraying, Surveillance, Homeland
Security, etc.). Today, drones can perform a wide range of activities, from delivering
packages to transporting patients [9].
In [20], A team of experts in engineering at Asahi Power Service has invented
a drone called “Umbrella Sidekick.” A drone can translate as an imaginative flying
umbrella.
While the techniques and solutions described above are good, outdoor workers
need better and more efficient methods, especially concerning global warming. The
following subsection presents a new proposal for flying umbrellas.
2.2 Proposal
This work aims to develop a flying umbrella with stable shading and a motion tracking
system. According to its name, this umbrella performs the same function as the older
one, flying above the head of the workers and serving the same purpose as an umbrella
hat. UV-protective fly umbrellas reduce the temperature in outdoor construction areas
and prevent heat illness at work. Flying Umbrella drones are designed mainly to:
• Provides consistent shadowing and shading regardless of the angle of the sun’s
rays.
• Keep a safe distance of approximately ten meters from the construction worker’s
field.
• Prevent heat illness at work.
The sun heats the earth during the day, and clear skies allow more heat to reach
the earth’s surface, which increases temperatures. However, cloud droplets reflect
some of the sun’s rays into space when the sky is cloudy. So, less of the sun’s
energy can reach the earth’s surface, which causes the earth to heat up more slowly,
leading to cooler temperatures [2, 43]. The prototype idea comes from the shadow
cast by clouds since clouds can block the sun’s rays and provide shade, as shown in
Fig. 8a. The study aims to develop an intelligent flying umbrella that would improve
efficiency and offer a comfortable outdoor environment for workers. We choose ten
meters high of the umbrella in the workers’ bran in a variety of ways. On the one
hand, the workers do not have hair and feel the noise generated by the propellers
[30, 36, 41]. On the other hand, the canopy reflects significant amounts of solar
radiation. (x) is a function of the solar position and the umbrella place, as shown
in Fig. 8b. The umbrella is on standby in special parking, awaiting the order to fly.
If it receives an order, it will follow the workers automatically. The umbrella will
Drone Shadow Cloud: A New Concept to Protect Individuals … 75
return to its original position if it loses the target workers. A failed signal will cause
the umbrella to issue an alarm and return to the parking area smoothly based on the
ultrasonic sensor implemented in the umbrella, as shown in Fig. 9. The sunshades
protect workers from solar and heat stress, and sunshades need to be adjusted daily
according to solar radiation. The umbrella concept consists of several components: an
aluminum frame, an electronics board, sensors, and radio-frequency communication.
The chapter is structured as follows: The first section is the introduction. Section 2
presents the materials and methods. Section 3 discusses the fabrication and testing
of the fly umbrella. Section 4 demonstrates the results, and finally, we conclude with
a prototype discussion and plan for the future.
76 M. Z. Chaari et al.
3 Proposed Method
Thus, this flying umbrella includes a flight and operating system to provide shade to
workers outdoors. The product is born from the combination of a drone equipped with
the necessary equipment and a canopy umbrella. Control commands for a umbrella
drone: This module sends specific control commands to the drone via the radio-
frequency link to control the umbrella (i.e., pitch, roll, yaw, and throttle). Utilize RF
remote controls to track and follow workers in an open environment. The VU0002
digital ultrasonic sensor employs the ultrasonic time-of-flight principle to measure
the distance between the umbrella and the obstacle. Various communication proto-
cols, including LIN bus and 2/3-wire IO, are used to output a digital distance signal
and self-test data, making it suitable for an array of intelligent umbrella parking sys-
tems. This ultrasonic sensor is highly weather-resistant, stable, and anti-interference.
The logical design in Fig. 10 shows the interaction between the components of the
flying umbrella and how data flows from the input layer to the designated actions
by the umbrella. Through the camera on board, the pilot perceives the environment
through its sensor. Also via the RF signal, the onboard computer receives commands
from the umbrella and executes them by sending movement control commands to
the umbrella flight controller, which is the brain of the umbrella location system.
Transmitters transmit orders that are received by the receiver and then transmitted to
the flight controller, which instructs the actuators to move. As part of the prototype,
an ultrasonic proximity sensor detects obstacles in the umbrella’s path. The umbrella
can track the worker efficiently by avoiding obstacles. The drone tracking module
is responsible for monitoring the spatial relationship between real-world elements
Fig. 11 Three cases for flying umbrella cloud geometry effect on workers from space. (S: Shadow
shown on the image; E: shadow covered by cloud, and F: bright surface covered by cloud)
and their virtual representations. After an initial calibration process and using an
IMU-based approach, tracking is possible.
A pilot must be able to receive a live video feed from the umbrella, as well as
have real-time access to flight data. The umbrella should be able to switch between
modes without malfunctioning. The umbrella should be able to adjust its position
automatically without human assistance. For example, the umbrella must not exceed
five seconds in latency when delivering pilot instructions to the drone and transmitting
telemetry information from the drone to the pilot. At ten meters in height, the canopy
creates a shade geometry around the worker’s barn, as shown in Fig. 11. According
to the position of the flying umbrella, the shade position changes. Consider these
three possibilities. S: Shadow provided by the canopy; E: shadow is covered by
the umbrella, and F: the bright surface contaminated by the canopy. The projected
and actual positions of the umbrella will most likely differ significantly when the
observing and solar zenith angles are relatively large.
For making the umbrella structure frame, we use aluminum instead of carbon fiber in
our prototype since we do not have the facility to produce it from fiber carbon. Plastic
propellers are the most common carbon fiber propellers of better quality; we chose
G32X 11CF carbon fiber propellers for our prototype. A motor’s technical speci-
fication is imperative, and more efficient motors will save battery life and give the
owner more flying time, which is what every pilot wants. We select motor U13KV130
with high efficiency and can thrust 24 kg. Multirotor drones rely on electronic speed
78 M. Z. Chaari et al.
move in two directions at the same speed. It uses six U13II KV130 motors (engines)
manufactured by T-MOTOR. Based on the data in Table 1, each engine produces 5659
W and a maximum thrust of 24 kg. Utilize a 2.5 m distance measurement ultrasonic
sensor to prevent crushing.
The different specifications of the hardware and mechanical parts of the flying
umbrella describe in Table 2.
An exploded picture of a flying umbrella concept in 3D, as shown in Fig. 15. The
parts of the umbrella drone shadows are mounted and secured well, as depicted in
Fig. 11b.
In the meantime, the umbrella remains on standby in a safe area and awaits an
order to take off, and it will follow workers after stabilizing at an altitude of ten
meters to provide shade. The pilot can move the position of the umbrella to give
the maximum area of shade to workers in the construction field. If the pilot loses
communication with the umbrella, a ringing alarm will go on, and the GPS will land
the umbrella automatically in the parking. The flowchart of the system algorithm is
shown in Fig. 16.
80 M. Z. Chaari et al.
(a) Umbrella frame ready. (b) Fixing the Flame 180A 12s V2.0 & the
U13II KV130 motors.
(c) Installation of the G32X11CF propeller. (d) Verify the weight of the umbrella body.
(a) An umbrella in 3D exploded view. (b) All parts of the umbrella mounted.
We used a fabric canopy screen that can block 95% of UV rays while allowing water
and air to pass through. Polyethylene fabric of 110 grams per square meter with
galvanized buttonholes and strong seams. It is very breathable, allowing air to pass
through, making the space more comfortable. It can block 95% of UV rays and create
cool shadows. Cools the space and allows light to pass through, allowing raindrops
to pass through, so there’s no water pooling.
The process of making an umbrella drone can be rewarding, but the choice of a
flight controller can be challenging. This prototype can only be operated by specific
drone controllers available today on the market. When you know exactly what our
umbrella drone will look like, you can narrow down the list of potential autopilot
boards, making the decision easier. Here are some criteria to consider when selecting
an autopilot board. In our analysis, we analyzed seven of the top drone controllers
on the market based on factors such as:
• Affordability
• Open Source Firmware
• FPV Racing friendly
• Autonomous functionality
• Linux or microcontroller-based environment
• Frame size typical
• Popularity
• CPU.
82 M. Z. Chaari et al.
In this subsection, we will present the best flight controller boards and select the best
for our prototype.
• APM Flight Controller: APM took the Pixhawk as its apprentice and developed
it into a much more powerful flight controller. Pixhawk uses a 32-bit processor,
while APM has an 8-bit processor. It was a massive leap for open-source drone
controllers, so DIY drone builders widely used them. The Pixhawk is compatible
with ArduPilot and PX4, two major open-source drone projects, and is also entirely
open-source.
Drone Shadow Cloud: A New Concept to Protect Individuals … 83
• Pixhawk: After the original pixhawk, the pixhawk open-source hardware project
made many flight control boards. Open-source projects like ArduPilot, therefore,
have a better chance of adding new functionality and support to Cube. Due to these
similarities, the Cube and the Pixhawk are very similar.
• Navio2: The Navio2 uses a raspberry pi to control the flight. As a result, Navio2 is
simply a shield that attaches to a Raspberry Pi 3. Debian OS images that come pre-
installed with ArduPilot are available for free from Emlid, which makes Navio2.
A simple flash of an SD card will do the trick.
• BeagleBone Blue: The first Linux implementation of ArduPilot was ported to
the BeagleBone Black before being ported to the BeagleBone in 2014. Creating
the BeagleBone Blue was a direct result of the success of the Linux porting of
ArduPilot for the BeagleBone Black.
• Naza Flight Controller: Naza-M V2 kits can be found on Amazon for about $200
and come with essential components like GPS. There is a closed-source flight
control software here, which means the community does not have access to its
code. Nasa flight controllers aren’t appropriate for people who want to build a
drone they can tinker with.
• Naze32: The Naze32 model boards are lightweight and affordable, costing about
$30-40. Many manufacturers offer Naze32 boards; choose one that is an F3 or F4
flight controller.
A significant problem with electricity-powered robots is their battery life. This will
also be an issue with the flying umbrella. Lithium Polymer (LiPo) batteries utilize in
this prototype because of their lightweight and high capacity. It is also possible to use
Nickel Metal Hydride (NiMH), cheaper but heavier than LiPo, causing problems and
reducing the umbrella’s efficiency. Due to battery weight, there is a tradeoff between
an umbrella’s total weight and flight time. Each U13II KV130 brushless DC Motor
can thrust 24.3 Kg at a battery voltage of 48 VDC, as shown in Fig. 17. So with
six U13II KV130 brushless motors, the fly umbrella can carry 144 Kg. Two sources
power the flying umbrella:
• For six DC brushless motors at 5659 W each, it is 59.65 KW for the whole load and
full thrust. So the power of the total motor is approximately equal to 41.7 kW. Six
batteries 44.44 Vdc/8000 mAh (Two batteries 22.22 Vdc/8000 mAh series). The
total amount of energy produced by the umbrella is 41.7 kWh = 9721 KgCo2.
• Battery 12 Vdc 5.5 AH for powering all sensors (ultrasonic, flight controller, GPS
module, camera, etc.). So total consumption power is 35 W.
The time for the umbrella to fly is based upon the specifications of the batteries, as
shown in Table 3.
The umbrella cannot fly for more than 22 min with the CX48100 battery. Alu-
minum umbrella frames weigh about 33 kg, and CX48100 batteries weigh 42 kg.
84 M. Z. Chaari et al.
A total of 73 kg is the weight of the six propellers umbrella drone with a battery
and a thrust-to-weight ratio (TTWR) of about 120 kg. The batterie was fixed in the
center to ensure an appropriate weight distribution. The fly umbrella can carry the
total weight easily. Flight time calculation:
AC D = T F W × ( p ÷ v)
where:
ACD: Average current Draw,
TFW: Total flight weight,
BDM: Battery Discharge Margin,
P: Power to Weight ratio,
V: Voltage.
Drone Shadow Cloud: A New Concept to Protect Individuals … 85
4 Experiment Phase
We agree to use radio frequency technology to control the umbrella via a ground
control person to improve worker safety. A ground control station and GPS data
manage the umbrella remotely. The large size of the umbrella (2500 mm × 2500 mm)
necessitates the use of a controller with a remote controller and flier system. This
will ensure its use in urban areas and ensure its safety. The umbrella flying control
system maintains balance by taking into account all parameters. With the addition
of a high-quality ultrasonic sensor (VU0002), the umbrella can take off and land
easily and avoid obstacles. To ensure that the umbrella functions appropriately, the
ultrasonic sensor in the umbrella must work with an obstacle avoidance algorithm.
A flight controller is one of the most comprehensive features of a flying umbrella.
The application supports everything, from the home return to carefree flight to altitude
hold. The altitude holding and return to home features are particularly helpful for
stable shade and following the sun’s position. This technique may help a pilot who is
disoriented about their flight direction. Continuously flying around the yard without
experiencing extreme ups and downs is easy. However, using the Naza-M V2 in this
work is highly efficient, as shown in Fig. 18a. The Fly umbrella is equipped with
an accelerometer, gyroscope, magnetometer, barometric pressure sensor, and GPS
receiver for more excellent safety in urban environments, as shown in Fig. 18b. As a
safety precaution, a rope should use to pull the umbrella drone in case communication
is lost. See Fig. 19.
During the flight test, one of the objectives was to determine how the control sys-
tem would perform when the large umbrella size synced with the electronic system.
Pilots were able to fly the umbrella via remote control during flight tests. The first
step in determining parameters after moving an umbrella is to select longitudinal
axes. In this function, the FCS controls the longitudinal axes while the pilot contin-
ues to direct the umbrella. The pilot steered the umbrella in and out of the airport
and increased gain until the umbrella was stable to minimize steady-state errors. It
is height-adjusted in ascending and descending steps from left to right to ensure sta-
bility and maintain a moderate rise and fall rate. Shades can be controlled remotely
and moved based on the worker’s location.
The umbrella took off successfully, flew steadily, and landed in the desired area,
as shown in Fig. 20. Flying umbrellas provide shade while remaining stable in the
air, as illustrated in Fig. 20b. The workers move in different directions (left, right, and
back) to evaluate the umbrella’s mechanical response. We measured the trajectory
between the flying umbrella and the error distance during the current study. We
kept the umbrella at a ten-meter altitude. During the testing phase, we observed
that the PID controller of the umbrella was affected by the wind speed. So one of
the limitations of the umbrella is that it cannot fly in strong winds. Because of the
high-efficiency propellers, the flying umbrella can be loud because large quantities
of air are rapidly displaced. While the propeller spins, pressure spikes are generated,
resulting in a distinctive buzzing noise. The flying umbrella produced in-air levels
of 80 dB, with fundamental frequencies at 120 Hz. Noise levels were around 95 dB
when the umbrella flew at altitudes of 5 and 10 m. Noise levels were very high in the
construction area, so it was not a significant effect compared to heat stress.
Drone Shadow Cloud: A New Concept to Protect Individuals … 87
(a) Flying umbrella successful take-off (b) Flying umbrella successful flying
(c) Landed
Fig. 20 Following scenes: a–b flying umbrella successful take-off and flying
Fig. 21 The air temperature difference between shade and full sun in 05 June 2022 (Morning)
Table 4 The difference in temperature between objects in shade and full sunlight
Object Time Full sun Shade provided by
flying umbrella
Workers hat 10 AM 42.3◦ 41.2◦
Air temperature 10.15 AM 44◦ 42.3◦
Soil temperature 10.20 AM 46.5◦ 44◦
• A more extensive scale of the umbrella will be able to be used in many places after
further R&D.
In comparing temperatures in two locations, the first location under an umbrella shade
and the second location under the full sun in morning time, there was an average
difference of 2.5◦ during 22 min, as shown in Fig. 21. We can observe that the shade
reduces air and soil temperatures and blocks the sun’s rays, as shown in Table 4.
On the same date but afternoon temperatures in two locations, one under an
umbrella shade, one under direct sunlight, there was an average difference of 2.7◦
during 22 min, as shown in Figure 22. According to Table 5, the shade reduces the
air and soil temperatures by blocking the sun’s rays.
Testing at the airport reveals some parameters we can consider to increase the
efficiency of the umbrella. Another power source, such as a solar system, should
be considered to keep the umbrella flying for a long time. Based on the GPS signal,
tracking capabilities are very high. The umbrella produces high levels of noise which
are acceptable in workers’ barn area.
Drone Shadow Cloud: A New Concept to Protect Individuals … 89
Fig. 22 The air temperature difference between shade and full sun in 05 June 2022 (Afternoon)
Table 5 The difference in temperature between objects in shade and full sunlight (Afternoon)
Object Time Full sun Shade provided by
flying umbrella
Workers hat 4.00 PM 41.2◦ 40.4◦
Air temperature 4.15 PM 40.4◦ 39.8◦
Soil temperature 4.20 PM 39.8◦ 39.7◦
6 Conclusion
ronment subject to rain, snow, and, most importantly, scorching heat is essential. In
future work, we propose the possibility of installing a wireless charging station in
the parking area for the fly umbrella drone.
References
1. Agarwal, G. (2022). Brief history of drones. In: Civilian Drones, Visual Privacy and EU Human
Rights Law, Routledge (pp. 6–26). https://doi.org/10.4324/9781003254225-2
2. Ahmad, L., Kanth, R. H., Parvaze, S., & Mahdi, S. S. (2017). Measurement of cloud cover.
Experimental Agrometeorology: A Practical Manual (pp. 51–54). Springer International Pub-
lishing. https://doi.org/10.1007/978-3-319-69185-5_8
3. Ahmadhon, K., Al-Absi, M. A., Lee, H. J., & Park, S. (2019). Smart flying umbrella drone on
internet of things: AVUS. In 2019 21st International Conference on Advanced Communication
Technology (ICACT), IEEE. https://doi.org/10.23919/icact.2019.8702024
4. Al-Bouwarthan, M., Quinn, M. M., Kriebel, D., & Wegman, D. H. (2019). Assessment of heat
stress exposure among construction workers in the hot desert climate of saudi arabia. Annals
of Work Exposures and Health, 63(5), 505–520. https://doi.org/10.1093/annweh/wxz033.
5. Al-Hatimy, F., Farooq, A., Abiad, M. A., Yerramsetti, S., Al-Nesf, M. A., Manickam, C.,
et al. (2022). A retrospective study of non-communicable diseases amongst blue-collar migrant
workers in qatar. International Journal of Environmental Research and Public Health, 19(4),
2266. https://doi.org/10.3390/ijerph19042266.
6. Ambarish Vaidyanathan PSSS Josephine Malilay. (2020). Heat-related deaths - united states,
2004–2018. Centers for Disease Control and Prevention, 69(24), 729–734.
7. Bashari, A., Shakeri, M., & Shirvan, A. R. (2019). UV-protective textiles. In The Impact and
Prospects of Green Chemistry for Textile Technology (pp. 327–365). Elsevier. https://doi.org/
10.1016/b978-0-08-102491-1.00012-5
8. Bodin, T., García-Trabanino, R., Weiss, I., Jarquín, E., Glaser, J., Jakobsson, K., et al. (2016).
Intervention to reduce heat stress and improve efficiency among sugarcane workers in el sal-
vador: Phase 1. Occupational and Environmental Medicine, 73(6), 409–416. https://doi.org/
10.1136/oemed-2016-103555.
9. Chaari, M. Z., & Al-Maadeed, S. (2021). The game of drones/weapons makers war on drones.
In Unmanned Aerial Systems (pp. 465–493). Elsevier. https://doi.org/10.1016/b978-0-12-
820276-0.00025-x
10. Chaari, M. Z., & Aljaberi, A. (2021). A prototype of a robot capable of tracking anyone with
a high body temperature in crowded areas. International Journal of Online and Biomedical
Engineering (iJOE), 17(11), 103–123. https://doi.org/10.3991/ijoe.v17i11.25463.
11. Chaari, M. Z., Abdelfatah, M., Loreno, C., & Al-Rahimi, R. (2021). Development of air condi-
tioner robot prototype that follows humans in outdoor applications. Electronics, 10(14), 1700.
https://doi.org/10.3390/electronics10141700.
12. Crowe, J., Rojas-Garbanzo, M., Rojas-Valverde, D., Gutierrez-Vargas, R., Ugalde-Ramírez,
J., & van Wendel de Joode, B. (2020). Heat exposure and kidney health of costa rican rice
workers. ISEE Conference Abstracts, 2020(1). https://doi.org/10.1289/isee.2020.virtual.o-os-
549
13. DeFrangesco, R., & DeFrangesco, S. (2022). The history of drones. In The big book of drones
(pp. 15–28). CRC Press. https://doi.org/10.1201/9781003201533-2
14. Desch, S. J., Smith, N., Groppi, C., Vargas, P., Jackson, R., Kalyaan, A., et al. (2017). Arctic
ice management. Earths Future, 5(1), 107–127. https://doi.org/10.1002/2016ef000410.
Drone Shadow Cloud: A New Concept to Protect Individuals … 91
15. Dhumane, R., Ling, J., Aute, V., & Radermacher, R. (2017). Portable personal conditioning
systems: Transient modeling and system analysis. Applied Energy, 208, 390–401. https://doi.
org/10.1016/j.apenergy.2017.10.023.
16. Dhumane, R., Mallow, A., Qiao, Y., Gluesenkamp, K. R., Graham, S., Ling, J., & Radermacher,
R. (2018). Enhancing the thermosiphon-driven discharge of a latent heat thermal storage system
used in a personal cooling device. International Journal of Refrigeration, 88, 599–613. https://
doi.org/10.1016/j.ijrefrig.2018.02.005.
17. Geffroy, E., Masia, M., Laera, A., Lavidas, G., Shayegh, S., & Jolivet, R. B. (2018). Mcaa
statement on ipcc report “global warming of 1.5 c”. https://doi.org/10.5281/ZENODO.1689921
18. Gies, P., & Mackay, C. (2004). Measurements of the solar UVR protection provided by shade
structures in new zealand primary schools. Photochemistry and Photobiology. https://doi.org/
10.1562/2004-04-13-ra-138.
19. Hameed, S. (2021). India’s labour agreements with the gulf cooperation council coun-
tries: An assessment. International Studies, 58(4), 442–465. https://doi.org/10.1177/
00208817211055344.
20. Hayes, M. J., Levine, T. P., & Wilson, R. H. (2016). Identification of nanopillars on the cuticle
of the aquatic larvae of the drone fly (diptera: Syrphidae). Journal of Insect Science, 16(1), 36.
https://doi.org/10.1093/jisesa/iew019.
21. Haywood, J., Jones, A., Johnson, B., & Smith, W. M. (2022). Assessing the consequences of
including aerosol absorption in potential stratospheric aerosol injection climate intervention
strategies. https://doi.org/10.5194/acp-2021-1032.
22. How, V., Singh, S., Dang, T., Lee, L. F., & Guo, H. R. (2022). The effects of heat exposure
on tropical farm workers in malaysia: Six-month physiological health monitoring. Interna-
tional Journal of Environmental Health Research, 1–17. https://doi.org/10.1080/09603123.
2022.2033706
23. (ILO) ILO (2014) Informal Economy and Decent Work: A Policy Resource Guide Supporting
Transitions to Formality. INTL LABOUR OFFICE
24. Irvine, P., Emanuel, K., He, J., Horowitz, L. W., Vecchi, G., & Keith, D. (2019). Halving
warming with idealized solar geoengineering moderates key climate hazards. Nature Climate
Change, 9(4), 295–299. https://doi.org/10.1038/s41558-019-0398-8.
25. Irvine, P., Burns, E., Caldeira, K., Keutsch, F., Tingley, D., & Keith, D. (2021). Expert judge-
ments judgements on solar geoengineering research priorities and challenges. https://doi.org/
10.31223/x5bg8c
26. JULIA SHIPLEY DNRBSMCCWT BRIAN EDWARDS (2021) Hot days: Heat’s mounting
death toll on workers in the u.s.
27. Khan, H. T. A., Hussein, S., & Deane, J. (2017). Nexus between demographic change and
elderly care need in the gulf cooperation council (GCC) countries: Some policy implications.
Ageing International, 42(4), 466–487. https://doi.org/10.1007/s12126-017-9303-9.
28. Lee, J., Lee, Y. H., Choi, W. J., Ham, S., Kang, S. K., Yoon, J. H., et al. (2021). Heat exposure
and workers’ health: a systematic review. Reviews on Environmental Health, 37(1), 45–59.
https://doi.org/10.1515/reveh-2020-0158.
29. Lee, W. R., MacMartin, D. G., Visioni, D., & Kravitz, B. (2021). High-latitude stratospheric
aerosol geoengineering can be more effective if injection is limited to spring. Geophysical
Research Letters, 48(9). https://doi.org/10.1029/2021gl092696.
30. Lee, Y. K., Yeom, T. Y., & Lee, S. (2022). A study on noise analysis of counter-rotating
propellers for a manned drone. The KSFM Journal of Fluid Machinery, 25(2), 38–44. https://
doi.org/10.5293/kfma.2022.25.2.038.
31. Marucci, A., Monarca, D., Cecchini, M., Colantoni, A., Giacinto, S. D., & Cappuccini, A.
(2014). The heat stress for workers employed in a dairy farm. Journal of Agricultural Engi-
neering, 44(4), 170. https://doi.org/10.4081/jae.2013.218.
32. Mehta, B. (2021) Heat exposure has killed hundreds of u.s. workers - it’s time to do something
about it. Industrial Saftey & Hygiene News, 3(52).
33. Meyer, R. (2018). A radical new scheme to prevent catastrophic sea-level rise. The Atlantic
92 M. Z. Chaari et al.
34. Ming, T., de_Richter, R., Liu, W., & Caillol, S. (2014). Fighting global warming by climate
engineering: Is the earth radiation management and the solar radiation management any option
for fighting climate change? Renewable and Sustainable Energy Reviews,31, 792–834. https://
doi.org/10.1016/j.rser.2013.12.032
35. Niedzielski, T., Jurecka, M., Miziński, B., Pawul, W., & Motyl, T. (2021). First successful
rescue of a lost person using the human detection system: A case study from beskid niski (SE
poland). Remote Sensing, 13(23), 4903. https://doi.org/10.3390/rs13234903.
36. Oliver Jokisch, D. F. (2019). Drone sounds and environmental signals - a first review. 30th
ESSV ConferenceAt: TU Dresden
37. Pan, Q., Sumner, D. A., Mitchell, D. C., & Schenker, M. (2021). Compensation incentives and
heat exposure affect farm worker effort. PLOS ONE, 16(11), e0259,459. https://doi.org/10.
1371/journal.pone.0259459
38. Peace, A. H., Carslaw, K. S., Lee, L. A., Regayre, L. A., Booth, B. B. B., Johnson, J. S., &
Bernie, D. (2020). Effect of aerosol radiative forcing uncertainty on projected exceedance year
of a 1.5 c global temperature rise. Environmental Research Letters, 15(9), 0940a6. https://doi.
org/10.1088/1748-9326/aba20c
39. Pradhan, B., Kjellstrom, T., Atar, D., Sharma, P., Kayastha, B., Bhandari, G., & Pradhan, P. K.
(2019). Heat stress impacts on cardiac mortality in nepali migrant workers in qatar. Cardiology,
143(1–2), 37–48. https://doi.org/10.1159/000500853.
40. Sankar, S., & Tsai, C. Y. (2019). ROS-based human detection and tracking from a wireless
controlled mobile robot using kinect. Applied System Innovation, 2(1), 5. https://doi.org/10.
3390/asi2010005.
41. Thalheimer, E. (2021). Community acceptance of drone noise. INTER-NOISE and NOISE-
CON Congress and Conference Proceedings, 263(6), 913–924. https://doi.org/10.3397/in-
2021-1694.
42. Uejio, C. K., Morano, L. H., Jung, J., Kintziger, K., Jagger, M., Chalmers, J., & Holmes,
T. (2018). Occupational heat exposure among municipal workers. International Archives of
Occupational and Environmental Health, 91(6), 705–715. https://doi.org/10.1007/s00420-
018-1318-3.
43. Woelders, T., Wams, E. J., Gordijn, M. C. M., Beersma, D. G. M., & Hut, R. A. (2018).
Integration of color and intensity increases time signal stability for the human circadian system
when sunlight is obscured by clouds. Scientific Reports, 8(1). https://doi.org/10.1038/s41598-
018-33606-5
44. Wood, D., Yablonina, M., Aflalo, M., Chen, J., Tahanzadeh, B., & Menges, A. (2018). Cyber
physical macro material as a UAV [re]configurable architectural system. Robotic Fabrication
in Architecture, Art and Design 2018 (pp. 320–335). Springer International Publishing. https://
doi.org/10.1007/978-3-319-92294-2_25
Accurate Estimation of
3D-Repetitive-Trajectories using Kalman
Filter, Machine Learning and
Curve-Fitting Method for High-speed
Target Interception
Abstract Accurate estimation of trajectory is essential for the capture of any high-
speed target. This chapter estimates and formulates an interception strategy for the
trajectory of a target moving in a repetitive loop using a combination of estimation
and learning techniques. An extended Kalman filter estimates the current location of
the target using the visual information in the first loop of the trajectory to collect data
points. Then, a combination of Recurrent Neural Network (RNN) with least-square
curve-fitting is used to accurately estimate the future positions for the subsequent
loops. We formulate an interception strategy for the interception of a high-speed
target moving in a three-dimensional curve using noisy visual information from
a camera. The proposed framework is validated in the ROS-Gazebo environment
for interception of a target moving in a repetitive figure-of-eight trajectory. Astroid,
Deltoid, Limacon, Squircle, and Lemniscates of Bernoulli are some of the high-order
curves used for algorithm validation.
Nomenculture
f focal length of the camera
P Error covariance matrix
1 Introduction
In automated robotics systems, a class of problems that have engaged the attention
of researchers is that of motion tracking and guidance using visual information by
which an autonomously guided robot can track and capture a target moving in an
approximately known or predicted trajectory. Interception of a target in an outdoor
environment is challenging, and it is important for the defence of the military as well
as important civilian infrastructures. Interception of intruder targets using UAVs has
the advantages of low cost and quick deployability; however, the performance of the
UAV is limited by its payload capability. In this case, the detection of the target is
performed using visual information. Interception of a target with UAVs using visual
information is reported in various literature such as [8, 17, 33] and it is difficult due to
limitations on sensing, payload and computational capability of UAVs. Interception
strategies are generally based on the estimation of target future trajectories [30, 31],
controller based on visual servoing [5, 12], or using vision based guidance law
[22, 34]. Controller based on visual servoing is mostly applicable for slow-moving
targets. Guidance strategies for the interception of a high-speed target are difficult as
the capturability region of the guidance law is small for the interceptor compared to
a low-speed target. Interception using the visual information is further difficult as the
Accurate Estimation of 3D-Repetitive-Trajectories using Kalman Filter … 95
visual output can be noisy in the outdoor environment, and the range of view is small
compared to other sensors such as radar. The interception strategy by prediction of
target trajectory over a shorter interval is not effective in the case of a high-speed
target. Therefore, accurate estimation of the trajectory of the target is important for
efficient interception of a high-speed target. Once the looping trajectory of the target
is obtained, the interceptor could be placed in a favourable position to increase the
probability of interception. In this chapter, we consider a problem where an aerial
target is moving in a repetitive loop at high speed and an aerial robot, or a drone, has
to observe the target’s motion via its imaging system and predict the target trajectory
in order to guide itself for effective capture of the target.
In this chapter, the strategy for interception of a high-speed target moving in a
repetitive loop is formulated after estimation and prediction of target trajectory using
the Extended Kalman Filter, Recurrent Neural Network (RNN) and least square curve
fitting techniques. An Extended Kalman filter (EKF) is used to track a manoeuvring
target moving in an approximately repetitive loop by using the first loop of the tra-
jectory to collect data points and then using a combination of machine learning
with least-square curve-fitting to accurately estimate future positions for the sub-
sequent loops. The EKF estimates the current location of the target from its visual
information and then predicts its future position by using the observation sequence.
We utilise noisy visual information of the target from the three-dimensional trajec-
tory to carry out the trajectory estimation. Several high-order curves, expressed as
univariate polynomials, are considered test cases. Some of these are Circle/Ellipse,
Astroid, Deltoid, Limacon, Nephroid, Quadrifolium, Squircle, and Lemniscates of
Bernoulli and Gerono, among others. The proposed algorithm is demonstrated in the
ROS-Gazebo environment and is implemented in field tests. The problem statement
is motivated by Challenge-1 of MBZIRC-2020, where the objective is to catch an
intruder target moving in an unknown repetitive figure-of-eight trajectory (shown
in Fig. 1). The ball attached to the target drone is moving in an approximate figure-
of-eight trajectory in 3D, and the parameters of the trajectory are unknown to the
interceptors. Interceptors need to detect, estimate and formulate strategies for grab-
bing or interception of the target ball. In this chapter, the main focus is to estimate
the target trajectory using visual information. The method proposed in the chapter is
used first to estimate the position of the target using the Kalman Filter techniques,
and then the geometry of the looping trajectory is estimated using the learning and
curve fitting techniques. The main contribution of the chapter is the following:
1. Estimation of target position using visual information in EKF framework.
2. Estimation of target trajectory moving in a standard geometric curve in a closed
loop using Recurrent Neural Network and Least-Square curve fitting techniques.
3. Development of a strategy for interception of a high-speed target moving in a
standard geometric curve in a repetitive loop.
The rest of the chapter is organised as follows: Relevant literature is presented
in Sect. 2. Estimation and prediction of target location using visual information are
presented in Sect. 3. Detailed curve fitting methods using learning in 2D and 3D are
96 A. Agrawal et al.
2 Related Work
Trajectory estimation using curve fitting method is reported in [2, 10, 15, 16, 18].
In [2], bearing-only measurements are used for fitting splines for highly manoeuvring
targets. In [16], the trajectory of targets has a smooth trajectory using a sliding time
window approach where parameters of the parametric curve are updated iteratively.
In [18], Spline Fitting Filtering (SFF) algorithm is used to fit a cubic spline to estimate
the trajectory of the manoeuvring target. The trajectory is estimated using data-driven
regression analysis, known as “fitting for smoothing (F4S)”, with the assumption that
trajectory is a function of time [15].
Interception of a target using the estimation of the target’s future position is
reported in various literature such as [9, 13, 36]. In [13], the target trajectory is
estimated using a simple linear extrapolation method and uncertainty in the target
position is considered using a random variable. In [36], the interception strategy
is formulated using the prediction of target trajectory using the historical data and
selection of optimal path using third-order Bezier curves. The proposed formulation
is validated using the simulation and not validated using real visual information.
Capturing an aerial target using a robotic manipulator after the target’s pose and
motion estimation using the adaptive extended Kalman filter and photogrammetry is
reported in [9].
Other research groups approached the similar problem statement (as shown in
Fig. 1) [4, 6, 38], and the estimation of trajectory is performed using the filtering
techniques assuming that target will be following the figure-of-eight trajectory; how-
ever, a general approach for estimation of trajectory following unknown geometric
curve is not reported.
In this section, the global position of the target is estimated from visual information.
It is assumed that the target trajectory lies in the 2D plane, and thus measurements
of the target in the global X -Y plane are considered. The target’s motion is assumed
to be smooth; that is, the change in curvature of the trajectory remains bounded and
smooth over time. Let xtarget and ytarget are coordinates of the target position, and
Vtarget is target speed, ψ is flight path angle with the horizontal. The target motion
without considering wind can be expressed as,
ψ̇ = ω (3)
The target trajectory, instantaneous circle, and the important variables at kth sampling
time is shown in Fig. 2. Let the position of the target at kth sampling time be (X k ),
given as
98 A. Agrawal et al.
k T
X k = xtarget k
ytarget (4)
k k
where xtarget and ytarget are the inertial coordinates of the target in the global X -Y plane
at the kth sampling time . {xck , yck } are coordinates of the center of the instantaneous
curvature of the target trajectory at that instant. From Fig. 2, the variables θ k and ψ k
are related as follows,
π
θk = ψk − (5)
2
based on which Eqs. (1) and (2) can be simplified to,
Therefore, the motion of the target can be represented in discrete time-space by the
following equations,
k
xtarget = xtarget
k−1
− Vtarget Δt (ytarget
k
− yck )/ (ytarget
k
− yck )2 + (xtarget
k
− xck )2 (8)
k
ytarget = ytarget
k−1
+ Vtarget Δt (xtarget
k
− xck )/ (ytarget
k
− yck )2 + (xtarget
k
− xck )2 (9)
where Vtarget is the speed of the target and Δt is the sampling time interval.
Accurate Estimation of 3D-Repetitive-Trajectories using Kalman Filter … 99
The target position is estimated using the target pixel information from the image
plane of a monocular camera. The target position is obtained considering the per-
spective projection of the target in the image plane. If the coordinates of the estimated
k k
target position are (xtarget,vision and ytarget,vision ), then
Z k xk
k
xtarget,vision = (10)
f
Z k yk
k
ytarget,vision = (11)
f
where xk and xk are the coordinates of the target pixel in the image plane, f is the
focal length of the camera, and Z k is the target depth.
The target measurement (Y k ) can be presented as,
k k
xtarget,vision x
Y = k
k
= target
k + ηk (12)
ytarget,vision ytarget
k+1
xtarget = xtarget
k
− r δ sin θk (13)
k+1
ytarget = ytarget
k
+ r δ cos θk (14)
θk = θk−1 + δ (15)
where r is the radius of the instantaneous circle, and δ is the change in the target’s
flight path angle θ between the time steps. Let the sequence of last m observations
of target positions gathered at sample index k be,
{xtarget
k−i
, ytarget
k−i
} where i = 0, 1, 2, ..m − 1 (16)
We will define the difference in x-position and y-position of target of jth sequence
at kth sample index as,
k− j k− j−1
Δxtarget (k, j) = xtarget − xtarget = −r δ sin θk− j−1 (17)
k− j k− j−1
Δytarget (k, j) = ytarget − ytarget = r δ cos θk− j−1 (18)
Equivalently,
Δxtarget (k, j) = −r δ sin θk− j−2 cos δ − r δ cos θk− j−2 sin δ (20)
Therefore,
Therefore,
Since the parameter δ describes the evolution of the target’s states, the elements of
the evolution matrix contain (cos δ, sin δ). The difference in observations equations
are written in matrix form as Eq. 24, for j = 0, 1, ..., m − 1.
Accurate Estimation of 3D-Repetitive-Trajectories using Kalman Filter … 101
⎡ ⎤ ⎡ ⎤
.. .. ..
⎢ . ⎥ ⎢ . . ⎥
⎢Δxtarget (k, j)⎥ ⎢Δxtarget (k, j − 1) −Δytarget (k, j − 1)⎥ cos δ
⎢ ⎥=⎢ ⎥ (24)
⎢Δytarget (k, j)⎥ ⎢Δytarget (k, j − 1) Δxtarget (k, j − 1) ⎥ sin δ
⎣ ⎦ ⎣ ⎦
.. .. ..
. . .
The least squares solution of the observation sequence provides the estimation of
the evolution matrix at every sampling step, and we obtain the estimated value of δ
as δ̂.
Let (xc (k), yc (k)) be the co-ordinates of the instantaneous center of curvature of
the target trajectory, then from Fig. 3 we can write,
Therefore using (17) and (18), (xc (k), yc (k)) is calculated as follows:
Δytarget (k, 1)
xc (k) = xtarget
k
− (27)
δ̂
Δxtarget (k, 1)
yc (k) = ytarget
k
+ (28)
δ̂
Steps for calculating the centre of curvature of the target trajectory are mentioned
in detail in Algorithm 1.
Once the instantaneous centre of curvature of the target trajectory at the current
instant is estimated, the target position is estimated using the continuous-discrete
extended Kalman Filter (EKF) framework. The continuous target motion model is
represented as,
Ẋ = F(X, U, ξ ) (29)
where
−Vtarget (ytarget − yc )/ (ytarget − yc )2 + (xtarget − xc )2
F(X, U ) = (30)
Vtarget (xtarget − xc )/ (ytarget − yc )2 + (xtarget − xc )2
where ξ is the process noise and ηk is the measurement noise. It is assumed that
process noise and measurement noises are zero mean Gaussian white noise, that is,
ξ ∼ N (0, Q) and ηk ∼ N (0, R).
The prediction step is the first stage of the EKF algorithm, where we propagate
the previous state and input values to the non-linear process Eq. (32) in a discrete
time estimate to arrive at the state estimate.
X̂˙ = F( X̂ , U, 0) (32)
Ṗ = A P + P A T + Q (33)
∂F
where matrix A = ∂X
| X̂ . The matrix A can be derived as,
Vtarget
A= Γ (34)
((ytarget − yc )2 + (xtarget − xc )2 )3/2
where
(ytarget − yc )(xtarget − xc ) −(xtarget − xc )2
Γ = . (35)
(ytarget − yc )2 −(xtarget − xc )(ytarget − yc )
X̂ k+ = X̂ k− + L k (Yk − Ck X̂ k− ) (36)
Accurate Estimation of 3D-Repetitive-Trajectories using Kalman Filter … 103
where
L k = Pk CkT (R + Ck Pk CkT )− 1 (38)
∂H
Ck = | X̂ (39)
∂X
The EKF position estimation framework provides a filtered position of the target,
which is then used for predicting the target’s trajectory. The workflow of trajectory
prediction is divided into two phases, namely, the observation phase and the predic-
tion phase. During the observation phase, a predefined sequence of observations of
the estimated target position is gathered, and the trajectory is predicted in the near
future.
Prediction of target position in shorter duration is important for ease in the tracking
of the target. For prediction of target position up to n steps, we can write, for j =
0, 1, 2, · · · , n − 1
k+ j+1 k+ j
x̂target = x̂target + Δx̂target (k, j + 1) (40)
k+ j+1 k+ j
ŷtarget = ŷtarget + Δ ŷtarget (k, j + 1) (41)
k+ j+1 k+ j
Δx̂target (k, j + 1) = x̂target − x̂target = Δx̂target (k, j) cos δ̂ − Δ ŷtarget (k, j) sin δ̂
(42)
k+ j+1 k+ j
Δ ŷtarget (k, j + 1) = ŷtarget − ŷtarget = Δx̂target (k, j) sin δ̂ + Δ ŷtarget (k, j) cos δ̂
(43)
The steps for the trajectory prediction are described in Algorithm 2.
In this section, the mathematical formulation for the estimation of the looping tra-
jectory is derived. The curve-fitting technique is applied in the next loop based on
the initial observations of the target position in the first loop. It is to be noted that
conventional curve fitting techniques using the regression method will fail to esti-
104 A. Agrawal et al.
mate the complex curves with multiple loops; therefore, the curve fitting technique
is formulated using the learning techniques. The following assumptions are made
about the target motion.
• The target drone is moving continuously in a looping trajectory in standard geo-
metrical curves.
• The target trajectory is a closed loop curve.
We have considered all high-order closed curves to the best of our knowledge,
and the curve fits the data to the appropriate curve equation without prior knowledge
about the shape of the figure. The closed curves taken into consideration are listed
in Table 1. We have considered curves with one parameter (for example, Astroid,
Nephroid, etc.) and with two parameters (for example, Limacon, Squircle, etc.).
Since Circle is a special case of the ellipse, we will include both in a single category.
This method will also be applicable in any closed mathematically-derivable curve.
Accurate Estimation of 3D-Repetitive-Trajectories using Kalman Filter … 105
The curves mentioned above have been well studied, and their characteristics are
well known. They are usually the zero set of some multivariate polynomials. We can
write their equations as
f (x, y) = 0 (44)
(x 2 + y 2 )2 − 2a 2 (x 2 − y 2 ) = 0 (45)
f (x, y, a, b) = 0 (46)
where b may or may not be used based on the category of shape the points are being
fitted to.
Univariate polynomials of the form
can be solved using matrices if there are enough points to solve for the k unknown
coefficients. On the other hand, multivariate equations require different methods to
solve for their coefficients. One method for curve fitting uses an iterative least-squares
approach along with specifying related constraints.
f : [x0 , x1 , . . . , xm , y0 , y1 , . . . , ym ] → O (48)
Considering any of the above-mentioned curves in two dimensions, the base equation
has to be modified to account for both offset and orientation in 2D. Therefore, let the
orientation be some θ , and the offset be (x0 , y0 ). On applying a counter-clockwise θ
rotation to a set of points, the rotation is defined by this matrix equation:
x cos θ − sin θ x
= (49)
y sin θ cos θ y
Substituting (x, y) from Eq. 49 into Eq. 46, we get the following function:
Letting
g(x, y, θ, a, b) = 0 (52)
To account for offset from origin, we can replace all x and y with x and y ,
respectively, where,
x = x − x0 (53)
y = y − y0 (54)
and (x0 , y0 ) is the offset of the centre of the figure from the origin. Therefore, we
have
g(x, y, θ, a, b, x0 , y0 ) = 0 (55)
as the final equation of the figure we are trying to fit. Applying the least-squares
method on the above equation for curve-fitting of m empirical points (xi , yi ).
m
E2 = (g(xi , yi , θ, a, b, x0 , y0 ) − 0)2 (56)
i=0
Our aim is to find x0 , y0 , a, b and θ such that E 2 is minimised. This can only be
done by,
d E2
= 0, where β ∈ {x0 , y0 , a, bθ } (57)
dβ
If g had been a linear equation, simple matrix multiplication would have yielded the
optimum parameters. But since Eq. 55 is a complex nth (where n is 2, 4 or 6) order
nonlinear equation with trigonometric variables, we need to use iterative methods
in order to estimate the parameters, a, b, θ , x0 , and y0 . Therefore, this work uses
Levenberg-Marquardt [14, 21] least-squares algorithm to solve the non-linear Eq. 57.
If the orientation of any shape is in 3D, the above algorithm will need some modifi-
cations. We first compute the equation of the plane in which the shape lies and then
transform the set of points to a form where the method in Sect. 4.2 can be applied.
108 A. Agrawal et al.
In order to find the normal to the plane of the shape, we carry out singular value
decomposition (SVD) of the given points. Let the set of points (x, y, z) be represented
as matrix A ∈ Rn×3 . From each point, subtract the centroid and calculate SVD of A.
A = U Σ V. (58)
n1 x + n2 y + n3 z = C (59)
where C is a constant. The next step is to transform the points to the X − Y plane. For
that, we first find the intersection of the above plane with X -Y plane by substituting
z = 0 in Eq. 59. We get the equation of line as,
Then we rotate the points about the z-axis such that the above line is parallel to the
x-axis. Angle of rotation α = 0, β = 0 and γ = arctan( −n n1
2
) needs to be substituted
in matrix R, given in Eq. 61. New points will be A z = A R.
⎡ ⎤
cos β sin γ sin α sin β cos γ − cos α sin γ cos α sin β cos γ + sin α sin γ
R = ⎣ cos β sin γ sin α sin β sin γ + cos α cos γ cos α sin β sin γ − sin α cos γ ⎦
− sin β sin α cos β cos α cos β
(61)
We then rotate the points about X -axis by the angle cos−1 (|n 3 |/ n 1 , n 2 , n 3 ) to
make the points lie in the X − Y plane. Then, we substitute angles α =
|n 3 |
arccos( n 1 ,n 2 ,n 3
), β = 0 and γ = 0 in the rotation matrix given in (61). Finally,
Accurate Estimation of 3D-Repetitive-Trajectories using Kalman Filter … 109
the set of points in the X − Y plane will be A f inal = A z R. We can then use the
neural network described in Sect. 4.1 to classify the curve into one of the nine cate-
gories. Then, we can compute the parameters (x0 , y0 , a, b, θ ) of the classified curve
using method given in Sect. 4.2. The combined algorithm for the shape detection and
parameter estimation is shown in Algorithm 3.
5 Interception Strategy
In this section, the interception strategy is formulated considering that the target is
moving in a repetitive figure-of-eight trajectory; however, the proposed framework
could be extended to other geometric curves. Once the target trajectory is estimated,
the favourable location for the interceptor is selected where interception can lead to
almost head-on collision. Let’s consider a target is moving in the direction of the
arrow (marked in green colour), as shown in Fig. 5. So, once the trajectory of the
target is estimated through the EKF, machine learning and curve fitting techniques,
and the direction of target motion is known; it is found that I1 and I2 are the favourable
location for generating almost head-on interception scenario. The part of the curve
between the red and yellow lines is in a straight line, so the target can be detected
earlier, and the interceptor will have a higher response time for target engagement.
Once the target is detected, the interceptor applies the pursuit-based guidance strategy
to generate the desired velocity and desired yaw rate. Details of the guidance strategy
are mentioned in [11, 34]. Here, In Algorithm 4, we have mentioned the important
steps of the guidance command for the sake of completion. The desired velocity
(Vdes ) for the interceptor is generated to drive the interceptor along the line joining
the interceptor and the projection of the target in the image plane. The desired yaw
rate (rdes ) is generated to keep the target in the field of view of the camera by using
a PD controller based on the error (eψ ) between the desired yaw and the actual
yaw. In the case of an interception scenario with multiple interceptors, the estimated
favourable standoff locations are allocated to multiple drones through task allocation
architecture considering the current states of target drones and interceptors.
6 Results
the perspective projection. The position of the ball is fed to the EKF module as the
measurement and used in the EKF module for filtered position estimation of the ball.
After one loop of data, the neural network predicts the category of the shape, after
112 A. Agrawal et al.
which the curve fitting algorithm estimates the shape parameters using the appropriate
shape equation. The raw position data and the estimated shape of different curves are
shown in Figs. 9–17. The green colour dot shows the raw position of the ball obtained
using the information from the camera, and the blue curve represents the predicted
shape of the curve. As can be seen in Figs. 9–17 the overall estimation framework is
able to reasonably approximate the desired geometry of the curve.
The proposed estimation framework is tested for the estimation of 3D geometric
curves. Estimation of Leminscate of Gerono in 3D is shown in Fig. 18.
The proposed high-speed target interception strategy is tested by creating an inter-
ception scenario in the Gazebo environment where two interceptor drones are trying
to intercept the target ball moving in the figure-of-eight (Leminscate of Gerono)
curve (similar to Fig. 1). Different snapshot of the experiments are shown in Fig. 19–
24. Figure 20 shows a snapshot during the tracking and estimation of ball position
and the corresponding trajectory using the visual information. Figures 22–24 shows
the snapshot during the engagement once the target ball is detected by Drone 2 while
waiting at the standoff location.
Accurate Estimation of 3D-Repetitive-Trajectories using Kalman Filter … 113
The experimental setup (shown in Fig. 25) consists of two drones. One of the drones,
the target drone, has a red ball attached and will fly in a figure-of-8 trajectory. The
second drone is fitted with Sec3CAM 130 monocular camera for the detection of the
ball. The second drone will follow the target drone and estimate the ball position and
velocity using computer vision techniques. The raw position data as obtained using
the information from the image plane is shown in Fig. 26. The raw position data
is fed to the EKF algorithm and subsequently through the RNN and Least-Square
curve fitting techniques. Figure 27 shows the estimated figure-of-eight and raw ball
118 A. Agrawal et al.
position observed using visual information using the proposed framework (Figs. 13,
14, 15, 17, 21, 23).
Accurate Estimation of 3D-Repetitive-Trajectories using Kalman Filter … 119
7 Discussions
Interception of a target with low speed is easy compared to a target with a higher
speed as the region of capturability for a given guidance algorithm will be higher for
interception of a low-speed target. In the case of interception with small UAVs, the
target information is obtained from a camera’s information, and a camera’s detection
range is small compared to other sensors like radar. Interception of a high-speed target
using a conventional guidance strategy is difficult, even if the high-speed target is
moving in a repetitive loop. Therefore, the looping trajectory of the target needs to
be estimated so that interceptor can be placed at a favourable position for ease in the
interception of the high-speed target. The target position estimation using the visual
sensor is too noisy in an outdoor environment to estimate the target trajectory, as
120 A. Agrawal et al.
observed from field experiments. So, we have estimated the target position using the
extended Kalman filter framework. To obtain the target position, the target needs to
track the target, so we have proposed to predict the target trajectory over a shorter
horizon using least square methods considering the sequence of observed target
positions. Once the initial observations of the target position are made, learning
techniques and curve fitting methods are applied to identify the curve. Once the
parameter of the curve is estimated, the interceptors are placed for the head-on
collision situation. We have successfully validated the estimation framework for
various geometric curves in the Gazebo and outdoor environments. The geometric
curves should be a standard closed loop curve. While formulating the motion model
for target position estimation, it is assumed that the target’s motion is smooth, i.e.,
the change in curvature of the target’s trajectory remains bounded and smooth over
time. This assumption is the basis of our formulation of the target motion model.
The interception strategy is checked only in simulation. The maximum speed of the
target should be within a limit such that tracking by the interceptor is possible in the
first loop. The standby location for the interceptors to be selected is such that the
interceptor will have a higher reaction time to initiate the engagement. The proposed
framework provides a better interception strategy for a high-speed target rather than
directly chasing the target after detection due to higher response time and better
alignment along the target’s path.
8 Conclusions
In this chapter, we present the framework which is designed to estimate and predict
the position of a moving target, which follows a repetitive path of some standard
shape. The proposed trajectory estimation algorithm is used to formulate the inter-
ception strategy for a target having a higher speed at the interceptor. The target
position is estimated using the EKF framework using visual information, and then
the target position is used to estimate the shape of the repetitive loop of the target.
Estimation of different curves such as Lemniscate of Bernoulli, Deltoid, and Limacon
are performed using realistic visual sensors set up in the Gazebo environment. The
proposed high-speed interception strategy is validated by simulating an interception
scenario of a high-speed target moving in a figure-of-eight trajectory in the ROS-
Gazebo framework. Future work includes the integration of the proposed estimation
and prediction algorithm in the interception framework and validation of the com-
plete architecture in the outdoor environment. The proposed technique can also be
used to help the motion planning of autonomous cars and develop driver-assistance
systems in traffic junctions.
Acknowledgements We would like to acknowledge the Robert Bosch Center for Cyber Physical
Systems, Indian Institute of Science, Bangalore, and Khalifa University, Abu Dhabi, for partial
financial support.
Accurate Estimation of 3D-Repetitive-Trajectories using Kalman Filter … 121
References
1. Abbas, M. T., Jibran, M. A., Afaq, M., & Song, W. C. (2020). An adaptive approach to vehicle
trajectory prediction using multimodel kalman filter. Transactions on Emerging Telecommuni-
cations Technologies, 31(5), e3734.
2. Anderson-Sprecher, R., & Lenth, R. V. (1996). Spline estimation of paths using bearings-only
tracking data. Journal of the American Statistical Association, 91(433), 276–283.
3. Banerjee, P., & Corbetta, M. (2020). In-time uav flight-trajectory estimation and tracking using
bayesian filters. In 2020 IEEE Aerospace Conference (pp. 1–9). IEEE
4. Barisic, A., Petric, F., & Bogdan, S. (2022). Brain over brawn: using a stereo camera to detect,
track, and intercept a faster uav by reconstructing the intruder’s trajectory. Field Robotics, 2,
34–54.
5. Beul, M., Bultmann, S., Rochow, A., Rosu, R. A., Schleich, D., Splietker, M., & Behnke, S.
(2020). Visually guided balloon popping with an autonomous mav at mbzirc 2020. In 2020
IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR) ( pp. 34–41).
IEEE
6. Cascarano, S., Milazzo, M., Vannin, A., Andrea, S., & Stefano, R. (2022). Design and develop-
ment of drones to autonomously interact with objects in unstructured outdoor scenarios. Field
Robotics, 2, 34–54.
7. Chen, M., Liu, Y., & Yu, X. (2015). Predicting next locations with object clustering and tra-
jectory clustering. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp.
344–356). Springer
8. Cheung, Y., Huang, Y. T., & Lien, J. J. J. (2015). Visual guided adaptive robotic intercep-
tions with occluded target motion estimations. In 2015 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS) (pp. 6067–6072). IEEE
9. Dong, G., & Zhu, Z. H. (2016). Autonomous robotic capture of non-cooperative target by
adaptive extended kalman filter based visual servo. Acta Astronautica, 122, 209–218.
10. Hadzagic, M., & Michalska, H. (2011). A bayesian inference approach for batch trajectory
estimation. In 14th International Conference on Information Fusion (pp. 1–8). IEEE
11. Jana, S., Tony, L. A., Varun, V., Bhise, A. A., & Ghose, D. (2022). Interception of an aerial
manoeuvring target using monocular vision. Robotica, 1–20
12. Kim, S., Seo, H., Choi, S., & Kim, H. J. (2016). Vision-guided aerial manipulation using a
multirotor with a robotic arm. IEEE/ASME Transactions On Mechatronics, 21(4), 1912–1923.
13. Kumar, A., Ojha, A., & Padhy, P. K. (2017). Anticipated trajectory based proportional navi-
gation guidance scheme for intercepting high maneuvering targets. International Journal of
Control, Automation and Systems, 15(3), 1351–1361.
14. Levenberg, K. (1944). A method for the solution of certain non-linear problems in least squares.
Quarterly of Applied Mathematics, 2(2), 164–168.
15. Li, T., Prieto, J., & Corchado, J. M. (2016). Fitting for smoothing: a methodology for
continuous-time target track estimation. In 2016 International Conference on Indoor Posi-
tioning and Indoor Navigation (IPIN) (pp. 1–8). IEEE
16. Li, T., Chen, H., Sun, S., & Corchado, J. M. (2018). Joint smoothing and tracking based on
continuous-time target trajectory function fitting. IEEE transactions on Automation Science
and Engineering, 16(3), 1476–1483.
17. Lin, L., Yang, Y., Cheng, H., & Chen, X. (2019). Autonomous vision-based aerial grasping for
rotorcraft unmanned aerial vehicles. Sensors, 19(15), 3410.
18. Liu, Y., Suo, J., Karimi, H. R., & Liu, X. (2014). A filtering algorithm for maneuvering target
tracking based on smoothing spline fitting. In Abstract and Applied Analysis (Vol. 2014).
Hindawi
19. Luo, C., McClean, S. I., Parr, G., Teacy, L., & De Nardi, R. (2013). UAV position estimation
and collision avoidance using the extended kalman filter. IEEE Transactions on Vehicular
Technology, 62(6), 2749–2762.
122 A. Agrawal et al.
20. Ma, H., Wang, M., Fu, M., & Yang, C. (2012). A new discrete-time guidance law base on
trajectory learning and prediction. In AIAA Guidance, Navigation, and Control Conference (p.
4471)
21. Marquardt, D. W. (1963). An algorithm for least-squares estimation of nonlinear parameters.
Journal of the society for Industrial and Applied Mathematics, 11(2), 431–441.
22. Mehta, S. S., Ton, C., Kan, Z., & Curtis, J. W. (2015). Vision-based navigation and guidance
of a sensorless missile. Journal of the Franklin Institute, 352(12), 5569–5598.
23. Pang, B., Ng, E. M., & Low, K. H. (2020). UAV trajectory estimation and deviation analysis
for contingency management in urban environments. In AIAA Aviation 2020 Forum (p. 2919)
24. Prevost, C. G., Desbiens, A., & Gagnon, E. (2007). Extended kalman filter for state estimation
and trajectory prediction of a moving object detected by an unmanned aerial vehicle. In 2007
American Control Conference (pp. 1805–1810). IEEE
25. Qu, L., & Dailey, M. N. (2021). Vehicle trajectory estimation based on fusion of visual motion
features and deep learning. Sensors, 21(23), 7969.
26. Roh, G. P., & Hwang, S. W. (2010). Nncluster: an efficient clustering algorithm for road network
trajectories. In International Conference on Database Systems for Advanced Applications (pp.
47–61). Springer
27. Schulz, J., Hubmann, C., Löchner, J.,& Burschka, D. (2018). Multiple model unscented kalman
filtering in dynamic bayesian networks for intention estimation and trajectory prediction. In
2018 21st International Conference on Intelligent Transportation Systems (ITSC) (pp. 1467–
1474). IEEE
28. Shamwell, E. J., Leung, S., & Nothwang, W. D. (2018). Vision-aided absolute trajectory esti-
mation using an unsupervised deep network with online error correction. In 2018 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS) (pp. 2524–2531). IEEE
29. Shrivastava, A., Verma, J. P. V., Jain, S., & Garg, S. (2021). A deep learning based approach
for trajectory estimation using geographically clustered data. SN Applied Sciences, 3(6), 1–17.
30. Strydom, R., Thurrowgood, S., Denuelle, A., & Srinivasan, M. V. (2015). UAV guidance: a
stereo-based technique for interception of stationary or moving targets. In Conference Towards
Autonomous Robotic Systems (pp. 258–269). Springer
31. Su, K., & Shen, S. (2016). Catching a flying ball with a vision-based quadrotor. In International
Symposium on Experimental Robotics (pp. 550–562). Springer
32. Sung, C., Feldman, D., & Rus, D. (2012). Trajectory clustering for motion prediction. In 2012
IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 1547–1552). IEEE
33. Thomas, J., Loianno, G., Sreenath, K., & Kumar, V. (2014). Toward image based visual servoing
for aerial grasping and perching. In 2014 IEEE International Conference on Robotics and
Automation (ICRA) (pp. 2113–2118). IEEE
34. Tony, L. A., Jana, S., Bhise, A. A., Gadde, M. S., Krishnapuram, R., Ghose, D., et al. (2022).
Autonomous cooperative multi-vehicle system for interception of aerial and stationary targets
in unknown environments. Field Robotics, 2, 107–146.
35. Yan, L., Jg, Zhao, Hr, Shen, & Li, Y. (2014). Biased retro-proportional navigation law for
interception of high-speed targets with angular constraint. Defence Technology, 10(1), 60–65.
36. Zhang, X., Wang, Y., & Fang, Y. (2016). Vision-based moving target interception with a mobile
robot based on motion prediction and online planning. In 2016 IEEE International Conference
on Real-time Computing and Robotics (RCAR) (pp. 17–21). IEEE
37. Zhang, Y., Wu, H., Liu, J., & Sun, Y. (2018). A blended control strategy for intercepting high-
speed target in high altitude. Proceedings of the Institution of Mechanical Engineers, Part G:
Journal of Aerospace Engineering, 232(12), 2263–2285.
38. Zhao, M., Shi, F., Anzai, T., Takuzumi, N., Toshiya, M., Kita, I., et al. (2022). Team JSK at
MBZIRC 2020: interception of fast flying target using multilinked aerial robot. Field Robotics,
2, 34–54.
Robotics and Artificial Intelligence
in the Nuclear Industry: From
Teleoperation to Cyber Physical Systems
Abstract This book chapter looks to address how upcoming technology can be used
to improve the efficiency of decommissioning processes within the nuclear industry.
Challenges associated with decommissioning are introduced with a brief overview
of the previous efforts and current practices of nuclear decommissioning. A high-
level cyber-physical architecture for nuclear decommissioning applications is then
proposed by drawing upon recent technological advances in the realm of Industry 4.0
such as internet of things, sensor networks, and increased use of data analytics and
cloud computing approaches. In the final section, based on demands and proposals
from industry, possible applications within the nuclear industry are identified and
discussed.
1 Introduction
1.1 Background
Around the world, many nuclear power plants are reaching the end of their active
life and are in urgent need of decontamination and decommissioning (D&D). In the
UK alone, seven advanced gas cooled reactors are due to enter decommissioning by
2028 [1]. This is in addition to the nuclear 11 Magnox reactors along with research
sites, weapon production facilities, and fuel fabrication and reprocessing facilities
already at various stages of decommissioning. Current estimates are that the decom-
missioning programme will not be completed until 2135, at a cost of £237bn [2].
The D&D process requires that the systems, structures, and components (SSC) of
a nuclear facility be characterised and then handled accordingly. This may include
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 123
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_5
124 D. Shanahan et al.
removing materials that are unhazardous, to dealing with highly radioactive mate-
rials such as nuclear fuel. The main objectives of the D&D process are to protect
workers, the public and environment, while also minimising waste and associated
costs [3]. Despite this, at present many D&D tasks are still reliant on manual labour,
putting workers at risk of radiation exposure. The need to limit this exposure in the
nuclear industry, as defined by as low as reasonably practical (ALARP) principles,
is a challenging problem [4].
1.2 Motivation
Certain facilities within a nuclear plant are not accessible by humans due to levels
of radioactivity present. Where access is possible such as in alpha-contaminated
areas, heavy duty personal protective equipment is required, including air fed
suits and leather overalls. This makes work cumbersome and strenuous, while
still not removing all risk to the worker. In addition, the high cost of material
disposal, including the associated equipment for segregation of materials, some of
which may be highly radioactive, has necessitated more efficient processes. Conse-
quently, decommissioning tasks such as cutting and dismantling components present
challenges to the nuclear industry and require novel solutions [5].
In the nuclear industry due to uniqueness in the diversity and severity of its chal-
lenges, significant barriers prevent robotic and autonomous system (RAS) deploy-
ment These challenges, for example, are: (i) highly unstructured, uncertain and clut-
tered environments, (ii) high risk environments, with radioactive, thermal and chem-
ical hazards, (iii) exploration, mapping and modelling of unknown or partially known
extreme environments, (iv) powerful, precise, multi-axis manipulators needed with
complex multimodal sensing capabilities, (v) critically damaging effects of radiation
on electronic systems, (vi) need for variable robot supervision, from tele-immersion
to autonomous human–robot collaboration [6].
The Nuclear Decommissioning Authority (NDA) in the UK has set out 4 grand
challenges for nuclear decommissioning as shown in Table 1. As can be inferred
from Table 1, digitisation of the nuclear industry, in combination with autonomous
systems, advanced robotics, and wearable technologies, plays a significant role to
address these challenges. Sellafield Ltd., in charge of managing the world’s largest
inventory of untreated nuclear waste, has also identified key areas for technical
innovation, based upon the grand challenges presented by the NDA [7]. A major
difficulty in decommissioning current nuclear power plants is the lack of available
data regarding storage and previous operation. Future builds would benefit from a
comprehensive accounting of all lifecycle activities and incidents, which would in
turn, make planning dismantling and decontamination a much easier process.
Robotics and Artificial Intelligence in the Nuclear Industry: From … 125
Robots are a natural solution to challenges faced with D&D processes. Neverthe-
less, uptake of robot systems in the nuclear industry has been slow, with handling
applications limited to teleoperated manipulators controlled through thick lead
glass windows, greatly reducing situational awareness [5]. Furthermore, while these
systems are robust and rugged, they lack feedback sensors or inverse kinematic
controllers, making them slow to operate and reliant on the experience and skill of
the operator. To address these challenges innovative technology is needed to improve
efficiency and reduce the time taken to decommission plants.
The need for robotic capability in the nuclear industry has long been recognised,
particularly in the aftermath of accidents where radioactive sources can become
dispersed and the danger to human life is much greater. Particular examples of these
accidents include Three Mile Island in the USA, Chernobyl in Russia, and more
recently at Fukushima in Japan. While the risks posed by nuclear are great, the need
for cleaner energy and net-zero carbon emission is critical and so there is a necessity
for systems that can deal with nuclear accidents, as well as provide support in the
handling of common decommissioning tasks [9].
The current study expands on earlier research on the use of robotics and
autonomous systems for nuclear decommissioning applications done at Lancaster
University [10, 11]. Although the hydraulically actuated robotic manipulators are
crucial for decommissioning operations, the inherent nonlinearities in the hydraulic
joints make modelling and control extremely difficult. For instance, in [12] it is
suggested to use a genetic algorithm technique to estimate the unknown parame-
ters of a hydraulically actuated, seven degree of freedom manipulator. In [13], the
estimation outcomes are enhanced by utilising a multi-objective cost function to
replace the output error system identification cost function. Another issue arising
in using hyper redundant manipulators in nuclear decommissioning is the need for
developing a numerically efficient inverse kinematic approach to be robust against
potential singularities [14]. An explanation of the earliest studies on the approaches
126 D. Shanahan et al.
Industry 4.0 originated in Germany and has since expanded in scope to include digital-
isation of manufacturing processes, becoming known as the fourth industrial revolu-
tion. The industry 4.0 paradigm can be characterised by key technologies that enable
the shift towards greater digitalisation and the creation of cyber-physical systems
(CPSs). This includes merging of technology that was once isolated, providing oppor-
tunities for new applications. As the key area of Industry 4.0 is manufacturing, many
applications are focused on increasing productivity within factories, most commonly
accomplished by improving the efficiency of machinery.
Industrial automation in manufacturing encompasses a wide range of technologies
that can be used to improve processes. With the advent of industry 4.0, there is now
a trend for a holistic approach to increasing automation by considering all aspects
of a process and how they are interlinked. Autonomous industrial processes can be
used to replace humans in work that is physically hard, monotonous or performed
in extreme environments as well as perform tasks beyond human capabilities such
as handling heavy loads or working to fine tolerances. At the same time, they offer
the opportunity for data collection, analytics, and quality checks, while providing
improved efficiency and reduced operation costs [24].
Automation of industrial processes and manufacturing systems requires the use
of an array of technologies and methods. Some of the technologies used at present
Robotics and Artificial Intelligence in the Nuclear Industry: From … 127
include distributed control systems (DCS), supervisory control and data acquisi-
tion (SCADA), and programmable logic controllers (PLC). These can be combined
with systems such as robot manipulators, CNC machining centres and/or bespoke
machinery.
Industry 4.0 continues past developments and looks to enhance the level of
autonomy in an industrial process. This can include new concepts and technologies
such as:
The aim of this chapter is to provide an overview of the current challenges faced
during D&D operations at nuclear power plants, and how these can be addressed
by greater digital integration and advanced technology such as seen as part of the
Industry 4.0 initiative. A new conceptual design for the decommissioning process
using the cyber-physical framework is proposed and the challenges and opportunities
for the integrated technologies is discussed by drawing upon the literature. The start
of the chapter will give background on the D&D process and some of the challenges
faced. A review of the current state of the art in relevant research is then provided
for common D&D tasks. In the final section, a framework for autonomous systems
is developed, building upon the current advances in robot and AI system research.
This section will provide the background on the D&D process and some of the chal-
lenges faced at each stage of the process. There are several main processes that must
be completed during D&D, categorised here as characterisation, decontamination,
dismantling and demolition, and waste management.
2.1 Characterisation
2.2 Decontamination
Dismantling (or segmentation) is often required for the components and systems
within a nuclear facility, a key example of this is the reactor. There is no standardised
method for dismantling; each facility will require a tailored approach depending on
the design and conditions. As with decontamination, variables such as component
types and ease of access need to be considered to determine the optimal approach.
There are several techniques available for dismantling, these include:
130 D. Shanahan et al.
Each technique can have advantages for a given application, depending on require-
ments for cutting depth, speed, and type of waste generated. Off-the-shelf equip-
ment may be adapted for the nuclear environment, along with the application of
tele-operation and/or robotics to reduce the risk of radiation to operators. Once the
systems and components have been removed, and the building is free from radioac-
tivity and contamination, demolition can be carried out. This can be done using
conventional techniques such as ball and breakers, collapsing, or explosives where
appropriate [3].
There is a range of robots required in the nuclear industry, each designed for specific
applications. These can be water, land or air based, and often have to deal with chal-
lenging environments that include high radiation and restricted access. Robots used
in nuclear have additional requirements over other industries, particularly in relation
to the ability to cope with radioactive environments. This results in the need for high
equipment reliability with low maintenance requirements. While conventional off the
shelf equipment may be suitable for some applications in nuclear, invariably it will
need to be adapted to make it more suitable for being deployed. This can involve tech-
nique such as relocating sensitive electronics, adding shielding, upgrading compo-
nents to radiation tolerant counterparts and/or adding greater redundancy through
alternative recovery methods [33].
Gloveboxes are used throughout the nuclear industry to provide a contained environ-
ment for handling hazardous objects and materials. They can however be difficult to
use and still present a risk of exposure to radioactive materials for the operator. In
addition, the gloves reduce tactile feedback and reduce mobility of the arms making
simple tasks tiring and challenging; it would therefore be beneficial to incorporate a
robotic system. There are however challenges for implementation such as integrating
robots within a glovebox which can be dark and cluttered, and protecting systems
from the harmful effects of radiation which can cause rapid deterioration to electrical
components. In [37], new technologies in robotics and AI are explored as to how
gloveboxes can be improved through the use of robotics. It is suggested that it is
preferable to design a robot to operate within the gloves to simplify maintenance
and protect it from contaminants. It is also noted while greater autonomy would
improve productivity, glovebox robotics mainly utilise teleoperation due to the risk
of breaking containment when using an autonomous system. Teleoperation methods
were further developed by a team at the University of Manchester which allowed
control of a manipulator using only the posture and gesture of bare hands. This was
implemented using virtual reality with Leap Motion and was successful at executing
a simple pick and place task. The system could be further improved with the addition
of haptic feedback, possibly achieved virtually using visual or audio feedback [38].
Currently under development by Sellafield Ltd. and the National Nuclear Laboratory
is the box encapsulation plant (BEP) which is intended to be used for post-processing
of decommissioning waste. Post-processing offers benefits including lower waste
classifications, reduced waste volume and safer waste packaging while also allowing
creation of an inventory. Due to the radioactivity of the waste materials, it is necessary
to have a remotely operated system. Current designs are tele-operated however they
are extremely difficult to operate. To address this, research [31] looks at how greater
autonomy can be applied to the manipulators used in the BEP. The key requirements
of the autonomous system can be defined as:
• Visual object detection, recognition, and localisation of waste canisters.
• Estimation of 6DOF manipulator pose and position.
• Decision making agent utilising vision data and acting with the manipulator
control system.
• Autonomous manipulation and disruption.
Robotics and Artificial Intelligence in the Nuclear Industry: From … 133
challenge on the inspection of spent fuel storage ponds. This project focused on
development of robust localisation and positioning techniques using Kalman filters
and model predictive control techniques. Experimental testing showed the MallARD
system was able the follow planned paths with a reduction in error of two orders of
magnitude [43].
The review in the previous section reveals that a large body of work remains to
complete all aspects of autonomy expected in the nuclear sector. Moreover, the
overview in the last two sections highlights the challenges the nuclear sector is
currently facing for decommissioning of power plants and clarifies the urgent need
for increasing the autonomy level in the whole process. Inspired from autonomous
industrial processes and drawing upon what is currently practiced in various industrial
sectors after the fourth industrial revolution, in this section we aim to study the
underlying architecture and recent technological advances that motivates designing
an autonomous decommission process for the nuclear sector.
will vary dependent on how a system is designed. Early stages of development may
rely more on simulation due to the unpredictable nature of systems in the initial
phases. Considerations for design include robust design methods, facility design and
possible adaptions to accommodate the system. In addition, considerations regarding
the maintenance and decommissioning of a new system must be evaluated such as
methods of recovery in the event of power loss or other critical fault [46]. For a
tele-operated system this may include physical testing, user analysis and simulated
testing. Semi-autonomous systems require greater analysis and may require stability
proofs along with hardware in the loop testing. Such systems are at risk from events
such as loss of communication and may need to be continuously recalibrated to ensure
they are operating within safe limits. Research at the University of West of England
has investigated how simulation-based internal models can be used in the decision-
making process of robots needs to identify safety hazards at runtime. This approach
would allow a robot to adapt to a dynamic environment and predict dangerous actions,
allowing a system to be deployed across more diverse applications [47].
136 D. Shanahan et al.
cyber physical production systems have been given as an adaptation to the manu-
facturing industry. This is based on the same framework as the conventional CPS,
mainly the 5Cs [54]. An example of a CPS developed for manufacturing automation
is proposed in [55]. The authors account for the hybrid system approach, integrating
the continuous-time physical process with the discrete-time control. In addition, the
paper notes that service-orientated architecture and multi-agent system approaches
are promising for the development of such a CPS. The concept system integrates the
robot movements with object recognition and fetching, all while take consideration
of safety during human–robot interaction.
The interconnected nature of CPSs make them inherently susceptible to cyber-
attacks. These attacks can come from actors such as hacking organisations, govern-
ments, users (that may be intentional or not), or hacktivists. Hackers can exploit
vulnerabilities, such as cross site scripting or misconfigured security in order to
break into a system. These risks can be mitigated by maintaining security defined
by the CIA triad—confidentiality, integrity, and availability. Many cyber-attacks are
confined to the cyber domain and focus on sensitive data or disrupting computer
systems. In contrast, CPS attacks can have a direct impact in the physical world that
can cause damage to the physical systems as well as the potential to endanger life.
Preventing future attacks, which could have greater impact due to the greater number
of connected systems, is of upmost importance [56].
138 D. Shanahan et al.
The cyber physical architecture discussed in the previous section can be realised
on the pillars of various recently proposed technologies, referred to as enabling
technologies here. In the following section, these technologies are reviewed in more
depth to set the scene for development of a cyber-physical system for the nuclear
industry.
Industrial Internet of Things. The Industrial Internet of Things (IIoT) is a subset
of the Internet of Things (IoT) concept whereby devices are all interconnected, facil-
itated by the internet. IoT has seen increasing development in many sectors such
as transportation and healthcare. Increasing connectivity has been enabled by the
greater use of mobile devices each of which can provide feedback data. IIoT covers
machine-to-machine (M2M) communication and similar industrial automation inter-
action technology. While there are some similarities with consumer based IoT, IIoT
differs in that connectivity must be structured, applications are critical, and data
volume is high [53]. While IoT tends to use wireless communications, industrial
applications often rely on wired connectivity due to greater reliability.
As IIoT is characterised by the substantial amounts of data transferred, the
methods for data exchange are a key consideration for IIoT systems. Industrial
communication networks have developed considerably over the years, utilising devel-
opments in fields such as IT. Initial industrial communication was developed using
fieldbus system networks which helped to improve communication between low level
devices. As the internet continued to grow, automation networks changed to incor-
porate more ethernet based technologies despite issues with real time capabilities.
More recently wireless networks have become more common as they allow easier
reconfiguration of systems and do not require extensive cabling; they do have still
have issues with real time capabilities and concerns regarding reliability. In particular
the new wave of communication technology that has added the development of the
IoT and IIoT is more focused on consumer requirements and so are not currently
suitable for many of the demanding requirements of industry [57].
Using IIoT can help improve productivity and efficiency of processes. Real time
data acquisition and data analytics can be used in conjunction with IIoT to predict
equipment maintenance requirements and allow fast response to failures. In health-
care setting IoT can be used to improve patient safety by better monitoring patient
conditions.
There are many examples of IoT applications, some have been slowly devel-
oped while others are quickly being implemented as enabling technologies become
available. Examples include:
• Smart Cities—Smart highways can be used to improve traffic flow and reduce
accidents while monitoring of available parking spaces can be used to assist
drivers.
• Smart Agriculture—Monitoring weather along with soil moisture and quality can
ensure planting and harvesting is done at the correct time.
Robotics and Artificial Intelligence in the Nuclear Industry: From … 139
• Smart Homes—using IoT devices in the home can improve efficiency in use of
utilities in addition to also allowing better safety through detection of break ins.
Internet of Things systems generally comprise of a wireless sensor network
(WSN). WSNs have some key objectives such as sensing properties of their environ-
ment, sampling signals to allow digital processing, and some provide some ability
towards extracting useful information from the collected data. Typically, WSNs will
involve low-cost, low-power, communication methods such as Wi-Fi, Bluetooth,
or near frequency communication (NFC); some drawback of using these methods
include interference and loss of data. This can be particularly problematic in a nuclear
environment where good reliability is required [58].
The convergence of IoT and robotics has resulted in the concept of an internet of
robotic things (IoRT) which can be seen from control approaches such as with cloud
and ubiquitous robotics [59]. The term was initially identified to refer to fusion of
sensor data and manipulation of objects resulting in a cyber-physical approach to
robotics, and sharing characteristics with the CPS concept. Elementary capabilities
of an IoRT system include perception, motion, and manipulation, along with higher
level processing including decisional autonomy, cognitive ability, and interactive
capacity.
Cloud Computing. Cloud computing allow quick access to IT resources, providing
flexibility to match requirements. Resources are available on demand, over the
internet, and are used across a range of industries. There are several benefits of
cloud computing such as reduced initial upfront cost for infrastructure, running costs
associated with IT infrastructure, and allowing capacity to be quickly adapted to
requirements, reducing the need for planning. There are several deployment methods
for cloud computing which each offer benefits in terms of flexibility and management.
Some examples are shown in Fig. 3.
can be used to give insight into the use of utilities and how it may be possible to
save energy. Within manufacturing real time status of machines performance can
be obtained helping to predict maintenance issues and increase productivity. Within
medicine and healthcare, digital twins can be used to provide real time diagnostics
of the human body and may even allow simulation of the effects of certain drugs, or
surgical procedures [69].
There are some key enablers in related technology that have allowed the develop-
ment of more comprehensive digital twins. While digital modelling has been around
for decades, advances now allow real world scenarios to be replicated and testing
carried out without any risk to the physical system. Digital modelling, made possible
by advances in processor power, likewise can be used to make predictions about that
condition of a system.
As shown in Fig. 4, digital twins can have different applications depending on the
computing layer in which they operate. In addition, it is possible to have multiple
digital twins running simultaneously each providing a different application. This
could include deployment within the system of interest itself, allowing rapid to
response to conditions that fall outside of nominal operating conditions. Simultane-
ously, a digital twin could be applied to work with historical data, possible utilising
data from concurrent systems and allowing predictions that can influence mainte-
nance strategies and test possible scenarios. A key consideration in the development
of a digital twin is the modelling method used; these can be categorised into data-
driven or physics-based modelling, however modelling approaches also exist as a
combination of these two techniques.
Multi-agent Systems. The concept of multi-agent system (MAS) developed from
the field of distributed artificial intelligence (DAI) first seen in the 1970s. DAI can
be defined as “the study, construction, and application of multiagent systems, that
is, systems in which several interacting, intelligent agents pursue some set of goals
or perform some set of tasks” [71]. An agent can be either cyber based such as a
software program, sometimes referred to as a bot, or a physical robot that can interact
directly with its environment.
Research in DAI has developed as a necessity from the ever-increasing distributed
computing in modern systems. A major aspect of DAI is that agents are intelligent,
and therefore have some degree of flexibility while being able to optimise their
operations in relation to a given performance indicator. Typically, an agent will
have a limited set of possible actions which is known as its effectoric capability.
This capability may vary dependent on the current state of the environment. The
task environment for an agent can be characterised according to a set of properties.
According to [72], these can be defined as.
• Observability—whether the complete state of the environment is available.
• Number of agents—how many agents are operating in the environment; this also
required the consideration of how an agent will act, if it is competitive, cooperative
or can be viewed as a simple entity.
• Causality—the environment may be deterministic allowing future states to be
predicted with certainty, or alternatively stochastic in which there will be a level
of uncertainty in actions.
• Continuity—whether previous decisions affect future decisions.
• Changeability—an environment can be static or dynamic and requiring continual
updates.
• State—can either be discrete or continuous.
Agents can have software architectures not dissimilar from those of robots. Reflex
agents respond directly to stimuli, while model-based agents use an internal state to
maintain a belief of the environment. Goal based agents may involve some aspects
of search and planning in order to achieve a specific goal, utility agents have some
sense of optimising, and learning agents can improve their performance based on a
given performance indicator.
MASs can assist with many tasks required of the robots in a previously unknown
environment such as planning, scheduling, and autonomous operation. The perfor-
mance of a multi-agent system can be measured in terms of rationality, i.e., choosing
the correct action. There are 4 main approaches to implementation of a MAS are
shown in Table 3.
the robot during operation using visual and haptic feedback through the workstation.
However, local sensors equipping the tools at the remote environment may provide
better quality or complementary sensory information. Additionally, the operator’s
performance may decrease with fatigue, and in general robotic control should be
adapted to the operator’s specific sensorimotor abilities. Therefore, empowering
robots with some autonomy would effectively regulate flexible interaction behaviour.
In this regard, how to understand and predict human motion intent is the funda-
mental problem for the robot side. Three human motion intents are typically consid-
ered in physical human–robot interaction, namely motion intent, action intent and
task-level intent [75], which covers human intent in the short to long time and even
full task horizon. A more intuitive approach is developing physical or semi-physical
connection between robot and human. Such method can be observed in human–
human interaction literature. In terms of the performances of physically interacting
subjects in a tracking task, subjects can improve their performance by interacting with
(even a worse) partner [76]. Similar performances obtained with a robotic partner
demonstrated that this property results from the sensory exchange between the two
agents via the haptic channel [77]. This haptic communication is currently exploited
to develop sensory augmentation in a human–robot system [78].
When both human and robot perform partner intent estimation and interaction
control design, it is essential to investigate the system performance under this process
of bilateral adaptation. Game theory is a suitable mathematical tool to address this
problem [79]. For example, for novice or operator in fatigue, the robot controller
tends to compensate for human motion to mitigate any possible adverse effects due
to incorrect human manipulation. On the other hand, human may still wish to operate
along his/her own target trajectory. With the estimated intent and gain from the robot
side, human can counteract robot’s effects by further increasing his/her control input
(gain). Thus, there is haptic communication between human and robot regarding
their respective behaviour. The strategies of the two controllers can converge to a
Nash equilibrium as defined in non-cooperative game theory.
The overall architecture of the cyber physical system proposed for the nuclear decom-
missioning applications is illustrated in Fig. 5. This architecture consists of two layers
namely ‘Cyber Environment’ and Extreme ‘Physical Environment’. Each robot based
on the local information from its own sensor, information from other robots and the
cloud, and a mission defined by ‘Decision Making and Action Invoking Unit’ or
‘Assisted Teleoperator’ can obtain a proper environmental perception and design a
proper path planning to avoid obstacles and conduct to the mission position.
As can be seen from Fig. 5, the proposed system involves various integrated
heterogeneous subsystems with a great level of interoperability and a high level of
automation in both vertical and horizontal dimensions of the system. In this sense,
the proposed architecture can be viewed as a complex system of systems with a
146 D. Shanahan et al.
Communicaon
Sensing
Percepon
Agent #1
Object Pose
Recognion Esmaon
Sensor Feature
Fusion Extracon
Extreme Physica Environment
Detecon
Fault
Introspecve Autonomy
Avoidance
Obstacle
Agent #2 Agent #i Agent #N
SLAM
Isolaon
Navigaon
Fault
FTC
Constraints
Planning
Robot
Path
Accommodaon
Fault
Sensor #1 Sensor #i Sensor #K
Actuaon
Cloud
Asset #1 Asset #i Asset #M
Fig. 5 A schematic block diagram of the proposed solution. The components studied in this project
and their relevance to the work packages are highlighted in light blue colour
map of the nuclear site. For autonomous operation of individual robots and their
interaction with other robots, various algorithms such as motion planning, trajec-
tory tracking controller, simultaneous localisation and mapping, object detection,
and pose estimation techniques should be designed by respecting the environmental
constraints imposed on each robot. In the following, the most important components
of the proposed cyber physical system are explained in more depth.
The development of more advanced robotics has necessitated better structures for
designing robots based on how subsystems interact or communicate. Developments
have seen the formation of paradigms which provide approaches for solving robotic
design problems. Initial robots such as Shakey [80], were based on the sense-plan-act
(hierarchical) paradigm. As robotics developed, this paradigm became inadequate.
Planning would take too long and operating in a dynamic environment meant that
plans may be outdated by the time they are executed which may be challenging.
To address the challenges in control architecture, new paradigms were created
such as using reactive planning. A reactive system is one of the simplest forms of
the control architecture depicted in Fig. 6. While not effective for general purpose
applications, this paradigm is particularly useful in applications where fast execution
is required. This structure also exhibits similarities with some biological systems.
Another alternative approach that has been widely adopted is the subsumption archi-
tecture in Fig. 7, which is built from layers of interacting behaviours. This utilised an
arbitration method that would allow higher-level behaviours to override lower-level
ones [81]. While this method proved popular, it was unable to deal with longer term
planning.
Fig. 7 Example of
subsumption architecture
[81]
148 D. Shanahan et al.
The early generations of the multi-robot system proposed in Fig. 5 were tele-operated
ensuring that workers need not enter hazardous environments. Such systems are
complex to operate and require intensive human supervision. The next generation will
be the autonomous multi-robot systems that has a hierarchical architecture depicted
in Fig. 8.
A robot can be defined as an autonomous system if it exists in the physical world,
senses its environment, acts on what it senses and has a goal or objective. They
consist of a mixture of hardware and software components which are designed
to work together. Robot designs and suitable components will vary depending on
the intended application along with constraints regarding size, power, and budget
[84]. Autonomous robots require electronics for different purposes; often a micro
controller is used for motors while a separate and more powerful on-board computer
is used for processing sensor data. Since the electronics components are vulnerable
to radiations the current practice to protect them is to use the radiation hardened
electronics. This may increase the TID measures of the nuclear robot, however, from
the viewpoint of completing a specific mission this may not be a sufficient time.
Also, deploying numerous single robots will not necessarily improve the progress of
the intended mission.
Power efficiency is a key consideration for designing embedded systems in robotic
applications. This may constrain practical implementation of advanced processing
algorithms such as deep learning techniques which may alternatively be done using
cloud-based form of computing.
A multi-robot system also requires hardware for external communication. The
simplest method of communication is using a wired connection. This is straightfor-
ward if the robot is static. However, for mobile robots a tether can become tangled or
caught. Alternatively, wireless communication methods are often used, for example
using Wi-Fi, allowing for greater system mobility and reconfiguration along with
quicker setup times due to the reduction in cabling required. Within a radioactive
environment, wireless communication and in particularly wireless sensor networks
can be challenging to implement. Some of the challenges include lack of acces-
sible power sources, radiation protection of communication system components,
and reinforced wall in nuclear plants that results in significant signal attenuation,
along with the need to ensure all communications are secure and reliable. A new
wireless communication design architecture using nodes with a base station has
recently been tested at the Sellafield site, UK, and was shown to be operationally
effective within reinforced concrete structures found in nuclear plants while the low
complexity sensor nodes used allow for greater radiation tolerance [85]. A low power
but rather lower range communication methods such as Bluetooth and ZigBee can be
also attempted for deployment of wireless sensor network in nuclear environments.
Using multi-robot or modular cheap robotic units with simple functionality can be
a possible solution to radiation exposure or TID problem. Such a system is inherently
redundant and the collective behaviour of the multi-robot system is significant. In
150 D. Shanahan et al.
A robot acts on its environment in two main ways: locomotion and manipulation. This
is achieved by various components depending on the task that is required. Effectors
are components or devices that allow a robot to carry out tasks such as moving and
grasping and consist of or are driven by actuators. Actuators typically produce linear
or rotary motion and may be powered via different mediums including electrics,
hydraulics, or pneumatics. While an actuator typically has one degree of freedom, a
robot can be designed with mechanisms comprising joints and links giving multiple
degrees of freedom.
Dexterous operation of the robotic manipulators in the restricted environments
existing in the nuclear sites require degrees of freedom more than required for a
task. In this case, the manipulator is kinematically redundant and higher degrees of
freedom are used to satisfy various environmental and robotics constraints such as
obstacle avoidance, joint limit avoidance, avoiding singularities, etc. Figure 9 illus-
trates the detailed block diagram of the control system designed for the manipulation
and grasping of a single dual arm robot. Similarly, the control system designed for
autonomous operation of a single UAV in an unstructured environment is depicted
in Fig. 10.
Improving control systems leads to better performance, in turn allowing faster
operation. In the development of new system, better control can also allow the use
of smaller and lighter weight components for a given application. As many phys-
ical systems are inherently non-linear, there has been increasing demand for control
schemes that can improve control of such systems. A common example of a system
requiring non-linear control is the hydraulic manipulator. Greater availability of
computing power has allowed more complex control schemes to be implemented as
the real time control. Nevertheless, proportional-derivative and proportional-integral-
derivative control are still commonly used controllers for industrial robotics [86].
Developing control systems using non-linear control methods has many benefits
over classical methods [87]. An overview of these is.
Robotics and Artificial Intelligence in the Nuclear Industry: From … 151
Fig. 9 The schematic of the control system designed for a dual arm manipulator
Fig. 10 The schematic of the control system designed for autonomous operation of a single UAV
• Cost saving, using cheaper components which are not required to have a linear
operating region.
Control Algorithms. Using the inverse dynamics of a system, a linear response can
be obtained by an inverse dynamics control, sometimes also called computed torque.
Often when designing a control scheme, knowledge of system of parameters is not
perfect. One technique to address this is adaptive control. Adaptive control uses
system identification techniques in combination with a model-based control law that
allows model parameters to be updated in response to the error.
As with most control laws, the challenge of a rigorous proof of stability is of central
importance [88]. Unlike traditional continuous-time systems, distanced communi-
cation in cyber-physical nuclear robotic systems would potentially result in time
delay, data loss, and even instability. Since the active nodes in the network are typi-
cally not synchronised with each other, the number of delay sources tends to be
stochastic. Given the potential network congestion and packet loss in closed-loop
robotic systems, the delay is typically assumed to follow a known probability distri-
bution [89]. From the perspective of model simplification, time delay sequence can
be hypothesised to be independently and identically distributed, where Markov chain
would benefit the modelling of network-induced delays [90, 91]. Therefore, main-
taining system stability in the presence of time delays is critical yet challenging
for systematic stability analysis and control synthesis. A classic control strategy is
built on the basis of passivity theory, which characterises the input–output property
of the system’s dissipation. The energy stored in passive system does not exceed
the energy imported from the environment. Passivity-based time-delay stabilisa-
tion methods have been intensively studied and have yielded rich results, such as
scattering approach, wave variable, damping injection, and time domain passivity
approach [92]. Different from passive-based methods, predictive control strategies
avoid passive analysis by compensating for the uncertainty of communication delays
in the system and provide unique advantages especially in handling constraints and
uncertain delays [93]. Another alternative approach is machine learning control
which is a subset of optimal control that allows improvements in performance as
the system being controlled repeats tasks, not necessarily involving the use of a para-
metric model. This allows compensation of difficult to model phenomena such as
friction and system nonlinearities. This type of control can also be used to validated
dynamic modes of a system by studying how the torque error function varies over a
system operating envelope. Research detailed in [94] shows how a remote centre of
movement (RCM) strategy can be used to avoid collisions while working with small
openings. This is particularly relevant to nuclear whereby ports are used to access
highly radioactive environments.
The physical and computational components are intertwined through communi-
cation networks in cyber-physical systems (CPSs). When the control design for the
CPSs is considered, the CPSs can also be regarded as one type of nonlinear networked
control systems (NCSs), in which the plant is controlled through the communication
networks [95–98]. For the NCSs, one main concern is the security, in which the
communication network may suffer from different types of malicious attacks, such
Robotics and Artificial Intelligence in the Nuclear Industry: From … 153
constraints and uncertainties within a system. Path planning approaches for visual
servoing can be categorised into 4 groups: image space, optimisation-based, potential
field-based, and global [111].
Motion planning is the process of determining the sequence of actions and motions
required to achieve a goal such as moving from point A to B, in the case of a
manipulator this will be moving or rearranging objects in an environment. At present
in the nuclear industry teleoperation in still widely used which can be slow and tedious
for operators due to poor situational awareness and difficulty in using controllers.
Improved motion planning capability has the possibility of speeding up tasks such
as packing waste into safe storage containers [112].
The goal of a planner is to find a solution to a planning problem while satisfying
any constraints such as the kinematic and dynamics constraints of the links along with
constraints that arise from the environment such as obstacles. This must be done in
the presence of uncertainties arising from modelling, actuation, and sensing. Motion
planning has traditionally been split into macro planning for large scale movements,
and fine planning for high precision. Typically, as the level of constraints increases
such as during fine motor movements, feedback must be obtained at a higher rate
and actions are more computationally expensive [111].
One of the most basic forms of motion planning is by artificial potential fields
which operates by incrementally exploring free space until a solution is found. This
can be treated as an optimisation problem, using gradient descent to fins the optimal
solution. Other methods include graph-based path planning which evaluate different
path trees using a graph representation to get from the start to goal state. Examples
of these are A*, Dijkstra, Breadth First Search (BFS), and Depth First Search (DFS).
Alternatively, sampling-based path planning can be used which randomly add points
to a tree until a solution is found. Examples include RRT and PRM [113].
Graphs are often used in motion planners, consisting of nodes which typically
represent states, and edges which represent the ability to move between two nodes.
Graphs can be directed or undirected, depending on whether edges are bidirectional.
Weightings can be given to edges as a cost associated with traversing it. A tree refers
to a graph with one root node, several leaf nodes, no cycles and at most one parent
node per node. Graphs can also be represented as matrices. Once a graph is built,
it can then be searched. A* is a popular best-first search algorithm which finds the
minimum-cost path for a graph. It is a popular and efficient search technique which
has been applied to general manipulators and those with revolute joints in. In order to
allow a search such as A*, the configuration space must be discretised, which is most
easily done with a grid. Using a grid requires consideration of the appropriate costs
as well as in how many directions a path planner can travel. Multi-resolution grids
can be used which repeatedly subdivide cells that are in contact with an obstacle, and
Robotics and Artificial Intelligence in the Nuclear Industry: From … 155
therefore reduce the computational complexity that is one of the main drawbacks of
using grid methods [114].
Sampling methods sacrifice resolution-optimality but can find satisficing solutions
quickly. These methods are spilt into two main groups: RRT for single-query and
PRM for multiple-query planning. RRT is a data structure sampling scheme that
can quickly search high dimensional spaces that have both algebraic and differential
constraints. It does this through biasing exploration in the state space to “pull” towards
unexplored areas [115]. PRM uses randomly generated free configurations of the
robot which are then connected and stored as a graph. This learning phase is then
followed by the query phase where a graph search is used to connect two nodes
within the roadmap from the start and goal configurations. Segments of the path are
then concatenated to find a full path for the robot. Difficulties found when querying
can then be used to improve the roadmap [116].
Research into robotic path planning is an active research area; recently develop-
ments have utilised machine learning and deep neural networks to solve complex
multi-objective planning problems [117]. The increasing number of planning algo-
rithms is necessitating greater capability for benchmarking and compare algorithms
on the basis of performance indicators such as computational efficiency, success
rates, and optimality of the generated paths. Recently, the PathBench framework has
been created that allows comparison of traditional sample- and graph-based algo-
rithms and newer ML-base algorithms such as value iteration networks and long
short-term memory (LSTM) networks [118]. This is achieved by using a simulator
to test each algorithm, a generator and trainer component for the ML models, and an
analyser component to generate statistical data on each trial. Research on the Path-
Bench platform has shown that at present ML algorithms have longer path planning
times in comparison to classical planning approaches [117].
Another key component of planning manipulation tasks is the ability to properly
grasp an object. Research by Levine et al. [119] used a deep convolutional neural
network to predict the chance of a successful grasp, and a continuous servoing mech-
anism to update the motor commands. The CNN was trained with data from over
80,000 grasp attempts. These were obtained with the same robot model; however,
each robot is not identical and so the differences provided a diverse dataset for the
neural network to learn from. The proposed grasping methods can find non-obvious
grasping strategies and have the possibility to be extended to a wider range of grasping
strategies as the dataset increases.
Sensors are a key component of robotic needed for measuring physical properties
such as position and velocity. They can be classified as either proprioceptive or
exteroceptive depending on whether they take measurements of the robot itself or
of the environment. They can also be categorised according to energy output, being
156 D. Shanahan et al.
either active or passive. Commonly used sensors include LiDAR (Light Detection
and Ranging), SONAR (Sound Navigation and Ranging), RADAR (Radio Detection
and Ranging), RGB Camera, and RGB-D Camera.
Processing sensor data can be a difficult task in robotics as measurements are
often noisy, can be intermittent, and must sometimes be taken indirectly. Many
sensors provide a large amount of data which requires processing to extract useful
components such as for obstacle detection and object recognition. One of the most
commonly used forms of sensing is by visual data from a camera. Light has many
properties that can be measured such as intensity and wavelength and can interact by
different means such as absorption and reflection. Using an image, the size, shape,
and/or position of an object can be determined. Digital cameras use a light sensor to
convert a projection of the 3D world into a 2D image, a technique known as perspec-
tive projection. As images are collected using a lens, it is important to account for
how the image is formed as it passes through the lens such as using the thin lens
equation to relate the distances between the object and image [110].
Recently, deep neural networks have been used for a range of computer vision
tasks such as object recognition and classification, depth map inference, and pose
and motion estimation. This can be extended to visual servoing, using direct visual
servoing (DVS) which does not require feature extraction or tracking. Using a deep
neural network the convergence domain can be increased to create a CNN-based VS
as in [120].
Object Recognition and Detection. Object recognition has many applications such
as position measurement, inspection, sorting, counting, and detection. Requirements
for an object recognition task vary depending on the application and may include
evaluation time, accuracy, recognition reliability, and invariance. Invariance can be
with respect to illumination, scale, rotation, background clutter, partial occlusion, and
viewpoint change [121]. In unstructured environments such as with nuclear decom-
missioning, it is likely that all these aspects will in some way affect the object recog-
nition algorithm. A neural network can be used for image classification, outputting a
probability for a given object. Images can be converted to a standard array of pixels
which are then used as input to the neural network [74].
Deformable part models (DPM) for discriminative training of classifiers have
shown to be efficient and accurate on difficult datasets in [122]. This can also be
implemented using C++ with OpenCV which combines the DPM with a cascade
algorithm to speed up detection. Research in [123] looked to exploit visual context,
to aid object recognition, for example by identifying the room an object is in, and then
using that information to help narrow down possible objects. This allows recognition
with less local stimulus.
Advances in deep learning algorithms have opened up new avenues of research in
object detection. Some commonly used detectors include R-CNN, YOLO and SSD,
which generally operate by localising in terms of bounding boxes [124]. Research in
[125] looks to address the issues within the nuclear industry of detecting and cate-
gorising waste objects using a RGB-D camera. Common objects in decommissioning
includes PPE, tools, and pipes. These need to be detected, categorised, sorted, and
Robotics and Artificial Intelligence in the Nuclear Industry: From … 157
segregated according to their radioactivity level. Presently DCNN methods are a good
solution for object detection and recognition; however, they rely on large amounts
of training data which may be unavailable for new applications. Research at the
University of Birmingham [125] looked to address this by using weakly-supervised
deep learning for detection and recognition of common nuclear waste objects. Using
minimally annotated data for initial training the network was able to handle sparse
examples and was able to be implemented in a real time recognition pipeline for
detecting and categorising unknown waste objects. Researchers at the University of
Birmingham have also been able to expand 3D geometric reconstruction to allow
semantic mapping and provide understanding of features within scene contexts. This
was achieved using a Pixel-Voxel network to process RGB image and point cloud
data [126].
Pose Estimation. 6D object detection is the combination of object detection with
6D pose estimation. Within manufacturing there is demand for algorithms that can
perform 6D object detection for tasks such as grasping and quality control. Tasks
situations in manufacturing have the benefits of known CAD models, good cameras,
and controlled environments in terms of lightning. In the nuclear industry this is
often not the case and algorithms are required that can withstand lack of these factors
along with other difficulties such as occlusions, lack of textures, unknown instances
and colours, and difficult surface properties. A common approach to training 6D
object detection algorithms is via model-based training. This use CAD models to
generate augmented images; for geometric manipulations producing training images
is straightforward, while generating images with variations in surface properties or
projections can be more difficult. It is still however more cost effective and time
efficient to use model based training when possible rather than creating real images,
which can be particularly problematic in varying environmental conditions [127].
The creation of digital twins has been identified by the UK’s National Nuclear Labo-
ratory (NNL) as an area that could be adapted for use in a nuclear environment. Digital
twins have the possibility to be utilised throughout the lifespan of a nuclear facility.
However, while the technology may be available, implementing digital twins could
be more difficult due to stringent requirements for safety and security demanded by
regulatory bodies. Despite this the nuclear industry is in a good position to make better
use of digital twin technology, having already built a strong system of documenta-
tion based on destructive and non-destructive testing and analysis of components
and infrastructure [128]. Some progress is development of digital twins solutions for
nuclear has been made by the consortium for advanced simulation of INRs [129]. In
[130], digital twins are identified as a possible to technology to help the UK develop
the next generation of nuclear plants. This can be achieved by benefits including
158 D. Shanahan et al.
increased efficiency and improved safety analysis. This would be building on the
Integrated Nuclear Digital Environment (INDE) as proposed in [128].
Digital twins offer the opportunity to visualise and simulate work tasks in a virtual
environment using up to date data to plan work and improve efficiency and safety.
While task simulation may not require a digital twin, and a digital twin may not
necessarily be used for task simulation. The authors in [131] used Choreonoid to
simulate tasks to be performed by remotely controlled robots. To achieve this, they
developed and used several plug-ins for the software to emulate behaviour such as
underwater, aerial, camera-view modifications and disturbance, gamma camera, and
communication failure effect. In another set of research, a digital twin in virtual
environment is used to analyse scenarios involving a remote teleoperation system.
Beneficial to using the digital twin is the opportunity to test configuration changes
including in development of a convolutional neural network [132].
In [133] the authors developed a digital environment within Gazebo to allow the
simulation of ionising radiation to study the effects of interactions with radioactive
sources and how radiation detectors can be better developed. While this allowed
some research into the optimisation of robotic activities in radioactive environments,
due to the heavy computational burden of modelling complex radiation sources,
some simplifications had to be made such as point sources and assumption of a
constant radioactivity. Simulation can also often used in the development of control
systems, to aid with system design and operator training. A real-time simulator was
developed in [134] that was verified using open loop control experiments, and then
was subsequently applied to investigate the performance of trajectory tracking and
pipe-cutting tasks.
6 Conclusions
and some of the progress made in utilising such systems in manufacturing as part
of the industry 4.0 concept are detailed. Finally, an overview of the enabling tech-
nologies along with a concept framework for a nuclear decommissioning CPS is
developed with attention to how developments in Industry 4.0 can be transferred for
application in nuclear decommissioning activities.
References
1. NAO. (2022). The decommissioning of the AGR nuclear power stations. https://www.nao.
org.uk/report/the-decommissioning-of-the-agr-nuclear-power-stations/.
2. Nuclear Decommissioning Authority. (2022). Nuclear Decommissioning Authority Annual
Report and Account 2021/22. http://www.nda.gov.uk/documents/upload/Annual-Report-and-
Accounts-2010-2011.pdf.
3. NEA. (2014). R&D and Innovation Needs for Decommissioning Nuclear Facili-
ties. https://www.oecd-nea.org/jcms/pl_14898/r-d-and-innovation-needs-for-decommission
ing-nuclear-facilities.
4. Industry Radiological Protection Co-ordination Group. (2012). The application of ALARP
to radiological risk, (IRPCG) Group.
5. Marturi, N., et al. (2017). Towards advanced robotic manipulations for nuclear decommis-
sioning. In Robots operating in hazardous environments. https://doi.org/10.5772/intechopen.
69739.
6. Watson, S., Lennox, B., & Jones, J. (2020). Robots and autonomous systems for nuclear
environments.
7. Sellafield Ltd. (2021). Future research and development requirements 2021 (pp. 1–32).
8. NDA. (2019). Integrated waste management radioactive waste strategy. https://www.gov.uk/
government/consultations/nda-radioactive-waste-management-strategy.
9. Bogue, R. (2015). Robots in the nuclear industry: a review of technologies and applications.
10. Montazeri, A., & Ekotuyo, J. (2016). Development of dynamic model of a 7DOF hydraulically
actuated tele-operated robot for decommissioning applications. In Proceedings of American
Control Conference (Vol. 2016-July, pp. 1209–1214). https://doi.org/10.1109/ACC.2016.752
5082. (Jul 2016).
11. Montazeri, A., West, C., Monk, S. D., & Taylor, C. J. (2017). Dynamic modelling and param-
eter estimation of a hydraulic robot manipulator using a multi-objective genetic algorithm.
International Journal of Control, 90(4), 661–683. https://doi.org/10.1080/00207179.2016.
1230231.
12. West, C., Montazeri, A., Monk, S. D., & Taylor, C. J. (2016). A genetic algorithm approach
for parameter optimization of a 7DOF robotic manipulator. IFAC-PapersOnLine, 49(12),
1261–1266. https://doi.org/10.1016/j.ifacol.2016.07.688.
13. West, C., Montazeri, A., Monk, S. D., Duda, D. & Taylor, C. J. (2017). A new approach to
improve the parameter estimation accuracy in robotic manipulators using a multi-objective
output error identification technique. In RO-MAN 2017-26th IEEE International Symposium
on Robot and Human Interactive Communication, Dec. 2017 (Vol. 2017-Jan, pp. 1406–1411).
https://doi.org/10.1109/ROMAN.2017.8172488.
14. Burrell, T., Montazeri, A., Monk, S., & Taylor, C. J. J. (2016). Feedback control—based
inverse kinematics solvers for a nuclear decommissioning robot. IFAC-PapersOnLine, 49(21),
177–184. https://doi.org/10.1016/j.ifacol.2016.10.541.
15. Oveisi, A., Anderson, A., Nestorović, T., Montazeri, A. (2018). Optimal input excitation
design for nonparametric uncertainty quantification of multi-input multi-output systems (Vol.
51, no. 15, pp. 114–119). https://doi.org/10.1016/j.ifacol.2018.09.100.
160 D. Shanahan et al.
16. Oveisi, A., Nestorović, T., & Montazeri, A. (2018). Frequency domain subspace identification
of multivariable dynamical systems for robust control design, vol. 51, no. 15, pp. 990–995.
https://doi.org/10.1016/j.ifacol.2018.09.065.
17. West, C., Monk, S. D., Montazeri, A., & Taylor, C. J. (2018) A vision-based positioning
system with inverse dead-zone control for dual-hydraulic manipulators. In 2018 UKACC
12th International Conference on Control, CONTROL 2018 (pp. 379–384). https://doi.org/
10.1109/CONTROL.2018.8516734. (Oct, 2018).
18. West, C., Wilson, E. D., Clairon, Q., Monk, S., Montazeri, A., & Taylor, C. J. (2018).
State-dependent parameter model identification for inverse dead-zone control of a hydraulic
manipulator∗ . IFAC-PapersOnLine, 51(15), 126–131. https://doi.org/10.1016/j.ifacol.2018.
09.102.
19. Burrell, T., West, C., Monk, S. D., Montezeri, A., & Taylor, C. J. (2018). Towards a cooperative
robotic system for autonomous pipe cutting in nuclear decommissioning. In 2018 UKACC
12th International Conference on Control, CONTROL 2018 (pp. 283–288). https://doi.org/
10.1109/CONTROL.2018.8516841. (Oct 2018).
20. Nemati, H., & Montazeri, A. (2018). Analysis and design of a multi-channel time-varying
sliding mode controller and its application in unmanned aerial vehicles. IFAC-PapersOnLine,
51(22), 244–249. https://doi.org/10.1016/j.ifacol.2018.11.549.
21. Nemati, H., & Montazeri, A. (2018). Design and development of a novel controller for robust
attitude stabilisation of an unmanned air vehicle for nuclear environments. In 2018 UKACC
12th International Conference on Control (CONTROL) (pp. 373–378). https://doi.org/10.
1109/CONTROL.2018.8516729.
22. Nemati, H., Montazeri, A. (2019). Output feedback sliding mode control of quadcopter using
IMU navigation. In Proceedings-2019 IEEE International Conference on Mechatronics, ICM
2019 (pp. 634–639). https://doi.org/10.1109/ICMECH.2019.8722899. (May 2019).
23. Nokhodberiz, N. S., Nemati, H., & Montazeri, A. (2019). Event-triggered based state esti-
mation for autonomous operation of an aerial robotic vehicle. IFAC-PapersOnLine, 52(13),
2348–2353. https://doi.org/10.1016/j.ifacol.2019.11.557.
24. Lamb, F. (2013). Industrial automation hands-on.
25. Weyer, S., Schmitt, M., Ohmer, M., & Gorecky, D. (2015). Towards industry 4.0-
Standardization as the crucial challenge for highly modular, multi-vendor production systems.
IFAC-PapersOnLine, 28(3), 579–584. https://doi.org/10.1016/j.ifacol.2015.06.143.
26. IAEA. (2004). The nuclear power industry’s ageing workforce : transfer of knowledge to the
next generation (p. 101). (no. June).
27. Department for Business Energy and Industrial Strategy UK. 2022 Civil Nuclear Cyber
Security Strategy. https://assets.publishing.service.gov.uk/government/uploads/system/
uploads/attachment_data/file/1075002/civil-nuclear-cyber-security-strategy-2022.pdf. (no.
May, 2022).
28. Emptage, M., Loudon, D., Mcleod, R., Milburn, H., & Row, N. (2016). Characterisation:
Challenges and opportunities–A UK perspective (pp. 1–10).
29. Euratom (2022) Cyber physicaL Equipment for unmAnned Nuclear DEcommissioning
Measurements. Horizon 2020. Retrieved September 08, 2022, from https://cordis.europa.
eu/project/id/945335.
30. OECD/NEA. (1999). Decontamination techniques used in decommissioning activities. In
Nuclear Energy Agency (p. 51).
31. Aitken, J. M., et al. (2018). Autonomous nuclear waste management. IEEE Intelligent Systems,
33(6), 47–55. https://doi.org/10.1109/MIS.2018.111144814.
32. Euratom (2020) PREDIS. Horizon 2020. https://doi.org/10.3030/945098.
33. Smith, R., Cucco, E., & Fairbairn, C. (2020). Robotic development for the nuclear envi-
ronment: Challenges and strategy. Robotics, 9(4), 1–16. https://doi.org/10.3390/robotics9
040094.
34. Vitanov, I., et al. (2021). A suite of robotic solutions for nuclear waste decommissioning.
Robotics, 10(4), 1–20. https://doi.org/10.3390/robotics10040112.
Robotics and Artificial Intelligence in the Nuclear Industry: From … 161
35. Monk, S. D., Grievson, A., Bandala, M., West, C., Montazeri, A., & Taylor, C. J. (2021).
Implementation and evaluation of a semi-autonomous hydraulic dual manipulator for cutting
pipework in radiologically active environments. Robotics, 10(2). https://doi.org/10.3390/rob
otics10020062.
36. Adjigble, M., Marturi, N., Ortenzi, V., Rajasekaran, V., Corke, P., & Stolkin, R. (2018).
Model-free and learning-free grasping by Local Contact Moment matching. In IEEE Inter-
national Conference on Intelligent Robots and Systems (pp. 2933–2940). https://doi.org/10.
1109/IROS.2018.8594226.
37. Tokatli, O., et al. (2021). Robot-assisted glovebox teleoperation for nuclear industry. Robotics,
10(3). https://doi.org/10.3390/robotics10030085.
38. Jang, I., Carrasco, J., Weightman, A., & Lennox, B. (2019). Intuitive bare-hand teleoperation
of a robotic manipulator using virtual reality and leap motion. In TAROS 2019 (pp. 283–294).
London: Springer.
39. Sayed, M. E., Roberts, J. O., & Donaldson, K. (2022). Modular robots for enabling operations
in unstructured extreme environments. Advanced Intelligent Systems. https://doi.org/10.1002/
aisy.202000227.
40. Cerba, Š, Lüley, J., Vrban, B., Osuský, F., & Nečas, V. (2020). Unmanned radiation-monitoring
system. IEEE Transactions on Nuclear Science, 67(4), 636–643. https://doi.org/10.1109/TNS.
2020.2970782.
41. Tsitsimpelis, I., Taylor, C. J., Lennox, B., & Joyce, M. J. (2019). A review of ground-based
robotic systems for the characterization of nuclear environments. Progress in Nuclear Energy,
111, 109–124. https://doi.org/10.1016/j.pnucene.2018.10.023. (no. Oct, 2018).
42. Groves, K., Hernandez, E., West, A., Wright, T., & Lennox, B. (2021). Robotic exploration of
an unknown nuclear environment using radiation informed autonomous navigation. Robotics,
10(2), 1–15. https://doi.org/10.3390/robotics10020078.
43. Groves, K., West, A., Gornicki, K., Watson, S., Carrasco, J., & Lennox, B. (2019). MallARD:
An autonomous aquatic surface vehicle for inspection and monitoring of wet nuclear storage
facilities. Robotics, 8(2). https://doi.org/10.3390/ROBOTICS8020047.
44. Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of
human interaction with automation. IEEE Transactions on Systems, Man, and Cybernetics-
Part A: Systems and Humans, 30(3), 286–297. https://doi.org/10.1109/3468.844354.
45. Gamer, T., Hoernicke, M., Kloepper, B., Bauer, R., & Isaksson, A. J. (2020). The autonomous
industrial plant–future of process engineering, operations and maintenance. Journal of Process
Control, 88, 101–110. https://doi.org/10.1016/j.jprocont.2020.01.012.
46. Luckcuck, M., Fisher, M., Dennis, L., Frost, S., White, A., & Styles, D. (2021). Princi-
ples for the development and assurance of autonomous systems for safe use in hazardous
environments. https://doi.org/10.5281/zenodo.5012322.
47. Blum, C., Winfield, A. F. T., & Hafner, V. V. (2018). Simulation-based internal models for
safer robots. Frontiers in Robotics and AI, 4. https://doi.org/10.3389/frobt.2017.00074. (no.
Jan, 2018).
48. Lee, E. A. (2008). Cyber physical systems: Design challenges. In Proceedings-11th IEEE
Symposium Object/Component/Service-Oriented Real-Time Distributed Computing ISORC
2008, (pp. 363–369). https://doi.org/10.1109/ISORC.2008.25.
49. NIST. (2017). Framework for Cyber-Physical Systems: Volume 1, Overview NIST Special
Publication 1500–201 Framework for Cyber-Physical Systems: Volume 1, Overview. https://
nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.1500-201.pdf.
50. Wang, L., Törngren, M., & Onori, M. (2015). Current status and advancement of cyber-
physical systems in manufacturing. Journal of Manufacturing Systems, 37, 517–527. (no.
Oct, 2020). https://doi.org/10.1016/j.jmsy.2015.04.008.
51. Lee, J., Bagheri, B., & Kao, H. A. (2015). A cyber-physical systems architecture for Industry
4.0-based manufacturing systems. Manufacturing Letters, 3, 18–23. https://doi.org/10.1016/
j.mfglet.2014.12.001.
52. Pivoto, D. G. S., de Almeida, L. F. F., da Rosa Righi, R., Rodrigues, J. J. P. C., Lugli, A. B., &
Alberti, A. M. (2021). Cyber-physical systems architectures for industrial internet of things
162 D. Shanahan et al.
71. Weiss, G. (1999). Multiagent Systems: A Modern Approach to Distributed Artificial Intel-
ligence, (Vol. 3, no. 2). http://books.google.com/books?hl=nl&lr=&id=JYcznFCN3xcC&
pgis=1.
72. Russell, S., & Norvig, P. (2010). Artificial intelligence: A modern approach. Prentice Hall.
73. Alpaydın, E. (2010). Introduction to machine learning second edition. MIT Press. https://doi.
org/10.1007/978-1-62703-748-8_7.
74. Goodfellow, I., Bengio, Y., & Courville, A. (2012) Deep learning.
75. Li, Y., et al. (2022) A review on interaction control for contact robots through intent detection.
Progress in Biomedical Engineering, 4(3). https://doi.org/10.1088/2516-1091/ac8193.
76. Ganesh, G., Takagi, A., Osu, R., Yoshioka, T., Kawato, M., & Burdet, E. (2014). Two is better
than one: Physical interactions improve motor performance in humans. Science and Reports,
4(1), 3824. https://doi.org/10.1038/srep03824.
77. Takagi, A., Ganesh, G., Yoshioka, T., Kawato, M., & Burdet, E. (2017). Physically interacting
individuals estimate the partner’s goal to enhance their movements. Nature Human Behaviour,
1(3), 54. https://doi.org/10.1038/s41562-017-0054.
78. Li, Y., Eden, J., Carboni, G., & Burdet, E. (2020). Improving tracking through human-robot
sensory augmentation. IEEE Robotics and Automation Letters, 5(3), 4399–4406. https://doi.
org/10.1109/LRA.2020.2998715.
79. Başar, T., & Olsder, G. J. (1998). Dynamic noncooperative game theory (2nd ed.). Society
for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9781611971132.
80. Nilsson, N. (1969). A mobile Automaton. An application of artificial intelligence techniques.
81. Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Journal of
Robotics and Automation, 2(1), 14–23. https://doi.org/10.1109/JRA.1986.1087032.
82. Siciliano, B., & Khatib, O. (2012). Handbook of robotics. https://link.springer.com/book/.
https://doi.org/10.1007/978-3-319-32552-1.
83. Albus, J., et al. (2002). 4D/RCS version 2.0: A reference model architecture for unmanned
vehicle systems. NIST Interagency/Internal Report (NISTIR), National Institute of Standards
and Technology, Gaithersburg, MD. https://doi.org/10.6028/NIST.IR.6910.
84. Mataric, M. J. (2008). The robotics primer. MIT Press. https://doi.org/10.5860/choice.45-
3222.
85. Di Buono, A., Cockbain, N., Green, P., & Lennox, B. (2021). Wireless communications in
nuclear decommissioning environments. In UK-RAS Conference: Robots Working For and
Among us Proceedings (Vol. 1, pp. 71–73). https://doi.org/10.31256/ukras17.23.
86. Spong, M. W. (2022). An historical perspective on the control of robotic manipulators. Annual
Review of Control, Robotics, and Autonomous Systems, 5(1). https://doi.org/10.1146/annurev-
control-042920-094829.
87. Slotine, J.-J. E., & Li, W. (2011). Applied nonlinear control. Prentice Hall.
88. Craig, J. J., Hsu, P., & Sastry, S. S. (1987). Adaptive control of mechanical manipulators. The
International Journal of Robotics Research, 6(2), 16–28. https://doi.org/10.1177/027836498
700600202.
89. Shousong, H., & Qixin, Z. (2003). Stochastic optimal control and analysis of stability of
networked control systems with long delay. Automatica, 39(11), 1877–1884. https://doi.org/
10.1016/S0005-1098(03)00196-1.
90. Huang, D., & Nguang, S. K. (2008). State feedback control of uncertain networked control
systems with random time delays. IEEE Transactions on Automatic Control, 53(3), 829–834.
https://doi.org/10.1109/TAC.2008.919571.
91. Shi, Y., & Yu, B. (2009). Output feedback stabilization of networked control systems with
random delays modeled by Markov chains. IEEE Transactions on Automatic Control, 54(7),
1668–1674. https://doi.org/10.1109/TAC.2009.2020638.
92. Hokayem, P. F., & Spong, M. W. (2006). Bilateral teleoperation: An historical survey.
Automatica, 42(12), 2035–2057. https://doi.org/10.1016/j.automatica.2006.06.027.
93. Bemporad, A. (1998). Predictive control of teleoperated constrained systems with unbounded
communication delays. In Proceedings of the 37th IEEE Conference on Decision and Control
(Cat. No.98CH36171), 1998 (Vol. 2, pp. 2133–2138). https://doi.org/10.1109/CDC.1998.
758651.
164 D. Shanahan et al.
94. Guo, K., Su, H., & Yang, C. (2022) A small opening workspace control strategy for redundant
manipulator based on RCM method. IEEE Transactions on Control Systems Technology, 1–9.
https://doi.org/10.1109/TCST.2022.3145645.
95. Walsh, G. C., Ye, H., & Bushnell, L. G. (2002). Stability analysis of networked control systems.
IEEE Transactions on Control Systems Technology, 10(3), 438–446. https://doi.org/10.1109/
87.998034.
96. Tipsuwan, Y., & Chow, M.-Y. (2003). Control methodologies in networked control systems.
Control Engineering Practice, 11, 1099–1111. https://doi.org/10.1016/S0967-0661(03)000
36-4.
97. Yue, D., Han, Q.-L., & Lam, J. (2005). Network-based robust H∞ control of systems with
uncertainty. Automatica, 41(6), 999–1007. https://doi.org/10.1016/j.automatica.2004.12.011.
98. Zhang, X.-M., Han, Q.-L., & Zhang, B.-L. (2017). An overview and deep investigation
on sampled-data-based event-triggered control and filtering for networked systems. IEEE
Transactions on Industrial Informatics, 13(1), 4–16. https://doi.org/10.1109/TII.2016.260
7150.
99. Pasqualetti, F., Member, S., Dör, F., Member, S., & Bullo, F. (2013). Attack detection and
identification in cyber-physical systems. Attack Detection and Identification in Cyber-Physical
Systems, 58(11), 2715–2729.
100. Dolk, V. S., Tesi, P., De Persis, C., & Heemels, W. P. M. H. (2017). Event-triggered control
systems under denial-of-service attacks. IEEE Transactions on Control of Network Systems.,
4(1), 93–105. https://doi.org/10.1109/TCNS.2016.2613445.
101. Ding, D., Han, Q.-L., Xiang, Y., Ge, X., & Zhang, X.-M. (2018). A survey on security
control and attack detection for industrial cyber-physical systems. Neurocomputing, 275(C),
1674–1683. https://doi.org/10.1016/j.neucom.2017.10.009.
102. Yue, D., Tian, E., & Han, Q.-L. (2013). A delay system method for designing event-triggered
controllers of networked control systems. IEEE Transactions on Automatic Control, 58(2),
475–481. https://doi.org/10.1109/TAC.2012.2206694.
103. Wu, L., Gao, Y., Liu, J., & Li, H. (2017). Event-triggered sliding mode control of stochastic
systems via output feedback. Automatica, 82, 79–92. https://doi.org/10.1016/j.automatica.
2017.04.032.
104. Li, X.-M., Zhou, Q., Li, P., Li, H., & Lu, R. (2020). Event-triggered consensus control for
multi-agent systems against false data-injection attacks. IEEE Transactions on Cybernetics,
50(5), 1856–1866. https://doi.org/10.1109/TCYB.2019.2937951.
105. Zhang, L., Liang, H., Sun, Y., & Ahn, C. K. (2021). Adaptive event-triggered fault detection
scheme for semi-markovian jump systems with output quantization. IEEE Transactions on
Systems, Man, and Cybernetics: Systems, 51(4), 2370–2381. https://doi.org/10.1109/TSMC.
2019.2912846.
106. Huo, X., Karimi, H. R., Zhao, X., Wang, B., & Zong, G. (2022). Adaptive-critic design for
decentralized event-triggered control of constrained nonlinear interconnected systems within
an identifier-critic framework. IEEE Transactions on Cybernetics, 52(8), 7478–7491. https://
doi.org/10.1109/TCYB.2020.3037321.
107. Dao, H. V., Tran, D. T., & Ahn, K. K. (2021). Active fault tolerant control system design
for hydraulic manipulator with internal leakage faults based on disturbance observer and
online adaptive identification. IEEE Access, 9, 23850–23862. https://doi.org/10.1109/ACC
ESS.2021.3053596.
108. Yu, X., & Jiang, J. (2015). A survey of fault-tolerant controllers based on safety-related issues.
Annual Reviews in Control, 39, 46–57. https://doi.org/10.1016/j.arcontrol.2015.03.004.
109. Freddi, A., Longhi, S., Monteriù, A., Ortenzi, D., & Proietti Pagnotta, D. (2019). Fault tolerant
control scheme for robotic manipulators affected by torque faults. IFAC-PapersOnLine,
51(24), 886–893. https://doi.org/10.1016/j.ifacol.2018.09.680.
110. Corke, P. (2016). Robotics, vision and control (2nd ed.). Springer.
111. Brock, O., Kuffner, J., & Xiao, J. (2012) Robotic motion planning. In Springer handbook of
robotics. Springer.
Robotics and Artificial Intelligence in the Nuclear Industry: From … 165
112. Marturi, N., et al. (2017). Towards advanced robotic manipulation for nuclear decom-
missioning: A pilot study on tele-operation and autonomy. In International Conference
on. Robotics and Automation for Humanitarian Applications RAHA 2016-Conference
Proceedings. https://doi.org/10.1109/RAHA.2016.7931866.
113. Spong, M. W., Hutchinson, S., & Vidyasgar, M. (2004). Robot dynamics and control.
114. Lozano-PéRez, T. (1987). A simple motion-planning algorithm for general robot manipula-
tors. IEEE Journal of Robotics and Automation, 3(3), 224–238. https://doi.org/10.1109/JRA.
1987.1087095.
115. Lavalle, S., & Kuffner, J. (2000). Rapidly-exploring random trees: Progress and prospects.
Algorithmic Computational Robotics. (New Dir.).
116. Kavraki, L. E., Švestka, P., Latombe, J. C., & Overmars, M. H. (1996). Probabilistic roadmaps
for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics
and Automation, 12(4), 566–580. https://doi.org/10.1109/70.508439.
117. Hsueh, H.-Y., et al. (2022). Systematic comparison of path planning algorithms using
PathBench (pp. 1–23). http://arxiv.org/abs/2203.03092.
118. Guo, N., Li, C., Gao, T., Liu, G., Li, Y., & Wang, D. (2021). A fusion method of local
path planning for mobile robots based on LSTM neural network and reinforcement learning.
Mathematical Problems in Engineering, 2021. https://doi.org/10.1155/2021/5524232.
119. Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., & Quillen, D. (2018). Learning hand-eye
coordination for robotic grasping with deep learning and large-scale data collection. Interna-
tional Journal of Robotics Research, 37(4–5), 421–436. https://doi.org/10.1177/027836491
7710318.
120. Bateux, Q., et al. (2018). Training deep neural networks for visual servoing. In ICRA 2018-
IEEE International Conference on Robotics and Automation, 2018 (pp. 3307–3314).
121. Treiber, M. (2013). An introduction to object recognition selected algorithms for a wide variety
of applications. Springer.
122. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection
with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 32(9). https://doi.org/10.1109/MC.2014.42.
123. Torralba, A., Murphy, K. P., Freeman, W. T., & Rubin, M. A. (2003). Context-based vision
system for place and object recognition. In Proceedings of the IEEE International Conference
on Computer Vision (Vol. 1, pp. 273–280). https://doi.org/10.1109/iccv.2003.1238354.
124. Zakharov, S., Shugurov, I., & Ilic, S. (2019) DPOD: 6D pose object detector and refiner.
In Proceedings of the IEEE International Conference on Computer Vision, (Vol. 2019 Oct,
pp. 1941–1950). https://doi.org/10.1109/ICCV.2019.00203.
125. Sun, L., Zhao, C., & Yan, Z. (2019). A novel weakly-supervised approach for RGB-D-based
nuclear waste object detection (Vol. 19, no. 9, pp. 3487–3500).
126. Zhao, C., Sun, L., Purkait, P., Duckett, T., & Stolkin, R. (2018). Dense RGB-D semantic
mapping with pixel-voxel neural network. Sensors (Switzerland), 18(9). https://doi.org/10.
3390/s18093099.
127. Gorschlüter, F., Rojtberg, P., & Pöllabauer, T. (2022). A Survey of 6D object detection based
on 3D models for industrial applications. Journal of Imaging, 8(3), 1–18. https://doi.org/10.
3390/jimaging8030053.
128. Patterson, E. A., Taylor, R. J., & Bankhead, M. (2016). A framework for an integrated nuclear
digital environment. Progress in Nuclear Energy, 87, 97–103. https://doi.org/10.1016/j.pnu
cene.2015.11.009.
129. Lu, R. Y., Karoutas, Z., & Sham, T. L. (2011). CASL virtual reactor predictive simulation:
Grid-to-rod fretting wear. JOM Journal of the Minerals Metals and Materials Society, 63(8),
53–58. https://doi.org/10.1007/s11837-011-0139-6.
130. Bowman, D., Dwyer, L., Levers, A., Patterson, E. A., Purdie, S., & Vikhorev, K. (2022) A
unified approach to digital twin architecture–Proof-of-concept activity in the nuclear sector.
IEEE Access, 1–1. https://doi.org/10.1109/access.2022.3161626.
131. Kawabata, K., & Suzuki, K. (2019) Development of a robot simulator for remote operations for
nuclear decommissioning. In 2019 16th Int. Conf. Ubiquitous Robot. UR 2019 (pp. 501–504).
https://doi.org/10.1109/URAI.2019.8768640.
166 D. Shanahan et al.
132. Partiksha, & Kattepur, A. (2022). Robotic tele-operation performance analysis via digital twin
simulations (pp. 415–417). https://doi.org/10.1109/comsnets53615.2022.9668555.
133. Wright, T., West, A., Licata, M., Hawes, N., & Lennox, B. (2021). Simulating ionising radi-
ation in gazebo for robotic nuclear inspection challenges. Robotics, 10(3), 1–27. https://doi.
org/10.3390/robotics10030086.
134. Kim, M., Lee, S. U., & Kim, S. S. (2021). Real-time simulator of a six degree-of-freedom
hydraulic manipulator for pipe-cutting applications. IEEE Access, 9, 153371–153381. https://
doi.org/10.1109/ACCESS.2021.3127502.
Deep Learning and Robotics, Surgical
Robot Applications
Abstract Surgical robots can perform difficult tasks that humans cannot. They can
perform repetitive tasks, work with hazardous materials, and can operate difficult
objects. This has helped businesses, saved time and money while also preventing
numerous accidents. The use of surgical robots, also known as robot-assisted surgery
allows medical professionals to perform a wide range of complex procedures with
greater accuracy, adaptability, and control than traditional methods. Minimally inva-
sive surgery, which is frequently associated with robotic surgery, is performed
through small incisions. It is also used in some traditional open surgical proce-
dures. This chapter discusses advanced robotic surgical systems and deep learning
(DL). The purpose of this chapter is to provide an overview of the major issues in
artificial intelligence (AI), including how they apply to and limit surgical robots.
Each surgical system is thoroughly explained in the chapter, along with any most
recent AI-based improvements. Case studies are provided with the information on
recent advancements and on the role of DL, and future surgical robotics applications
in ophthalmology are also thoroughly discussed. The new ideas, comparisons, and
updates on surgical robotics and deep learning are all summarized in this chapter.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 167
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_6
168 M. S. Iqbal et al.
1 Introduction
The development, application, and the use of mechanical robots are the subject of
high-level mechanics, an interdisciplinary field of science and planning. The assistant
will give you a thorough understanding of mechanical innovation, including various
robot types and how they are used in adventures [1, 2].
One barrier to robots mimicking humans is a lack of proprioception—a sense of
attention to muscles and body parts—a type of “intuition” for humans that is essential
to how to coordinate development. Roboticists have had the option to provide robots
with the feeling of sight through cameras, feeling of smell and taste through synthetic
sensors and receivers assist robots with hearing. However they have battled to assist
robots with procuring this “intuition” to see their body. Now, utilizing tactile mate-
rials and AI calculations, progress is being made. In one case, arbitrarily positioned
sensors distinguish contact and tension and send information to an AI calculation that
deciphers the signs. In another model, roboticists are attempting to foster a mechan-
ical arm that is essentially as able as a human arm, and that can get an assortment
of articles. Until late turns of events, the interaction included separately preparing a
robot to play out each undertaking or to have an AI calculation with a huge dataset of
involvement to gain from. Robert Kwiatkowski and Hod Lipson of Columbia Univer-
sity are chipping away at “task-skeptic self-demonstrating machines.” Similar to a
newborn child in its most memorable year of life, the robot starts without any infor-
mation on its own body or the material science of movement. As it rehashes great
many developments it observes the outcomes and constructs a model of them. The
AI calculation is then used to help the robot plan about future developments in light
of its earlier movement. Thusly, the robot is figuring out how to decipher its activ-
ities. A group of USC specialists at the USC Viterbi School of Engineering accept
they are quick to foster an AI-controlled automated appendage that can recuperate
from falling without being unequivocally customized to do as such. This is progres-
sive work that shows robots advancing by doing. Man-made brainpower empowers
present day advanced mechanics. AI and AI assisted robots with seeing, walk, talk,
smell and move in progressively human-like ways [3–13].
In this chapter, we propose taking a gander at the modified game plan execution of
the convolutional cerebrum network-based cautious robot with that of various robots
and cautious robots as well as the business standard of expert manual gathering.
Different convolutional mind network plans can be done by changing the amount of
part graphs and the coding and unraveling layer. Examination is finished to sort out
how gathering execution is impacted by designing arrangement limits. The chapter
describes each surgical system in detail, as well as its most recent advancements
through the use of AI. Future surgical robotics applications in ophthalmology are
thoroughly discussed, with case studies provided alongside recent progress and the
role of DL. This chapter summarises the new concepts and comparisons, as well as
updates on surgical robotics and deep learning. Figure 1 show the PRISMA diagram
of this chapter.
Deep Learning and Robotics, Surgical Robot Applications 169
2 Related Work
Careful robots have been available for some time, and mechanical medicine has made
significant progress over the past almost ten years by collecting data and experi-
menting with unusual situations. Mechanical and laparoscopic medical procedures
are known to have less of an impact on patients because they can complete tasks with
minimal intrusion [14–19]. A traditional open surgical procedure necessitates the use
of a critical and cautious site when performing an activity within an organ. A field of
view costs money because the intricate entanglement of human organs necessitates
it. This was the way the laparoscopic surgical procedure and the embedding of an
endoscope to get a field of view were described. The stomach depression and its
cavity could then be examined and treated. Additionally, it was promoted as a more
precise and meticulous method of medical treatment. Imaging and representation,
contact force detecting, and control advancements have made tissue palpation in the
170 M. S. Iqbal et al.
controller possible [20]. The impacted area is less damaged by robotic surgery, and
the speed of the patient’s recovery reduces the amount of time they can spend doing
other things [21–23]. Because this kind of work is done by controlling a robot arm,
it is hard for the specialist to talk to the patient directly.
To complete this task, you need a lot of skill and careful attention. On the oper-
ating table, the patient is surrounded by numerous machines. The robot’s console is
occupied by the operating surgeon. The patient is kept well away from the site of the
procedure. The surgeon simultaneously controls and monitors the console unit. The
surgeon uses visual information to understand how the robotic arm works during
surgery [24, 25]. This has to do with how well robotic surgery works. In addition,
surgeons assert that robotic surgery necessitates greater caution than open surgery due
to its greater reliance on visual aspects because the affected area cannot be directly
observed. When the surgical site is narrowed and the camera view is inevitably
narrowed, the surgeon receives even less information [26, 27]. A major drawback of
surgical robots is this. A gas-filled surgery space is created in the abdomen to ensure
a clear view, and surgical instruments are then inserted. Additionally, a camera is
inserted to demonstrate to the surgeon the state of the abdominal cavity. The pinch-
type master controller of the surgery robot enables extremely precise operation. It’s
hard and requires a lot of skill to operate. 3D images have recently been developed and
integrated into technology to address surgeons’ concerns [26, 27]. Combining images
from various vantage points can improve the quality of the information provided, as
demonstrated in this example. However, for surgeons who perform direct opera-
tions, it is still difficult to improve visual information. In an emergency, the method
of operation can also lead to poor judgment because it is carried out away from the
patient.
Surgeons can use a variety of simulations to improve their accuracy and become
more accustomed to performing operations. Several subtask devices have recently
been developed to provide surgeons with a virtual practice area prior to robotic
surgery [28]. The Da Vinci Research Kit, developed at Johns Hopkins, is the most
well-known tool. You can practice by creating a setting from materials that look like
human tissues. Because microscopic surgery is based on what the camera sees in the
area being treated, it is still essential to be able to use a hand even with this equipment.
To improve the success rate of surgery performed with such a limited view, additional
sensory transmission is required. A haptic feedback system is still missing from even
the most widely used da Vinci robot [29, 30]. Therefore, if RMIS can provide the
surgeon with tactile sensation data in real time during surgery, this issue will partially
be resolved. In robotic surgery, it is anticipated that the proposed haptic system
will speed up decision-making and enhance surgical quality and accuracy [31–37].
Surgeons who need to operate with great dexterity may require haptic systems [31,
36, 37]. When the surgeon has access to the patient’s real-time data, they are able
to make decisions quickly and precisely. The human body’s internal conditions can
vary greatly. Tissue stiffness may differ from that of the surrounding area if a tumor
hasn’t been found yet. However, unless you directly touch the deeply concealed area,
the issue might not be apparent. Tactile feedback can help with some of these issues.
Using a single tactile feedback device against a variety of human body tissues, it
Deep Learning and Robotics, Surgical Robot Applications 171
ought to be possible to alter the tactile perception of various organs and tissues in
real time. Numerous tactile transmission devices have been investigated in light of
these factors. The vibration feedback system is the most widely used tactile feedback
system, as previously mentioned [38, 39]. The intensity of the vibration can convey
the tactile sensation. However, it is frequently utilized to issue a warning signal in
response to external stimuli.
It is common knowledge that a piezoelectric-based vibration feedback system
can also function as a tactile device. According to numerous sources, various human
organs and tissues are viscoelastic. To ensure high surgical quality and safety, a
tactile device with properties comparable to or identical to those of human tissues
and organs ought to be utilized in order to provide surgeons with more precise infor-
mation about viscoelastic properties. However, due to the viscous effect’s time delay,
implementing viscoelastic properties via the vibration feedback system is extremely
challenging. Because of this, the surgical robot console’s tactile device cannot be
used with it. The piezoelectric technology is one of the additional systems [40]. In
accordance with different strategies for arrangement, this method provides tactile
sensation. You can be successful with a sufficient number of different forces [40].
However, using simple force to convey the state of body tissue is inappropriate.
A method that can simultaneously express all of the viscoelastic properties of the
human body is more suitable. To incorporate viscoelastic properties, a pneumatic
tactile transmission device has been proposed [41]. The entry of compressible gas
prevents the condition of incompressible tissues from being expressed under pneu-
matic pressure. In RMIS (Robot-assisted Minimally Invasive Surgery), numerous
tactile devices made of magnetorheological materials have recently been proposed
to support these points [42–49]. The development of the Haptic Master with MR
materials has been the subject of numerous studies [42, 50]. Additionally, a method
for directly delivering haptic information to the surgeon’s hand has been proposed as
the MR Tactile Cell Device [51–56]. Because the magnetorheologically based tactile
device is able to alter the yield stress by varying the intensity of the magnetic field,
it is possible to use a single sample to represent the organ characteristics of various
human tissues.
(OSATS) is one of many rating scales that expert raters can use to evaluate surgeons
in a variety of areas, such as effectiveness, tissue handling, and operation flow [57].
These have also been modified for use with robotic platforms [58, 59], laparoscopic
procedures [60], and specific research fields [61, 62]. Despite their widespread use
in academic research, these scales are rarely utilized in clinical settings. This is due
to the fact that it requires a professional reviewer, is prone to rater bias, and requires
a lot of time and effort.
These issues might be solved by putting ML to use. The scientific field that focuses
on how computers learn from data is referred to as “machine learning” (ML).It is
capable of quickly generating automated feedback that can be replicated without the
assistance of professional reviewers once it has undergone training or is constructed
empirically. It can also easily process the vast amount of data that is available from
the modern operating room. Due to the ever-increasing availability of computa-
tional power, machine learning (ML) is being utilized in a variety of medical fields,
including surgery. Postoperative mortality risk prediction [63], autonomous perfor-
mance of simple tasks [64], and surgical work-flow analysis [65] are just a few of
the many surgical applications of machine learning (ML) and artificial intelligence
(AI).The widespread use of machine learning (ML) has led to the development of the
field of surgical data science, which aims to improve the value and quality of surgery
through data collection, organization, analysis, and modeling [63, 66–68]. Over the
past ten years, the use of machine learning (ML) in the assessment of surgical skill
has increased rapidly. However, the extent to which ML can be utilized to eval-
uate surgical performance is still unclear. HMM, SVM, and ANN were the most
commonly used machine learning (ML) techniques for evaluating surgical perfor-
mance. Coincidentally, these three important ML techniques follow the research
trends in this area, which initially emphasized the use of HMM before moving on to
SVM methods and, more recently, ANN and deep learning.
At the point when people and creatures participate in object control ways of behaving,
the cooperation innately includes a quick input circle among discernment and activity.
Indeed, even complex control errands, for example, removing a solitary article from
a jumbled container, can be performed without first confining the items or plan-
ning the scene, depending rather on ceaseless detecting from contact and vision.
Interestingly, automated control frequently (however not consistently) depends all
the more vigorously on early arrangement and examination, with moderately basic
input, for example, direction following, to guarantee solidness during execution.
Part of the justification behind this is that integrating complex tactile information
sources. For example, vision straightforwardly into an input regulator is extremely
difficult. Procedures, for example, visual servoing perform consistent criticism on
visual elements, yet commonly require the highlights to be determined manually,
and both open circle insight and input (for example by means of visual servoing)
Deep Learning and Robotics, Surgical Robot Applications 173
of the proposed strategy, as well as contrasting it with baselines and earlier proce-
dures. The dataset utilized in these trials is accessible for download: https://sites.
google.com/website/brainrobotdata/home, the second arrangement of investigations
was pointed toward assessing whether getting a handle on information gathered by
one kind of robot could be utilized to work on the getting a handle on capability
of an alternate robot. In these examinations, author gathered in excess of 900,000
extra handle endeavors utilizing an alternate automated controller, with a consid-
erably bigger assortment of items. This second mechanical stage was utilized to
test whether consolidating information from numerous robots brings about better
generally getting a handle on capability. Trials showed that our convolutional brain
network getting a handle on regulator makes a high progress rate while getting a
handle on in mess on a wide reach. Author gathered of 800,000 handle endeavors
to prepare the CNN handle forecast model. Objects, including objects that are huge,
little, hard, delicate, deformable, and clear. Supplemental recordings of our getting
a handle on framework showed that the robot utilizes consistent input to continu-
ally change its grip, representing movement of the articles and erroneous incitation
orders. Author additionally contrast our methodology with open-circle variations
to show the significance of ceaseless criticism, as well as a hand-designing getting
a handle on benchmark that utilizes manual hand-to-eye alignment and profundity
detecting. Our technique makes the most elevated progress rates in our examinations.
At long last, author shows the way that can consolidate information gathered for two
unique sorts of robots, and use information from one robot to work on the getting a
handle on capability of another [71, 72].
Mechanical getting a handle on is one of the broadly investigated area of control.
While a total overview of getting a handle on is outside the extent of this work, author
allude the peruser to standard studies regarding the matter for a more complete treat-
ment. Comprehensively, getting a handle on techniques can be classified as mathe-
matically determined and information driven. Mathematical techniques dissect the
state of an objective article and plan an appropriate handle present, in light of stan-
dards, for example, force conclusion or confining. These techniques commonly need
to figure out the calculation of the scene; utilizing profundity or sound system sensors
and matching of recently examined models to perceptions. Author approach is most
firmly connected with ongoing work on self-directed learning of handle presented by
Pinto and Gupta, as well as prior work on gaining from independent trial and error,
proposed to become familiar with an organization to foresee the ideal handle direc-
tion for a given picture fix, prepared with self-administered information gathered
utilizing a heuristic getting a handle on framework in view of article proposition.
As opposed to this earlier work, our methodology accomplishes ceaseless dexterity
for getting a handle on by noticing the gripper and picking the best engine order to
move the gripper toward an effective handle, instead of making open-circle forecasts.
Since our technique utilizes no human comments, author can likewise gather a huge
genuine world dataset totally independently [73, 74].
Since our strategy makes considerably more fragile suppositions about the acces-
sible human management (none) and the accessible detecting (just over-the-shoulder
RGB), direct correlations as far as handle achievement rate to values detailed in earlier
Deep Learning and Robotics, Surgical Robot Applications 175
work are unrealistic. The arrangement of articles that we use for assessment incor-
porates incredibly troublesome items, like straightforward jugs, little round objects,
deformable articles, and mess. An error in object trouble between our work and
earlier examinations further muddles direct correlation of detailed precision. The
point of our work is thusly not to outline which framework is ideal, since such corre-
lations are unthinkable without normalized benchmarks, but instead analyze how
much a getting a handle on strategy dependent completely upon gaining from crude
independently gathered information can scale to mind boggling and different handle
situations [75].
One more related region to creator technique is automated coming to, which
manages coordination and criticism for arriving at movements, and visual servoing,
which tends to move a camera or end-effector to an ideal posture utilizing visual input.
As opposed to our methodology, visual servoing techniques are normally worried
about arriving at a posture comparative with objects in the scene, and frequently
(however not dependably) depend on physically planned or determined highlights
for input control. Photometric visual servoing utilizes an objective picture instead of
highlights, and a few visual servoing strategies have been recommended that don’t
straightforwardly need earlier adjustment between the robot and camera. A few
ongoing visual servoing strategies have additionally utilized learning and PC vision
procedures. Supposedly, no earlier learning-based technique has been suggested that
utilizes visual servoing to straightforwardly move into a represent that expands the
likelihood of achievement on a given errand (like getting a handle on) [76].
To anticipate the ideal engine orders to expand handle achievement, author use
convolutional brain organizations (CNNs) prepared on handle achievement expec-
tation. Albeit the innovation behind CNNs has been known for a really long time
they have made noteworthy progress lately on a wide scope of testing PC vision
benchmarks, turning into the accepted norm for PC vision frameworks. Nonethe-
less, utilizations of CNNs to mechanical control issues has been less predominant,
contrasted with applications to inactive insight undertakings like item acknowl-
edgment, limitation, and division. A few works have proposed to involve CNNs
for profound support learning applications, including playing computer games,
executing straightforward errand space movements for visual serving, controlling
basic recreated mechanical frameworks, and playing out an assortment of automated
control undertakings. Large numbers of these applications have been in straight-
forward or engineered spaces, and every one of them have zeroed in on generally
obliged conditions with little datasets [77].
After dominating the market with the amazing Da Vinci framework for many years,
Intuitive Surgical is now finally up against international companies that are vying
for market share with their own iterations of cutting-edge robots [78]. A few studies
have suggested the use of CNNs for sophisticated support learning applications.
176 M. S. Iqbal et al.
These frameworks will typically include open command centers, lighter equipment,
and increased mobility. Even interest in robotization, which hasn’t been seen in
nearly 30 years, has been reignited. The STAR robot can sew inside objects more
effectively than a human hand without human intervention. In order to participate in
the gastrointestinal anastomosis of a pig, it combined 3-layered imaging and sensors
(close infrared fluorescent/NIRF labels) with the concept of managed independent
stitching [63]. The Revo-I, a Korean robot, recently completed its first clinical trials
in a long time, including a groundbreaking prostatectomy that saved Retzius’ life
(RARP). Even in the hands of skilled practitioners, three patients underwent blood
bonding, and the positive edge rate was 23% [79]. This is a great example of legitimate
advertising.
The new devices may be able to reduce the cost of an automated medical proce-
dure to be similar to that of a laparoscopy, even though the underlying equipment
cost may still be substantial. The UK’s Cambridge Medical Robotics has plans to
present more up-to-date costing models that cover support, tools, and even aides as
a whole package in addition to the actual equipment. This may attract multidisci-
plinary development in the east, among high volume open and laparoscopic special-
ists, to advanced mechanics. For instance, lower costs could support a more notable
acknowledgment of an automated medical procedure in eastern India, where prostate
malignant growth is rare but instead aggressive in those who get it. According to data
from the Vattikuti Foundation, there are currently 60 Da Vinci cases in India, with
urologists making up about 50% of the specialists and RARP being the most popular
approach. In a review of a recent series of RARPs from Kolkata, 90% self-control
and a biochemical repeat-free endurance of 75% at 5 years were found in cases of
mostly high-risk prostate cancer. While effective multidisciplinary teamwork will
reduce costs, it is almost certain that the use of Markov displaying will determine the
automated medical procedure’s medium-term cost-adequacy in the developing scene.
The two distinct perspectives in the field of new robots that are stirring up excite-
ment are man-made consciousness (AI) and quicker advanced correspondence, even
though cost may outweigh the titles. The era of careful AI has begun, even though
the concept is not new and can be traced back to Alan Turing, a genius whose deci-
phering skills had a significant impact on the outcome of World War II. Despite how
trendy it may sound, AI is probably going to be the main force behind the digitization
of meticulous practice. Artificial intelligence is the superset of managing a group of
intricate computer programs designed to achieve a goal by making decisions. With
models like visual discrimination, discourse acknowledgment, and language inter-
pretation, it is comparable to human insight in this way. A subset of AI called AI (ML)
uses dynamic PC calculations to understand and respond to specific information. By
determining, for example, whether a particular image represents a prostate malignant
growth, a prostate recognition calculation might enable the machine to reduce the
variability in radiologists’ interpretations of attractive resonance imaging. Current
machine learning frameworks have been transformed by fake neural networks explic-
itly deep learning, graphics processing units, and limitless information stockpiling
limits, making the executions faster, less expensive, and more impressive than at any
time in recent memory. The video accounts of experts performing RARP can now be
converted into Automated Performance Metrics through a black box, and they reveal
Deep Learning and Robotics, Surgical Robot Applications 177
astounding discoveries, such as the finding that not all highly productive experts are
necessarily those who achieve the best results [80].
Medical intervention is intended to become a more dependable, secure, and mini-
mally invasive process through the use of mechanical technology [81, 82]. New devel-
opments are being made in the direction of fully autonomous automated specialists
and robot-assisted frameworks. The careful framework that has been used the most
frequently to date is the da Vinci robot. Through remote-controlled laparoscopic
surgery in gynaecology, urology, and general surgery, it has already demonstrated its
effectiveness [81]. Data in a careful control center of a careful framework with robot
assistance includes crucial nuances for internally employable direction that can help
the dynamic cycle. Typically, this information is presented as 2D images or record-
ings with exacting tools and human tissues. It is a complex problem to comprehend
these details, which also include the posture evaluation of careful instruments near
careful scenes. The semantic grouping of the instruments in the meticulous control
center is a fundamental component of this interaction. Semantic separation of auto-
mated instruments is a challenging task due to the complexity and dynamic nature
of foundation tissues, as well as light changes like shadows and specular reflections,
visual obstructions like blood and camera focal point hazing, and visual impediments
like these. Division veils can be used to make a significant contribution to instrument
GPS systems. This creates a compelling need for the development of precise and
powerful PC vision techniques for the semantic separation of precise instruments
from functional pictures and video. Numerous vision-based techniques have been
developed for mechanical instrument identification and tracking [82]. Instrument
background division can be viewed as a double or occurrence division problem, and
old-style AI calculations have been used for this problem, utilising both surface high-
lights and variety [83, 84]. Later applications found a solution in semantic division,
which refers to the recognition of various instruments or their components [85, 86].
Deep learning-based approaches recently demonstrated performance improve-
ments over conventional AI methods for some biomedical issues [87, 88]. Convolu-
tional brain organisations have been successfully used in the field of clinical imaging
for a variety of purposes, including the analysis of a histology image of a bosom
malignant growth [89], the prediction of bone disease [90], the determination of
age [91], and others [87]. Applications of deep learning-based automated instrument
division have previously shown solid performance in paired division [92, 93], and
hopeful outcomes in multiclass division [94]. Beginning to emerge [95] are deep brain
network changes that are suitable for use in fixed and mobile devices, such as clin-
ical robots. The authors of this paper offer a thorough learning-based approach to the
semantic grouping of mechanical instruments that produces cutting-edge outcomes
in both a two-class and a multi-class setting. By using this method, the author is able
to create a solution for the Robotic Instrument Segmentation MICCAI 2017 Endo-
scopic Vision Sub Challenge [96]. This lodging placed first in the division of double
and multi-class instruments and second in the division of instrument parts sub-tasks.
Here, authors illustrate the arrangement’s subtleties in light of a U-Net model adjust-
ment [97]. Authors also provide additional improvements to this arrangement using
other contemporary profound models.
178 M. S. Iqbal et al.
CNNs with robotic assistance have been suggested in a few works for significant
support. For procedures like oncological surgery, colorectal surgery, and general
anesthesia, minimally invasive surgery (MIS) has recently gained popularity [98].
Natural Orifice Transluminal Endoscopic Surgery (NOTES), with mechanical assis-
tance, has the greatest potential and dependability of any technique for performing
tasks inside the peritoneal depression without stomach entry points. Phee et al. created
a master–slave adaptable endoscopic careful robot [99] to significantly enhance the
mobility of specialists within peritoneal pits. A flexible endoscope directs the slave
robot arms to the precise locations they desire, while the expert robot moves in
accordance with instructions from a specialist at the near end. Although the strength,
adaptability, and consistency of robot-assisted NOTES quickly improved over time,
the absence of precise haptic input remained a critical flaw. As a result, experts rely
heavily on experience and visual information to make decisions [100]. Numerous
studies [31, 101–103] have shown that providing doctors with haptic feedback will
not only significantly shorten the amount of time spent in the operating room during
the procedure, but it will also reduce the instances of excessive or inadequate power
move to reduce tissue damage.
Although Omega.7, CyberForce, and CyberGrasp [104] are among the well-
known haptic devices available on the market, the force data connecting the attentive
robots and worked objects is missing from the cycle. Tendon-Sheath Mechanisms
(TSMs) have been widely used for movement and power transmission in mechanical
frameworks for NOTES due to their high degree of adaptability and responsive-
ness to control in constrained and convoluted ways. TSMs are frequently associated
with issues such as backfire, hysteresis, and nonlinear pressure misfortune due to
grating that exists between the ligament and the associated sheath. Therefore, it is
challenging to obtain precise haptic feedback using these frameworks. At the distal
end of a careful robot, various sensors with working standards for removal [105],
current [106], pressure [107], resistance [108], capacitance [109], vibration [110],
and optical properties [111] can be mounted to clearly determine the connection force
for haptic input. However, the inability to sanitize, the uncomfortable environment
during the procedure, the lack of mounting space at the distal end, issues with the
associated wires and decorations, and other factors typically limit these sensors.
However, extensive efforts have been made to numerically depict the robot’s power
transmission for TSM-driven robots so that models can calculate the power at the
robot’s distal end using estimates from its proximal end. Kaneko et al. conducted
research on the pressure transmission in TSMs [112] in accordance with the Coulomb
rubbing model. Lampaert, Pitkowski, and Others [113, 114] proposed the Dahl,
LuGre, and Leuven models as alternatives to the Coulomb model in an effort to
alter the displaying procedure. However, as these demonstration techniques became
more accurate, they fundamentally began to exhibit irregularities between various
hysteresis stages and were unable to accurately depict the contact force when the
system was operating at zero speed. In a subsequent development of clinical robots,
Deep Learning and Robotics, Surgical Robot Applications 179
the nonlinear erosion in TSMs was unavoidably demonstrated using the Bouc-Wen
model [115]. Backfire pay is a common strategy in Bowden link control to reduce
hysteresis [116, 117]. Do and co. [118, 119] suggested an improved Bouc-Wen model
with dynamic properties that utilized speed and speed increase data to significantly
more accurately depict the erosion profile. It’s important to note that springs were
frequently used in writing to mimic the response of tissue. In fact, in order to effec-
tively perform force expectation on a haptic device, the non-straight property of tissue
needs to be taken into consideration. Wang et al. [120] also used the Voigt, Kelvin, and
Hunt-Crossley models, looked into other approaches for modeling the tissue force
response for TSMs in NOTES while keeping the space speed constant. For each
viscoelastic model that is going to be constructed, a number of challenging bound-
aries must be meticulously deduced for the scenarios where the ligament assumption
is sufficiently large to prevent any framework loosening and the types of collabora-
tions with tissue are constrained. This is necessary in order to accurately predict the
distal power in TSMs using numerical models. In automated control problems where
robots derive strategies directly from images, neural networks have demonstrated
observational results [100, 121]. Learning control arrangements with convolutional
highlights suggest that these components may also possess additional dynamical
framework-related properties. The dynamical framework hypothesis serves as the
impetus for the author’s investigation of methods for integrating the Transition State
model, with significant emphasis on division. The author acknowledges that precise
division necessitates the appropriate selection of visual highlights, and that division
is an essential initial step in numerous robot learning applications.
Instrumentation: According to a significant portion of the studies that are taken into
consideration in this review, the absence of instrumentation designed specifically for
microsurgery severely limits the capabilities of the current robotic surgical systems.
The majority of the published articles examined the viability of robotic microsurgery
performed with the da Vinci surgical robot. Even though this particular system is
approved for seven different types of minimally invasive surgery, it is not recom-
mended or intended to be used for open plastic and reconstructive microsurgery. The
majority of compatible instruments with the da Vinci are thought to be too big to
handle the delicate tissue that is frequently encountered during microsurgery. Using
the da Vinci’s Black Diamond Micro Forceps, nerves and small vessels can be oper-
ated on with success. However, the process of handling submillimeter tissue and
equipment is time-consuming and difficult due to the absence of a comprehensive
set of appropriate microsurgical instruments. When compared to traditional micro-
surgery, using a surgical robot makes routine microsurgical procedures like dissecting
blood vessels, applying vessel clamps, and handling fine sutures more difficult.
The variety of tissues encountered during microsurgery is not completely covered
by the surgical toolkit. The large instruments are also present. Several operations
180 M. S. Iqbal et al.
can range from $1,800 to $4,600 per instrument. In order to familiarize personnel
working in operating rooms with surgical robots, training resources must be allocated.
In order to guarantee these systems’ dependability outside of the operating room,
more staff is required. Because of their intrinsic intricacy, fix and support of careful
robots require specific information. Because of this, hospitals that use surgical robots
have to negotiate service agreements with the manufacturers, which result in a 10%
increase in the system’s annual cost. The use of surgical robots is becoming less
appealing due to the rising costs brought on by the increased demands placed on
personnel and supplies.
If costly treatment options are linked to improved outcomes or increased revenue
over time, hospitals may benefit. However, only a small amount of evidence indicates
that plastic reconstructive microsurgery is an exception. There are currently few
reasons to spend a lot of money on surgical robots. The operating time for robotic-
assisted microsurgery is also longer than that of traditional microsurgery, according to
published data. As a result, waiting times may be longer and the number of patients
who can be treated may be reduced. The claimed paradoxical cost savings from
shorter hospital stays and fewer post-operative complications do not yet outweigh the
investment required to justify the use of surgical robots in plastic and reconstructive
microsurgery. Very few plastic surgery departments will currently be willing to invest
in robotic-assisted surgery unless patient turnover and cost efficiency both rise.
In surgical training, the apprenticeship model is frequently utilized, in which
students first observe a skilled professional before becoming more involved in proce-
dures. Typically, surgical robots only permit a single surgeon to complete the proce-
dure and operate the entire system. As a result, assistants rarely have the opportunity
to participate in robotically assisted tasks. Surgeons’ exposure to surgical robotics and
opportunities to improve their skills may be limited if they are not actively involved.
This problem can be solved by switching surgeons in the middle of the procedure
or by using two or more complete surgical robotic systems. Even though switching
between different users is a quick process, clinical outcomes may be jeopardized
if crucial circumstances are delayed. It might be safer to train new surgeons with
multiple surgical robots. Students can learn the skills necessary for robotic micro-
surgery while also providing the lead surgeon with an assistant who can assist during
the procedure. However, considering that each robotic system can cost more than
$2 million, it is difficult to justify purchasing one solely for training purposes. Last
but not least, it’s important to know that surgical robots shouldn’t replace traditional
microsurgery; rather, they should be seen as an additional tool. The skills required
for each type of microsurgery are very different. Due to the very different movements
and handling of delicate tissue, the skills required to successfully use a surgical robot
in these circumstances cannot be directly applied to conventional microsurgery. For
future surgeons to be able to deal with the many different problems that will arise
during their careers, they will need to receive training in both conventional and
robotic-assisted microsurgery. Therefore, surgical training ought to incorporate both
traditional and robotically assisted surgical experience.
182 M. S. Iqbal et al.
RAS systems and supplies ought to begin to become more affordable as a result
of market competition for laparoscopic robotic assisted surgery. Laparoscopic RAS
ought to become more affordable as a result of this. RAS ought to be used more
frequently for laparoscopic procedures due to the benefits it provides to the patient
and the cost savings it provides. Laparoscopic RAS surgery will continue to become
more affordable due to the economies of scale that result from lower costs for RAS
systems, supplies, and maintenance [123].
Despite the fact that da Vinci continues to dominate the market for single port
laparoscopic RAS surgery, we can see that a few rival systems are still in the testing
phase. The cost and frequency of single port laparoscopic RAS surgery should go
down as a result of these systems’ availability. Single port laparoscopic RAS surgery
is likely to become the technique of choice for both surgeons and patients due to
the advantages of almost scar-free surgery and the decreasing costs. Endo Wrist
instruments with a single port are likely to be purchased by hospitals that have
purchased the da Vinci Xi system in order to perform both single-port and multi-port
laparoscopic surgery with the same RAS system. As single-port laparoscopic RAS
systems become available in the operating room, we are likely to see an increase in
the use of NOTES for genuine scar-free procedures. Similar to how Intuitive Survival
introduced the dedicated single port laparoscopic RAS system for the da Vinci SP
[123], they will probably introduce instruments that the da Vinci SP can use with
NOTES procedures to compete with the new NOTES-specific systems on the market.
Finally, both new RAS systems and upgrades to existing RAS systems are likely to
include augmented reality as a standard feature. Surgeons will be able to overlay real-
time endoscope camera feeds on top of elements of the operating room workspace
Deep Learning and Robotics, Surgical Robot Applications 183
using augmented reality [53, 86]. Technology advancements that can map features
like blood vessels, nerves, and even tumors and overlay their locations on the
surgeon’s display in real time have made this possible [54–56, 80]. Overlaid medical
images can also include images taken prior to a diagnosis or intervention design.
By assisting the surgeon in locating the area of interest and avoiding major blood
vessels and nerves that could cause the patient problems after surgery, this will help
the surgeon provide the safest and best care possible throughout the intervention.
New surgical systems that improve either manipulation or imaging, two essential
aspects of surgery, must be researched. Given the widespread adoption of these
technologies, it seems inevitable that new and improved imaging will be developed.
They must continue in order to keep up with robotic technology advancements on
the manipulation side [124].
The use of robotic surgery is still in its infancy. Equipment is incorporating new
technologies to boost performance and cut down on downtime. Siemens employee
Balasubramaniac asserts that digital twins and AI will improve future performance.
The procedure can undoubtedly be recorded and analyzed in the future for educa-
tional and process improvement purposes using the digital twin technology. It is
necessary to keep a minute-by-minute record of the process. There is a lot of hope that
robotic surgery will eventually improve precision, efficiency, and safety while poten-
tially lowering healthcare costs. Additionally, it may facilitate access to specialists
in difficult-to-reach locations. Santosh Kesari, M.D., Ph.D., co-founder and director
of neuro-oncology at the Pacific Neuroscience Institute in Santa Monica, California,
stated, Access to surgical expertise is limited in many rural areas of the United
States as well as in many parts of the world. It is anticipated that robotic-assisted
surgical equipment will be utilized by a growing number of healthcare facilities
for both in-person and online procedures. The technology will keep developing and
improving.
The technology of the future will be more adaptable, portable, and based on AI.
Additional robotic equipment, such as handheld devices, will be developed to accel-
erate telehealth and remote care. How quickly high-speed communication infrastruc-
ture is established will play a role in this.5G will be useful due to its 20 Gbps peak
data rate and 1 ms latency, but 6G is anticipated to be even better. With a latency
of 0.1 ms, 6G’s peak data rate ought to theoretically reach one terabit per second.
However, speeds can vary significantly depending on the technology’s application
and location. Open Signal, a company that monitors 5G performance all over the
world, asserts that South Korea frequently takes the lead in achieving the fastest 5G
performance, such as Ultra-Wideband download speeds of 988.37 Mbps. Verizon,
on the other hand, recently achieved a peak speed of 1.13 Gbps. The speed is signifi-
cantly impacted by the position of the 5G antennas. Even if you only reach your peak
performance once, that does not mean it will last. 5G has a long way to go before it
reaches 20 Gbps, even though it is currently at 1 Gbps. In conclusion, the medical
field can benefit greatly from remote robotic-assisted surgery. There are numerous
advantages. Ramp-up time will be affected by reliable communications systems and
secure chips, as well as the capacity to monitor each component in the numerous
interconnected systems that must cooperate for RAS to be successful.
184 M. S. Iqbal et al.
9 Discussion
A couple of works have proposed to use CNNs, A few works have proposed to
include CNNs for significant help learning applications; Action word Careful is a
joint undertaking among Johnson and Johnson’s clinical equipment division Ethicon
and Google’s life sciences division Verily. It has as of late planned its most memorable
computerized a medical procedure model, bragging driving edge mechanical abilities
and top tier clinical gadget innovation. Mechanical technology, representation, high
level instrumentation, information investigation and network are its excellent points
of support. IBM’s Watson additionally anticipates being an astute careful colleague. It
is a harbinger of limitless clinical data, utilizing regular language handling to explain
a specialist’s questions. It is right now being utilized to investigate electronic clin-
ical records and arrangement growth qualities determined to form more customized
treatment plans. Medical procedure might be additionally democratized by low inert-
ness ultrafast 5G availability. The Internet of Skills could make distant mechanical
medical procedure, educating and mentorship effectively available, independent of
the area of the master specialist [125]. In rundown the three trendy expressions for the
eventual fate of automated a medical procedure are-cost, information and network.
The effect of these advancements on tolerant consideration is being watched with
extensive interest. Author mean to examine assuming the exhibition improves when
train the CNNs with careful pictures [85]. Author will investigate how to extricate
predictable construction across conflicting showings and find that a few careful exhi-
bitions have circles, i.e., dreary movements where the specialist rehashes a subtask
until progress. Combining these movements into solitary crude is a significant need
for us. The subsequent stage is to apply this and future mechanized division strategies
to expertise appraisal and strategy acquiring.
A potential following stage of author work is to utilize the weighting factor grid
for helping techniques to all the more proficiently train the bound together state
assessment model. Albeit demonstrated as a FSM, the fine-grained states inside
each careful assignment are assessed autonomously, without impact from the past
state(s). One more potential following stage is perform state forecast in light of
recently assessed state succession. Later on, additionally plan to apply this state
assessment system to applications, for example, brilliant help advancements and
directed independence for careful subtasks.
This study had a few limits. In the first place, the proposed framework was
applied to Video Sets of preparing model and patients with thyroid disease who
went through BABA medical procedure. It is important to check the adequacy of
the proposed framework utilizing different careful techniques and careful regions.
Secondly, author couldn’t straightforwardly look at the exhibitions of the kinematics
and proposed picture based strategies since admittance to the da Vinci Research
Interface is restricted, permitting most scientists just to get kinematic crude informa-
tion [85]. Nonetheless, past investigations have detailed that the kinematics strategy
utilizing da Vinci robot had a mistake of somewhere around 4 mm [9]. Direct exami-
nation of execution is troublesome in light of the fact that the careful pictures utilized
Deep Learning and Robotics, Surgical Robot Applications 185
in the past review and in this study varied. Nonetheless, the normal RMSE of the
proposed picture based following calculation was 3.52 mm, demonstrating that this
strategy is more precise than the kinematics technique and that the last option can’t be
portrayed as prevalent. The exhibition of the ongoing technique with the past visual
strategy couldn’t be straightforwardly looked at on the grounds that no comparable
review distinguished and followed the tip directions of the SIs. Nonetheless, studies
have utilized profound learning-based discovery strategies to decide the bouncing
boxes of the SIs and to show the direction of the middle places of these cases [94, 95].
By the by, on the grounds that this approach couldn’t decide the particular areas of the
SIs, it can’t be viewed as an exact following technique instinctively. Correlation of
the quantitative presentation of the proposed strategy and different methodologies are
significant, making it important to analyze different SI following techniques. Thirdly,
since SIs is recognized on two-layered sees, mistakes might happen because of the
shortfall of profundity data. Mistakes of amplification were in this manner limited
by estimating the width of SIs on the view and changing pixels over completely to
millimeters. Notwithstanding, techniques are expected to use three-layered data in
light of stereoscopic matching of left and right pictures during mechanical medical
procedure [10, 11]. Fourth, on the grounds that the proposed technique is a mix of a
few calculations, longer recordings can bring about the aggregation of extra blunders,
corrupting the exhibition of the framework. Consequently, specifically, it is impor-
tant to prepare extra regrettable models with the occasion division system, which is
the start of the pipeline. For instance, cloth or cylinders on the mechanical medical
procedure view can be perceived as SIs (Supplementary Figure S4). At last, since
blunders from re-recognizable proof in the following system could fundamentally
influence the capacity to decide right directions, exact evaluation of careful abilities
requires manual amendment of mistakes.
Regardless of the advancement in present work, there actually exist a few restric-
tions of profound learning models toward a capable web-based ability appraisal. To
begin with, as affirmed by our outcomes, the classification exactness of managed
profound learning depends intensely on the marked examples. The essential worry
in this study lies with the JIGSAWS dataset and the absence of severe ground-truth
names of ability levels. It is vital to make reference to that there is an absence of
agreement in the ground-truth explanation of careful abilities. In the GRS-based
marking, ability names were explained in view of the predefined cutoff edge of
GRS scores, nonetheless, no usually acknowledged cutoff exists. For future work, a
refined marking approach with more grounded ground-truth information on specialist
ability might additionally further develop the general expertise evaluation [9, 10].
Second, author will look for a point by point streamlining of our profound design,
boundary settings, and expansion systems to all the more likely handle movement
time series information and further develop the internet based execution further. Like-
wise, the interpretability of consequently educated portrayals is at present restricted
because of the discovery idea of profound learning models. It would be intriguing to
examine a perception of profound various leveled portrayals to comprehend stowed
away ability designs, in order to all the more likely legitimize the choice taken by a
profound learning classifier.
186 M. S. Iqbal et al.
At this point, creator acknowledges that the huge limitation of the Profound
construction is its high estimation power. Running various significant mind networks
consistently requires different dealing with units, which limits the update speeds of
the trackers. Lightweight significant cerebrum associations will be perfect for contin-
uous cautious applications, at whatever point changed without choosing accuracy.
As late progress has been made on profound learning-based redoing and conveying
strategies [117, 118], a future course could be utilizing a learnable tissue tracker
and gadget tracker to expansion partner improve the knowledge framework. Another
bearing to pursue is cautious task computerization. By including the clear environ-
ment as analysis, controllers applied to the cautious instrument will really need to
accomplish tasks in unstructured, misshaping cautious circumstances.
10 Conclusions
Deep learning techniques are currently surpassing the prior level of workmanship
in a wide range of advanced mechanics, clinical mechanical technology, and fix
improvement endeavors. According to our main question, has the investigation of
careful robot changed deep learning in light of this rapid advancement? Although
the answer to this question is straightforwardly dependent on the specific issue and
space, we accept that deep learning has yet to comprehend or trigger a critical defining
moment in its ability to change. Despite being evaluated as having a strong position
in many different areas, issues like these have not yet been addressed by tactical
improvements in the ability to predict the future. This research focuses on careful
robot and deep learning, progress, achievement, and future perspectives. This area
requires more attention; in future more medical or surgical robots are needed.
References
1. Levine, S., Finn, C., Darrell, T., & Abbeel, P. (2016). End-to-end training of deep visuomotor
policies. The Journal of Machine Learning Research, 17(1), 1334–1373.
2. Bakshi, G., Kumar, A., & Puranik, A. N. (2022). Adoption of robotics technology in healthcare
sector. In Advances in communication, devices and networking (pp. 405–414). Singapore:
Springer.
3. Maibaum, A., Bischof, A., Hergesell, J., & Lipp, B. (2022). A critique of robotics in health
care. AI & Society, 37(2), 467–477.
4. Tasioulas, J. (2019). First steps towards an ethics of robots and artificial intelligence. Journal
of Practical Ethics, 7(1).
5. Hallevy, G. (2013). When robots kill: Artificial intelligence under criminal law. UPNE.
6. Bryndin, E. (2019). Robots with artificial intelligence and spectroscopic sight in hi-tech labor
market. International Journal of Systems Science and Applied Mathematic, 4(3), 31–37.
7. Lopes, V., Alexandre, L. A. & Pereira, N. (2019). Controlling robots using artificial
intelligence and a consortium blockchain. arXiv:1903.00660.
Deep Learning and Robotics, Surgical Robot Applications 187
8. Bataev, A. V., Dedyukhina, N., & Nasrutdinov, M. N. (2020, February). Innovations in the
financial sphere: performance evaluation of introducing service robots with artificial intel-
ligence. In 2020 9th International Conference on Industrial Technology and Management
(ICITM) (pp. 256–260). IEEE.
9. Nitto, H., Taniyama, D., & Inagaki, H. (2017). Social acceptance and impact of robots and
artificial intelligence. Nomura Research Institute Papers, 211, 1–15.
10. Yoganandhan, A., Kanna, G. R., Subhash, S. D., & Jothi, J. H. (2021). Retrospective and
prospective application of robots and artificial intelligence in global pandemic and epidemic
diseases. Vacunas (English Edition), 22(2), 98–105.
11. Rajan, K., & Saffiotti, A. (2017). Towards a science of integrated AI and Robotics. Artificial
Intelligence, 247, 1–9.
12. Chatila, R., Renaudo, E., Andries, M., Chavez-Garcia, R. O., Luce-Vayrac, P., Gottstein, R.,
Alami, R., Clodic, A., Devin, S., Girard, B., & Khamassi, M. (2018). Toward self-aware
robots. Frontiers in Robotics and AI, 5, 88.
13. Gonzalez-Jimenez, H. (2018). Taking the fiction out of science fiction:(Self-aware) robots
and what they mean for society, retailers and marketers. Futures, 98, 49–56.
14. Schostek, S., Schurr, M. O., & Buess, G. F. (2009). Review on aspects of artificial tactile
feedback in laparoscopic surgery. Medical Engineering & Physics, 31(8), 887–898.
15. Naitoh, T., Gagner, M., Garcia-Ruiz, A., Heniford, B. T., Ise, H., & Matsuno, S. (1999). Hand-
assisted laparoscopic digestive surgery provides safety and tactile sensation for malignancy
or obesity. Surgical Endoscopy, 13(2), 157–160.
16. Schostek, S., Ho, C. N., Kalanovic, D., & Schurr, M. O. (2006). Artificial tactile sensing in
minimally invasive surgery–a new technical approach. Minimally Invasive Therapy & Allied
Technologies, 15(5), 296–304.
17. Kraft, B. M., Jäger, C., Kraft, K., Leibl, B. J., & Bittner, R. (2004). The AESOP robot system in
laparoscopic surgery: Increased risk or advantage for surgeon and patient? Surgical Endoscopy
And Other Interventional Techniques, 18(8), 1216–1223.
18. Troisi, R. I., Patriti, A., Montalti, R., & Casciola, L. (2013). Robot assistance in liver surgery:
A real advantage over a fully laparoscopic approach? Results of a comparative bi-institutional
analysis. The International Journal of Medical Robotics and Computer Assisted Surgery, 9(2),
160–166.
19. Dupont, P. E., Nelson, B. J., Goldfarb, M., Hannaford, B., Menciassi, A., O’Malley, M. K.,
Simaan, N., Valdastri, P., & Yang, G. Z. (2021). A decade retrospective of medical robotics
research from 2010 to 2020. Science Robotics, 6(60), eabi8017.
20. Fuchs, K. H. (2002). Minimally invasive surgery. Endoscopy, 34(02), 154–159.
21. Robinson, T. N., & Stiegmann, G. V. (2004). Minimally invasive surgery. Endoscopy, 36(01),
48–51.
22. McDonald, G. J. (2021) Design and modeling of millimeter-scale soft robots for medical
applications (Doctoral dissertation, University of Minnesota).
23. Currò, G., La Malfa, G., Caizzone, A., Rampulla, V., & Navarra, G. (2015). Three-dimensional
(3D) versus two-dimensional (2D) laparoscopic bariatric surgery: A single-surgeon prospec-
tive randomized comparative study. Obesity Surgery, 25(11), 2120–2124.
24. Dogangil, G., Davies, B. L., & Rodriguez, Y., & Baena, F. (2010) A review of medical robotics
for minimally invasive soft tissue surgery. Proceedings of the Institution of Mechanical
Engineers, Part H: Journal of Engineering in Medicine, 224(5), 653–679.
25. Yu, L., Wang, Z., Yu, P., Wang, T., Song, H., & Du, Z. (2014). A new kinematics method
based on a dynamic visual window for a surgical robot. Robotica, 32(4), 571–589.
26. Byrn, J. C., Schluender, S., Divino, C. M., Conrad, J., Gurland, B., Shlasko, E., & Szold,
A. (2007). Three-dimensional imaging improves surgical performance for both novice and
experienced operators using the da Vinci Robot System. The American Journal of Surgery,
193(4), 519–522.
27. Kim, S., Chung, J., Yi, B. J., & Kim, Y. S. (2010). An assistive image-guided surgical robot
system using O-arm fluoroscopy for pedicle screw insertion: Preliminary and cadaveric study.
Neurosurgery, 67(6), 1757–1767.
188 M. S. Iqbal et al.
28. Nagy, T. D., & Haidegger, T. (2019). A dvrk-based framework for surgical subtask automation.
Acta Polytechnica Hungarica (pp.61–78).
29. Millan, B., Nagpal, S., Ding, M., Lee, J. Y., & Kapoor, A. (2021). A scoping review of
emerging and established surgical robotic platforms with applications in urologic surgery.
Société Internationale d’Urologie Journal, 2(5), 300–310
30. Nagyné Elek, R., & Haidegger, T. (2019). Robot-assisted minimally invasive surgical skill
assessment—Manual and automated platforms. Acta Polytechnica Hungarica, 16(8), 141–
169.
31. Okamura, A. M. (2009). Haptic feedback in robot-assisted minimally invasive surgery. Current
Opinion Urology, 19(1), 102.
32. Bark, K., McMahan, W., Remington, A., Gewirtz, J., Wedmid, A., Lee, D. I., & Kuchenbecker,
K. J. (2013). In vivo validation of a system for haptic feedback of tool vibrations in robotic
surgery. Surgical Endoscopy, 27(2), 656–664.
33. Van der Meijden, O. A., & Schijven, M. P. (2009). The value of haptic feedback in conventional
and robot-assisted minimal invasive surgery and virtual reality training: A current review.
Surgical Endoscopy, 23(6), 1180–1190.
34. Bethea, B. T., Okamura, A. M., Kitagawa, M., Fitton, T. P., Cattaneo, S. M., Gott, V. L.,
Baumgartner, W. A., & Yuh, D. D. (2004). Application of haptic feedback to robotic surgery.
Journal of Laparoendoscopic & Advanced Surgical Techniques, 14(3), 191–195.
35. Amirabdollahian, F., Livatino, S., Vahedi, B., Gudipati, R., Sheen, P., Gawrie-Mohan, S., &
Vasdev, N. (2018). Prevalence of haptic feedback in robot-mediated surgery: A systematic
review of literature. Journal of robotic surgery, 12(1), 11–25.
36. Okamura, A. M. (2004). Methods for haptic feedback in teleoperated robot-assisted surgery.
Industrial Robot: An International Journal, 31(6), 499–508.
37. Pacchierotti, C., Scheggi, S., Prattichizzo, D., & Misra, S. (2016). Haptic feedback for
microrobotics applications: A review. Frontiers in Robotics and AI, 3, 53.
38. Yeh, C. H., Su, F. C., Shan, Y. S., Dosaev, M., Selyutskiy, Y., Goryacheva, I., & Ju, M. S.
(2020). Application of piezoelectric actuator to simplified haptic feedback system. Sensors
and Actuators A: Physical, 303, 111820.
39. Okamura, A. M., Dennerlein, J. T., & Howe, R. D. (1998, May). Vibration feedback models for
virtual environments. In Proceedings of the 1998 IEEE International Conference on Robotics
and Automation (Cat. No. 98CH36146) (Vol. 1, pp. 674–679). IEEE.
40. Luostarinen, L. O., Åman, R., & Handroos, H. (2016, October). Haptic joystick for improving
controllability of remote-operated hydraulic mobile machinery. In Fluid Power Systems
Technology (Vol. 50473, p. V001T01A003). American Society of Mechanical Engineers.
41. Shang, W., Su, H., Li, G., & Fischer, G. S. (2013, November). Teleoperation system with hybrid
pneumatic-piezoelectric actuation for MRI-guided needle insertion with haptic feedback. In
2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 4092–4098).
IEEE.
42. Kim, P., Kim, S., Park, Y. D., & Choi, S. B. (2016). Force modeling for incisions into various
tissues with MRF haptic master. Smart Materials and Structures, 25(3), 035008.
43. Hooshiar, A., Payami, A., Dargahi, J., & Najarian, S. (2021). Magnetostriction-based
force feedback for robot-assisted cardiovascular surgery using smart magnetorheological
elastomers. Mechanical Systems and Signal Processing, 161, 107918.
44. Shokrollahi, E., Goldenberg, A. A., Drake, J. M., Eastwood, K. W., & Kang, M. (2018,
December). Application of a nonlinear Hammerstein-Wiener estimator in the development
and control of a magnetorheological fluid haptic device for robotic bone biopsy. In Actuators
(Vol. 7, No. 4, p. 83). MDPI.
45. Najmaei, N., Asadian, A., Kermani, M. R., & Patel, R. V. (2015). Design and performance
evaluation of a prototype MRF-based haptic interface for medical applications. IEEE/ASME
Transactions on Mechatronics, 21(1), 110–121.
46. Song, Y., Guo, S., Yin, X., Zhang, L., Wang, Y., Hirata, H., & Ishihara, H. (2018). Design and
performance evaluation of a haptic interface based on MR fluids for endovascular tele-surgery.
Microsystem Technologies, 24(2), 909–918.
Deep Learning and Robotics, Surgical Robot Applications 189
47. Kikuchi, T., Takano, T., Yamaguchi, A., Ikeda, A. and Abe, I. (2021, September). Haptic
interface with twin-driven MR fluid actuator for teleoperation endoscopic surgery system. In
Actuators (Vol. 10, No. 10, p. 245). MDPI.
48. Najmaei, N., Asadian, A., Kermani, M. R. & Patel, R. V. (2015, September). Performance
evaluation of Magneto-Rheological based actuation for haptic feedback in medical applica-
tions. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
(pp. 573–578). IEEE.
49. Gao, Q., Zhan, Y., Song, Y., Liu, J., & Wu, J. (2021, August). An MR fluid based master manip-
ulator of the vascular intervention robot with haptic feedback. In 2021 IEEE International
Conference on Mechatronics and Automation (ICMA) (pp. 158–163). IEEE.
50. Nguyen, N. D., Truong, T. D., Nguyen, D. H. & Nguyen, Q. H. (2019, March). Development
of a 3D haptic spherical master manipulator based on MRF actuators. In Active and Passive
Smart Structures and Integrated Systems XIII (Vol. 10967, pp. 431–440). SPIE.
51. Kim, S., Kim, P., Park, C. Y., & Choi, S. B. (2016). A new tactile device using magneto-
rheological sponge cells for medical applications: Experimental investigation. Sensors and
Actuators A: Physical, 239, 61–69.
52. Cha, S. W., Kang, S. R., Hwang, Y. H., & Choi, S. B. (2017, April). A single of MR sponge
tactile sensor design for medical applications. In Active and Passive Smart Structures and
Integrated Systems (Vol. 10164, pp. 520–525). SPIE.
53. Oh, J. S., Sohn, J. W., & Choi, S. B. (2018). Material characterization of hardening soft sponge
featuring MR fluid and application of 6-DOF MR haptic master for robot-assisted surgery.
Materials, 11(8), 1268.
54. Park, Y. J., & Choi, S. B. (2021). A new tactile transfer cell using magnetorheological materials
for robot-assisted minimally invasive surgery. Sensors, 21(9), 3034.
55. Park, Y. J., Yoon, J. Y., Kang, B. H., Kim, G. W., & Choi, S. B. (2020). A tactile device
generating repulsive forces of various human tissues fabricated from magnetic-responsive
fluid in porous polyurethane. Materials, 13(5), 1062.
56. Park, Y. J., Lee, E. S., & Choi, S. B. (2022). A cylindrical grip type of tactile device using
Magneto-Responsive materials integrated with surgical robot console: design and analysis.
Sensors, 22(3), 1085.
57. Martin, J. A., Regehr, G., Reznick, R., Macrae, H., Murnaghan, J., Hutchison, C., & Brown,
M. (1997). Objective structured assessment of technical skill (OSATS) for surgical residents.
British Journal of Surgery, 84(2), 273–278.
58. Vassiliou, M. C., Feldman, L. S., Andrew, C. G., Bergman, S., Leffondré, K., Stanbridge, D., &
Fried, G. M. (2005). A global assessment tool for evaluation of intraoperative laparoscopic
skills. The American Journal of Surgery, 190(1), 107–113.
59. Goh, A. C., Goldfarb, D. W., Sander, J. C., Miles, B. J., & Dunkin, B. J. (2012). Global
evaluative assessment of robotic skills: Validation of a clinical assessment tool to measure
robotic surgical skills. The Journal of Urology, 187(1), 247–252.
60. Insel, A., Carofino, B., Leger, R., Arciero, R., & Mazzocca, A. D. (2009). The development
of an objective model to assess arthroscopic performance. JBJS, 91(9), 2287–2295.
61. Champagne, B. J., Steele, S. R., Hendren, S. K., Bakaki, P. M., Roberts, P. L., Delaney, C. P.,
Brady, J. T., & MacRae, H. M. (2017). The American Society of Colon and Rectal Surgeons
assessment tool for performance of laparoscopic colectomy. Diseases of the Colon & Rectum,
60(7), 738–744.
62. Koehler, R. J., Amsdell, S., Arendt, E. A., Bisson, L. J., Bramen, J. P., Butler, A., Cosgarea, A.
J., Harner, C. D., Garrett, W. E., Olson, T., & Warme, W. J. (2013). The arthroscopic surgical
skill evaluation tool (ASSET). The American Journal of Sports Medicine, 41(6), 1229–1237.
63. Shademan, A., Decker, R. S., Opfermann, J. D., Leonard, S., Krieger, A., & Kim, P. C. (2016).
Supervised autonomous robotic soft tissue surgery. Science Translational Medicine, 8(337),
337ra64–337ra64.
64. Garrow, C. R., Kowalewski, K. F., Li, L., Wagner, M., Schmidt, M. W., Engelhardt, S.,
Hashimoto, D. A., Kenngott, H. G., Bodenstedt, S., Speidel, S., & Mueller-Stich, B. P. (2021).
Machine learning for surgical phase recognition: A systematic review. Annals of Surgery,
273(4), 684–693.
190 M. S. Iqbal et al.
85. Pezzementi, Z., Voros, S., & Hager, G. D. (2009, May). Articulated object tracking by
rendering consistent appearance parts. In 2009 IEEE International Conference on Robotics
and Automation (pp. 3940–3947). IEEE.
86. Bouget, D., Benenson, R., Omran, M., Riffaud, L., Schiele, B., & Jannin, P. (2015). Detecting
surgical tools by modelling local appearance and global shape. IEEE Transactions on Medical
Imaging, 34(12), 2603–2617.
87. Ching, T., Himmelstein, D. S., Beaulieu-Jones, B. K., Kalinin, A. A., Do, B. T., Way, G. P.,
Ferrero, E., Agapow, P. M., Zietz, M., Hoffman, M. M., & Xie, W. (2018). Opportunities and
obstacles for deep learning in biology and medicine. Journal of The Royal Society Interface,
15(141), 20170387.
88. Kalinin, A. A., Higgins, G. A., Reamaroon, N., Soroushmehr, S., Allyn-Feuer, A., Dinov, I.
D., Najarian, K., & Athey, B. D. (2018). Deep learning in pharmacogenomics: From gene
regulation to patient stratification. Pharmacogenomics, 19(7), 629–650.
89. Yong, C. W., Teo, K., Murphy, B. P., Hum, Y. C., Tee, Y. K., Xia, K., & Lai, K. W. (2021).
Knee osteoarthritis severity classification with ordinal regression module. Multimedia Tools
and Applications, 1–13.
90. Tiulpin, A., Thevenot, J., Rahtu, E., Lehenkari, P., & Saarakkala, S. (2018). Automatic knee
osteoarthritis diagnosis from plain radiographs: A deep learning-based approach. Scientific
Reports, 8(1), 1–10.
91. Iglovikov, V. I., Rakhlin, A., Kalinin, A. A., & Shvets, A.A. (2018). Paediatric bone age assess-
ment using deep convolutional neural networks. In Deep learning in medical image analysis
and multimodal learning for clinical decision support (pp. 300–308). Cham: Springer.
92. Garcia-Peraza-Herrera, L. C., Li, W., Fidon, L., Gruijthuijsen, C., Devreker, A., Attilakos,
G., Deprest, J., Vander Poorten, E., Stoyanov, D., Vercauteren, T., & Ourselin, S. (2017,
September). Toolnet: holistically-nested real-time segmentation of robotic surgical tools.
In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
(pp. 5717–5722). IEEE.
93. Attia, M., Hossny, M., Nahavandi, S., & Asadi, H. (2017, October). Surgical tool segmentation
using a hybrid deep CNN-RNN auto encoder-decoder. In 2017 IEEE International Conference
on Systems, Man, and Cybernetics (SMC) (pp. 3373–3378). IEEE.
94. Pakhomov, D., Premachandran, V., Allan, M., Azizian, M., & Navab, N. (2019, October). Deep
residual learning for instrument segmentation in robotic surgery. In International Workshop
on Machine Learning in Medical Imaging (pp. 566–573). Cham: Springer.
95. Solovyev, R., Kustov, A., Telpukhov, D., Rukhlov, V., & Kalinin, A. (2019, January).
Fixed-point convolutional neural network for real-time video processing in FPGA. In 2019
IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering
(EIConRus) (pp. 1605–1611). IEEE.
96. Shvets, A. A., Rakhlin, A., Kalinin, A. A., & Iglovikov, V. I. (2018, December). Automatic
instrument segmentation in robot-assisted surgery using deep learning. In 2018 17th IEEE
International Conference on Machine Learning and Applications (ICMLA) (pp. 624–628).
IEEE.
97. Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for
biomedical image segmentation. In International Conference on Medical Image Computing
and Computer-Assisted Intervention (pp. 234–241). Cham: Springer.
98. Hamad, G. G., & Curet, M. (2010). Minimally invasive surgery. The American Journal of
Surgery, 199(2), 263–265.
99. Phee, S. J., Low, S. C., Huynh, V. A., Kencana, A. P., Sun, Z. L. & Yang, K. (2009, September).
Master and slave transluminal endoscopic robot (MASTER) for natural orifice transluminal
endoscopic surgery. In 2009 Annual International Conference of the IEEE Engineering in
Medicine and Biology Society (pp. 1192–1195). IEEE.
100. Wang, Z., Sun, Z., & Phee, S. J. (2013). Haptic feedback and control of a flexible surgical
endoscopic robot. Computer Methods and Programs in Biomedicine, 112(2), 260–271.
101. Ehrampoosh, S., Dave, M., Kia, M. A., Rablau, C., & Zadeh, M. H. (2013). Providing haptic
feedback in robot-assisted minimally invasive surgery: A direct optical force-sensing solution
for haptic rendering of deformable bodies. Computer Aided Surgery, 18(5–6), 129–141.
192 M. S. Iqbal et al.
102. Akinbiyi, T., Reiley, C. E., Saha, S., Burschka, D., Hasser, C. J., Yuh, D .D. & Okamura, A.
M. (2006, September). Dynamic augmented reality for sensory substitution in robot-assisted
surgical systems. In 2006 International Conference of the IEEE Engineering in Medicine and
Biology Society (pp. 567–570). IEEE.
103. Tavakoli, M., Aziminejad, A., Patel, R. V., & Moallem, M. (2006). Methods and mechanisms
for contact feedback in a robot-assisted minimally invasive environment. Surgical Endoscopy
and Other Interventional Techniques, 20(10), 1570–1579.
104. Hayward, V., Astley, O. R., Cruz-Hernandez, M., Grant, D., & Robles-De-La-Torre, G. (2004).
Haptic interfaces and devices. Sensor Review.
105. Rosen, J., Hannaford, B., MacFarlane, M. P., & Sinanan, M. N. (1999). Force controlled and
teleoperated endoscopic grasper for minimally invasive surgery-experimental performance
evaluation. IEEE Transactions on Biomedical Engineering, 46(10), 1212–1221.
106. Tholey, G., Pillarisetti, A., Green, W., & Desai, J. P. (2004, June). Design, development,
and testing of an automated laparoscopic grasper with 3-D force measurement capability. In
International Symposium on Medical Simulation (pp. 38–48). Berlin, Heidelberg: Springer.
107. Tadano, K., & Kawashima, K. (2010). Development of a master–slave system with force-
sensing abilities using pneumatic actuators for laparoscopic surgery. Advanced Robotics,
24(12), 1763–1783.
108. Valdastri, P., Harada, K., Menciassi, A., Beccai, L., Stefanini, C., Fujie, M., & Dario, P. (2006).
Integration of a miniaturised triaxial force sensor in a minimally invasive surgical tool. IEEE
Transactions on Biomedical Engineering, 53(11), 2397–2400.
109. Howe, R. D., Peine, W. J., Kantarinis, D. A., & Son, J. S. (1995). Remote palpation technology.
IEEE Engineering in Medicine and Biology Magazine, 14(3), 318–323.
110. Ohtsuka, T., Furuse, A., Kohno, T., Nakajima, J., Yagyu, K., & Omata, S. (1995). Application
of a new tactile sensor to thoracoscopic surgery: Experimental and clinical study. The Annals
of Thoracic Surgery, 60(3), 610–614.
111. Lai, W., Cao, L., Xu, Z., Phan, P. T., Shum, P., & Phee, S. J. (2018, May). Distal end force
sensing with optical fiber bragg gratings for tendon-sheath mechanisms in flexible endo-
scopic robots. In 2018 IEEE International Conference on Robotics and Automation (ICRA)
(pp. 5349–5255). IEEE.
112. Kaneko, M., Wada, M., Maekawa, H., & Tanie, K. (1991, January). A new consideration on
tendon-tension control system of robot hands. In Proceedings of the 1991 IEEE International
Conference on Robotics and Automation (pp. 1028–1029). IEEE Computer Society.
113. Lampaert, V., Swevers, J., & Al-Bender, F. (2002). Modification of the Leuven integrated
friction model structure. IEEE Transactions on Automatic Control, 47(4), 683–687.
114. Piatkowski, T. (2014). Dahl and LuGre dynamic friction models—The analysis of selected
properties. Mechanism and Machine Theory, 73, 91–100.
115. Do, T. N., Tjahjowidodo, T., Lau, M. W. S., & Phee, S. J. (2015). Nonlinear friction modelling
and compensation control of hysteresis phenomena for a pair of tendon-sheath actuated
surgical robots. Mechanical Systems and Signal Processing, 60, 770–784.
116. Dinh, B. K., Cappello, L., Xiloyannis, M., & Masia, L. Position control using adaptive
backlash.
117. Dinh, B. K., Cappello, L., Xiloyannis, M., & Masia, L. (2016, October). Position control using
adaptive backlash compensation for bowden cable transmission in soft wearable exoskeleton.
In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
(pp. 5670–5676). IEEE.
118. Do, T. N., Tjahjowidodo, T., Lau, M. W. S., & Phee, S. J. (2014). An investigation of friction-
based tendon sheath model appropriate for control purposes. Mechanical Systems and Signal
Processing, 42(1–2), 97–114.
119. Do, T. N., Tjahjowidodo, T., Lau, M. W. S., & Phee, S. J. (2015). A new approach of fric-
tion model for tendon-sheath actuated surgical systems: Nonlinear modelling and parameter
identification. Mechanism and Machine Theory, 85, 14–24.
120. Do, T. N., Tjahjowidodo, T., Lau, M. W. S., Yamamoto, T., & Phee, S. J. (2014). Hysteresis
modeling and position control of tendon-sheath mechanism in flexible endoscopic systems.
Mechatronics, 24(1), 12–22.
Deep Learning and Robotics, Surgical Robot Applications 193
121. Lenz, I., Lee, H., & Saxena, A. (2015). Deep learning for detecting robotic grasps. The
International Journal of Robotics Research, 34(4–5), 705–724.
122. Tan, Y. P., Liverneaux, P., & Wong, J. K. (2018). Current limitations of surgical robotics in
reconstructive plastic microsurgery. Frontiers in surgery, 5, 22.
123. Longmore, S. K., Naik, G., & Gargiulo, G. D. (2020). Laparoscopic robotic surgery: Current
perspective and future directions. Robotics, 9(2), 42.
124. Camarillo, D. B., Krummel, T. M., & Salisbury, J. K., Jr. (2004). Robotic technology in
surgery: Past, present, and future. The American Journal of Surgery, 188(4), 2–15.
125. Kim, S. S., Dohler, M., & Dasgupta, P. (2018). The Internet of Skills: Use of fifth-generation
telecommunications, haptics and artificial intelligence in robotic surgery. BJU International,
122(3), 356–358.
Deep Reinforcement Learning
for Autonomous Mobile Robot
Navigation
Abstract Numerous fields, such as the military, agriculture, energy, welding, and
automation of surveillance, have benefited greatly from autonomous robots’ contri-
butions. Since mobile robots need to be able to navigate safely and effectively, there
was a strong demand for cutting-edge algorithms. The four requirements for mobile
robot navigation are as follows: perception, localization, planning a path and control-
ling movement. Numerous algorithms for autonomous robots have been developed
over the past two decades. The number of algorithms that can navigate and control
robots in dynamic environments is limited, even though the majority of autonomous
robot applications take place in dynamic environments. A qualitative comparison of
the most recent Autonomous Mobile Robot Navigation techniques for controlling
autonomous robots in dynamic environments with safety and uncertainty consid-
erations is presented in this paper. The work incorporates different angles like the
essential technique, benchmarking, and showing parts of the improvement interac-
tion. The structure, pseudocode, tools, and practical, in-depth applications of the
particular Deep Reinforcement Learning algorithms for autonomous mobile robot
navigation are also included in the research. This study provides an overview of
the development of suitable Deep Reinforcement Learning techniques for various
applications.
A. de J. Plasencia-Salgueiro (B)
National Center of Animals for Laboratory (CENPALAB), La Habana, Cuba
e-mail: [email protected]
BioCubaFarma, National Center of Animal for Laboratory (CENPALAB), La Habana, Cuba
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 195
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_7
196 A. de J. Plasencia-Salgueiro
1 Introduction
The work is structured in the following incises. In Sect. 2. Antecedents briefly are
related to the historical development of the AR control, at conventional linear control
to DRL. In Sect. 3. Background, exposed the theoretical fundaments of AMRN and
Deep Reinforcement Learning for Autonomous Mobile Robot Navigation 197
Machine Learning (ML) including the requirements and the applications of ML algo-
rithms like Reinforcement Learning (RL), Convolutional Neural Networks (CNN),
the different approaches of DRL, Long Short-Term Memory (LSTM), and Applica-
tion requirements particularly the navigation in dynamic environments, safety, and
uncertainty. In Sect. 4. DRL Methods, make an accurate description of the more
common methods described in the more recent scientific literature, including the
theoretical conception, representation, logical flow chart, and pseudo-code of the
different DRL algorithms and combinations of them. In Sect. 5. Design Method-
ology is described the necessary steps to follow in the design of DRL systems
for autonomous navigation and the particularities and techniques benchmarking
in different conceptions, In Sect. 6. Teaching has exposed the particularity of the
teaching process of DRL algorithms and two exercises to develop in the class using
a simulation conception. In Sect. 7. Discussion is treated as the principal concep-
tions and troubles exposed in the work. In Sect. 8. Conclusions, is provided a brief
summary and the future perspective of the work.
The nomenclature used in this paper is listed in Abbreviations part.
2 Antecedents
Using electronics and ICs (Integrated Circuits) be able to control machines with more
flexibility and accuracy using conventional linear control systems using sensors
for providing feedback from the system output.
Linear control is motivated by Control Theory using mathematical solutions and
specifically linear algebra implemented on hardware using mechatronics, electronics,
ICs (Integrated Circuits), and micro-controllers.
These systems were using sensors to feedback on the error and were trying to
minimize the error to stabilize the system output. These linear control systems were
using a mathematical solution known as linear algebra to drive the function that maps
input to the output. This field of interest was known as Automation and the goal was
to create automatic systems [1].
Non-linear control became more crucial to drive the non-linear function (or kernel
function) mathematically for the more complicated task. The reason behind non-
linearity was the fact that input and output had different and sometimes big dimen-
sionality and the complexity could just not be modeled using linear control and linear
198 A. de J. Plasencia-Salgueiro
Fig. 1 Control modules for generating the controlling commands [2] (Pieter Abbeel—UC
Berkeley/OpenAI/Gradescope)
algebra. This was the main motivation and fuel for the rise of non-linear function
learning or how to drive these functions [1].
With the advancement in the computer industry, non-linear control gave birth to
intelligent control which is using AI for high-level control of the robot and systems.
Classical robotics was the dominating approach. These approaches were mostly
application dependent and highly platform-dependent. Generally speaking, these
approaches were hand-crafted, hand-engineered, and addressed as shallow AI [1].
These architectures are also referred to as GNC (Guidance, Navigation, and
Control) architectures, mostly composed of perception, planning, and control
modules. Perception modules were mostly used for mapping the environment and
localization of the robot inside the environment, Planning modules (also referred to
as navigation modules) to plan the path in terms of motion and mission, and Control
modules for generating the controlling commands (controlling behaviors) required
for the robot kinematics [1] (see Fig. 1).
The main boost in reconsidering the use of Neural Networks (NN) was the intro-
duction of the Back Propagation algorithm as a fast optimization approach. In one
of Geoff Hinton´s talks, he explained the biological foundation of back-propagation
and how it might happen in our brains [1].
DRL -based control have been initially introduced and coined by a company called
Google Deep Mind (www.deepmind.com). This company started using this learning
approach for the simulated agents in Atari games. The idea is to let the agent learn on
its own till it reaches the human-control level of gaming or maybe a superior level.
Recent excitement in AI was brought about by this DRL method Deep Q-network
(DQN) in Atari games, a simple simulated environment, and robots for testing [1].
DRL is used where Deep Neural Network (DNN) is used to extract high-
dimensional observation features in Reinforcement Learning (RL). Figure 2 shows
how a DNN is used to approximate the Q value for each state and how the agent acts
by observing the environment accordingly.
With the implementation of DRL, the robotics state will transform like in Fig. 3.
3.1 Requirements
Examples of mobile robots (MR) include ships that move with their surround-
ings, autonomous vehicles, and spacecraft. Their navigation involves looking for
an optimal or suboptimal route while simultaneously avoiding obstacles and consid-
ering their destination. To simplify this challenge, the majority of researchers have
concentrated solely on the navigation issue in two-dimensional space. The robot’s
sense of perception is its ability to perceive its surroundings.
The effectiveness with which an intelligent robot completes its mission is influ-
enced in part by the properties of the robot’s sensor and control systems, such as its
capacity to plan the trajectory and avoid obstacles.
Sensor monitoring for environments can be used in a broad range of locations.
Mounting sensors on robotic/autonomous systems is one approach to addressing the
issues of mobility and adaptability [4].
In order for the efficient robot’s action to be realized in real time, particularly in
environments that are unknown or uncertain, the strict requirements of the robot’s
sensor and control system parameters must be met. First, let’s talk about [5]:
– Increasing the precision of the remote sensor information;
– Reduction of sensor signal formation time to a minimum;
– Reducing the processing time of the sensor data;
– Reducing the amount of time required for the robot’s control system to make
decisions in a dynamic or uncertain environment with obstacles;
– Spreading the robots’ functional characteristics through the use of fast calculation
algorithms and effective sensors.
The recommender software, which makes use of machine learning, gets to work
once the anomaly detection software has discovered an anomaly. Using the auto-
matic’s installed navigation system or a compass equipped with sensors and the
automatic’s impact evasion system’s warning recognition data, the sensor data is
combined with the mechanical’s current course. An off-policy deep learning (DL)
model is used by the recommender to make recommendations for MR based on
the current conditions, surroundings, and sensor readings. The MR can send the
precise coordinates of the anomaly site and, if necessary, sensor data back to the
base for additional investigation as required thanks to this DL ML. This is especially
important when safety is at stake or when investigators only have to wear breathing
apparatus or hazardous material suits for a short time. The drone can go straight to
the tagged location while it analyzes additional sensors [4].
Deep Reinforcement Learning for Autonomous Mobile Robot Navigation 201
Localization
Localization is the method of determining where the robot is in its environment.
Ground or aerial vehicles’ precise positioning and navigation in complex
spatial environments are essential for effective planning, unmanned driving, and
autonomous operation [6].
In fact, the Kalman filter, which is associated with reinforcement learning, is
regarded as one of the more promising strategies for precise positioning. The RL-
AKF (adaptive Kalman filter navigation algorithm) uses the deep deterministic policy
gradient to find the most optimal state estimation and process noise covariance
matrix from the continuous action space, taking the integrated navigation system
as the environment and the opposite of the current positioning error as the reward.
When the GNSS signal is unavailable, the RL-AKF significantly improves integrated
navigation’s positioning performance [6].
Path-planning
In Path-planning, the robot chooses how to maneuver to reach the goal without
collision.
Even though the majority of mobile robot applications take place in dynamic
environments, there aren’t many algorithms that can guide robots through them [7].
For automatically mapping high-dimensional sensor data to robot motion
commands without referring to the ground truth, DRL algorithms are regarded
as powerful and promising tools. They only require a scalar reward function to
encourage the learning agent to experiment with the environment to determine the
best course of action for each state [8]. Building a modular DQN architecture to
combine data from a variety of vehicle-mounted sensors is demonstrated in [8]. In the
real world, the developed algorithm can fly without hitting anything. Path planning,
3D mapping, and expert demonstrations are not required for the proposed method.
Using an end-to-end CNN, it turns merged sensory data into a robot’s velocity control
input.
Motion control
The robot’s movements are controlled in Motion control to follow the desired trajec-
tory. Linear and angular velocities, for example, fall under the category of motion
control [9].
In plain contrast to the conventional framework for hierarchical planning, data-
driven techniques are also being applied to the self-ruling navigation problem as a
result of recent advancements in ML research. Systems that use end-to-end learning
algorithms to find navigation systems that map directly from perceptual inputs to
motion commands, avoiding the traditional hierarchical paradigm, have been devel-
oped in early work of this multiplicity. Without the symbolic, rule-based human
knowledge or engineering design of these systems, a MR can move everywhere [9].
In Fig. 4, the mentioned requirements are linked.
202 A. de J. Plasencia-Salgueiro
Fig. 4 Requirement
interrelation in AMRN [7]
Bengio’s group with LeCun was the first to introduce CNN, and consistently propose
them as the best AI architecture [14]. They successfully applied DL to OCR (Optical
Character Recognition) for document analysis.
Closer examination reveals that there are two versions of the DRL strategy: policies
and approaches based on values. Value-based DRL indirectly obtains the agent’s
policy by iteratively updating the value function. When the agent reaches an optimal
value, the optimal policy is chosen using the optimal value function. Using the
function approximation method, the policy-based approach directly builds a policy
Deep Reinforcement Learning for Autonomous Mobile Robot Navigation 203
Fig. 5 a A RL Structure
using an ANN, b Function Q
with ANN type MLP for
Q-learning [13]
network. After that, it selects actions within the network to determine the value of
the reward and optimizes the policy network parameters in the gradient direction to
produce an optimized policy that maximizes the value of the reward [15].
Value-Based DRL Methods
Deep Q network
Mnih et al. published an interesting preliminary work about DQN-related research in
Nature in 2015, affirming that after 49 games, the trained network could perform at a
human level. In DQN, the activity estimation capability is addressed with a DNN that
depend on Q learning and is referred to as a CNN. The feedback from game rewards
is used to train the network. The following are DQN’s fundamental characteristics:
[15]:
Double DQN
“Hado Van Hasselt” was the one who introduced Double DQN (DDQN). By breaking
down the max operation in the target-to-action selection, it is used to reduce the
overestimation problem to a minimum. A DQN and a DNN are combined in this
system. This approach was developed to address the issue of overestimating Q values
in models previously discussed. It’s aware that the action with a higher Q value is
the best option for the next state, but the accuracy of the Q value depends on what
It’s tried, what It got, and what will be the next state for this trial. It’s not have
sufficient Q values at the beginning of the experiment to estimate the best possibility.
Since there are fewer Q values at this point to choose from, the highest Q value may
cause It to take an incorrect action toward the target. DDQN is used to solve this
issue. One DQN is used to select the Q value, and the other uses the target network to
calculate the target Q value for that particular action. This DDQN assists with limiting
the miscalculation of the Q values which assists in decreasing the preparation with
timing [3].
Dueling Q network
A dueling Q network is utilized to tackle the issues in the DQN model by utilizing
two networks, the ongoing network, and the objective network. The ongoing network
approximates the Q values. Then again, the objective network chooses the following
best action and plays out the action picked by the objective. It may not be necessary to
approximate the value of each action in all circumstances. It uses a dueling Q network
for this. In some gaming settings, when a collision occurs, it chooses to move left (or
right), but in other situations, it need to know which action to take. A dueling network
is an architecture that it creates for a single Q network. It’s employing two sequences
rather than a single sequence following the convolution layer. Estimation values, the
advantage function, and finally a single Q value are separated by employing these
two sequences. As a result, the dueling network produces the Q function, which has
been trained using a variety of existing algorithms, such as DDQN and SARSA.
The progression of the dueling deep Q network is dueling double deep Q network
(D3QN) and will revised later [3].
Policy-based DRL methods
carry out minibatch updates over a number of epochs increases the effectiveness of
sample utilization [15].
The PPO calculation utilizes an elective objective to advance the new approach
utilizing the old arrangement. It is utilized to enhance the new policy’s actions in
comparison to the previous policy. However, the training algorithm will become
unstable if the new policy makes significant improvements. The objective function
is improved by the PPO algorithm. A detailed explanation of PPO will be exposed
later in the incise 4.5.
Due to its ability to strike a balance between sample complexity, simplicity, and
time efficiency, PPO outperforms A3C and other on-policy gradient methods. [15].
Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNNs)
RNNs are a family of NN that cannot be constrained in the feed-forward architecture.
RNNs are obtained by introducing auto or backward connections—that is recurrent
connections—into feed-forward neural networks [16].
When introducing a recurrent connection, is introduced the concept of time. This
allows RNNs to take context into account; that is, to remember inputs from the past
by capturing the dynamic of the signal.
Introducing recurrent connections changes the nature of the NN from static to
dynamic and is therefore suitable for analyzing time series.
The simplest recurrent neural unit consists of a network with just one single hidden
layer, with activation function tanh(), and with an auto connection. In this case, the
output, h(t), is also the state of the network, which is fed back into the input—that
is, into the input of the next copy of the unrolled network at time t + 1 [16].
This simple recurrent unit already shows some memory, in the sense that the
current output also depends on previously presented samples at the input layer.
However, often not enough to solve most required tasks. It needed something more
powerful that can crawl backward farther in the past than just what the simple
recurrent unit can do. To solve this introduced LSTM units [16].
LSTM is a more complex type of recurrent unit, using an additional hidden vector,
the cell state or memory state, s(t), and the concept of gates. Figure 6 shows the
structure of an unrolled LSTM unit.
An LSTM layer contains three gates: a forget gate, an input gate, and an output
gate.
LSTM layers are a very powerful recurrent architecture, capable of keeping
the memory of a large number of previous inputs. These layers thus fit—and are
often used to solve—problems involving ordered sequences of data. If the ordered
sequences of data are sorted based on time, then we talk about time series. Indeed,
LSTM-based RNNs have been applied often and successfully to time series analysis
problems. A classic task to solve in time series analysis is demand prediction [16].
As in previous studies behind the success of Deep Q-Learning (DQL) to the contin-
uous action domain, the authors Lillicrap [20] developed an actor-critic, model-free
algorithm based on the deterministic policy gradient that can operate over contin-
uous action spaces. The proposed algorithm successfully solves numerous simulated
physics problems, including well-known ones like cart pole swing-up, handy manip-
ulation, legged locomotion, and car driving, by employing the same learning algo-
rithm, network architecture, and hyper-parameters. Because it has full access to all of
the space’s components and subsystems, the algorithm is able to identify strategies
that resemble those of arranging calculations. It demonstrates that the algorithm can
“end-to-end” learn policies for a variety of tasks: directly from sources of information
and raw sensors.
Deep Reinforcement Learning for Autonomous Mobile Robot Navigation 209
The Bellman Eq. (1) provides the recognized definition of RL, where [21]:
where:
• V(s): Present value of a state s;
• R (s, a): Reward related to action a in state s;
• V(s ): Future value in future state s ;
• a: Action was taken by the agent;
• s: Current agent state.
• γ : Discount factor.
In order for the Bellman equation to be applied to Q-learning, formula (2) states
that the formula needs to be transformed so that it can calculate the quality of each
agent state’s actions in both the current time (t) and a previous time (t − 1).
Q(s, a)t = Q(s, a)t−1 + α(R(s, a) + γ )(max a . Q(s , , a , ) − Q(s, a)(t−1) ) (2)
The Q-learning Eq. (2) was the foundation for the DQL algorithm, which makes
Q values available based on the agent’s state so that actions can be taken.
The agent’s Q values were determined in its current state using a dense artificial
NN with four inputs. Sensors, and hidden layer with thirty artificial neurons, execute
the Q-learning estimation, as displayed in the network diagram of Fig. 7. There are
four Q values included in the network’s output.
Based on the Q weights processed by the network for the agent’s final action,
the answer that was most likely to be chosen was determined using the SoftMax (3)
equation. The agent received a positive reward for making a good decision and a
negative reward for making a bad one after making a decision.
eQ
So f t Max = Q (3)
e
Reward notices a sign r(st ,at ) delivered by the reward function. Moving forward,
half-turning left, turning left, turning left–right, and turning right are the actions. The
MR then moves on to the subsequent observation, st+1 . The (4) incremental reward
for the future is:
T
Rt = γ γ τ, (4)
τ
and the MR’s goal is to get the most discount on the reward γ is a discount factor
between 0 and 1 that weighs the importance of immediate versus future rewards. The
immediate will be more significant the smaller γ it is, and vice versa. The termination
time step is denoted by T. The algorithm’s goal is to make the action value function
Q as maximal as possible. In contrast to DQN, D3QN’s Q function is (5).
where are the parameters of the two streams CNN of fully connected layers, a loss
function can be used to train the Q-network (6).
1
n
L(θ ) = (yk − Q(s, a; θ))2 (6)
n k
Fig. 10 The structure of D3QN network. It has a dueling network and a three-layer CNN [23]
Deep Reinforcement Learning for Autonomous Mobile Robot Navigation 213
step 2, the subsequent layer makes use of 64 3 × 3 convolution parts. In the third layer,
64 2 × 2 convolution kernels are utilized with stride 2. The control network is the
dueling network. The dueling network consists of two sequences of fully connected
(FC) layers, with each FC layer independently estimating the state value and benefits
of each action. In the FC1 layer, there are 512 nodes in each of the two FC layers. In
the FC2 layer, there are two FC layers, each with six nodes. In the FC3 layer, there
is a FC layer with six nodes. The ReLu function serves as the activation function for
each and every layer. Figure 11 describes the D3QN model’s parameters.
Hafner et al., developed the Dreamer algorithm without the use of robots learning
simulators. Dreamer builds a world model from a replay buffer of previous experi-
ences by acting in the environment. The predicted trajectories of the learned model
are used by an actor-critic algorithm to learn actions. Decoupled learning updates
from data collection to meet latency requirements and permit rapid training without
waiting for the environment. A learner thread continuously trains the world model
and actor-critic behavior in the implementation while an actor thread simultaneously
computes actions for environment interaction [24].
World Model Learning: The world model, as shown in Fig. 13, is a DNN that
teaches itself to anticipate the dynamics of the environment (Left). Future represen-
tations are predicted rather than future inputs because sensory inputs can be large
images. This makes massively parallel training with large batch sizes possible and
reduces the number of errors that accumulate. As a result, the world model can be
interpreted as a brief environment simulation that the robot acquires on its own. As
it looks into the real world, the model gets better all the time. The Recurrent State-
Space Model (RSSM), which has four parts, is the foundation for the world model
[24]:
Proprioceptive (the sense of one’s own movement, force, and body position) joint
readings, force sensors, and high-dimensional inputs like RGB and depth camera
images are all shared by the various physical AMR sensors. All of the sensory inputs
x t are combined by the encoder network into the stochastic representations zt . Using
its recurrent state, ht , the dynamics model learns to predict the sequence of stochastic
representations. Because it reconstructs the sensory inputs to provide a rich signal
for learning representations and permits human inspection of model predictions, the
decoder is not required when learning behaviors from latent rollouts (Fig. 12).
In the authors’ real-world experiments, the AMR must interact with the real
world to discover task rewards, which the reward network learns to predict. It is
also possible to use rewards that are specified by hand in response to the decoded
sensory inputs. All of the world model’s components were jointly optimized using
stochastic backpropagation [24].
Deep Reinforcement Learning for Autonomous Mobile Robot Navigation 215
Fig. 12 Dreamer algorithm uses a direct method for learning on AMR hardware without the use
of simulators. The robot’s experience is collected by the current learned policy. The replay buffer is
expanded by this experience. Through supervised learning, the world model is trained on replayed
off-policy sequences. An actor-critic algorithm uses imagined rollouts in the world model’s latent
space to improve a NN policy. Low-latency action computation is made possible by parallel data
collection and NN learning, and learning steps can continue while the AMR is moving [24]
Actor-Critic algorithm
The world model represents knowledge about the dynamics that the actor-critic algo-
rithm learns is specific to the task at hand, whereas the actor-critic algorithm is
independent of the task at hand. Without decoding observations, the world model’s
predicted learned behaviors from rollouts are depicted in Fig. 13 (right). Similar to
specialized modern simulators, this enables massively parallel behavior learning on
a single GPU with typical batch sizes of 16 K. Two NN make up the actor-critic
algorithm [24]:
Finding a distribution of successful actions that maximizes the total sum of task
rewards that can be predicted for each latent model state st is the job of the actor
network. The critic network learns to anticipate the total quantity of future task
rewards through temporal difference learning. This is important because it makes it
possible for the algorithm to learn long-term strategies by, for example, taking into
account rewards after the H = 16 step planning horizon has passed. It is necessary
to regress the critic’s predicted model state trajectory return. An easy option is to
calculate the return as the sum of N intermediate rewards and the critic’s prediction
of the next state. The compute returns represent the average of all N [1, H-1], not
an arbitrary N value [24]:
216 A. de J. Plasencia-Salgueiro
Fig. 13 Training in NN Hafner et al‘s. Dreamer algorithm (2019; 2020) for rapid robot learning
in real-world situations. Dreamer comprises two brain network parts. Left: A deep Kalman filter
that has been trained on replay buffer subsequences is the structure of the world model. All sensory
modalities are combined by the encoder into discrete codes. The decoder provides a significant
learning signal and makes it possible for humans to examine model predictions by reconstructing the
codes’ inputs. Without observing intermediate inputs, a RSSM is trained to predict subsequent codes
based on actions. Right: Without having to reconstruct sensory inputs, the world model enables
massively parallel policy optimization with large batch sizes and without remaking tangible data
sources. Dreamer trains a value and policy network using imagined rollouts and a learned reward
function [24]
. λ .
Vtλ = rt + γ (1 − λ)ν(st+1 ) + λVt+1
λ
, VH = ν(s H ). (9)
The actor network is trained to maximize returns, whereas the critic network is
trained to regress λ-returns. Two examples of gradient estimators are reinforce and
the reparameterization trick. Reiterate that the actor’s optimal policy gradient can
be calculated. The differentiable dynamics network is used by Rezende and others
to directly backpropagate return gradients. Following Hafner et al., reinforcement
gradients for discrete action tasks and reparameterization gradients for continuous
control tasks were selected. In addition to maximizing returns, the actor is encouraged
to maintain a high entropy level throughout training in order to avoid a deterministic
policy collapse and maintain some exploration [24]:
H
.
L(π ) = −E lnπ (at |st sg Vtλ − ν(st ) + ηH [π (at |st )]]; (10)
t=1
Was optimized both actors and critics using the Adam optimizer (Kingma and
Ba, 2014). As is typical in the literature (Mnih et al., 2015), a slowly updated copy
of the critic network was used to calculate the λ-returns. (Lillicrap and others 2015).
Because doing so would result in model predictions that are incorrect and overly
optimistic, the world model is unaffected by the gradients of the actor and critic [24].
Deep Reinforcement Learning for Autonomous Mobile Robot Navigation 217
Based on the research [25], using PPO, the AMR’s current angular velocity distribu-
tion is learned. The PPO algorithm, which simplifies it for RNNs to work at a huge
scope by utilizing first- order gradients, can be considered an inexact variant of trust
district strategy enhancement. The continuous control learning algorithm is shown
in pseudocode in Fig. 14 utilizing PPO. The proposed algorithm makes use of the
actor-critic architecture.
To begin, the policy stipulates πθ that the mobile robot progresses through the
environment one step at a time. The state, the action, and the reward are gathered
for later training at each step. The advantage function is then provided by temporal
difference (TD) error, which is the difference between the state value V ϕ (st ), and
discounted rewards t1 >t γ t1 −t rt1 . By applying a gradient method to actor updates θ
concerning J PPO (θ ), a surrogate function whose probability ratio is πθ (at |st )/πold (at
|st ). is maximized. Actor optimizes new policy πθ (at |st ) based on the advantage
function and old policy πold (at |st ). The larger the advantage function is the more
probable the new policy changes are. However, if the advantage function is too large,
the algorithm is very likely to be divergent. Therefore, a KL penalty is introduced
to limit the learning rate from the old policy π old (at |st ) to the new policy πθ (at |st ).
Critic updates ϕ by a gradient method concerning L BL (ϕ), which minimizes the loss
function of TD error given data with length-T time steps. The desired change is set by
the hyperparameter KLtarget in each policy iteration. If the actual change KL[πold|πθ]
below or exceeds the KLtarget range in [βlow KLtarget , βhigh KLtarget ], the scaling term
α > 1 would adjust the coefficient of KL[πold |πθ ].
The clipped surrogate objective is another approach that can be used in place of
the KL penalty coefficient for updating the actor-network. The following summarizes
the primary objective:
where: = 0.2 is the hyperparameter. The clip term clip(r t (θ ),1 − ,1 + )´t has
the same motivation as the KL penalty, which is also used to limit too large policy
updates.
Reward Function.
To simplify the reward function, the critic network uses only two distinct
conditions without normalization or clipping:
r move i f not collition
rt (st , at ) = (12)
rcollition i f collition
A positive reward r move is given to the AMR for freely operating in the environ-
ment. Otherwise, a significant negative reward rcollision is given if the AMR collides
with the obstacle during a minimum sensor scanning range check. This reward func-
tion encourages the AMR to maintain its lane and avoid collisions as it moves through
the environment.
Fig. 15 Top-Left: The CMAD-DDQN framework, in which each AMR (or UAV) j equipped with
a DDQN agent interacts and learns from its state space’s closest neighbors. In order to boost system
performance as a whole, each AMR directly works together. Bottom-Left: Framework known as
the multi-agent decentralized double deep Q-network (MAD-DDQN), in which each AMR j that
is equipped with a DDQN agent relies solely on the information it gathers from the surrounding
environment and does not collaborate directly with its immediate neighbors. The DDQN agent
observes its current state s in the environment at each time-step UAV j´s and updates its trajectory
by selecting an action in accordance with its policy, receiving a reward r and moving to a new state
s´ [25]
It was presuming that as the agents collaborate with one another in a dynamic
shared environment, they might observe learning uncertainties brought on by other
agents’ contradictory policies. Figure’s algorithm16 depicts Agent j’s direct collabo-
ration with its neighbors’ DDQN. Agent j adheres to a “–greedy policy” by carrying
out an action in its current state s, then moving to a new state s´ and receiving a
reward that reflects its neighborhood’s coverage performance. Moreover, the DDQN
procedure depicted in lines 23–31 advances the specialist’s choices (Fig. 16).
Due to AMR’s limited understanding of the environment when performing local path-
planning tasks, the issues of path redundancy and local deadlock that arise during
planning are present in environments that are unfamiliar and complex. A novel algo-
rithm based on the fusion (combination) of LSTM, NN, fuzzy logic control, and
RL was proposed by Guo et al. This algorithm uses the advantages of each algo-
rithm to overcome its disadvantages. With the end goal of nearby way arranging, a
NN model with LSTM units is first evolved. Second, a low-dimensional input fuzzy
220 A. de J. Plasencia-Salgueiro
Fig. 16 DDQN for Agent j with direct collaboration with its neighbors [25]
logic control (FL) algorithm is used to collect training data, and the network model
LSTM_FT is pretrained by transferring the learned method to acquire the required
skill. RL is joined with autonomous gaining of new standards from the conditions to
more readily adjust to various circumstances. In static and dynamic environments, the
FL and LSTM_FT algorithms are contrasted with the fusion algorithm LSTM_FTR.
Numerical simulations show that LSTM_FTR can significantly improve path plan-
ning success rate, path length optimization, and decision-making efficiency when
compared to FL. The LSTM_FTR can learn new rules and has a higher success rate
than the LSTM_FT [26]. The research’s simulation phase is still ongoing.
Deep Reinforcement Learning for Autonomous Mobile Robot Navigation 221
Fig. 17 Representation of DNFS by combining the advantages of fuzzy systems and a DNN [27]
222 A. de J. Plasencia-Salgueiro
Fig. 18 Sequential DNFS: a fuzzy systems incorporated with a DNN and b a DNN incorporated
with fuzzy systems [28]
scenes. Another hybrid strategy is AMRN with DRL [28]. Model migration costs
are reduced when A* and DDPG method are used together.
The actor-critic algorithm serves as a model for the DDPG algorithm. The proce-
dure, as shown in Fig. 20 depicts how two critics use the model’s in Fig. 19 pseu-
docode to speed up the training process. One critic will advise the actor on how to
avoid collision and estimate the probability of it. In addition to instructing the actor
on how to get there, the additional critic will reduce the difference between the input
speed and the output speed.
Wei Zhu and authors following sampling efficiency and sim-to-real transfer capa-
bility, a hierarchical DRL framework for quick and secure navigation is described
in [29]. The low-level DRL policy enables the robot to simultaneously move toward
the target position while maintaining a safe distance from obstacles; The high-level
DRL policy has been expanded to further enhance navigational safety. It is chosen
as a sub-goal as a waypoint on the path from the robot to the ultimate goal to avoid
sparse reward and reduce state space. The path can also be made with a local or global
map, which can make the proposed DRL framework’s generalizability, safety, and
sampling efficiency much better. The sub-goal can also be used to reduce action
space and increase motion efficiency by creating a target-directed representation of
the action space.
The objective is a DRL strategy with high training efficiency for quick and secure
navigation in complex environments that can be used in a variety of environments
and robot platforms. The low-level DRL policy is in charge of quick motion, and
the high-level DRL policy was added to improve obstacle avoidance safety. As a
result, a two-layer DRL framework is built, as shown in Fig. 21. When the sub-goal,
which is a waypoint on the path from the robot to the ultimate goal, is chosen, the
observation space of RL is severely limited. When conventional global path planning
strategies are used to generate the path, which takes into account both obstacles and
the final goal position, the sampling space is further reduced. Due to the inclusion of
the sub-goal, the training effectiveness of this DRL framework is significantly higher
than that of pure DRL methods. The DNN only generates discrete linear velocity,
and the sub-goal’s angular velocity is inversely proportional to its orientation in the
robot frame. Consequently, there is less room for exploratory action and less room
for observation. Additionally, the proposed DRL framework is very adaptable to a
variety of robot platforms and environments for three reasons: (1) the DRL elements
are represented by a sub-goal on a feasible path; (2) the observation includes a high-
dimensional sensor scan whose features were extracted using DNN; (3) Generalized
linear and angular velocities are used to convert the actions into actuator commands
[29].
The DQN algorithm is utilized by the value-based RL framework in both low-level
and high-level DRL. The discrete action space was chosen due to its simplicity, even
224 A. de J. Plasencia-Salgueiro
Fig. 20 A* utilized the DDPG architectural strategy. The laser input is provided by the robot
sensor. The input for navigation comes from global navigation. Using the vel-output, the mobile
base controls the robot. The gradient and computed mean squared error (MSE) are used to update
the actors’ neural networks [28]
Fig. 21 DRL framework in a hierarchy. The high-level DRL strategy aims to safely avoid obstacles
while the low-level DRL policy is utilized for rapid motion. A 37-dimension laser scan, the robot’s
linear and angular velocities (v and w), and the sub-goal’s position in the robot frame (r and θ) are
all part of the same state input for both the low-level and high-level DRL policies. In contrast, the
high-level DRL policy generates two abstract choices, one of which relates to the low-level DRL
policy, while the low-level DRL policy generates five specific actions. [29]
226 A. de J. Plasencia-Salgueiro
though the DDPG algorithm and the soft-actor-critic (SAC) framework are required
for smoother motion.
5 Design Methodology
With DRL systems for autonomous navigation, the two most significant issues are
data inefficiency and a lack of generalizability to new goals.
A design methodology to be used for designing DRL applications in autonomous
systems is given in [30]. For the development of DRL-based systems, Hillebrand’s
methodology was designed to accommodate the need to compromise a design method
and make consecutive design decisions.
The V-Model’s fundamental principles serve as the methodology’s foundation.
Figure 22 describes the process’s four consecutive interactions.
The Operational Domain Analysis is the first step. The operational environment
and the robot’s tasks in terms of the use case and requirements are defined in this
phase. The criteria for testing and evaluating are the requirements.
The second stage is conceptual design. At this point, the primary characteristic
of the problem with reinforcement learning is established. Powerful characteristics
include the activity space, perception space, reward range, and significant gambling
conditions.
The Systems Design is the third step. The various design decisions that need to
be made and an understanding of the fundamental factors that influence them are all
part of this phase.
The design of the reward is the first crucial aspect. The goal that the agent is
supposed to achieve is implicitly encoded in their ward.
The selection of an algorithm is the second design decision. When choosing an
algorithm, there are a few things to consider. The kind of action space first. Both a
discrete and a continuous action space can be handled by the DRL algorithm.
The design of the NN is the third step. In DRL, NN are used to approximate value
and policy networks’ functions.
The inductive bias is the fourth design decision. Domain heuristics that are utilized
to accelerate the algorithm’s learning processor performance are referred to as an
inductive bias.
The learning rate is the final design factor. Virtual commissioning is the fourth
step. The rate at which the NN is trained is determined by this factor. X-in-the-Loop
Techniques and Virtual Testbeds are used to evaluate agent performance and integrate
the model in this step [30].
5.1 Benchmarking
and user- and environment-related aspects of behavior because of the inherent hetero-
geneity of AMR. This is related to the lack of interventions that provide quantitative
quality metrics as the test’s result and conduct a quantitative system analysis [32].
Techniques for benchmarking
Gazebo (https://gazebosim.org/home): An open-source 3D simulator for robotics
applications is called Gazebo. From 2004 to 2011, Gazebo was a part of the Player
Project. Gazebo became an independent project in 2012, and the Open Source
Robotics Foundation (OSRF) began providing support for it. Open Dynamics Engine
(ODE), Bullet, Simbody, and Dynamic Animation and Robotics Toolkit (DART) are
among the physics engines that are incorporated into Gazebo. Each of these physics
engines is capable of loading a physical model that is described in XML format by a
Simulation Description Format (SDF) or Unified Robotic Description. Also, Gazebo
allows users to come up with their world, model, sensor, system, visual, and GUI
plugins by implementing C ++ Gazebo extensions. This capability enables users to
extend the simulator further into more complex scenarios. OSRF provides a bridge
between Gazebo and Robot Operating System (ROS) with the gazebo_ros plugin
package [33].
ROS is an open-source software framework for robot software development main-
tained by OSRF. ROS is a widely used middleware by robotics researchers to leverage
the communication between different modules in a robot and between different
robots and to maximize the re-usability of robotics code from simulation to the
physical devices. ROS allows the running of different device modules as a node and
provide multiple different types of communication layers between the nodes such as
service, publisher-subscriber, and action communication models to satisfy different
purposes. This allows robotics developers to encapsulate, package, and re-use each
of the modules independently. Additionally, it allows each module to be used in both
simulation and physical devices without any modification [33].
In robotics, as well as many other real-world systems, continuous control frame-
works are required. Numerous works provide RL-compatible access to extremely
realistic robotic simulations by combining ROS with physics engines like ODE or
Bullet. The greater part of them can be run on genuine mechanical frameworks with
a similar programming.
Two examples of the implementation of Gazebo in benchmarking deep reinforce-
ment learning algorithms are the following.
Yue et al. [34] is troubled by the AMRN’s inability to construct an environment
map prior to moving it to its desired position. Instead, it only relies on what is
currently visible. The DQN is used to map the initial image to the mobile robot’s
best action within the DRL framework. As previously stated, it is difficult to directly
apply RL in a real-world robot navigation scenario due to the need for a large number
of training examples. Prior to being utilized to tackle the issue in a genuine portable
robot route situation, the DQN first goes through preparing in the Gazebo reproduc-
tion environment. The proposed method has been approved for both simulation and
real-world testing. The experimental results of autonomous mobile robot navigation
in the Gazebo simulation environment demonstrate that the trained DQN is able to
230 A. de J. Plasencia-Salgueiro
accurately map the current original image to the AMR’s optimal action and approx-
imate the AMR’s state action-value function. The experimental results in real-world
indoor scenes demonstrate that the DQN that was trained in a simulated environment
can be utilized in a real-world indoor environment. The AMR can similarly stay
away from problems and get to the planned area, even in unique conditions where
there is interference. As a result, it can be used as an effective and environmentally
adaptable AMRN method by AMR operating in an unknown environment.
For robots moving in tight spaces [35], the authors assert that mapping, localiza-
tion, and control noises could result in collisions when using motion planning based
on the conventional hierarchical autonomous system. In addition, it is disabled when
there is no map. To address these issues, the authors employ DRL, a self-decision-
making technique, to self-explore in small spaces without a map and avoid colli-
sions. The rectangular safety region, which represents states and detects collisions
for robots with a rectangular shape, and a meticulously constructed reward function,
which does not require information about the destination, were suggested to be used
for RL using the Gazebo simulator. After that, they test five reinforcement learning
algorithms—DDPG, DQN, SAC, PPO, and PPO-discrete—in a narrow track simu-
lation. After training, the successful DDPG and DQN models can be applied to three
brand-new simulated tracks and three actual tracks. (https://sites.google.com/view/
rl4exploration).
Benchmarking Techniques for Autonomous Navigation
Article [36] confirms, that “a lack of an open-source benchmark and reproducible
learning methods specifically for autonomous navigation makes it difficult for
roboticists to choose what RL algorithm to use for their mobile robots and for
learning researchers to identify current shortcomings of general learning methods
for autonomous navigation”.
Before utilizing DRL approaches for AMRN, the four primary requirements that
must be satisfied are as follows: thinking about safety, generalization to diverse and
novel environments, learning from limited experimentation information, and thinking
under uncertainty of partially observed sensory inputs. The four main categories of
learning methods that can satisfy one or more of the aforementioned requirements are
safe RL, memory-based NN architectures, model-based RL, and domain randomiza-
tion. A comprehensive investigation of the extent to which these learning strategies
are capable of meeting these requirements for RL-based navigation systems is carried
out by incorporating them into a brand-new open-source large-scale navigation
benchmark. This benchmarking’s codebase, datasets, and experiment configurations
can be found at https://github.com/Daffan/ros_jackal.
Benchmarking multi-agent deep reinforcement learning algorithms
An open-source Framework for Multi-robot Deep Reinforcement Learning
(MADRL), named MultiRoboLearn was proposed by Chen et al. [37]. In terms
of generality, efficiency, and capability in an unstructured and large complex envi-
ronment, it is also important to include the support of multi-robot systems in existing
robot learning frameworks. More specifically, complex tasks such as search/rescue,
Deep Reinforcement Learning for Autonomous Mobile Robot Navigation 231
group formation control, or uneven terrain exploration require robust, reliable, and
dynamic collaboration among robots. MADRL is a framework that acts as a bridge
to link multiagent DRL algorithms with real-world multi-robot systems. This frame-
work has two key characteristics compared with other frameworks. On the one hand,
compared with learning-based single-robot frameworks, it is important to consider
how robots collaborate to perform tasks intelligently, and how robots communi-
cate with each other efficiently, also, this work extends the system to the domain of
learning algorithms (https://github.com/JunfengChenrobotics/MultiRoboLearn).
6 Teaching
7 Discussion
8 Conclusions
Numerous fields have benefited greatly from autonomous robots’ contributions. Since
mobile robots need to be able to navigate safely and effectively, there was a strong
demand for innovative algorithms but contradictory, the number of algorithms that
234 A. de J. Plasencia-Salgueiro
can navigate and control robots in dynamic environments is limited, even though the
majority of autonomous robot applications take place in dynamic environments.
With development of machine learning algorithm, in particular Reinforcement
Learning algorithms and Deep Learning, with the creation the Deep Reinforce-
ment Learning algorithms, is opened a wide field of applications at new level of
Autonomous Mobile Robot Navigation Techniques in dynamic environments with
safety and uncertainty considerations.
However, this is a very fast theme and for better development is necessary to
establish a methodological conception that in first order, select and characterized
the fundamental Deep Reinforcement Learning Algorithms at code level, making
qualitative comparison of the most recent Autonomous Mobile Robot Navigation
techniques for controlling in dynamic environments with safety and uncertainty
considerations, and underlying the most complex and promissory techniques like
fusion, hybrid and hierarchical framework. In second order, is included the design
methodology and established the different benchmarking techniques for the selec-
tion of better algorithm according the specific environment. Finally, but no minor
significant, is recommended the tool and more suitable examples, according the
experience, for Teaching Autonomous robot navigation using Deep Reinforcement
Learning Algorithms.
Thinking about the future perspective of the work, is necessary the continuing
development of methodology and write a more homogenous and practical document
that permits the inclusion of new developed algorithms and a better comprehension of
the exposed methodology. Hope that this methodology helps students and researcher
in his work.
Abbreviations
AC Actor-Critical Method
AI Artificial Intelligence
AMRN Autonomous Mobile Robot Navigation
ANN Artificial Neural Networks
AR Autonomous Robot
BDL Bayesian Deep Learning
CNN Convolutional Neural Networks
CMAD-DDQN Communication-Enabled Multiagent Decentralized DDQN
DRL Deep Reinforcement Learning
DNN Deep Neural Networks
DNFS Deep Neuro-fuzzy systems
DL Deep Learning
DQN Deep Q-network
DDQN Double DQN
D3QN Dueling Double Deep Q-network
DDPG Deep Deterministic Policy Gradient
Deep Reinforcement Learning for Autonomous Mobile Robot Navigation 235
References
1. Dargazany DRL (2021). Deep Reinforcement Learning for Intelligent Robot Control–Concept,
Literature, and Future (Vvol. 13806v1, no. 2105, p. 16).
2. Abbeel, P. (2016). Deep learning for robotics. In DL-workshop-RS.
3. Balhara, S. (2022). A survey on deep reinforcement learning architectures, applications and
emerging trends. IET Communications, 16.
4. Hodge, V. J. (2020). Deep reinforcement learning for drone navigation using sensor data. Neural
Computing and Applications, 20.
5. Kondratenko, Y., Atamanyuk, I., Sidenko, Machine learning techniques for increasing effi-
ciency of the robot’s sensor and control information processing. Sensors MDPI, 22(1062),
31.
6. Gao, X. (2020). RL-AKF: An adaptive kalman filter navigation algorithm based on reinforce-
ment learning for ground vehicles. Remote Sensing, 12(1704), 25.
7. Hewawasam, H. S. (2022). Past, present and future of path-planning algorithms for mobile
robot navigation in dynamic environments. IEEE Industrial Electronics Society, 3(2022), 13.
236 A. de J. Plasencia-Salgueiro
33. La, W. G. (2022). DeepSim: A reinforcement learning environment build toolkit for ROS and
Gazebo (p. 10). arXiv:2205.08034v1 [cs.LG].
34. Yue, P. (2019). Experimental research on deep reinforcement learning in autonomous
navigation of mobile robot (2019)
35. Tian, Z. (2022). Reinforcement Learning for Self-exploration in Narrow Spaces (Vol. 17, p. 7).
arXiv:2209.08349v1 [cs.RO].
36. Xu, Z. Benchmarking reinforcement learning techniques for autonomous navigation.
37. Chen, J. (2022). MultiRoboLearn: An open-source Framework for Multi-robot Deep Reinforce-
ment Learning (p. 7). arXiv:2209.13760v1 [cs.RO].
38. Dietz, G. (2022). ARtonomous: Introducing middle school students to reinforcement learning
through virtual robotics. In IDC ’22: Interaction Design and Children.
39. Yang, T., Zuo (2022). Target-Oriented teaching path planning with deep reinforcement learning
for cloud computing-assisted instructions. Applied Sciences, 12(9376), 18.
40. Armando Plasencia, Y. S. (2019). Open source robotic simulators platforms for teaching deep
reinforcement learning algorithms. Procedia Computer Science, 150, 9.
41. Coppelia robotics. Retrieved October 10, 2022, from https://www.coppeliarobotics.com/.
42. Quiroga, F. (2022). Position control of a mobile robot through deep reinforcement learning.
Applied Sciences, 12(7194), 17.
43. Zeng, T. (2018). Learning continuous control through proximal policy optimization for mobile
robot navigation. In: 2018 International Conference on Future Technology and Disruptive
Innovation, Hangzhou, China.
44. Tai, L. (2017). Virtual-to-real deep reinforcement learning: continuous control of mobile robots
for mapless navigation. In IROS 2017, Hong Kong.
Event Vision for Autonomous Off-Road
Navigation
Abstract Robotic automation has always been employed to optimize tasks that are
deemed repetitive or hazardous for humans. One instance of such an application
is within transportation, be it in urban environments or other harsh applications.
In said scenarios, it is required for the platform’s operator to be at a heightened
level of awareness at all times to ensure the safety of on-board materials being
transported. Additionally, during longer journeys it is often the case that the driver
might also be required to traverse difficult terrain under extreme conditions. For
instance, low light, fog, or haze-ridden paths. To counter this issue, recent studies
have proven that the assistance of smart systems is necessary to minimize the risk
involved. In order to develop said systems, this chapter discusses a concept of a Deep
Learning (DL) based Vision Navigation (VN) approach capable of terrain analysis
and determining the appropriate steering angle within a margin of safety. Within the
framework of Neuromorphic Vision (NV) and Event Cameras (EC), the proposed
concept is tackling several issues within the development of autonomous systems. In
particular, the use of Transformer based backbone for off-road depth estimation using
an event camera for better accuracy result and processing time. The implementation of
the above mentioned deep learning system, using event camera is leveraged through
the necessary data processing techniques of the events prior to the training phase.
Besides, binary convolutions (BN) and alternately spiking convolution paradigms
H. AlRemeithi
Tawazun Technology and Innovation, Abu Dhabi, United Arab Emirates
e-mail: [email protected]
H. AlRemeithi · F. Zayer (B) · J. Dias · M. Khonji
Khalifa University, Abu Dhabi, United Arab Emirates
e-mail: [email protected]
J. Dias
e-mail: [email protected]
M. Khonji
e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 239
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_8
240 H. AlRemeithi et al.
using the latest technology trend have been deployed as acceleration methods, with
efficiency in terms of energy latency, and environmental robustness. Initial results
hold promising potential for the future development of real-time projects with event
cameras.
1 Introduction
Despite the advancements in autonomous driving algorithms, there still exists much
room for development within the realm of off-road systems. Current state-of-the-
art techniques for self-driving platforms have matured in the context of inter-city
travel [1, 2] and thus neglects the challenges faced when navigating environments
such as deserts. Uneven terrain and the lack of relevant landmarks and/or significant
features, pose serious challenges when the platform attempts to localize itself or
analyze the terrain to determine suitable navigation routes [3]. When discussing
desert navigation, even for skilled drivers, maneuverability is a complex task to
achieve that requires efficient decision-making. It has been shown that self-driving
platforms are capable of meeting such standards when using a combination of sensors
that measure the state of the robot and its surroundings [2].
Popular modern approaches include stereo-vision in addition to ranging sensors
like LIDARs to map the environment [4]. Utilizing several sensors allow for redun-
dancy and safer navigation, but also at the cost of increased development com-
plexity, system integration requirements, and financial burden [5]. Researchers have
addressed this concern by developing algorithms that minimize on-board sensors,
where in extreme cases a monocular vision-based approach is developed [6]. Whilst
this reduces the computational resources required to run the system in real-time, it
is still feasible to further reduce the system complexity. In the recent years, Neuro-
morphic computing has been researched heavily to further optimize these systems,
and to accommodate these novel architectures, a new kind of vision sensor, Event
cameras, are used in-place of traditional Frame-based cameras. So, from the above,
efficient off-road navigation is yet to be achieved, especially in the context of extreme
environments. The study presented discusses scenarios that may be experienced in
the UAE deserts.
The contributions presented in this chapter is a concept of an end-to-end neural
network for depth and steering estimation in the desert. To the best of the authors’
knowledge, this work is the first to investigate and argue the implications of uti-
lizing a Transformer based backbone for off-road depth estimation using an event
camera. The implementation of the above-mentioned deep learning system, using
event camera is leveraged through the necessary data processing techniques of the
events prior to the training phase. During inference, an acceleration method, namely
Event Vision for Autonomous Off-Road Navigation 241
Binary Convolutions, is implemented and initial results hold promising potential for
the future development of real-time projects with event cameras.
The remaining sections of the paper are as the following, Sect. 2, present related
work in terms of off-road navigation and neuromorphic vision as well as the use of
event cameras and their feasibility. Section 3 discusses event-based vision naviga-
tion. Section 4 shows the the proposed end-to-end deep learning navigation model
including event processing, depth estimation, steering prediction and the data set.
Section 5 discusses the implementation of the system and its possible acceleration
using binary convolution method the integration. In addition, energy efficient pro-
cessing using memristive technology based neuromorphic infrastructure is proposed
in the case study. Results and discussion are presented in Sect. 6, to show the obtained
results and the main achievements. Finally, conclusions and future works are drowned
in Sect. 7.
2 Related Work
Given a desert setting, traditional navigation techniques may not be directly, or even
completely compatible with the setting. Considerable modifications are required
when navigating on off-road environments in terms of mechanical and algorith-
mic design when approaching this issue. In such environments, it is expected to be
exposed to high temperatures, visual distortions due to dust, and instability when
driving on uneven terrains [7, 8]. Due to these challenges, the risk involved for a
human operator is increased significantly and it is often the case that some sort of
enhancement is needed to existing systems [9]. Specifically, in night-time trans-
portation, convoys may be more inclined to halt the journey to minimize such risks;
results in profit loss for businesses or delays in critical missions [10–12]. It is crucial
to devise a strategy that may be utilized around-the-clock and being standalone as
well to allow for system integration flexibility with multiple platform types. Multiple
paradigms are used to navigate autonomously as mentioned in the section previously,
for example, the well established vision plus ranging sensor configuration (Camera
and LiDAR) [9]. Unfortunately, using such a configuration in harsh and unpredictable
environments is not reliable due to the degradation of LiDAR in performance due to
heat and diffraction by sand particles [13, 14]. The main challenges faced would be
to design a stronger filter to reconstruct the noisy LiDAR data due to the hot tempera-
tures, and the diffraction and refraction from the laser and the sand particles [15, 16].
Although there has been a study that managed to overcome this issue by employing
a high-power LiDAR [17], such a solution is not preferable. It is not viable in this
scenario as adding a high-power variant will hinder performance when operating
high temperatures. Subsequently, the complexity of integration is also increased as
larger power supplies and cooling solutions must be integrated on the platform. Since
242 H. AlRemeithi et al.
Fig. 2 Simplified
Neuromorphic sensor
architecture [22]
As mentioned previously, since only non-static artifacts are recorded, this means
that operationally the discarded pixels indirectly conserves power. This is seen in the
power consumption of a regular Event camera being approximately 100–140 mW in
an active state; four orders of magnitude less than a frame-based camera [35].
In summary, the advantages are, as listed by [22]:
• Minimal Internal Latency and High Temporal Resolution
• High Dynamic Range Analog Sensor
• Reduced Power Consumption and Heat Dissipation.
244 H. AlRemeithi et al.
Event cameras output asynchronous sparse data according to log intensity differ-
ences through time [26]. Expanding on the advantages listed earlier in the chapter,
this section elaborates further. The ms-scale latency and high dynamic range is a
benefit towards robotics applications especially in navigation. The available readout
data from the sensor is provided in Address-Event Representation, which was first
introduced in [36]. Since the data is asynchronous unlike frames, it must be first
interpreted in in terms of timestamps to correlate the event with a pixel counter-
part from frames or with another event from the same stream for further processing.
Techniques are discussed further below.
must first be interpreted as frames. Firstly, it must be noted that an event packet from a
neuromorphic camera contains the following data e( p, x, y, ts ). This form of data is
referred to as Address Event Representation (AER) [25, 26, 38]. In the AER packet,
p is the polarity of the event, which implies direction, shows old and new position
as seen in Fig. 1. Pixel position is represented by the pair (x, y), and ts indicates the
timestamp the event was recorded; when an external trigger was visually detected by
the silicon retina. The required pre-processing of AER streams consists of integrating
the accumulation of packets and overlaying them after passing through an integration
block [27]. By providing raw frames image enhancement is achieved because of the
inherited characteristics from the event camera. Visually represented as.
One drawback of using the approach in Fig. 4 is the saturation of the frame over
time when no motion is present. It is expected to encounter this problem if no raw
frames are provided; generated frames are inferred from events only. On a con-
tinuously moving platform this might not pose a serious issue, but if applied to
surveillance applications, the added redundancy of having a dedicated frame sensor
can be beneficial. Another study [39] demonstrated this concepts by using an Event
camera unit that houses both a frame and an event sensor (Fig. 5).
Feature Engineering has been a well-established topic of research for data gathered
using the traditional frame-based camera. Using a similar approach, object recog-
nition on event data can be accomplished through feature detection, followed by
classification. For event data, there are a few feature detection techniques. Corner
features, edges, and lines are the most commonly used features. This section describes
studies that detect corners from event data but were not further investigated for the
use of object recognition. However, characteristics around detected corners could be
retrieved and given into a classifier for further classification.The stance of using an
event-based camera may be argued by addressing the issues commonly faced when
operating a regular camera. In high-speed vision, it is seldom the case that ideal light-
ing situations and no motion blur exists. As a result, degraded features are registered
onto the receiving sensor, which in turn reduces whatever strategy is being employed.
Reconstruction techniques have been developed like in [8, 40, 41] to de-blur images
and generate high-frame rate video sequences. The method discussed in [40] presents
an efficient optimization process with rivals state-of-the-art in terms of high-speed
reconstruction under varying lighting conditions and dynamic scenes. The approach
is described as an Event-based Double Integral model, the mathematical formulation
and derivation is left as an exercise to the reader to be referred to from the original
material; the results are shown in Fig. 6. It was also noted in [8] that reconstructing
a frame-based video output purely from events is not feasible, if the objective is to
achieve the same temporal resolution. This factor restricts the output video to be only
as fast as the physical limitations brought by the frame-based counterpart. This con-
clusion has been had for some time where other research published challenged the
perspective and suggests the nullification of frame-based cameras completely in such
approaches. For instance in [42] the feature tracking technique defined was indepen-
dent of actual frames, but rather on reconstructed logarithmic intensities from events
as depicted in Fig. 7. The authors also argue that this approach retains the High-
Dynamic Range aspect and is favorable than using the original frames. The tested
case scenario was in uneven lighting settings or objects with conceived motion blur,
where the reconstructed frames from events were resistant against these degradation
factors, unlike the original frames.
Since Event cameras are a recent commercial technology, often researchers investi-
gate the applicability of legacy computer vision techniques and their effectiveness
with neuromorphic sensors. Firstly, it is crucial to have a distinct pipeline to calibrate
a vision sensor prior to any algorithm deployment. One method shown in [43] tack-
les this by having flashing patterns in specific intervals which define sharp features
within the video stream. Although the screen is not moving, due to the flashing, the
intensity value is changing with time in that position, which emulates the concept of
motion within a frame, as such, calibration patterns are recorded. Moreover, other
techniques like in [44] surfaced which use a deep learning model to achieve a generic
event camera calibration system. The paper shows that neural-network-based image
reconstruction is ideally suited for the task of intrinsic and extrinsic calibration of
event cameras, rather than depending on blinking patterns or external screens like in
[43]. The benefit of the suggested method is that it allows to employ conventional
calibration patterns that do not require active lighting. Furthermore, the technique
enables extrinsic calibration between frame-based and event-based sensors without
adding complexity. Both simulation and real-world investigations from the paper
show that picture reconstruction calibration is accurate under typical distortion mod-
els and a wide range of distortion factors (Fig. 8).
248 H. AlRemeithi et al.
In this section, relevant techniques used for autonomous navigation are discussed.
Challenges related to said techniques will also be addressed with respect to the case
scenario presented by desert environments. Currently, state-of-the-art approaches uti-
lize a combination of Stereoscopic vision setups, LiDAR, and single-view cameras
around the vehicle for enhanced situational awareness (as seen in Tesla vehicles).
However, for this study, research has been limited to front-view for driving assis-
tance or full autonomy. As a result, the discussed methods include the well established
stereo-vision configurations, with the optional addition of a LiDAR, or in extreme
cases, a monocular setup is used. Additionally, is not much reliable research con-
ducted regarding steering using event data, apart from [45, 46]. The presented results
in these studies mainly address a driving scenario of that similar to inter-city naviga-
tion, which may not be fruitful when applied onto an off-road environment, and so,
further investigations such as this chapter must be conducted. Researchers in [47]
have also proposed an event-frame driving dataset for end-to-end neural networks.
This work shall also be extended to accommodate other environments like those in
the Middle East, specifically, the United Arab Emirates (Fig. 9).
Event Vision for Autonomous Off-Road Navigation 249
Different feature extraction methods are discussed in the literature that handles 3D
space generation from 3D representations [48]. The algorithms mentioned opt for
LiDAR-camera data fusion to generate dense point-clouds for geometric and depth-
completion purposes. The images are first processed from the camera unit and the
LiDAR records the distances based on the surrounding obstacles. Algorithms such as
RANSAC are also employed during the correlation phase to determine the geomet-
ric information of the detected targets. Machine Learning based approaches are also
gaining popularity as they can optimize the disparity map when using a stereo config-
uration alongside a LiDAR, which reduces development complexity when designing
an algorithm and running it in real-time [1, 48, 49]. The purpose of this study is to
investigate reliable techniques and address viable real-time options for offroad nav-
igations, as such, the previous factors are taken into consideration. Environmental
challenges are also addressed in [15–17]. They highlight the issues of using a rang-
ing sensor, a LiDAR, in adverse environments which may yield to sub-optimal data
readings. Due to the LiDAR essentially being a laser, it is expected for performance
degradation to be observed if the mode of operation is in harsh off-road setting, such
as seen in the Arab regions. A study discussing a similar attenuation profile to the
sand particles seen in the desert is reported next (Fig. 10).
The data fusion difficulties is apparent as there has been a shift in leading insti-
tute’s research, such as DARPA [51] and NASA [52] opting for purely vision based
strategies. Avoiding such complexities allows for simpler hardware integration and
simpler software design with improved real-time performance as well. Also, since
monocular setups require less setup and calibration than stereo configurations, it is
possible to capitalize further on these optimizations. The reliability of monocular
Event Vision for Autonomous Off-Road Navigation 251
vision in regular frame cameras has been tested early on in the literature to push the
limits of autonomous systems in the context of off-road terrain analysis [51]. The
terrain traversability tests have been conducted in the DARPA challenge involving
an extensive 132 mile test in 7 h in off-road environments (Fig. 12).
Researchers have also proposed deep neural network capable of combining both
modalities and achieving better results than the state-of-the-art (∼30%) for monoc-
ular depth estimation when tested on the KITTI dataset.
The developed method in [39, 53] presents a network architecture that generates
voxel grids from recurrent neural networks that yield logarithmic depth estimations
of the terrain. The model may be simply reproduced in future work by implementing
and encoder-decoder based model with residual networks as a backbone, or Vision
Transformers as well may be investigated due to their recent proven reliability [54].
From [53], we see that although contrast information is not provided about the scene,
meaningful representations of events are recorded which is evident in the depth
estimation from the following (Figs. 13 and 14).
252 H. AlRemeithi et al.
From the previous findings and the discussed literature, we can theorize a possible
solution to address the gaps. The main contribution is towards a Realizable Event-
based Deep Learning Neural Network for Autonomous Off-road Navigation. The
system can be designed using well-established frameworks in the community, like
PyTorch, to ensure stability during development. Moreover, the hardware imple-
mentation may be done on a CUDA enabled single-board computer. This is done to
further enhance the real-time performance by the on-board GPU parallelism capa-
bilities [55].
The main task is to optimize steering angle of mobile platform during off-road
navigation by the assistance of a vision sensor. In principle, the strategy shall incor-
porate Depth/Height estimation of terrain and possible obstacles using events and
frame fusion. The system shall contain a pre-processor block to filter noise and
enhance fused data before injecting them into the deep learning pipeline. In addition,
as seen in [39, 53], for meaningful latent space representation, Encoder-Decoder
based architectures are employed. Some examples to reduce unwanted artifacts and
recreate the scene from event-frame fusion are Variational Auto-Encoders, Adver-
sarial Auto-Encoders, and UNet Architectures. By including relevant loss functions
and optimizers to generate viable depth maps, the yielded results are then taken to the
Event Vision for Autonomous Off-Road Navigation 253
steering optimizer block. At this stage, the depth maps are analyzed using modern
computer vision technique or inferred from a Neural Network as shown in [45]. The
two main branches in the model, adapted from the aforementioned literature, are the
Depth Estimation branch and the Steering branch (Fig. 15).
The end-to-end model is derived from two independent models developed by
the department of informatics in the University of Zurich. The main contribution
in our concept is to combine the federated systems into one trainable model and
adjust the asynchronous encoder-decoder based embedding backbone into a more
computationally efficient lightweight variant suitable for deployment under restricted
hardware; further details discussed in Sect. 6 of the chapter.
Due to the design of the sensors, they are particularly susceptible to the Background
Activity noise caused by cyclic noise and circuit leakage currents. Since background
activity rises when there is less light or when the sensitivity is increased, a filter
is frequently required for a variety of applications. A noise filter can be helpful in
certain situations for eliminating real events that are caused by slight changes in
the light and maintaining a greater delineation between the mobile target and the
surroundings. For this study, the Neuromorphic camera tool from the Swiss-based
startup iniVation, DV, is used for prototyping and selection of denoising algorithm.
From the literature, two prevalent noise removal techniques are used, coined as
knoise [56] and ynoise [57]. The former algorithm is depicted to have O(N) memory
complexity background removal capability. The proposed method is preferred for
memory sensitive tasks where near sensor implementations and harsh energy and
memory constraints are imposed. The method stores recovered events from the stream
as long as they are unique per row and column within a specific timestamp. Doing so,
minimizes the memory utilization of the on-board processor, and the reported error
254 H. AlRemeithi et al.
rates were tens of magnitudes better than previous spatiotemporal filter designs.
The main use-case for such a filter would be for mobile platforms with limited in-
memory computing resources, such as off-road navigation platforms. The latter,
ynoise, presents a two-stage filtering solution. The method discards background
activity based on the duration of events within a spatiotemporal window around
a hot-pixel. Results of both knoise, ynoise and a generic noise removal algorithm
from iniVation is shown next (Fig. 16).
The depth estimation branch is adopted from [39] where the network architecture is
RAMNet. RAMNet is a Recurrent Asynchronous Multimodal Neural Network which
serves as a generalized variant of RNNs that can handle asynchronous datastreams
depending on sensor-specific learnable encoding parameters [39]. The architecture is
a fully convolutional encoder-decoder architecture based on U-Net. The architecture
of the depth estimation branch is given by (Fig. 17).
Event Vision for Autonomous Off-Road Navigation 255
It was demonstrated in [45] how a deep learning model can take advantage of the
event stream from a moving platform to determine the steering angles. The steering
branch for this study is based on the previous study, where subtle motion cues during
desert navigation will be fed into the model to learn the necessary steering behaviour
in harsh terrain. For simplicity, this steering algorithm will not be adjusted to take
into consideration slip, but rather only the planar geometry of the terrain to avoid
collision with hills and overturning the vehicle. To implement the discussed approach
from the reference material, the events are dispatched into an accumulator first to
obtain the frames which will be used in a regression task to determine an appropriate
angular motion. As a backbone, ResNet models are to be investigated as a baseline
for the full model (Fig. 18).
256 H. AlRemeithi et al.
5 System Implementation
The proposed embedded system for processing and real-time deployment is the
NVIDIA Xavier AGX. From the official datasheet, The developer kit consists of
512-core Volta GPU with Tensor Cores, 32GB memory, and 8-core ARM v8.2 64-
bit CPU on an Linux-based distribution. NVIDIA Volta also allows for various data
operations which gives flexibility during the reduction of convolution operations
[55, 59]. Moreover, it has a moderate power requirement of 30−65 W while deliver-
ing desktop-grade performance in a small form-factor. Consequently, the NVIDIA
Xavier is a suitable candidate for the deployment of a standalone real-time system
(Fig. 20).
For a more extreme approach it is also possible to fully implement the proposed
network on an FPGA board as demonstrated by [60–62]. Since the design of an
FPGA framework is not within the scope of the project, as a proof of concept to
motivate further research, this chapter will detail the implementation of convolution
optimization techniques in simplistic networks on the PYNQ-Z1. A key feature is
it allows for ease of implementation and rapid prototyping because of the Python-
enabled development environment [60]. System features are highlighted in the next
figure (Fig. 21).
258 H. AlRemeithi et al.
When developing an autonomous driving platform it is often the case that tradeoffs
between performance and system complexity is taken into consideration [9]. Efforts
to reduce the challenges are seen by systems discussed in the previous sections
transitioning towards a singular sensor approach. The solutions are bio-inspired,
purely linked to vision and neuromorphic computing paradigms as they tend to
offer significant performance boost [22, 63, 64]. Considerable studies have been
published in the Deep Learning field aiming towards less computationally expensive
inference setups, mainly through CUDA optimizations [55, 59] and FPGA Hardware
Acceleration [60].
The optimizations mentioned in the previous work take advantage of the primary
used hardware in autonomous system deployment, NVIDIA Development boards.
The enhancements are done through improving the pipeline through CUDA program-
ming and minimizing the needed operations to perform a convolution operation. In
the discussed methods one achieved comparable performance to regular convolu-
tion by converting floating-point operations into binary operations. While the other
reduces the non-trivial elements by averaging the pixels around a point-of-interest
and discarding the surrounding neighbours [59]; coined as Perforated Convolution
(Fig. 22).
Event Vision for Autonomous Off-Road Navigation 259
When addressing the hardware implementation of deep learning systems often Neu-
romorphic computing platforms are mentioned. In the context of Event-based vision,
neuromorphic architectures are recently being favored over traditional von Neumann
architectures. This is due to comparative improvements in metrics such as computa-
tional speed up and reduced power consumption [65, 66]. Studies show that devel-
oping neuromorphic systems for mobile robotics, such as employing Spiking Neural
Networks (SNN), will yield faster and less computationally intensive pipelines with
reduced power consumption [67, 68]. SNNs are defined as a neuromorphic archi-
tecture which provides benefits such as improved parallelism due to neurons firing
asynchronously and possibly at the same time. The aforementioned is realizable
because by design, neuromorphic systems combine the functionality of processing
and memory within the same subsystem that carries out tasks by the design of the
artificial network rather than a set of algorithms as defined in von Neumann archi-
tectures [66]. The following figures demonstrate the main differences between both
architectures in addition to the working concept of a SNN (Fig. 23).
We take the two available networks, mainly what we are proposing is to change the
encoder-decoder CNN backbone into a purely attention-based model, Vision Trans-
former, and the usage of Binary convolution layers for the steering regression. For the
transformer, we argue that there is an apparent performance boost to the real-time,
and hardware implementation. The advantages include reduced power consump-
tion and performance time. This is seen through a reduced amount of parameters in
the Vision Transformer model, which implies less processing, and in-terms of the
hardware implementation of an accelerator, less arithmetic operations. Less phys-
ical computations is also preferable as it yield less power consumption for power
constrained platforms, such as ones seen in a desert environment which may have
reduced computing capabilities to deal with extreme heat and other adverse weather
conditions. Our proposed solution is to change the architecture seen in the Fig. 17,
by the transformer model (Fig. 26).
Studies have proven similar concepts in different domains, and what this chapter
aims to contribute towards is a first step towards efficient means of depth estimation
in desert navigation from the above rationale and the following results [81].
262 H. AlRemeithi et al.
Fig. 26 Proposed event transformer model for the system’s integration [80]
to recreate the results on GPU, then validate the approach on FPGA. The specific
algorithm optimized was first discussed in [82]. The study by Hu et al. demon-
strated Single-Mesh Reconstruction (SMR) to construct a 3D model from a single
RGB image. The approach depends on consistency between interpolated features
and learnt features through regression:
• Single-view RGB
• Silhouette mask of detected object.
The applicability of SMR is useful in the context of autonomous platforms. Poten-
tially, the platform can establish a 3D framework of itself with respect to the detected
3D obstacles (instead of the artifacts mentioned in [82]). Evidently, this can enhance
navigation strategies in the absence of additional sensors. The adjustment of the
convolution layers were aligned with the methods discussed to create a LBC layer
[55]. The purpose of this experiment was to prove the reduction in processing
resources required when performing convolution, regardless during training or infer-
ence (Table 1).
Clearly, we cannot establish a high level of confidence for this software-based
acceleration technique without examining a more relevant network and specifically
the KITTI dataset for self-driving vehicles, as a standard. The described network
in [83] follows a similar approach to SMR, however, applied to pose detection of
vehicles. During training, the pipeline is given:
• Single-view RGB
• Egocentric 3D Vehicle Pose Estimation.
The network architecture is given by (Fig. 27).
The 2D/3D Intermediate Representation stage is of interest to us as the main objec-
tive is to recreate a 3D framework from 2D inputs. Instead of regular convolution,
implementing LBC block yields the next results (Table 2).
From the results shared above, it is seen that the approach is viable, but it reduces
in potency and the dataset complexity increases. In the first dataset, singular artifacts
were provided without any background or ambiguous features. But in the KITTI
dataset, the vehicles exist in a larger ecosystem which may influence incorrect results,
and thus requires the network more time to determine clear boundaries from the back-
ground before transforming the 2D information into 3D representation. Furthermore,
264 H. AlRemeithi et al.
7 Conclusions
autonomy due to the terrain analyses complexities, ranging from depth, steering, and
slip estimation. The computational requirements for these approaches to be done in
real-time are often computationally expensive, but from the proposed approach, we
believe our system to be the first attempt towards efficient real-time computing in
constrained settings for off-road navigation. Secondly, the pipeline proposed may
be implemented for other systems, such as unmanned aerial platforms which tend
to be deployed for search-and-rescue missions and are seldom to have sufficient
on-board computing resources.This chapter served as a modest introduction towards
Event-based camera systems within the context of off-road navigation. The chapter
began by establishing the foundations behind the Neuromorphic sensor hardware
driving the camera. Moving on to the data processing aspect and the applicability of
traditional techniques. The knowledge-base was assessed to determine whether the
traditional techniques were indeed viable in this novel sensor. Furthermore, imple-
mentations of a Deep Learning system utilizing event camera is also possible through
the necessary data processing techniques of the events prior to the training phase. Dur-
ing inference, a possible implementation of an acceleration method, namely Binary
Convolutions, were implemented and initial results hold promising potential for the
future development of real-time projects with event cameras. Future work is still
necessary, specifically when addressing the data collection aspect within the UAE
environment. However, to summarize, the proposed Event Transformer-Binary CNN
(EvT-BCNN) concept in this chapter is a first attempt towards the deployments of
memristive-based systems and neuromorphic vision sensors as computing-efficient
alternatives to classical vision systems.
Acknowledgements This project is funded by Tawazun Technology & Innovation (TTI), under
Tawazun Economic Council, through the collaboration with Khalifa University. The work shared
is part of a MSc Thesis project by Hamad AlRemeithi, and all equipment is provided by TTI.
Professional expertise is also a shared responsibility between both entities, and the authors extend
their deepest gratitude for the opportunity to encourage research in this field.
References
1. Badue, C., Guidolini, R., Carneiro, R. V., Azevedo, P., Cardoso, V. B., Forechi, A., Jesus, L.,
Berriel, R., Paixão, T. M., Mutz, F., de Paula Veronese, L., Oliveira-Santos, T., & De Souza,
A. F. (2021). Self-driving cars: A survey. Expert Systems with Applications, 165.
2. Ni, J., Chen, Y., Chen, Y., Zhu, J., Ali, D., & Cao, W. (2020) A survey on theories and appli-
cations for self-driving cars based on deep learning methods. Applied Sciences (Switzerland),
10.
3. Chen, G., Cao, H., Conradt, J., Tang, H., Rohrbein, F., & Knoll, A. (2020). Event-based neuro-
morphic vision for autonomous driving: A paradigm shift for bio-inspired visual sensing and
perception. IEEE Signal Processing Magazine, 37.
4. Lin, M., Yoon, J., & Kim, B. (2020) Self-driving car location estimation based on a particle-
aided unscented kalman filter. Sensors (Switzerland), 20.
5. Mugunthan, N., Naresh, V. H., & Venkatesh, P. V. (2020). Comparison review on lidar vs camera
in autonomous vehicle. In International Research Journal of Engineering and Technology.
266 H. AlRemeithi et al.
6. Ming, Y., Meng, X., Fan, C., & Yu, H. (2021) Deep learning for monocular depth estimation:
A review. Neurocomputing, 438.
7. Li, X., Tang, B., Ball, J., Doude, M., & Carruth, D. W. (2019). Rollover-free path planning for
off-road autonomous driving. Electronics (Switzerland), 8.
8. Pan, Y., Cheng, C. A., Saigol, K., Lee, K., Yan, X., Theodorou, E. A., & Boots, B. (2020).
Imitation learning for agile autonomous driving. International Journal of Robotics Research,
39.
9. Liu, O., Yuan, S., & Li, Z. (2020). A survey on sensor technologies for unmanned ground
vehicles. In Proceedings of 2020 3rd International Conference on Unmanned Systems, ICUS
2020.
10. Shin, J., Kwak, D. J., & Kim, J. (2021). Autonomous platooning of multiple ground vehicles
in rough terrain. Journal of Field Robotics, 38.
11. Naranjo, J. E., Jiménez, F., Anguita, M., & Rivera, J. L. (2020). Automation kit for dual-mode
military unmanned ground vehicle for surveillance missions. IEEE Intelligent Transportation
Systems Magazine, 12.
12. Browne, M., Macharis, C., Sanchez-diaz, I., Brolinson, M., & Illsjö, R. (2017). Urban traffic
congestion and freight transport : A comparative assessment of three european cities. Interdis-
ciplinary Conference on Production Logistics and Traffic.
13. Zhong, H., Zhou, J., Du, Z., & Xie, L. (2018). A laboratory experimental study on laser
attenuations by dust/sand storms. Journal of Aerosol Science, 121.
14. Koepke, P., Gasteiger, J., & Hess, M. (2015). Technical note: Optical properties of desert aerosol
with non-spherical mineral particles: Data incorporated to opac. Atmospheric Chemistry and
Physics Discussions, 15, 3995–4023.
15. Raja, A. R., Kagalwala, Q. J., Landolsi, T., & El-Tarhuni, M. (2007). Free-space optics chan-
nel characterization under uae weather conditions. In ICSPC 2007 Proceedings - 2007 IEEE
International Conference on Signal Processing and Communications.
16. Vargasrivero, J. R., Gerbich, T., Buschardt, B., & Chen, J. (2021). The effect of spray water
on an automotive lidar sensor: A real-time simulation study. IEEE Transactions on Intelligent
Vehicles.
17. Strawbridge, K. B., Travis, M. S., Firanski, B. J., Brook, J. R., Staebler, R., & Leblanc, T.
(2018). A fully autonomous ozone, aerosol and nighttime water vapor lidar: A synergistic
approach to profiling the atmosphere in the canadian oil sands region. Atmospheric Measure-
ment Techniques, 11.
18. Hummel, B., Kammel, S., Dang, T., Duchow, C., & Stiller, C. (2006). Vision-based path-
planning in unstructured environments. In IEEE Intelligent Vehicles Symposium, Proceedings.
19. Mueller, G. R., & Wuensche, H. J. (2018). Continuous stereo camera calibration in urban
scenarios. In IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, 2018-
March.
20. Rankin, A. L., Huertas, A., & Matthies, L. H. (2009). Stereo-vision-based terrain mapping for
off-road autonomous navigation. Unmanned Systems Technology X, I, 7332.
21. Litzenberger, M., Belbachir, A. N., Donath, N., Gritsch, G., Garn, H., Kohn, B., Posch, C., &
Schraml, S. (2006). Estimation of vehicle speed based on asynchronous data from a silicon
retina optical sensor. In IEEE Conference on Intelligent Transportation Systems, Proceedings,
ITSC.
22. Gallego, G., Delbruck, T., Orchard, G., Bartolozzi, C., Taba, B., Censi, A., Leutenegger, S.,
Davison, A. J., Conradt, J., Daniilidis, K., & Scaramuzza, D. (2020). Event-based vision: A
survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44.
23. Delbrück, T., Linares-Barranco, B., Culurciello, E., & Posch, C. (2010). Activity-driven, event-
based vision sensors. In ISCAS 2010 - 2010 IEEE International Symposium on Circuits and
Systems: Nano-Bio Circuit Fabrics and Systems.
24. Rebecq, H., Ranftl, R., Koltun, V., & Scaramuzza, D. (2021). High speed and high dynamic
range video with an event camera. IEEE Transactions on Pattern Analysis and Machine Intel-
ligence, 43.
Event Vision for Autonomous Off-Road Navigation 267
25. Lichtsteiner, P., Posch, C., & Delbruck, T. (2008). A 128× 128 120 db 15 µs latency asyn-
chronous temporal contrast vision sensor. IEEE Journal of Solid-State Circuits, 43, 566–576.
26. Brändli, C., Berner, R., Yang, M., Liu, S.-C., & Delbruck, T. (2014). A 240 × 180 130 db 3
µs latency global shutter spatiotemporal vision sensor. IEEE Journal of Solid-State Circuits,
49, 2333–2341.
27. Scheerlinck, C., Barnes, N., & Mahony, R. (2019). Continuous-time intensity estimation using
event cameras. Lecture notes in computer science (including subseries Lecture notes in artificial
intelligence and lecture notes in bioinformatics), 11365 LNCS.
28. Gallego, G., Lund, J. E. A., Mueggler, E., Rebecq, H., Delbruck, T., & Scaramuzza, D. (2018).
Event-based, 6-dof camera tracking from photometric depth maps. IEEE Transactions on Pat-
tern Analysis and Machine Intelligence, 40.
29. Mostafavi, M., Wang, L., & Yoon, K. J. (2021). Learning to reconstruct hdr images from events,
with applications to depth and flow prediction. International Journal of Computer Vision, 129.
30. Mueggler, E., Huber, B., & Scaramuzza, D. (2014). Event-based, 6-dof pose tracking for high-
speed maneuvers.
31. Posch, C., Matolin, D., & Wohlgenannt, R. (2011). A qvga 143 db dynamic range frame-free
pwm image sensor with lossless pixel-level video compression and time-domain cds. IEEE
Journal of Solid-State Circuits, 46.
32. Lee, S., Kim, H., & Kim, H. J. (2020). Edge detection for event cameras using intra-pixel-area
events. In 30th British Machine Vision Conference 2019, BMVC 2019.
33. Rebecq, H., Ranftl, R., Koltun, V., & Scaramuzza, D. (2019). Events-to-video: Bringing modern
computer vision to event cameras. In Proceedings of the IEEE Computer Society Conference
on Computer Vision and Pattern Recognition, 2019-June.
34. Xu, H., Gao, Y., Yu, F., & Darrell, T. (2017). End-to-end learning of driving models from
large-scale video datasets. In Proceedings - 30th IEEE Conference on Computer Vision and
Pattern Recognition, CVPR 2017, 2017-January.
35. Xu, H., Gao, Y., Yu, F., & Darrell, T. (2017). End-to-end learning of driving models from
large-scale video datasets. In Proceedings - 30th IEEE Conference on Computer Vision and
Pattern Recognition, CVPR 2017, 2017-January.
36. Boahen, K. A. (2004). A burst-mode word-serial address-event link - i: Transmitter design (p.
51). IEEE Transactions on Circuits and Systems I: Regular Papers.
37. Wang, C., Buenaposada, J. M., Zhu, R., & Lucey, S. (2018). Learning depth from monocular
videos using direct methods. In Proceedings of the IEEE Computer Society Conference on
Computer Vision and Pattern Recognition.
38. Guo, S., Kang, Z., Wang, L., Zhang, L., Chen, X., Li, S., & Xu, W. (2020). A noise filter for
dynamic vision sensors using self-adjusting threshold.
39. Gehrig, D., Ruegg, M., Gehrig, M., Hidalgo-Carrio, J., & Scaramuzza, D. (2021). Combining
events and frames using recurrent asynchronous multimodal networks for monocular depth
prediction. IEEE Robotics and Automation Letters, 6.
40. Pan, L., Scheerlinck, C., Yu, X., Hartley, R., Liu, M., & Dai, Y. (2019). Bringing a blurry frame
alive at high frame-rate with an event camera. In Proceedings of the IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, 2019-June.
41. Pan, L., Hartley, R., Scheerlinck, C., Liu, M., Yu, X., & Dai, Y. (2022). High frame rate video
reconstruction based on an event camera. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 44.
42. Gehrig, D., Rebecq, H., Gallego, G., & Scaramuzza, D. (2020). Eklt: Asynchronous photometric
feature tracking using events and frames. International Journal of Computer Vision, 128.
43. Saner, D., Wang, O., Heinzle, S., Pritch, Y., Smolic, A., Sorkine-Hornung, A., & Gross, M.
(2014). High-speed object tracking using an asynchronous temporal contrast sensor. In 19th
International Workshop on Vision, Modeling and Visualization, VMV 2014.
44. Muglikar, M., Gehrig, M., Gehrig, D., & Scaramuzza, D. (2021). How to calibrate your event
camera. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Workshops.
268 H. AlRemeithi et al.
45. Maqueda, A. I., Loquercio, A., Gallego, G., Garcia, N., & Scaramuzza, D. (2018). Event-based
vision meets deep learning on steering prediction for self-driving cars. In Proceedings of the
IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
46. Galluppi, F., Denk, C., Meiner, M. C., Stewart, T. C., Plana, L. A., Eliasmith, C., Furber, S.,
& Conradt, J. (2014). Event-based neural computing on an autonomous mobile platform. In
Proceedings - IEEE International Conference on Robotics and Automation.
47. Hu, Y., Binas, J., Neil, D., Liu, S. C., & Delbruck, T. (2020). Ddd20 end-to-end event camera
driving dataset: Fusing frames and events with deep learning for improved steering prediction.
In 2020 IEEE 23rd International Conference on Intelligent Transportation Systems, ITSC 2020.
48. Zhong, H., Wang, H., Wu, Z., Zhang, C., Zheng, Y., & Tang, T. (2021). A survey of lidar and
camera fusion enhancement. Procedia Computer Science, 183.
49. Song, R., Jiang, Z., Li, Y., Shan, Y., & Huang, K. (2018). Calibration of event-based camera
and 3d lidar. In 2018 WRC Symposium on Advanced Robotics and Automation, WRC SARA
2018 - Proceeding.
50. Zhou, Y., Gallego, G., & Shen, S. (2021). Event-based stereo visual odometry. IEEE Transac-
tions on Robotics, 37.
51. Dahlkamp, H., Kaehler, A., Stavens, D., Thrun, S., & Bradski, G. (2007). Self-supervised
monocular road detection in desert terrain. Robotics: Science and Systems, 2.
52. Bayard, D. S., Conway, D. T., Brockers, R., Delaune, J., Matthies, L., Grip, H. F., Merewether,
G., Brown, T., & Martin, A. M. S. (2019). Vision-based navigation for the nasa mars helicopter.
AIAA Scitech 2019 Forum.
53. Hidalgo-Carrio, J., Gehrig, D., & Scaramuzza, D. (2020). Learning monocular dense depth
from events. In Proceedings - 2020 International Conference on 3D Vision, 3DV 2020.
54. Li, Z., Asif, M. S., & Ma, Z. (2022). Event transformer.
55. Juefei-Xu, F., Boddeti, V. N., & Savvides, M. (2017). Local binary convolutional neural net-
works. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition,
CVPR 2017, 2017-January.
56. Khodamoradi, A., & Kastner, R. (2021). O(n)-space spatiotemporal filter for reducing noise in
neuromorphic vision sensors. IEEE Transactions on Emerging Topics in Computing, 9.
57. Feng, Y., Lv, H., Liu, H., Zhang, Y., Xiao, Y., & Han, C. (2020). Event density based denoising
method for dynamic vision sensor. Applied Sciences (Switzerland), 10.
58. Meyer, L., Smíšek, M., Villacampa, A. F., Maza, L. O., Medina, D., Schuster, M. J., Steidle,
F., Vayugundla, M., Müller, M. G., Rebele, B., Wedler, A., & Triebel, R. (2021). The madmax
data set for visual-inertial rover navigation on mars. Journal of Field Robotics, 38.
59. Figurnov, M., Ibraimova, A., Vetrov, D., & Kohli, P. (2016). Perforatedcnns: Acceleration
through elimination of redundant convolutions. Advances in Neural Information Processing
Systems, 29.
60. Salman, A. M., Tulan, A. S., Mohamed, R. Y., Zakhari, M. H., & Mostafa, H. (2020). Compar-
ative study of hardware accelerated convolution neural network on pynq board. In 2nd Novel
Intelligent and Leading Emerging Sciences Conference, NILES 2020.
61. Yoshida, Y., Oiwa, R., & Kawahara, T. (2018). Ternary sparse xnor-net for fpga implementation.
In Proceedings - 7th International Symposium on Next-Generation Electronics. ISNE, 2018.
62. Ding, C., Wang, S., Liu, N., Xu, K., Wang, Y., & Liang, Y. (2019). Req-yolo: A resource-aware,
efficient quantization framework for object detection on fpgas. In FPGA 2019 - Proceedings
of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays.
63. Li, J. N., & Tian, Y. H. (2021). Recent advances in neuromorphic vision sensors: A survey.
Jisuanji Xuebao/Chinese Journal of Computers, 44.
64. Chen, G., Cao, H., Aafaque, M., Chen, J., Ye, C., Röhrbein, F., Conradt, J., Chen, K., Bing, Z.,
Liu, X., Hinz, G., Stechele, W., & Knoll, A. (2018) Neuromorphic vision based multivehicle
detection and tracking for intelligent transportation system. Journal of Advanced Transporta-
tion, 2018.
65. Gutierrez-Galan, D., Schoepe, T., Dominguez-Morales, J. P., Jiménez-Fernandez, A., Chicca,
E., & Linares-Barranco, A. (2020). An event-based digital time difference encoder model
implementation for neuromorphic systems.
Event Vision for Autonomous Off-Road Navigation 269
66. Schuman, C. D., Kulkarni, S. R., Parsa, M., Mitchell, J. P., Date, P., & Kay, B. (2022).
Opportunities for neuromorphic computing algorithms and applications. Nature Computational
Science, 2.
67. Richter, C., Jentzsch, S., Hostettler, R., Garrido, J. A., Ros, E., Knoll, A., et al. (2016). Muscu-
loskeletal robots: Scalability in neural control. IEEE Robotics & Automation Magazine, 23(4),
128–137.
68. Zenke, F., & Gerstner, W. (2014). Limits to high-speed simulations of spiking neural networks
using general-purpose computers. Frontiers in Neuroinformatics, 8.
69. Dupeyroux, J., Hagenaars, J. J., Paredes-Vallés, F., & de Croon, G. C. H. E. (2021). Neuromor-
phic control for optic-flow-based landing of mavs using the loihi processor. In Proceedings -
IEEE International Conference on Robotics and Automation, 2021-May.
70. Mitchell, J. P., Bruer, G., Dean, M. E., Plank, J. S. Rose, G. S., & Schuman, C. D. (2018).
Neon: Neuromorphic control for autonomous robotic navigation. In Proceedings - 2017 IEEE
5th International Symposium on Robotics and Intelligent Sensors, IRIS 2017, 2018-January.
71. Tang, G., Kumar, N., & Michmizos, K. P. (2020). Reinforcement co-learning of deep and
spiking neural networks for energy-efficient mapless navigation with neuromorphic hardware.
In IEEE International Conference on Intelligent Robots and Systems.
72. Rajendran, B., Sebastian, A., Schmuker, M., Srinivasa, N., & Eleftheriou, E. (2019). Low-
power neuromorphic hardware for signal processing applications: A review of architectural
and system-level design approaches. IEEE Signal Processing Magazine, 36.
73. Lahbacha, K., Belgacem, H., Dghais, W., Zayer, F., & Maffucci, A. (2021) High density rram
arrays with improved thermal and signal integrity. In 2021 IEEE 25th Workshop on Signal and
Power Integrity (SPI) (pp. 1–4).
74. Fakhreddine, Z., Lahbacha, K., Melnikov, A., Belgacem, H., de Magistris, M., Dghais, W.,
& Maffucci, A. (2021). Signal and thermal integrity analysis of 3-d stacked resistive random
access memories. IEEE Transactions on Electron Devices, 68(1), 88–94.
75. Zayer, F., Mohammad, B., Saleh, H., & Gianini, G. (2020). Rram crossbar-based in-memory
computation of anisotropic filters for image preprocessingloa. IEEE Access, 8, 127569–127580.
76. Bettayeb, M., Zayer, F., Abunahla, H., Gianini, G., & Mohammad, B. (2022). An efficient
in-memory computing architecture for image enhancement in ai applications. IEEE Access,
10, 48229–48241.
77. Ajmi, H., Zayer, F., Fredj, A. H., Hamdi, B., Mohammad, B., Werghi, N., & Dias, J.
(2022). Efficient and lightweight in-memory computing architecture for hardware security.
arXiv:2205.11895.
78. Zayer, F., Dghais, W., Benabdeladhim, M., & Hamdi, B. (2019). Low power, ultrafast synaptic
plasticity in 1r-ferroelectric tunnel memristive structure for spiking neural networks. AEU-
International Journal of Electronics and Communications, 100, 56–65.
79. Zayer, F., Dghais, W., & Belgacem, H. (2019). Modeling framework and comparison of mem-
ristive devices and associated stdp learning windows for neuromorphic applications. Journal
of Physics D: Applied Physics, 52(39), 393002.
80. Li, Z., Asif, M., & Ma, Z. (2022). Event transformerh.
81. Varma, A., Chawla, H., Zonooz, B., & Arani, E. (2022). Transformers in self-supervised monoc-
ular depth estimation with unknown camera intrinsics.
82. Hu, T., Wang, L., Xu, X., Liu, S., & Jia, J. (2021). Self-supervised 3d mesh reconstruction
from single images. In Proceedings of the IEEE Computer Society Conference on Computer
Vision and Pattern Recognition.
83. Li, S., Yan, Z., Li, H., & Cheng, K. T. (2021). Exploring intermediate representation for
monocular vehicle pose estimation. In Proceedings of the IEEE Computer Society Conference
on Computer Vision and Pattern Recognition.
Multi-armed Bandit Approach for Task
Scheduling of a Fixed-Base Robot in the
Warehouse
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 271
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_9
272 A. K. Sandula et al.
1 Introduction
based stochastic scheduler that gives priority to the mobile robot with a higher prob-
ability estimate. The proposed approach further ensures the coordination between
agents (fixed-base and mobile) considering the temporal and spatial constraints. The
coordination between the fixed-base and mobile robot coordination is ensured while
scheduling the sub-tasks. The sub-tasks are scheduled so that the fixed-base robot
moves towards the parking spot of the mobile robot at the same time as the mobile
robot reaches the shelf.
We organise the chapter as follows. In Sect. 2, we present the literature review. In
Sect. 3, we elaborate the problem formulation, along with a detailed explanation of
the motion planning algorithms used. In Sect. 4, we propose a multi-armed bandit
formulation to organise the sequence of tasks of the robot. We report the results with
their analysis and interpretation in Sect. 5. In Sect. 6, we conclude the chapter with
a summary of the work presented in the chapter and future work.
2 Related Work
Task allocation is the assignment of tasks to the agents. In the context of MRS,
multi-robot task allocation (MRTA) was extensively investigated in the literature
(Zlot and Stentz [30], Wang et al. [28], Viguria et al. [26] and Tang and Parker [25]).
However, the assigned tasks are to be scheduled by the individual agents for the
execution. Task scheduling is the arrangement of tasks while execution. Researchers
investigated task scheduling for a robotic arm’s pick and place operation for several
applications. The work by Stavridis and Doulgeri [23] proposed an online task priority
strategy for assembling two parts considering the relative motions of robotic arms
while avoiding dynamic obstacles. Borrell Méndez et al. [1] investigated a decision
tree model to predict the optimal sequence of tasks for pick and place operation for
a dynamic scenario. The application is to assemble pieces which arrive in a tray to
manufacture footwear. Szczepanski et al. [24] explored a nature-inspired algorithm
for task sequence optimisation considering multiple objectives.
Wang et al. [27] investigated the heterogeneous multi-robot task scheduling for
a robotic arm by predicting the total time for execution of tasks and minimising
the pick and place execution time. However, the picking robot did not choose any
priority in the case when multiple mobile robots approached at the same time. Ho
and Liu [11] investigated the performance of nine pickup-dispatching rules for task
scheduling. The paper by Ho and Liu [11] found that LTIS (Longest Time In System)
rule has the worst performance, whilst GQL (Greater Queue Length) has the best
performance for the multiple-load pickup and dispatch problem. The station that was
not served for a long time will have the top priority in the LTIS rule. On the other
hand, the GQL rule gives priority to the station, which has more pickup requests that
need to be addressed. However, the study did not investigate the tasks that needed
collaboration between heterogeneous robots with complementary abilities. Zhang
and Parker [29] explored four heuristic approaches to solve the multi-robot task
scheduling in the case where robots needed to work in a coalition to accomplish a
274 A. K. Sandula et al.
task. The proposed methods have tried to schedule the tasks to reduce interference
with other tasks. However, the approach did not use the history of the tasks to prioritise
the scheduling process. The study by Kalempa et al. [12] reported a robust preemptive
task scheduling approach by categorising the tasks as ‘Minor’, ‘Normal’, ‘Major’
and ‘Critical’. The categories are decided based on the number of robots needed
to allocate for the task execution and urgency. ‘Minor’ tasks often do not require
any robot to perform the job. There are alternative means to accomplish these minor
tasks. ‘Normal’ tasks need one robot to finish the task. A task is ’Major’ when two
robots are required to complete the job. For the ‘Critical’ tasks, execution should
ideally be started as soon as the task is generated. A minimum of three robots are
required to accomplish the task. However, the proposed model did not consider the
criticality of the tasks within the categories. Kousi et al. [14] investigated a service-
oriented architecture (SOA) for controlling the execution of in-plant logistics. The
suggested scheduling algorithm in the architecture is search-based. The scheduler
finds all possibilities of alternatives available at the decision horizon and calculates
the utility for each of the alternatives. The task sequences with the highest utility is
then executed. The scheduler continues to generate the task sequences until the task
execution is completed. The utility is calculated by taking the weighted sum of the
consequence for the alternatives, considering the criteria such as distance travelled
and time for execution. However, the study did not consider robots working in a
coalition. To the best of our knowledge, none of the existing works in the current
literature has used the tasks’ history to set the task scheduler’s priority. In this chapter,
we investigate a multi-armed bandit approach to estimate the probability of a task
appearing in the future and using this information; the task scheduler assigns the
priority accordingly.
In robotics, the Multi-armed bandit (MAB) approach has been utilised, where the
robots must learn the preferences of the environment to allocate limited resources
among multiple alternatives. Korein and Veloso [13] reported a MAB approach to
learning the users’ preferences to schedule the mobile robots during their spare time
while servicing the users. Claure et al. [2] suggested a MAB approach with fairness
constraints for a robot to distribute resources based on the skill level of humans in
a human collaboration task. Dahiya et al. [3] investigated a MAB formulation to
allocate limited human operators for multiple semi-autonomous robots. Pini et al.
[20] explored a MAB formulation for task partitioning problems in swarm robotics.
Task partitioning could be useful in saving resources, less physical interference and
increasing efficiency, etc. But, it could also be costly to coordinate among differ-
ent sub-tasks linked to one another. The paper by Pini et al. [20] proposed a MAB
approach to estimate whether a task needed to be partitioned or not. The results are
compared with an ad-hoc algorithm given by Ozgul et al. [19], and the suggested
approach is shown to outperform the ad-hoc approach. Koval et al. [15] investigated a
MAB approach to select the most ‘robust’ trajectory under uncertainty for rearrange-
ment planning problem Durrant-Whyte et al. [5]. Eppner and Brock [6] reported a
MAB approach to decide on the best trajectory for a robotic arm to grasp an object
exploiting the environment surrounding the object. Krishnasamy et al. [16] proposed
a MAB formulation to reduce the queue regret of service by learning the service
Multi-armed Bandit Approach for Task Scheduling of a Fixed-Base Robot … 275
probabilities over time. The learned probabilities help the server choose a service
with more probability. In this chapter, we propose a MAB formulation to decide the
priority for scheduling pick and place operations for a fixed-base robot (a limited
resource) among multiple mobile robots (competing alternatives) carrying the load
(Fig. 1).
3 Problem Formulation
The task τ [i, j] can be decomposed into the following sub-tasks. (a) Mobile
robot moving to the pickup point Φi , (b) the mobile robot picks up the load
of type Ti , (c) the mobile robot carries the load towards the shelf υ j , (d) the
fixed-base robot moves towards the parking spot of the mobile robot, and (e)
the fixed-base robot picks and places the load Ti onto the shelf.
We simulate the tasks at every time step using a preset probability matrix defined
as P, where P[i, j] is the probability that a task τ [i, j] is going to be generated
within the next time step ‘t’. Note that P is a constant two-dimensional matrix used
to simulate the task requests. At every time-step ‘t’, a new set of tasks are generated
based on the probability matrix P. Hence, we can have multiple requests for a fixed-
base robot to execute at any given instant, which is stored in a queue. We define
the queue for γ j as Q j , which contains the list of tasks of the type τ [:, j]. Now, we
investigate the assignment of a priority among the tasks in Q j , using the previous
history of tasks, to reduce the overall task completion time. We use the multi-armed
276 A. K. Sandula et al.
Fig. 1 Simulation of the environment where η1 , η2 andη3 are mobile robots approaching the fixed
base robot γ1 from their respective starting points carrying different load types
Multi-armed Bandit Approach for Task Scheduling of a Fixed-Base Robot … 277
p
bandit approach to schedule the tasks in Q j by generating a priority queue Q j . The
tasks are scheduled based on the previous history of tasks and the estimated time of
arrival(s) of the mobile robot(s) carrying the load towards the shelf.
For the motion planning algorithm of the robotic arm, we use the RRT-connect
algorithm Kuffner and LaValle [17]. The RRT-connect algorithm is a sampling-based
motion planning algorithm. RRT-connect is an extension of the RRT (Randomly
exploring Rapid Tree) motion planning algorithm given by LaValle et al. [18], which
is probabilistically complete. The RRT algorithm generates a uniformly exploring
random tree with every iteration and finds the path if it exists. Initially, we label the
entire workspace into free and obstacle spaces, which are assumed to be known. If
the state of the robot is not colliding with the obstacles present in the environment, we
consider that state belongs to free space and vice-versa obstacle space. The algorithm
generates a node (a state in the configuration space) with a bias towards the goal.
The node is determined whether it is in the free space or obstacle. From the nearest
neighbour, we steer towards the sampled node, determine a new node, and add to
the tree, provided that a straight line exists without colliding obstacles from the
nearest neighbour to the sampled node. The node to which the newly sample node
was attached in the tree is said to be the parent node. The exploration ends when we
encounter a sample node lying within a tolerance of the goal connected to the tree.
Any node connected to the tree can be reachable from the start node.
However, the RRT-connect, on the other hand, explores the environment from
both start and goal regions. The two trees will stop exploring the region if any newly
sampled node connected to a tree falls within a tolerance of any other node from the
other tree. The Fig. 2 illustrates how the random trees from the start (red) and goal
(green) approach each other for an increasing number of iterations. Now, the full path
from start to goal can be found by approaching the parent nodes from the intersection
point of the tress. This algorithm is known to give the quickest solution when the
obstacles are not very narrow. In our simulation, we did not have narrow passages
to move the robotic arm. Hence RRT-connect would be well suited for executing the
pick and place task.
We used ROS1 navigation framework (also called move_base) for moving the
mobile robot to a given goal location while avoiding obstacles Quigley et al. [21].
The goal of the navigation framework is to localize the robot within the indoor envi-
ronment map and simultaneously move towards the goal. Fox et al. [7] proposed a
probabilistic localization algorithm with great practical success, which places com-
putation where it needs. The mobile robot has a 360-degree laser scan sensor, which
gives the distance of the obstacles(in two dimensions only) around the robot, that is,
the 2D point cloud. The Adaptive Monte-Carlo Localization algorithm by Fox et al.
[7] uses the 2D point cloud data and locates the position and orientation of the robot
in an indoor environment. The ROS navigation stack allows the robot to navigate
278 A. K. Sandula et al.
from its current localized position to the goal point. The navigation framework uses
a two-level navigation approach by using global and local planning algorithms. The
goal of a global planner is to generate a path avoiding the static obstacles in the envi-
ronment. The goal of the local planner is to move the mobile robot along the planned
global path while avoiding dynamic obstacles. The local planner heavily reduces the
computational load of replanning the global path for changing environments in the
case of dynamic obstacles. A* Hart et al. [10], and Dijkstra [4] are popular graph
search algorithms which guarantee the optimal solution if a path exists from one
point to another. Dijkstra algorithm is an undirected search algorithm which follows
a greedy approach. A* algorithm is a directed search algorithm which uses heuris-
tics to focus the search towards the goal. The Dijkstra and A* algorithms are proven
to give the optimal solutions. The A* algorithm takes lesser time to reach the goal
because the search is directed. We used the A* algorithm as the global planner in
the navigation framework. The global planner gives a set of waypoints on the map
Multi-armed Bandit Approach for Task Scheduling of a Fixed-Base Robot … 279
for the mobile robot to follow to reach the goal. These waypoints avoid the static
obstacles on the map.
The mobile robot uses the Dynamic-Window Approach (DWA) Fox et al. [8], a
reactive collision avoidance approach, as the local path planning algorithm in the
navigation framework. The DWA algorithm is proposed for robots equipped with a
synchro-drive system. In a synchro-drive system, all the wheels of the robot orient
in the same direction and rotate with the same angular velocity. The control inputs
for such a system are linear and angular velocities. The DWA algorithm changes the
control inputs of the robot at a particular interval. This approach considers the mobile
robot’s dynamic constraints to narrow the search space for choosing the control input.
Based on the maximum angular and linear accelerations of the motors at any given
instant, the reachable set of control inputs (angular and linear velocity for the mobile
robot) was determined. The reachable set of inputs is discretized uniformly. For each
sample, a kinematic trajectory is generated, and the algorithm estimates the simulated
forward location of the mobile robot. Figure 3 shows the forward simulation of the
’reachable’ velocities for the robot. The reachable velocities are the set of linear and
angular velocities in the control input plane that can be reached within the next time
step of choosing another velocity. For each trajectory of the forward simulation, a
cost is computed. The original implementation of Fox et al. [8] computes the cost
based on clearance from obstacles, progress towards the goal and forward velocity.
The ROS’s implementation is as follows. The cost is the weighted sum of three
components. The weighted sum includes the distance of the path to the endpoint
of the simulated trajectory. Hence, increasing the weight of this component would
make the robot stay on the global path. The second component of the cost is the
280 A. K. Sandula et al.
Fig. 4 RViz
distance from the goal to the endpoint of the trajectory. Increasing the weight of this
component makes the robot choose any higher velocity to move towards the goal.
The other component is the obstacle cost along the simulated forward trajectory. We
assume that the points on the map with obstacles are very high while computing
obstacle costs. Hence, if the robot collides at any simulated forward trajectory, the
cost would be very high, and the (v, ω) pair will not be chosen by the dwa planner
(Fig. 4).
This cost depends on the distance from obstacles, proximity to the goal point and
velocity. The trajectory with the least cost is chosen, and the process is repeated
periodically till the goal point is reached.
4 Methodology
This section explains the proposed approach to schedule the tasks based on a multi-
armed bandit formulation solved with the ε-greedy algorithm. We can further opti-
mize by scheduling the tasks’ execution so that the fixed-base robot reaches the
parking spot of the mobile robot synchronously when the mobile robot reaches the
workspace of the fixed-base robot. The following subsections explain the modules
associated with the suggested approach and summarise the methodology.
Multi-armed Bandit Approach for Task Scheduling of a Fixed-Base Robot … 281
This module prioritises the order of requests by calculating the estimated prob-
ability P ∗ , which is updated at every time stamp and helps us to schedule the
tasks.
=cur
T
1
P ∗ (i, j) = τ ∗ (i, j, T )β(aT = a(i, j)) (1)
N j (aT = a(i, j)) T =1
282 A. K. Sandula et al.
In Eq. 1, P ∗ (i, j) represents the estimated probability of task request from ith pickup
point to jth shelf. Here, ‘T’ is the variable for the time step when the tasks are
generated. Here, N j (aT = a(i, j)) represents the number of times the action a(i, j)
was chosen by the MAB solver corresponding to the shelf j until the current number
of time-stamps, that is, T=cur. The function β(X ) is a binary function which returns
one if the condition X is satisfied and zero if it is not satisfied. The denominator
N j (aT = a(i, j)) = TT =cur=1 β(aT = a(i, j)), represents the number of times the
arm i was chosen by the MAB solver at shelf j until the current time-step, that is,
T=cur.
In this subsection, we present the scheduling of the task requests that a fixed-base
robot at a shelf must execute. We assign the priority among the task requests based on
P ∗ we receive from the MAB solvers and the estimated time of arrival(s) of the mobile
robot(s). As explained in Sect. 3.1, accomplishing a task requires a mobile robot and
Multi-armed Bandit Approach for Task Scheduling of a Fixed-Base Robot … 283
a fixed-base robot to work in a coalition. The mobile robot carries the package and
parks itself at a parking slot within the reachable workspace of the fixed-base robot.
We schedule the tasks in such a way that the fixed-base robot reaches the mobile robot
at the same time. This formulation helps us achieve collaboration between them to
reduce the overall time to finish the task. From P ∗ , we get the priority order (
p ) by
finding the element with the highest probability at each column. Sorting the indices
of the rows based on the highest estimate to the lowest gives us the priority order.
The priority order of load types (p ) is obtained by sorting the highest to least prob-
abilities from the estimated probability of task requests (P ∗ ) with respect to the
maximum element among the rows of P ∗ . Hence, we check the highest probability
value in every row and compare the same for other rows. The row numbers from
highest to lowest are sorted in the priority order p . We conclude that the robots that
carry a specific load type which appears at the beginning of the p have a higher prob-
ability of the tasks accumulated than the latter. We assume t j as the set of pickup
requests for the fixed-base robot at shelf j at any given instant. The estimated time of
arrival(s) of the mobile robot(s) is(are) calculated from the path length(s) with cur-
rent velocity as given in Fig. 5. We use the RRT-connect Kuffner and LaValle [17]
algorithm for the motion planning algorithm of the robotic arm. In this algorithm, a
tree from the source and a tree from the goal point are grown towards each other until
they meet. The shortest path from the set of nodes (tree) is then chosen to execute the
movement of the robotic arm. The scheduled requests are executed by the fixed-base
robot when the movement time of the fixed-base robot equals the estimated time
of arrival of the mobile robot. The estimated arrival time is calculated based on the
distance remaining for the mobile robot to travel, divided by the current velocity. The
time taken by the fixed-base robot is calculated by the angular distance (argument θ )
284 A. K. Sandula et al.
the base joint has to travel and the velocity profile of the controller. The mobile robot
navigates using a global planner A∗ and a local planner dynamic-window approach
Fox et al. [8], which is an online collision avoidance algorithm.
The fixed base robot has the velocity profile as shown in Fig. 6. The angular
velocity, ω, increases with a uniform angular acceleration for θ < 10◦ after which
ω = 20◦ /s. For the last 10◦ of the angular displacement, ω decreases till it becomes
zero. As shown in Algorithm 2, the time taken to execute the movement of the fixed-
base robot, γ1 , is calculated using the angular displacement θ of the base joint of the
fixed-base robot from its current position.
Fig. 6 Velocity profile of the base joint of the fixed base robot
The Fig. 7 shows the entire architecture of the proposed methodology and the
modules involved in the work. The two modules of the architecture are titled ‘Multi-
armed bandit’ and ‘Multi-agent coordination. The module titled ‘Multi-armed bandit’
gives us the estimated probabilities of the task requests based on the history of
the tasks, as explained in Sect. 4.1. The module titled ‘Multi-agent coordination’
takes into consideration the movement time and estimated time of arrivals and the
probability estimates to plan the sequence of tasks.
286 A. K. Sandula et al.
Since this work focuses on task scheduling of fixed-base robots, we only consid-
ered one shelf and three pickup points. We have a total of four agents, one fixed-base
robotic arm γ1 , and three different load-carrying mobile robots (η1 , η2 , η3 ), which
start from three different pickup points. Tasks are allocated to the mobile robot, which
can carry the particular load type. Each mobile robot can carry a specific load type
from a pickup point. Hence, after finishing the task, the mobile robots move back to
the pickup point from the shelf to execute future tasks, if any. Figure 8 shows the
flow chart of the execution of requests by a load-carrying robot.
The detailed algorithm is explained in Algorithm 3. The input T is the list of tasks
which are not accomplished. If ‘i’ is in the list T , a load from pickup point ‘i’ is to be
carried to the shelf. We define four states of the mobile robot. State ‘start’ means the
robot is waiting for the load to be deployed at the pickup point. State ‘to shelf’ means
that the mobile robot will move with the load to the shelf at a parking point reachable
to the robotic arm. State ‘pick’ means the robot is waiting for the fixed-base robot
to reach the package to execute the pick and place operation. State ‘to start’ means
the robot has finished its task and is moving back to its corresponding pickup point.
Once the robot reaches the start position, it will follow the same loop if there are
unfinished tasks.
Multi-armed Bandit Approach for Task Scheduling of a Fixed-Base Robot … 287
Now, the fixed base robot at a given shelf prioritizes the pickup requests based
on the MAB scheduler algorithm, as explained in Algorithm 1. The output E of
the algorithm gives the order of the tasks that the fixed-base robot must execute.
The algorithm uses the multi-armed bandit formulation to estimate the priority to
be allocated among the mobile robots at that current instant. We only prioritize the
requests approaching the fixed-base robot within a threshold δ equal to the time
taken for executing a pick and place task. We move the fixed-base robot towards the
parking spot of the mobile robot in such a way that the robotic arm could reach for
pickup precisely when the mobile robot delivers the package to make the scheduler
288 A. K. Sandula et al.
robust. This can be achieved by moving the robotic arm when movement time equals
the estimated time of arrival of the mobile robot. We schedule the tasks based on the
priority p because we want to reduce the waiting time for the mobile robot, which
has more probability of getting the tasks accumulated in future. We investigate the
performance of the MAB task scheduler to a deterministic scheduler which works on
a first-come-first-serve (FCFS) approach. In the FCFS approach, the mobile robot,
which is estimated (based on the ETA) to arrive the earliest to the shelf, would be
scheduled first for the pick and place operation irrespective of the history of the
task requests. The position of the load is estimated using a classical colour-based
object detection approach. A mask is used on the image frame to recognise the object
in real-time, which only detects a particular colour. The mask is created using the
upper and lower limits of the hue, value of greyness and brightness. We can detect
any colour which falls within the specified range by the camera, which is attached to
the end-effector of the robotic arm. In our simulation, the red-coloured region in the
camera view is first detected using the colour-based object detection technique, as
explained in Algorithm 6. A contour is created around the red object, which is used
to detect the object’s centroid, the weighted average of all the pixels that make up the
object. A contour is a curve that joins all the continuous (along the boundary) points
having the same colour. As the depth (Z) was constant, the X and Y coordinates of
Fig. 9 View from the camera attached to the end-effector of the robotic arm
Multi-armed Bandit Approach for Task Scheduling of a Fixed-Base Robot … 289
the block can be determined by the difference (in pixels) between the centre of the
red bounded region from the image’s centre (Fig. 9).
using the proposed multi-armed bandit based approach and the first-come-first-serve
approach is 0.02 h in Simulation 1 (see Figs. 14 and 15). However, in Simulation 2,
the FCFS approach is faster by 0.05 h. It was observed that the multi-armed bandit
Multi-armed Bandit Approach for Task Scheduling of a Fixed-Base Robot … 291
based approach is faster for the last set of 6 tasks for Robot 2 by 0.45 h and 0.23 h in
Simulation 1 and Simulation 2, respectively. In Simulation 1, the difference between
the time for Robot 1 to complete the first 11 tasks using the suggested multi-armed
292 A. K. Sandula et al.
Table 1 Time taken (in hours) to complete 100 tasks in Simulation 1 and Simulation 2
Robot Simulation 1 Simulation 2
FCFS (h) MAB (h) FCFS (h) MAB (h)
1 7.2 5.4 7.2 10.4
2 1.2 0.4 1.2 1.2
3 34.9 20.3 34.9 30.7
Table 2 Time taken (in hours) to complete consecutive sets of 20 tasks by Robot 3 in Simulation
1 and Simulation 2
Tasks Simulation 1 Simulation 2
FCFS (h) MAB (h) FCFS (h) MAB (h)
20 2.52 1.42 2.52 1.76
40 6.33 3.73 6.33 5.41
60 10 6.25 10 9.42
80 15.29 8.54 15.29 13.53
Table 3 shows the cumulative time taken by Robot 2 to complete consecutive sets
of 6 tasks using the deterministic and the stochastic approach. Table 4 shows the
cumulative time taken by Robot 2 to complete consecutive sets of 11 tasks using the
deterministic and the stochastic approach.
We observe that the total task completion time for all the robots was more for the
FCFS approach than the MAB approach in both the simulations.
294 A. K. Sandula et al.
5.2 Discussion
We observe that the proposed approach can outperform the deterministic task sched-
uler. However, the uncertainty in executing the path by the mobile robot can affect the
task completion time. In Simulation 2, even though Robot 1 was given priority, the
task completion time was more than Simulation 1. The reason for this difference is
the uncertainty in executing the path by the mobile robot. The mobile robot uses the
Dynamic-Window Approach Fox et al. [8], a robust local path planning algorithm,
to avoid dynamic obstacles. Even though the global path decided by the robot is the
same in every case, it keeps updating based on the trajectory decided by the local path
Multi-armed Bandit Approach for Task Scheduling of a Fixed-Base Robot … 295
Table 3 Time taken (in hours) to complete consecutive sets of 6 tasks by Robot 2 in Simulation 1
and Simulation 2
Tasks Simulation 1 Simulation 2
FCFS (h) MAB (h) FCFS (h) MAB (h)
6 0.09 0.07 0.09 0.14
12 0.24 0.08 0.24 0.38
18 0.14 0.1 0.14 0.25
24 0.2 0.07 0.2 0.15
30 0.45 0.09 0.45 0.22
Table 4 Time taken (in hours) to complete consecutive sets of 11 tasks by Robot 1 in Simulation
1 and Simulation 2
Tasks Simulation 1 Simulation 2
FCFS (h) MAB (h) FCFS (h) MAB (h)
11 0.26 0.27 0.26 0.43
22 0.62 0.42 0.62 0.94
33 1.15 0.85 1.15 1.71
44 1.91 1.53 1.91 2.73
55 2.52 1.8 2.52 3.52
planner. Hence, the execution of the path by the mobile robot does not necessarily
have the same time taken for execution for the same initial and final goal point.
In this chapter, we have proposed a novel task scheduling approach in the context
of heterogeneous robot collaboration. However, we did not consider the case where
robots are semi-autonomous. We observe that the difference between the time taken
296 A. K. Sandula et al.
Fig. 19 The mobile robot carries the load towards the Fixed-base Station 1 when the user presses
the first button
is to autonomously avoid real and virtual obstacles and reach the desired goal. The
robot is clearly observed avoiding the virtual obstacles while reaching the goal. The
top part of the figure shows the Hololens camera view from the user. Hololens is a
holographic device which is developed and manufactured by Microsoft. The bottom
part shows the real-world view of the environment.
In Fig. 20, the user pressed the other button, choosing the other station. Hence, a
new goal position is sent to the navigation stack. The robot changes the global path
and, consequently, stops at that moment. Immediately after that, the robot started
moving along the new global path. The arrows in the Hololens view and the real-
world view represent the direction of the robot’s velocity at that moment. It can be
observed that the direction of velocity is such that the robot avoids the virtual obstacles
present on both its sides. This is done by providing an edited map to the navigation
stack. We send the edited map to the move_base node of ROS as a parameter. We
use the unedited map as an input parameter to the amcl node, which is responsible
for the indoor localisation of the robot.
In Fig. 21a. we can observe that the mobile robot is avoiding both the real and
virtual obstacles while parallelly progressing towards the new goal. The real-world
camera view of the figure shows the real obstacle and the end-effector of the fixed-
base robot. In Fig. 21b. we show that the mobile robot reached the workspace of the
fixed-base robot, and the fixed-base robot picks and places the load carried towards
298 A. K. Sandula et al.
Fig. 20 Mobile robot changes its trajectory when the second button is pressed
(a) Mobile robot moves towards Fixed-base (b) Fixed-base robot executes the pick and place
Station 2, avoiding the real obstacle task
Fig. 21 Pick and place task execution in a mixed-reality warehouse environment with real and
virtual obstacle avoidance - link to video
Multi-armed Bandit Approach for Task Scheduling of a Fixed-Base Robot … 299
it using a camera attached to the end-affector. This work can be extended to a case
where the scheduler considers human input and prioritises the mobile robots for
collaboration to finish the task.
6 Conclusion
References
1. Borrell Méndez, J., Perez-Vidal, C., Segura Heras, J. V., & Pérez-Hernández, J. J. (2020).
Robotic pick-and-place time optimization: Application to footwear production. IEEE Access,
8, 209428–209440.
2. Claure, H., Chen, Y., Modi, J., Jung, M. F. & Nikolaidis, S. (2019). Reinforcement learning with
fairness constraints for resource distribution in human-robot teams. arXiv:abs/1907.00313.
3. Dahiya, A., Akbarzadeh, N., Mahajan, A. & Smith, S. L. (2022). Scalable operator allocation for
multi-robot assistance: A restless bandit approach. IEEE Transactions on Control of Network
Systems, 1.
4. Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische math-
ematik, 1(1), 269–271.
5. Durrant-Whyte, H., Roy, N., & Abbeel, P. (2012). A framework for Push-Grasping in clutter
(pp. 65–72).
6. Eppner, C., & Brock, O. (2017). Visual detection of opportunities to exploit contact in grasping
using contextual multi-armed bandits. In 2017 IEEE/RSJ international conference on intelligent
robots and systems (IROS) (pp. 273–278).
7. Fox, D., Burgard, W., Dellaert, F., & Thrun, S. (1999). Monte Carlo localization: Efficient
position estimation for mobile robots. AAAI/IAAI (343–349), 2.
8. Fox, D., Burgard, W., & Thrun, S. (1997). The dynamic window approach to collision avoid-
ance. IEEE Robotics and Automation Magazine, 4(1), 23–33.
9. Fragapane, G., de Koster, R., Sgarbossa, F., & Strandhagen, J. O. (2021). Planning and control of
autonomous mobile robots for intralogistics: Literature review and research agenda. European
Journal of Operational Research, 294(2), 405–426.
10. Hart, P., Nilsson, N., & Raphael, B. (1968). A formal basis for the heuristic determination of
minimum cost paths. IEEE Transactions on Systems Science and Cybernetics, 4(2), 100–107.
https://doi.org/10.1109/tssc.1968.300136
11. Ho, Y.-C., & Liu, H.-C. (2006). A simulation study on the performance of pickup-
dispatching rules for multiple-load agvs. Computers and Industrial Engineering, 51(3), 445–
463. Special Issue on Selected Papers from the 34th. International Conference on Comput-
300 A. K. Sandula et al.
29. Zhang, Y., & Parker, L. E. (2013). Multi-robot task scheduling. In 2013 IEEE international
conference on robotics and automation (pp. 2992–2998).
30. Zlot, R., & Stentz, A. (2006). Market-based multirobot coordination for complex tasks.
The International Journal of Robotics Research, 25(1), 73–101. https://doi.org/10.1177/
0278364906061160
Machine Learning and Deep Learning
Approaches for Robotics Applications
Abstract Robotics plays a significant part in raising the standard of living. With
a variety of useful applications in several service sectors, such as transportation,
manufacturing, and healthcare. In order to make these services useable with efficacy
and efficiency in having robotics obey the directions supplied to them by the program,
continuous improvement is required. Intensive research has been focusing on the
way to improve these services which has led to the use of sub-sections of artificial
intelligence represented by ML and DL with their state-of-the-art algorithms and
architecture adding positive improvements to the field of robotics. Recent studies
prove various ML/DL algorithms for robotic system architectures to offer solutions
for different issues related to, robotics autonomy, and decision making. This chapter
provides a thorough review about autonomous and automatic robotics along with their
uses. Additionally, the chapter discusses extensive machine learning techniques such
as machine learning for robotics. And finally, a discussion about the issues and future
of artificial intelligence applications in robotics.
L. E. Alatabani
Faculty of Telecommunications, Department of Data Communications and Network Engineering,
Future University, Khartoum, Sudan
E. S. Ali (B)
Faculty of Engineering, Department of Electrical and Electronics Engineering, Red Sea
University (RSU), Port Sudan, Sudan
e-mail: [email protected]
E. S. Ali · R. A. Saeed
Department of Electronics Engineering, Collage of Engineering, Sudan University of Science and
Technology (SUST), Khartoum, Sudan
R. A. Saeed · R. A. Saeed
Department of Computer Engineering, College of Computers and Information Technology, Taif
University, P.O. Box 11099, Taif 21944, Saudi Arabia
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 303
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_10
304 L. E. Alatabani et al.
1 Introduction
Robotics has recently emerged as one of the most significant and pervasive techno-
logical technologies. Artificial intelligence has played a significant role in the devel-
opment of advanced robots, which makes them more coherent and responsive [1].
Machine learning (ML) and deep learning (DL) approaches helped with the creation
of enhanced and intelligent control capabilities as well as the development of smart
solutions to a variety of problems affecting robotics applications [2]. Artificial intel-
ligence techniques have recently been used to create a variety of robots, giving them
the capacity to increase correlation, human traits, and productivity in addition to the
enhanced humanistic cognitive capacities [3]. Robots can learn using precise ML
techniques to increase their precision and understanding of spatial relations func-
tions, grab objects, control movement, and other tasks that let them comprehend
and respond to unobserved data and circumstances. Recently, the robotic process
automation and the capacity to interact with the environment have been incorporated
into the mechanisms of DL [4]. It also enables robots to perform various tasks by
understanding physical and logistical data patterns and act accordingly.
Due to the difficulty of translating and analyzing natural events, event control
is one of the hardest activities to design through code when creating robots. This
is especially true when there is a wide variety of actions that the robot performs
in reality [5]. Therefore, algorithms that can gain expert human knowledge of the
robot as structured parameters and improve control techniques are needed when
constructing robots [6]. These justifications state that ongoing modifications to the
robot’s programming are required because the world around it is constantly changing
and because a composite analytical model is required to create application solutions
[7]. One strategy that can address these challenges in a comprehensive and unique
manner is the usage of machine and DL architecture.
The rationale for utilizing ML and DL in robotics is that they are more broad, and
deep networks are excellent for robots in characterless environments since they are
capable of high-level reasoning and conceptualization [8]. According to the impor-
tance of AI approaches in robotics automotive to solve complex tasks, the contri-
butions of this chapter is to provides a brief concept about the autonomous and
automatic robots, and the differences between them. Also the chapter will discuss
the most important robotics applications and the solution can be provides by the AI
approaches, in addition to reviews the concept of extreme learning machines methods
for robotics. Moreover, it will reviews different robotic learning approaches such as
multi, self, and imitation learning approaches.
This chapter’s remaining sections are arranged as follows, in Sect. 2, the differ-
ences between the autofocus robots and automatic robots were reviewed. Section 3
will review different robotics applications with respect to the AI solution approaches.
The extreme learning machines methods for robotics is provided in Sect. 4. The
machine learning approach for soft robotics, and machine learning based robotics
applications were presented in Sects. 5, and 6 respectively. In Sect. 7, chapter provides
a challenges and open issues in robotics applications. The chapter was concluded in
Sect. 8.
Machine Learning and Deep Learning Approaches for Robotics … 305
Automatic robots is a concept that defines expert systems in terms of having a soft-
ware robot that imitates human responses or actions. The process of automating the
human processes is accomplished through the application of robotic process automa-
tion (RPA). The RPA is collector bowls a set of tools that operate on computer system
interface and enable robotic acting like a human [12, 15]. RPA has a variety of appli-
cations in today’s industries, such as agriculture, power plants, and manufacturing.
This technology targets the automation of simple, repetitive, and traditional work
steps [13, 14]. Figure 2 represents the potentiality of process automation in relation
with cognitive and routine tasks.
Size
Variety
Ambiguity Ambiguity
Relationship
Variavility
Novelty Temporal
Demand
Task-Technology Performance
fit type Impact
AI Technology Capabilities
Image
Search Prediction
Recogniton
Automation Level
Natural
Speech
Data Analysis Language Optimization
Recognition
Understanding
Artificial Intelligence (AI) gives the traditional concept of RPA more sense in
multiples areas as AI has a range of capabilities that can give added value to bots
in two major areas, (a) capturing information and (b) understanding the captured
information. In capturing information, the aim is speech recognition, image recogni-
tion, search, data analysis/clustering. In understanding the captured information, the
aim is natural language understanding i.e. acting as translator between humans and
machines, optimization, and prediction. An intelligent framework to be used with
RPA includes classifying tasks according to their characteristics and fitting them with
the AI capabilities in order to select the task that is most suitable to be automated
[15]. The potential framework is illustrated in Fig. 3.
3 Robotics Applications
Input Output
A. Convolutional layer
The network’s convolutional layer is made up of several sets of filters (kernels) that
can take in a certain input and output a feature map. Filters are a representation of
a multi-dimensional gird of discrete numbers. The numbers represent the weights
of the filter which are learnt during the training phase of the network. CNN uses
sub-sampling feature to reduce the dimension resulting to be smaller as feature map
output. The output feature allows mild constancy to scale and makeup objects which
is useful for applications like object recognizing used by image processing.
Zero-Padding is introduced to reduce noise, super-resolution, or segmentation
when an image’s spatial size needs to be kept constant or larger after convolution,
because these operations require more pixel-intensive predictions. It also gives more
room to design deeper networks. In order to increase the size of the output feature map
through processing with multi-dimensional filters, zero-mapping involves sliding
zeros in the input feature map [18].
B. Pooling layers
This layer defines parts or blocks of the input feature map and then integrates feature
activations. This integration is represented by a voting function such as the max or
middle function in the convolutional layer where there is a need to specify the size
of the aggregation area. The amount of the output feature map is calculated by the
function if we consider the size of an voted region of f x f.
h− f +s w− f +s
h = ,w = (1)
s s
where h is the height, w is the width, and s is the stride. In order to extract the
compressed feature representation, the voting operation efficiently down-sizes the
input feature map [19].
C. Fully Connected Layers
This layer is placed at the end of the network. Some researches proved that it also
efficient to place it at the middle of the network. This layer correlate with layers
with filter size of 1 × 1 where each unit is fully is intensively joint to all units in the
previous layer. By performing a straightforward matrix multiplication, adding a bias
vector, and then adding an element-wise nonlinear function, the layer functions are
calculated.
y = f (W T x + b) (2)
where x and y denote the input and output parameter, b represents the bias parameter
and W is the weights of connections between the units [20].
D. Region of Interest (ROI)
It used in object detection, this layer in an important element of CNN. By creating
a bounding box and labeling each object with a specific object class, this method
310 L. E. Alatabani et al.
The idea of a robot was first proposed in the early 1960s, when a mobile industrial
robot with two fingers was designed to transport goods and execute specified duties.
Therefore, much study was done to enhance the gripping and directional processes.
Learning through presentation, also known as imitation learning, is a theory that
has been demonstrated to exist in the performance of difficult maneuvering tasks
that recognize and copy human motion without the use of sophisticated behavior
algorithms [22, 27].
Deep Reinforcement Learning (DRL) is extremely valuable in the imitation
learning field because it has the ability to create policies on its own, which is not
possible with traditional imitation learning techniques that require prior knowledge of
the learning system’s full model. Robots can immediately learn actions from images
thanks to DRL, which combines decision-making and insight abilities. The Markov
Decision Process (MDP), which is the basis of reinforcement learning, produces the
anticipated summation of rewards as the output of the action state function.
T
Q π (s, a) = E π γ |st = s, at = a
trt
(3)
t=0
where the Q π (s, a) represents the action state value where E π is the expected
outcome in the motion strategy case π, rt representing the reward value γt, denotes
the discount factor [23, 24].
Robots learn motions and maneuvers by watching the expert’s demonstration
through the process known as imitation learning, which is the concept of exactly
mimicking the instructor’s behavior or action. In order to optimize the learning
process, the robot also learns to correlate the observed motion with the performance
[27, 28]. Instead of having to learn the entire process from scratch, training data
can be achieved by learning from available motion samples. This has a significant
positive impact on increasing learning efficiency. Combining many reinforcement
Machine Learning and Deep Learning Approaches for Robotics … 311
Supervised
Optimal Policy
Learning
Behavior Cloning
Imitation Reinforcement
Learning Reward Optimal Policy
Learning
learning approaches can increase the speed and accuracy of imitation learning [25].
The three major types into which imitation learning is divided are presented in Fig. 5.
To enable the distribution of the state action path generated by the agent to match
the known learning path, learning is done in the behavior reproduction process after
the policy is acquired. A robotic arm or other item would typically only be able to
repeat a specific movement after receiving manual instruction or a teaching pack, with
the exception of becoming accustomed to an unfamiliar environment change. The
availability of data-driven machine learning techniques allows the robot to recognize
superior robot maneuvering units and adjust to environmental changes.
In inverse reinforcement learning a reward function is introduced to test if the
action is performed as it should be. This approach outperforms the traditional
behavior cloning for its adaptation to different environments qualities, it is described
as an efficient approach of imitation learning. Generative adversarial imitation
learning concept is satisfied by ensuring that the generated strategy is aligned with
the expert strategy [26]. The imitation learning framework contains trajectory and
force learnings. In a trajectory training approach, an existing trajectory profile is
taken for task as an input, then create the nominal trajectory of the next task. The
force learning part of the framework uses Reinforcement Learning agent, and an
equivalent controller is to learn both the position and the parameters commands
of the controller [27]. Figure 6 illustrates an imitation learning framework where
the update of dynamic movement primitives (DMPs) are updated using a modular
learning strategy.
312 L. E. Alatabani et al.
Trajectory Learning
Final Goal
IL
Trajectory Skill Position Feedback
DMPs
Profile Policy
Update Sub-Goal
Force Learning
Control
Force RL
Feedback
Self-supervised learning has been used to improve the functionality of several robotic
application characteristics, such as improving robot navigation, movement, and
vision. Since they rely on positional information for efficient movement and task
fulfillment, almost all robots navigate by evaluating input from sensors. Robots use
motion-capture and GPS systems, as well as other external sources, to determine
their positions. They can also use on-board sensors that are currently in vogue, such
as 3D LiDARs that record varied distances, to determine their positions [29]. Self-
supervised learning, in which the robot does not require supervision and the target
does not require labeling, is a successful learning technique. Self-supervised learning
is therefore most suitable for usage when the data being investigated is unlabeled [30].
A variety of machine learning approaches has been used in the visual localiza-
tion area to enhance the effectiveness of the visual based manipulation. Researchers
have established that self-supervised learning techniques for feature extraction based
on image production and translation enhanced the performance of robotic systems.
Feature extraction using Domain-Invariant Super Point (DISP) is satisfied through
two major tasks: key point detection and description. A function detects key points on
a certain image via map calculation, the aim of this process is to compare and match
Machine Learning and Deep Learning Approaches for Robotics … 313
key points found in other images using similarity matrix, the output is a fixed-length
vector to describe each pixel of the image.
With the goal of feature extraction domain-invariant function, picture domain
adaptation and self-supervised training are combined. The image is parsed into sub-
domains, and the network is instructed to train from a specified function to locate
associated key points. Instead of using domain translation for the feature detector and
descriptor, optimization is employed to reduce the matching loss, which aids in the
extraction of a better feature extraction function. This enables the network to have
the ability to filter essential points and match them under varying circumstances.
When applied within deep convolutional layers, this cross-domain idea expands the
learning process’s scope. Using the scenes and objects as opposed to a single visual
space. When particular conditions are being examined, feature extraction has stronger
abilities when image-to-image translation is used [31].
The use of learning-based methods for multi-robot planning has yielded promising
results due to their abilities to manage a multi-dimensional environment with state-
space representation. Robotics difficult problems, such as teaching numerous robots
or multi-agents to perform a task simultaneously, were resolved by applying rein-
forcement learning [35, 36]. By moving the compute portion to the cloud, multi-agent
systems must create strategies to overcome problems like energy consumption and
computational complexity. The performance of multi-agent systems has substantially
improved as a result of the integration of multi-robot systems with edge and cloud
computing, adding value and enhancing user experience.
Depending on the application being used, especially when deploying robots in
the medical industry, robotic applications have increased the demand for speedier
processing. Resource allocation made it simple to handle this issue by utilizing
multi-agent resource allocation. To meet the Quality-of-Service standards, which
vary from one robot to another according on its application, resource allocation
refers to allocating a resource to a job based on its availability. Robotics applications
required a variety of latency-sensitive, data-intensive, and computational operations.
For these tasks, a variety of resources are needed, including computing, network,
and storage resources [37].
The Markov Decision Making Process is applied to a Markov model in Multi-
Agent Reinforcement Learning (MARL), where a group of agents N are taken into
account together with S State Space, A Joint Action Space, and a Reward Function
R. Each time-step, the Reward function computes N rewards, one for each actor.
Considering T as a transitional function that takes into account the likelihood that a
state will be reached after taking a combined action of a. Every time-step, an obser-
vation function O is sampled from each agent and placed under the observation space
Z of each agent. Multi-Agent systems can be either heterogeneous or homogenous,
i.e. having either distinct or share the same action space respectively. MARL systems
configurations vary depending on their reward functions which can be either coop-
erative or competitive, in addition to their learning setups which directly impact the
type of policies learnt [38].
classification accuracy, lessen the number of manual interventions, and less training
time.
ELM uses two random parameter and freezes them during the training process.
The random parameters are kept on ELM hidden layer. To be more efficient in the
training process, the input vector is mapped into a random feature space with random
configurations and nonlinear activation functions. In the linear parameter solution
step βi is acquired by Moore–Penrose inverse as it is a linear problem Hβ = T [39, 41].
The Extreme Learning Machine and Color Feature Variant technique were
combined to create a single layer feed forward neural network. A completely
connected output layer and three tiers of hidden layers are both included in CF-
ELM. During the training phase, the extraction of the product of the random weight
values and the inputs is recorded in the hidden layer output matrix H. The activa-
tion function g (x) is used to process the input signal and convert it to an output,
and the CF-ELM is processed using the soft-sign activation function g(x), which is
represented by the following function.
x
g(x) = (4)
1 + x
⎡ ⎤
g W1 .Y1 + b1 . . . g W Ñ .Y1 + bñ
⎢ ... ⎥
⎢ ⎥
⎢ g W .Y + b . . . g W Ñ .Y1 + bñ ⎥
⎢ 1 1 1 ⎥
⎢ g(W .U + b ) . . . g W Ñ .U1 + bñ ⎥
⎢ 1 1 1 ⎥
⎢ ⎥
=⎢ ... ⎥3N . Ñ (5)
⎢ ⎥
⎢ g(W1 .U N + b1 ) . . . g W Ñ .U N + bñ ⎥
⎢ ⎥
⎢ g(W1 .V1 + b1 ) . . . g W Ñ .V1 + bñ ⎥
⎢ ⎥
⎣ ... ⎦
g(W1 .VN + b1 ) . . . g W Ñ .VN + bñ
where N represents the samples for training. Ñ denoted for number of neurons in
the hidden layer, W represents the input weight, b represent bias. Y , U and V are
the color input samples for each pixel. The difference between CF-ELM and ELM
is that ELM uses grey scale images. The output coming from the hidden layer, and
turned to be as an input multiplier for the output weights β, T represents the output
target which equals to β.H
T = β.H (6)
where m is the layer’s total number of neurons. The following equation can be used
to represent the goal output T matrix:
⎡ ⎤
t1T
⎢ . ⎥
⎢ ⎥
⎢ ⎥
T = ⎢ . ⎥ N .m (8)
⎢ ⎥
⎣ . ⎦
t ÑT
Vector of (1 s) typically represents the value for each t which is stored based on
the input training sample. The value of β can be obtained by making it as the subject.
β = H −1 .T (9)
Ñ
3
yi = (βi j W j .Y + b j + βi j + Ñ W j+ Ñ .U + b j+ Ñ
j=1
+ βi j+2 Ñ (W j+2N
˜ .V + b j+2 Ñ )) (10)
wher e W j Represents the weight vector for the jth neuron [40].
Another joint approach based on convolutional neural networks and ELM is
proposed to harvest the strength of CNN, this is to overcome the gradient calcu-
lations that are used for updating network weights. Convolutional Extreme Learning
Machine (CELM) are fast training CNNs which are used for feature extraction by
effectively define filters [41].
Level-Based Learning Swarm Optimizer (LLSO) based ELM approach is intro-
duced to solve the limitations of ordinary ELM which are the reduced generalization
performance. The concept is to use LLSO to harvest the paradigmatic parameters for
ELM, the optimization problem is present at large scale in ELM because of the fully
Machine Learning and Deep Learning Approaches for Robotics … 317
connected input and hidden layers. The essential considerations in the optimization
of ELM parameters are fitness and particle encoding. The input layer’s weight vector
and the hidden neurons’ bias vector make up the LLSO-ELM particles. The following
equations can be used to numerically represent particle P:
where n is the dimension of the input dataset, and L is the number of hidden neurons.
The following equations represent the particle’s length, Len of Particle:
Fitness is used to assess the quality of the particles, with smaller values indicating
better classification. As a result, the equation below can be used to determine fitness
value.
After the LLSO had sought and harvested the optimal parameters, Acc is the ratio
of the number of samples that were correctly classified to the count of all samples
that were harvested by the ELM algorithm [42].
The general concept of robotics has been known for being precise and rigid. However,
in the latest years an improved technology was introduced with modern concepts such
as flexibility and adaptability. Soft robotics presented abilities that were not present
in the stiff formation of robotics [43]. Large and multidimensional data sets can be
used by ML to extract features. When coupled with the application of soft sensors to
robots, soft robotics’ performance has increased significantly. By using soft sensors,
robotics sensitivity and adaptability may be improved. Due to the non-linearity and
high hysteresis of soft materials, soft sensors are however constrained in terms of
their capabilities and calibration difficulty.
Learning approaches can overcome the limitations of implementing soft sensors.
ML enable to accurate the calibration and characterization by considering their
nonlinearity and hysteresis [44]. Robotics tasks can be divided into two sections:
perception and control, perception tasks aim at collecting information about the
environment via sensors for extracting the target properties, while in the control
policy the agent interconnects with the environment for the purpose of learning a
certain behavior depending on the received award. However, control in soft robotic
is much complicated and needs more efforts for the following reasons:
318 L. E. Alatabani et al.
And,
Two learning strategies are used by DQN, including the target network, which uses
the same framework as the Q-network and updates the weights on the Q-networks
while iteratively copying them to the weights of the target network. Additionally,
Machine Learning and Deep Learning Approaches for Robotics … 319
the experience replay shows how the network’s input is gathered in state-action
pairs, together with their rewards, and then saved in a replay memory before being
retrieved as samples to serve as mini-batches for learning. Gradient descent is used
in the remaining learning tasks to lessen the possibility of loss between the learnt
Q-network and the target Q-network.
B. Deep Deterministic Policy Gradients (DDPG)
The DDPG is a combination of Actor-critic methods aimed at modeling prob-
lems with high dimensional action space. DDPG is represented mathematically by
stochastic and deterministic policies in the following Eqs. (17) and (18).
And,
Q μ (st , at ) = E Rt+1 ,st+1 ∼E Rt+1 + Q μ Q π st+1 , μ(s t+1 (18)
This method is one of the first algorithms in the field of DRL applied to soft robots.
C. Normalized Advantage Function (NAF)
With the aid of deep learning, Q-Learning is made possible in continuous and high-
dimensional action space. This technique differs from standard DQN in that it outputs
V, L, and V in the neural network’s final layer. The advantage required for the learning
strategy is predicted by and L. To lessen correlations in the observation data gathered
over time, NAF uses the target network and experience replays.
1
A s, a; θμ , θ L = − (a − μ(s, θμ ))T P(s; θ L )(a − μ(s; θμ )) (19)
2
where:
T
P s; θ L = L(s; θ L )L(s; θ L ) (20)
where: θ π , θ V are the network parameters, s is the state, a, is the action, and t is the
learning step.
Instead of updating the parameters sequentially, they are updated simultaneously,
negating the need to learn stabilizing strategies like memory replay. The mechanism
uses action-learners who aid in inspecting a larger perspective of the environment to
aid in learning the best course of action.
E. Trust Region Policy Optimization (TRPO)
This algorithm was introduced to overcome the limitations which might occur using
other algorithms, for optimizing large nonlinear policies that enhances the accuracy.
Cost function is used in place of reward function to achieve this. Utilizing conjugate
gradient followed by line search to solve optimization problems has been shown to
improve performance [46]. The equation that follows illustrates it,
∞
η(π ) = E π γ c(st )|s0 ∼ ρ0
t
(22)
t=0
With the same concept the state-value function is replaced, represented by Eq. (23),
The result of the optimization function would be an updating rule for the policy
given by Eq. (24),
∞
πold
η(π ) = η(πold ) + E π γ A
t
(st , at )s0 ∼ ρ0 (24)
t=0
By incorporating improved sensorimotor capabilities that can give a robot the ability
to adapt to a changing environment, robotics technology is a field that is quickly
developing. This is made possible by the integration of AI into robotics, which
enables optimizing the level of autonomy through learning using ML techniques.
The usefulness of adding intelligence to machines is determined by their capacity
to predict the future through planning how to finish a job and interacting with the
environment through successfully manipulating or navigating [47, 48].
Machine Learning and Deep Learning Approaches for Robotics … 321
The development of modern life has made it clear that better decision-making
processes are required when it comes to addressing e-services to enhance client
decision making. These systems utilize personalized e-services with artificial
intelligence-based approaches and procedures to identify user profiles and prefer-
ences [49]. With the application of multiple ML methods, recommendation quality
was raised and user experience increased. Recommendation systems are primarily
designed to help people who lack the knowledge or skills to deal with a multitude
of alternatives through systems that estimate use preferences as a result of reviewing
data from multiple sources. Knowledge-based recommender systems, collaborative
filtering-based recommender systems, and content-based recommender systems are
the three types of recommender systems that exist [50].
Making recommendations for products that are comparable to those that have
previously caught the user’s attention is the aim of content-based recommender
systems. Items attributes are collected from documents or pictures using retrieval
techniques like the vector space model [51]. As a result, information about the user’s
preferences, or what they like and dislike, is included in their user profile. Collabora-
tive filtering (CF), the most popular technique in recommender systems, is predicated
on the idea that consumers who are similar to one another will typically consume
comparable products. A system based on user preferences, on the other hand, will
function using information about users with comparable interests. Memory-based and
model-based CF techniques are the two categories; memory-based is the more tradi-
tional type and uses heuristic algorithms to identify similarities between people and
things. The fundamental method utilized in memory-based CF is closest neighbor,
which is simple, efficient, and precise. As a result, user-based CF and item-based CF
are two subcategories of memory-based CF.
Model-based CF was initially offered as a remedy for the shortcomings in the
prior methodology, but its use has since spread to suit additional applications.
Model-based CF uses machine learning and data mining techniques to forecast user
behavior. Knowledge-based recommender systems are built on the user knowledge
already in existence and are based on user knowledge gleaned from past behavior.
A popular method known as “case-based” is used by knowledge-based systems to
use previous difficulties to address present challenges. Knowledge-based applica-
tion fields include, but are not limited to, real estate sales, financial services, and
health decision assistance. Each of these application areas deals with a different
issue and necessitates a thorough knowledge of that issue. Figure 7 displays robotics
applications for machine learning systems.
Figure 8 illustrates how the implementation of AI methods has improved the
performance of many techniques in numerous fields, including knowledge engi-
neering, reasoning, planning, communication, perception, and motion [52]. Simply
defined, recommender systems use historical behavior to predict client needs [53].
322 L. E. Alatabani et al.
The Root Mean Square Error (RMSE), which is widely used to evaluate prediction
accuracy, is the most basic performance evaluation technique for qualitative evalu-
ation metrics. This evaluation is done using the mean squared error (MSE), which
is calculated by dividing the sum of the squares of the difference between the actual
score and the anticipated score by the total number of expected scores. Additional
qualitative evaluations that are covered by a confusion matrix and used to calcu-
late the value of the qualitative evaluation index include precision, re-call, accuracy,
F-measure, ROC curve, and Area under curve (AUC). By identifying whether or
not the user’s preference is based on a recommender system, this matrix enables the
evaluation of a recommender system. Each row in Table 1 represents a user-preferred
item, and each column shows whether a recommender model has suggested a related
item.
Where: The number of items that fit the user’s preferences is represented by TP.
The number of user favorites that the recommendation system does not suggest is
known as TN. FP represents the frequency at which systems suggest products that
people dislike. Additionally, FN stands for the number of cases for which the system
does not offer a suggestion.
Table 2 also includes other qualitative measures such as accuracy, which is the
proportion of recommendations that are successful, precision, which is the number
of choices that exactly match the user’s preference, recall, which is the proportion
of recommendations made by a recommender system based on actual data gathered
from users, F-means, which calculates the harmonic average of precision and recall,
and the ROC curve, which is a graph demonstrating the relationship between the
True Positive Rate (TPR) and precision and recall [54].
Machine Learning and Deep Learning Approaches for Robotics … 323
AI Areas Techniques
Transfer Active
Knowledge Learning Learning
Engineering
Deep Neural
Network
Fuzzy
Reasoning Technologies
Evolutionary
Algorithm Reinforcement
Learning
Planning
Natural
Language
Communication Processing
Computer
Perception Vision
With the development of robots and ML applications toward smart life, nanotech-
nology is advancing quickly. Nanotechnology is still in its infancy, though, and this
has researchers interested in expanding this subject. The term “Nano” describes the
324 L. E. Alatabani et al.
Table 2 Qualitative
Evaluation matrix Equation
evaluation using confusion
matrix components Thoroughness TP/TP + FP
Rendering TP/TP + FN
Accuracy TP + TN/TP + FN + FP + TN
F-measure 2 × (Precision × Recall) / (Precision ×
Recall)
ROC curve Ratio of TP rate = (TP/TP + FN) and FP
rate = (FP/FP + TN)
AUC Area under the ROC curve
creation of objects with a diameter smaller than a human hair [55]. Reference [56]
claims that nanotechnology entails the process of creating, constructing, and manu-
facturing materials at the Nano scale. Robots that are integrated on the Nano scale are
known as Nano robots [57]. According to the authors, Nano robots are objects that
can sense, operate, and transmit signals, process information, exhibit intelligence, or
exhibit swarm behavior at the Nano scale.
They are made up of several parts that work together to carry out particular
functions; the parts are constructed at the Nano scale, where sizes range from 1 to
100 nm. In the medical industry, Nano robots is frequently utilized in procedures
including surgery, dentistry, sensing and imaging, drug delivery, and gene therapy.
Reference [58] Fig. 9 shows an application of Nano robotics.
By providing value to image processing through picture recognition, grouping,
and classifications via medical imaging processing using ML, this will improve the
performance of Nano health applications. By incorporating machine learning (ML)
into biological analysis using microscopic images, disease identification accuracy
can be increased. In order to better understand the influence nanoparticles have
on their characteristics and interactions with the targeted tissue and cells, machine
learning (ML) methods have been utilized to predict the pathological response to
breast cancer treatment with a high degree of accuracy. Artificial neural networks,
which decreased the prediction error rate, enable the use of techniques without the
need for enormous data sets [59].
Medical Applications
where: u c , vc are the pixel coordinates, N represents phase steps. δn = 2π(n − 1)/N
representing the phase shift a, bandϕ represents the background. The desired phase
is then calculated using the least-square algorithm represented in the Eq. (26).
N
n=1 [In (u , v )sin(δn )]
c c
ϕ u ,v
c c
= arctan N (26)
n=1 [In (u , v )cos(δn )]
c c
u c , vc = ϕ u c , vc + K u c , vc × 2π (27)
where K represents the fringe order. The mapping is in form of 3D points in a form of
a matrix with coordinate frame (x w , y w , z w ) represented by the following equation.
326 L. E. Alatabani et al.
⎤
⎡
⎡ c
⎤ xw
u ⎢ yw ⎥
s c ⎣ v c ⎦ = Ac ⎢ ⎥
⎣ zw ⎦
1
1
⎡ ⎤
⎡ c c c c
⎤ xw
a11 a12 a13 a14 ⎢ w ⎥
= ⎣ a c a c a c a c ⎦⎢ y ⎥ (28)
21 22 23 24 ⎣ w ⎦
c c c c z
a31 a32 a33 a34
1
⎡ ⎤
⎡ ⎤ xw
up ⎢ yw ⎥
s p ⎣ v p ⎦ = Ac ⎢ ⎥
⎣ zw ⎦
1
1
⎡ w⎤
⎡p p p p ⎤ x
a11 a12 a13 a14 ⎢ w ⎥
p p p p ⎦⎢ y ⎥
= ⎣ a21 a22 a23 a24 ⎣ w ⎦ (29)
p p p p z
a31 a32 a33 a34
1
where: c is the camera and p are the projector. s = scaling factor. A = the resultant
product of the intrinsic and extrinsic matrix which is represented as follows;
Ac = I c × R c |T c
⎡ c ⎤ ⎡ c c c c⎤
f x 0 u c0 r11r12 r13 t1
c c c c⎦
= ⎣ 0 f yc v0c ⎦ = ⎣ r21 r22 r23 t2 (30)
c c c c
0 0 1 r31 r32 r33 t3
A p = I p × R p |T p
⎡ p p⎤ ⎡ p p p p⎤
fx 0 u0 r11r12 r13 t1
= ⎣ 0 f yp v p ⎦ = ⎣ r p r p r p t p ⎦
0 21 22 23 2 (31)
p p p p
0 0 1 r31r32 r33 t3
(u c , vc ) × λ
up = (32)
2π
Which is the matching between the points from the camera and projector.
The back-end: Following the collection of high-quality data, rapid and accurate
mapping is required. This is accomplished by using a registration technique based on
Machine Learning and Deep Learning Approaches for Robotics … 327
n
2
min P1 − R P2 − T (33)
i=1
where: P1 and P2 represent the 3D data P1: (xw1 , yw1 , z w1 ) and P2: (xw2 , yw2 , z w2 )
The pedestrian localization systems has been growing recently, machine learning
has been applied to different types of pedestrian localization. There is a tendency to
apply supervised learning and scene analysis in pedestrian localization aspects for its
accuracy. The use of DL as a branch is implemented for its high processing capacity.
Scene analysis is the most frequency used ML approach in pedestrian localization
for its easy implementation and fair performance [61, 62].
Automated guided vehicles (AGV) have developed into autonomous mobile robots
as a result of recent breakthroughs in robotic applications (AMR). To get to the
vision-based system we have today, the main component of AGV material handling
systems has advanced through numerous mechanical, optical, inductive, inertial,
and laser guided milestones [63]. The technologies that improve the performance
of these systems include sensors, strong on-board processors, artificial intelligence,
simultaneous location and mapping, and strong on-board processors. The robot can
comprehend the workplace thanks to these technologies.
AMRs have AI algorithms applied to improve their navigation; they travel on
their own through unpredictable terrain. Machine learning (ML) methods can be
used to identify and categorize obstacles. A few examples include fuzzy logic, neural
networks, genetic algorithms, and neuron fuzzy. To move the robot from one place
to another while avoiding collisions, all of these techniques are routinely used. The
ability of the brain to do specific tasks serves as the source of inspiration for these
strategies [63, 64]. For example, if we take into consideration a dual-arm robot,
we may construct and analyze a control algorithm for the oscillation, position, and
speed control of the dual-arm robot. This required the usage of a dynamic system. The
system design incorporates time delay control and pole placement-based feedback
328 L. E. Alatabani et al.
control for the control of oscillation (angular displacement), precise position control,
and speed control, respectively.
As robots are employed in homes, offices, the healthcare industry, operating auto-
mobiles, and education, robotics applications have become a significant part of our
life. In order to improve performance, including accuracy, efficiency, and security,
it is increasingly common to deploy bots to integrate several applications utilizing
machine learning techniques [65]. Open difficulties in AI for robots include the
type of learning to be used, ML application, ML architecture, standardization, and
incorporation of other performance evaluation criteria in addition to accuracy [66].
Exploring different learning approaches would be beneficial for performance and
advancement, even though supervised learning is the most typical type of learning
in robotic applications [67]. Use ML tools to solve issues brought on by wireless
connectivity, which increases multipath and lowers system performance. Bots are
adopting DL architectures more frequently, particularly for localization, but their
use is restricted since DL designs require a significant amount of difficult-to-obtain
training data. To analyze the efficacy of ML systems, it is crucial to identify best
practices in these fields and take into account alternative evaluation criteria because
standard performance evaluation criteria are constrained [68].
There are many uses for machine learning, including in forensics, energy manage-
ment, health, and security. Since they are evolving so quickly, new trends in robotics
and machine learning require further study. Among the trends are end-to-end auto-
mated, common, and continuous DL approaches for data-driven intelligent systems.
Technologies that are quickly evolving, such as 5G, cloud computing, and blockchain,
offer new opportunities to improve the system as a whole [69, 70]. Issues with
user security, privacy, and safety must be resolved. Black box smart systems have
opportunities in AI health applications because to their low deployment costs, rapid
performance, and accuracy [71, 72]. These applications will aid in monitoring,
rehabilitation, and diagnosis. The future research trends also includes,
• AI algorithms which offer essential role in data analytics and decision making for
robotics operations.
• IT infrastructure such as 5G which plays an integral role with low latency, high
traffic, and fast connection for robotic based Industrial Applications.
• Human–robot Interaction HRC gained famous reputation lately in health appli-
cations during the pandemic.
• Big data and cloud-based applications are expected to accelerate in the coming
years to be applied with robotics for their powerful analytics that helps the
decision-making process.
Machine Learning and Deep Learning Approaches for Robotics … 329
8 Conclusions
References
1. Saeed, M., OmriS., Abdel-KhalekE. S., AliM. F., & Alotaibi M.: Optimal path planning for
drones based on swarm intelligence algorithm. Neural Computing and Applications, 34, 10133–
10155 (2022). https://doi.org/10.1007/s00521-022-06998-9.
2. Niko, S., et al. (2018). The limits and potentials of deep learning for robotics. The International
Journal of Robotics Research, 37(4), 405–420. https://doi.org/10.1177/0278364918770733
3. Ali, E. S., Zahraa, T., & Mona, H. (2021). Algorithms optimization for intelligent IoV applica-
tions. In J. Zhao & Vinoth K. (Eds.), Handbook of Research on Innovations and Applications
of AI, IoT, and Cognitive Technologies (pp. 1–25). Hershey, PA: IGI Global (2021). https://doi.
org/10.4018/978-1-7998-6870-5.ch001
4. Matt, L, Marie, F, Louise, A., Clare, D, & Michael, F. (2020). Formal specification and verifi-
cation of autonomous robotic systems: A survey. ACM Computing Surveys, 52I(5), 100, 1–41.
https://doi.org/10.1145/3342355.
5. Alexander, L., Konstantin, M., & Pavol. B. (2021). Convolutional Neural Networks Training
for Autonomous Robotics, 29, 1, 75–79. https://doi.org/10.2478/mspe-2021-0010.
6. Hassan, M., Mohammad, H., Othman, O., & Aisha, H. (2022). Performance evaluation of
uplink shared channel for cooperative relay based narrow band internet of things network. In
2022 International Conference on Business Analytics for Technology and Security (ICBATS).
IEEE.
7. Fahad, A., Alsolami, F., & Abdel-Khalek, S. (2022). Machine learning techniques in internet of
UAVs for smart cities applications. Journal of Intelligent & Fuzzy Systems, 42(4), 3203–3226
8. Salih, A., & Sayed A.: Machine learning in cyber-physical systems in industry 4.0. In A.
Luhach & E. Atilla (Eds.), Artificial Intelligence Paradigms for Smart Cyber-Physical Systems.
(pp. 20–41). Hershey, PA: IGI Global. https://doi.org/10.4018/978-1-7998-5101-1.ch002.
9. Gruber, T. R. (1995). Toward principles for the design of ontologies used for knowledge sharing.
International Journal of Human-Computer Studies, 43, 907–928.
330 L. E. Alatabani et al.
10. Lim, G., Suh, I., & Suh, H. (2011). Ontology-Based unified robot knowledge for service robots
in indoor environments. IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems
and Humans, 41, 492–509.
11. Mohammed, D., Aliakbar, A., Muhayy, U., & Jan, R. (2019). PMK—A knowledge processing
framework for autonomous robotics perception and manipulation. Sensors, 19, 1166. https://
doi.org/10.3390/s19051166
12. Wil, M., Martin, B., & Armin, H. (2018). Robotic Process Automation, Springer Fachmedien
Wiesbaden GmbH, part of Springer Nature (2018)
13. Aguirre, S., & Rodriguez, A. (2017). Automation of a business process using robotic
process automation (RPA): A case study. Applied Computational Science and Engineering
Communications in Computer and Information Science.
14. Ilmari, P., & Juha, L. (2021). Robotic process automation (RPA) as a digitalization related tool
to process enhancement and time saving. Research. https://doi.org/10.13140/RG.2.2.13974.
68161
15. Mona, B., & Sayed, A. (2021). Intelligence IoT Wireless Networks. Intelligent Wireless
Communications, IET Book Publisher.
16. Niall, O. et al. (2020). In K. Arai & S. Kapoor (Eds.), Deep Learning versus Traditional
Computer Vision. Springer Nature Switzerland AG 2020: CVC 2019, AISC 943 (pp. 128–144).
https://doi.org/10.1007/978-3-030-17795-9_10.
17. Othman, O., & Muhammad, H. et al. (2022). Vehicle detection for vision-based intelligent
transportation systems using convolutional neural network algorithm. Journal of Advanced
Transportation, Article ID 9189600. https://doi.org/10.1155/2022/9189600.
18. Ross, G., Jeff, D., Trevor, D., & Jitendra, M. (2019). Region-based convolutional networks
for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 38(1), 142–158.
19. Ian, G., Yoshua, B., & Aaron. C. (2016). Deep Learning (Adaptive Computation and Machine
Learning series) Deep Learning. MIT Press.
20. Macaulay, M. O., & Shafiee, M. (2022). Machine learning techniques for robotic and
autonomous inspection of mechanical systems and civil infrastructure. Autonomous Intelligent
Systems, 2, 8. https://doi.org/10.1007/s43684-022-00025-3
21. Khan, S., Rahmani, H., Shah, S. A. A., Bennamoun, M. (2018). A guide to convolutional neural
networks for computer vision. Springer. https://doi.org/10.2200/S00822ED1V01Y201712CO
V01.
22. Kober, J., Bagnell, J. A., & Peters, J. (2013). Reinforcement learning in robotics: A survey.
International Journal of Robotics Research, 32, 1238–1274.
23. Bakri, H., & Elmustafa, A., & Rashid, A.: Machine learning for industrial IoT systems. In J.
Zhao & V. Vinoth Kumar, (Eds.), Handbook of Research on Innovations and Applications of
AI, IoT, and Cognitive Technologies (pp. 336–358). Hershey, PA: IGI Global, (2021). https://
doi.org/10.4018/978-1-7998-6870-5.ch023
24. Luong, N. C., Hoang, D. T., Gong, S., Niyato, D., Wang, P., Liang, Y. C., & Kim, D. I. (2019).
Applications of deep reinforcement learning in communications and networking: A survey.
IEEE Communications Surveys Tutorials, 21, 3133–3174.
25. Chen, Z., & Huang, X. (2017). End-to-end learning for lane keeping of self-driving cars. In
2017 IEEE Intelligent Vehicles Symposium (IV) (pp. 1856–1860). https://doi.org/10.1109/IVS.
2017.7995975.
26. Jiang, H., Liangcai, Z., Gongfa, L., & Zhaojie, J. (2021). Learning for a robot: Deep reinforce-
ment learning, imitation learning, transfer learning, learning for a robot: Deep reinforcement
learning, imitation Learning. Transfer Learning. Sensors, 21, 1278. https://doi.org/10.3390/
s21041278
27. Yan, W., Cristian, C., Beltran, H., Weiwei, W., & Kensuke, H. (2022). An adaptive imitation
learning framework for robotic complex contact-rich insertion tasks. Frontiers in Robotics and
AI, 8, 90–123.
28. Ali, E. S., Hassan, M. B., & Saeed, R. (2020). Machine learning technologies in internet of
vehicles. In: M. Magaia, G. Mastorakis, C. Mavromoustakis, E. Pallis & E. K Markakis (Eds.),
Machine Learning and Deep Learning Approaches for Robotics … 331
Intelligent Technologies for Internet of Vehicles. Internet of Things. Cham: Springer. https://
doi.org/10.1007/978-3-030-76493-7_7.
29. Alatabani, L. E., Ali, E. S., & Saeed, R. A. (2021). Deep learning approaches for IoV applica-
tions and services. In: N. Magaia, G. Mastorakis, C. Mavromoustakis, E. Pallis, E. K. Markakis
(Eds.), Intelligent Technologies for Internet of Vehicles. Internet of Things. Cham: Springer.
https://doi.org/10.1007/978-3-030-76493-7_8
30. Lina, E., Ali, E., & Mokhtar A. et al. (2022). Deep and reinforcement learning technologies on
internet of vehicle (IoV) applications: Current issues and future trends. Journal of Advanced
Transportation, Article ID 1947886. https://doi.org/10.1155/2022/1947886.
31. Venator, M. et al. (2021). Self-Supervised learning of domain-invariant local features for robust
visual localization under challenging conditions. IEEE Robotics and Automation Letters, 6(2).
32. Abbas, A., Rania, A., Hesham, A. et al. (2021). Quality of services based on intelligent IoT
wlan mac protocol dynamic real-time applications in smart cities. Computational Intelligence
and Neuroscience, 2021, Article ID 2287531. https://doi.org/10.1155/2021/2287531.
33. Maibaum, A., Bischof, A., Hergesell, J., et al. (2022). A critique of robotics in health care.
AI & Society, 37, 467–477. https://doi.org/10.1007/s00146-021-01206-z
34. Yanxue, C., Moorhe, C., & Zhangbo, X. (2021). Artificial intelligence assistive technology
in hospital professional nursing technology. Journal of Healthcare Engineering, Article ID
1721529, 7 pages. https://doi.org/10.1155/2021/1721529.
35. Amanda, P., Jan, B., & Qingbiao, L. (2022). The holy grail of multi-robot planning: Learning
to generate online-scalable solutions from offline-optimal experts. In International Conference
on Autonomous Agents and Multiagent Systems (AAMAS 2022).
36. Lorenzo, C., Gian, C., Cardarilli, L., et al. (2021). Multi-agent reinforcement learning: A
review of challenges and applications. Applied Sciences, 11, 4948. https://doi.org/10.3390/app
11114948
37. Mahbuba, A., Jiong, J., Akhlaqur, R., Ashfaqur, R., Jiafu, W., & Ekram, H. (2021). Resource
allocation and service provisioning in multi-agent cloud robotics: A comprehensive survey.
Manuscript. IEEE. Retrieved February 10, 2021.
38. Wang, Y., Damani, M., Wang, P., et al. (2022). Distributed reinforcement learning for robot
teams: A review. Current Robotics Reports. https://doi.org/10.1007/s43154-022-00091-8
39. Elfatih, N. M., et al. (2022). Internet of vehicle’s resource management in 5G networks using
AI technologies: Current status and trends. IET Communications, 16, 400–420. https://doi.org/
10.1049/cmu2.12315
40. Edmund, J., Greg, F., David, M., & David, W. (2021). The segmented colour feature extreme
learning machine: applications in agricultural robotics. Agronomy, 11, 2290. https://doi.org/10.
3390/agronomy11112290
41. Rodrigues, I. R., da Silva Neto, S. R.,Kelner, J., Sadok, D., & Endo, P. T. (2021). Convolutional
extreme learning machines: A systematic review. Informatics 8, 33. https://doi.org/10.3390/inf
ormatics8020033.
42. Jianwen, G., Xiaoyan, L, Zhenpeng, I., & Yandong, L. et al. (2021). Fault diagnosis of indus-
trial robot reducer by an extreme learning machine with a level-based learning swarm opti-
mizer. Advances in Mechanical Engineering 13(5), 1–10. https://doi.org/10.1177/168781402
11019540
43. Ali, Z., Lorena, D., Saleh, G., Bernard, R., Akif, K., & Mahdi, B. (2021). 4D printing soft robots
guided by machine learning and finite element models. Sensors and Actuators A: Physical, 322,
112774.
44. Elmustafa, S. et al. (2021). Machine learning technologies for secure vehicular communica-
tion in internet of vehicles: Recent advances and applications. Security and Communication
Networks, Article ID 8868355. https://doi.org/10.1155/2021/8868355.
45. Ho, S., Banerjee, H., Foo, Y., Godaba, H., Aye, W., Zhu, J., & Yap, C. (2017). Experimental
characterization of a dielectric elastomer fluid pump and optimizing performance via composite
materials. Journal of Intelligent Material Systems and Structures, 28, 3054–3065.
46. Sarthak, B., Hritwick, B., Zion, T., & Hongliang, R. (2019). Deep reinforcement learning for
soft, flexible robots: brief review with impending challenges. Robotics, 8, 4. https://doi.org/10.
3390/robotics8010004
332 L. E. Alatabani et al.
47. Estifanos, T., & Mihret, M.: Robotics and artificial intelligence. International Journal of
Artificial Intelligence and Machine Learning, 10(2).
48. Andrius, D., Jurga, S., Žemaitien, E., & Ernestas, Š. et al. (2022). Advanced applications of
industrial robotics: New trends and possibilities. Applied Science, 12, 135. https://doi.org/10.
3390/app12010135.
49. Elmustafa, S. A. A., & Mujtaba, E. Y. (2019). Internet of things in smart environment: Concept,
applications, challenges, and future directions. World Scientific News (WSN), 134(1), 151.
50. Ali, E. S., Sohal, H. S. (2017). Nanotechnology in communication engineering: Issues,
applications, and future possibilities. World Scientific News (WSN), 66, 134-148.
51. Reham, A. A., Elmustafa, S. A., Rania, A. M., & Rashid, A. S. (2022). Blockchain for IoT-Based
cyber-physical systems (CPS): Applications and challenges. In: D. De, S. Bhattacharyya, &
Rodrigues, J. J. P. C. (Eds.), Blockchain based Internet of Things. Lecture Notes on Data
Engineering and Communications Technologies (Vol. 112). Singapore: Springer. https://doi.
org/10.1007/978-981-16-9260-4_4.
52. Zhang, Q, Lu, J., & Jin, Y. (2020). Artificial intelligence in recommender systems. Complex &
Intelligent Systems. Retrieved September 28, 2020 from, https://doi.org/10.1007/s40747-020-
00212-w.
53. Abdalla, R. S., Mahbub, S. A., Mokhtar, R. A., Elmustafa, S. A., Rashid, A. S. (2021). IoE
design principles and architecture. In Book: Internet of Energy for Smart Cities: Machine
Learning Models and Techniques. USA: Publisher: CRC group, Taylor & Francis Group.
54. Hyeyoung, K., Suyeon, L., Yoonseo, P., & Anna, C. (2022). A survey of recommenda-
tion systems: recommendation models, techniques, and application fields, recommendation
systems: recommendation models, techniques, and application fields. Electronics, 11, 141.
https:// doi.org/https://doi.org/10.3390/electronics11010141.
55. Yuanyuan, C., Dixiao, C., Shuzhang, L. et al. (2021). Recent advances in field-controlled micro–
nano manipulations and micro–nano robots. Advanced Intelligent Systems, 4(3), 2100116, 1–23
(2021). https://doi.org/10.1002/aisy.202100116,
56. Mona, B., et al. (2021). Artificial intelligence in IoT and its applications. Intelligent Wireless
Communications, IET Book Publisher.
57. Neto, A., Lopes, I. A., & Pirota, K. (2010). A Review on Nanorobotics. Journal of
Computational and Theoretical Nanoscience, 7, 1870–1877.
58. Gautham, G., Yaser, M., & Kourosh, Z. (2021). A Brief review on challenges in design and
development of nanorobots for medical applications. Applied Sciences, 11, 10385. https://doi.
org/10.3390/app112110385
59. Egorov, E., Pieters, C., Korach-Rechtman, H., et al. (2021). Robotics, microfluidics, nanotech-
nology and AI in the synthesis and evaluation of liposomes and polymeric drug delivery
systems. Drug Delivery and Translational Research, 11, 345–352. DOI: 10.1007/s13346-021-
00929-2.
60. Yang, Z., Kai, Z., Haotian, Y., Yi, Z., Dongliang, Z., & Jing, H. (2022). Indoor simultaneous
localization and mapping based on fringe projection profilometry 23, arXiv:2204.11020v1
[cs.RO].
61. Miramá, V. F., Díez, L. E., Bahillo, A., & Quintero, V. (2021). A survey of machine learning
in pedestrian localization systems: applications, open issues and challenges. IEEE Access, 9,
120138–120157. https://doi.org/10.1109/ACCESS.2021.3108073
62. Tian, Y., Adnane, C., & Houcine, C. (2021). A survey of recent indoor localization scenarios
and methodologies. Sensors, 21, 8086. https://doi.org/10.3390/s21238086
63. Giuseppe, F., René, D., Fabio, S., Strandhagen, J. O. (2021). Planning and control of
autonomous mobile robots for intralogistics: Literature review and research agenda European
Journal of Operational Research, 294(2), (405–426). https://doi.org/10.1016/j.ejor.2021.01.
019. Published 2021.
64. Alfieri, A., Cantamessa, M., Monchiero, A., & Montagna, F. (2012). Heuristics for puzzle-based
storage systems driven by a limited set of automated guided vehicles. Journal of Intelligent
Manufacturing, 23(5), 1695–1705.
Machine Learning and Deep Learning Approaches for Robotics … 333
65. Ahmad, B., Xiaodong, Z., & Haiming, S. et al. (2022). Precise motion control of a power line
inspection robot using hybrid time delay and state feedback control. Frontiers in Robotics and
AI 9(24). https://doi.org/10.3389/frobt.2022.746991.
66. Elsa, J., Hung, K., & Emel, D. (2022). A survey of human gait-based artificial intelligence
applications. Frontiers in Robotics and AI, 8. https://doi.org/10.3389/frobt.2021.749274.
67. Xi, V., & Lihui, W. (2021). A literature survey of the robotic technologies during the COVID-
19 pandemic. Journal of Manufacturing Systems, 60, 823–836. https://doi.org/10.1016/j.jmsy.
2021.02.005
68. Ahir, S., Telavane, D., & Thomas, R. (2020). The impact of artificial intelligence, blockchain,
big data and evolving technologies in coronavirus disease-2019 (COVID-19) curtailment. In:
Proceedings of the International Conference of Smart Electronics Communication ICOSEC
2020 (pp. 113–120). https://doi.org/10.1109/ICOSEC49089.2020.9215294.
69. Lana, I. S., Elmustafa, S., & Saeed, A. (2022). Machine learning in healthcare: Theory, applica-
tions, and future trends. In R. El Ouazzani & M. Fattah & N. Benamar (Eds.), AI Applications
for Disease Diagnosis and Treatment (pp. 1–38). Hershey, PA: IGI Global, (2022). https://doi.
org/10.4018/978-1-6684-2304-2.ch001
70. Jat, D., & Singh, C. (2020). Artificial intelligence-enabled robotic drones for COVID-19
outbreak. Springer Briefs Applied Science Technology, 37–46 (2020). DOI: https://doi.org/
10.1007/978–981–15–6572–4_5.
71. Schulte, P., Streit, J., Sheriff, F., & Delclos, G. et al. (2020). Potential scenarios and hazards in
the work of the future: a systematic review of the peer-reviewed and gray literatures. Annals of
Work Exposures and Health, 64, 786–816, (2020), DOI: https://doi.org/10.1093/annweh/wxa
a051.
72. Alsolami, F., Alqurashi, F., & Hasan, M. K. et al. (2021). Development of self-synchronized
drones’ network using cluster-based swarm intelligence approach. IEEE Access, 9, 48010–
48022. https://doi.org/10.1109/ACCESS.2021.3064905.
73. Alatabani, L. E., Ali, E. S., Mokhtar, R. A., Khalifa, O. O., & Saeed, R. A. (2022). Robotics
architectures based machine learning and deep learning approaches. In 8th International
Conference on Mechatronics Engineering (ICOM 2022), Online Conference, Kuala Lumpur,
Malaysia (pp. 107–113). https://doi.org/10.1049/icp.2022.2274.
74. Malik, A. A., Masood, T., & Kousar, R. (2020). Repurposing factories with robotics in the face
of COVID-19. IEEE Transactions on Automation Science and Engineering, 5(43), 133–145.
https://doi.org/10.1126/scirobotics.abc2782.
75. Yoon, S. (2020). A study on the transformation of accounting based on new technologies:
Evidence from korea. Sustain, 12, 1–23. https://doi.org/10.3390/su12208669
A Review on Deep Learning on UAV
Monitoring Systems for Agricultural
Applications
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 335
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_11
336 T. Petso and R. S. Jamisola Jr
1 Introduction
ing unit (GPU) hardware and advancements of deep learning algorithms architectures
in the computer vision domain [36].
2 Proposed Methodology
In this literature review study, we perform analysis to explore the use of drones
for different agricultural application through the use of deep learning strategies.
Google Scholar was mainly used as a search engine for data collection. The key-
words that were used include “animal agricultural applications”, “computer vision”,
“deep learning algorithms”, “plant agricultural application”, and “UAV monitoring
system”. The initial step was data collection of related work and secondly detailed
review of that research work. The approach reveals a detailed overview deep learning
strategies, advantages, disadvantages, and future research ideas that can be exploited
in agriculture.
Output
layer
stage detectors. The difference between the two is a trade-off of detection speed
and accuracy. The one stage detectors hold the capability for high detection speeds
as compared to two stage detectors [63]. These include single shot detector (SSD),
you only look once (YOLO) models, RetinaNet, to name a few [87]. The two stage
detectors have the capability of high detection accuracy compared to one stage detec-
tors [26]. Some of these includes Region convolutional neural network (R-CNN),
Fast region-based convolutional neural network (Fast R-CNN), Faster R-CNN, Mask
R-CNN, to name a few [102].
The five basic steps for the development of UAV agricultural monitoring appli-
cation includes UAV platform, data collection, deep learning model, perfor-
mance evaluation and agricultural application (Fig. 3).
UAV Platform
The most three common types of drones used for agricultural applications include
rotary-wing, fixed-wing, and fixed-wing hybrid Vertical Take-Off and Landing
(VTOL). These types hold different advantages and disadvantages. Rotary wing
drones are capable of flying at low altitude, hover and highly manoeuvrable hence
beneficial for image related agricultural monitoring. Their greatest challenge is low
endurance due to high power usage. Fixed-wing drones have high endurance and
payload capability which can be used for agricultural pesticide spraying. Fixed-wing
hybrid is a combination of rotary and fixed-wing characteristics. They hold the needed
attributes for agricultural applications such as hovering, high endurance, and better
manoeuvrability.
Data Collection
The images collected for the development of deep learning agricultural monitoring
systems are obtained using remote sensing from the drone. The limited public datasets
for agricultural drone videos and images for development of deep learning algorithms
340 T. Petso and R. S. Jamisola Jr
Deep
Learning Data
Collection
Model
highlights the need to collect datasets. The remote sensing specification and drone
altitude highly contributes to the image quality for monitoring systems capabilities.
Different environmental conditions such as brightness, contrast, drone altitude and
sharpness, to name a few are taken into account to ensure the development of a robust
deep learning agricultural monitoring system. A sample drone altitude variation
which constitutes model performance due to feature extraction characteristics are
highlighted in Figs. 4 and 5.
Deep Learning Model
The selection of the deep learning algorithms used for agricultural applications is
dependent upon the research objective and hardware capability. To ensure proper
training the data augmentation, hyperparameters like an optimizer, batch size, learn-
ing rate are set to optimum results during model training. Data augmentation is
primarily used to strengthen the size and quality of the training dataset, thus a robust
agricultural deep learning model can be developed [113]. The process of collect-
ing training dataset is prone to be expensive, thus data augmentation also has holds
the capability to increase the limited training dataset [23]. The hyperparameters are
mainly used to fine-tune the deep learning models for improved and better perfor-
mance [83].
Performance Evaluation
The four fundamental evaluation elements are graphically represented in Fig. 6. These
feature elements are used to compute the performance evaluation metrics: precision,
recall, f1 score, and accuracy. The Eqs. (1)–(6) presents their mathematical defini-
tions. Other metrics commonly used in research studies are average precision (AP)
and mean average precision (mAP). The following are the fundamental evaluation
elements definitions;
TP
Recall (R) = (2)
T P + FN
342 T. Petso and R. S. Jamisola Jr
True False
Positive Positive Negative
(TP) (FN)
Actual
Class
False True
Negative Positive Negative
(FP) (TN)
2P R
F1 Scor e = (3)
P+R
TP +TN
Accuracy (A) = (4)
T P + T N + FP + FN
Recalln − Recalln−1
Average Pr ecision (A P) = (5)
n
Pr ecision n
1
Mean Average Pr ecision (m A P) = A P. (6)
N
Agricultural Applications
The agricultural applications that are reviewed consider both plant and animal mon-
itoring A graphically representation of this UAV monitoring systems are detailed in
Fig. 7.
Pest Infiltration
Plant Growth
Plant/Crop
Fruit Conditions
Monitoring
Weed Invasion
Drone
Systems Livestock
Livestock Detection
Monitoring Livestock
Counting
better management strategy [29]. It is vital for farmers to have proper yield estimates
to enable preparation for harvesting and market supply projections [9]. Pest outbreaks
in crops are unpredictable, continuous monitoring of them to prevent crop losses are
of adamant importance [42]. The capability of pest detection in near real-time aids
to make immediate and appropriate decision on time [25]. The implementation of
deep learning algorithms with drones contribute to appropriate decision making at
the right time for prompt plant monitoring tactics [125]. It has been established
to increase agricultural productivity as well as efficiency and save costs [29]. The
study conducted by [101] addressed insect monitoring through automated detection
to protect soft-skinned fruits, it was established that to be cost-effective, less time-
consuming and labour-demanding as compared to the existing monitoring methods.
Pest infiltration greatly impacts the overall plant production yield [70]. They can
cause a production yield loss of roughly 20–40% [49]. Due to the gradual climate
change over the years pest outbreaks are common [84]. The usage of deep learning
techniques with drone to combat pests in plants is a practical approach [120]. The
study conducted by [25] automatically detected pest through deep learning technique
(YOLO v3 tiny) near real-time. This capability enhanced the drone to spray the pest to
the appropriate location thereby ensuring less destruction and improving production
344 T. Petso and R. S. Jamisola Jr
yield. The capability to detect, locate and quantify pest distribution with the aid of
drones and deep learning techniques eliminates human labour which is costly and
time-consuming [33]. The capability to detect pests during production minimise yield
loss on time [24]. Rice is one of the primary agricultural produce in Asian countries,
an effective monitoring is vital to ensure quality and high production yield [17, 58].
Table 1 highlight the advantages and disadvantages of a summary of studies with
the use of UAV monitoring systems for automatic pest detection with the application
deep learning models. The capability of deep learning models yield capability of near
real-time plant monitoring. The mobile hardware required needed to achieve these
needs a high graphical processing unit (GPU) to improve the model performance.
Deep learning models hold great opportunities for better feature extraction capabil-
ities vital for pest detection. An increase drone altitude and occlusion reduced the
capability of automatic pest detection.
Agriculture has a great impact to the human survivability. Agricultural yield pro-
duction during plant growth is ideal to ensure improvement for food planning and
security. The ability to acquire agricultural data at different growth stages are vital for
better farm management strategies [28]. The capability of near real-time monitoring
for seedling germination is vital for early crop yield estimation [48]. The determi-
nation of plant flowing is vital for agricultural monitoring purposes [3]. It is also
essential to establish the plant maturity at the appropriate time for decision-making
purposes [79]. Thus, enabling yield prediction for better field management. The use
of deep learning models with drones had been established to save costs and time as
compared to the physical traditional approach [139]. The study conducted by [79]
highlighted the effectiveness of saving time and costs for estimating the maturity
of soybeans over time. The ability to attain pre-harvest information such as product
number and size is important for management decision-making and marketing plan-
ning [127]. The conventional maize plant growth monitoring was established to be
time demanding and labour intensive as compared to the use of deep learning with
drones [88]. A study by [45] highlighted the capability of automatic oil palm not
accessible to human to establish if they are ready for harvest or not.
Table 2 highlights the capability of plant growth monitoring at different stages
such as germination, flowering, immaturity and maturity for UAV agricultural appli-
cations. The capabilities of plant growth monitoring aids an effective approach to
early decision making needed to ensure high production yield and better agricultural
planning. Figure 8 demonstrates the capability of automatic seedling detection with
lightweight deep learning algorithm from a UAV platform. Some factors such as
occlusion due to plant growth, environmental background conditions, and environ-
ment thickness to name a few impact the overall performance of UAV monitoring
systems. Hardware with graphic processing units are needed for the capability to aid
near real-time essential for appropriate needed action.
A Review on Deep Learning on UAV Monitoring Systems … 345
Table 1 Advantages and disadvantages of UAV based pest infiltration monitoring applications
Main focus Pests Advantages Disadvantages References
Rice pest Leafhopper, Instant evaluation Requires an [17]
detection Insect, Arthropod, and prevent rice internet platform
Invertebrate, Bug damage in timely
manner
Coconut trees Rhinoceros Immediate pest High rate Wi-Fi [24]
pest detection beetle, Red palm monitoring required
weevil, Black capability
headed
caterpillar,
Coconut
Eriophyid mite,
Termite
Longan crop pest Tessaratoma Approach hold Limited power [25]
detection papillosa near real-time for GPU systems
capability with is a challenge
optimum pest
location and
distribution
Maize pest Spodoptera Early detection of Some leaves [33]
detection frugiperda infected maize under occlusion
leaves in time could not detect
pest damage
Maize leaves Fall armyworm Capability of near Challenge to [43]
affected by fall real-time results established exact
armyworms location for
future reference
Detect and Insects Deep learning Hardware with [106]
measure models high memory
leaf-cutting ant (YOLOv5s) (GPU) need
nest outperformed the
traditional
multilayer
perceptron neural
network
Soybean crop Acrididae, Deep learning Pest detection [120]
pest detection Anticarsia approach challenge at
gemmatalis, outsmarted the higher drone
Coccinellidae, other feature altitudes
Diabrotica extraction models
speciosa, Edessa
meditabunda,
Euschistus heros
adult, Euschistus
heros nymph,
Gastropoda,
Lagria villosa,
Nezara viridula
adult, Nezara
viridula nymph,
Spodoptera spp.
346 T. Petso and R. S. Jamisola Jr
Fig. 8 Sample seedling detection from a UAV platform with custom YOLO v4 tiny
The capability to estimate the fruit yield and location is important for farm man-
agement and planning purposes [45]. A combination drone with deep learning algo-
rithms had been established to be effective method to locate the ideal areas to pick
fruits [64]. The traditional approach has been determined to be time-consuming and
labour demanding [130]. This approach enhances fruit detection in challenging ter-
rains. The study conducted by [8] established automatic detection, counting and size
of citrus fruits on an individual trees aided yield estimation. Another study by [129]
highlighted the capability of automatic mango detection and estimation using UAV
monitoring system with deep learning algorithms. The study conducted by [50] high-
lighted the melon detection, size estimation and location through UAV monitoring
system to be less labour intensive.
Table 3 highlights the positive capability and research gaps of UAV monitoring
systems using deep learning algorithms for fruit condition monitoring. Some of the
advantages highlighted includes effective approaches such as promising automatic
counting accuracy as compared to the manual approach. Figure 9 illustrates the
capability of automatic ripe lemon detection needed for harvesting planning pur-
poses. High errors are contributed by the fruit tree canopy during fruit detection and
counting yield estimation [78]. Environmental conditions such as occlusion, lighting
consistent as existing challenges to fruit condition monitoring.
The presence of weed holds great challenge for the overall agricultural production
yield [80]. They are capable to cause a production loss to as much as 45% of the yield
[99]. The presence of weeds also known as unwanted crops in the agricultural field
348 T. Petso and R. S. Jamisola Jr
competes for the needed resources such as sunlight, water, soil nutrition and growing
space [107]. An effective approach for weed monitoring is vital for appropriate farm
management [66]. Early weed detection is important to improve agricultural produc-
tivity and ensure sustainable agriculture [86]. The use of herbicides to combat weeds
have negative consequences such as destruction to the environmental surrounding
and harmful to the human health [11]. An appropriate quantity usage of herbicides
and pesticides based from correct weed identification and their location is an impor-
tant factor [41, 56]. Drones hold the capability of precise location and appropriate
chemical usage [56]. The ability of near real-time weed detection is vital farm man-
A Review on Deep Learning on UAV Monitoring Systems … 349
Fig. 9 Automatic ripe lemon detection from a UAV platform with custom YOLO v4 tiny
agement [76, 124]. The traditional approach to weed detection is manual and labour
intensive, and time-consuming [85].
Table 4 highlight advantages and disadvantages for UAV monitoring systems for
weed detection through deep learning models. The capability of weed detection with
deep learning models is an effective approach for early weed management strategies
to ensure high agricultural production yield. From the reviewed studies some of
the challenges encountered includes high drone altitude, lighting conditions, and
automatic detection of unclassified weed species during the deep learning model
training.
Crop diseases had been established to hamper the production yield hence the need for
extensive crop monitoring [19, 119, 138]. It also contributes to economical impact in
agricultural production concerning quality and quantity [55]. Automatic crop disease
detection is critically important for crop management and efficiency [54]. Early and
correct crop disease detection aids in plant management strategies in a timely manner
and ensures high production yield [52, 62]. Early disease symptoms are likely to be
identical and proper classification is vital to tackle them [1, 35]. The plant disease
have been established to affect the continues food supply. The study by [117] used
mask R-CNN for successful automatic detection of northern leaf blight which affects
maize. The traditional manual methods of diseases identification had been established
to be time consuming and labour demanding as to drones [60]. It is also susceptible to
human error [46]. Figure 10 presents a visualisation crop disease symptom vital for
monitoring purposes. The advantages and disadvantages of crop disease monitoring
are highlighted in Table 5.
350 T. Petso and R. S. Jamisola Jr
Table 6 UAV based deep learning models used for plant monitoring applications
Deep learning UAV platform Application Findings Performance References
YOLO v3 Self assembled Pest detection Efficient mAP: [25]
YOLO v3 tiny APD-616X Pest location pesticide usage YOLO v3: 93.00% YOLO
v3 tiny: 89.00%
Speed:
YOLO v3: 2.95 FPS
YOLO v3 tiny: 8.71 FPS
ResNeSt-50 DJI Mavic air 2 Pest detection Effective pest Validation accuracy: [33]
ResNet-50 Pest location monitoring ResNeSt-50: 98.77%
Efficient Net Pest approach ResNet-50: 97.59%
RegNet Quantification Efficient Net: 97.89%
RegNet: 98.07%
VGG-16 DJI Phantom 4 Pest damage Effective Accuracy: [43]
VGG-19 Pro approach to VGG-16: 96.00%
Xception v3 increase crop VGG-19: 93.08%
MobileNet v2 yield Xception v3: 96.75%
MobileNet v2: 98.25%
YOLO v5xl DJI Phantom 4 Ant nest pest Precise Accuracy: [106]
YOLO v5l Adv detection monitoring YOLO v5xl: 97.62%
YOLO v5m approach YOLO V5l: 98.45%
YOLO v5s YOLO v5m: 97.89%
YOLO v5s: 97.65%
Inception-V3 DJI Phantom 4 Pest control Alternative pest Accuracy: [120]
ResNet-50 Adv monitoring Inception-V3: 91.87%
VGG-16 strategies ResNet-50: 93.82%
VGG-19 VGG-16: 91.80%
Xception VGG-19: 91.33%
Xception: 90.52%
Faster R-CNN DJI Phantom 3 Maize tassel Positive F1 Score: [3]
CNN Pro detection productivity Faster R-CNN: 97.90%
growth CNN: 95.90%
monitoring
VGG-16 eBee Plus Crop Useful crop F1 Score: 86.00% [28]
identification identification
YOLO v3 DJI Phantom 4 Sea cucumber Successful mAP = 85.50% [65]
Pro detection Sea growth density Precision = 82.00%
cucumber estimation Recall = 83.00% F1
density Score = 82.00%
CenterNet DJI Phantom 4 Cotton stand Feasibility of mAP: [68]
MobileNet Pro detection early seedling CenterNet: 79.00%
Cotton stand monitoring MobileNet: 86.00%
count Average Precision:
CenterNet: 73.00%
MobileNet: 72.00%
Faster R-CNN DJI Phantom 3 Citrus grove Positive yield SE: 6.59% [8]
Pro detection Count estimation
Size estimation
R-CNN DJI Phantom 4 Apple detection Effective F1 Score > 87.00% [9]
Pro Apple approach for Precision > 90.00%
estimation yield prediction
(continued)
354 T. Petso and R. S. Jamisola Jr
Table 6 (continued)
Deep learning UAV platform Application Findings Performance References
RetinaNet DJI Phantom 4 Melon detection Successful Precision = 92.00% F1 [50]
Pro Melon number yield estimation Score > 90.00%
estimation
Melon weight
estimation
FPN DJI Jingwei Longan fruit Effective fruit mAP: [64]
SSD M600 PRO detection detection FPN: 54.22%
YOLO v3 Longan fruit SSD: 66.53%
YOLO v4 location YOLO v3: 72.56%
MobileNet- YOLO v4: 81.72%
YOLO v4 MobileNet-YOLOv4:
89.73%
YOLO v2 DJI Phantom 3 Mango Effective mAP: 86.40% [129]
detection approach for Precision: 96.10%
Mango mango Recall: 89.00%
estimation estimation
YOLO v3 DJI Phantom 4 Flower Effective mAP: 88.00% AP: [139]
Pro detection approach for 93.00%
Immature fruit yield prediction
detection
Mature fruit
detection
YOLO v3 DJI Matrice Monocot weed Capability of AP Monocot: 91.48% [32]
600 Pro detection Dicot weed detection AP Dicot: 86.13%
weed detection in the field
FCNN DJI Phantom 3 Hogweed Positive results ROC AUC: 96.00% [76]
DJI Mavic Pro detection for hogweed Speed: 0.46 FPS
Embedded detection
devices
Faster R-CNN DJI Matrice Weed detection Weed Precision: [124]
SSD 600 Pro monitoring Faster R-CNN: 65.00%
SSD: 66.00%
Recall:
Faster R-CNN: 68.00%
SSD: 68.00%
F1 Score:
Faster R-CNN: 66.00%
SSD: 67.00%
YOLO v3 tiny DJI Phantom 3 Weed detection Effective mAP = 72.50% [137]
approach for
weed detection
in wheat field
SegNet Quadcopter Mildew disease Promising Accuracy: [54]
(Scanopy) detection disease Grapevine-level > 92%
detection grapes Leaf-level > 87%
SqueezeNet Quadcopter Cercospora Promising Validation accuracy: [97]
ResNet-18 (Customised) Leaf Spot disease SqueezeNet: 99.10%
disease detection ResNet-18: 99.00%
detection
DCNN DJI S1000 Yellow dust Crop disease Accuracy: 85.00% [138]
detection monitoring
(continued)
A Review on Deep Learning on UAV Monitoring Systems … 355
Table 6 (continued)
Deep learning UAV platform Application Findings Performance References
Inception-V3 DJI Phantom 3 Leaf disease Crop disease Accuracy: [119]
ResNet-50 Pro detection monitoring Inception-V3: 99.04%
VGG-19 ResNet-50: 99.02%
Xception VGG-19: 99.02%
Xception: 98.56%
1 [DCNN—Deep Convolutional Neural Network; Faster R-CNN—Faster Region-based Convolu-
tional Neural Network; FCNN—Fully Convolutional Neural Networks; FPN—Feature Pyramid
Network; VGG-16—Visual Geometry Group 16; VGG-19—Visual Geometry Group 19; YOLO
v2—You Only Look Once version 2; YOLO v3—You Only Look Once version 3; YOLO v4—You
Only Look Once version 4;YOLO v5—You Only Look Once version 5; R-CNN—Region-based
Convolutional Neural Network; CNN—Convolutional Neural Network; SSD—Single Shot Detec-
tor; SE—Standard Error]
Fig. 11 Sample sheep detection from a UAV image with custom YOLO v4 tiny model
356 T. Petso and R. S. Jamisola Jr
Table 7 Advantages and disadvantages of animal identification and population count monitoring
applications
Main focus Advantages Disadvantages References
Sheep detection and Online approach Online approach used [2]
counting yielded promising more power as
results immediately as compared to offline
compared to offline
Individual cattle Capability of non Challenge of false [5]
identification intrusive cattle positive exists in cases
detection and such as multi-cattle
individual alignment and cattle
identification with similar features
on the cattle coat
Aerial cattle Several deep learning Challenges conditions [13]
monitoring algorithms highlighted such as blur images
capability of livestock hampered
monitoring
Aerial livestock Automatic sheep Challenge such as [109]
monitoring detection to aid sheep close contact sheep for
counting individual
identification and
sheep under trees and
bushes could not be
detected
Cattle detection and Capability for cattle The model [111]
count management for performance decreases
grazing purposes with fast moving
animals
Detecting and Capability of cattle Challenge of cattle [115]
counting cattle counting by deleting movement hampers
duplicate animals cattle counting
capability
Livestock detection Capability of livestock Challenge with [132]
for counting capability monitoring based on overestimation due to
mask detection limited training
images
A Review on Deep Learning on UAV Monitoring Systems … 357
Table 8 UAV based deep learning models used for animal monitoring applications
Deep learning UAV platform Application Findings Performance References
YOLO Self- Sheep Promising Accuracy: [2]
Assembled detection on-board Offline processing: 60.00%
drone Sheep system Online pre-processing: 89%
counting Online processing: 97.00%
LRCN DJI Inspire Cattle Non-intrusive mAP: [5]
Mk1 detection Cattle detection: 99.3%
Single frame Single frame individual: 86.07%
individual Accuracy:
Video based Video based individual: 98.13%
individual
YOLO v2 DJI Matrice Individual Practical Accuracy: [6]
Inception v3 100 cattle biometric YOLO v2: 92.40%
identification identification Inception v3: 93.60%
CNN DJI Phantom Cattle Effective and Accuracy > 90.00% [15]
4 Pro DJI detection efficient
Mavic 2 Cattle approach
counting
YOLO v3 DJI Tello Cattle Improved Not provided [37]
Deep Sort detection livestock
Cattle monitoring
counting
YOLO v2 DJI Phantom Cattle Positive cattle Precision: 95.70% [111]
4 detection grazing Recall: 94.60%
Cattle management F1 Score: 95.20%
counting
VGG-16 DJI Phantom Canchim Promising Accuracy: [13]
VGG-19 4 Pro cattle results for VGG-16: 97.22%
ResNet-50 v2 detection detection VGG-19: 97.30%
ResNet-101 ResNet-50 v2: 97.70%
v2 ResNet-101 v2: 98.30%
ResNet-152 ResNet-152 v2: 96.70%
v2 MobileNet MobileNet: 98.30%
MobileNet v2 MobileNet v2: 78.70%
DenseNet 121 DenseNet 121: 85.20%
DenseNet 169 DenseNet 169: 93.50%
DenseNet 201 DenseNet 201: 93.50%
Xception v3 Xception v3: 97.90%
Inception Inception ResNet v2: 98.30%
ResNet v2 NASNet Mobile: 85.70%
NASNet NASNet Large: 99.20%
Mobile
NASNet
Large
Mask R-CNN DJI Mavic Pro Cattle Potential Accuracy: [131]
detection cattle Pastures: 94.00%
Cattle monitoring Feedlot: 92.00%
counting
Mask R-CNN DJI Mavic Pro Livestock Effective Accuracy: [132]
classification approach for Cattle: 96%
Livestock livestock Sheep: 92%
counting monitoring
1 [LRCN—Long-term Recurrent Convolutional Network; YOLO—You Only Look Once; YOLO
v2—You Only Look Once version 2; YOLO v3—You Only Look Once version 3; R-CNN—Region-
based Convolutional Neural Network; VGG-16—Visual Geometry Group 16; VGG-19—Visual
Geometry Group 19]
A Review on Deep Learning on UAV Monitoring Systems … 359
of interest. The disadvantage it holds is the lower detection speed compared to one
stage detectors for near real-time capability. Thus, applying a two stage detector is
appropriate for plant growth agricultural monitoring.
An effective fruit condition monitoring is beneficial for better agricultural decision-
making in relation of fruit quantity, size, weight and degree of maturity estimation, to
name a few. These capabilities are needed for production yield prediction needed for
agricultural management planning such as fruit picking and market value estimation.
Fruit detection can help in planning fruit picking to consider the ease or difficulty of
picking and possible dangers. This will help acquire appropriate equipment to ensure
a smooth harvest process during fruit picking time. Fruit detection is performed at
different stages, flowering, mature, and immature, to help decide the harvest time
and ensure the maximum number of ripe fruits. The reviewed studies investigated
both lightweight and non-lightweight deep learning models. This approach provides
minimum time, less labour demand and lower erroneous as compared to manual fruit
monitoring. The highest performance evaluation was 91.10% precision for mango
detection and estimation from the reviewed studies. Though the model performance
can be good, some challenges such as tree occlusion and lighting variations can
hamper the overall model performance.
The presence of weeds hamper the plant growth, thus they compete for sunlight,
water, space and soil nutrients. Early detection and appropriately addressing them
greatly contributes to better agricultural production yield. The capability to detect
weeds is a challenging task due to similar characteristic features of plants. To help in
the accuracy of weed detection, we have to increase our knowledge of the expected
weed associated with a particular crop. Considering that they are many types of
weeds, we can concentrate on the more appropriate ones for a specific crop and
disregard others. This way, we can save time in deep learning model training. The
highest performance evaluation for weed detection was established to be 96.00%
with the classification model performance with FCN. The possible reason for this
high model performance is connected to the fact that FCN simplifies the feature
extraction learning process faster as it avoids the application of dense layers in the
model architecture.
The plant disease detection is commonly characterized by a change in colour
leaves such as isolated spots, widespread spots, isolated lesions, a cluster of lesions,
to name a few. SqueezeNet had the highest accuracy of 99.10%. Other studies high-
lighted high model accuracy of over 85.00%. The possible reason for high accuracy
capability is due to the change in plant leaves for detection purposes. We recommend
studies on detecting fallen leaves or broken leaves caused by an external force to help
determine the plant’s health.
Automatic livestock identification and count from a UAV also encompasses min-
imal animal behavioural change from the presence of a drone. Most of the reviewed
studies individually identified livestock for population and health monitoring. How-
ever, other studies are capable of counting livestock without individual identification.
The higher the drone altitude, the greater the challenge of acquiring the required
deep learning distinguishing features. There are limited studies establishing live-
stock response towards drones for appropriate altitude, thus great caution must be
360 T. Petso and R. S. Jamisola Jr
taken into consideration in livestock monitoring applications [40]. The highest per-
formance evaluation for livestock detection from the reviewed studies was identified
at 99.30% in terms of mean average precision with the LRCN model. LRCN is a
model approach ideal for visual features in videos, activities and image classifica-
tion. The incorporation of deep learning algorithms and livestock graze monitoring
capability on drones can aid in animal grazing management system. It is essential
to ensure animal grazing management for agricultural sustainability and maintain
continuous animal production [111].
6 Conclusions
References
1. Abdulridha, J., Ampatzidis, Y., Kakarla, S. C., & Roberts, P. (2020). Detection of target spot
and bacterial spot diseases in tomato using UAV-based and benchtop-based hyperspectral
imaging techniques. Precision Agriculture, 21(5), 955–978.
2. Al-Thani, N., Albuainain, A., Alnaimi, F., & Zorba, N. (2020). Drones for sheep livestock
monitoring. In 2020 IEEE 20th Mediterranean Electrotechnical Conference (MELECON)
(pp. 672–676). IEEE.
A Review on Deep Learning on UAV Monitoring Systems … 361
3. Alzadjali, A., Alali, M. H., Sivakumar, A. N. V., Deogun, J. S., Scott, S., Schnable, J. C., &
Shi, Y. (2021). Maize tassel detection from UAV imagery using deep learning. Frontiers in
Robotics and AI, 8.
4. de Andrade Porto, J. V., Rezende, F. P. C., Astolfi, G., de Moraes Weber, V. A., Pache, M. C.
B., & Pistori, H. (2021). Automatic counting of cattle with faster R-CNN on UAV images. In
Anais do XVII Workshop de Visão Computacional, SBC (pp. 1–6).
5. Andrew, W., Greatwood, C., & Burghardt, T. (2017). Visual localisation and individual identi-
fication of holstein friesian cattle via deep learning. In Proceedings of the IEEE International
Conference on Computer Vision Workshops (pp. 2850–2859).
6. Andrew, W., Greatwood, C., & Burghardt, T. (2019). Aerial animal biometrics: Individual
friesian cattle recovery and visual identification via an autonomous UAV with onboard deep
inference. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS) (pp. 237–243). IEEE.
7. Anghelache, D., Persu, C., Dumitru, D., Băltatu, C., et al. (2021). Intelligent monitoring of
diseased plants using drones. Annals of the University of Craiova-Agriculture, Montanology,
Cadastre Series, 51(2), 146–151.
8. Apolo, O. E. A., Guanter, J. M., Cegarra, G. E., Raja, P., & Ruiz, M. P. (2020). Deep learning
techniques for estimation of the yield and size of citrus fruits using a UAV. European Journal
of Agronomy: The Official Journal of the European Society for Agronomy, 115(4), 183–194.
9. Apolo-Apolo, O. E., Pérez-Ruiz, M., Martínez-Guanter, J., & Valente, J. (2020). A cloud-
based environment for generating yield estimation maps from apple orchards using UAV
imagery and a deep learning technique. Frontiers in Plant Science, 11, 1086.
10. Ayamga, M., Akaba, S., & Nyaaba, A. A. (2021). Multifaceted applicability of drones: A
review. Technological Forecasting and Social Change, 167(120), 677.
11. Bah, M. D., Hafiane, A., & Canals, R. (2018). Deep learning with unsupervised data labeling
for weed detection in line crops in UAV images. Remote Sensing, 10(11), 1690.
12. Barbedo, J. G. A., & Koenigkan, L. V. (2018). Perspectives on the use of unmanned aerial
systems to monitor cattle. Outlook on Agriculture, 47(3), 214–222.
13. Barbedo, J. G. A., Koenigkan, L. V., Santos, T. T., & Santos, P. M. (2019). A study on the
detection of cattle in UAV images using deep learning. Sensors, 19(24), 5436.
14. Barbedo, J. G. A., Koenigkan, L. V., & Santos, P. M. (2020). Cattle detection using oblique
UAV images. Drones, 4(4), 75.
15. Barbedo, J. G. A., Koenigkan, L. V., Santos, P. M., & Ribeiro, A. R. B. (2020). Counting
cattle in UAV images-dealing with clustered animals and animal/background contrast changes.
Sensors, 20(7), 2126.
16. Behjati, M., Mohd Noh, A. B., Alobaidy, H. A., Zulkifley, M. A., Nordin, R., & Abdullah,
N. F. (2021). Lora communications as an enabler for internet of drones towards large-scale
livestock monitoring in rural farms. Sensors, 21(15), 5044.
17. Bhoi, S. K., Jena, K. K., Panda, S. K., Long, H. V., Kumar, R., Subbulakshmi, P., & Jebreen, H.
B. (2021). An internet of things assisted unmanned aerial vehicle based artificial intelligence
model for rice pest detection. Microprocessors and Microsystems, 80(103), 607.
18. Bhoj, S., Tarafdar, A., Singh, M., Gaur, G. (2022). Smart and automatic milking systems:
Benefits and prospects. In Smart and sustainable food technologies (pp. 87–121). Springer.
19. Bouguettaya, A., Zarzour, H., Kechida, A., Taberkit, A. M. (2021). Recent advances on UAV
and deep learning for early crop diseases identification: A short review. In 2021 International
Conference on Information Technology (ICIT) (pp. 334–339). IEEE.
20. Bouguettaya, A., Zarzour, H., Kechida, A., & Taberkit, A. M. (2022). Deep learning tech-
niques to classify agricultural crops through UAV imagery: A review. Neural Computing and
Applications, 34(12), 9511–9536.
21. Brunberg, E., Eythórsdóttir, E., Dỳrmundsson, Ó. R., & Grøva, L. (2020). The presence of
icelandic leadersheep affects flock behaviour when exposed to a predator test. Applied Animal
Behaviour Science, 232(105), 128.
22. de Camargo, T., Schirrmann, M., Landwehr, N., Dammer, K. H., & Pflanz, M. (2021). Opti-
mized deep learning model as a basis for fast UAV mapping of weed species in winter wheat
crops. Remote Sensing, 13(9), 1704.
362 T. Petso and R. S. Jamisola Jr
23. Cauli, N., & Reforgiato Recupero, D. (2022). Survey on videos data augmentation for deep
learning models. Future Internet, 14(3), 93.
24. Chandy, A., et al. (2019). Pest infestation identification in coconut trees using deep learning.
Journal of Artificial Intelligence, 1(01), 10–18.
25. Chen, C. J., Huang, Y. Y., Li, Y. S., Chen, Y. C., Chang, C. Y., & Huang, Y. M. (2021)
Identification of fruit tree pests with deep learning on embedded drone to achieve accurate
pesticide spraying. IEEE Access, 9, 21,986–21,997.
26. Chen, J. W., Lin, W. J., Cheng, H. J., Hung, C. L., Lin, C. Y., & Chen, S. P. (2021). A
smartphone-based application for scale pest detection using multiple-object detection meth-
ods. Electronics, 10(4), 372.
27. Chen, Y., Lee, W. S., Gan, H., Peres, N., Fraisse, C., Zhang, Y., & He, Y. (2019). Strawberry
yield prediction based on a deep neural network using high-resolution aerial orthoimages.
Remote Sensing, 11(13), 1584.
28. Chew, R., Rineer, J., Beach, R., O’Neil, M., Ujeneza, N., Lapidus, D., Miano, T., Hegarty-
Craver, M., Polly, J., & Temple, D. S. (2020). Deep neural networks and transfer learning for
food crop identification in UAV images. Drones, 4(1), 7.
29. Delavarpour, N., Koparan, C., Nowatzki, J., Bajwa, S., & Sun, X. (2021). A technical study on
UAV characteristics for precision agriculture applications and associated practical challenges.
Remote Sensing, 13(6), 1204.
30. Dileep, M., Navaneeth, A., Ullagaddi, S., & Danti, A. (2020). A study and analysis on various
types of agricultural drones and its applications. In 2020 Fifth International Conference on
Research in Computational Intelligence and Communication Networks (ICRCICN) (pp. 181–
185). IEEE
31. Espejo-Garcia, B., Mylonas, N., Athanasakos, L., Vali, E., & Fountas, S. (2021). Combining
generative adversarial networks and agricultural transfer learning for weeds identification.
Biosystems Engineering, 204, 79–89.
32. Etienne, A., Ahmad, A., Aggarwal, V., & Saraswat, D. (2021). Deep learning-based object
detection system for identifying weeds using UAS imagery. Remote Sensing, 13(24), 5182.
33. Feng, J., Sun, Y., Zhang, K., Zhao, Y., Ren, Y., Chen, Y., Zhuang, H., & Chen, S. (2022).
Autonomous detection of spodoptera frugiperda by feeding symptoms directly from UAV
RGB imagery. Applied Sciences, 12(5), 2592.
34. Fenu, G., & Malloci, F. M. (2021). Forecasting plant and crop disease: an explorative study
on current algorithms. Big Data and Cognitive Computing, 5(1), 2.
35. Görlich, F., Marks, E., Mahlein, A. K., König, K., Lottes, P., & Stachniss, C. (2021). UAV-
based classification of cercospora leaf spot using RGB images. Drones, 5(2), 34.
36. Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., & Lew, M. S. (2016). Deep learning for
visual understanding: A review. Neurocomputing, 187, 27–48.
37. Hajar, M. M. A., Lazim, I. M., Rosdi, A. R., & Ramli, L. (2021). Autonomous UAV-based
cattle detection and counting using YOLOv3 and deep sort.
38. Hande, M. J. (2021). Indoor farming hydroponic plant grow chamber. International Journal
of Scientific Research and Engineering Trends, 7, 2050–2052.
39. Hasan, A. M., Sohel, F., Diepeveen, D., Laga, H., & Jones, M. G. (2021). A survey of deep
learning techniques for weed detection from images. Computers and Electronics in Agricul-
ture, 184(106), 067.
40. Herlin, A., Brunberg, E., Hultgren, J., Högberg, N., Rydberg, A., & Skarin, A. (2021). Animal
welfare implications of digital tools for monitoring and management of cattle and sheep on
pasture. Animals, 11(3), 829.
41. Huang, H., Lan, Y., Yang, A., Zhang, Y., Wen, S., & Deng, J. (2020). Deep learning versus
object-based image analysis (OBIA) in weed mapping of UAV imagery. International Journal
of Remote Sensing, 41(9), 3446–3479.
42. Iost Filho, F. H., Heldens, W. B., Kong, Z., & de Lange, E. S. (2020). Drones: innovative
technology for use in precision pest management. Journal of Economic Entomology, 113(1),
1–25.
A Review on Deep Learning on UAV Monitoring Systems … 363
43. Ishengoma, F. S., Rai, I. A., & Said, R. N. (2021). Identification of maize leaves infected
by fall armyworms using UAV-based imagery and convolutional neural networks. Computers
and Electronics in Agriculture, 184(106), 124.
44. Islam, N., Rashid, M. M., Wibowo, S., Wasimi, S., Morshed, A., Xu, C., & Moore, S. (2020).
Machine learning based approach for weed detection in chilli field using RGB images. In
The International Conference on Natural Computation (pp. 1097–1105). Fuzzy Systems and
Knowledge Discovery: Springer.
45. Jintasuttisak, T., Edirisinghe, E., & Elbattay, A. (2022). Deep neural network based date palm
tree detection in drone imagery. Computers and Electronics in Agriculture, 192(106), 560.
46. Joshi, R. C., Kaushik, M., Dutta, M. K., Srivastava, A., & Choudhary, N. (2021). Virleafnet:
Automatic analysis and viral disease diagnosis using deep-learning in vigna mungo plant.
Ecological Informatics, 61(101), 197.
47. Junos, M. H., Mohd Khairuddin, A. S., Thannirmalai, S., & Dahari, M. (2022). Automatic
detection of oil palm fruits from UAV images using an improved YOLO model. The Visual
Computer, 38(7), 2341–2355.
48. Juyal, P., & Sharma, S. (2021). Crop growth monitoring using unmanned aerial vehicle for farm
field management. In 2021 6th International Conference on Communication and Electronics
Systems (ICCES) (pp. 880–884). IEEE
49. Kaivosoja, J., Hautsalo, J., Heikkinen, J., Hiltunen, L., Ruuttunen, P., Näsi, R., Niemeläi-
nen, O., Lemsalu, M., Honkavaara, E., & Salonen, J. (2021). Reference measurements in
developing UAV systems for detecting pests, weeds, and diseases. Remote Sensing, 13(7),
1238.
50. Kalantar, A., Edan, Y., Gur, A., & Klapp, I. (2020). A deep learning system for single and
overall weight estimation of melons using unmanned aerial vehicle images. Computers and
Electronics in Agriculture, 178(105), 748.
51. Kamilaris, A., & Prenafeta-Boldú, F. X. (2018). Deep learning in agriculture: A survey.
Computers and Electronics in Agriculture, 147, 70–90.
52. Kerkech, M., Hafiane, A., & Canals, R. (2018). Deep leaning approach with colorimetric
spaces and vegetation indices for vine diseases detection in UAV images. Computers and
Electronics in Agriculture, 155, 237–243.
53. Kerkech, M., Hafiane, A., & Canals, R. (2020). Vddnet: Vine disease detection network based
on multispectral images and depth map. Remote Sensing, 12(20), 3305.
54. Kerkech, M., Hafiane, A., & Canals, R. (2020). Vine disease detection in UAV multispec-
tral images using optimized image registration and deep learning segmentation approach.
Computers and Electronics in Agriculture, 174(105), 446.
55. Kerkech, M., Hafiane, A., Canals, R., & Ros, F. (2020). Vine disease detection by deep
learning method combined with 3D depth information. In International Conference on Image
and Signal Processing (pp. 82–90). Springer.
56. Khan, S., Tufail, M., Khan, M. T., Khan, Z. A., & Anwar, S. (2021). Deep learning-based
identification system of weeds and crops in strawberry and pea fields for a precision agriculture
sprayer. Precision Agriculture, 22(6), 1711–1727.
57. Kitano, B. T., Mendes, C. C., Geus, A. R., Oliveira, H. C., & Souza, J. R. (2019). Corn plant
counting using deep learning and UAV images. IEEE Geoscience and Remote Sensing Letters.
58. Kitpo, N., & Inoue, M. (2018). Early rice disease detection and position mapping system using
drone and IoT architecture. In 2018 12th South East Asian Technical University Consortium
(SEATUC) (Vol. 1, pp. 1–5). IEEE
59. Krul, S., Pantos, C., Frangulea, M., & Valente, J. (2021). Visual SLAM for indoor livestock
and farming using a small drone with a monocular camera: A feasibility study. Drones, 5(2),
41.
60. Lan, Y., Huang, Z., Deng, X., Zhu, Z., Huang, H., Zheng, Z., Lian, B., Zeng, G., & Tong,
Z. (2020). Comparison of machine learning methods for citrus greening detection on UAV
multispectral images. Computers and Electronics in Agriculture, 171(105), 234.
61. Lan, Y., Huang, K., Yang, C., Lei, L., Ye, J., Zhang, J., Zeng, W., Zhang, Y., & Deng, J.
(2021). Real-time identification of rice weeds by UAV low-altitude remote sensing based on
improved semantic segmentation model. Remote Sensing, 13(21), 4370.
364 T. Petso and R. S. Jamisola Jr
62. León-Rueda, W. A., León, C., Caro, S. G., & Ramírez-Gil, J. G. (2022). Identification of
diseases and physiological disorders in potato via multispectral drone imagery using machine
learning tools. Tropical Plant Pathology, 47(1), 152–167.
63. Li, B., Yang, B., Liu, C., Liu, F., Ji, R., & Ye, Q. (2021) Beyond max-margin: Class margin
equilibrium for few-shot object detection. In Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition (pp. 7363–7372).
64. Li, D., Sun, X., Elkhouchlaa, H., Jia, Y., Yao, Z., Lin, P., Li, J., & Lu, H. (2021). Fast detection
and location of longan fruits using UAV images. Computers and Electronics in Agriculture,
190(106), 465.
65. Li, J. Y., Duce, S., Joyce, K. E., & Xiang, W. (2021). Seecucumbers: Using deep learning and
drone imagery to detect sea cucumbers on coral reef flats. Drones, 5(2), 28.
66. Liang, W. C., Yang, Y. J., & Chao, C. M. (2019). Low-cost weed identification system using
drones. In 2019 Seventh International Symposium on Computing and Networking Workshops
(CANDARW) (pp. 260–263). IEEE.
67. Lin, Y., Chen, T., Liu, S., Cai, Y., Shi, H., Zheng, D., Lan, Y., Yue, X., & Zhang, L. (2022).
Quick and accurate monitoring peanut seedlings emergence rate through UAV video and deep
learning. Computers and Electronics in Agriculture, 197(106), 938.
68. Lin, Z., & Guo, W. (2021). Cotton stand counting from unmanned aerial system imagery
using mobilenet and centernet deep learning models. Remote Sensing, 13(14), 2822.
69. Liu, C., Jian, Z., Xie, M., & Cheng, I. (2021). A real-time mobile application for cattle tracking
using video captured from a drone. In 2021 International Symposium on Networks (pp. 1–6).
IEEE: Computers and Communications (ISNCC).
70. Liu, J., & Wang, X. (2021). Plant diseases and pests detection based on deep learning: A
review. Plant Methods, 17(1), 1–18.
71. Liu, J., Abbas, I., & Noor, R. S. (2021). Development of deep learning-based variable rate
agrochemical spraying system for targeted weeds control in strawberry crop. Agronomy, 11(8),
1480.
72. Loey, M., ElSawy, A., & Afify, M. (2020). Deep learning in plant diseases detection for agri-
cultural crops: A survey. International Journal of Service Science, Management, Engineering,
and Technology (IJSSMET), 11(2), 41–58.
73. Maes, W. H., & Steppe, K. (2019). Perspectives for remote sensing with unmanned aerial
vehicles in precision agriculture. Trends in plant science, 24(2), 152–164.
74. Mathew, A., Amudha, P., & Sivakumari, S. (2020). Deep learning techniques: An overview.
In International Conference on Advanced Machine Learning Technologies and Applications
(pp. 599–608). Springer
75. Meena, S. D., & Agilandeeswari, L. (2021). Smart animal detection and counting frame-
work for monitoring livestock in an autonomous unmanned ground vehicle using restricted
supervised learning and image fusion. Neural Processing Letters, 53(2), 1253–1285.
76. Menshchikov, A., Shadrin, D., Prutyanov, V., Lopatkin, D., Sosnin, S., Tsykunov, E., Iakovlev,
E., & Somov, A. (2021). Real-time detection of hogweed: UAV platform empowered by deep
learning. IEEE Transactions on Computers, 70(8), 1175–1188.
77. van der Merwe, D., Burchfield, D. R., Witt, T. D., Price, K. P., & Sharda, A. (2020). Drones
in agriculture. In Advances in agronomy (Vo. 162, pp. 1–30).
78. Mirhaji, H., Soleymani, M., Asakereh, A., & Mehdizadeh, S. A. (2021). Fruit detection and
load estimation of an orange orchard using the YOLO models through simple approaches
in different imaging and illumination conditions. Computers and Electronics in Agriculture,
191(106), 533.
79. Moeinizade, S., Pham, H., Han, Y., Dobbels, A., & Hu, G. (2022). An applied deep learning
approach for estimating soybean relative maturity from UAV imagery to aid plant breeding
decisions. Machine Learning with Applications, 7(100), 233.
80. Mohidem, N. A., Che’Ya, N. N., Juraimi, A. S., Fazlil Ilahi, W. F., Mohd Roslim, M. H.,
Sulaiman, N., Saberioon, M., & Mohd Noor, N. (2021). How can unmanned aerial vehicles
be used for detecting weeds in agricultural fields? Agriculture, 11(10), 1004.
A Review on Deep Learning on UAV Monitoring Systems … 365
81. Monteiro, A., Santos, S., & Gonçalves, P. (2021). Precision agriculture for crop and livestock
farming-brief review. Animals, 11(8), 2345.
82. Nazir, S., & Kaleem, M. (2021). Advances in image acquisition and processing technologies
transforming animal ecological studies. Ecological Informatics, 61(101), 212.
83. Nematzadeh, S., Kiani, F., Torkamanian-Afshar, M., & Aydin, N. (2022). Tuning hyperpa-
rameters of machine learning algorithms and deep neural networks using metaheuristics: A
bioinformatics study on biomedical and biological cases. Computational Biology and Chem-
istry, 97(107), 619.
84. Nguyen, H. T., Lopez Caceres, M. L., Moritake, K., Kentsch, S., Shu, H., & Diez, Y. (2021).
Individual sick fir tree (abies mariesii) identification in insect infested forests by means of
UAV images and deep learning. Remote Sensing, 13(2), 260.
85. Ofori, M., El-Gayar, O. F. (2020). Towards deep learning for weed detection: Deep convolu-
tional neural network architectures for plant seedling classification.
86. Osorio, K., Puerto, A., Pedraza, C., Jamaica, D., & Rodríguez, L. (2020). A deep learning
approach for weed detection in lettuce crops using multispectral images. AgriEngineering,
2(3), 471–488.
87. Ouchra, H., & Belangour, A. (2021). Object detection approaches in images: A survey. In
Thirteenth International Conference on Digital Image Processing (ICDIP 2021) (Vol. 11878,
pp. 118780H). International Society for Optics and Photonics.
88. Pang, Y., Shi, Y., Gao, S., Jiang, F., Veeranampalayam-Sivakumar, A. N., Thompson, L., Luck,
J., & Liu, C. (2020). Improved crop row detection with deep neural network for early-season
maize stand count in UAV imagery. Computers and Electronics in Agriculture, 178(105), 766.
89. Petso, T., Jamisola, R. S., Mpoeleng, D., & Mmereki, W. (2021) Individual animal and herd
identification using custom YOLO v3 and v4 with images taken from a UAV camera at different
altitudes. In 2021 IEEE 6th International Conference on Signal and Image Processing (ICSIP)
(pp. 33–39). IEEE.
90. Petso, T., Jamisola, R. S., Jr., Mpoeleng, D., Bennitt, E., & Mmereki, W. (2021). Automatic
animal identification from drone camera based on point pattern analysis of herd behaviour.
Ecological Informatics, 66(101), 485.
91. Petso, T., Jamisola, R. S., & Mpoeleng, D. (2022). Review on methods used for wildlife
species and individual identification. European Journal of Wildlife Research, 68(1), 1–18.
92. Ponnusamy, V., & Natarajan, S. (2021). Precision agriculture using advanced technology of
iot, unmanned aerial vehicle, augmented reality, and machine learning. In Smart Sensors for
Industrial Internet of Things (pp. 207–229). Springer.
93. Qian, W., Huang, Y., Liu, Q., Fan, W., Sun, Z., Dong, H., Wan, F., & Qiao, X. (2020). UAV
and a deep convolutional neural network for monitoring invasive alien plants in the wild.
Computers and Electronics in Agriculture, 174(105), 519.
94. Rachmawati, S., Putra, A. S., Priyatama, A., Parulian, D., Katarina, D., Habibie, M. T., Sia-
haan, M., Ningrum, E. P., Medikano, A., & Valentino, V. (2021). Application of drone technol-
ogy for mapping and monitoring of corn agricultural land. In 2021 International Conference
on ICT for Smart Society (ICISS) (pp. 1–5). IEEE.
95. Raheem, D., Dayoub, M., Birech, R., & Nakiyemba, A. (2021). The contribution of cereal
grains to food security and sustainability in Africa: potential application of UAV in Ghana,
Nigeria, Uganda, and Namibia. Urban Science, 5(1), 8.
96. Rahman, M. F. F., Fan, S., Zhang, Y., & Chen, L. (2021). A comparative study on application
of unmanned aerial vehicle systems in agriculture. Agriculture, 11(1), 22.
97. Rangarajan, A. K., Balu, E. J., Boligala, M. S., Jagannath, A., & Ranganathan, B. N. (2022).
A low-cost UAV for detection of Cercospora leaf spot in okra using deep convolutional neural
network. Multimedia Tools and Applications, 81(15), 21,565–21,589.
98. Raoult, V., Colefax, A. P., Allan, B. M., Cagnazzi, D., Castelblanco-Martínez, N., Ierodia-
conou, D., Johnston, D. W., Landeo-Yauri, S., Lyons, M., Pirotta, V., et al. (2020). Operational
protocols for the use of drones in marine animal research. Drones, 4(4), 64.
99. Razfar, N., True, J., Bassiouny, R., Venkatesh, V., & Kashef, R. (2022). Weed detection in
soybean crops using custom lightweight deep learning models. Journal of Agriculture and
Food Research, 8(100), 308.
366 T. Petso and R. S. Jamisola Jr
100. Rivas, A., Chamoso, P., González-Briones, A., & Corchado, J. M. (2018). Detection of cattle
using drones and convolutional neural networks. Sensors, 18(7), 2048.
101. Roosjen, P. P., Kellenberger, B., Kooistra, L., Green, D. R., & Fahrentrapp, J. (2020). Deep
learning for automated detection of Drosophila suzukii: Potential for UAV-based monitoring.
Pest Management Science, 76(9), 2994–3002.
102. Roy, A. M., Bose, R., & Bhaduri, J. (2022). A fast accurate fine-grain object detection model
based on YOLOv4 deep neural network. Neural Computing and Applications, 34(5), 3895–
3921.
103. Safarijalal, B., Alborzi, Y., & Najafi, E. (2022). Automated wheat disease detection using a
ROS-based autonomous guided UAV.
104. Safonova, A., Guirado, E., Maglinets, Y., Alcaraz-Segura, D., & Tabik, S. (2021). Olive
tree biovolume from UAV multi-resolution image segmentation with mask R-CNN. Sensors,
21(5), 1617.
105. Saleem, M. H., Potgieter, J., & Arif, K. M. (2021). Automation in agriculture by machine
and deep learning techniques: A review of recent developments. Precision Agriculture, 22(6),
2053–2091.
106. dos Santos, A., Biesseck, B. J. G., Latte, N., de Lima Santos, I. C., dos Santos, W. P., Zanetti, R.,
& Zanuncio, J. C. (2022). Remote detection and measurement of leaf-cutting ant nests using
deep learning and an unmanned aerial vehicle. Computers and Electronics in Agriculture,
198(107), 071.
107. dos Santos, Ferreira A., Freitas, D. M., da Silva, G. G., Pistori, H., & Folhes, M. T. (2017).
Weed detection in soybean crops using convnets. Computers and Electronics in Agriculture,
143, 314–324.
108. Sarwar, F., Griffin, A., Periasamy, P., Portas, K., & Law, J. (2018). Detecting and counting
sheep with a convolutional neural network. In 2018 15th IEEE International Conference on
Advanced Video and Signal Based Surveillance (AVSS) (pp. 1–6). IEEE.
109. Sarwar, F., Griffin, A., Rehman, S. U., & Pasang, T. (2021). Detecting sheep in UAV images.
Computers and Electronics in Agriculture, 187(106), 219.
110. Shankar, R. H., Veeraraghavan, A., Sivaraman, K., Ramachandran, S. S., et al. (2018). Appli-
cation of UAV for pest, weeds and disease detection using open computer vision. In 2018
International Conference on Smart Systems and Inventive Technology (ICSSIT) (pp. 287–292).
IEEE
111. Shao, W., Kawakami, R., Yoshihashi, R., You, S., Kawase, H., & Naemura, T. (2020). Cattle
detection and counting in UAV images based on convolutional neural networks. International
Journal of Remote Sensing, 41(1), 31–52.
112. Sharma, A., Jain, A., Gupta, P., & Chowdary, V. (2020). Machine learning applications for
precision agriculture: A comprehensive review. IEEE Access, 9, 4843–4873.
113. Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on image data augmentation for deep
learning. Journal of Big Data, 6(1), 1–48.
114. Skendžić, S., Zovko, M., Živković, I. P., Lešić, V., & Lemić, D. (2021). The impact of climate
change on agricultural insect pests. Insects, 12(5), 440.
115. Soares, V., Ponti, M., Gonçalves, R., & Campello, R. (2021). Cattle counting in the wild with
geolocated aerial images in large pasture areas. Computers and Electronics in Agriculture,
189(106), 354.
116. Stein, E. W. (2021). The transformative environmental effects large-scale indoor farming may
have on air, water, and soil. Air, Soil and Water Research, 14(1178622121995), 819.
117. Stewart, E. L., Wiesner-Hanks, T., Kaczmar, N., DeChant, C., Wu, H., Lipson, H., Nelson,
R. J., & Gore, M. A. (2019). Quantitative phenotyping of northern leaf blight in UAV images
using deep learning. Remote Sensing, 11(19), 2209.
118. Talaviya, T., Shah, D., Patel, N., Yagnik, H., & Shah, M. (2020). Implementation of artificial
intelligence in agriculture for optimisation of irrigation and application of pesticides and
herbicides. Artificial Intelligence in Agriculture, 4, 58–73.
119. Tetila, E. C., Machado, B. B., Menezes, G. K., Oliveira, Ad. S., Alvarez, M., Amorim, W. P.,
Belete, N. A. D. S., Da Silva, G. G., & Pistori, H. (2019). Automatic recognition of soybean
A Review on Deep Learning on UAV Monitoring Systems … 367
leaf diseases using UAV images and deep convolutional neural networks. IEEE Geoscience
and Remote Sensing Letters, 17(5), 903–907.
120. Tetila, E. C., Machado, B. B., Astolfi, G., de Souza Belete, N. A., Amorim, W. P., Roel, A. R.,
& Pistori, H. (2020). Detection and classification of soybean pests using deep learning with
UAV images. Computers and Electronics in Agriculture, 179(105), 836.
121. Tiwari, A., Sachdeva, K., & Jain, N. (2021). Computer vision and deep learningbased frame-
work for cattle monitoring. In 2021 IEEE 8th Uttar Pradesh Section International Conference
on Electrical (pp. 1–6). IEEE: Electronics and Computer Engineering (UPCON).
122. Ukwuoma, C. C., Zhiguang, Q., Bin Heyat, M. B., Ali, L., Almaspoor, Z., & Monday, H.
N. (2022). Recent advancements in fruit detection and classification using deep learning
techniques. Mathematical Problems in Engineering.
123. Vayssade, J. A., Arquet, R., & Bonneau, M. (2019). Automatic activity tracking of goats using
drone camera. Computers and Electronics in Agriculture, 162, 767–772.
124. Veeranampalayam Sivakumar, A. N., Li, J., Scott, S., Psota, E., Jhala, A., Luck, J. D., & Shi, Y.
(2020). Comparison of object detection and patch-based classification deep learning models
on mid-to late-season weed detection in UAV imagery. Remote Sensing, 12(13), 2136.
125. Velusamy, P., Rajendran, S., Mahendran, R. K., Naseer, S., Shafiq, M., & Choi, J. G. (2021).
Unmanned aerial vehicles (UAV) in precision agriculture: Applications and challenges. Ener-
gies, 15(1), 217.
126. Wani, J. A., Sharma, S., Muzamil, M., Ahmed, S., Sharma, S., & Singh, S. (2021). Machine
learning and deep learning based computational techniques in automatic agricultural dis-
eases detection: Methodologies, applications, and challenges. In Archives of Computational
Methods in Engineering (pp. 1–37).
127. Wittstruck, L., Kühling, I., Trautz, D., Kohlbrecher, M., & Jarmer, T. (2020). UAV-based
RGB imagery for hokkaido pumpkin (cucurbita max.) detection and yield estimation. Sensors,
21(1), 118.
128. Xie, W., Wei, S., Zheng, Z., Jiang, Y., & Yang, D. (2021). Recognition of defective carrots
based on deep learning and transfer learning. Food and Bioprocess Technology, 14(7), 1361–
1374.
129. Xiong, J., Liu, Z., Chen, S., Liu, B., Zheng, Z., Zhong, Z., Yang, Z., & Peng, H. (2020).
Visual detection of green mangoes by an unmanned aerial vehicle in orchards based on a deep
learning method. Biosystems Engineering, 194, 261–272.
130. Xiong, Y., Zeng, X., Chen, Y., Liao, J., Lai, W., & Zhu, M. (2022). An approach to detecting
and mapping individual fruit trees integrated YOLOv5 with UAV remote sensing.
131. Xu, B., Wang, W., Falzon, G., Kwan, P., Guo, L., Chen, G., Tait, A., & Schneider, D. (2020).
Automated cattle counting using mask R-CNN in quadcopter vision system. Computers and
Electronics in Agriculture, 171(105), 300.
132. Xu, B., Wang, W., Falzon, G., Kwan, P., Guo, L., Sun, Z., & Li, C. (2020). Livestock classi-
fication and counting in quadcopter aerial images using mask R-CNN. International Journal
of Remote Sensing, 41(21), 8121–8142.
133. Yang, Q., Shi, L., Han, J., Yu, J., & Huang, K. (2020). A near real-time deep learning approach
for detecting rice phenology based on UAV images. Agricultural and Forest Meteorology,
287(107), 938.
134. Yang, S., Yang, X., & Mo, J. (2018). The application of unmanned aircraft systems to plant
protection in china. Precision Agriculture, 19(2), 278–292.
135. Zhang, H., Lin, P., He, J., & Chen, Y. (2020) Accurate strawberry plant detection system based
on low-altitude remote sensing and deep learning technologies. In 2020 3rd International
Conference on Artificial Intelligence and Big Data (ICAIBD) (pp. 1–5). IEEE.
136. Zhang, H., Wang, L., Tian, T., & Yin, J. (2021). A review of unmanned aerial vehicle low-
altitude remote sensing (UAV-LARS) use in agricultural monitoring in china. Remote Sensing,
13(6), 1221.
137. Zhang, R., Wang, C., Hu, X., Liu, Y., Chen, S., et al. (2020) Weed location and recognition
based on UAV imaging and deep learning. International Journal of Precision Agricultural
Aviation, 3(1).
368 T. Petso and R. S. Jamisola Jr
138. Zhang, X., Han, L., Dong, Y., Shi, Y., Huang, W., Han, L., González-Moreno, P., Ma, H., Ye,
H., & Sobeih, T. (2019). A deep learning-based approach for automated yellow rust disease
detection from high-resolution hyperspectral UAV images. Remote Sensing, 11(13), 1554.
139. Zhou, X., Lee, W. S., Ampatzidis, Y., Chen, Y., Peres, N., & Fraisse, C. (2021). Strawberry
Maturity Classification from UAV and Near-Ground Imaging Using Deep Learning. Smart
Agricultural Technology, 1(100), 001.
Navigation and Trajectory Planning
Techniques for Unmanned Aerial
Vehicles Swarm
Abstract Navigation and trajectory planning algorithms is one of the most impor-
tant issues in unmanned aerial vehicle (UAV) and robotics. Recently, UAV swarm
or flying ad-hoc network which have much interest and extensive attentions from
aviation industry, academia and research community, as it becomes one of the great
tools for smart cities, rescue/disaster managements and military applications. UAV
swarm is a scenario makes the UAVs interacted with each other. The control and
communication structure in UAVs swarm require a specific decision to improve the
trajectory planning and navigation operations of UAVs swarm. In addition, it requires
high processing time and power with resources scarcity to efficiently operates the
flights plan. Artificial intelligence (AI) is a powerful tool for optimization and accu-
rate solutions for decision and power management issues. However, it comes with
high data communication and processing. Leveraging AI with navigation and path
planning it gives much adding values and great results for the system robustness. UAV
industry moves toward the AI approaches in developing UAVs swarm and promising
more intelligence UAV swarm interaction, according to the importance of this topic,
this chapter will provide a systematic review on AI approaches and most algorithms
those enable to developing the navigation and trajectory planning strategies for UAV
swarm.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 369
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_12
370 N. M. Elfatih et al.
1 Introduction
The Drones are known as an unmanned aerial vehicle (UAVs) which can operate
remotely without onboard humans [1]. UAVs have been investigated as a disruptive
technology that complement and support operations, which are performed tradition-
ally by human. Due to their excellent mobility, flexibility, easy deployment, high-
performance, low maintenance, adaptive altitude UAVs are widely used in many
applications related to civil and military issues, for example, wildfire and moni-
toring, traffic control, emergency rescue, medical field and intelligent transportation.
UAVs enable to provide wide coverage sensing for different environments [2].
For AUVs, various communication technologies and standards emerge. Such tech-
niques like cloud computing and software defined network, in addition to big data
analytics. The UAV design also passed through different communications evolutions,
beginning from using 3G broadband signals, and to achieve high data rate, in addition
to 5G end to end connectivity. The evolution of UAVs communications from 4 to 5G
provides new technologies to support cellular communications for UAVs operations
with high reliability, and high energy utilization [3]. The cellular network based 5G
provide an enhanced UAVs broadband communication, and also enables the UAVs
to act as a flying base station for swarm UAVs, and gateways to the ground cellular
stations.
Navigation process and trajectory planning are the most important issues which
are considered a crucial for UAVs. The process of planning the UAVs trajectories
in complex environments that contains a number of obstacles is one of the major
challenges facing its application [4]. In addition, the establishment of a network
consisting of a number of UAVs that have the ability to avoid collision while taking
into account the kinetic characteristics is the most important requirements in UAVs
swarm applications.
In accordance with the challenges mentioned, and to achieve the operational effi-
ciency of the swarm UAVs with their safety, it is important to intelligently estimate
the journey plan especially in complex environments. Therefore, trajectory planning
for UAVs has become a research hotspot. In this chapter we provide a comprehen-
sive review about technical conceptual about UAVs swarm architectures, applica-
tion, navigation and trajectory planning technologies. Our main contributions are
summarized as follows.
• Provides a review about UAVs swarm Architecture, communications and control
systems.
• Discussed the UAVs swarm navigations and trajectory planning classifications.
• Provides a review about most important intelligent technologies used for UAVs
swarm trajectory planning.
The rest of this chapter is organized as follows, Sect. 2 provides UAVs technical
background in addition to UAV swarm advantages and applications. Swarm commu-
nication and control system architectures are provided in Sect. 3. Section 4 provides
navigation and path planning for UAV swarm. The classical techniques for UAV
Navigation and Trajectory Planning Techniques for Unmanned Aerial … 371
swarm navigation and path planning are reviewed in Sect. 5. In Sect. 6, the reactive
approaches for UAV Swarm navigation and path planning are discussed. Finally, the
conclusion is provided in Sect. 7.
The beginning of the development of drone technology is when the federal law was
announced from the United States in the year 2016 regarding regulating the public
use of UAVs. Corresponds to the purpose, it has been used in a number of fields
such as agricultural monitoring, power lines and photography, in addition to various
search and rescue operations [5]. In recent years, the concept of a UAVs swarm has
become an important research topic, which is the possibility of managing a swarm
of drones and enabling interaction between them through intelligent algorithms that
enable the swarm to conduct joint operations among themselves according to pre-
defined paths that are controlled from the operations center. The operations of UVAs
are depending on it capability of controlling, maneuvering and power utilization. The
following section provides a brief concept about UVAs architecture and intelligence
operations.
The architecture of UAV consists of three layers, as shown in Fig. 1. These layers
relate to data collection, processing and operation [7]. The data collection layer
consists of a number of devices such as sensors, light detectors and cameras. The
other layers also contain processing devices, control systems, and other systems
related to maps and decision-making [8].
In UAVs, the central control system shown in Fig. 2 controls the UAV trajectory
in the real environment. The controller adjusts the speed, flight control, and radio
and power source. More clearly, the components are described as follows [9].
• Speed controller: provides high frequency to operate the UAV motors and control
their speed.
• Positioning system: It calculates the time and location information of the UAV
and determines the coordinates and altitude of the aircraft.
• Flight controller: It manages flight operations by reading location system
information while controlling communications.
• Battery: In UAVs, batteries are made of materials that give greater energy and
a long range as a material, like Lithium polymer, in addition other batteries are
added to help long-range flight
• Gimbal: It stabilizes the UAVs on its three-dimensional axis.
372 N. M. Elfatih et al.
• Sensors: There are a number of sensors in the UAVs that work to capture 3D
images or detect and avoid collisions.
UAVs can be act and make different operation scenarios in swarm as a set of UAV
group. Recent studies tried to explore the benefits and features of swarm insect’s
behavior in nature [10]. As example, bee and eagle bird swarm provides an intelli-
gent concept of flying which can help to provide a good solution for UAVs tasks.
Also, the concept of swarm as a behavior of complex collective operations can take
place through interactions between large numbers of UAVs in an intelligent manner
[11]. The UAVs swarm can provide a tactical operation with high efficiency and
performance, in addition to increase the operations quality and reliability when used
in different applications [12].
Navigation and Trajectory Planning Techniques for Unmanned Aerial … 373
UAV swarm enable to provide high-level tasks compared to one UAV. Moreover,
UAV swarms also allow for fault tolerance, if one UAV of the swarm teams are
lost, others swarms or group member can accomplish the assigned tasks by real-
locating the missions to the surviving team members [13]. A swarm of UAVs can
be deployed to perform various delivery missions including searching, performing
target tracking, high-accuracy search and surveillance [14]. All of these operations
are carried out by a squadron or group of UAVs that are directed by navigation,
guidance and control systems that work to manage the UAVs allocation, flying area
coordination and communications. All these tasks operate within a complex system
that includes various integrated and integrated technologies [15]. Additionally, the
artificial intelligence (AI) technologies are also used in building UAVs systems for
different purposes such for intelligent maneuvering, trajectory planning, and swarm
interaction.
Many previous papers discussed individual UAV systems and their various applica-
tions however, few did offer the study of UAV swarms and their associated limitations
versus advantages. Through a lot of literature, it is clear that UAV work in a number
of applications and scenarios related to surveillance and have an advantage when
used alone [15]. However, the operations of the UAV in a swarm gives more advan-
tages appear in exchange for the operation of single UAV, especially in search tasks.
By using swarm of UAVs searching task can be done in parallel and the range of
operations can be increase largely. Even though that, UAV swarm can face issues in
trajectory planning and interactions. In general, the swarm UAV advantages can be
summaries in table 1, when compared with single UAV [16].
A. Synchronized Actions
Table 1 Comparisons
Features Single UAV Swarm UAV
between a single UAV and
swarm UAVs systems Operations duration Poor High
Scalability Limited High
Mission speed Slow Fast
independence Low High
Cost High Low
Communication requirements High Low
Radar cross sections Large Small
374 N. M. Elfatih et al.
B. Time Efficiency
The swarm of UAV enable to reduce the time of making task and missions of searching
or monitoring. As an example, author in [7] provide a study of using UAV swarm
for detecting the nuclear radiation to build a map for rescue operations.
C. Complementarities a Team Member
Having a swarm of heterogeneous UAVs, more advantages can be achieved due to
the possibility of its integration in different operations and tasks at the same time
[18].
D. Reliability
The UAV swarm system delivers solutions that provide greater fault tolerance and
flexibility in case of single UAV mission fails.
E. Technology Evaluation
With the development of integrated systems and techniques of miniaturization,
models of UAVs to operate in a swarm can be produced, characterized by lightness
and small size. [18].
F. Cost
Single high-performance UAV to perform complex tasks are very costly when
compared to using a number of low-cost UAVs to perform the same task. Where
cost is related to power, size and weight [18].
A big variety application as shown in Fig. 3 exists where UAVs swarm systems are
used. Figure 4 shows the review publications in UAVs swarm and their applications
between years 2018 to 2022. The following subsection provides an overview of most
important UAVs applications.
A. Photogrammetry
Photogrammetry enables to extract the quantitative information’s from scanned
images, in addition to recover point surface position. Several works addressed UAVs
swarm performing imagery collection. For example, [19] presented low altitude
thermal image system enable to observe specific area respected to flight plan.
B. Security and Surveillance
Many applications use UAVs swarms in video surveillance by cameras to cover
specific targets [20]. It also helps in monitoring, the traffic control operations, in
addition to many military monitoring operations.
Navigation and Trajectory Planning Techniques for Unmanned Aerial … 375
3000
Number of Publications
2500
2000
(2018 - 2022)
1500
1000
500
0
C. Battlefields
Swarms of UAVs help cover battlefields to gather intelligence and transfer it to
ground receiving stations for decision-making [21]. In many military applications,
376 N. M. Elfatih et al.
UAV swarms serve to locate enemy locations in urban or remote areas, in lands or
seas.
D. Earth Monitoring
UAVs swarms are used to monitor geophysical processes and pollutant levels by
means of sensors and processing units that make independent surveys along a
predetermined path [21].
E. Precision agriculture
In the agricultural field, UAVs swarm help in spraying pesticides on plants to combat
agricultural pests while ensuring high productivity and efficiency [22]. They can also
monitor specific areas and analyze data to make spraying decisions.
F. Disaster management and good delivery
AUVs swarm assist in rescue operations during disasters, especially in the early
hours, and help deliver emergency medical supplies [24]. It can also assess risks and
damages in a timely manner by means of integrated atmospheric computing. Some
companies, such as Amazon, are working on using UAVs swarms to deliver goods
through computing systems and the Internet [25, 26].
G. Healthcare Application
In healthcare, UAVs help to collect data from different medical levels, from sensor
information related to patients, to health centers [27]. One of the examples of these
applications is the UAVs star network topology which uses radio alert technology to
allocate resources, which consists of the following stages.
• Stage 1: data collection, enables the UAV to gathering patients’ information.
• Stage 2: data reporting, enable to reporting the information’s collected by the
UAV to medical servers, or doctors end devices.
• Stage 3: data processing enables to take decisions about patient’s healthcare to
provide diagnosis and prescription.
With the increase in the number of UAVs in the swarm, the centralized communication
approach can be used, which provides an organizational structure that reduces the
number of swarm and UAVs numbers connected to the central network and gives
the independency for some UAVs [33]. Also, the long distances that UAV can travel
can lose their connection to the central network, so other decentralized networks are
allocated to the aircraft to carry out interactive communications in real time [15].
A dedicated single group swarm network can be combined with other networks
as shown in Fig. 8, so that each network has a central architecture and a special
architecture with different applications depending on the task. The architecture is
organized in a centralized manner but the difference is at the level of the UAVs within
each private network group [37]. The architecture of communication within UAVs
swarm groups is similar to the architecture of communication within a swarm, with
a mechanism for communication between groups defined by the infrastructure. The
responsibility for connecting to the infrastructure lies with the gateway’s UAVs and
for coordinating communications between the missions of the various UAV groups.
This architecture helps to conduct specific multitasking applications for groups to
conduct joint multi-theater military operations so that the central control center can
communicate with different UAV swarms [18, 37].
missions are complex and there are a large number of UAVs performing the missions
allowing for a change in the network topology, and communication between the
UAVs [19, 39].
According to what has been reviewed, UAV communications engineering has
evolved significantly to serve a number of different and important scenarios.
According to this, there are different communication structures to choose from
among these structures. Table 2 summarizes the advantages and disadvantages of
the discussed architectures. It turns out that the central communications architecture
is suitable for scenarios with a small number of UAVs swarms with relatively simple
tasks. The more complex the tasks and the larger the swarms, the other architec-
tures are used according to the required scenario [40]. In case of expanded coverage
through, multi-hop network scenario, the decentralized communication architecture
is suitable for this purpose [41].
There are many communication technologies enable to provide UAVs communi-
cations. Figure 10, shows the classifications different UAVs communication tech-
nologies categorized four types based on, cellular, satellite, Wi-Fi based, and
cognitive radio UAVs communications.
Planning of the path is defined as the method of finding shortest and an optimal
path between destination and source. One of the utmost important difficulties to be
discovered in the UAVs arena. The core goal of UAV’s path planning is to discover
an effective flight cost through a path that fulfil the requirements of UAV perfor-
mance with small collision probability during the flight [20]. Planning of UAVs
route, normally comprises three main terms [21, 44]: Planning of motion, naviga-
tion, and planning of trajectory. Planning of motion contents restraints alike flight
route, turning the motion crank of the route planning. On the other side, Planning
of trajectory includes the route planning having velocity, time, and UAVs mobility
kinematics of whereas navigation is concerned with localization, and avoidance of
collision.
Navigation and Trajectory Planning Techniques for Unmanned Aerial … 383
UAV needs to accomplish route planning during motion from a source to a destina-
tion. UAVs recognize the neighboring environment by utilizing sensors to navigate,
control and plan flight mobility. The UAV route planning stages that required to be
tracked during the operations execution are (i) climate and weather sensing, (ii) navi-
gations, (iii) UAVs movement control. The mentioned above stages are to be applied
throughout the trip [46]. The climate and weather sensing for the environments gives
the UAVs awareness. The route planning and navigation methods are to be applied
continuously seeking for an optimal route. The UAVs movement and velocity are
monitored by central controller for avoidance of collision. Furthermore, the UAVs
need to communicate with neighbor UAVs for the network management during their
goal of mission [47].
384 N. M. Elfatih et al.
where, t presents the time of operating, h is the height, and s is the UAVs velocity.
Pmin and Pmax depend on motor specification and weight. One can write, Pmin is the
lowest power that required for UAV to start with α as the motor velocity. Hence,
forth, the entire communication cost (Tcom ) that reduced the UAVs cost and time can
be modeled as:
route constraint with spatial-search [49]. Though NP-hard problem has no regular
solution, multiple CI algorithms [23, 24] could work for route planning optimization
by simplifying the cost and constraint functions gradients computation.
The second algorithm model the problem based on the shape of the UAV. For
UAV shape-based algorithm, the problem can be converted [25, 49] to be considered
as 2D shape with shape parameters i.e., gravity center and wing span. Then, it could
be resolved by the method used UAV as a particle.
The third technique is UAV to be modeled as dynamic and kinematic constraints,
such as, its max and min radii turning. Related to the previous techniques, the
third technique is more complicated however more applied in applications. For
the dynamic and kinematic constraints, CI method [26, 50] could be used with the
advantages in computation and fast convergence.
As growing fields range, for example, navigation, detection, operations and trans-
portation, are all needed for UAVs application. Because of the environment’s
complexity, which have many factors with uncertainty and unstructured, 3D route
planning robust methods are crucially required [51]. Though route planning of UAV
for 3D spaces presents excessive opportunities, contrasting to route planning for 2D.
The challenges are dramatically increased for kinematic constraint. One of the tradi-
tional problems can be modeled for the 3D space while considering the kinematic
constraint for collision-free route planning. Bear in mind, kinematic constraints such
as temporal, geometric, and physical difficult to be resolved by conventional CI
methods which may encounter numerous of difficulties i.e., convergence rate is low
and exploration range is wide [52].
This article focuses and discusses 3D environments with emphasis on the chal-
lenges mentioned in the above sections. 3D techniques have various advantages
and characteristics when joint with suitable CI algorithms. To avoid the challenges
of 3D unmature and slow convergences in UAV route planning with low-height,
numerous of Genetic Algorithm (GA) can be used for route planning [27, 28]. The
enhanced particles swarm optimizations (PSO) techniques [29, 30] can be utilized to
solve blind exploration of wide range problems and execute comprehensive 3d route
optimization.
An improved ant colony optimization (ACO) techniques [31] for planning for
3D route have been discussed extensively, where it could also enhance the selection
speed and reduce the finding optimal local points probability. Unlike swarm methods,
Federated learning (FL) algorithm has generally been utilized for navigation vision in
UAV to enable images decisions and detection [32]. To discuss UAV route planning
attack problem, fusion neural-network (NN) based technique has been proposed [33].
This method could simply be enhanced by parallelization methods. Recently, with
the computing chip development, need more computing time and high performance
386 N. M. Elfatih et al.
for deep learning (DL) and machine learning (ML) techniques [34, 35] have been
guaranteed. These techniques (i.e., ML and DL methods) were greatly been used in
UAV 3D route planning to resolve the NP-hard problems more accurately in a wide
search region [53].
The route planning for UAV could be categorized in three methods namely
combinative sampling-based and biologically-inspired methods as presented in
Fig. 11.
• Route length: The route length is identified as the total path that UAVs can move
from start points to the end points.
• Optimization: which defined as the route calculation and parameters should be
efficient in time, energy and cost. It could be identified to three classes, i.e.,
non-optimal, sub-optimal and optimal.
• Extensiveness: which is identified as the characteristics that utilized in route
planning for discovery the route. It offers the UAVs a platform and an optimal
route solution.
Navigation and Trajectory Planning Techniques for Unmanned Aerial … 387
B. A* algorithm
A* algorithms are a traversal graph and route searching algorithms which is
commonly utilized for discovering the optimum route due to its optimality and
completeness [60]. It discovers the optimum route with less processing time and
it has to store and remember all nodes that have been visited earlier. It uses those
memories to identify the best path that can be taken from its current state. To find
the following node, one can use the below expression.
where n denotes as following node on the route, g(n) is the route cost from node S
to n, and h(n) is a heuristic process that calculates the lowest path cost from G to n.
The minimum path cost is approximated and estimated to reach the next optimum
node point. The repeated optimum node points are estimated based on these expenses
of the optimum route by obstacles avoiding [61]. Figure 12 illustrates the phases
that elaborate in searching the optimum route in the A* searching algorithms. It
is basically based on efficient heuristic cost, the high expanded search areas, and
appropriate only in a static circumstance. Since A* algorithm solve the optimum
routes that made by neighbor nodes to build the roadmap, it results with route jagged
and long solution.
390 N. M. Elfatih et al.
arrangement and a route in this chart equivalences to a digression in free space. This
is drawn by the striped cells succession [64]. A route in this graph associated to a free
space network, which is drawn by the striped cells succession. These channels are
then changed into a free route by linking the underlying arrangement to the objec-
tives design through a midpoint of the crossing points of the contiguous cells in the
channel.
In approximate CD, planning spaces have been utilized to identify a regular grid
has an explicit size and shape, henceforth it is easy to configure. Adaptive CD recog-
nizes the presented information in the free space and follows the avoidance basic
concept of the free space in regular CD [44, 64].
The benefits of this method are that it is practical to implement above two-
dimensions and relatively quick to compute. However, because it is an iterative
process, it is not necessarily practical to compute online as there is no guarantee
when, or if, a solution is found. Additionally while there are both exact and approx-
imate cell decomposition methods, the approximate method (shown in the figure
above) can provide very suboptimal solutions.
Motion planning field using APF initially used for online avoidance of collision
for where UAVs do not have previous knowledge about obstacles however it avoids
it in real-time manner. The comparatively simple concepts treat the vehicles as a
node under the effect of an APF where the differences in the spaces characterize
the environment structure [65]. The attractive potentials reflect the vehicle pull to
the goal and the repulsive potentials reflect the UAV push from the obstacle [44,
66]. Consequently, the environment is disintegrated into values set where high value
is linked to obstacles and low value is linked to the goal. Several steps are used to
392 N. M. Elfatih et al.
construct the map using potential fields. First, the target point is assigned a large
negative value and Cfree is assigned increasing values as the distance from the goal
increases. Again typically, the inverse of the distance from the goal is used as a value
[10, 65]. Second, Cobstacle is assigned as the highest values and Cfree is assigned
decreasing values as the distance from the obstacles increases. Typically, the inverse
of the distance from the obstacle is used as a value. Finally, the two potentials in
Cfree are added and a steepest descent approach is used to find an appropriate path
from start point to the end point (see Fig. 15 on the right) [45] (Table 3).
• The delete operation: randomly select a point in the route, then connect any two
neighbor nodes together. If eliminate the selected node results in a short without
collision route, then eliminate this point from the route.
• The enhance operation: can be only utilized in the collision-free routes. Choose a
point from the route and enclose two new points in two sides of the chosen point.
Then, link the two new points with a route. If the new route is viable, eliminate
the chosen point.
The mentioned genetic operations are used for parent routes to create an opti-
mized child route. In GAs, the parent route is identified as the preliminary route
achieved from the preceding route planning operation, which could be achieved
using Roadmap. The parent routes should be line sections that link the start and the
end via numerous midway points. GAs are robust search algorithms which needs very
minor information on the environment for efficient search [67]. Most of the studies
have studied static environment navigation only by utilizing GAs however, naviga-
tion in dynamic environment with existing of mobile obstacles is not been discussed
extensively in the literature. To have excellent achievements in UAB route plan-
ning, several studied have been studied GAs applications along with other intelligent
algorithms jointly which sometimes called hybrid approaches [50].
The ANN structure concept is stimulated by the neural biological network operation.
It is built on a group that connected with computation function known as artificial
neuron (AN). Each link between ANs has ability to transmit a signal from one point to
another [68]. The ANs process the signal received and then signal the ANs associated
to it. In ANN configuration of UAV route planning, the link between the neurons
is called signal and it is usually real number. The neuron output is computed by a
nonlinear function. They are typically optimization through mathematical stochastic
approaches based on huge amounts of data fitting [69]. Then, we can attain a suitable
solution which can be converted by mathematical function.
The ANN algorithms reduce the mathematical complexity by eliminating the
collocating requirement for the computational environments and providing fast
computer equipment [62]. Since an ANN is created by using parallel computation
the convergence is generally very fast, and the created route is safe and optimal
[63]. There are, two key forms of ANN approaches have been used in UAV route
planning: firstly, a UAV built its route on a sample trajectory and utilizes a direct asso-
ciation approach to optimize and compute the trajectory [64]. Secondly, it uses NNs
to estimate the system dynamic, objective function, and gradient, which eliminate
the collocation requirement, thus reducing the nonlinear programming problems size
[65]. Presently, second type approaches are more popular. Then, it has been expanded
to determine its best for resolving multiple-UAVs problem [66]. Additionally, ANN
has generally been combined with other approaches and algorithms [67, 68] such as
394 N. M. Elfatih et al.
the PFM, PSO, and GA, to maximize their advantages. Deep neural networks (DNNs)
are a multi-layer NNs and have been extensively used in the AI field recently, such
as speech recognitions and images processing. Due to its capability to characterize
and extract features precisely, it can be applied for UAV future facilitation for route
planning in complex environments.
Firefly algorithms (FAs) are stimulated by the fireflies’ behavior and flashing activi-
ties, although it is also known as the meta heuristics’ algorithms. Its concepts include
general identifications and random states as trial/error of firefly which is present in
nature statistically [70]. The firefly is a flying beetle of the Lampyridae family and
usually is called lightning bugs due to its capability to create light. It creates light by
a process of Luciferin oxidation in the enzymes Luciferase presence, which arises
very rapidly. The light creation process is known as bio luminescence and fireflies
utilize this light to glow without spending heat energies. Firefly uses the light for
mate selection, message communication and occasionally also for terrifying off other
insects who try to attack it.
Recently the FAs have been utilized as an optimized tool and its applications are
spreading in nearly all engineering areas such as robot mobile navigations. In [70],
the authors presented Firefly algorithms-based robot mobile navigations approach
in the of static obstacles presence. The paper attained the three primary navigation
objectives such as route safety, route length, and route smoothness. In [71], authors
showed the FAs for the shortest path with free collision for single robot mobile
navigation in simulations environment. In [72] established the FAs for underwater
robot mobile navigation. Authors established strategy for swarm robots scheduling
for jamming and interference avoidance in 3D marine space environment. Reference
[73] discussed a similar environment, where a real-life underwater robot navigation
in partially pre-known environment is presented by utilizing the levy light-fireflies
based method.
The FAs based cooperation for dead robot detection strategy in a multi-mobile
robot environment is discussed by [74]. The FA 3D application for world exploration
with aerial navigations is implemented and developed by [75]. An enhanced version
of FAs is applied for unmanned combat aerial vehicle (UCAV) route planning in a
crowded complex environment and to avoid hazard areas and minimizing the fuel
cost. The concentric sphere based modified FA algorithm has [76] been presented to
avoid random moving of the fireflies in less computational efforts. The experimental
and situational results show a great commitment in achieving the goals of navigation
in a complex environment. Reference [77] Addressed the problem of navigation
specifically in dynamic conditions.
Navigation and Trajectory Planning Techniques for Unmanned Aerial … 395
The ACO algorithms initiated from the ant community behaviour and its capability
to search for the best shortest route from the source (nest) to a destination while
they are seeking for food [78]. In route planning method, all the routes of ant swarm
establish the optimized solution space for the problem. The pheromone concentration
is increasingly accumulated on shorter routes, and the number of ants selecting the
route is also growing. Ultimately, the entire ants concentrate on the shortest route
under confident feedback, and the consistent solution is the optimum to the route
planning of optimized problem [80]. ACO algorithm for UAV route planning is
typically developed by dividing the area of flying into a grid and enhancing a route
between a grid point and the destination points [85] to conduct the optimal route
efficient and rapid search [81]. An improved algorithm was discussed in [81] with
the assistance of a climbing weight and a 3D grid. Today, the ACO is utilized for
efficient route planning, and to handle the robot mobile navigation problems for
obstacles avoidance.
The ACO compared with other Collective Influence (CI) algorithms, the ACO
has solid robustness and capability to search for a best solution. Furthermore, the
ACO is an evolution population-based algorithms that are fundamentally easy and
parallel to run in parallel. To enhance the performance of ACO algorithm in route
planning problematic issues, the ACO algorithms can be simply combined with a
various heuristic algorithm.
The CS algorithms are based on the cuckoo’s lazy behavior for putting their eggs
in the of other birds’ nests. The algorithms follow three basic guidelines for an
optimized solution problem as discussed in [79]. At a time, each cuckoo put one egg
in a randomly selected nest. The best nest with high eggs quality will be passed to
the next generation. The number of nests available is usually fixed, and the cuckoo
egg laid has a probability of P ∈ (0, 1) to be discovered by the host bird. In such
case, the host bird can either abandon the current nests and build another one or
get rid of the egg. The CS algorithms are an enhanced approach due to the grows
of the efficiency and rate of convergence, henceforth it is extensively recognized in
various optimization engineering problems. Robot mobile navigations are one the
area where computational time and performance are need to be optimized [80].
The CS algorithms utilized for wheeled robot navigation in a static environment
the environment is partially known, and have shown real-life experiments and simu-
lations over the complex environments. The simulation and experimental results
present good arrangement as there was a much slighter deviation errors [81].
The CS-based algorithms perform well when combined with other navigation
methods. One such method is a combined of adaptive neuro fuzzy inference systems
396 N. M. Elfatih et al.
(ANFIS) and CS were proposed to obtain better navigation results in uncertain envi-
ronments. Another hybrid route planning method for an uncertain 3D environment by
hybridizing the CS with differential evolution (DE) algorithms for the global conver-
gence speed acceleration. The enhanced convergence speed aids the aerial robot to
discover the 3D environment. The CS 3D applications particularly for a battleground
has been discussed in [82]. In the manuscript, hybrid method (combing CS and DE)
has been proposed for aerial 3D route planning optimization problem. The DE is
added for the cuckoo’s selection process optimization which enhanced CS algorithm
noticeably, where the cuckoos were act as searching agent for optimum route.
PSO is an algorithm that describe birds flocking based optimization approach. There
are two parameters in this approach: position and speed. Position defines the move-
ment direction, while speed is the movement variable. Each element in the search
space individually searches for the optimum solution, and saves it as the present indi-
vidual value, and shares this value with other elements in the whole swarm, and finds
the optimum individual value for the entire swarm [82]. The present global swarm
optimum solution is that all elements belong to the swarm adapt their position and
speed according to the present individual value they found and the present global
optimum that distributed to the whole entire particle swarm [83].
Extensive studies have been done based on the UAVs route planning by applying
PSOs approaches and its alternatives. In PSOs, individual or particle is initialized
randomly. Each one of these particles represent a probable solution to path planning
problem and search around within certain space to look for optimum position. PSOs
have advantage compared with other computing approaches as it can faster finds
solution [81, 83].
Each particle in the swarm has its own individual speed, Vi and individual location,
Xi and search towards the local optimal position, Pi and global optimal position,
Pg. Local optimal location is the location at which the elements in swarm meet its
optimum suitability during fitness evaluation phase. For global optimal position, X’ is
obtained by particle for the whole swarm obtained. It achieves the optimum solution
by iterations. In each iteration, each element would apprise their position and speed
until extreme iteration is reached.
The ABCs algorithms are an intelligent-based swarm techniques adapted from the
honey bees’ activities in food search and it is initially introduced [83]. The ABC
algorithms are populations-based protocol comprising of inherent solutions popula-
tion (i.e., food sources for bee). It is comparatively simple, light processing and it is
populations-based stochastic search method in the swarm algorithm field. ABC food
search cycles comprises of the following three stages. Send the working bee to food
sources and assessing the juice quality. Onlookers’ bees selecting the food source
398 N. M. Elfatih et al.
after attaining information from working bees and computing the quality of nectar.
Having the scout bee and send it onto probable food source [87].
The ABCs algorithms application to the robot mobile navigations in static envi-
ronments is proposed by [89]. The proposed method applies ABCs for local search
and evolutionary algorithms to identify the optimum route. Real-time experiment in
indoor environments is discussed for result verification.
Similar techniques in static environments are also discussed by [89] however the
results were limited to simulations environment. For the navigation goal meeting
in a real-life with dynamic environments, the ABCs’ based technique is proposed
by [90]. Authors proposed hybrid method which combined the ABC with a rolling
time window protocol. Several robot mobile navigations in environments are a chal-
lenge issue, the development of ABC is successfully finalized in static environments.
Similar to the wheeled robot mobile navigations, the ABC is examined for navigation
aerial underwater and autonomous vehicles routine problem [83].
UCAV route planning purposes to attain an optimum 3D flight path by consider
the constraints and threats in the battle field. The researchers discussed the UCAV
navigation problems utilizing an enhanced ABC. The ABC is amended by balance-
evolution strategies (BESs) which completely uses the information convergence
throughout the iterations to employ the investigation accuracy and conduct balance
pursue between global explorations and the local exploitations capability [89]. ABC
algorithms applications in the military sectors have been discussed by [90], where an
unmanned helicopter has been examined for a stimulating mission such as accurate
measurements, information gathering, and etc.
AFSA is a part of Intelligence swarms, which proposed by [91]. Mostly, fishes move
to a location for best consistency food by execution social search behaviors. AFSAs
have roughly four behaviors’ prey, follow, swarm, and leap behaviors [90]. Lately,
with its robust volume of global search, good robustness, and fast convergence rates,
AFSA is extensively utilized for dealing with robot route planning issues. Hence,
several researches proposed methods to enhance the standard AFS performance by
fictitious entities of real fish. In a noble AFSA algorithm, identified as NAFSA, has
been presented to enhance the weak issues of the standard AFSA and fastening the
speed of convergence for the algorithm. A mended form of AFSA called MAFSA by
dynamic parameters control has been proposed to choose the optimum features subset
to enhance the categorization accuracy to enhance vector machines experimental
result show that the proposed method is outperform the standard AFSA [91].
A new optimization AFS is presented to enhance the counterfeiting of AFSA
behavior, which was near to reality that to enhance the ambient sense for the fishes’
foraging behavior. By testing the environment, artificial fish could monitor the
surrounded information to attain an optimum state for better movement direction. The
hybrid adaptive systems niche artificial fishes swarm algorithms (AHSNAFSAs) is
Navigation and Trajectory Planning Techniques for Unmanned Aerial … 399
proposed to resolve the vehicles’ routing problems, and the ecological niche concept
is discussed and presented to enhance the deficiency of conventional AFSA to achieve
an optimum solution [92].
7 Conclusions
References
1. Lu, Y., Zhucun, X., Xia, G.-S., & Zhang, L. (2018). A survey on vision-based UAV navigation.
Geo-Spatial Information Science, 21(1), 1–12.
2. Rashid, A., & Mohamed, O. (2022). Optimal path planning for drones based on swarm intel-
ligence algorithm. Neural Computing and Applications, 34, 10133–10155. https://doi.org/10.
1007/s00521-022-06998-9
3. Aggarwal, S., & Kumar, N. (2019). Path planning techniques for unmanned aerial vehicles: a
review. Solutions, and Challenge, Com Com, 149
400 N. M. Elfatih et al.
4. Lina, E., Ali, A., & Rania, A., et al, (2022). Deep and reinforcement learning technologies on
internet of vehicle (IoV) applications: Current issues and future trends. Journal of Advanced
Transportation, Article ID 1947886. https://doi.org/10.1155/2022/1947886
5. Farshad, K., Ismail, G., & Mihail, L. S. (2018). Autonomous tracking of intermittent RF source
using a UAV swarm. IEEE Access, 6, 15884–15897.
6. Saeed, M. M., Saeed, R. A., Mokhtar, R. A., Alhumyani, H., & Ali, E. S. (2022). A novel
variable pseudonym scheme for preserving privacy user location in 5G networks. Security and
Communication Networks, Article ID 7487600. https://doi.org/10.1155/2022/7487600
7. Han, J., Xu, Y., Di, L., & Chen, Y. (2013). Low-cost multi-uav technologies for contour mapping
of nuclear radiation field. Journal of Intelligent and Robotic Systems, 70(1–4), 401–410.
8. Merino, L., Martínez, J. R., & Ollero, A. (2015). Cooperative unmanned aerial systems for fire
detection, monitoring, and extinguishing. In Handbook of unmanned aerial vehicles (pp. 2693–
2722).
9. Othman, O. et al. (2022). Vehicle detection for vision-based intelligent transportation systems
using convolutional neural network algorithm. Journal of Advanced Transportation, Article ID
9189600. https://doi.org/10.1155/2022/9189600
10. Elfatih, N. M., et al. (2022). Internet of vehicle’s resource management in 5G networks using
AI technologies: Current status and trends. IET Communications, 16, 400–420. https://doi.org/
10.1049/cmu2.12315
11. Sana, U., Ki-Il, K., Kyong, H., Muhammad, I., et al. (2009). UAV-enabled healthcare
architecture: Issues and challenges”. Future Generation Computer Systems, 97, 425–432.
12. Haifa, T., Amira, C., Hichem, S., & Farouk, K. (2021). Cognitive radio and dynamic TDMA
for efficient UAVs swarm Communications. Computer Networks, 196.
13. Saleem, Y., Rehmani, M. H., & Zeadally, S. (2015). Integration of cognitive radio technology-
with unmanned aerial vehicles: Issues, opportunities, and future research challenges. Journal
of Network and Computer Applications, 50, 15–31. https://doi.org/10.1016/j.jnca.2014.12.002
14. Rashid, A., Sabira, K., Borhanuddin, M., & Mohd, A. (2006). UWB-TOA geolocation
techniques in indoor environments. Institution of Engineers Malaysia (IEM), 67(3), 65–69,
Malaysia.
15. Xi, C., Jun, T., & Songyang, L. (2020). Review of unmanned aerial vehicle Swarm communi-
cation architectures and routing protocols. Applied Sciences, 10, 3661. https://doi.org/10.3390/
app10103661
16. Sahingoz, O. K. (2013). Mobile networking with UAVs: Opportunities and challenges. In
Proceedings of the 2013 international conference on unmanned aircraft systems (ICUAS),
Atlanta, GA, USA, 28–31 May 2013 (pp. 933–941). New York, NY, USA: IEEE.
17. Kaleem, Z., Qamar, A., Duong, T., & Choi, W. (2019). UAV-empowered disaster-resilient edge
architecture for delay-sensitive communication. IEEE Network, 33, 124–132.
18. Sun, Y., Wang, H., Jiang, Y., Zhao, N. (2019). Research on UAV cluster routing strategy
based on distributed SDN. In Proceedings of the 2019 IEEE 19th International Conference
on Communication Technology (ICCT), Xi’an, China, 2019 (pp. 1269–1274). New York, NY,
USA: IEEE.
19. Khan, M., Qureshi, I, & Khan, I. (2017). Flying ad-hoc networks (FANETs): A review of
communication architectures, and routing protocols. In Proceedings of the 2017 first inter-
national conference on latest trends in electrical engineering and computing technologies
(INTELLECT). (pp. 1–9). New York, NY, USA.
20. Shubhani, A., & Neeraj, K. (2020). Path planning techniques for unmanned aerial vehicles: A
review, solutions, and challenges. Computer Communications, 149, 270–299.
21. Mamoon, M., et al. (2022). A comprehensive review on the users’ identity privacy for 5G
networks. IET Communications, 16, 384–399. https://doi.org/10.1049/cmu2.12327
22. Yijing, Z., Zheng, Z., & Yang, L. (2018). Survey on computational-intelligence-based UAV
path planning. Knowledge-Based Systems, 158, 54–64.
23. Zhao, Y., Zheng, Z., Zhang, X., & Liu Y. (2017). Q learning algorithm-based UAV path learning
and obstacle avoidance approach. In: 2017 thirty-sixth chinese control conference (CCC)
Navigation and Trajectory Planning Techniques for Unmanned Aerial … 401
24. Zhang, H. (2017). Three-dimensional path planning for uninhabited combat aerial vehicle
based on predator-prey pigeon-inspired optimization in dynamic environment. Press.
25. Alaa, M., et al. (2022). Performance evaluation of downlink coordinated multipoint joint trans-
mission under heavy IoT traffic load. Wireless Communications and Mobile Computing, Article
ID 6837780.
26. Sharma, R., & Ghose, D. (2009). Collision avoidance between uav clusters using swarm
intelligence techniques. International Journal of Systems Science, 40(5), 521–538.
27. Abdurrahman, B., & Mehmetnder, E. (2016). Fpga based offline 3d UAV local path planner
using evolutionary algorithms for unknown environments. Proceedings of the Conference of
the IEEE Industrial Electronics Society, IECON, 2016, 4778–4783.
28. Yang, X., Cai, M., Li, J. (2016). Path planning for unmanned aerial vehicles based on genetic
programming. In Chinese control and decision conference (pp. 717–722).
29. Luciano, B., Simeone, B., & Egidio, D. (2017). A mixed probabilistic-geometric strategy
for UAV optimum flight path identification based on bit-coded basic manoeuvres. Aerospace
Science Technology, 71.
30. Phung, M., Cong, H., Dinh, T., & Ha, Q. (2017). Enhanced discrete particle swarm optimization
path planning for UAV vision-based surface inspection. Automation in Construction, 81, 25–33.
31. Ugur, O., Koray, S. O. (2016). Multi colony ant optimization for UAV path planning with
obstacle avoidance. In International conference on unmanned aircraft systems (pp 47–52).
32. Adhikari, E., & Reza, H. (2017). A fuzzy adaptive differential evolution for multi-objective 3d
UAV path optimization. Evolutionary Computation, 6(9).
33. Choi, Y., Jimenez, H., & Mavris, D. (2017). Two-layer obstacle collision avoidance with
machine learning for more energy-efficient unmanned aircraft trajectories. Robotics and
Autonomous Systems, 6(2).
34. Abdul, Q. (2017). Saeed M: Scene classification for aerial images based on CNN using sparse
coding technique. International Journal of Remote Sensing, 38(8–10), 2662–2685.
35. Kang, Y., Kim, N., Kim, B., Tahk, M. (2017). Autopilot design for tilt-rotor unmanned aerial
vehicle with nacelle mounted wing extension using single hidden layer perceptron neural
network. In Proceedings of the Institution of Mechanical Engineers G Journal of Aerospace
Engineering, 2(6), 743–789.
36. Bygi, M., & Mohammad, G. (2007). 3D visibility graph. In International conference on compu-
tational science and its applications, conference: computational science and its applications,
2007. ICCSA 2007. Kuala Lampur.
37. Rashid, A., Rania, A., & Jalel, C., Aisha, H. (2012). TVBDs coexistence by leverage sensing
and geo-location database. In IEEE international conference on computer & communication
engineering (ICCCE2012) (pp. 33–39).
38. Fahad, A., Alsolami, F., & Abdel-Khalek, S. (2022). Machine learning techniques in internet of
UAVs for smart cities applications. Journal of Intelligent and Fuzzy Systems, 42(4), 3203–3226.
39. Ali, S., Hasan, M., & Rosilah, H, et al. (2021). Machine learning technologies for secure
vehicular communication in internet of vehicles: recent advances and applications. Security
and Communication Networks, Article ID 8868355. https://doi.org/10.1155/2021/8868355
40. Zeinab, K., & Ali, S. (2017). Internet of things applications, challenges and related future
technologies. World Scientific News (WSN), 67(2), 126–148.
41. Wang, Y., & Yuan, Q. (2011). Application of Dijkstra algorithm in robot path-planning. In
2011 2nd international conference mechnical automation control engineering (MACE 2011)
(pp. 1067–1069).
42. Patle, B. K., Ganesh, L., Anish, P., Parhi, D. R. K., & Jagadeesh, A. (2019). A review: On path
planning strategies for navigation of mobile robot. Defense Technology, 15, 582e606. https://
doi.org/10.1016/j.dt.2019.04.011
43. Reham, A, Ali, A., et al. (2022). Blockchain for IoT-based cyber-physical systems (CPS): appli-
cations and challenges. In: De, D., Bhattacharyya, S., Rodrigues, J. J. P. C. (Eds.), Blockchain
based internet of things. Lecture notes on data engineering and communications technologies
(Vol. 112). Springer. https://doi.org/10.1007/978-981-16-9260-4_4
402 N. M. Elfatih et al.
44. Jia, Q., & Wang, X. (2009). Path planning for mobile robots based on a modified potential
model. In Proceedings of the IEEE international conference on mechatronics and automation,
China.
45. Gul, W., & Nazli, A. (2019). A comprehensive study for robot navigation techniques. Cogent
Engineering, 6(1),1632046.
46. Hu, Y., & Yang, S. (2004). A knowledge based genetic algorithm for path-planning of a mobile
robot. In IEEE international conference on robotics automation.
47. Pratihar, D., Deb, K., & Ghosh, A. (1999). Fuzzy-genetic algorithm and time-optimal obstacle
free path generation for mobile robots. Engineering Optimization, 32(1), 117e42.
48. Hui, N. B., & Pratihar, D. K. (2009). A comparative study on some navigation schemes of a
real robot tackling moving obstacles. Robot Computer Integrated Manufacture, 25, 810e28.
49. Wang, X., Shi, Y., Ding, D., & Gu, X. (2016). Double global optimum genetic algorithm
particle swarm optimization-based welding robot path planning. Engineering Optimization,
48(2), 299e316.
50. Vachtsevanos, K., & Hexmoor, H. (1986). A fuzzy logic approach to robotic path planning
with obstacle avoidance. In 25th IEEE conference on decision and control (pp. 1262–1264).
51. Ali Ahmed, E. S., & Zahraa, T, et al. (2021). Algorithms optimization for intelligent IoV
applications. In Zhao, J., and Vinoth Kumar, V. (Eds.), Handbook of research on innovations
and applications of AI, IoT, and cognitive technologies (pp. 1–25). Hershey, PA: IGI Global.
https://doi.org/10.4018/978-1-7998-6870-5.ch001
52. Rashid, A., & Khatun, S. (2005) Ultra-wideband (UWB) geolocation in NLOS multipath fading
environments. In Proceeding of IEEE Malaysian international communications conference–
IEEE conference on networking 2005 (MICC-ICON’05) (pp. 1068–1073). Kuala Lumpur,
Malaysia.
53. Hassan, M. B., & Saeed, R. (2021). Machine learning for industrial IoT systems. In Zhao,
J., & Vinoth, K. (). Handbook of research on innovations and applications of AI, IoT, and
cognitive technologies (pp. 336–358). Hershey, PA: IGI Global. https://doi.org/10.4018/978-
1-7998-6870-5.ch023
54. Ali, E. S., & Hassan, M. B. et al. (2021). Terahertz Communication Channel characteristics and
measurements Book: Next Generation Wireless Terahertz Communication Networks Publisher.
CRC group, Taylor & Francis Group.
55. Rania, S., Sara, A., & Rania, A., et al. (2021). IoE design principles and architecture. In Book:
Internet of energy for smart cities: Machine learning models and techniques, publisher. CRC
group, Taylor & Francis Group.
56. Jaradat, M., Al-Rousan, M., & Quadan, L. (2011). Reinforcement based mobile robot
navigation in dynamic environment. Robot Computer Integrated Manufacture, 27, 135e49.
57. Tschichold, N. (1997). The neural network model Rule-Net and its application to mobile robot
navigation. Fuzzy Sets System, 85, 287e303.
58. Alsaqour, R., Ali, E. S., Mokhtar, R. A., et al. (2022). Efficient energy mechanism in heteroge-
neous WSNs for underground mining monitoring applications. IEEE Access, 10, 72907–72924.
https://doi.org/10.1109/ACCESS.2022.3188654
59. Jaradat, M., Garibeh, M., & Feilat, E. A. (2012). Autonomous mobile robot planning using
hybrid fuzzy potential field. Soft Computing, 16, 153e64.
60. Yen, C., & Cheng, M. (2018). A study of fuzzy control with ant colony algorithm used in
mobile robot for shortest path planning and obstacle avoidance. Microsystem Technology, 24(1),
125e35.
61. Duan, L. (2014). Imperialist competitive algorithm optimized artificial neural networks for
UCAV global path planning. Neurocomputing, 125, 166–171.
62. Liang, K. (2010). The application of neural network in mobile robot path planning. Journal of
System Simulation, 9(3), 87–99.
63. Horn, E., Schmidt, B., & Geiger, M. (2012). Neural network-based trajectory optimization for
unmanned aerial vehicles. Journal of Guidance, Control, and Dynamics, 35(2), 548–562.
64. Geiger, B., Schmidt, E., & Horn, J. (2009). Use of neural network approximation in multiple
unmanned aerial vehicle trajectory optimization. In Proceedings of the AIAA guidance,
navigation, and control conference, Chicago, IL.
Navigation and Trajectory Planning Techniques for Unmanned Aerial … 403
65. Ali, E., Hassan, M., & Saeed, R. (2021). Machine learning technologies in internet of vehicles.
In: Magaia, N., Mastorakis, G., Mavromoustakis, C., Pallis, E., Markakis, E. K. (Eds.), Intelli-
gent technologies for internet of vehicles. Internet of things. Cham : Springer. https://doi.org/
10.1007/978-3-030-76493-7_7
66. Gautam, S., & Verma, N., Path planning for unmanned aerial vehicle based on genetic algo-
rithm & artificial neural network in 3d. In Proceedings of the 2014 international conference
on data mining and intelligent computing (ICDMIC) (pp. 1–5). IEEE.
67. Wang, N., Gu, X., Chen, J., Shen, L., & Ren, M. (2009). A hybrid neural network method for
UAV attack route integrated planning. In Proceedings of the advances in neural networks–ISNN
2009 (pp. 226–235). Springer.
68. Alatabani, L, & Ali, S. et al. (2021). Deep learning approaches for IoV applications and
services. In Magaia, N., Mastorakis, G., Mavromoustakis, C., Pallis, E., & Markakis, E. K.
(Eds.), Intelligent technologies for internet of vehicles. Internet of things. Cham : Springer.
https://doi.org/10.1007/978-3-030-76493-7_8
69. Hidalgo, A., Miguel, A., Vegae, R., Ferruz, J., & Pavon, N. (2015). Solving the multi-objective
path planning problem in mobile robotics with a firefly-based approach. Soft Computing, 1e16.
70. Brand, M., & Yu, H. (2013). Autonomous robot path optimization using firefly algorithm. In
International conference on machine learning and cybernetics, Tianjin (Vol. 3, p. 14e7).
71. Salih, A., & Rania, A. A., et al. (2021). Machine learning in cyber-physical systems in industry
4.0. In Luhach, A. K., and Elçi, A. (Eds.), Artificial intelligence paradigms for smart cyber-
physical systems (pp. 20–41). Hershey, PA: IGI Global. https://doi.org/10.4018/978-1-7998-
5101-1.ch002
72. Mahboub, A., & Ali, A., et al. (2021). Smart IDS and IPS for cyber-physical systems. In
Luhach, A. K., and Elçi, A. (Eds.), Artificial intelligence paradigms for smart cyber-physical
systems (pp. 109–136). Hershey, PA: IGI Global. https://doi.org/10.4018/978-1-7998-5101-1.
ch006
73. Christensen, A., & Rehan, O. (2008). Synchronization and fault detection in autonomous robots.
In IEEE/RSJ intelligent conference on robots and systems (p. 4139e40).
74. Wang, G., Guo, L., Hong, D., Duan, H., Liu, L., & Wang, H. (2012). A modified firefly algorithm
for UCAV path planning. International Journal of Information Technology, 5(3), 123e44.
75. Patle, B., Parhi, D., Jagadeesh, A., & Kashyap, S. (2017). On firefly algorithm: optimization
and application in mobile robot navigation. World Journal of Engineering, 14(1):65e76, (2017).
76. Patle, B., Pandey, A., Jagadeesh, A., & Parhi, D. (2018). Path planning in uncertain environment
by using firefly algorithm. Defense Technology, 14(6), 691e701. https://doi.org/10.1016/j.dt.
2018.06.004.
77. Ebrahimi, J., Hosseinian, S., & Gharehpetian, G. (2011). Unit commitment problem solution
using shuffled frog leaping algorithm. IEEE Transactions on Power Systems, 26(2), 573–581.
78. Tang, D., Yang, J., & Cai, X. (2012). Grid task scheduling strategy based on differential
evolution-shuffled frog leaping algorithm. In Proceedings of the 2012 international conference
on computer science and service system, (CSSS 2012) (pp. 1702–1708).
79. Hassanzadeh, H., Madani, K., & Badamchizadeh, M. (2010). Mobile robot path planning
based on shuffled frog leaping optimization algorithm. In 2010 IEEE international conference
on automation science and engineering, (CASE 2010) (pp. 680–685).
80. Cekmez, U., Ozsiginan, M., & Sahingoz, O. (2014). A UAV path planning with parallel
ACO algorithm on CUDA platform. In Proceedings of the 2014 international conference on
unmanned aircraft systems (ICUAS) (pp. 347–354).
81. Zhang, C., Zhen, Z., Wang, D., & Li, M. (2010). UAV path planning method based on ant
colony optimization. In Proceedings of the 2010 Chinese Control and Decision Conference
(CCDC) (pp. 3790–3792). IEEE.
82. Brand, M., Masuda, M., Wehner, N., & Yu, X. (2010). Ant colony optimization algorithm for
robot path planning. In 2010 international conference on computer design and applications,
3(V3-V436-V3), 440.
83. Mohanty, P., & Parhi, D. (2015). A new hybrid optimization algorithm for multiple mobile
robots’ navigation based on the CS-ANFIS approach. Memetic Computing, 7(4), 255e73.
404 N. M. Elfatih et al.
84. Wang, G., Guo, L., Duan, H., Wang, H., Liu, L., & Shao, M. (2012). A hybrid metaheuristic
DE/ CS algorithm for UCAV three-dimension path planning. The Scientific World Journal,
2012, 83973. https://doi.org/10.1100/2012/583973.11pages
85. Abbas, N., & Ali, F. (2017). Path planning of an autonomous mobile robot using enhanced
bacterial foraging optimization algorithm. Al-Khwarizmi Engineering Journal, 12(4), 26e35.
86. Jati, A., Singh, G., Rakshit, P., Konar, A., Kim, E., & Nagar, A. (2012). A hybridization
of improved harmony search and bacterial foraging for multi-robot motion planning. In:
Evolutionary computation (CEC), IEEE congress, 1e8, (2012).
87. Asif, K., Jian, P., Mohammad, K., Naushad, V., Zulkefli, M., et al. (2022). PackerRobo: Model-
based robot vision self-supervised learning in CART. Alexandria Engineering Journal, 61(12),
12549–12566. https://doi.org/10.1016/j.aej.2022.05.043
88. Mohanty, P., & Parhi, D. (2016). Optimal path planning for a mobile robot using cuckoo search
algorithm. Journal of Experimental and Theoretical Artificial Intelligence, 28(1e2), 35e52.
89. Wang, G., Guo, L., Duan, H., Wang, H., Liu, L., & Shao, M. (2012). A hybrid metaheuristic
DE/ CS algorithm for UCAV three-dimension path planning. The Scientific World Journal,
583973, 11 pages. https://doi.org/10.1100/2012/583973
90. Ghorpade, S. N., Zennaro, M., & Chaudhari, B. S., et al. (2021). A novel enhanced quantum
PSO for optimal network configuration in heterogeneous industrial IoT, in IEEE access, 9,
134022–134036. https://doi.org/10.1109/ACCESS.2021.3115026
91. Ghorpade, S. N., Zennaro, M., Chaudhari, B. S., et al. (2021). Enhanced differential crossover
and quantum particle Swarm optimization for IoT applications. IEEE Access, 9, 93831–93846.
https://doi.org/10.1109/ACCESS.2021.3093113
92. Saeed, R. A., Omri, M., Abdel-Khalek, S., et al. (2022). Optimal path planning for drones
based on swarm intelligence algorithm. Neural Computing and Applications. https://doi.org/
10.1007/s00521-022-06998-9
Intelligent Control System for Hybrid
Electric Vehicle with Autonomous
Charging
Abstract The present chapter deals with a general review of electric vehicles (EVs)
and testing the efficiency of modern charging systems. This work is concentrated
also on hybrid vehicle architectures and recharge systems. In the first step and
more precisely, a global study on the different architecture and technologies for
EVs examined the battery, electric motor, and different sensor actions in electric
vehicles. The second part also discusses the different types of charging systems used
in EVs which we divided into two types, the first one is classic chargers the second
is the autonomous charger. In addition, an overview of the autonomous charger
is presented along with its corresponding mathematical modeling to address the
photovoltaic charger (PV) and Wireless charging system (WR). After a clear mathe-
matical discerption of each part and by showing the needed electronic equipment to
assure each tool’s role, an easy management loop is designed and implemented. Then
propose a hybrid charging system between PV and WR and then used an intelligent
power distribution system. Then, Matlab/Simulink software is used to simulate the
energetic performance of an electric vehicle with this hybrid recharge tool under
various simulation conditions. At the end of this study, the given results and their
corresponding discussions show the benefits and the drawbacks of each solution and
prove the importance of this hybrid recharge tool for increasing vehicle autonomy.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 405
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_13
406 M. Naoui et al.
1 Introduction
Electrifying transport systems have become a necessity in the modern city, and having
an electric vehicle instead of a fuel-powered one is becoming essential given the
technological advantages of communication techniques as well as the advantages of
taxes and the reduced price of electrical energy, compared to that of fuels [1]. This
transport system has been applied to several models and architectures which differ
internally. In recent years, most countries have sought to develop their transport
systems. Indeed, the facilities offered by new technologies and the global orientation
in saving the planet from atmospheric pollution have pushed toward the electrification
of transport systems. The future objective has therefore become the elimination of
all transport systems based on polluting energies, the replacement of which by other
systems using clean energy has become a necessity in most countries. This means
that modern transport systems are based, either totally or partially, on electrical
energy, which is non-polluting energy. The scarcity of fossil fuels and ecological
concerns are leading to a phase of the energy transition. The transport sector absorbed
nearly 66% of global oil production in 2011, producing more than 70% of global
greenhouse gas emissions [2]. The automotive sector is at the heart of these problems.
Therefore, research and development of technologies related to electric vehicles have
become indispensable [3]. About this sector and its basic elements, the storage of
electrical energy is one of the main success factors in this field [4]. Generally, for an
electric vehicle to be competitive in the market, the factor mentioned above must be
high profitability. About the latter, the model of the charging system used, directly
influences the classification of the vehicle model, compared to other models. These
charging systems have been treated and studied in various research works since the
appearance of accumulators. This led to the appearance of the notion of a charger
connected to the network [5]. These conventional systems immobilize the system
that carries the accumulators and limit the area of movement of the latter [6]. This
behavior includes weaknesses and strengths for some applications. Concerning the
objective studied in this thesis, electric vehicles using grid-connected chargers are
exposed to various problems [7]. These are especially linked to the recharging time,
which is generally high, and to the need for stops for a long journey, and this for a
long time [5]. Weaknesses appeared with this type of vehicle. The resolution of this
problem began with the appearance of new lithium battery technology as well as with
mobile charging systems, such as the photovoltaic, hybrid, or even mobile contact
system (example of the metro). In this context, the authors, in [8, 9], proposed the
first version of an adaptable photovoltaic energy system for an electric vehicle [10].
This system is used in fixed stations (doors of photovoltaic panels or panels installed
on the vehicle itself).
From another point of view, the integration of the case of wireless charging when
parking the vehicle appeared in [11], and then problems appeared in relation to the
frequency factor. The researchers in [12, 13] have proposed a version characterized by
a frequency study of this system taking into account the static and dynamic problems
[14]. However, the efficiency of all charging systems is related to the type of control
Intelligent Control System for Hybrid Electric Vehicle … 407
and energy distribution used. These kinds of control are indirect or may employ
procedures that do not require recharge system information or knowledge, such as
fuzzy logic (FL), neural network (NN), and ANFIS-based tactics [15–17].
In this context, the work presented in this chapter aims to study the present history
of electric vehicles. We have tried to present this system in accordance with what the
literature suggests. We have particularly talked about the different topologies that
exist, as well as the known architectures. The rest of this part consists of a state of
known and used recharging systems. We have exploited their modern and classic
architectures. Then use an intelligent power distribution system to save the energy
in the battery and use the available sources of energy in the most efficient way.
Finally, in this chapter, the authors studied the incorporation of several recharging
devices within the vehicle and were able to use also the vehicle in motion. This
hybrid recharge system consists of photovoltaic panels mounted on the body of the
car and collected solar energy to be used in the power pack. In addition, the wireless
charging device is removed from the car in order to provide control even though the
vehicle is in motion on the rechargeable paths.
These systems are modeled and explained in order to define the hybrid recharge
system. Also, this system is studied using Matlab/Simulink tool for having infor-
mation about the power flux and the relation of the external factors and the vehicle
speed on the battery state of charge parameter. Therefore, this chapter is organized
into four sections. After the introduction, a general review presents the general clas-
sification of electric vehicles. The third section explains the architecture of electric
vehicles, where it accurately presents its components. The next section describes elec-
tric vehicle charging, the mathematical model for the autonomous charging system,
and simulation results. Finally, a conclusion resumes the chapter.
2 Preliminaries
The electric vehicle composes of two models, which are the hybrid version and the
pure EV. The combustion engine seems the only difference between the two models
as it is existing only in the hybrid model. The initial pack of components regroups
a source of energy beside a battery system. This bloc is connected to the power
electronic converter for feeding the main electric motor. The function bloc needs a
system of control beside a high-processor calculator for supervising all the processes
of energy management and vehicle speed control [18].
A lot of problems and advantageous descriptions can be found for each of these
models [19]. Pure electric vehicle is friendly to the environment. Based on the
problem of environment and gas emissions, the pure electric vehicle field has encour-
aged research for making this transport solution more efficient. Optimizing the size
of the motor and improving the battery technologies, by making various solutions
408 M. Naoui et al.
for recharge are the different fields in this sector of research [20]. The EVs use
electrical energy to drive the vehicle and to interact with the vehicle’s electrical
system. According to the Technical Committee of the International Electro-Technical
Commission (IETCTC), whether a vehicle uses two or more energy sources or storage
systems, or converters to drive the vehicle, it is referred to as a hybrid electric vehicle
(HEV) as long as at least one source supplies electricity [21]. EVs are categorized
according to the combination of sources into various categories [3]. The battery alone
serves as a source for the electric vehicle battery (BEV), the electric vehicle fuel cell
and battery in the electric vehicle fuel cell (FCEV), the HEV battery and ICE, the
PEV battery and grid or the external charging station as shown in Fig. 1. In the
following section, the specifics of the EV forms are discussed.
The HEV can be classified into three major architectures. Architecture refers to the
configuration of the main elements of the powertrain. In our case, it is the heat engine,
an electric machine, and a battery. Three architectures can be characterized by how
thermal and electrical energies are channeled to the wheels: series, parallel, or power
bypass (series–parallel) [22].
Intelligent Control System for Hybrid Electric Vehicle … 409
ICE
Generator
Electric Motor
In this configuration (Fig. 2), the heat engine drives an alternator that supplies the
battery in the event of discharge and the electric motor in the event of a high-power
demand. This type of model allows great flexibility of propulsion, it consists in
operating the heat engine in the range of its highest efficiency and in increasing
the autonomy of the vehicle. On the other hand, the overall efficiency is very low
because of a double conversion of energy. Then, it requires a relatively powerful
electric motor because it alone provides all of the propulsion [23].
However, this architecture makes it possible to satisfy one of the constraints raised
in the issue, particularly low emissions in the urban cycle and a saving of 15 to 30%
in consumption.
In a parallel hybrid structure, the heat engine supplies its power to the wheels as for
a traditional vehicle. It is mechanically coupled to an electric machine which helps
it. This configuration is shown in Fig. 3 [24].
410 M. Naoui et al.
ICE
Electric Motor
ICE
Electric Motor
The peculiarity of its coupling also gives it the name of parallel hybrid with the
addition of torque or of addition of speed depending on the structure and design of the
vehicle. The torque addition structure adds the torques of the electric machine and the
heat engine to propel the vehicle (or to recharge the battery). This connection can be
made by belts, pulleys, or gears (a technology called parallel double-shaft hybrid).
The electric machine can also be placed on the shaft connecting the transmission
to the heat engine (a technology called parallel single shaft). The speed addition
structure adds the speeds of the heat engine and the electric machine. The resulting
speed is related to the transmission. This type of coupling allows great flexibility in
terms of speeds. The connection is made mechanically by planetary gear (also called
epicyclic gear).
This architecture requires more complex control than that of the serial architecture
and requires additional work for the physical integration of power sources. Never-
theless, not insignificant gains can be obtained, even by using electrical components
of low power and low capacity. Also, these gains make it possible to compensate
for the additional cost of this architecture and the excess weight associated with the
batteries and the electric motor.
Depending on the configuration used, here are some advantages and disadvantages
of each one presented in Table 1.
412 M. Naoui et al.
ICE
Generator
Electric Motor
The battery is defined as a device that ensures the storage of energy to deliver
electrical energy. It is divided into two categories: Primary batteries and secondary
batteries. Primary batteries are characterized by the provision of energy for a single
time during discharge while secondary batteries permanently offer storage energy
during the process of charge and discharge during the whole life of the battery.
The characteristics of a battery are generally defined by several criteria indicating
its performance which are: Energy density, Cost in the market, The number of
charging cycles, The discharge processes, The influence on the environment, The
temperature ranges, and the memory effects. This section will focus on secondary
batteries as they are the ones used in electric or hybrid vehicles. There are a lot of
batteries used in EVs, generally based on Lead Acid, Nickel, Lithium Metal, Silver,
and Sodium-Sulfur. The following battery technologies that are used in EVs will
be described respectively: Lead-acid (Pb-acid), Nickel–Cadmium (NiCd), Nickel-
metal-hydride (Ni-MH), Lithium-ion (Li-ion) [26]. Table 2 shows the characteristics
Intelligent Control System for Hybrid Electric Vehicle … 413
of the different battery hybrid vehicles, their electrification systems, costs, and CO2
emission minimization in each case [27].
3.2 Super-Capacitors
The motor is a relatively simple component at the heart of an electric vehicle that
operates on the interaction forces (force vectors) between an electromagnet and a
permanent magnet. When braking, the mechanical chain becomes part of the power
source, and the main energy source (battery) becomes the receiver an actuator that
creates rotational motion from electrical energy Electric motors are widely used
Intelligent Control System for Hybrid Electric Vehicle … 415
3.3.1 DC Motors
The drives with DC motors have long been used in electric vehicles because they
provide simple speed control. Furthermore, this sort of engine has great electric
propulsion properties (very favorable torque curve at low speed). However, their
production is costly, and the brush-collector system must be maintained [31]. Their
speed is limited, and they have low specific power, typically 0.3 to 0.5 kW/kg, whereas
gasoline engines have a specific power of 0.75 to 1.1 kW/kg. As a result, they are
less trustworthy and unsuitable for this purpose [32].
The asynchronous motor is made up of a stator and a rotor. The stator is the fixed
part of the motor and it has three windings (or windings) which can be connected
in star (Y) or in delta (4) depending on the supply network. The rotor is the rotating
part of the engine and is cylindrical, it carries either a winding (usually three-phase
like the stator) accessible by three rings and three brushes, or an inaccessible squirrel
cage, based on aluminum conductive edges. In both cases, the rotor circuit is short-
circuited (by rings or a rheostat) [33]. The asynchronous machine, due to its simplicity
of manufacture and maintenance, is currently the most widespread in the industrial
sector and has much better performance than other types of machines. Furthermore,
these machines have a lower mass torque, efficiency, and power factor than magnet
machines.
416 M. Naoui et al.
The synchronous motors Although more difficult to manage, more expensive, and
potentially less robust, synchronous motor selection has become critical in electric
and hybrid vehicles. In generator and motor mode, the synchronous machine has the
highest efficiency. A synchronous motor, like an asynchronous motor, consists of a
stator and a rotor separated by an air gap. The only change is in the rotor design.
Electric vehicles are increasingly part of our daily lives, so it was time to look at the
operation of their motor as well as the different versions (synchronous, asynchronous,
permanent, induction, etc.). So, let’s see the general principle of this technology
which however is not new [34].
A. The principle of an electric motor
The principle of an electric motor in Fig. 7 regardless of its construction, is to use
magnetic force to generate movement. Magnetic force is recognizable to us since
magnets may repel or attract other magnets. We shall employ two primary elements
for this: permanent magnets and copper coils (ideal materials for this work because
they are the most conductive…), or even copper coils in some cases (therefore without
a permanent magnet). Everything will be mounted on a circular axis to achieve a
permanent and linear movement; the idea is to create something with a cycle that
will repeat itself as long as we feed the motor.
It should also be known that a coil traversed by current (thus electrons) then
behaves like a magnet, with an electromagnetic field with two poles: north and south.
All electric motors are reversible: if we move the magnet manually, this generates
an electric current in the coil (we can then recharge the battery, for example, this is
regeneration) If we inject current into the coil, then the magnet begins to move. In
reality, the current goes from – to + . If the convention was decided that it would go
from + to – (we decided on this convention before having the true direction of the
current).
B. Parts of an electric motor
B.1 Accumulator:
This is where the current that will power the motor comes from, generally from a
Lithium-Ion battery or Ninh battery.
B.2 Stator:
This is the peripheral part of the engine, the one that does not rotate. To help you
remember, tell yourself it’s the static part (stator). It is in 99% of the cases made up
of coils that we will more or less supply (but also more or less alternate in = /- with
alternating current motors) to make the rotor turn.
B.3 Rotor:
This is the moving part, and to remind you of this, think of the word rotation (rotor).
It is generally not powered because being mobile it is difficult to do (or in any case,
it is not sustainable over time).
C. Transmission:
Because the electric motor has a very high operating range (16,000 rpm on a Model S
(model of electric vehicles, for example) and torque available quickly (the lower the
revs, the more torque), it was not necessary to produce a gearbox, so we have a type
of motor that is directly connected to the wheels! The gear ratio remains constant
whether you are traveling at 15 or 200 km/h.
The rhythm of the electric motor is not exactly set on that of the wheels; there
is what is known as a reduction. On a Model S, the ratio is around 10%, which
means that the wheel turns 10 times slower than the electric motor. An epicyclic gear
train, which is common in automatic gearboxes, is used to obtain the reduction ratio.
Figure 8 depicts this global structure.
After this reducer, there is finally the differential which allows the wheels to rotate
at different speeds. No need for a clutch or a torque converter because if a thermal
engine needs to be in motion all the time, this is not the case with an electric motor.
It, therefore, has no idling speed or need for a clutch that acts as a bridge between
the wheels and the engine: when the wheels stop, there is no need to disengage.
418 M. Naoui et al.
The calculator is a power calculator and manages a lot of things for example controls
the energy flows thanks to the many sensors it has. For example, when I accelerate,
I press a sensor (the pedal) called a potentiometer (it’s the same thing on modern
thermal vehicles), the computer then manages the flow of energy to be sent to the
engine according to my “degree of acceleration”. Same when I release the pedal, it
will manage energy recovery by sending the juice generated by the electric motor
(therefore reversible) to the battery while modulating the electric flow. It can ripple
the current using a chopper (battery to motor) or even rectify the current (recovery of
alternative energy for the DC battery). The different sensor action in electric vehicles
is shown in Table 3.
Integrated chargers can reuse all, or part, of the components of the traction chain to
perform the recharge. For example, the power of the traction chain of the Renault
ZOE 2nd generation is 80 kW with an on-board battery and a capacity of 40 kWh,
which makes it possible to envisage a substantial recharge in 30 min using the compo-
nents of the traction inverter. The tree in Fig. 9 is taken from a review of integrated
onboard chargers carried out as part of a CIFRE thesis with Renault S.A.S. lists the
different means of exploiting the traction chain for charging. This classification is
based on the study of 67 publications including patented topologies, journal articles,
and conference papers [35].
Intelligent Control System for Hybrid Electric Vehicle … 419
Light sensor
420 M. Naoui et al.
The reuse of onboard power electronics and/or electrical machine windings can
cause EMC interference problems with other equipment connected to the electrical
system and also with domestic protection devices. This may affect the availability of
the EV load. If the high-frequency components of the leakage currents are too high,
two things can happen blinding or untimely tripping of protection devices, such as
the differential circuit breaker. Any event that can cause an RCD to blind poses a
high safety risk to the user. Therefore, the IEC 61,851–21 safety standard specifies
that the leakage current must not exceed 3.5 mARMS.
Thus, the reduction of emissions conducted towards the network, and more partic-
ularly those of common mode currents, in a large frequency range [150 kHz –
30 MHz] is often achieved by galvanic isolation through the use of topologies based
on power transformers. Given the charging power levels, galvanic isolation has an
impact on the cost and volume of the charger. When the charger is not isolated, manu-
facturers use passive and active filtering in order to limit the disturbances generated
by the charger.
The average intensity of solar energy reaching the earth is 1367W/m2 . Benefiting
from this amount of energy has encouraged researchers to design solar receivers
intended to transform this solar energy into electrical energy. The results found
have guided vehicle manufacturers towards another energy source that will later
be used to improve vehicle autonomy. This solar charging system is essentially
based on a set of components that essentially includes the solar receivers which
ensure the obtaining of direct electricity when light reaches them. The efficiency of
this conversion depends mainly on the type of solar panel, such as polycrystalline,
monocrystalline, or amorphous silicon.
Charge controllers are also indispensable tools in this operating loop since the
outputs of the panels are variable and must be adjusted before being stored in
the battery or supplied to the load. Charge controllers work by monitoring battery
voltage. In other words, they extract the variable voltage from the photovoltaic panels,
depending on the safety of the battery. Once fully charged, the controller can short
out the solar panel to prevent further charge buildup in the battery. These controllers
are usually DC-DC converters. Figure 10 shows the architecture of the solar vehicle
and the location of the charge controller [36].
Most of these controllers measure the voltage in the battery and supply current
to the battery accordingly or completely stop the flow of current. This is done by
measuring the current capacity of the battery, rather than looking at its state of charge
(SOC). The maximum battery voltage allowed to reach is called the “charging set
point”. Factors such as prevention of deep discharge, battery sulfation, overcurrent,
and short circuit, are also prevented by the controller. A deep discharge can be
detected by the microcontroller, and it will then initiate an automatic acceleration
charge to keep the battery activated. Depending on the connections, charge controllers
can be of two types: the parallel controller, which is connected in parallel with the
battery and the load, and the series controller, which is placed in series between the
solar, the battery, and the load.
In this part we tested the efficiency of the autonomous system, we chose the two most
used systems in the application of electric and hybrid vehicles, the first photovoltaic
charging system and the second wireless charging a hybrid between the PV and WR
and tests the efficiency. then we proposed a hybrid system between PV and WR and
tested the efficiency. The different blocks for the hybrid recharge system are shown
in Fig. 12.
Intelligent Control System for Hybrid Electric Vehicle … 423
Battery Pack
DC/DC Converter
AC/DC Converter
AC/DC DC/AC
Grid Or Home
Converter Converter
(a)
Battery Pack
AC/DC Converter
(b)
Fig. 11 Wireless charging of electric vehicles, based on IPT a wireless V2G b plug-In V2G
A simplified representation of the IPT system is given in Fig. 13 where “V1 ” and
“V2 ” indicate the input and output voltages of this system. Each part consists of a set
of resistance and capacitance placed in series, between the source and the part either
emitting or receiving. This system is similar to that of a moving transformer [22].
424 M. Naoui et al.
DC/DC
Converter
Battery Pack
DC/DC Converter
AC/DC Converter
AC/DC DC/AC
Grid Or Home
Converter Converter
C1 C2
I1 R1 R2 I2
V1 L1 L2 V2
Subsequently, the vectors linked to V1 and V2, by considering that ϕ1 and ϕ2 are
their phases, with respect to a zero-phase reference vector, are given by (2).
−
→ √
V1= 2 2
π
√
V1 (cosϕ1 + j.sinϕ1 )
−
→ (2)
V2= 2 2
π
V2 (cosϕ2 + j.sinϕ2 )
Is the real part of the power on the primary and secondary sides equivalent to the
active power as seen through the Eq. (3):
⎧ → −
⎨ P1 = Re − →
V 1 I1
→ −
⎩ P2 = Re − → (3)
V 2 I2
From Eq. (1), the vector of the primary current is expressed according to Eq. (4).
⎧ → jωM −
− →
⎪
⎨−→ V 1− R 2
V
I 1= 2
2
R1+ (ωM) (4)
⎪
⎩ ω = √ 1 = 2π f
R2
LC
where L is the intrinsic inductance of the primary and secondary coils, assumed to be
identical. C is the value of the series compensation capacitors C1 and C2, assumed
to be equal (C1 = C2). The expression of the current on the emitting side is therefore
expressed in Eq. (5).
⎧ √
⎪ −
→ X −Y
⎪
⎪ I 1= 2 2
.
⎪
⎨
π R1 + (ωM)
R
2
2
However, the phase delay is defined as the phase difference between V2 and V1,
hence:
ϕ D = ϕ1 − ϕ2 (6)
According to Eqs. (2), (3), and (5), the real power of the primary side is defined
according to Eq. (7)
⎡ ⎤
V V ωM
8 ⎣ V1 + 1 R22 sinϕ D ⎦
2
P1 = 2 (7)
π R1 + (ωM)
2
R2
−
→
jωM V 1 −
→
−
→ R1
− V2
I 2= (8)
(ωM)2
R2 + R1
A= jωM V1
(sinϕ1 + j.cosϕ1 ) (9)
⎪
⎪
R1
⎪
⎪
⎩
B = V2 (cosϕ2 + j.sinϕ2 )
According to Eqs. (3), (8), and (9), the real power on the secondary side is defined
by Eq. (10).
⎡ ⎤
V V ωM
8 ⎣ 1 R21 sinϕ D − V2 ⎦
2
P2 = 2 (10)
π R2 + (ωM)
2
R1
The real effective values of the primary and secondary waveforms are related to
the direct voltages V1 and V2, as a function of their phase shift values, ϕs1 and ϕs2 ,
relating, respectively, to the primary and secondary bridges. Considering Vdc and
Vbatt as the amplitudes of V1 and V2 :
V1 = Vdc .sin ϕ2s1
(11)
V2 = Vbatt .sin ϕ2s2
Finally, replacing (7) by (10) and (11), the real powers are also obtained:
⎧ V sin( 2s1 )
ϕ
⎪
⎪ P1 = π82 · dc (ωM) Vdc sin ϕ2s1 + E
⎪
⎪ R +
2
⎪ 1
2 π2 2 batt
R2 + (ωM) 2
(12)
⎪
R1
⎪
⎪ V ωMsin( 2 )sin(ϕ D )
ϕs2
⎪
⎪ E = batt
⎪
⎩
Rϕ2
Vdc ωMsin( 2s1 )sin(ϕ D )
F= R1
The solar cell is an electrical component used in some application requirements (such
as an electric vehicle) to technologically transform solar energy into electricity to
produce the electrical energy requirements. Many authors have suggested various
Intelligent Control System for Hybrid Electric Vehicle … 427
models for solar cell to prove their research work [41–50]. The current I c can be
given by
Ic = I ph + Ish + Id (13)
Ir s−r e f
Ir s = (15)
exp( n sq.nβT
Voc
c
)−1
q(Vc + Rs Is ) 1
Ic = I ph − Ir s exp −1 − (Vc + Rs Is ) (16)
αkT Rp
The resistance Rp and Rs parameters are not considered (Rp >> Rs). Here is the
model with Rp = ∞ and Rs = 0.
q Vp
I p = N p I ph − N p Ir s (exp ( ) − 1) (19)
nβTc n s Ns
428 M. Naoui et al.
Following this exposure, it is important to cite the conditions of the simulation vehi-
cles out during this phase. Table 4 gives the technical specifications of the hybrid
system, as well as the driving condition applied to the vehicle. Driving exhibits a
variety of forms of acceleration: low, medium, and high, ensuring a state of decel-
eration, to demonstrate that this hybrid system can achieve a visible energy gain,
especially when the traction motor is not consuming. This phase is summarized in
Fig. 14, especially at times from 8 to 13 s.
Table 4 Characteristics of
Electrical characteristics of the hybrid system
the hybrid system
Electric motor power 50 KW
Type of electric motor PMSM
Type of battery lithium
Battery voltage 288 V
Maximum vehicle speed 120 km/h
Max motor torque 150 Nm
Mechanical characteristics of the vehicle
Max vehicle weight 332 kg
Tilt angle αr Tilt angle αr Variable according to route
Vehicle front surface 2.7 m2
Air density 1,225 kg/m3
Features of the PV charging system
Vehicle front surface 1.5 m2
Number of PV cells 145 145145 °C
The fuzzy logic technique has recently been established as one of the intelligent
methods used in power distribution systems to detect the power generated by the
recharge systems and distribute it in the most efficient way to recharge the battery
with a large amount of power. This control is more robust than traditional control
techniques and does not necessitate a perfect understanding of the system’s math-
ematical model. A fuzzy logic supervisor’s basic three functional blocks are fuzzi-
fication, inference engine, and defuzzification. As a result, input variables, output
variables, membership functions, and fuzzy rules identify it. Any fuzzy controller’s
success is determined by variables such as the number and relevance of any chosen
input, the fuzzification method, and the number of rules. The chosen variables are
related to these three signals in this case of application and are based on multiple
tests performed before in the study of [51] and in our earlier works. So, to pilot this
energy we will propose a simple flowchart of energy management which is presented
in Table 5.
To test the profitability of this hybrid system, especially in relation to the state of
charge of the battery in the case where the vehicle is in motion, we will refer in what
follows to the simulation conditions mentioned above. Indeed, we propose to study
to have the energetic behavior of the vehicle for an increased speed of the form given
in Fig. 14. The forms of power delivered by the studied source are implemented in
Fig. 15, corresponding respectively to the photovoltaic and wireless cases. Switching
on these two energy sources gave a new kind of power output. The average value of
the power acquired from this hybrid system is greater than that of the photovoltaic
or wireless mode alone.
As explained in the previous part, it is clear that the average value of the power
acquired by the hybrid system is quite remarkable. During the action phase of the
traction system, the power consumed by the electric machine used follows a variable
path proportional to the selected acceleration or driving state. The main power source
is usually the battery, where the voltage supplied is quite stable. The other sources are
also used, as additional sources, to minimize the charge on the accumulators. At this
point, the state of charge of the battery is related to various conditions and situations,
especially the driving state and external factors related to the climate and the sizing
of the wireless system. Coordination between the various energy sources includes
control of power management. We chose to use the battery device in our work to begin
generating electricity. The WR generates electricity and the photovoltaic system at
least works to adapt solar irradiation to electrical energy supplied to a DC bus. The
total power is measured as present in Fig. 15 [52–54].
430 M. Naoui et al.
It is important to note that the selected driving cycle is the one used in Fig. 14. The
contribution of additive sources is well supervised in the form of power delivered
by the accumulators in Fig. 16. The first part of the simulation shows that the drops
obtained in the form of power delivered by the accumulator do not influence the
form of power consumed by the machine. On the other hand, we can notice that
during the malfunction of the motor “zero acceleration”, the power which is within
the battery is of negative form, which validates the state of charge on the part of
the photovoltaic source and wireless. For low acceleration forms, the implanted
hybrid system provides enough power to power the engine and charge the battery
simultaneously. Figure 16, shows this conclusion.
Along with this energy behavior, due to this hybrid system, it is possible to monitor
the state of charge of the battery, in order to officially validate whether this model is
Intelligent Control System for Hybrid Electric Vehicle … 431
4
Fig. 16 Evolution of power x 10
in relation to speed (Battery 4
power)
P-Batt (W)
0
Gain
-2
-4
0 5 10 15
Time (s)
40
profitable or not. Figure 17 shows the state of charge of the battery and proves that
during weak acceleration the SOC rate increases, although the vehicle is in motion,
and the same during the stop phase.
In the same context, we wanted to test the contribution of this hybrid charging
system against the wireless one or pure photovoltaic one. Figure 18 shows two cases
of SOC evolution, the first case is when the hybrid charging system is deactivated
and there is only battery consumption. The second case presents the evolution of the
SOC in the operation of this hybrid system. We notice the difference between the
two curves is clear and the energy gain is equal to 0.86%.
On the other hand, it is possible to monitor the evolution of the power supplied by
these charging, photovoltaic, wireless, and hybrid systems. Figure 19, shows that the
power of the hybrid system represents the sum of the powers obtained by the single
charging systems. It is clear that the power obtained by the wireless system is zero
since the vehicle has not yet passed over a transmitter coil.
Table 6 summarize the energy statistics of the charging systems studied and prove
that the hybrid system provides a greater gain compared to a purely photovoltaic or
432 M. Naoui et al.
40.5
Without (PV+WR)
With (PV+WR)
SOC(%) 40
39.5 X= 14.938
Y= 39.6303
39
X= 14.9977
38.5 Y= 38.7669
0 5 10 15
Time (s)
purely wireless charging system. This performance will ensure a gain in terms of the
distance traveled and will increase the life of the battery, which improves the overall
performance of the electric vehicle.
7 Conclusion
In this chapter, we have discussed the state of the art of the electrified transport system
relating to electric vehicles. Indeed, we have tried to present the different architectures
and models cited in the literature, such as pure electric and hybrid models. More
precisely, in this chapter, the recharge systems used as well as their different internal
architectures have been demonstrated. During this study, we divided these types of
Intelligent Control System for Hybrid Electric Vehicle … 433
Step 4. Rules
Energy Deceleration Low speed Medium High Speed
/Speed Speed
P-batt zero Low Low High
chargers into two categories: a set of modern or advanced chargers, solar chargers, and
Inductive Power Transfer. By focusing on the second category, the rest of this work
offers an in-depth study of these charging systems and exposes detailed modeling for
these different blocks. An operating simulation, to determine their performance at
specific operating conditions is applied. Then, a detailed mathematical model shows
a simulation result regarding the hybrid recharge system and energy. This recharge
system is installed into an electric vehicle to improve the vehicle’s autonomy. Each
recharge bloc, such as the photovoltaic recharge system and the wireless recharge tool
was modeled, and their corresponding mathematical expressions are given. Then, the
434 M. Naoui et al.
Acknowledgements The authors would like to thank Prince Sultan University, Riyadh, Saudi
Arabia for supporting this work. Special acknowledgement to Automated Systems & Soft
Computing Lab (ASSCL), Prince Sultan University, Riyadh, Saudi Arabia.
References
1. Bai, H., & Mi, C. (2011). The impact of bidirectional DC-DC converter on the inverter operation
and battery current in hybrid electric vehicles. In 8th international conference power electron.
- ECCE Asia "Green world with power electron. ICPE 2011-ECCE Asia (pp. 1013–1015).
https://doi.org/10.1109/ICPE.2011.5944686.
2. Sreedhar, V. (2006). Plug-in hybrid electric vehicles with full performance. In 2006 IEEE
configuration electrical hybrid vehicle ICEHV (pp. 1–2). https://doi.org/10.1109/ICEHV.2006.
352291.
3. Mohamed, N., Aymen, F., Ali, Z. M., Zobaa, A. F., & Aleem, S. H. E. A. (2021). Efficient
power management strategy of electric vehicles based hybrid renewable energy. Sustainability,
13(13), 7351. https://doi.org/10.3390/su13137351
4. Ertan H. B., & Arikan, F. R. (2018). Sizing of series hybrid electric vehicle with hybrid energy
storage system. In SPEEDAM 2018 - proceedings: international symposium on power elec-
tronics, electrical drives, automation and motion (pp. 377–382). https://doi.org/10.1109/SPE
EDAM.2018.8445422.
5. Kisacikoglu, M. C., Ozpineci, B., & Tolbert, L. M. (2013). EV/PHEV bidirectional charger
assessment for V2G reactive power operation. IEEE Transactions on Power Electronics, 28(12),
5717–5727. https://doi.org/10.1109/TPEL.2013.2251007
6. Lee, J. Y., & Han, B. M. (2015). A bidirectional wireless power transfer EV charger using
self-resonant PWM. IEEE Transactions on Power Electronics, 30(4), 1784–1787. https://doi.
org/10.1109/TPEL.2014.2346255
7. Tan, L., Wu, B., Yaramasu, V., Rivera, S., & Guo, X. (2016). Effective voltage balance control
for bipolar-DC-Bus-Fed EV charging station with three-level DC-DC Fast Charger. IEEE
Transactions on Industrial Electronics, 63(7), 4031–4041. https://doi.org/10.1109/TIE.2016.
2539248
8. Abdelwahab O. M., & Shaaban, M. F. (2019). PV and EV charger allocation with V2G capa-
bilities. In Proceedings - 2019 IEEE 13th international conference on compatibility, power
Intelligent Control System for Hybrid Electric Vehicle … 435
24. Zhao, C., Zu, B., Xu, Y., Wang, Z., Zhou, J., & Liu, L. (2020). Design and analysis of an
engine-start control strategy for a single-shaft parallel hybrid electric vehicle. Energy, 202(5),
2354–2363. https://doi.org/10.1016/j.energy.2020.117621
25. Cheng, M., Sun, L., Buja, G., & Song, L. (2015). Advanced electrical machines and machine-
based systems for electric and hybrid vehicles. Energies, 8(9), 9541–9564. https://doi.org/10.
3390/en8099541
26. Naoui, M., Aymen, F., Ben Hamed, M., & Lassaad, S. (2019). Analysis of battery-EV state of
charge for a dynamic wireless charging system. Energy Storage, 2(2). https://doi.org/10.1002/
est2.117.
27. Rajashekara, K. (2013). Present status and future trends in electric vehicle propulsion tech-
nologies. IEEE Journal of Emerging and Selected Topics in Power Electronics, 1(1), 3–10.
https://doi.org/10.1109/JESTPE.2013.2259614
28. Paladini, V., Donateo, T., de Risi, A., & Laforgia, D. (2007). Super-capacitors fuel-cell
hybrid electric vehicle optimization and control strategy development. Energy Conversion
and Management, 48(11), 3001–3008. https://doi.org/10.1016/j.enconman.2007.07.014
29. Chopra, S. (2011). Contactless power transfer for electric vehicle charging application. Science
(80).
30. Emadi, A. (2017). Handbook of automotive power electronics and motor drives.
31. Naoui, M., Flah, A., Ben Hamed, M., & Lassaad, S. (2020). Brushless motor and wireless
recharge system for electric vehicle design modeling and control. In Handbook of research on
modeling, analysis, and control of complex systems.
32. Guarnieri, M. (2011). When cars went electric, Part 2. IEEE Industrial Electronics Magazine,
5(2), 46–53. https://doi.org/10.1109/MIE.2011.941122
33. Levi, E., Bojoi, R., Profumo, F., Toliyat, H. A., & Williamson, S. (2007). Multiphase induction
motor drives-a technology status review. IET Electric Power Applications, 1(5), 643–656.
https://doi.org/10.1049/iet-epa
34. Mohamed, N., Flah, A., Ben Hamed, M., & Lassaad, S. (2021). Modeling and simulation of
vector control for a permanent magnet synchronous motor in electric vehicle. In 2021 4th
international symposium on advanced electrical and communication technologies (ISAECT),
2021 (pp. 1–5). https://doi.org/10.1109/ISAECT53699.2021.9668411.
35. Yilmaz, M., & Krein, P. T. (2013). Review of battery charger topologies, charging power
levels, and infrastructure for plug-in electric and hybrid vehicles. IEEE Transactions on Power
Electronics, 28(5), 2151–2169. https://doi.org/10.1109/TPEL.2012.2212917
36. Mohamed, N., Flah, A., & Ben Hamed, M. (2020). Influences of photovoltaics cells number for
the charging system electric vehicle. In Proceedings of the 17th international multi-conference
system signals devices, SSD 2020 (pp. 244–248). https://doi.org/10.1109/SSD49366.2020.936
4141.
37. Wu, H. H., Gilchrist, A., Sealy, K., Israelsen, P., & Muhs, J. (2011). A review on inductive
charging for electric vehicles. 2011 IEEE international electrical machine drives conference
IEMDC, 2011 (pp. 143–147). https://doi.org/10.1109/IEMDC.2011.5994820
38. Xie, L., Shi, Y., Hou, Y. T., & Lou, A. (2013). Wireless power transfer and applications to
sensor networks. IEEE Wireless Communications, 20(4), 140–145. https://doi.org/10.1109/
MWC.2013.6590061
39. Cao, P. et al. (2018). An IPT system with constant current and constant voltage output features
for EV charging. In Proceedings of the IECON 2018 - 44th annual conference IEEE industrial
electronics society (vol. 1, pp. 4775–4780). https://doi.org/10.1109/IECON.2018.8591213.
40. Nagendra, G. R., Chen, L., Covic, G. A., & Boys, J. T. (2014). Detection of EVs on IPT
highways. In Conference proceedings of the - IEEE applied power electronics conference and
exposition - APEC (pp. 1604–1611). https://doi.org/10.1109/APEC.2014.6803521.
41. Mohamed, N., Aymen, F., & Ben Hamed, M. (2019). Characteristic of photovoltaic generator
for the electric vehicle. International Journal of Scientific and Technology Research, 8(10),
871–876.
42. Dheeban, S. S., Selvan, N. M., & Kumar, C. S. (2019). Design of standalone pv system.
International Journal of Scientific and Technology Research (vol. 8, no. 11, pp. 684–688).
Intelligent Control System for Hybrid Electric Vehicle … 437
43. Kamal, N. A., & Ibrahim, A. M. (2018). Conventional, intelligent, and fractional-order control
method for maximum power point tracking of a photovoltaic system: A review. In Advances in
nonlinear dynamics and chaos (ANDC), fractional order systems (pp. 603–671). Academic.
44. Amara, K., Malek, A., Bakir, T., Fekik, A., Azar, A. T., Almustafa, K. M., Bourennane,
E., & Hocine, D. (2019). Adaptive neuro-fuzzy inference system based maximum power point
tracking for stand-alone photovoltaic system. International Journal of Modelling, Identification
and Control, 2019, 33(4), 311–321.
45. Fekik, A., Hamida, M. L., Houassine, H., Azar, A. T., Kamal, N. A., Denoun, H., Vaidyanathan,
S., & Sambas, A. (2022). Power quality improvement for grid-connected photovoltaic panels
using direct power control. In A. Fekik, & N. Benamrouche (Ed.), Modeling and control of
static converters for hybrid storage systems, 2022 (pp. 107–142). IGI Global. https://doi.org/
10.4018/978-1-7998-7447-8.ch005.
46. Fekik, A., Azar A. T., Kamal, N. A., Serrano, F. E., Hamida, M. L., Denoun, H., & Yassa,
N. (2021). Maximum power extraction from a photovoltaic panel connected to a multi-cell
converter. In Hassanien, A. E., Slowik, A., Snášel, V., El-Deeb, H., & Tolba, F. M. (Eds.),
Proceedings of the international conference on advanced intelligent systems and informatics
2020. AISI 2020. Advances in intelligent systems and computing (vol. 1261, pp. 873–882).
Springer, Cham. https://doi.org/10.1007/978-3-030-58669-0_77.
47. Kamal, N. A., Azar, A. T., Elbasuony, G. S., Almustafa, K. A., & Almakhles, D. (2019).
PSO-based adaptive perturb and observe MPPT technique for photovoltaic systems. In The
international conference on advanced intelligent systems and informatics AISI 2019. Advances
in intelligent systems and computing (vol. 1058, pp. 125–135). Springer.
48. Ammar, H. H., Azar, A. T., Shalaby, R., Mahmoud, M. I. (2019). Metaheuristic optimization of
fractional order incremental conductance (FO-INC) Maximum power point tracking (MPPT).
Complexity, 2019, Article ID 7687891, 1–13. https://doi.org/10.1155/2019/7687891
49. Rana, K. P. S., Kumar, V., Sehgal, N., George, S., & Azar, A. T. (2021). Efficient maximum
power point tracking in fuel cell using the fractional-order PID controller. In Advances in
nonlinear dynamics and chaos (ANDC), renewable energy systems (pp. 111–132). Academic.
https://doi.org/10.1016/B978-0-12-820004-9.00017-6
50. Ben Smida, M., Sakly, A., Vaidyanathan, S., & Azar, A. T. (2018). Control-based maximum
power point tracking for a grid-connected hybrid renewable energy system optimized by particle
swarm optimization. Advances in system dynamics and control (pp. 58–89). IGI-Global, USA.
https://doi.org/10.4018/978-1-5225-4077-9.ch003
51. Ghoudelbourk, S., Dib, D., Omeiri, A., & Azar, A. T. (2016). MPPT Control in wind energy
conversion systems and the application of fractional control (PIα ) in pitch wind turbine.
International Journal of Modelling, Identification and Control (IJMIC), 26(2), 140–151.
52. Kraiem, H., et al. (2022). Decreasing the battery recharge time if using a fuzzy based power
management loop for an isolated micro-grid farm. Sustain, 14(5), 1–23. https://doi.org/10.
3390/su14052870
53. Liu, H. C., Wu, S.-M., Wang, Z.-L., & Li, X.-Y. (2021). A new method for quality function
deployment with extended prospect theory under hesitant linguistic environment. IEEE Trans-
actions on Engineering Management, 68(2), 442–451. https://doi.org/10.1109/TEM.2018.286
4103
54. Nguyen, B. H., Trovão, J. P. F., German, R., & Bouscayrol, A. (2020). Real-time energy
management of parallel hybrid electric vehicles using linear quadratic regulation. Energies,
13(21), 1–19. https://doi.org/10.3390/en13215538
55. Guo, J., He, H., & Sun, C. (2019). ARIMA-based road gradient and vehicle velocity prediction
for hybrid electric vehicle energy management. IEEE Transactions on Vehicular Technology,
68(6), 5309–5320. https://doi.org/10.1109/TVT.2019.2912893
Advanced Sensor Systems for Robotics
and Autonomous Vehicles
M. Tolani (B)
Manipal Institute of Technology, udupi, Manipal, Karnataka, India
e-mail: [email protected]
A. A. Ajasa
Universiti Teknologi Malaysia, Skudai, JB, Malaysia
e-mail: [email protected]
A. Balodi · A. Bajpai
Atria Institute of Technology, Bangalore, India
e-mail: [email protected]
A. Bajpai
e-mail: [email protected]
Y. AlZaharani
University of Wollongong, Wollongong, Australia
Sunny
Indian Institute of Information Technology, Allahabad, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 439
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_14
440 M. Tolani et al.
1 Introduction
The sensors play an important role for the various robotic applications. The robots
used for industrial, commercial as well domestic applications [1–11]. The advance-
ment in machine learning methods makes the system more intelligent like human
beings. The uses of advance sensor system makes the robot intelligent upto next
level. The present work deals with the need and uses of the advanced sensor system
for the autonomous vehicle application. In the present work the robotic vehicle appli-
cation is mainly divided into railway trains and road vehicles. The advanced sensor
systems are mainly used into railway track or inside the train in case of automatic
application. Similarly, in case of road vehicle, the sensors are used for the efficient
operations of the vehicles and in roadside for automatic driving and other services.
In this chapter, the sensors are divided based on their three different categories of
operation [12–79]. The first category belongs to the category of sensors which con-
tinuously senses the data based on the operation requirement. These types of sensors
are called continuous monitoring sensors [80]. The next category of the sensors are
called event monitoring sensors which belongs to the category of event based data
generation. Also, one more category of the sensors are their i.e. periodic sensors.
The periodic sensors transmit the data periodically to the monitoring station. The
roadside monitoring system for vehicle application and track monitoring system for
railway application is discussed below in the subsections [14–122].
Now a days advanced sensors are used for the automatic driving and vehicle monitor-
ing application. For the automatic driving, efficient and accurate monitoring helps to
better prediction and tracking. the researchers are working in the field of various pre-
diction methods for the tracking of the vehicles. The kalman, extended kalman based
prediction methods are widely used for the prediction application. The advanced
algorithms are also used for the accurate prediction e.g. cuckoo search algorithm,
particle swarm optimization. The prediction algorithm plays an important role but it
has a limit. The prediction efficiency can be improved with the help of the advanced
sensors. Now a days advanced sensors with efficient communication system is used
for the monitoring application. The advanced sensor system for vehicle application
Advanced Sensor Systems … 441
is shown in Fig. 1. As shown in figure the sensor device is mounted on each vehicle.
The sensor devices directly communicates with the road side devices and transmits
the data to road side device. All the road side devices transmit the data to the mon-
itoring station. The monitoring station analyzes the data and generates the control
signal for the vehicle. The advanced sensors used in the vehicle makes the system
more efficient [91, 120, 121].
The railway monitoring is an important field to make the railway trains robotic and
autonomous [13]. The researchers have worked on railway monitoring application
in different fields. Most of the researchers are working in the MAC protocol [14–
58, 81, 86–100, 104, 114–119], aggregation protocol [59–83, 101, 102, 122], and
data consistency [91, 120, 121]. However, the performance efficiency of the railway
monitoring system depends upon the advanced sensor devices. Now a days many
advanced sensor devices are available for the wireless sensor network and IoT appli-
442 M. Tolani et al.
cation. The operation of railway track monitoring application is shown in The Fig. 2.
The sensor devices are placed on the track. The sensor devices transmits the data to
the monitoring station via base station.
2 Related Works
The researchers have reported many works related to the advanced sensor applica-
tion. Victoria et al. discussed various uses of advanced sensors for the railway track
Advanced Sensor Systems … 443
condition monitoring systems. The authors reported various research works for the
monitoring application using WSN. The researchers used WSN to (1) Maintain pro-
cess Tolerance, (2) Verify and protect machine, (3) Detect maintenance requirement,
(4) Minimize Downtime, and (5) Prevent failure and save business money and time.
In this work, the author have reported many advanced sensors for bridge monitor-
ing, tunnel monitoring, track monitoring, and rail bed monitoring. Eila et al. proposed
a system to measure the bogie vibration [56]. Maly et al. proposed the heterogeneous
sensor model to integrate data [94]. There are many other research works reported by
many of the researchers. For the better analysis of the contribution of the researchers
in this direction, the research papers are selected based on inclusion and exclusion
criteria. Approximately 280 research papers are identified at the first stage. Based on
inclusion and exclusion criteria, total 70% papers are rejected.
The research works related to the core data communication are rejected at the
screening stage and research papers with core idea with advanced sensors are identi-
fied for the full review as shown in Fig. 3. The keywords of the works closely related
to the advanced sensors are also mentioned. The identified keywords are according
to the inclusion criteria.
In the continuous process, the year-wise contribution of the researchers are also
analyzed in the field of advanced sensors. The research analysis shows that the
research contribution of the researchers are increased in the last 4–5 years as shown
in Fig. 4. The main reason of the researchers attraction in this direction is the advance-
ment in AI/ML and IoT. Advanced sensors are primary requirement of the IoT. Also,
the AI/ML methods make the IoT system more powerful. Therefore, the requirement
of the advanced sensors increasing exponentially.
The literature study shows that the researchers contribution is increasing from
last 5–10 years. The advancement of the IoT and AI/ML is the major cause
of the demand of advanced sensors. The automatic driving efficiency strongly
depends upon advanced sensors.
The main inclusion terms are mentioned in Fig. 3. The cumulative use of each
term is shown in Fig. 5. The result in Fig. 5. shows that most of the papers are based
on the event monitoring sensors. The continuous monitoring is also mentioned in
many of the sensors. Therefore, the discussion of advanced sensors are categorized
into continuous monitoring and event monitoring sensors.
444 M. Tolani et al.
Fig. 3 Screening and Identification process of the research papers (Inclusion/Exclusion criteria,
Keywords)
Advanced Sensor Systems … 445
Apart from this the main focus of the researchers are AI/ML based advance
methods to control the vehicle. The vehicle sensors are advance sensors which are
used for the autonomous driving. The researchers have also used various different
types of sensors devices. The sensor devices are mainly categorized into reduced
function device (RFD) and fully function device (FFD). The RFD devices can pnly
perform sensing operation and transmission of the data. However, the FFD devices
can perform all the different types of the computational operation. The researchers
have reported clustering, and aggregation operation which can be performed by FFD
devices.
RFD mainly works as a end device. However, the FFD device can work as
intermediate device and can perform all the mathematical operations. The
FFD can also work as cluster head device.
The keywords of the inclusion criteria mainly describes the focus of the researchers
in the particular direction. Therefore the cumulative count of the keywords are also
identified. The keywords of the shortlisted manuscripts also show that the main focus
of the researchers is either event monitoring and continuous monitoring sensors. The
researchers used this type of sensors for WSN or IoT applications. Similarly, data
aggregation and filtration of the collected data from the advanced sensors are also
reported in the research works (Fig. 6).
Advanced Sensor Systems … 447
The cumulative count of the keywords signifies the focus of the researchers in
the particular direction. The current cumulative count of the keywords clearly
indicates that the advanced continuous and event monitoring sensors plays an
important role for autonomous robotic application.
As already mentioned that the researchers are working in various domain of the
autonomous vehicle. The contribution of the researchers in various fields are analyzed
as shown in Fig. 7. The analysis shows that the major focus of the researchers is to
reduce the energy consumption. The advanced sensors play an important role in
reduction of energy consumption. The researchers also reported various other works
for the reduction of energy consumption. The efficient design of MAC protocol is one
of the field. The researchers have reported various contention based and contention-
less protocols. The road/railway safety based intelligent system is reported by various
researchers. These types of systems can only be made with the help of the AI/ML
and IoT.
There are various ways to reduce the energy consumption. However, the
energy-efficient methods can be categorized into two different fields, i.e., hard-
ware and software related fields. In the software related fields, the researchers
are working in MAC protocol, routing protocol, and aggregation protocol. For
the hardware related field, the researchers are working in the advanced sensor
design and fabrication.
448 M. Tolani et al.
The researchers have reported many works in related to the bridge monitoring
[4–9]. The acoustic emission sensor is used for the crack/fatigue detection. Similarly
strain guage sensor is used for the stress detection on the railway track [1, 3, 8, 12]. the
researchers have used for the piezoelectric strain guage sensor for bridge monitoring
application [11]. Similarly, the strain guage sensor is used for the weight measurement
of the train as reported in [10]. For the dynamic load application accelerometer sensor
is used for various application [2, 8]. Many other research works are reported for
different other applications.
The advanced sensors can be used for different applications. In this section, we have
discussed two categories of advanced sensors as given below:
The data is transmitted from the sensor devices to the roadside equipment via
direct communication. The monitoring station receives data from every road-
side device. The monitoring station evaluates the information and produces
the vehicle’s control signal.The technology is more effective thanks to the
sophisticated sensors utilized in the vehicle.
Advanced Sensor Systems … 449
The efficient railway monitoring depends upon various other technologies similar to
railway monitoring as shown in Fig. 9.The demand for data rates is rising now a days.
The researchers are stepping up their frequency range in order to meet the enormous
demand for data rate. The present need for bandwidth is met by the 5G technology
for railway application. The effectiveness of the road monitoring is increased by
the modern sensors combined with device-to-device communication and IoT. The
researchers have put up a number of aggregation protocols for the transfer of data
that uses little energy. Methods for ensuring data consistency are also crucial. Data
integrity and energy-efficient data transmission are trade-offs [80, 84, 85, 90, 92,
95, 103, 105–113].
The researchers have reported various uses of the advance sensors for the vari-
ous railway application needs. The accelerometer, gyroscope, FB Strain sensors are
used for the train shell monitoring application. Humidity, motion detector, vibration
sensors are used for the wagon monitoring application. Surface, acoustic sensor,
and inertia sensors are used for the bogie monitoring. Gyro sensor and gap sensors
are used for the wheels monitoring. Wind pressure sensors are used for the brakes
monitoring. Few other sensors are also used for other applications as given in Table 1.
450 M. Tolani et al.
The road side application requires various different types of sensors. As shown in
Fig. 10 that footpath sensor, light sensor, service road shoulder width, safety barrier,
medium width, gyroscope, road capacity, traffic signal, operating speed, and many
other different types of the sensors. Most of the sensor are mentioned in the Fig. 10.
Advanced Sensor Systems … 451
Similarly, the advanced sensors can also be used for the railway application. The
sensors used for the railway application are shown in Fig. 11. The accelerometer
sensors, FBG sensor, inclinometer sensor, maganetoelectric sensor, acoustic emis-
sion, gyroscope, displacement sensor, and many other senors are used for the railway
monitoring application.
452 M. Tolani et al.
4 Conclusion
Sensor systems are essential in applications involving robotics and autonomous vehi-
cles. The advancement of data science, artificial intelligence, machine learning, and
the internet of things (IoT) creates new opportunities for autonomous cars. The
fusion of robots, IoT, and AI is a particularly potent combination for applications
such as vehicle control, traffic monitoring, and traffic management. Advanced sen-
sor systems are necessary for efficient robotic and vehicle control with robot sensor
devices. As a result, the AI-based system attracts researcher’s attention in order to
maximize the utilization of sensor data for diverse robotic applications while mini-
mizing energy consumption. One key challenge that AI technology can successfully
address is the effective collection of data from sensors. Applications requiring time-
constrained data collection can also make use of the data consistency method. The
current chapter examines different crucial ways to raise the robotic and autonomous
vehicle applications’ quality of service (QoS) and quality of experience (QoE) stan-
dards.
In future, the nano-electromachenical system (NEMS) advanced sensors and actu-
ators can be developed for the low energy and long life time monitoring applications.
Also, new physical layer protocols can be developed for the efficient operation.
References
1. Bischoff, R., Meyer, J., Enochsson, O., Feltrin, G., & Elfgren, L. (2009). Eventbased strain
monitoring on a railway bridge with a wireless sensor network. In Proceedings of the 4th
International Conference on Structural Health Monitor, (pp. 1–8). Zurich, Switzerland: Intell.
Infrastructure.
2. Chebrolu, K., Raman, B., Mishra, N., Valiveti, P., & Kumar, R. (2008). Brimon: A sensor
network system for railway bridge monitoring. In Proceedings of the 6th International Con-
ference on Mobile System and Application Services, Breckenridge, CO, USA (pp. 2–14).
Advanced Sensor Systems … 453
3. Feltrin, G. (2012). Wireless sensor networks: A monitoring tool for improving remaining
lifetime estimation. In Civil Struct (Ed.), Health Monitoring Workshop (pp. 1–8). Berlin:
Germany.
4. Grosse, C., et al. (2006). Wireless acoustic emission sensor networks for structural health mon-
itoring in civil engineering. In Proceedings of the European Conference on Non-Destructive
Testing (pp. 1–8), Berlin, Germany.
5. Grosse, C., Glaser, S., & Kruger, M. (2010). Initial development of wireless acoustic emission
sensor Motes for civil infrastructure state monitoring. Smart Structures and Systems, 6(3),
197–209.
6. Hay, T. et al. (2006). Transforming bridge monitoring from time-based to predictive mainte-
nance using acoustic emission MEMS sensors and artificial intelligence. In Proceedings of
the 7th World Congress on Railway Research, Montreal, Canada, CD-ROM.
7. Hay, T. (2007). Wireless remote structural integrity monitoring for railway bridges. Trans-
portation Research Board, Washington, DC, DC, USA, Technical report no. HSR-IDEA
Project 54.
8. Krüger, M. et al. (2007). Sustainable Bridges. Technical Report on Wireless Sensor Networks
using MEMS for Acoustic Emission Analysis including other Monitoring Tasks. Stuttgart,
Germany: European Union.
9. Ledeczi, A., et al. (2009). Wireless acoustic emission sensor network for structural monitoring.
IEEE Sensors Journal, 9(11), 1370–1377.
10. Reyer, M., Hurlebaus, S., Mander, J., & Ozbulut, O. E. (2011). Design of a wireless sensor
network for structural health monitoring of bridges. In Proceedings of the 5th International
Conference on Sens Technology, Palmerston North, New Zealand (pp. 515–520).
11. Sala, D., Motylewski, J., & Koaakowsk, P. (2009). Wireless transmission system for a railway
bridge subject to structural health monitoring. Diagnostyka, 50(2), 69–72.
12. Townsend, C., & Arms, S. (2005). Wireless sensor networks. Principles and applications. In
J. Wilson (Ed.), Sensor Technology Handbook (Chap. 22). Oxford, UK: Elsevier.
13. Tolani, M., Sunny, R., Singh, K., Shubham, K., & Kumar, R. (2017). Two-Layer optimized
railway monitoring system using Wi-Fi and ZigBee interfaced WSN. IEEE Sensors Journal,
17(7), 2241–2248.
14. Rasouli, H., Kavian, Y. S., & Rashvand, H. F. (2014). ADCA: Adaptive duty cycle algorithm
for energy efficient IEEE 802.15.4 beacon-enabled WSN. IEEE Sensors Journal, 14(11),
3893–3902.
15. Misic, J., Misic, V. B., & Shafi, S. (2004). Performance of IEEE 802.15.4 beacon enabled
PAN with uplink transmissions in non-saturation mode-access delay for finite buffers. In First
International Conference on Broadband Networks, San Jose, CA, USA (pp. 416–425).
16. Jung, C. Y., Hwang, H. Y., Sung, D. K., & Hwang, G. U. (2009). Enhanced markov chain
model and throughput analysis of the slotted CSMA/CA for IEEE 802.15.4 under unsaturated
traffic conditions. In IEEE Transactions on Vehicular Technology (Vol. 58, no. 1, pp. 473–478),
January 2009.
17. Zhang, H., Xin, S., Yu, R., Lin, Z., & Guo, Y. (2009). An adaptive GTS allocation mechanism
in IEEE 802.15.4 for various rate applications. In 2009 Fourth International Conference on
Communications and Networking in China.
18. Ho, C., Lin, C., & Hwang, W. (2012). Dynamic GTS allocation scheme in IEEE 802.15.4 by
multi-factor. In 2012 Eighth International Conference on Intelligent Information Hiding and
Multimedia Signal Processing.
19. Yang, L., Zeng, S. (2012). A new GTS allocation schemes For IEEE 802.15.4. In 2012 5th
International Conference on BioMedical Engineering and Informatics (BMEI 2012)
20. Hurtado-López, J., & Casilari, E. (2013). An adaptive algorithm to optimize the dynamics of
IEEE 802.15.4 network. In Mobile Networks and Management (pp. 136–148).
21. Standard for Part 15.4: Wireless Medium Access Control (MAC) and Physical Layer Specifica-
tions for Low Rate Wireless Personal Area Networks (LR-W PAN), IEEE Standard 802.15.4,
Junuary 2006.
454 M. Tolani et al.
22. Pei, G., & Chien, C. (2001). Low power TDMA in large WSNs. In 2001 MILCOM Proceed-
ings Communications for Network-Centric Operations: Creating the Information Force (Cat.
No.01CH37277) (Vol. 1, pp. 347–351).
23. Shafiullah, G. M., Thompson, A., Wolf, P., & Ali, S. (2008). Energy-efficient TDMA MAC
protocol for WSNs applications. In Proceedings of the 5th ICECE, Dhaka, Bangladesh,
December 24–27, 2008 (pp. 85–90).
24. Hoesel & Havinga. (2004). A lightweight medium access protocol (LMAC) for WSNs: Reduc-
ing preamble transmissions and transceiver state switches. In 1st International Workshop on
Networked Sensing Systems (pp. 205–208).
25. Alvi, A. N., Bouk, S. H., Ahmed, S. H., Yaqub, M. A., Sarkar, M., & Song, H. (2016). BEST-
MAC: Bitmap-Assisted efficient and scalable TDMA-Based WSN MAC protocol for smart
cities. IEEE Access, 4, 312–322.
26. Li, J., & Lazarou, G. Y. (2004). A bit-map-assisted energy-efficient MAC scheme for WSNs.
In Third International Symposium on Information Processing in Sensor Networks. IPSN 2004
(pp. 55–60).
27. Shafiullah, G., Azad, S. A., & Ali, A. B. M. S. (2013). Energy-efficient wireless MAC protocols
for railway monitoring applications. IEEE Transactions on Intelligent Transportation Systems,
14(2), 649–659.
28. Patro, R. K., Raina, M., Ganapathy, V., Shamaiah, M., & Thejaswi, C. (2007). Analysis and
improvement of contention access protocol in IEEE 802.15.4 star network. In 2007 IEEE
International Conference on Mobile Adhoc and Sensor Systems, Pisa (pp. 1–8).
29. Pollin, S. et al. (2008). Performance analysis of slotted carrier sense IEEE 802.15.4 medium
access layer. In IEEE Transactions on Wireless Communications (Vol. 7, no. 9, pp. 3359–
3371), September 2008.
30. Park, P., Di Marco, P., Soldati, P., Fischione, C., & Johansson, K. H. (2009). A generalized
Markov chain model for effective analysis of slotted IEEE 802.15.4. In IEEE 6th International
Conference on Mobile Adhoc and Sensor Systems Macau (pp. 130–139).
31. Aboelela, E., Edberg, W., Papakonstantinou, C., & Vokkarane, V. (2006). WSN based model
for secure railway operations. In Proceedings 25th IEEE International Performance, Com-
puter Communication Conference, Phoenix, AZ, USA (pp. 1–6).
32. Shafiullah, G., Gyasi-Agyei, A., & Wolfs, P. (2007). Survey of wireless communications
applications in the railway industry. In Proceedings of 2nd International Conferences on
Wireless Broadband Ultra Wideband Communication, Sydney, NSW, Australia (p. 65).
33. Shrestha, B., Hossain, E., & Camorlinga, S. (2010). A Markov model for IEEE 802.15.4
MAC with GTS transmissions and heterogeneous traffic in non-saturation mode. In IEEE
International Conference on Communication Systems, Singapore (pp. 56–61).
34. Park, P., Di Marco, P., Fischione, C., & Johansson, K. H. (2013). Modeling and optimization
of the IEEE 802.15.4 protocol for reliable and timely communications. In IEEE Transactions
on Parallel and Distributed Systems (Vol. 24, no. 3, pp. 550–564), March 2013.
35. Farhad, A., Zia, Y., Farid, S., & Hussain, F. B. (2015). A traffic aware dynamic super-frame
adaptation algorithm for the IEEE 802.15.4 based networks. In IEEE Asia Pacific Conference
on Wireless and Mobile (APWiMob), Bandung (pp. 261–266).
36. Moulik, S., Misra, S., & Das, D. (2017). AT-MAC: Adaptive MAC-Frame payload tuning
for reliable communication in wireless body area network. In IEEE Transactions on Mobile
Computing (Vol. 16, no. 6, pp. 1516–1529), June 1, 2017.
37. Choudhury, N., & Matam, R. (2016). Distributed beacon scheduling for IEEE 802.15.4 cluster-
tree topology. In IEEE Annual India Conference (INDICON), Bangalore, (pp. 1–6).
38. Choudhury, N., Matam, R., Mukherjee, M., & Shu, L. (2017). Adaptive duty cycling in
IEEE 802.15.4 Cluster Tree Networks Using MAC Parameters. In Proceedings of the 18th
ACM International Symposium on Mobile Ad Hoc Networking and Computing, Mobihoc’17,
Chennai, India (pp. 37:1–37:2).
39. Moulik, S., Misra, S., & Chakraborty, C. (2019). Performance evaluation and Delay-Power
Trade-off analysis of ZigBee Protocol. In IEEE Transactions on Mobile Computing (Vol. 18,
no. 2, pp. 404–416), February 1, 2019.
Advanced Sensor Systems … 455
40. Barbieri, A., Chiti, F., & Fantacci, R. (2006). Proposal of an adaptive MAC protocol for
efficient IEEE 802.15.4 low power communications. In Proceedings of IEEE 49th Global
Telecommunication Conference, December 2006 (pp. 1–5).
41. Lee, B.-H., & Wu, H.-K. (2010). Study on a dynamic superframe adjustment algorithm for
IEEE 802.15.4 LR-WPAN. In Proceedings of Vehicular Technology Conference (VTC), May
2010 (pp. 1–5).
42. Jeon, J., Lee, J. W., Ha, J. Y., & Kwon, W. H. (2007). DCA: Duty-cycle adaptation algorithm
for IEEE 802.15.4 beacon-enabled networks. In Proceedings of the 65th IEEE Vehicular
Technology Conference, April 2007 (pp. 110–113).
43. Goyal, R., Patel, R. B., Bhadauria, H. S., & Prasad, D. (2014). Dynamic slot allocation scheme
for efficient bandwidth utilization in Wireless Body Area Network. In 9th International Con-
ference on Industrial and Information Systems (ICIIS), Gwalior (pp. 1–7).
44. Na, C., Yang, Y., & Mishra, A. (2008). An optimal GTS scheduling algorithm for time-
sensitive transactions in IEEE 802.15.4 networks. In Computer Networks (Vol. 52 no. 13 pp.
2543–2557), September 2008.
45. Akbar, M. S., Yu, H., & Cang, S. (2017). TMP: Tele-Medicine protocol for slotted 802.15.4
with duty-cycle optimization in wireless body area sensor networks. IEEE Sensors Journal,
17(6), 1925–1936.
46. Koubaa, A., Alves, M., & Tovar, E. (2006). GTS allocation analysis in IEEE 802.15.4 for
real-time WSNs. In Proceedings 20th IEEE International Parallel and Distributed Processing
Symposium, Rhodes Island (p. 8).
47. Park, P., Fischione, C., & Johansson, K. H. (2013). Modeling and stability analysis of hybrid
multiple access in the IEEE 802.15.4 protocol. ACM Transactions on Sensor Networks, 9(2),
13:1–13:55.
48. Alvi, A., Mehmood, R., Ahmed, M., Abdullah, M., & Bouk, S. H. (2018). Optimized GTS
utilization for IEEE 802.15.4 standard. In International Workshop on Architectures for Future
Mobile Computing and Internet of Things.
49. Song, J., Ryoo1, J., Kim, S., Kim, J., Kim, H., & Mah, P. (2007). A dynamic GTS allocation
algorithm in IEEE 802.15.4 for QoS guaranteed real-time applications. In IEEE International
Symposium on Consumer Electronics. ISCE 2007.
50. Lee, H., Lee, K., & Shin, Y. (2012). A GTS Allocation Scheme for Emergency Data Trans-
mission in Cluster-Tree WSNs, ICACT2012, February 2012 (pp. 19–22).
51. Lei, X., Choi, Y., Park, S., & Hyong Rhee, S. (2012). GTS allocation for emergency data
in low-rate WPAN. In 18th Asia-Pacific Conference on Communications (APCC), October
2012.
52. Yang, L., & Zeng, S. (2012). A new GTS allocation schemes For IEEE 802.15.4. In 2012 5th
International Conference on BioMedical Engineering and Informatics (BMEI 2012).
53. Cheng, L., Bourgeois, A. G., & Zhang, X. (2007). A new GTS allocation scheme for IEEE
802.15.4 networks with improved bandwidth utilization. In International Symposium on Com-
munications and Information Technologies
54. Udin Harun Al Rasyid, M., Lee, B., & Sudarsono, A. (2013). PEGAS: Partitioned GTS
allocation scheme for IEEE 802.15.4 networks. In International Conference on Computer,
Control, Informatics and Its Applications.
55. Roy, S., Mallik, I., Poddar, A., & Moulik, S. (2017). PAG-MAC: Prioritized allocation of
GTSs in IEEE 802.15.4 MAC protocol—A dynamic approach based on Analytic Hierarchy
Process. In 14th IEEE India Council International Conference (INDICON), December 2017.
56. Heinzelman, W. B., Chandrakasan, A. P., & Balakrishnan, H. (2002). An application-specific
protocol architecture for wireless microsensor networks. IEEE Wireless Communication
Transactions, 1(4), 660–670.
57. Philipose, A., & Rajesh, A. (2015). Performance analysis of an improved energy aware MAC
protocol for railway systems. In 2nd International Conference on Electronics and Communi-
cation Systems (ICECS), Coimbatore, (pp. 233–236).
58. Kumar, D., & Singh, M. P. (2018). Bit-Map-Assisted Energy-Efficient MAC protocol for
WSNs. International Journal of Advanced Science and Technology, 119, 111–122.
456 M. Tolani et al.
59. Duarte-Melo, E. J., & Liu, M. (2002). Analysis of energy-consumption and lifetime of hetero-
geneous WSNs. In Global Telecommunications Conference. GLOBECOM ’02. IEEE, 2002
(Vol. 1. pp. 21–25).
60. Shabna, V. C., Jamshid, K., & Kumar, S. M. (2014). Energy minimization by removing
data redundancy in WSNs. In 2014 International Conference on Communication and Signal
Processing, Melmaruvathur (pp. 1658–1663).
61. Yetgin, H., Cheung, K. T. K., El-Hajjar, M., & Hanzo, L. (2015). Network-Lifetime maxi-
mization of WSNs. IEEE Access, 3, 2191–2226.
62. Rajagopalan, R., & Varshney, P. K. (2006). Data-aggregation techniques in sensor networks:
A survey. In IEEE Communications Surveys & Tutorials (Vol. 8, no. 4, pp. 48–63). Fourth
Quarter 2006.
63. Jesus, P., Baquero, C., & Almeida, P. S. (2015). A survey of distributed data aggregation algo-
rithms. In IEEE Communications Surveys Tutorials (Vol. 17, no. 1, pp. 381–404). Firstquarter
2015.
64. Zhou, F., Chen, Z., Guo, S., & Li, J. (2016). Maximizing lifetime of Data-Gathering trees
with different aggregation modes in WSNs. IEEE Sensors Journal, 16(22), 8167–8177.
65. Sofra, N., He, T., Zerfos, P., Ko, B. J., Lee, K. W., & Leung, K. K. (2008). Accuracy analysis
of data aggregation for network monitoring. MILCOM 2008–2008 IEEE Military Communi-
cations Conference, San Diego, CA (pp. 1–7).
66. Heinzelman, W., Chandrakasan, A., & Balakrishnan, H. (2000). Energy-Efficient communi-
cation protocols for wireless microsensor networks. In Proceedings of the 33rd Hawaaian
International Conference on Systems Science (HICSS), January 2000.
67. Liang, J., Wang, J., Cao, J., Chen, J., & Lu, M. (2010). Algorithm, an efficient, & for construct-
ing maximum lifetime tree for data gathering without aggregation in WSNs. In Proceedings
IEEE INFOCOM, San Diego, CA (pp. 1–5).
68. Wu, Y., Mao, Z., Fahmy, S., & Shroff, N. B. (2010). Constructing maximum-lifetime data-
gathering forests in sensor networks. IEEE/ACM Transactions on Networking, 18(5), 1571–
1584.
69. Luo, D., Zhu, X., Wu, X., & Chen, G. (2011). Maximizing lifetime for the shortest path
aggregation tree in WSNs. Proceedings IEEE INFOCOM, Shanghai (pp. 1566–1574).
70. Hua, C., & Yum, T. S. P. (2008). Optimal routing and data aggregation for maximizing lifetime
of WSNs. IEEE/ACM Transactions on Networking, 16(4), 892–903.
71. Choi, K., & Chae, K. (2014). Data aggregation using temporal and spatial correlations in
Advanced Metering Infrastructure. In The International Conference on Information Network-
ing 2014 (ICOIN2014), Phuket (pp. 541–544).
72. Villas, L. A., Boukerche, A., Guidoni, D. L., de Oliveira, H. A. B. F., de Araujo, R. B., &
Loureiro, A. A. F. (2013). An energy-aware spatio-temporal correlation mechanism to perform
efficient data collection in WSNs. Computer Communications, 36(9), 1054–1066.
73. Liu, C., Wu, K., & Pei, J. (2007). An energy-efficient data collection framework for WSNs by
exploiting spatiotemporal correlation. IEEE Transactions on Parallel and Distributed Systems,
18(7), 1010–1023.
74. Kandukuri, S., Lebreton, J., Lorion, R., Murad, N., & Daniel Lan-Sun-Luk, J. (2016). Energy-
efficient data aggregation techniques for exploiting spatio-temporal correlations in WSNs.
Wireless Telecommunications Symposium (WTS) (pp. 1–6), London.
75. Mantri, D., Prasad, N. R., & Prasad, R. (2014). Wireless Personal Communications, 5, 2589.
https://doi.org/10.1007/s11277-013-1489-x.
76. Mantri, D., Prasad, N. R., Prasad, R., & Ohmori, S. (2012). Two tier cluster based data aggre-
gation (TTCDA) in WSN. In 2012 IEEE International Conference on Advanced Networks
and Telecommunciations Systems (ANTS).
77. Pham, N. D., Le, T. D., Park, K., & Choo, H. SCCS: Spatiotemporal clustering and com-
pressing schemes for efficient data collection applications in WSNs. International Journal of
Communication Systems, 23, 1311–1333.
78. Villas, L. A., Boukerche, A., de Oliveira, H. A. B. F., de Araujo, R. B., & Loureiro, A. A. F.
(2014). A spatial correlation aware algorithm to perform efficient data collection in WSNs.
Ad Hoc Networks, 12, 69–85. ISSN 1570-8705.
Advanced Sensor Systems … 457
79. Krishnamachari, B., Estrin, D., & Wicker, S. B. (2002). The impact of data aggregation in
WSNs. In ICDCSW ’02: Proceedings of the 22nd International Conference on Distributed
Computing Systems (pp. 575–578). Washington, DC, USA: IEEE Computer Society.
80. Tolani, M., & Sunny, R. K. S. (2019). Lifetime improvement of WSN by information sensitive
aggregation method for railway condition monitoring. Ad Hoc Networks, 87, 128–145. ISSN
1570-8705. https://doi.org/10.1016/j.adhoc.2018.11.009.
81. Tolani, M., & Sunny, R. K. S. (2019). Energy Efficient Adaptive Bit-Map-Assisted Medium
Access Control Protocol, Wireless Personal Communication (Vol. 108, pp. 1595–1610).
https://doi.org/10.1007/s11277-019-06486-9.
82. MacQueen, J. B. Some Methods for classification and Analysis of Multivariate Observations.
In Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability (Vol.
1, pp. 281–297). Berkeley: University of California Press.
83. Mišić, J., Shafi, S., & Mišić, V. B. (2005). The impact of MAC parameters on the performance
of 802.15.4 PAN. Ad Hoc Network. 3, 5 (September 2005), 509–528. https://doi.org/10.1016/
j.adhoc.2004.08.002.
84. An IEEE 802.15.4 complaint and ZigBee-ready 2.4 GHz RF transceiver. (2004). Microwave
Journal, 47(6), 130–135.
85. Dargie, W., & Poellabauer, C. (2010). Fundamentals of WSNs: Theory and Practice. Wiley
Publishing.
86. Park, P., Fischione, C., & Johansson, K. H. (2013). Modeling and stability analysis of hybrid
multiple access in the IEEE 802.15.4 protocol. ACM Transactions on Sensor Networks, 9, 2,
Article 13, 55 pages.
87. Zhan, Y., & Xia, M. A. (2016). GTS size adaptation algorithm for IEEE 802.15.4 wireless
networks. Ad Hoc Networks, 37, Part 2, pp. 486–498. ISSN 1570-8705, https://doi.org/10.
1016/j.adhoc.2015.09.012.
88. Iala, I, Dbibih, I., & Zytoune, O. (2018). Adaptive duty-cycle scheme based on a new predic-
tion mechanism for energy optimization over IEEE 802.15.4 wireless network. International
Journal of Intelligent Engineering and Systems, 11(5). https://doi.org/10.22266/ijies2018.
1031.10.
89. Boulis, A. (2011). Castalia: A simulator for WSNs and Body Area Networks, user’s manual
version 3.2, NICTA.
90. Kolakowski1, P., Szelazek, J., Sekula, K., Swiercz, A., Mizerski, K., & Gutkiewicz, P. (2011).
Structural health monitoring of a railway truss bridge using vibration-based and ultrasonic
methods. Smart Materials and Structures, 20(3), 035016.
91. Al-Janabi, T. A., & Al-Raweshidy, H. S. (2019). An energy efficient hybrid MAC protocol
with dynamic sleep-based scheduling for high density IoT networks. IEEE Internet of Things
Journal, 6(2), 2273–2287.
92. Penella-López, M. T., & Gasulla-Forner, M. (2011). Powering autonomous sensors: An inte-
gral approach with focus on solar and RF energy harvesting. Springer Link. https://doi.org/
10.1007/978-94-007-1573-8.
93. Farag, H., Gidlund, M., & Österberg, P. (2018). A delay-bounded MAC protocol for mission-
and time-critical applications in industrial WSNs. IEEE Sensors Journal, 18(6), 2607–2616.
94. Lin, C. H., Lin, K. C. J., & Chen, W. T. (2017). Channel-Aware polling-based MAC protocol
for body area networks: Design and analysis. IEEE Sensors Journal, 17(9), 2936–2948
95. Hodge, V. J., O’Keefe, S., Weeks, M., & Moulds, A. (2015). WSNs for condition monitoring
in the railway Industry: A survey. IEEE Transactions on Intelligent Transportation Systems,
16(3), 1088–1106.
96. Ye, W., Heidemann, J., & Estrin, D. (2002). An energy-efficient MAC protocol for WSNs. In
Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies
(Vol. 3, pp. 1567–1576).
97. Siddiqui, S., Ghani, S., & Khan, A. A. (2018). ADP-MAC: An adaptive and dynamic polling-
based MAC protocol for WSNs. IEEE Sensors Journal, 18(2), 860–874.
98. Stem, M., & Katz, R. H. (1997). Measuring and reducing energy-consumption of network
interfaces in hand held devices. IEICE Transactions on Communications, E80-B(8), 1125–
1131.
458 M. Tolani et al.
99. Lee, A. H., Jing, M. H., & Kao, C. Y. (2008). LMAC: An energy-latency trade-off MAC
protocol for WSNs. International Symposium on Computer Science and its Applications,
Hobart, ACT (pp. 233–238).
100. Karl, H., & Willig, A. (2005). Protocols and Architectures for WSNs. Wiley.
101. Balakrishnan, C., Vijayalakshmi, E., & Vinayagasundaram, B. (2016). An enhanced itera-
tive filtering technique for data aggregation in WSN. In 2016 International Conference on
Information Communication and Embedded Systems (ICICES), Chennai (pp. 1–6).
102. Nayak, P., & Devulapalli, A. (2016). A fuzzy logic-based clustering algorithm for WSN to
extend the network lifetime. In IEEE Sensors Journal, 16(1), 137–144.
103. Tolani, M., Bajpai, A., Sunny, R. K. S., Wuttisittikulkij, L., & Kovintavewat, P. (2021). Energy
efficient hybrid medium access control protocol for WSN. In The 36th International Tech-
nical Conference on Circuits/Systems, Computers and Communications, June 28th(Mon)–
30th(Wed)/Grand Hyatt Jeju, Republic of Korea.
104. María Gabriela Calle Torres, energy-consumption in WSNs Using GSP, University of Pitts-
burgh, M.Sc. Thesis, April.
105. Chebrolu, K., Raman, B., Mishra, N., Valiveti, P., & Kumar, R. (2008). Brimon: A sensor
network system for railway bridge monitoring. In Proceeding 6th International Conference
on Mobile Systems, Applications, and Services, Breckenridge, CO, USA, pp. 2–14.
106. Pascale, A., Varanese, N., Maier, G., & Spagnolini, U. (2012). A WSN architecture for railway
signalling. In Proceedings of 9th Italian Network Workshop, Courmayeur, Italy (pp. 1–4).
107. Grudén, M., Westman, A., Platbardis, J., Hallbjorner, P., & Rydberg, A. (2009). Reliability
experiments for WSNs in train environment. in Proceedings of European Wireless Technology
Conferences, (pp. 37–40).
108. Rabatel, J., Bringay, S., & Poncelet, P. (2009). SO-MAD: Sensor mining for anomaly detection
in railway data. Advances in Data Mining: Applications and Theoretical Aspects, LNCS (Vol.
5633, pp. 191–205).
109. Rabatel, J., Bringay, S., & Poncelet, P. (2011). Anomaly detection in monitoring sensor data
for preventive maintenance. Expert Systems With Applications, 38(6), 7003–7015.
110. Reason, J., Chen, H., Crepaldi, R., & Duri, S. (2010). Intelligent telemetry for freight trains.
Mobile computing, applications, services (Vol. 35, pp. 72–91). Berlin, Germany: Springer.
111. Reason, J., & Crepaldi, R. (2009). Ambient intelligence for freight railroads. IBM Journal of
Research and Development, 53(3), 1–14.
112. Tuck, K. (2010). Using the 32 Samples First In First Out (FIFO) in the MMA8450Q,
Energy Scale Solutions by free scale, FreeScale Solutions, 2010. http://www.nxp.com/docs/
en/application-note/AN3920.pdf.
113. Pagano, S., Peirani, S., & Valle, M. (2015). Indoor ranging and localisation algorithm based
on received signal strength indicator using statistic parameters for WSNs. In IET Wireless
Sensor Systems (Vol. 5, no. 5, pp. 243–249), October 2015.
114. Tolani, M., Bajpai, A., Sharma, S., Singh, R. K., Wuttisittikulkij, L., & Kovintavewat, Energy
efficient hybrid medium access control protocol for WSN. In 36th International Technical
Conference on Circuits/Systems, Computers and Communications, (ITC-CSCC 21), at Jeju,
South Korea, 28–30 June 2021.
115. Tolani, M., Sunny, R. K. S. (2020). Energy-Efficient adaptive GTS allocation algorithm for
IEEE 802.15.4 MAC protocol. Telecommunication systems. Springer. https://doi.org/10.1007/
s11235-020-00719-0.
116. Tolani, M., Sunny, R. K. S. Adaptive Duty Cycle Enabled Energy-Efficient Bit-Map-Assisted
MAC Protocol. Springer, SN Computer Science. https://doi.org/10.1007/s42979-020-00162-
7.
117. Tolani, M., Sunny, R. K. S. (2020). Energy-Efficient Hybrid MAC Protocol for Railway Mon-
itoring Sensor Network (Vol. 2, p. 1404). Springer, SN Applied Sciences (2020). https://doi.
org/10.1007/s42452-020-3194-1.
118. Tolani, M., Sunny, R. K. S. (2018). Energy-efficient aggregation-aware IEEE 802.15.4
MAC protocol for railway, tele-medicine & industrial applications. In 2018 5th IEEE Uttar
Pradesh Section International Conference on Electrical, Electronics and Computer Engi-
neering (UPCON), Gorakhpur (pp. 1–5).
Advanced Sensor Systems … 459
119. Khan, A. A., Jamal, M. S., & Siddiqui, S. (2017). Dynamic duty-cycle control for WSNs using
artificial neural network (ANN). International Conference on Cyber-Enabled Distributed
Computing and Knowledge Discovery (CyberC), 2017, 420–424. https://doi.org/10.1109/
CyberC.2017.93
120. Wahyono, I. D., Asfani, K., Mohamad, M. M., Rosyid, H., Afandi, A., & Aripriharta (2020).
The new intelligent WSN using artificial intelligence for building fire disasters. In 2020 Third
International Conference on Vocational Education and Electrical Engineering (ICVEE) (pp.
1–6). https://doi.org/10.1109/ICVEE50212.2020.9243210.
121. Aliyu, F., Umar, S., & Al-Duwaish, H. (2019). A survey of applications of artificial neural
networks in WSNs. In 2019 8th International Conference on Modeling Simulation and Applied
Optimization (ICMSAO) (pp. 1–5). https://doi.org/10.1109/ICMSAO.2019.8880364.
122. Sun, L., Cai, W., & Huang, X. (2010). Data aggregation scheme using neural networks in
WSNs. In 2010 2nd International Conference on Future Computer and Communication,
May 2010 (Vol. 1, pp. V1-725–V1-729).
123. Elia, M. et al. (2006). Condition monitoring of the railway line and overhead equipment
through onboard train measurement-an Italian experience. In Proceedings of IET International
Conference on Railway Condition Monitor, Birmingham, UK (pp. 102–107).
124. Maly, T., Rumpler, M., Schweinzer, H., & Schoebel, A. (2005). New development of an overall
train inspection system for increased operational safety. In Proceedings of IEEE Intelligent
Transportation Systems, Vienna, Austria (pp. 188–193).
Four Wheeled Humanoid Second-Order
Cascade Control of Holonomic
Trajectories
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 461
A. T. Azar and A. Koubaa (eds.), Artificial Intelligence for Robotics and Autonomous
Systems Applications, Studies in Computational Intelligence 1093,
https://doi.org/10.1007/978-3-031-28715-2_15
462 A. A. Torres-Martínez et al.
1 Introduction
Wheeled mobile robots are widely used in a number of applications. The perfor-
mance of a wheeled robot is considerably good in particular on flat and structured
floors. For instance, they are faster, more efficient in reaching positions and usually
more effective in terms of mechanical energy requirements than walking bipeds.
In numerous robotic applications, particularly in cluttered environments, omnidirec-
tional rolling locomotion capabilities result suitable to easily change the robot’s body
posture to basically move towards any direction without explicit yaw control. Nowa-
days, deployment of omnidirectional wheels as a means for locomotion in mobile
robotics is highly demanding in a number or applications due to their ability to move
in any direction and particularly driving in plain structured confined spaces [1, 2]. The
use of omniwheels, unlike conventional wheels have fewer kinematic constraints and
allow the robot to move in a wide range of mobility. Adding, holonomy, considerable
maneuverability. In modern times, the cases of omniwheel-based holonomic robots
developed for different applications are considerable numerous and relevant. For
instance, the use of personal assistant robots as walking-helper tool, demonstrated
capability to provide guidance and dynamic support for impaired walking people [3].
There are also other types of human-robot assistants in cases where mobile robots are
purposed to perform interaction socially assistive [4]. In healthcare, robotic systems
have been designed with mecanum wheels for providing omnidirectional motion to
wheelchairs [5]. Although, manipulators onboard mobile platforms are not relatively
new approaches, however mobile manipulators with omnidirectional locomotion pro-
vides interesting advantages. Mecanum-wheeled platforms have been performed as
holonomic vehicular manipulators moving in industrial working spaces [6]. Park
et al. [7] presented a controller for velocity tracking and vibration reduction of a
cart-pole inverted pendulum like-model of omnidirectional assistive mobile robot.
The robot adopted mecanum wheel rolling with suspension for keeping consistent
contact mecanum wheel and ground while transporting heavy goods placed on high
locations.
Moreover, instrumented omnidirectional platforms with visually guided servo-
ing devices have been reported [8]. Furthermore, holonomic robotic platforms have
been exploited as robotized sporting and training technology to provide assistance
and training in racquet sports [9]. A traditional robotic application is exploiting
advantages of omniwheel-based mobile robots being deployed as domestic assistants
in household environments [10]. An advantageous application of mecanum-wheels
used as an omni-directional mobile robot in industrial fields has been critical. For
instance, autonomous material indoor transportation [11] as well as robotic plat-
forms of four omnidirectional wheels working in warehouses [12]. The work [13]
performed collaborative manipulation by multirobot displacing payloads transported
to desired locations in planar obstacle-clustered scenarios maneuvering through nar-
row pathways for which advocated the use of Mecanum-Wheeled Robots positioning
without body-orientation change. The work [14] developed modular reconfiguration
by deploying a group of vehicles to perform different mission tasks. Recofnigura-
Four Wheeled Humanoid Second-Order … 463
tion was done at the level of motion planning deploying four-wheel-drive Mecanum
vehiclemobile robots. A critical deployment of omniwheel robotics is in a gaining
popularity field for omnidirectional humanoids, that is in nursing and rehabilitation
[15].
A variety of demands on the use of robots differs considerably on how to deploy
them. For instance, in the industry with robots working on-site with highly accurate
robotic arms, or mobile platforms moving heavy loads and assisting humans workers
in close proximity [16].
This chapter presents the trajectory tracking model of a humanoid robot at the
stage of kinematic modeling and simulation. This work approaches a model of two
upper limbs of three joints fixed on a trunk that is placed on a Mecanum wheeled plat-
form with four asynchronous rolling drives. This research’s main contribution is the
development of a three-cascade kinematic trajectory tracking controller. Each cas-
cade is comprised of a different order derivative deduced from the robot’s kinematic
model. Different observers to complement the control are developed considering a
deterministic approach and based on wheels-encoder and an inertial measurement
unit. The omniwheels physical arrangement is radially equidistant and tangential
rotated with respect to (w.r.t.) the center of reference. This work shows numeri-
cal simulation results that allow validating and understanding proposed models and
ideas, as well as to refine them before converting them into feasible and operational
physical systems.
This chapter organizes the sections as follows. Section 2 briefly discusses similar
works. Section 3 deduces motion equations of the robot’s arms and its four-wheel
four-drive omnidirecional rolling platform. Section 4 defines the sensing model and
observers used as online feedback elements. Section 5 describes the three-cascade
controller. Finally, Sect. 7 provides conclusion of the work.
2 Related Work
The study of position errors and calibration methods for robot locomotion with omni-
directional wheels has demonstrated to be relevant [17]. The work [18] developed a
reference-based control for a mecanum-wheels omnidirectional robot platform, rely-
ing on the robot’s kinematic model and generate trajectories and optimal constrained
navigation. The cost function quantified differences between the robot’s path predic-
tion and using a family of parameterized reference trajectories. [19] demonstrated
control of a time-varying proportional integral derivative model for trajectory track-
ing of a mecanum-wheeled robot. It used linearization of a nonlinear kinematic error
model and controller’s parametric coefficients adjusted by trial-and-error.
Omniwheel-based robot motion is effected by systematic perturbations differently
as in conventional wheeled robots. Identifying the sources of pose errors are critical
to develop methods for kinematic errors reduction of omni-directional robotic sys-
tem [20]. The work [21] evaluated a method to correct systematic odometry errors of
a humanoid-like three-wheeled omnidirectional mobile robot. Correction was made
464 A. A. Torres-Martínez et al.
Finally, a path-following control using extended Kalman filtering for sensor fusion
was introduced in [35].
Some of previous cited works reported approaches using soft-computing tech-
niques combined with traditional control methods for tracking, either for recovery
of disturbances and fault tolerances in tracking motion control. As a difference, in
the present research a model-based recursive control is proposed with the particular-
ity of implementing inner multi-cascades combining multiple higher-order inputs.
Numerical errors are reduced with respect to a reference model by successive approx-
imations as convergence laws. The focus presented in this research differs from most
the cited related work, fundamentally in the class of control’s structure and the kind
of observers models. For instance, while a traditional PID controller might combine
three different order derivatives as a summation of terms into an algebraic expres-
sion, the proposed approach exploits each derivative inside another of lower order
and faster sampling as different recursive control cycles.
This section describes the essential design parts of the proposed robotic structure
at the level of simulation model. Additionally, both kinematic models, the onboard
manipulators and the four mecanum wheels and the omnidirectional locomotive
structure are illustrated.
Figure 1a depicts the humanoid CAD concept of the proposed robotic platform.
Figure 1b shows a basic figure created in C/C++ language as a resource for numerical
simulations, which deploy the Object Dynamic Engine (ODE) libraries to create
simulated animations.
Fig. 1 Mecanum four-wheeled humanoid structure. a a CAD mo del. b a simulation model from
the physics engine ODE
466 A. A. Torres-Martínez et al.
Fig. 2 Onboard arms basic mechanism. Joints and links kinematic parameters (above). Side of the
elbow mechanism (middle). Side of the wrist mechanism and shoulder gray-color gear (below)
The four mecanum wheels are located symmetric radially arranged and equidis-
tant beneath the chassis structure. Each wheel is independently driven both rotary
directions. This work provides the emphasis on the omnidirectional locomotion con-
troller, since motion over the plane ground has impacts on the manipulators position,
adding robot’s translation and orientation is given in models separately along the
manuscript. Figure 2 illustrates a basic conceptual design purposed to help describ-
ing joints’ functional form. The limbs purpose in this manuscript is to illustrate
general interaction in general scenarios with manipulable objects.
Therefore, the onboard arms may be modeled for multiple degrees of freedom.
However, in this manuscript the manipulators have been established symmetrically
planar with three rotary joints: shoulder (θ0 ), elbow (θ2 ) and wrist (θ2 ), all turning
in pitch (see Fig. 2). Additionally, the robot’s orientation is assumed to be the arms’
yaw motion (θt ). The onboard arm’s side view is shown in Fig. 2(below), where the
gray-color gear is the actuating device ϕ0 that rotates a shoulder. The arm’s joint φl1
describes angular displacements for link l1 . Figure 2(middle) shows an antagonistic
arm’s side view where the orange-color joint mechanism for θ1 (elbow) is depicted.
The mechanism device for θ1 has asynchronous motion from θ0 and θ2 . Additionally,
Four Wheeled Humanoid Second-Order … 467
Fig. 2(middle) shows yellow-color gearing system to depict how the wrist motoin
is obtained and transmitted from the actuating gear ϕ2 towards ϕ5 . The wrist rotary
angle is the joint θ2 that rotates the gripper’s elevation angle.
Hence, without lose of generality, the Cartesian position is a system of equations
. ..
that are established for now in two-dimension, z = 0, z = 0 and z = 0. Let z be the
depth dimension no treated in this section. Subsequently, a third Cartesian component
may be stated when the robot’s yaw is defined as it impacts the arms pose, given in
the next sections.
From the depiction of Fig. 2(above), the following arms position xa , y A expres-
sions are deduced to describe the motion in sagittal plane (pitch),
and
ya = l1 · sin(θ0 ) + l2 · sin(θ0 + θ1 ) + l3 · sin(θ0 + θ1 + θ2 ). (2)
where the functional forms of actuating joints are described in the following
Definition 1,
Definition 1 (Joints functional forms) Assuming gears angles and teeth numbers by
ϕi and n j , respectively, let ϕ0 be the actuating joint,
θ0 = ϕ0 . (3)
⎛θ − ϕ ⎞
xat+1 − xat −l1 c0 − nn 68 (l1 c0 + l2 c01 ) − nn 15 (l1 c0 + l2 c01 + l3 c012 ) 0 0
= n6 n1 ⎝θ1 − ϕ8 ⎠
yat+1 − yat l1 s0 n 8 (l1 s0 + l2 s01 ) n 5 (l1 s0 + l2 s01 + l3 s012 )
θ2 − ϕ5
(6)
468 A. A. Torres-Martínez et al.
Fig. 3 Onboard arms local Cartesian motion simulation for an arbitrary trajectory
Hence, the law (6) is satisfied when limϕi →θi (x, y)t+1 − (x, y)t = 0. Being x, y a
Cartesian position, and (θi − ϕi ) an instantaneous joints error.
It follows that validating previous kinematic expressions (1) and (2), Fig. 3 shows a
numerical simulation for Cartesian position along an arbitrary trajectory.
Moreover, from the system of nonlinear equations modeling position (1) and (2)
and hereafter assuming that joints θ j (ϕk ) are functions in terms of gears rotations.
Thus, first-order derivative w.r.t. time is algebraically deduced and Cartesian veloc-
ities are described by
1 2
ẋa −s0 −s01 −s012
= l1 θ̇0 + l2 θ̇i + l3 θ̇i . (7)
ẏa c0 c01 c012
i=0 i=1
It follows the second-order derivative which describe the arms Cartesian accelera-
tions, where the serial links’ Jacobian is assumed a non stationary matrix Jt ∈ R2×3 ,
such that ⎛ ⎞ ⎛ ⎞
θ̈0 θ̇ 0
ẍa
= Jt · ⎝θ̈1 ⎠ + J̇t · ⎝θ̇ 1 ⎠ (8)
ÿa
θ̈2 θ̇ 2
Fig. 4 4W4D holonomic kinematics. Mecanum wheels location without twist (left). Wheels posi-
tions and twisted ± π2 w.r.t. its center (right)
dimension manipulators can therefore exploit the native holonomic mobility such as
position and rotation as to provide three-dimension spatial manipulator’s trajectories.
Additionally, omnidirectional mobility as a complement to the arms, allows arms’
degrees of freedom complexity reduction.
Let us establish the following mobility kinematic constraints depicted in Fig. 4.
Therefore, without loss of generality let us state the following Proposition 2,
Proposition 2 (Holonomic motion model) Let ut be the robot state vector in its
Cartesian form with components (x, y) orientation, such that ut ∈ R2 , u = (x, y) .
Hence, the forward kinematics is
. .
u = r K · Φ, (9)
1 1 .
Φ̇ = · K+ · u̇ = · K (K · K )−1 · u. (10)
r r
Therefore, according to geometry of Fig. 4 and the general models of previous Propo-
. .
sition 2, the following algebraic deduction arises, Cartesian speeds x and y in holo-
nomic motion are obtained from wheels tangential velocities Vk expressed as,
470 A. A. Torres-Martínez et al.
. π π π π
x = V1 · cos α1 − + V2 · cos α2 − + V3 · cos α3 − + V4 · cos α4 − ,
2 2 2 2
(11)
as well as
. π π π π
y = V1 · sin α1 − + V2 · sin α2 − + V3 · sin α3 − + V4 · sin α4 − .
2 2 2 2
(12)
From where, the stationary non square kinematic control matrix K is provided by
Definition 2,
Definition 2 (4W4D holonomic kinematic matrix) Each wheel with angle αk w.r.t.
the robot’s geometric center, thus
cos(α1 − π2 ) cos(α2 − π2 ) cos(α3 − π2 ) cos(α4 − π2 )
K= . (14)
sin(α1 − π2 ) sin(α2 − π2 ) sin(α3 − π2 ) sin(α4 − π2 )
Similarly, from previous model higher-order derivatives are deduced for subse-
quent treatment for the sake of controller cascades building. Thus, the second-order
kinematic model is
⎛ .. ⎞
φ
.. ⎜ .. 1 ⎟
x cos(α1 − π2 ) cos(α2 − π2 ) cos(α3 − π2 ) cos(α4 − π2 ) ⎜φ 2 ⎟
.. = r · · ⎜ .. ⎟ .
y sin(α1 − π2 ) sin(α2 − π2 ) sin(α3 − π2 ) sin(α4 − π2 ) ⎝φ 3 ⎠
..
φ4
(16)
Likewise, a third-order derivative is provided by model
Four Wheeled Humanoid Second-Order … 471
Fig. 5 General higher order derivatives for 4W4D holonomic model. Velocity (above). Acceleration
(below)
472 A. A. Torres-Martínez et al.
⎛... ⎞
φ
... π π π π
⎜...1 ⎟
x cos(α1 − 2 ) cos(α2 − 2 ) cos(α3 − 2 ) cos(α4 − 2 ) ⎜φ 2 ⎟
... = r · · ⎜... ⎟ .
y sin(α1 − π2 ) sin(α2 − π2 ) sin(α3 − π2 ) sin(α4 − π2 ) ⎝φ 3 ⎠
...
φ4
(17)
The fact that matrix K is stationary keeps simplistic the linear derivative expressions.
For this type of four-wheel holonomic platforms, their kinematic models produce
the following behavior curves shown in Fig. 5.
4 Observer Models
This section establishes the main sensing models that are assumed deterministic and
in the cascade controller as elements of feedback for observing the robot’s model
state. It is worth saying that perturbation models and noisy sensor measurements and
calibration methods are out of the scope of this manuscript’s interest.
Thus, let us assume a pulse shaft encoder fixed for each wheel. Hence, let φ̂εk be
a measurement of the angular position of the k th -wheel,
2π
φ̂εkt (η) = ηt , (18)
R
where ηt is the instantaneous number of pulses detected while wheel is rotating.
Let R be defined as the encoder angular resolution. Furthermore, the angular veloc-
ity encoder-based observation is given by the backward high-precision first-order
derivative,
3φ̂t − 4φ̂t−1 + φ̂t−2
φ̇ˆ ε (η, t) = , (19)
(tk − tk−1 )(tk−1 − tk−2 )
with three previous measurements of angle θ̂ε and time tk . Hence, the kth wheel’s
tangential velocity is obtained by
πr
υk = (3ηt − 4ηt−1 + ηt−2 ) , (20)
RΔt
.
where r is the wheel’s radius and considering varying time loops, let Δt = (tk −
tk−1 )(tk−1 − tk−2 ). Without loss of generality, let us substitute previous statements
in Proposition 3 to describe Cartesian speeds observation, such that
Proposition 3 (Encoder-based velocity observer) For simplicity, let us define the
.
constants βk = αk − π2 as constant angles for wheels orientation. Therefore, the
ˆ ẏˆ are modeled by
encoder-based velocity observers ẋ,
Four Wheeled Humanoid Second-Order … 473
4
ẋˆk = υk sin(βk ) (21)
k=1
and
4
ẏˆk = υk cos(βk ). (22)
k=1
Moreover, the four wheels tangential speeds contribute to exert yaw motion w.r.t.
the center of the robot. Thus, since by using encoders the wheels linear displacements
can be inferred, then an encoder-based yaw observation θ̂ε is possible,
4
πr
θ̂ε = ηk , (23)
2L k=1
where L is the robot’s distance between any wheel and its center of rotation. Thus,
the robot’s angular velocity observer based only on the encoders measurements is
πr
4
θ̇ˆε = 3ηkt − 4ηkt−1 + ηkt−2 . (24)
4R LΔt k=1
1ˆ 1 d θ̂ι 1
ω̂t = θ̇ε + + θ̇ˆg , (27)
3 3 dt 3
u2 − u1 = r · K · (Φ 2 − Φ 1 ). (30)
1
Φ t+1 = Φ t + · K T (K · K T )−1 · (ur e f − ût ), (31)
r
where, the prediction vector ut+1 is re-formulated as the global reference ur e f or
the goal the robot is desirable to reach. Likewise, for the forward kinematic solution
ut+1 is
ut+1 = ut + r · K · (Φ t+1 − Φ̂ t ). (32)
Four Wheeled Humanoid Second-Order … 475
Therefore, the first global controller cascade is formed by means of the pair of
recursive expressions (31) and (32). Proposition 5 highlights the global cascade by.
Proposition 5 (Feedback position cascade) Given the inverse kinematic motion with
observation in the workspace ût
1
Φ t+1 = Φ t + · K T (K · K T )−1 · (ur e f − ût ), (33)
r
and direct kinematic motion with observation in the control variables space Φ̂ t ,
Proposition 5 is validated through numerical simulations that are shown in Fig. 6 the
automatic Cartesian segments and feedback position errors decreasing.
Without loss of generality and following the same approach as Proposition 5,
let us represent a second controller cascade to control velocity. Thus, the following
equation expresses the second-order derivative kinematic expression given in (16)
as a differential equation, .
d u̇ dΦ
=r ·K· (35)
dt dt
and solving definite integrals,
476 A. A. Torres-Martínez et al.
Fig. 7 Numerical simulation for second cascade inner recursive in terms of velocities
.
u2 .
Φ2 .
.
.
d udt = r · K .
d Φdt, (36)
u1 Φ1
It follows the Proposition 6 establishing the second cascade controlling the first-order
derivatives.
Proposition 6 (Feedback velocity cascade) The backwards kinematic recursive
function with in-loop velocity observers and prediction u̇t+1 used as local reference
u̇r e f is given by
1
Φ̇ t+1 = Φ̇ t + · K T · (K · K T )−1 · (u̇r e f − u̇ˆ t ), (38)
r
likewise the forward speeds kinematic model,
ˆ ).
u̇t+1 = u̇t + r · K · (Φ̇ t+1 − Φ̇ (39)
t
... ...
u =K·Φ (40)
Therefore, the following Proposition 7 is provided and notation rearranged for a third
recursive inner control loop in terms of accelerations.
Proposition 7 (Feedback acceleration cascade) The backwards kinematic recursive
function with in-loop acceleration observers and prediction Φ̈ t+1 used as local ref-
erence ür e f is given by
1
Φ̈ t+1 = Φ̈ t + · K T · (K · K T )−1 · (ür e f − üˆ t ). (44)
r
Additionally, the forward acceleration kinematic model is
.. .. .. ˆ
..
ut+1 = ut + r · K · (Φ t+1 − Φ t ), (45)
lim Φ r e f − Φ̂ = 0,
ΔΦ→0
where the feedback error is eΦ = (Φ r e f − Φ̂) that numerically will nearly approach
zero. Thus, given the criterion eΦ < εΦ is a
478 A. A. Torres-Martínez et al.
Fig. 8 Numerical simulation for third cascade inner recursive in terms of accelerations
Φ − Φ̂
ref t
< ε. (46)
Φre f
. . . .̂
2 ut+1 = ut + r · K · (Φ t+1 − Φ t )
.. .. .. ..
3 Φ t+1 = Φ t + 1
r · K T (K · K T )−1 · (ut+1 − uˆ t )
.. .. .. ..ˆ
4 ut+1 = ut + r · K · (Φ t+1 − Φ t )
. . . .̂
5 Φ t+1 = Φ t + 1
r · K T (K · K T )−1 · (ut+1 − ut )
6 ut+1 = ut + r · K · (Φ t+1 − Φ̂ t )
. .̂ .. .. .̂ .. ..ˆ
Data: , K, ur e f , ut , ût , ut , ut , ut , uˆ t , Φ t , Φ̂ t , Φ t , Φ t , Φ t
ur e f = (xi , yi )T ;
while (ur e f − ut ) < u do
Φ t+1 = Φ t + r1 · K T (K · K T )−1 · (ur e f − ût );
3Φ̂ t −4Φ̂ t−1 +Φ̂ t−2
dt Φ t+1 =
d
Δt ;
.
ˆ ) < ε do
while (Φ t+1 − Φ̇ t Φ̇
. . . .̂
ut+1 = ut + r · K · (Φ t+1 − Φ t );
.̂ .̂ .̂
d .
dt ut+1= 3ut −4uΔt t−1 +ut−2
;
while (üt+1 − üˆ t ) < εü do
.. .. .. ..
Φ t+1 = Φ t + · K T · (K · K T )−1 · (ut+1 − uˆ t );
1
r
.. .. .. ..ˆ
ut+1 = ut + r · K · (Φ t+1 − Φ t );
b .. ..ˆ jn−1 ..ˆ ..ˆ
a ut+1 dt = 2·n · (u0 + 2 · j=1 uk j + ukn );
b−a
end
. . . .̂
Φ t+1 = Φ t + r1 · K T · (K · K T )−1 · (ut+1 − ut );
b . .̂ jn−1 .̂ .̂
a Φ t+1 dt = 2·n · (Φ 0 + 2 · j=1 Φ k j + Φ kn );
b−a
end
ut+1 = ut + r · K · (Φ t+1 − Φ̂ t );
end
Algorithm 1: Second-order three cascade controller pseudocode
The following Figures of Sect. 6 show the numerical simulation results under
a controlled scheme. The robot navigate to different Cartesian positions and within
trajectory segments the cascade controller is capable to control position, then controls
the velocity exerted within such segment of distance, and similarly the acceleration
is controlled within a small such velocity-window that is being controlled.
The three cascade controllers required couplings between them through numerical
derivations and integrations overtime. In this case, backwards high precision deriva-
tives and Newton-Cotes integration were used. Although, the traditional PID also
deploys three derivative orders, the use of them is by far different in their imple-
mentation. The proposed cascade model worked considerably stable, reliable and
numerically precise.
A main fundamental in the proposed method is that three loops are nested. The
slowest loop establishes a current metric error distance toward a global vector ref-
erence. Then, the second and third nested control loops establish local reference
velocity and acceleration, both corresponding to the actual incremental distance. The
three loops are conditioned to recursively reduce errors up to a numerical precision
value by means of successive approximations.
Four Wheeled Humanoid Second-Order … 481
Fig. 10 Controlled Cartesian position along a trajectory compounded of four global points
acceleration is controlled by a set of loops, only for a segment of velocity, the one
that is being calculated in the current velocity’s loop.
Finally, the observers that provided feedback were stated as a couple of single
expression representing a feasible model of sensor fusion (summarized by Proposi-
tion 4). The robot’s angular motion (angle and yaw velocity) combined wheels motion
and inertial movements into a compounded observer model. The in-loop transition
between numerical derivatives worked highly reliably. The multiple inner control
cascades approach resulted numerically accurate, working online considerably fast.
Although, this type of cascade controller has the advantage that input, reference and
state vectors and matrices can easily be augmented without any alteration to the
algorithm, but if compared with PID controller in terms of speed, the latter is faster
due to less computational complexity.
7 Conclusions
The proposed cascade control model was derived directly from the physics of the
robotic mechanism. In such an approach, the kinematic model was obtained as an
independent cascade and established as a proportional control type with a constant
gain represented by the robot’s geometric parameters. The gain or convergence factor
resulted in a non-square stationary matrix (MIMO). Unlike a PID controller, the
inverse analytic solution was formulated to obtain a system of linear differential
equations. In its solution, definite integration produced a recursive controller, which
converged to a solution by successive approximations of the feedback error.
The strategy proposed in this work focuses on connecting all the higher order
derivatives of the system in nested forms (cascades), unlike a PID controller which is
described as a linear summation of all derivative orders. Likewise, a cascade approach
does not need gain adjustment.
The lowest order derivative was organized in the outer cascade. Being the loop
with the slowest control cycle frequency and containing the global control references
(desired positions). Further, the intermediate cascade is a derivative with the next
higher order, and is governed by a local speed reference. That is, this cascade controls
the speed during the displacement segment that has projected the cycle of the external
cascade. Finally, the acceleration cascade cycle is the fastest loop and controls the
portions of acceleration during a small interval of displacement along the trajectory
towards the next Cartesian goal.
The proposed research evidenced a good performance, showing controlled limits
of disturbances due to the three controllers acting over a same portion of motion.
The controller was robust and the precision error ε allowed to adjust the accuracy of
the robot goal closeness.
References
1. Mutalib, M. A. A., & Azlan, N. Z. (2020). Prototype development of mecanum wheels mobile
robot: A review. Applied Research and Smart Technology, 1(2), 71–82, ARSTech. https://doi.
org/10.23917/arstech.v1i2.39.
2. Yadav P. S., Agrawal V., Mohanta J. C., & Ahmed F. (2022) A theoretical review of mobile
robot locomotion based on mecanum wheels. Joint Journal of Novel Carbon Resource Sciences
& Green Asia Strategy, 9(2), Evergreen.
3. Palacín, J., Clotet, E., Martínez, D., Martínez, D., & Moreno, J. (2019). Extending the appli-
cation of an assistant personal Robot as a Walk-Helper Tool. Robotics, 8(27), MDPI. https://
doi.org/10.3390/robotics8020027.
4. Cooper S., Di Fava A., Vivas C., Marchionni L., & Ferro F. (2020). ARI: The social assistive
robot and companion. In 29th IEEE International Conferences on Robot and Human Inter-
active Communication, Naples Italy, August 31–September 4. https://doi.org/10.1109/RO-
MAN47096.2020.9223470.
Four Wheeled Humanoid Second-Order … 485
5. Li, Y., Dai, S., Zheng, Y., Tian, F., & Yan, X. (2018). Modeling and kinematics simulation of a
mecanum wheel platform in RecurDyn. Journal of Robotics Hindawi. https://doi.org/10.1155/
2018/9373580
6. Rohrig, C., Hes, D., & Kunemund, F. (2017). Motion controller design for a mecanum wheeled
mobile manipulator. In 2017 IEEE Conferences on Control Technology and Applications, USA
(pp. 444–449), August 27–30. https://doi.org/10.1109/ccta.2017.8062502.
7. Park J., Koh D., Kim J., & Kim C. (2021). Vibration reduction control of omnidirectional
mobile robot with lift mechanism. In 21st International Conferences on Control, Automation
and Systems. https://doi.org/10.23919/ICCAS52745.2021.9649932.
8. Belmonte, Á., Ramón, J. L., Pomares, J., Garcia, G. J., & Jara, C. A. (2019). Optimal image-
based guidance of mobile manipulators using direct visual servoing. Electronics, 8(374). https://
doi.org/10.3390/electronics8040374.
9. Yang, F., Shi, Z., Ye, S., Qian, J., Wang, W., & Xuan D. (2022). VaRSM: Versatile autonomous
racquet sports machine. In ACM/IEEE 13th International Conferences on Cyber-Physical Sys-
tems, Milano Italy, May 4–6. https://doi.org/10.1109/ICCPS54341.2022.00025.
10. Eirale A., Martini M., Tagliavini L., Gandini D., Chiaberge M., & Quaglia G. (2022). Marvin:
an innovative omni-directional robotic assistant for domestic environments. arXiv:2112.05597,
https://doi.org/10.48550/arXiv.2112.05597.
11. Qian J., Zi B., Wang D., Ma Y., & Zhang D. (2017). The design and development of an omni-
directional mobile robot oriented to an intelligent manufacturing system. Sensors, 17 (2073).
https://doi.org/10.3390/s17092073.
12. Zalevsky, A., Osipov, O., & Meshcheryakov, R. (2017). Tracking of warehouses robots based on
the omnidirectional wheels. In International Conferences on Interactive Collaborative Robotics
(pp. 268–274). Springer. https://doi.org/10.1007/978-3-319-66471-2_29.
13. Rauniyar A., Upreti H. C., Mishra A., & Sethuramalingam P. (2021). MeWBots: Mecanum-
Wheeled robots for collaborative manipulation in an obstacle-clustered environment without
communication. J. of Intelligent & Robotic Systems, 102(1). https://doi.org/10.1007/s10846-
021-01359-5.
14. Zhou, J., Wang, J., He, J., Gao, J., Yang, A., & Hu, S. (2022). A reconfigurable modular
vehicle control strategy based on an improved artificial potential field. Electronics, 11(16),
2539. https://doi.org/10.3390/electronics11162539
15. Tanioka, T. (2019). Nursing and rehabilitative care of the elderly using humanoid robot. The
Journal of Medical Investigation, 66,. https://doi.org/10.2152/jmi.66.19
16. Shepherd, S., & Buchstab, A. (2014). KUKA Robots On-Site. In W. McGee and M. Ponce
de Leon M (Eds.), Robotic Fabrication in Architecture, Art and Design (pp. 373–380). Cham:
Springer. https://doi.org/10.1007/978-3-319-04663-1_26.
17. Taheri, H., & Zhao, C. X. (2020). Omnidirectional mobile robots, mechanisms and navigation
approaches. Mechanism and Machine Theory, 153(103958), Elsevier. https://doi.org/10.1016/
j.mechmachtheory.2020.103958.
18. Slimane Tich Tich, A., Inel, F., & Carbone, G. (2022). Realization and control of a mecanum
wheeled robot based on a kinematic model. In V. Niola, A. Gasparetto, G. Quaglia & G. Carbone
(Eds.), Advances in Italian Mechanism Science, IFToMM Italy, Mechanisms and Machine
Science (Vol. 122). Cham: Springer. https://doi.org/10.1007/978-3-031-10776-4_77.
19. Thai, N. H., Ly, T. T. K., & Dzung, L. Q. (2022). Trajectory tracking control for mecanum wheel
mobile robot by time-varying parameter PID controller. Bulletin of Electrical Engineering and
Informatics, 11(4), 1902–1910. https://doi.org/10.11591/eei.v11i4.3712
20. Han K., Kim H., & Lee J. S. (2010). The sources of position errors of omni-directional mobile
robot with mecanum wheel. In IEEE International Conferences on Systems, Man and Cyber-
netics, October 10–13, Istanbul, Turkey (pp. 581–586). https://doi.org/10.1109/ICSMC.2010.
5642009.
21. Palacín J., Rubies E., & Clotet E. (2022). Systematic odometry error evaluation and correction
in a human-sized three-wheeled omnidirectional mobile robot using flower-shaped calibration
trajectories. Applied Sciences, 12(5), 2606, MDPI. https://doi.org/10.3390/app12052606.
486 A. A. Torres-Martínez et al.
22. Cavacece, M., Lanni, C., & Figliolini, G. (2022). Mechatronic design and experimentation of a
mecanum four wheeled mobile robot. In: V. Niola, A. Gasparetto, G. Quaglia & G. Carbone G.
(Eds.) Advances in Italian Mechanism Science. IFToMM Italy 2022. Mechanisms and Machine
Science (Vol. 122). Cham: Springer. https://doi.org/10.1007/978-3-031-10776-4_93.
23. Lin, P., Liu, D., Yang, D., Zou, Q., Du, Y., & Cong, M. (2019). Calibration for odometry
of omnidirectional mobile robots based on kinematic correction. In IEEE 14th International
Conferences on Computer Science & Education, August 19–21, Toronto, Canada (pp. 139–
144). https://doi.org/10.1109/iccse.2019.8845402.
24. Maddahi, Y., Maddahi, A., & Sepehri, N. (2013). Calibration of omnidirectional wheeled
mobile robots: Method and experiments. In Robotica (Vol. 31, pp. 969–980). Cambridge Uni-
versity Press. https://doi.org/10.1017/S0263574713000210.
25. Ma’arif, I. A., Raharja, N. M., Supangkat, G., Arofiati, F., Sekhar, R., & Rijalusalam, D.U.
(2021). PID-based with odometry for trajectory tracking control on four-wheel omnidirec-
tional Covid-19 aromatherapy robot. Emerging Science Journal, 5. SI “COVID-19: Emerging
Research”. https://doi.org/10.28991/esj-2021-SPER-13.
26. Li, Y., Ge, S., Dai, S., Zhao, L., Yan, X., Zheng, Y., & Shi, Y. (2020). Kinematic modeling of a
combined system of multiple mecanum-wheeled robots with velocity compensation. Sensors,
20(75), MDPI. https://doi.org/10.3390/s20010075.
27. Savaee E., & Hanzaki A. R. (2021). A new algorithm for calibration of an omni-directional
wheeled mobile robot based on effective kinematic parameters estimation. Journal of Intelligent
& Robotic Systems, 101(28), Springer. https://doi.org/10.1007/s10846-020-01296-9.
28. Khoygani, M. R. R., Ghasemi, R., & Ghayoomi, P. (2021). Robust observer-based control of
nonlinear multi-omnidirectional wheeled robot systems via high order sliding-mode consensus
protocol. International Journal of Automation and Computing, 18, 787–801, Springer, https://
doi.org/10.1007/s11633-020-1254-z.
29. Almasri, E., & Uyguroğlu, M. K. (2021). Modeling and trajectory planning optimization for the
symmetrical multiwheeled omnidirectional mobile robot. Symmetry, 13(1033), MDPI. https://
doi.org/10.3390/sym13061033.
30. Rijalusalam, D.U., & Iswanto, I. (2021). Implementation kinematics modeling and odometry
of four omni wheel mobile robot on the trajectory planning and motion control based micro-
controller. Journal of Robotics and Control, 2(5). https://doi.org/10.18196/jrc.25121.
31. Alshorman, A. M., Alshorman, O., Irfan, M., Glowacz, A., Muhammad, F., & Caesarendra, W.
(2020). Fuzzy-Based fault-tolerant control for omnidirectional mobile robot. Machines, 8(3),
55, MDPI. https://doi.org/10.3390/machines8030055.
32. Szeremeta, M., & Szuster, M. (2022). Neural tracking control of a four-wheeled mobile
robot with mecanum wheels. Applied Science, 2022(12), 5322, MDPI. https://doi.org/10.3390/
app12115322.
33. Vlantis, P., Bechlioulis, C. P., Karras, G., Fourlas, G., & Kyriakopoulos, K. J. (2016). Fault
tolerant control for omni-directional mobile platforms with 4 mecanum wheels. In IEEE Inter-
national Conferences on Robotics and Automation (pp. 2394–2400). https://doi.org/10.1109/
icra.2016.7487389.
34. Wu, X., & Huang, Y. (2021). Adaptive fractional-order non-singular terminal sliding mode
control based on fuzzy wavelet neural networks for omnidirectional mobile robot manipulator.
ISA Transactions. https://doi.org/10.1016/j.isatra.2021.03.035
35. Pizá, R., Carbonell, V., Casanova, Á., Cuenca, J. J., & Salt L. (2022). Nonuniform dual-rate
extended kalman-filter-based sensor fusion for path-following control of a holonomic mobile
robot with four mecanum wheels. Applied Science, 2022(12), 3560, MDPI. https://doi.org/10.
3390/app12073560.