AI-driven-force-torque-control-strategies-for-further-automate-_2024_Procedi
AI-driven-force-torque-control-strategies-for-further-automate-_2024_Procedi
com
ScienceDirect
Procedia CIRP 130 (2024) 348–354
AI-driven force torque control strategies for further automate flexible high-
precision, contact-intensive assemblies
Yunqi Gua*, Ruth Maria Ottoa*, Martin Naumannb, Leutrim Gjakovab, Rico Löserb, Martin Dixb,c
a
University of Applied Sciences Munich, Lothstr. 34, 80335 Munich, Germany
b
Fraunhofer Institute for Machine Tools and Forming Technology IWU, 09126 Chemnitz, Germany
c
Chemnitz University of Technology, 09111 Chemnitz, Germany
* Corresponding author. Tel.: +49-89-1265-3613; fax: +49-89-1265-1603. E-mail address: [email protected]; [email protected]
Abstract
Robotic assembly processes are integral to industrial manufacturing, traditionally relying on pre-programmed sequences that exhibit limited
adaptability to uncertain conditions. Recent advancements propose AI-enhanced vision systems as viable solutions to mitigate uncertainty in
automated tasks. Despite the potential of vision, contact-intensive, high-precision assembly processes need customized force-torque feedback
strategies to perform the task. Parametrization for setup and adaption of force-torque processes are time consuming, but decisive for quality. This
work outlines an AI-driven force-torque control strategic solution for peg-in-hole assembly task, which employs object detection algorithm to
grasp the workpiece and determine flexible assembly position, with force-torque-informed DNN algorithm for error correction in terms of
position, and digital twin to monitor the process and collect virtual data.
© 2024 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the scientific committee of the 57th CIRP Conference on Manufacturing Systems 2024 (CMS 2024)
Keywords: Robotic Assembly; Force-torque control; Deep-Neural-Network; Cognitive Robotics.
Object detection using neural networks can be realized in varied on the type of used sensors. Commonly used
two main phases: object localization and object classification information, gathered from different sensors, are visual, force
[4]. The classification architecture is utilized to determine the torque and joint/position information. They have respective
type of object, the localization and its size within the image. advantages and disadvantages, so in practical application
This is done by training a deep neural network on a large hybrid usage of sensors is often considered.
dataset of labelled images. The network learns to recognize the The Peg-in-Hole Assembly can be divided into two main
features of different objects by analyzing the patterns in the phases [22,23]: search phase and insert phase. In the search
input data. The size and position of the bounding box can be phase, the peg would be placed in a random position within the
determined using various techniques, such as sliding window clearance region, the difference between peg and hole center
or region proposal methods. In recent years, various state-of- would be defined as position error. The insertion phase
the-art CNN architectures have emerged: indicates an adjustment of the peg orientation for smooth
insertion.
• 1998 LeNet [5], In [24], an experimental force-torque dataset for multi-shape
• 2012 AlexNet [5,6], peg insertions was provided. Multi-Layer Perceptrons (MLPs)
• 2013 ZFNet [7], with different inputs for different contact situations and shapes
• 2014 VGG Net [4] and Inception V3 [8], were trained and tested, which outputs the translation and
• 2015 ResNet [9], rotation actions of the peg. The accuracy has reached 81.01%
• 2016 FractalNet [10], for position prediction of round peg.
• 2017 DenseNet [10], MobileNet [11] and Xception [12]. In [25], DNNs for a dual-arm Peg-in-Hole Assembly have
been designed. With the force and torque data from double FT
A distinction between one stage and two stage object Sensor as inputs, the first DNN predict the orientation motion
detection methods is made at this point. One-stage object of the peg and the second DNN to classify the translation
detection methods perform object detection in a single step, motion of the peg. The accuracy reached 98% and the success
directly predicting the bounding boxes and class scores for all ratio of the testing on demonstrator is 98% with 14 as average
potential objects in the image without the need for an steps.
intermediate region proposal step. One-stage detectors include Except DNN, reinforcement learning is also widely applied
YOLO (You Only Look Once), EfficientDet, RetinaNet and due to it adaptive ability. Value-based model-free method like
SSD (Single Shot Detector). Two-stage methods use a two-step DQN [26], Double DQN [27], Dueling DQN [28] are often
process to perform object detection within a frame. A region researched. With drawbacks that value-based learning can only
proposal network (RPN) creates bounding boxes in the first output discrete actions, the policy-based model-free method
step, which are further refined to generate final detections in can be considered for continuous actions learning, including
the second stage. While two-stage detectors offer higher algorithm like DDPG [29], TD3 [30] ad SAC [31]. Another
accuracy, one-stage detectors are faster, primarily because they alternative is model-based leaning, where firstly the model of
avoid the need for an additional region proposal step [13]. the environment should be obtained, for example iLQR/iLQG
From today's perspective, frugal networks are becoming [32].
increasingly important, because of their low demands on data,
low energy consumption an inference time respectively. These 3. Methodology
lightweighted networks with a small model size due to depth-
wise separable convolutions and less computational cost are 3.1. Object Detection
suitable for the development of mobile and embedded vision.
[14] In order to make the right selection of network in advance For object detection a matrix camera with a resolution of
important factors, including model complexity [16,15], 2448 px X 2048 px (widht X height) attached to the robot
computational demands [17], model size [18], training times flange takes a single image of the objects that can be freely
[16,18] and processing speed [18] needs to be covered positioned in its field of view (306 mm X 256 mm). This
comparing to the project relevant KPIs. results in a pixel size of 0.125 mm, which means that a
Robotic force control can be roughly summarized into detection accuracy of ±0.25 mm can be achieved.
impedance control, hybrid position/force control, adaptive The image recognition algorithm, shown in Fig. 1 uses the
control and neural network intelligent control [19]. The neural XEIDANA® framework developed by Fraunhofer Institute for
network approach has been researched due to it's adaptive, self- Machine Tools and Forming Technology IWU, Chemnitz,
learning ability and the capability of approaching arbitrary Germany. [33] “The framework is a visual programming
complex nonlinear. system and runtime for complex data processing flows that
The active compliance control strategy for Peg-in-Hole enable optimal utilization of multi-core systems. The data flow
Assembly can be categorized into contact model-based and is handled as networks of interconnected modules with
model-free [20]. By model-based methods the mechanical encapsulated data processing operations. [33] In addition, the
model of the assembly process would be analyzed while by modules are connected as a daisy chain and can be processed
model-free methods machine learning algorithms would be in the so-called pipeline mode. This means that while one
applied, for example neural networks and reinforcement module is still calculating the output for an incoming data
learning, so that the process would be learned by imitation or packet, its predecessor module can already process a new data
direct from environment [21]. The control strategy can also be packet at the same time.” [34]
350 Yunqi Gu et al. / Procedia CIRP 130 (2024) 348–354
Fig. 1. Developed image recognition network in the XEIDANA® framework consisting of: a) Image acquisition, b) Hardware requirements, c) Edge
detection algorithm, d) Visualization e) Localization of the object.
Fig. 3. (a) class definition of error area; (b) time series inputs for CNN
Fig. 2. Search and insertion phases. Classification
Yunqi Gu et al. / Procedia CIRP 130 (2024) 348–354 351
Fig. 5. (a) Datapoints; (b) Force and Torque of TCP, internal and external joint torques on top right point of (a).
352 Yunqi Gu et al. / Procedia CIRP 130 (2024) 348–354
Acknowledgements [18] Park, J., Kim, D.H., Shin, Y.S., Lee, S., 2017. A comparison of
convolutional object detectors for real-time drone tracking using a PTZ
camera, in: 2017 17th International Conference on Control, Automation
The work is part of the France-German project GreenBotAI and Systems (ICCAS). 2017 17th International Conference on Control,
- Frugal and adaptive AI for flexible industrial Robotics which Automation and Systems (ICCAS), Jeju. 18.10.2017 - 21.10.2017. IEEE,
pp. 696–699.
is supported by the German Federal Ministry for Economic
[19] Weng, L., Tian, L., Hu, K., Zang, Q., Chen, X., 2020. Overview of Robot
Affairs and Climate Action (BMWK) on the basis of a decision Force Control Algorithms Based on Neural Network, in: 2020 Chinese
by the German Bundestag. Automation Congress (CAC). 2020 Chinese Automation Congress
(CAC), Shanghai, China. 06.11.2020 - 08.11.2020. IEEE, pp. 6800–6803.
[20] Xu, J., Hou, Z., Liu, Z., Qiao, H., 2019. Compare Contact Model-based
References Control and Contact Model-free Learning: A Survey of Robotic Peg-in-
[1] Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J., 2003. 3D object hole Assembly Strategies. http://arxiv.org/pdf/1904.05240v1.
modeling and recognition using affine-invariant patches and multi-view [21] Shen, L., Su, J., Zhang, X., 2023. Review on Peg-in-Hole Insertion
spatial constraints, in: 2003 IEEE Computer Society Conference on Technology Based on Reinforcement Learning, in: 2023 China
Computer Vision and Pattern Recognition, 2003. Proceedings. CVPR Automation Congress (CAC). 2023 China Automation Congress (CAC),
2003: Computer Vision and Pattern Recognition Conference, Madison, Chongqing, China. 17.11.2023 - 19.11.2023. IEEE, pp. 6688–6695.
WI, USA. 18-20 June 2003. IEEE Comput. Soc, II-272-7. [22] Inoue, T., Magistris, G. de, Munawar, A., Yokoya, T., Tachibana, R.,
[2] Sackewitz, M. (Ed.), 2016. Leitfaden zur Inspektion und 2017. Deep reinforcement learning for high precision assembly tasks,
Charakterisierung von Oberflächen mit Bildverarbeitung. Fraunhofer in: 2017 IEEE/RSJ International Conference on Intelligent Robots and
Verlag, Stuttgart, 114 pp. Systems (IROS). 2017 IEEE/RSJ International Conference on Intelligent
[3] M.phil Scholar, A., 2014. Object Detection In Image Processing Using Robots and Systems (IROS), Vancouver, BC. 24.09.2017 - 28.09.2017.
Edge Detection Techniques. IOSRJEN 4 (3), 10–13. IEEE, pp. 819–825.
[4] Oksuz, K., Cam, B.C., Akbas, E., Kalkan, S., 2020. A Ranking-based, [23] Sharma, K., Shirwalkar, V., Pal, P.K., 2013. Intelligent and environment-
Balanced Loss Function Unifying Classification and Localisation in independent Peg-In-Hole search strategies, in: 2013 International
Object Detection. http://arxiv.org/pdf/2009.13592v4. Conference on Control, Automation, Robotics and Embedded Systems
[5] Dhillon, A., Verma, G.K., 2020. Convolutional neural network: a review (CARE). 2013 International Conference on Control, Automation,
of models, methodologies and applications to object detection. Prog Artif Robotics and Embedded Systems (CARE), Jabalpur, India. 16.12.2013 -
Intell 9 (2), 85–112. 18.12.2013. IEEE, pp. 1–6.
[6] Patel, S., 2020. A comprehensive analysis of Convolutional Neural [24] Magistris, G. de, Munawar, A., Pham, T.-H., Inoue, T., Vinayavekhin, P.,
Network models. International Journal of Advanced Science and Tachibana, R., 2018. Experimental Force-Torque Dataset for Robot
Technology 29 (4), 771–777. Learning of Multi-Shape Insertion.
[7] Sultana, F., Sufian, A., Dutta, P., 2018. Advancements in Image [25] Ortega-Aranda, D., Jimenez-Vielma, J.F., Saha, B.N., Lopez-Juarez, I.,
Classification using Convolutional Neural Network abs 1312 4400, 122– 2021. Dual-Arm Peg-in-Hole Assembly Using DNN with Double
129. Force/Torque Sensor. Applied Sciences 11 (15), 6970.
[8] Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A., 2016. Inception-v4, [26] Gullapalli, V., Grupen, R.A., Barto, A.G., 1992. Learning reactive
Inception-ResNet and the Impact of Residual Connections on Learning. admittance control, in: Proceedings. 1992 IEEE International Conference
http://arxiv.org/pdf/1602.07261v2. on Robotics and Automation. May 12-14, 1992, Nice, France (1992.Nice,
[9] Lee, C., Kim, H.J., Oh, K.W., 2016. Comparison of faster R-CNN models France). 1992 IEEE International Conference on Robotics and
for object detection, in: 2016 16th International Conference on Control, Automation, Nice, France. 12-14 May 1992. IEEE, Nice, France, pp.
Automation and Systems (ICCAS). 2016 16th International Conference 1475–1480.
on Control, Automation and Systems (ICCAS), Gyeongju, South Korea. [27] van Hasselt, H., Guez, A., Silver, D., 2015. Deep Reinforcement Learning
16.10.2016 - 19.10.2016. IEEE, pp. 107–110. with Double Q-learning. http://arxiv.org/pdf/1509.06461.
[10] Yang, Y., Zhang, L., Du, M., Bo, J., Liu, H., Ren, L., Li, X., Deen, M.J., [28] Wang, Z., Schaul, T., Hessel, M., van Hasselt, H., Lanctot, M., Freitas,
2021. A comparative analysis of eleven neural networks architectures for N. de, 2015. Dueling Network Architectures for Deep Reinforcement
small datasets of lung images of COVID-19 patients toward improved Learning, 15 pp. http://arxiv.org/pdf/1511.06581.
clinical decisions. Computers in biology and medicine 139, 104887. [29] Ren, T., Dong, Y., Wu, D., Chen, K., 2018. Learning-Based Variable
[11] Sanchez, S.A., Romero, H.J., Morales, A.D., 2020. A review: Compliance Control for Robotic Assembly. Journal of Mechanisms and
Comparison of performance metrics of pretrained models for object Robotics 10 (6).
detection using the TensorFlow framework. IOP Conf. Ser.: Mater. Sci. [30] Fujimoto, S., van Hoof, H., Meger, D., 2018. Addressing Function
Eng. 844, 12024. Approximation Error in Actor-Critic Methods.
[12] Zhao, P., Li, C., Rahaman, M.M., Xu, H., Yang, H., Sun, H., Jiang, T., http://arxiv.org/pdf/1802.09477.
Grzegorzek, M., 2022. A Comparative Study of Deep Learning [31] Haarnoja, T., Zhou, A., Abbeel, P., Levine, S., 2018. Soft Actor-Critic:
Classification Methods on a Small Environmental Microorganism Image Off-Policy Maximum Entropy Deep Reinforcement Learning with a
Dataset (EMDS-6): From Convolutional Neural Networks to Visual Stochastic Actor. http://arxiv.org/pdf/1801.01290.
Transformers. Frontiers in microbiology 13, 792166. [32] Tassa, Y., Erez, T., Todorov, E., 2012. Synthesis and stabilization of
[13] Adarsh, P., Rathi, P., Kumar, M., 2020. YOLO v3-Tiny: Object Detection complex behaviors through online trajectory optimization, in: IEEE/RSJ
and Recognition using one stage improved model, in: 2020 6th International Conference on Intelligent Robots and Systems (IROS),
International Conference on Advanced Computing and Communication 2012. 7 - 12 Oct. 2012, Vilamoura, Algarve, Portugal. 2012 IEEE/RSJ
Systems (ICACCS). 2020 6th International Conference on Advanced International Conference on Intelligent Robots and Systems (IROS 2012),
Computing and Communication Systems (ICACCS), Coimbatore, India. Vilamoura-Algarve, Portugal. 10/7/2012 - 10/12/2012. IEEE,
06.03.2020 - 07.03.2020. IEEE, pp. 687–694. Piscataway, NJ, pp. 4906–4913.
[14] Wang, C.-H., Huang, K.-Y., Yao, Y., Chen, J.-C., Shuai, H.-H., Cheng, [33] Putz, M., Wiener, T., Pierer, A., Hoffmann, M., 2018. A multi-sensor
W.-H., 2024. Lightweight Deep Learning: An Overview. IEEE Consumer approach for failure identification during production enabled by parallel
Electron. Mag., 1–12. data monitoring. CIRP Annals 67 (1), 491–494.
[15] Nikhil Yadav, Utkarsh Binay, 2017. Comparative Study of Object [34] Pierer, A., Hauser, M., Hoffmann, M., Naumann, M., Wiener, T., León,
Detection Algorithms, in: . M.A.L. de, Mende, M., Koziorek, J., Dix, M., 2022. Inline Quality
[16] Chiu, Y.-C., Tsai, C.-Y., Ruan, M.-D., Shen, G.-Y., Lee, T.-T., 2020. Monitoring of Reverse Extruded Aluminum Parts with Cathodic Dip-
Mobilenet-SSDv2: An Improved Object Detection Model for Embedded Paint Coating (KTL). Sensors (Basel, Switzerland) 22 (24).
Systems, in: 2020 International Conference on System Science and [35] Juliani, A., Berges, V.-P., Teng, E., Cohen, A., Harper, J., Elion, C., Goy,
Engineering (ICSSE). 2020 International Conference on System Science C., Gao, Y., Henry, H., Mattar, M., Lange, D., 2018. Unity: A General
and Engineering (ICSSE), Kagawa, Japan. 31.08.2020 - 03.09.2020. Platform for Intelligent Agents. http://arxiv.org/pdf/1809.02627v2.
IEEE, pp. 1–5. [36] Safeea, M., Neto, P., 2023. Model-based hardware in the loop control of
[17] Bouguettaya, A., Kechıda, A., Taberkıt, A.M., 2019. A survey on collaborative robots: Simulink and Python based interfaces. International
lightweight CNN-based object detection algorithms for platforms with Journal of Computer Integrated Manufacturing, 1–13.
limited computational resources. International Journal of Informatics and [37] Steven Macenski, Tully Foote, Brian Gerkey, Chris Lalancette, William
Applied Mathematics 2 (2), 28–44. Woodall, 2022. Robot Operating System 2: Design, architecture, and uses
in the wild. Science Robotics 7 (66), eabm6074.
354 Yunqi Gu et al. / Procedia CIRP 130 (2024) 348–354