0% found this document useful (0 votes)

2 views

Reinforcement Learning

The document provides a comprehensive study on the application of reinforcement learning (RL) in social robots, highlighting their key features such as human-like appearance and emotional intelligence. It discusses various RL algorithms like Q-learning, SARSA, DQN, and DDQN, comparing their performance in terms of accuracy and reward in different virtual environments. The study concludes that DDQN outperforms other algorithms with the highest accuracy and reward accumulation.

Uploaded by

bhavyasingh23khushi

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Reinforcement Learning

Uploaded by

bhavyasingh23khushi

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Reinforcement Learning in Social

Robots : A Comprehensive Study

Mentored By : Presented By:

Dr. Manju Bhardwaj Aaditri Saraswat
Dr. Shweta Sankhwar Bhavya
Singh
Social Robots

A social robot is a type of robot designed

to interact and communicate with humans
in a social way. Some key features and
characteristics of social robots include :

Human-like appearance

Emotional intelligence

Natural language processing (NLP)

Social skills
Reinforcement Learning

● RL is an area of machine learning which teaches an agent

to learn through “trial and error” methodology.
● The agent makes decisions in a given environment by
maximising a reward signal
● The robot learns by interacting with its environment by
observing the outcomes, and receiving the reward
(positive or negative).
Markov Decision Process

The agent interacts with an environment,

which changes its state and yields a reward
for the action. Then, another round begins.
Mathematically, this cycle is based on the
Markov Decision Process (MDP). Such an
MDP consists of five components: S, A, R, P,
and p₀.
Fig 1
Markov decision
process
Reinforcement
learning model
Objectives

➢ Identification of types of reinforcement learning algorithms and

mechanisms used in social robot navigation

➢ Comparison of reinforcement learning algorithms in social robot

navigation
Workflow

Creating Designing Setting a Training Testing the Evaluating

Environme a mobile reward the agent learned the
nt using robot/ function in policy in performanc
Gazebo agent Environme both e based on
Simulator using ROS nt 1 for Environmen total reward
different t 1 and and number
Reinforce Environmen of
ment t 2 for successful
Learning different episodes for
Algorithms reinforceme each
nt learning reinforceme
algorithms
Virtual Environments

Virtual environment is orientation of different obstacles.

Environment 1 Environment 2
(Environment 1 Yields Training (Environment 2 Yields Test
Accuracy ) Accuracy )
Algorithms ❏ Q - Learning
❏ SARSA
Used ❏ DQN ( Deeo Q -
Network)
❏ DDQN (Double Deep
Q - Network )
Comparative Analysis
Algorithms Accuracy Reward Action State

Q - learning Q-learning is model-free and Q-learning relies on Q-learning directly

can be accurate if explicit rewards and updates Q-values
the state-action space is learns based
well-defined and discretized. optimal Q-values on observed rewards
through exploration. It and the maximum Q-
may struggle value of the
with complex next state.
continuous state
spaces.

SARSA SARSA is an on-policy Similar to Q-learning, SARSA updates Q-

algorithm that updates SARSA depends on values based on the
Q-value based on the actual explicit observed reward, the
action taken and rewards and next state's action,
the next state's action. exploration to learn and the next state's Q-
optimal Q-values value.
Comparative Analysis Contd.
Algorithms Accuracy Reward Action State

DQN DQN utilizes a deep neural The reward mechanism DQN selects actions
network to approximate depends on the specific based on the highest
Q-values, which can lead to more task predicted Q-value from
accurate estimations and environment. DQN the neural network
compared to traditional Q- can learn to optimize output.
learning. rewards
effectively, but careful
reward design is crucial
for good performance.

DDQN DDQN addresses overestimation Similar to DQN, DDQN's DDQN selects actions
bias present in reward handling based on the highest
DQN by using a separate target depends predicted Q-value from
network for action on proper reward the target network's
selection, potentially leading to design for the task. output.
more accurate
Q-value estimations.
Results

S.No Algorithms Training Accuracy Test Accur

1 Q - Learning 68 67

2 SARSA 77 75

3 DQN 87 85

4 DDQN 92 91
Conclusion

In this study we evaluated the performance of different reinforcement learning algorithms

based on their reward function

and total number of successful episodes. The results are as follows:

1. DDQN achieved highest accuracy of 92% and 91% in Env1 and Env2 respectively.

2. Comparatively DQN achieved accuracy of 87% and 85% in Env1 and Env2 respectively
followed by SARSA and Q- learning.

3. DDQN accumulated highest reward for same number of action states both Env1 and
Env2.
References
1. Akalin, N., & Loutfi, A. (2021). Reinforcement learning approaches in social robotics. Sensors, 21(4), 1292. https://doi.org/10.3390/s21041292

2. Kober, J., Bagnell, J. A., & Peters. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11), 1238–1274.
https://doi.org/10.1177/027836491349572

3. Thomaz, A.; Hoffman, G.; Cakmak, M. Computational human-robot interaction. Found. Trends Robot. 2016, 4, 105–223. [Google Scholar]

4. Isbell, C.; Shelton, C.R.; Kearns, M.; Singh, S.; Stone, P. A social reinforcement learning agent. In Proceedings of the Fifth International Conference on Autonomous
Agents, (AGENTS ’01), Montreal, QC, Canada, 28 May–1 June 2001; pp. 377–384

5.Suay, H.B.; Chernova, S. Effect of human guidance and state space size on interactive reinforcement learning. In Proceedings of the 20th IEEE International Workshop
on Robot and Human Communication (RO-MAN 2011), Atlanta, GA, USA, 31 July–3 August 2011; pp. 1–6.

6. Thomaz, A.L.; Hoffman, G.; Breazeal, C. Reinforcement learning with human teachers: Understanding how people want to teach robots. In Proceedings of the
15th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2006), Hatfield, UK, 6–8 September 2006; pp. 352–357.

7. Thomaz, A.L.; Breazeal, C. Experiments in socially guided exploration: Lessons learned in building robots that learn with and without human teachers. Connect.
Sci. 2008, 20, 91–110. [CrossRef]

8. Knox, W.B.; Stone, P.; Breazeal, C. Training a Robot via Human Feedback: A Case Study. In Social Robotics; Herrmann, G.,Pearson, M.J., Lenz, A., Bremner, P.,
Spiers, A., Leonards, U., Eds.; Springer International Publishing: Cham, Switzerland, 2013;pp. 460–470.

Capstone Proposal
No ratings yet
Capstone Proposal
2 pages
15
No ratings yet
15
17 pages
Case Study C Neww
No ratings yet
Case Study C Neww
12 pages
Introduction To Deep Q-Network (DQN) : by Divyansh Pandit
No ratings yet
Introduction To Deep Q-Network (DQN) : by Divyansh Pandit
10 pages
DQN_Muhammed
No ratings yet
DQN_Muhammed
46 pages
Reinforcement Learning by Comparing Immediate Reward: Punit Pandey Deepshikhapandey
No ratings yet
Reinforcement Learning by Comparing Immediate Reward: Punit Pandey Deepshikhapandey
5 pages
DL questions
No ratings yet
DL questions
30 pages
Untitled document
No ratings yet
Untitled document
11 pages
Midterm_Report_Example3
No ratings yet
Midterm_Report_Example3
4 pages
ML Module - 5 QB Solved-1
No ratings yet
ML Module - 5 QB Solved-1
11 pages
ML Assignment[1]
No ratings yet
ML Assignment[1]
7 pages
Autonomous Car Racing in Simulation Environment Using Deep Reinforcement Learning
No ratings yet
Autonomous Car Racing in Simulation Environment Using Deep Reinforcement Learning
6 pages
3.5 Intro2DeepQLearning
No ratings yet
3.5 Intro2DeepQLearning
12 pages
RL Project - Deep Q-Network Agent Report
No ratings yet
RL Project - Deep Q-Network Agent Report
11 pages
RL Project - Deep Q-Network Agent Presentation
No ratings yet
RL Project - Deep Q-Network Agent Presentation
15 pages
Stock Price Prediction Using Reinforcement Learning
No ratings yet
Stock Price Prediction Using Reinforcement Learning
6 pages
Pplication of Deep Reinforcement Learning For Ndian Stock Trading Automation
No ratings yet
Pplication of Deep Reinforcement Learning For Ndian Stock Trading Automation
9 pages
Unit Iv Deep Q Learning
No ratings yet
Unit Iv Deep Q Learning
27 pages
Lecture Notes on Reinforcement Learning Basics
No ratings yet
Lecture Notes on Reinforcement Learning Basics
6 pages
Actor Mimic
No ratings yet
Actor Mimic
16 pages
Unit 5 Deep Learning
No ratings yet
Unit 5 Deep Learning
24 pages
Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models
No ratings yet
Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models
14 pages
Generating Intelligent Agent Behaviors in Multi-Agent Game AI Using Deep Reinforcement Learning Algorithm
No ratings yet
Generating Intelligent Agent Behaviors in Multi-Agent Game AI Using Deep Reinforcement Learning Algorithm
9 pages
Q Learning
No ratings yet
Q Learning
38 pages
AI Paper Set
No ratings yet
AI Paper Set
33 pages
Deep Q_Dyna Q
No ratings yet
Deep Q_Dyna Q
3 pages
Deep Exploration Via Bootstrapped DQN: Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy
No ratings yet
Deep Exploration Via Bootstrapped DQN: Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy
18 pages
The_Use_of_Reinforcement_Learning_in_Gaming_The_Br
No ratings yet
The_Use_of_Reinforcement_Learning_in_Gaming_The_Br
9 pages
Reconciling Lambda-Returns With Experience Replay
No ratings yet
Reconciling Lambda-Returns With Experience Replay
17 pages
A Short Survey On Memory Based RL
No ratings yet
A Short Survey On Memory Based RL
18 pages
Deep Q-Network
No ratings yet
Deep Q-Network
15 pages
IMPLing The DQN
No ratings yet
IMPLing The DQN
9 pages
Final
No ratings yet
Final
18 pages
AS02
No ratings yet
AS02
16 pages
Reinforcement_Learning_Algorithms_in_Global_Path_Planning_for_Mobile_Robot
No ratings yet
Reinforcement_Learning_Algorithms_in_Global_Path_Planning_for_Mobile_Robot
5 pages
Notes RL
No ratings yet
Notes RL
12 pages
Towards Adapting Reinforcement Learning Agents To New Tasks: Insights From Q-Values
No ratings yet
Towards Adapting Reinforcement Learning Agents To New Tasks: Insights From Q-Values
10 pages
Reward Design RL
No ratings yet
Reward Design RL
5 pages
Reinforcement Learning - Basics
No ratings yet
Reinforcement Learning - Basics
7 pages
Maai 6
No ratings yet
Maai 6
143 pages
MDP
No ratings yet
MDP
10 pages
Reinforcement Learning, Crawling Robot: Faculty of Sciences and Techniques Béni-Mellal
No ratings yet
Reinforcement Learning, Crawling Robot: Faculty of Sciences and Techniques Béni-Mellal
5 pages
A Reinforcement Learning Approach To Obstacle Avoidance of Mobil
No ratings yet
A Reinforcement Learning Approach To Obstacle Avoidance of Mobil
5 pages
Module 1
No ratings yet
Module 1
72 pages
Introduction To Deep Reinforcement Learning
No ratings yet
Introduction To Deep Reinforcement Learning
7 pages
Q-Learning_Seminar_Report
No ratings yet
Q-Learning_Seminar_Report
37 pages
RL PDF
No ratings yet
RL PDF
4 pages
DDQN PDF
No ratings yet
DDQN PDF
13 pages
Evaluation of Deep Reinforcement Learning Algorithms for Autonomous Driving
No ratings yet
Evaluation of Deep Reinforcement Learning Algorithms for Autonomous Driving
7 pages
Autonomous Driving System Based On Deep Q Learnig: Takafumi Okuyama, Tad Gonsalves Jaychand Upadhay
No ratings yet
Autonomous Driving System Based On Deep Q Learnig: Takafumi Okuyama, Tad Gonsalves Jaychand Upadhay
5 pages
Applying Q (λ) -learning in Deep Reinforcement Learning to Play Atari Games
No ratings yet
Applying Q (λ) -learning in Deep Reinforcement Learning to Play Atari Games
6 pages
lecture doubts
No ratings yet
lecture doubts
2 pages
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
No ratings yet
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
12 pages
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
No ratings yet
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
12 pages
ML Assignment 2
No ratings yet
ML Assignment 2
6 pages
Deep Reinforcement Learning For Drone Delivery
No ratings yet
Deep Reinforcement Learning For Drone Delivery
19 pages
ML-UNIT2
No ratings yet
ML-UNIT2
17 pages
42-Deep Q Learning
No ratings yet
42-Deep Q Learning
8 pages
RL Course Report
No ratings yet
RL Course Report
11 pages
Report On Reinforcement Learning
No ratings yet
Report On Reinforcement Learning
26 pages
Reinforcement Learning Explained - A Step-by-Step Guide to Reward-Driven AI
From Everand
Reinforcement Learning Explained - A Step-by-Step Guide to Reward-Driven AI
Luka Nikolic
No ratings yet
Build A 9 DB, 70cm, Collinear Antenna From Coax: by N1Hfx
No ratings yet
Build A 9 DB, 70cm, Collinear Antenna From Coax: by N1Hfx
3 pages
Geeky Banker CAIIB IT MODULE A COMPLETE
100% (1)
Geeky Banker CAIIB IT MODULE A COMPLETE
62 pages
Exercises and Solutions
No ratings yet
Exercises and Solutions
26 pages
12 Project Management
No ratings yet
12 Project Management
29 pages
Selenium Python
50% (2)
Selenium Python
53 pages
CS304P - Lab Exercises
No ratings yet
CS304P - Lab Exercises
16 pages
1 File Structure & Organization
No ratings yet
1 File Structure & Organization
23 pages
Fassi Magazine9 EN
No ratings yet
Fassi Magazine9 EN
40 pages
Lock Out Tag Out (LOTO)
No ratings yet
Lock Out Tag Out (LOTO)
43 pages
The Space Syntax Toolkit: Integrating Depthmapx and Exploratory Spatial Analysis Workflows in QGIS
No ratings yet
The Space Syntax Toolkit: Integrating Depthmapx and Exploratory Spatial Analysis Workflows in QGIS
12 pages
System Requirements Specification For Placement Management System
No ratings yet
System Requirements Specification For Placement Management System
7 pages
ACFrOgCnbEamTmlVksZYD7F2LbSAtm7CmDkkvF1JEjPexZHONTNxH0cW3CVewA-VwzTjXNK-nY5XZZX nWwXrAhxHxu1zu4bTpgLTyoAk KOyVLg2s0EOxJIllKWzgU
No ratings yet
ACFrOgCnbEamTmlVksZYD7F2LbSAtm7CmDkkvF1JEjPexZHONTNxH0cW3CVewA-VwzTjXNK-nY5XZZX nWwXrAhxHxu1zu4bTpgLTyoAk KOyVLg2s0EOxJIllKWzgU
3 pages
NCA-AIIO AI Infrastructure and Operations Exam Free Dumps
No ratings yet
NCA-AIIO AI Infrastructure and Operations Exam Free Dumps
11 pages
CMDB8.1 MappingYourData
No ratings yet
CMDB8.1 MappingYourData
10 pages
Rexroth Trainingssysteme Automatisierung Version1-1 en
No ratings yet
Rexroth Trainingssysteme Automatisierung Version1-1 en
234 pages
Keras Cheat Sheet Python For Data Science: Model Architecture Inspect Model
No ratings yet
Keras Cheat Sheet Python For Data Science: Model Architecture Inspect Model
1 page
Chrome Reprompt
No ratings yet
Chrome Reprompt
5 pages
Mobile Apps in Vietnam 2020 2025 Statista
No ratings yet
Mobile Apps in Vietnam 2020 2025 Statista
44 pages
QApp 1 Assure
No ratings yet
QApp 1 Assure
5 pages
v2.0 SICTIC-Startup-Factsheet-for-YOURSTARTUPNAME
No ratings yet
v2.0 SICTIC-Startup-Factsheet-for-YOURSTARTUPNAME
5 pages
IntelliTrac X Series Protocol - TA - 1st Draft
No ratings yet
IntelliTrac X Series Protocol - TA - 1st Draft
17 pages
PostgreSQL DBA Guide
No ratings yet
PostgreSQL DBA Guide
105 pages
Brochure Speed Sensors
No ratings yet
Brochure Speed Sensors
4 pages
Word_Prac1_GT (1)
No ratings yet
Word_Prac1_GT (1)
2 pages
MT 3750 Krbcsales
No ratings yet
MT 3750 Krbcsales
2 pages
Aproval Fire Alarm
No ratings yet
Aproval Fire Alarm
5 pages
Ciontek Catalog - C Series 2024
No ratings yet
Ciontek Catalog - C Series 2024
7 pages
MCE-321: Mechanical Design I: Dr. Mohamed Nasr
No ratings yet
MCE-321: Mechanical Design I: Dr. Mohamed Nasr
33 pages
Dear Valued 555 Customers:: Sankei Industry Co.,Ltd
No ratings yet
Dear Valued 555 Customers:: Sankei Industry Co.,Ltd
7 pages