Reinforcement Learning
Reinforcement Learning
Human-like appearance
Emotional intelligence
Social skills
Reinforcement Learning
navigation
Workflow
Environment 1 Environment 2
(Environment 1 Yields Training (Environment 2 Yields Test
Accuracy ) Accuracy )
Algorithms ❏ Q - Learning
❏ SARSA
Used ❏ DQN ( Deeo Q -
Network)
❏ DDQN (Double Deep
Q - Network )
Comparative Analysis
Algorithms Accuracy Reward Action State
DQN DQN utilizes a deep neural The reward mechanism DQN selects actions
network to approximate depends on the specific based on the highest
Q-values, which can lead to more task predicted Q-value from
accurate estimations and environment. DQN the neural network
compared to traditional Q- can learn to optimize output.
learning. rewards
effectively, but careful
reward design is crucial
for good performance.
DDQN DDQN addresses overestimation Similar to DQN, DDQN's DDQN selects actions
bias present in reward handling based on the highest
DQN by using a separate target depends predicted Q-value from
network for action on proper reward the target network's
selection, potentially leading to design for the task. output.
more accurate
Q-value estimations.
Results
1 Q - Learning 68 67
2 SARSA 77 75
3 DQN 87 85
4 DDQN 92 91
Conclusion
1. DDQN achieved highest accuracy of 92% and 91% in Env1 and Env2 respectively.
2. Comparatively DQN achieved accuracy of 87% and 85% in Env1 and Env2 respectively
followed by SARSA and Q- learning.
3. DDQN accumulated highest reward for same number of action states both Env1 and
Env2.
References
1. Akalin, N., & Loutfi, A. (2021). Reinforcement learning approaches in social robotics. Sensors, 21(4), 1292. https://doi.org/10.3390/s21041292
2. Kober, J., Bagnell, J. A., & Peters. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11), 1238–1274.
https://doi.org/10.1177/027836491349572
3. Thomaz, A.; Hoffman, G.; Cakmak, M. Computational human-robot interaction. Found. Trends Robot. 2016, 4, 105–223. [Google Scholar]
4. Isbell, C.; Shelton, C.R.; Kearns, M.; Singh, S.; Stone, P. A social reinforcement learning agent. In Proceedings of the Fifth International Conference on Autonomous
Agents, (AGENTS ’01), Montreal, QC, Canada, 28 May–1 June 2001; pp. 377–384
5.Suay, H.B.; Chernova, S. Effect of human guidance and state space size on interactive reinforcement learning. In Proceedings of the 20th IEEE International Workshop
on Robot and Human Communication (RO-MAN 2011), Atlanta, GA, USA, 31 July–3 August 2011; pp. 1–6.
6. Thomaz, A.L.; Hoffman, G.; Breazeal, C. Reinforcement learning with human teachers: Understanding how people want to teach robots. In Proceedings of the
15th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2006), Hatfield, UK, 6–8 September 2006; pp. 352–357.
7. Thomaz, A.L.; Breazeal, C. Experiments in socially guided exploration: Lessons learned in building robots that learn with and without human teachers. Connect.
Sci. 2008, 20, 91–110. [CrossRef]
8. Knox, W.B.; Stone, P.; Breazeal, C. Training a Robot via Human Feedback: A Case Study. In Social Robotics; Herrmann, G.,Pearson, M.J., Lenz, A., Bremner, P.,
Spiers, A., Leonards, U., Eds.; Springer International Publishing: Cham, Switzerland, 2013;pp. 460–470.