0% found this document useful (0 votes)

6 views

Lecture Reinforcement Learning

Uploaded by

A Rajagopal am18d301

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Lecture Reinforcement Learning

Uploaded by

A Rajagopal am18d301

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Edge AI and Robotics Teaching Kit

Lecture 5.1
Reinforcement Learning
The Edge AI and Robotics Teaching Kit is licensed by NVIDIA and UMBC under the
Creative Commons Attribution-NonCommercial 4.0 International License.

2
Topics

• Describe concept of Reinforcement Learning

• Reinforcement Learning Algorithms and Approaches
• Deep Learning
• States, Actions, Rewards
• Lab and Example Environments

3
Learning Objectives - Reinforcement Learning

Explain concepts of Reinforcement Learning

Explain different reinforcement learning approaches
Describe DQN and how Q-Learning is leveraged
Gain hands-on experience training agents using sample environments
in Openai Gym

4
Reinforcement Learning Concepts

5
Concepts

• Environment- attributes
• Agents
• State/Actions
• Learning – policies, functions,
models
• Objective
• Rewards

6
© D . Poole and A. Mackworth 2019 Artificial Intelligence: Foundations of Computational Agents
Reinforcement Learning

Agent Environment

7
Reinforcement Learning

What should an agent do given:

• Prior knowledge – possible states, baseline, possible actions
• Observations – current state, immediate reward
• Goal – optimal set of actions that maximizes the mean cumulative discounted reward
We can train this agent approximating its environment

8
© D . Poole and A. Mackworth 2019 Artificial Intelligence: Foundations of Computational Agents
Reinforcement Learning Loop

Figure 1.2 The reinforcement learning control

loop

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9
9780135172384)
Copyright © 2020 Pearson Education, Inc. All rights reserved.
Rewards and Values

Figure 1.4 Rewards r and values V(s) for each state s in a simple grid-world
environment. The value of a state is calculated from the rewards using
Equation 1.10 with  = 0.9 while using a policy  that always takes the
shortest path to the goal state with r = +1.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
10 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Approaches

11
Reinforcement Learning Approaches

Figure 1.5 Deep reinforcement learning

algorithm families

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
Copyright © 2020 Pearson Education, Inc. All rights reserved.
12
Neural Networks Leveraged for RL

Figure 12.4 Neural network families

14
Simple Environment

Figure 3.1 Simple environment: five states, two actions per state

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9780135172384)
15 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Simple Environment

Figure 3.2 Optimal Q-values for

the simple environment from
Figure 3.1,  = 0.9

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng
(ISBN-13: 9780135172384)
16 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Simple Environment - Learning

Figure 3.3 Learning the Q*(s, a) for the simple environment from Figure 3.1

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
17 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Simple Environment – Optimal Values

Figure 3.4 Optimal Q-values for the simple environment from Figure 3.1,  = 0
(left),  = 1 (right)

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9780135172384)
18 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Processing of Data

Figure 14.2 Information flow from the world to an algorithm

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
19 9780135172384)
Reinforcement Learning - GPU

20
Environment

21
Gym Openai

Gym (openai.com)

For Jetson- Github Project: dusty-nv

/jetson-reinforcement: Deep reinforcement learning GPU libraries for NVIDIA Jetson TX1/TX2 with
PyTorch, OpenAI Gym, and Gazebo robotics simulator. (github.com)

Sample Notebook Tutorial using GPU: jetson-reinforcement/intro-DQN.ipynb at master · dusty-nv

/jetson-reinforcement (github.com)

With ROS and Gazebo

https://github.com/AcutronicRobotics/gym-gazebo2/blob/dashing/docker/README.md

22
OpenAI Gym - Cartpole

Figure 1.1 CartPole-v0 is a simple toy environment. The objective is to

balance a pole for 200 time steps by controlling the left-
right motion of a cart.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
23 Copyright © 2020 Pearson Education, Inc. All rights reserved.
OpenAI Gym - Cartpole

(a) t = 1 (b) t = 2 (c) t (d) t = 4

=3
Figure 14.7 Four consecutive frames of the
CartPole environment

Figure B.3 The LunarLander-v2 environment. The objective is to steer and

land the lander between the flags using minimal fuel,
without crashing.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
25 Copyright © 2020 Pearson Education, Inc. All rights reserved.
OpenAI Gym - Environments

CartPole Atari Breakout BipedalWalker

Figure 1.3 Three example environments with different states, actions, and
rewards. These environments are available in
OpenAI Gym.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9780135172384)
26 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Additional Information

Foundations of Deep Reinforcement Learning

https://www.pearson.com/us/higher-education/program/Graesser-Foundations-of-Deep-Reinforceme
nt-Learning-Theory-and-Practice-in-Python/PGM2027228.html

NVIDIA Technical Blog – Deep Learning in a Nutshell: Reinforcement Learning

https://developer.nvidia.com/blog/deep-learning-nutshell-reinforcement-learning/

27
Thank You
Edge AI and Robotics Teaching Kit

Deep Reinforcement Learning: From Q-Learning To Deep Q-Learning
No ratings yet
Deep Reinforcement Learning: From Q-Learning To Deep Q-Learning
9 pages
4th Quarter Test in Mapeh
100% (7)
4th Quarter Test in Mapeh
5 pages
Rigid Body Dynamics
No ratings yet
Rigid Body Dynamics
123 pages
1 s2.0 S0925231220303337 Main
No ratings yet
1 s2.0 S0925231220303337 Main
12 pages
RL PDF
No ratings yet
RL PDF
4 pages
Deep Reinforcement Learning For Drone Delivery
No ratings yet
Deep Reinforcement Learning For Drone Delivery
19 pages
Deep Quality-Value (DQV) Learning: Preprint. Work in Progress
No ratings yet
Deep Quality-Value (DQV) Learning: Preprint. Work in Progress
10 pages
3.5 Intro2DeepQLearning
No ratings yet
3.5 Intro2DeepQLearning
12 pages
The_Use_of_Reinforcement_Learning_in_Gaming_The_Br
No ratings yet
The_Use_of_Reinforcement_Learning_in_Gaming_The_Br
9 pages
Lecture15 Deep Reinforcement Learning PDF
No ratings yet
Lecture15 Deep Reinforcement Learning PDF
109 pages
WWW Teachmint Com Tfile Studymaterial Class 6th Physics12 Importantdiagramsbyumeshrajoria1pdf
No ratings yet
WWW Teachmint Com Tfile Studymaterial Class 6th Physics12 Importantdiagramsbyumeshrajoria1pdf
3 pages
Rule-based Reinforcement Learning augmented by External Knowledge
No ratings yet
Rule-based Reinforcement Learning augmented by External Knowledge
7 pages
The Primacy Bias in Deep Reinforcement Learning
No ratings yet
The Primacy Bias in Deep Reinforcement Learning
20 pages
Environment Interaction of A Bipedal Robot Using Model-Free Control Framework Hybrid Off-Policy and On-Policy Reinforcement Learning Algorithm
No ratings yet
Environment Interaction of A Bipedal Robot Using Model-Free Control Framework Hybrid Off-Policy and On-Policy Reinforcement Learning Algorithm
12 pages
RL Project - Deep Q-Network Agent Report
No ratings yet
RL Project - Deep Q-Network Agent Report
11 pages
3885-Article Text-6944-1-10-20190702
No ratings yet
3885-Article Text-6944-1-10-20190702
8 pages
RL Course Report
No ratings yet
RL Course Report
11 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Off-Policy Deep Reinforcement Learning Without Exploration: Hester Et Al. 2017 Sun Et Al. 2018 Cheng Et Al. 2018
No ratings yet
Off-Policy Deep Reinforcement Learning Without Exploration: Hester Et Al. 2017 Sun Et Al. 2018 Cheng Et Al. 2018
23 pages
A Short Survey On Memory Based RL
No ratings yet
A Short Survey On Memory Based RL
18 pages
1611.01606v1
No ratings yet
1611.01606v1
13 pages
Midterm_Report_Example3
No ratings yet
Midterm_Report_Example3
4 pages
Algorithm Distillation Summary
No ratings yet
Algorithm Distillation Summary
5 pages
Project3-Arc1 1
No ratings yet
Project3-Arc1 1
7 pages
Project 3
No ratings yet
Project 3
23 pages
Analysis of Reinforcement Learning in Maze Environment
No ratings yet
Analysis of Reinforcement Learning in Maze Environment
5 pages
RL Project - Deep Q-Network Agent Presentation
No ratings yet
RL Project - Deep Q-Network Agent Presentation
15 pages
An Analysis of Quantile Temporal-Difference Learning: Mark Rowland
No ratings yet
An Analysis of Quantile Temporal-Difference Learning: Mark Rowland
47 pages
ExperienceReplay
No ratings yet
ExperienceReplay
17 pages
Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey
No ratings yet
Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey
103 pages
Exp-14 Reinforcement Learning
No ratings yet
Exp-14 Reinforcement Learning
11 pages
Reinforcement and Imitation Learning Via Interactive No-Regret Learning
No ratings yet
Reinforcement and Imitation Learning Via Interactive No-Regret Learning
14 pages
2303.07109v1
No ratings yet
2303.07109v1
21 pages
1-s2.0-S2666827022001177-main
No ratings yet
1-s2.0-S2666827022001177-main
9 pages
Ray Interference: A Source of Plateaus in Deep Reinforcement Learning
No ratings yet
Ray Interference: A Source of Plateaus in Deep Reinforcement Learning
17 pages
Sharma PDF
No ratings yet
Sharma PDF
5 pages
PIIS0896627320304682
No ratings yet
PIIS0896627320304682
14 pages
A Review On Basic Deep Learning
No ratings yet
A Review On Basic Deep Learning
9 pages
Using Q-Learning For OpenAI's CartPole-v1 - by Ali Fakhry - The Startup - Medium
No ratings yet
Using Q-Learning For OpenAI's CartPole-v1 - by Ali Fakhry - The Startup - Medium
10 pages
IROS2021 - Deep Learning Using Gazebo
No ratings yet
IROS2021 - Deep Learning Using Gazebo
9 pages
On The Interplaybw Physical and Content Priors in Deep Learning For Computational Imaging
No ratings yet
On The Interplaybw Physical and Content Priors in Deep Learning For Computational Imaging
19 pages
R, R, R: S R - M - A R L: Educe Euse Ecycle Elective Eincarna Tion in Ulti Gent Einforcement Earning
No ratings yet
R, R, R: S R - M - A R L: Educe Euse Ecycle Elective Eincarna Tion in Ulti Gent Einforcement Earning
14 pages
Yuming Li Pin Ni Victor Chang
No ratings yet
Yuming Li Pin Ni Victor Chang
19 pages
DL syllabus 3164601
No ratings yet
DL syllabus 3164601
4 pages
Main
No ratings yet
Main
2 pages
VC Comp Deep Reinforcement Learning Accepted
No ratings yet
VC Comp Deep Reinforcement Learning Accepted
20 pages
Generalization to New Sequential Decision Making Tasks With in-Context Learning
No ratings yet
Generalization to New Sequential Decision Making Tasks With in-Context Learning
21 pages
joshi2020
No ratings yet
joshi2020
6 pages
CHRISTIANO et al 2017
No ratings yet
CHRISTIANO et al 2017
17 pages
Estimating Training Data Influence by Tracing Gradient Descent
No ratings yet
Estimating Training Data Influence by Tracing Gradient Descent
17 pages
IE643 Lecture4 2020aug25
No ratings yet
IE643 Lecture4 2020aug25
71 pages
Towards Learning to Imitate From a Single Video Demonstration
No ratings yet
Towards Learning to Imitate From a Single Video Demonstration
26 pages
The Computational Limits of Deep Learning: Neil C. Thompson, Kristjan Greenewald, Keeheon Lee, Gabriel F. Manso
No ratings yet
The Computational Limits of Deep Learning: Neil C. Thompson, Kristjan Greenewald, Keeheon Lee, Gabriel F. Manso
46 pages
Enhancing Robot Programming With Visual Feedback and Augmented Reality 2
No ratings yet
Enhancing Robot Programming With Visual Feedback and Augmented Reality 2
6 pages
Learning Latent Dynamics for Planning From Pixels
No ratings yet
Learning Latent Dynamics for Planning From Pixels
11 pages
Ai and ML Lab
No ratings yet
Ai and ML Lab
2 pages
Generative Adversarial Inverse Reinforcement Learning With Deep Deterministic Policy Gradient
No ratings yet
Generative Adversarial Inverse Reinforcement Learning With Deep Deterministic Policy Gradient
15 pages
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
No ratings yet
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
12 pages
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
No ratings yet
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
12 pages
A Reinforcement Learning Approach To Obstacle Avoidance of Mobil
No ratings yet
A Reinforcement Learning Approach To Obstacle Avoidance of Mobil
5 pages
Causal Reinforcement Learning Using Observational and Interventional Data
No ratings yet
Causal Reinforcement Learning Using Observational and Interventional Data
25 pages
Practical Deep Reinforcement Learning with Python: Concise Implementation of Algorithms, Simplified Maths, and Effective Use of TensorFlow and PyTorch (English Edition)
From Everand
Practical Deep Reinforcement Learning with Python: Concise Implementation of Algorithms, Simplified Maths, and Effective Use of TensorFlow and PyTorch (English Edition)
Ivan Gridin
4/5 (1)
Touhidur Rahman: Curriculum Vitae OF
No ratings yet
Touhidur Rahman: Curriculum Vitae OF
3 pages
Chapter 14 Notes
No ratings yet
Chapter 14 Notes
12 pages
19 IPv6 Basics
No ratings yet
19 IPv6 Basics
42 pages
Teachers Book Mosaic 3
42% (12)
Teachers Book Mosaic 3
10 pages
What Is The Greenhouse Effect?
No ratings yet
What Is The Greenhouse Effect?
3 pages
Asset Management in SAP S-4HANA Cloud, Public Edition 2302 SAP Blogs
No ratings yet
Asset Management in SAP S-4HANA Cloud, Public Edition 2302 SAP Blogs
25 pages
Microwave Oven Instruction Manual: Model: FMO07ABTBKA
No ratings yet
Microwave Oven Instruction Manual: Model: FMO07ABTBKA
20 pages
Cwaauiyvg43dmqfgnecncnka
No ratings yet
Cwaauiyvg43dmqfgnecncnka
3 pages
Unit III
No ratings yet
Unit III
89 pages
MHA Support Group Facilitation Guide 2016
100% (1)
MHA Support Group Facilitation Guide 2016
46 pages
Refrigerant Compressor
No ratings yet
Refrigerant Compressor
6 pages
Magdalene College - Academic Visitors Policy
No ratings yet
Magdalene College - Academic Visitors Policy
2 pages
Assassination of Julius Caesar - Wikipedia
No ratings yet
Assassination of Julius Caesar - Wikipedia
1 page
Diagnosis Treatment Malaria
100% (1)
Diagnosis Treatment Malaria
29 pages
Pricing
No ratings yet
Pricing
7 pages
The Elisha Principle Revival Through Sonship (Z-Library)
No ratings yet
The Elisha Principle Revival Through Sonship (Z-Library)
124 pages
Kellogg Conference Hotel Fact Sheet English
No ratings yet
Kellogg Conference Hotel Fact Sheet English
2 pages
150RZG
100% (1)
150RZG
4 pages
Speaking Basic 1-Dikonversi
No ratings yet
Speaking Basic 1-Dikonversi
33 pages
ps2 1
No ratings yet
ps2 1
5 pages
Meridian Park Design Guidelines
No ratings yet
Meridian Park Design Guidelines
68 pages
Generation - Incorporating Electromagnetic Transient Studies Into The Generator Interconnection Process at ATC - 110420 PDF
No ratings yet
Generation - Incorporating Electromagnetic Transient Studies Into The Generator Interconnection Process at ATC - 110420 PDF
32 pages
Service Manual - NX70 - 391469
No ratings yet
Service Manual - NX70 - 391469
111 pages
Chapter 4 - HDL Modelling of Sequential Logic Circuit
No ratings yet
Chapter 4 - HDL Modelling of Sequential Logic Circuit
26 pages
Offender Profiling and Crime Analysis (Peter B Ainswoth) Willan Pub - English - 9781843924630 - 2001
100% (3)
Offender Profiling and Crime Analysis (Peter B Ainswoth) Willan Pub - English - 9781843924630 - 2001
208 pages
SPE 92779 Oil-Based Mud Micro-Imager (OBMI) Application in Sangatta: A Field Case Study
No ratings yet
SPE 92779 Oil-Based Mud Micro-Imager (OBMI) Application in Sangatta: A Field Case Study
12 pages
Open Source Automated Testing: An Insight Into Current Trends and Scope For Further Research
No ratings yet
Open Source Automated Testing: An Insight Into Current Trends and Scope For Further Research
12 pages
Graha Roga's
No ratings yet
Graha Roga's
10 pages