0% found this document useful (0 votes)
6 views

Lecture Reinforcement Learning

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Lecture Reinforcement Learning

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Edge AI and Robotics Teaching Kit

Lecture 5.1
Reinforcement Learning
The Edge AI and Robotics Teaching Kit is licensed by NVIDIA and UMBC under the
Creative Commons Attribution-NonCommercial 4.0 International License.

2
Topics

• Describe concept of Reinforcement Learning


• Reinforcement Learning Algorithms and Approaches
• Deep Learning
• States, Actions, Rewards
• Lab and Example Environments

3
Learning Objectives - Reinforcement Learning

Explain concepts of Reinforcement Learning


Explain different reinforcement learning approaches
Describe DQN and how Q-Learning is leveraged
Gain hands-on experience training agents using sample environments
in Openai Gym

4
Reinforcement Learning Concepts

5
Concepts

• Environment- attributes
• Agents
• State/Actions
• Learning – policies, functions,
models
• Objective
• Rewards

6
© D . Poole and A. Mackworth 2019 Artificial Intelligence: Foundations of Computational Agents
Reinforcement Learning

Agent Environment

7
Reinforcement Learning

What should an agent do given:


• Prior knowledge – possible states, baseline, possible actions
• Observations – current state, immediate reward
• Goal – optimal set of actions that maximizes the mean cumulative discounted reward
We can train this agent approximating its environment

8
© D . Poole and A. Mackworth 2019 Artificial Intelligence: Foundations of Computational Agents
Reinforcement Learning Loop

Figure 1.2 The reinforcement learning control


loop

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9
9780135172384)
Copyright © 2020 Pearson Education, Inc. All rights reserved.
Rewards and Values

Figure 1.4 Rewards r and values V(s) for each state s in a simple grid-world
environment. The value of a state is calculated from the rewards using
Equation 1.10 with  = 0.9 while using a policy  that always takes the
shortest path to the goal state with r = +1.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
10 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Approaches

11
Reinforcement Learning Approaches

Figure 1.5 Deep reinforcement learning


algorithm families

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
Copyright © 2020 Pearson Education, Inc. All rights reserved.
12
Neural Networks Leveraged for RL

Figure 12.4 Neural network families

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
Copyright © 2020 Pearson Education, Inc. All rights reserved.
13
States

14
Simple Environment

Figure 3.1 Simple environment: five states, two actions per state

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9780135172384)
15 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Simple Environment

Figure 3.2 Optimal Q-values for


the simple environment from
Figure 3.1,  = 0.9

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng
(ISBN-13: 9780135172384)
16 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Simple Environment - Learning

Figure 3.3 Learning the Q*(s, a) for the simple environment from Figure 3.1

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
17 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Simple Environment – Optimal Values

Figure 3.4 Optimal Q-values for the simple environment from Figure 3.1,  = 0
(left),  = 1 (right)

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9780135172384)
18 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Processing of Data

Figure 14.2 Information flow from the world to an algorithm

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
19 9780135172384)
Reinforcement Learning - GPU

20
Environment

21
Gym Openai

Gym (openai.com)

For Jetson- Github Project: dusty-nv


/jetson-reinforcement: Deep reinforcement learning GPU libraries for NVIDIA Jetson TX1/TX2 with
PyTorch, OpenAI Gym, and Gazebo robotics simulator. (github.com)

Sample Notebook Tutorial using GPU: jetson-reinforcement/intro-DQN.ipynb at master · dusty-nv


/jetson-reinforcement (github.com)

With ROS and Gazebo


https://github.com/AcutronicRobotics/gym-gazebo2/blob/dashing/docker/README.md

22
OpenAI Gym - Cartpole

Figure 1.1 CartPole-v0 is a simple toy environment. The objective is to


balance a pole for 200 time steps by controlling the left-
right motion of a cart.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
23 Copyright © 2020 Pearson Education, Inc. All rights reserved.
OpenAI Gym - Cartpole

(a) t = 1 (b) t = 2 (c) t (d) t = 4


=3
Figure 14.7 Four consecutive frames of the
CartPole environment

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
Copyright © 2020 Pearson Education, Inc. All rights reserved.
24
OpenAI Gym - LunarLander

Figure B.3 The LunarLander-v2 environment. The objective is to steer and


land the lander between the flags using minimal fuel,
without crashing.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13: 9780135172384)
25 Copyright © 2020 Pearson Education, Inc. All rights reserved.
OpenAI Gym - Environments

CartPole Atari Breakout BipedalWalker


Figure 1.3 Three example environments with different states, actions, and
rewards. These environments are available in
OpenAI Gym.

From Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser and Wah Loon Keng (ISBN-13:
9780135172384)
26 Copyright © 2020 Pearson Education, Inc. All rights reserved.
Additional Information

Foundations of Deep Reinforcement Learning


https://www.pearson.com/us/higher-education/program/Graesser-Foundations-of-Deep-Reinforceme
nt-Learning-Theory-and-Practice-in-Python/PGM2027228.html

NVIDIA Technical Blog – Deep Learning in a Nutshell: Reinforcement Learning


https://developer.nvidia.com/blog/deep-learning-nutshell-reinforcement-learning/

27
Thank You
Edge AI and Robotics Teaching Kit

You might also like