0% found this document useful (0 votes)
38 views

Reinforcement Learning - Open AI Gym

(1) The document discusses using reinforcement learning to solve the Lunar Lander environment from OpenAI Gym. (2) Deep Q-learning was used to create an agent that could take actions to land the lunar lander safely based on its environment state. (3) The results showed that the agent was able to achieve scores over 200 points, solving the problem, after training for a number of episodes.

Uploaded by

lekeke
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Reinforcement Learning - Open AI Gym

(1) The document discusses using reinforcement learning to solve the Lunar Lander environment from OpenAI Gym. (2) Deep Q-learning was used to create an agent that could take actions to land the lunar lander safely based on its environment state. (3) The results showed that the agent was able to achieve scores over 200 points, solving the problem, after training for a number of episodes.

Uploaded by

lekeke
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Reinforcement learning–

Open AI gym
Jakub Senčák, Pavel Podlužanský, Martin Pospísil,
Viet Anh Phan, Dinh Thao Le
Content

• Assignment
• Motivation
• Reinforcement learning
• The chosen problem
• Approach to the problem
• Created solution to the problem
• Results
• Conclusion
Assignment

• Get acquainted with the issue of reinforcement learning.


• Choose any environment from https://gym.openai.com/.
• Create a model that will be able to play the game.
Motivation

• Gaming
• Resouce management
• Personalized recommendations
• Robotics
Reinforcement learning

• Learning from interaction with an


environment to achieve some long-term
goal that is related to the state of the
environment.
• The goal is defined by reward signal,
which must be maximized
• Agent must be able to partially/fully
sense the environment state and take
actions to influence the environment
state
The chosen problem

• Lunar Lander – The goal is to get the


lander to land on the landing pad.
• If the lander lands on the pad =>
+ 100 to +140 points.
• If the lander lands outside of the
pad => -100 to -140 points.
• Episode finishes if the lander
crashes or comes to rest (-100 or
+100 points).
• The problem is solved if we get at
least 200 points.
• Four discrete actions available: do
nothing, fire left orientation engine,
fire main engine, fire right orientation
engine.
Approach to the problem

• Chosen method of RL:


• Deep Q-learning
• Used libraries:
• Numpy
• Tensorflow
• Keras
• The code is executed on the Google Colab notebook.
Q-learning

• The AI agent attempts to construct an optimal policy directly by interacting with the environment.
• It uses a trial-and-error-based approach - The AI agent repeatedly tries to solve the problem using
varied approach, and continuously updates its policy as it learns more and more about the
environment.
Deep Q-learning

• Q-Learning: A table maps each state-


action pair to its corresponding Q-value
• Deep Q-Learning: A Neural Network
maps input states to (action, Q-value)
pairs
Created solution to the problem

• Some codes and explanation here guys


Results

• Screenshot of the scores


• Maybe one or two GIFs or videos
Conclusion
• We get acquainted to Reinforcement learning, Q-learning, Deep Q-
learning
• We created a model that can play the Lunar Lander game.
• The result of the game is xxxxx after xxxxx episodes. Based on that,
we consider the model a success 
Thank you for your attention

You might also like