5.3 Supervised & Reinforcement
5.3 Supervised & Reinforcement
A. Avinash, Ph.D.,
Assistant Professor
School of Computer Science and Engineering (SCOPE)
Vellore Institute of Technology (VIT), Chennai
Introduction to Evolutionary Computation
Step 4 (fitness and end test) Compute the fitness values for
the new population of N chromosomes. Terminate the
algorithm if stopping criterion or budget of fitness function
evaluations is met; else return to step 1.
Machine Learning
Classification Regression
Types of Supervised Learning:
• Classification
• Regression
Linear Regression
Nearest Neighbor
Decision Trees
Random Forest
Rewards
observation action
St = f (Ht )
Environment State
e
observation action The environment state
Ot At St is the environment’s
private representation
i.e. whatever data the
reward Rt environment uses to
pick the next
observation/reward
The environment state is
not usually visible to the
Even if S
t is visible, it
e
agent
may
contain
environment state St
e
irrelevant
information
Agent State
H1:t → St → Ht+1:∞
Start
Rewards: -1 per time-
step Actions: N, E, S,
W States: Agent’s
Goal location
Maze Example: Policy
Start
Goal
-16 - -6 -7
17
-18 - -5
19
-24 -20 -4 -3
-1 -1
Rewards:
from eachhow much
state
Goal
reward
The model may be
imperfect
Grid layout represents transition s
model Pa ′ represent immediate reward
Numbers s
Ra from
s
(samestate
each for all
s
a)