Chapter 2
Chapter 2
“Intelligent Agents”
Melaku M.
►Structure of agents
❖A human agent has eyes, ears, and other organs for sensors, and hands,
legs, mouth, and other body parts for effectors.
Figure 4: Partial tabulation of a simple agent function for the vacuum-cleaner world shown in Fig 2
Figure 5 The agent program for a simple reflex agent in the two-state vacuum environment. This
program implements the agent function tabulated in Fig 4. 7
Rationality and Knowledge
➢A rational agent “ does the right thing”.
The right action is the one that will cause the agent to be most
successful.
➢Definition:
9
Task Environment
❖To design a rational agent, we must specify the task environment.
Specifying the task environment is always the first step in designing
agent.
❖PEAS: to specify a task environment
Performance measure
Environment
Actuators
Sensors
❖Performance measure: is the success criteria, that determines how
successful an agent is.
PEAS: Specifying for an automated taxi driver
►Performance measure: what are the desirable qualities that we would aspire
from our automated driver?
Safe, fast, legal(minimize violation of traffic laws and other protocols),
comfortable trip, maximize profits.
►Environment: what are the driving environment that the taxi will face?
roads, other traffic/vehicles, pedestrians, customers, potholes, traffic signs
►Actuators:
steering, accelerator, brake, signal, horn(sounding a warning), display
►Sensors:
Cameras(to see road), sonar and infrared( to detect distances to other cars and
obstacles), accelerometer(to control the vehicle properly, especially on curves),
speedometer, GPS, odometer, engine-fuel-electrical system sensor, keyboard or
microphone.
PEAS: Specifying for vacuum cleaner
►Performance measure:
Cleanness, efficiency: amount of dirt cleaned within certain time, battery life,
►Environment:
►Actuators:
►Sensors:
❖Fully Observable: Agent’s sensors give it access to the complete state of the
environment at each point of time.
✓ Fully observable environments are convenient because the agent not need to
✓ For example, an agent that has to spot defective products on an assembly line
✓Chess. Cross Word, Poker and taxi driver are sequential: in both
cases, short-term actions can have long-term consequences.
Deterministic vs. Stochastic
❖ Deterministic environment: an agent's current state and selected
action can determine the next state of the environment.
Eg. Cross Word
Taxi driving is clearly stochastic in this sense, because one can never
➢Learning agents
Simple reflex Agents
❖These agents selects an action on the basis of the current percept, ignoring
the rest of the percept history.
❖It works on the condition-action rule:-which means it directly maps the
current percept to action.
❖For example, the vacuum agent whose agent function is tabulated in Figure
2.3 is a simple reflex agent, because its decision is based only on the current
location and on whether that location contains dirt.
❖If location A is dirty.” Then, this triggers some established connection in the
agent program to the action “Suck” .
❖We call such a connection, condition–action rule, written as
Example 1: if location A is dirty then suck
Example 2: if car-in-front-is-braking then initiate-braking.
This agents only succeed in the environment is fully observable.
Figure: schematic diagram of simple reflex Agents
Cont.…
INTERPRET-INPUT function generates an abstracted description of the current state from the percept.
RULE-MATCH function returns the first rule in the set of rules that matches the given state description.
Model based reflex agents
❖The most effective way to handle partial observability is for the agent to keep track
of the part of the world it can’t see now.
❖Maintains internal state: keeps track of percept history in order to reveal some of
the unobservable aspects of the current state.
For other driving tasks such as changing lanes, the agent needs to keep track of
where the other cars are if it can’t see them all at once.
❖The agent combines current percept with the internal state to generate updated
description of the current state.
❖Updating internal state information requires two kinds of knowledge to be encoded
in the agent program.
a) how the world evolves in-dependently from the agent.
b) how the agent actions affects the world.
Store previously-observed information
Figure: structure of the Model based reflex agents:showing how the current percept is
combined with the old internal state to generate the updated description of the current state
Cont.…
Goal Based Agent
❖Knowing something about the current state of the environment is not always
enough to decide what to do. For example, at a road junction, the taxi can turn
left, turn right, or go straight on. The correct decision depends on where the
taxi is trying to get to.
In addition to current state description, the agent needs “goal information”.
The agent program combines “goal information” with the “model”. This
allows the agent to choose among multiple possibilities, select the one which
reaches a goal state.
Usually requires Searching and planning to finding action sequences that
achieve the agent’s goals. E.g.: GPS system finding path to certain destination
Figure: Goal based agents: Decision making of this kind is fundamentally different from the
condition– action rules , this involves consideration of the future—both “What will happen
if I do such-and-such?”.Will that make me happy?”
Utility based agents
❖Goals alone are not enough to generate high-quality behavior in most
environments. For example, many action sequences will get the taxi to its
destination (thereby achieving the goal) but some are quicker, safer, more reliable,
or cheaper than others.
❖Utility functions assigns a score to any given sequence of environment states.
❖Based on utility, agent chooses the actions that maximize its expected
utility/performance measure/ for each states of the world.
❖Example: Taxi driving agent can easily distinguish between more and less
desirable ways of getting to its destination.
❖The term utility, can be used to describe “how happy they would make the agent”
Figure: Utility based agents. It uses a model of the world, along with a utility function
that measures its preferences among states of the world. Then it chooses the action that
leads to the best expected utility.
Learning agents
❖ Learning agent is capable of learning from its experience. It starts with basic
knowledge and then able to act and adapt autonomously through learning, to improve
its performance over time.
❖ A learning agent is divided into four conceptual components.
i. Learning element, which is responsible for making improvements
It uses feedback from the critic on how the agent is doing and determines how the
performance element should be modified to do better in the future.
ii. Performance element, which is responsible for selecting external actions. It is
considered to be the entire agent: it takes in percepts and decides on actions.
iii. Critic tells the learning element how well the agent is doing with respect to a fixed
performance standard.
iv. Problem generator: It is responsible for suggesting actions that will lead to new
and informative experiences.
35
Figure: Learning agents
Example
To make the overall design more concrete, let us return to the automated taxi
example. The performance element consists of whatever collection of
knowledge and procedures the taxi has for selecting its driving actions. The taxi
goes out on the road and drives, using this performance element. The critic
observes the world and passes information along to the learning element. For
example, after the taxi makes a quick left turn across three lanes of traffic, the
critic observes the shocking language used by other drivers. From this
experience, the learning element is able to formulate a rule saying this was a
bad action, and the performance element is modified by installation of the new
rule. The problem generator might identify certain areas of behavior in need of
improvement and suggest experiments, such as trying out the brakes on
different road surfaces under different conditions.
Agent programs
The agent programs that we design in this book all have the same skeleton: they
take the current percept as input from the sensors and return an action to the
actuators.4 Notice the difference between the agent program, which takes the
current percept as input, and the agent function, which takes the entire percept
history. The agent program takes just the current percept as input because
nothing more is available from the environment; if the agent’s actions need to
depend on the entire percept sequence, the agent will have to remember the
percepts
We describe the agent programs in the simple pseudocode language
Figure 2.3—represents explicitly the agent function that the agent program
embodies.
To build a rational agent in this way, we as designers must construct a table that
contains the appropriate action for every possible percept sequence.
38