0% found this document useful (0 votes)
6 views

Module 1

Ai ML notes

Uploaded by

Ankit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Module 1

Ai ML notes

Uploaded by

Ankit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Module 1

Foundations of AI
Introduction – Agents and rationality – Task environment – Agent
Architecture Types.
INSTRUCTIONAL OBJECTIVE
Overview of AI
At the completion of this lecture,

students will be able to :


• Understand the definition of the term Agents &
Environment
• Understand ideas of their interaction
AGENTS AND ENVIRONMENTS
• An agent is anything that can be viewed as perceiving its environment through
sensors and acting upon that environment through actuators.
• A human agent has eyes, ears, and other organs for sensors and hands, legs, vocal
tract, and so on for actuators.
• A robotic agent might have cameras and infrared range finders for sensors and
various motors for actuators.
• A software agent receives keystrokes, file contents, and network packets as
sensory inputs and acts on the environment by displaying on the screen, writing
files, and sending network packets.
AGENTS AND ENVIRONMENTS
• The term percept refer to the agent's perceptual inputs at any given instant.
• An agent's percept sequence is the complete history of everything the
agent has ever perceived.
• In general, an agent's choice of action at any given instant can depend on
the entire percept sequence observed to date, but not on anything it hasn't
perceived.
• Everything about the agent is said by specifying the agent's choice of action
for every possible percept sequence.
• An agent's behavior is AGENT FUNCTION described by the agent function
that maps any given percept sequence to an action/implementation.
[f: P*  A]
Agent: Percept*  Action *
AGENTS AND ENVIRONMENTS
AGENTS AND ENVIRONMENTS
• The agent function can be tabulated that describes any given agent.
• Construct the table by trying out all possible percept sequences and
recording which actions the agent does in response.
• The table is an external characterization of the agent.
• Internally, the agent function for an artificial agent will be
implemented by an agent program.
• The agent function is an abstract mathematical description; the agent
program is a concrete implementation, running within some physical
system.
AGENTS AND ENVIRONMENTS
• Consider the example of vacuum cleaner (Agent) world.
• This world has just two locations: squares A and B.
• The vacuum agent perceives which square it is in and whether there is dirt in the
square. (Percepts – (Location, content))
• Actions - move left, move right, suck up the dirt, or do nothing ( No operation).
AGENTS AND ENVIRONMENTS
• Initial State- Can be in any squares (A or B)
• Successor function – Generates new states after applying the action left, right,
suck.
• Goal States – Checks whether all the squares are clean
• Path cost – Each step costs 1, so the path cost is the sum of the steps in the path
from initial state to the goal state.

The simple agent function is the following:


 if the current square is dirty, then suck; otherwise, move to the other
square.
AGENTS AND ENVIRONMENTS
• A partial tabulation of this agent function is shown in Figure 2.3.
• The various vacuum-world agents can be defined by filling in the right-hand column
in various ways.
• The obvious question, is this:
• What is the right way to fill out the table? In other words, what makes an agent good
or bad, intelligent or stupid?
AGENTS AND ENVIRONMENTS
Path cost = 3
THE CONCEPT OF RATIONALITY
• A rational agent is one that does the right thing, by considering the consequences
of the agent's behavior.
• Doing the right thing is better than doing the wrong thing, but what does it mean
to do the right thing?
• The answer to this question is:
• When an agent is plunked down in an environment, it generates a sequence of
actions according to the percepts it receives.
• This sequence of actions causes the environment to go through a sequence of
states.
• If the sequence is desirable, then the agent has performed well.
• This notion of desirability is captured by a performance measure that evaluates
any given sequence of environment states.
THE CONCEPT OF RATIONALITY
• There is not one fixed performance measure for all tasks and agents; typically, a
designer will devise one appropriate to the circumstances.
• Consider, for example, the vacuum-cleaner agent.
• The performance measure may be the amount of dirt cleaned up in a single eight-
hour shift.
• A rational agent can maximize this performance measure by cleaning up the dirt,
then dumping it all on the floor, then cleaning it up again, and so on.
• A more suitable performance measure would reward the agent for having a clean
floor.
• For example, one point could be awarded for each clean square at each time step
(perhaps with a penalty for electricity consumed and noise generated).
• As a general rule, it is better to design performance measures according to what
one actually wants in the environment, rather than according to how one thinks
the agent should behave.
RATIONALITY
Definition of a Rational agent:
Rational agent at any given time depends on four things:
• The performance measure that defines the criterion of success.
• The agent's prior knowledge of the environment.
• The actions that the agent can perform.
• The agent's percept sequence to date.

For each possible percept sequence, a rational agent should select an action that is
expected to maximize its performance measure, given the evidence provided by the
percept sequence and whatever built-in knowledge the agent has.
RATIONALITY
Consider the simple vacuum-cleaner agent that cleans a square if it is dirty and moves
to the other square if not; Is this a rational agent?
It depends. First, we need to say what the performance measure is, what is known
about the environment, and what sensors and actuators the agent has.
Let us assume the following:
The performance measure awards one point for each clean square at each time
step, over a "lifetime" of 1000 time steps.
The "geography" of the environment is known a priori (Figure 2.2) but the dirt
distribution and the initial location of the agent are not. Clean squares stay clean and
sucking cleans the current square. The Left and Hight actions move the agent left
and right except when this would take the agent outside the environment, in which
case the agent remains where it is.
The only available actions are Left, Right, and Suck.
The agent correctly perceives its location and whether that location contains dirt.
We claim that under these circumstances the agent is indeed rational; its expected
performance is at least as high as any other agent's.
RATIONALITY
One can see easily that the same agent would be irrational under different
circumstances.
For example, once all the dirt is cleaned up, the agent will oscillate needlessly back and
forth; if the performance measure includes a penalty of one point for each movement
left or right, the agent will fare poorly.
A better agent for this case would do nothing once it is sure that all the squares are
clean.
If clean squares can become dirty again, the agent should occasionally check and re-
clean them if needed.
If the geography of the environment is unknown, the agent will need to explore it
rather than stick to squares A and B.
Omniscience, learning, and autonomy
An omniscient agent knows the actual outcome of its actions and can act accordingly;
but omniscience is impossible in reality.
Consider the following example:
I am walking along the Champs Elysees one day and I see an old friend across the
street. There is no traffic nearby and I'm not otherwise engaged, so, being rational, I
start to cross the street.
Meanwhile, at 33,000 feet, a cargo door falls off a passing airliner, 2 and before I make
it to the other side of the street I am flattened.
Was I irrational to cross the street?
It is unlikely that my obituary would read "Idiot attempts to cross street.“
This example shows that rationality is not the same as perfection.
Rationality maximizes expected performance, while perfection maximizes actual
performance.
Omniscience, learning, and autonomy
A rational agent not only gather information but also learn as much as possible from what it
perceives.
The agent's initial configuration could reflect prior knowledge of the environment, but as the
agent gains experience this may be modified and augmented.
There are extreme cases in which the environment is completely known a priori. In such cases,
the agent need not perceive or learn; it simply acts correctly. Such agents are fragile.
An agent relies on the prior knowledge of its designer rather than on its own percepts, and it is
said that the agent lacks autonomy.
A rational agent should be autonomous—it should learn what it can to compensate for partial
or incorrect prior knowledge.
For example, a vacuum-cleaning agent that learns to foresee where and when additional dirt
will appear will do better than one that does not.
An agent requires complete autonomy from the start: when the agent has had little or no
experience, it would have to act randomly unless the designer gave some assistance.
After sufficient experience of its environment, the behavior of a rational agent can become
effectively independent of its prior knowledge.
Hence, the incorporation of learning allows one to design a single rational agent that will
succeed in a vast variety of environments.
INSTRUCTIONAL OBJECTIVE
Overview of AI

At the completion of this lecture,

students will be able to :


• Understand the Nature of environment
• Identify Property of given task environment
The Nature of Environments
Specifying the task environment:
In the rationality of the simple vacuum-cleaner agent, the performance measure is
specified using the environment, and the agent's actuators and sensors.
All these are grouped under the task environment, named as PEAS (Performance,
Environment, Actuators, Sensors).
In designing an agent, the first step must always be to specify the task environment.
The below Figure 2.4 summarizes the PEAS description for the taxi's task environment.
The Nature of Environments
The performance measure include the desirable qualities such as getting to the correct destination;
minimizing fuel consumption and wear and tear; minimizing the trip time or cost; minimizing
violations of traffic laws and disturbances to other drivers; maximizing safety and passenger
comfort; maximizing profits.
The driving environment deal with a variety of roads, ranging from rural lanes and urban alleys to
12-lane freeways. The roads contain other traffic, pedestrians, stray animals, road works, police
cars, puddles, and potholes. The taxi must also interact with potential and actual passengers.
The actuators available to a human driver: control over the engine through the accelerator and
control over steering and braking. In addition, it will need output to a display screen or voice
synthesizer to talk back to the passengers, and some way to communicate with other vehicles.
The basic sensors for the taxi includes one or more controllable video cameras to see the road; it
might augment these with infrared or sonar sensors to detect distances to other cars and obstacles.
To avoid speeding tickets, the taxi should have a speedometer, and to control the vehicle properly,
especially on curves, it should have an accelerometer. To determine the mechanical state of the
vehicle, it will need the usual array of engine, fuel, and electrical system sensors. It might want a
global positioning system (GPS) so that it doesn't get lost. Finally, it will need a keyboard or
microphone for the passenger to request a destination.
The Nature of Environments
In Figure 2.5. the basic PEAS elements for a number of additional agent types.
Properties/ Features of Task Environments
The task environments can be categorized depending on the number of dimensions.
These dimensions determine, the appropriate agent design and the applicability of each of the techniques
for agent implementation. First, we list the dimensions, then we analyze several task environments to
illustrate the ideas.
Fully observable vs. partially observable:
If an agent's sensors is capable to sense or access the complete state of the environment at each point in
time, then we say that the task environment is fully observable.
A task environment is effectively fully observable if the sensors detect all aspects that are relevant to the
choice of action; relevance, in turn, depends on the performance measure.
Fully observable environments are convenient because the agent need not maintain any internal state to
keep track of the world.
An environment might be partially observable because of noisy and inaccurate sensors or because parts of
the state are simply missing from the sensor data—for example, a vacuum agent with only a local dirt sensor
cannot tell whether there is dirt in other squares, and an automated taxi cannot see what other drivers are
thinking.
If the agent has no sensors at all then the environment is unobservable.
Examples:
• Chess – the board is fully observable, and so are the opponent’s moves.
• Driving – the environment is partially observable because what’s around the corner is not known.
Properties of Task Environments
Single agent vs. multiagent:
An agent solving a crossword puzzle by itself is a single-agent environment, whereas an
agent playing chess is in a two agent environment.
An entity may be viewed as an agent, but which entities must be viewed as agents are not
clear.
Does an agent A (the taxi driver) have to treat an object B (another vehicle) as an agent, or
can it be treated merely as an object.
The key distinction is whether B's behavior is best described as maximizing a performance
measure whose value depends on agent A's behavior.
For example, in chess, the opponent entity B is trying to maximize its performance
measure, which, minimizes agent As performance measure. Thus, chess is a competitive
multiagent environment.
In the taxi-driving environment, avoiding collisions maximizes the performance measure of
all agents, so it is a partially cooperative multiagent environment. It is also partially
competitive because, only one car can occupy a parking space.
Properties of Task Environments
Deterministic vs. stochastic:
If the next state of the environment is completely determined by the current state and the action
executed by the agent, then we say the environment is deterministic; otherwise, it is stochastic.
Example for deterministic is a chess game.
An agent need not worry about uncertainty in a fully observable, deterministic environment.
If the environment is partially observable, however, then it could appear to be stochastic.
Most real situations are so complex that it is impossible to keep track of all the unobserved
aspects; for practical purposes, they must be treated as stochastic.
Taxi driving is stochastic in this sense, because one can never predict the behavior of traffic
exactly; moreover, one's tires blow out and one's engine seizes up without warning.
The vacuum world is deterministic, but variations can include stochastic elements such as
randomly appearing dirt and an unreliable suction mechanism.
 We say an environment is uncertain if it is not fully observable or not deterministic.
“Stochastic“ generally implies that uncertainty about outcomes is quantified in terms of
probabilities;
a nondeterministic environment is one in which actions are characterized by their possible
outcomes, but no probabilities are attached to them.
Nondeterministic environment descriptions are usually associated with performance measures
that require the agent to succeed for all possible outcomes of its actions.
Properties of Task Environments
Episodic vs. sequential:
In an episodic task environment, the agent's experience/actions is divided into atomic
episodes.
In each episode the agent receives a percept and then performs a single action. The next
episode does not depend on the actions taken in previous episodes.
Many classification tasks are episodic. For example, an agent that has to spot defective
parts on an assembly line bases each decision on the current part, regardless of previous
decisions; moreover, the current decision doesn't affect whether the next part is
defective.
In sequential environments, the current decision could affect all future decisions.
Chess and taxi driving arc sequential: in both cases, short-term actions can have long-
term consequences.
Episodic environments are much simpler than sequential environments because the
agent does not need to think ahead.
Properties of Task Environments
Static vs. Dynamic:
If the environment can change while an agent is deliberating, then we say the environment
is dynamic for that agent; otherwise, it is static.
In Static environments, the agent need not keep looking at the world while it is deciding on
an action, nor worry about the passage of time.
Dynamic environments, continuously asks the agent what it wants to do; if it hasn't
decided yet, that counts as deciding to do nothing.
If the environment itself does not change with the passage of time but the agent's
performance score does, then we say the environment is semi-dynamic.
Taxi driving is dynamic: the other cars and the taxi itself keep moving while the driving
algorithm dithers about what to do next.
Chess, when played with a clock, is semi-dynamic.
Crossword puzzles are static.
Properties of Task Environments
Discrete vs. continuous:
If an environment consists of a finite number of actions that can be deliberated in the
environment to obtain the output, it is said to be a discrete environment.
The discrete/continuous distinction applies to the state of the environment, to the way
time is handled, and to the percepts and actions of the agent.
For example, the chess environment has a finite number of distinct states / moves
(excluding the clock), Chess also has a discrete set of percepts and actions.
The environment in which the actions are performed cannot be numbered i.e. is not
discrete, is said to be continuous.
Taxi driving is a continuous-state and continuous-time problem: the speed and location of
the taxi and of the other vehicles sweep through a range of continuous values and do so
smoothly over time.
Taxi-driving actions are also continuous (steering angles, etc.). Input from digital cameras is
discrete, but is typically treated as representing continuously varying intensities and
locations.
Properties of Task Environments
Known vs. unknown:
The distinction refers not to the environment itself but to the agent's or designer's state of
knowledge about the "laws of physics" of the environment.
In a known environment, the outcomes (or outcome probabilities if the environment is
stochastic) for all actions are given.
If the environment is unknown, the agent will have to learn how it works in order to make
good decisions.
It is possible for a known environment to be partially observable—for example, in solitaire
card games, I know the rules but am still unable to see the cards that have not yet been
turned over.
Conversely, an unknown environment can be fully observable—in a new video game, the
screen may show the entire game state but I still don't know what the buttons do until I try
them.
Properties of Task Environments
Figure 2.6 lists the properties of a number of familiar environments.
For example, we describe the part-picking robot as episodic, because it normally considers
each part in isolation. But if one day there is a large batch of defective parts, the robot
should learn from several observations that the distribution of defects has changed, and
should modify its behavior for subsequent parts.
INSTRUCTIONAL OBJECTIVE
Overview of AI

At the completion of this lecture,

students will be able to :


• Understand what an agent is and how an agent will function
• Understand the structure of agents
Structure of Agents
The objective of Al is to design an agent program that implements the agent function— the
mapping from percepts to actions.
This agent program will run on some sort of computing device with physical sensors and actuators
called the architecture.
Agent = Architecture + Program
The agent programs take the current percept as input from the sensors and return an action to the
actuators.
The difference between the agent program, which takes the current percept as input, and the
agent function, which takes the entire percept history.
 Ex. Vacuum world: Figure 2.7 shows a agent program that keeps track of the percept sequence
and then uses it to index into a table of actions to decide what to do,
Structure of Agents
There are four basic kinds of agent programs that embody the principles underlying almost
all intelligent systems:
Simple reflex agents
Model-based reflex agents
Goal-based agents
Utility-based agents
Learning Agents
Simple Reflex Agents
The simplest kind of agent is the simple reflex agent.
These agents select actions on the basis of the current percept, ignoring the rest of the
percept history.
For example, the vacuum agent is a simple reflex agent, because its decision is based only
on the current location and on whether that location contains dirt.
The agent program for this agent is shown in Figure 2.8.
Simple Reflex Agents
Imagine yourself as the driver of the automated taxi.
If the car in front brakes, and its brake lights come on, then you should notice this and
initiate braking.
In other words, some processing is done on the visual input to establish the condition and
we say "The car in front is braking."
Then, this triggers some established connection in the agent program to the action "initiate
braking."
We call such a connection a condition-action rule, written as
if car-in-front-is-braking then initiate-braking.
Humans also have many such connections, some of which are learned responses (driving)
and some of which are innate reflexes (such as blinking when something approaches the
eye).
Simple Reflex Agents
A general and flexible approach is first to build a general-purpose interpreter for condition-
action rules and then to create rule sets for specific task environments.
Figure 2.9 gives the structure of this general program in schematic form, showing how the
condition-action rules allow the agent to make the connection from percept to action.
The rectangles are used to denote the current internal state of the agent's decision process,
and ovals to represent the background information used in the process.
Simple Reflex Agents
The agent program is shown in Figure 2.10.
The INTERPRET-INPUT function generates an abstracted description of the current state
from the percept, and the RULE-MATCH function returns the first rule in the set of rules that
matches the given state description.
The agent in Figure 2.10 will work only if the correct decision can be made on the basis of
only the current percept— (ie) only if the environment is fully observable.
For example, the braking rule given earlier assumes that the condition car-in-front - is-
braking can be determined from the current percept—a single frame of video.
This works if the car in front has a centrally mounted brake Light.
Model-based Reflex Agents
The most effective way to handle partial observability is for the agent to keep track of the part of the world it
can't see now.
That is, the agent should maintain some sort of internal state that depends on the percept history and thereby
reflects at least some of the unobserved aspects of the current state.
For the braking problem, the internal state is not too extensive—just the previous frame from the camera,
allowing the agent to detect when two red lights at the edge of the vehicle go on or off simultaneously.
For other driving tasks such as changing lanes, the agent needs to keep track of where the other cars are if it
can't see them all at once.
And for any driving to be possible at all, the agent needs to keep track of where its keys are.
Updating this internal state information as time goes by requires two kinds of knowledge to be encoded in the
agent program.
First, we need some information about how the world evolves independently of the agent—for example, that
an overtaking car generally will be closer behind than it was a moment ago.
Second, we need some information about how the agent's own actions affect the world—for example, that
when the agent turns the steering wheel clockwise, the car turns to the right, or that after driving for five
minutes northbound on the freeway, one is usually about five miles north of where one was five minutes ago.
This knowledge about "how the world works"—whether implemented in simple Boolean circuits or in
complete scientific theories—is called a model of the world.
An agent that uses such a MODEL-BASED model is called a model-based agent.
Model-based Reflex Agents
Figure 2.11 gives the structure of the model-based reflex agent with internal state, showing how the current percept is combined with the
old internal state to generate the updated description of the current state, based on the agent's model of how the world works.

For example, an automated taxi may not be able to see around the large truck that has stopped in front of it and can
only guess about what may be causing the hold-up. Thus, uncertainty about the current state may be unavoidable,
but the agent still has to make a decision.
For example, the taxi may be driving back home, and it may have a rule telling it to fill up with gas on the way home
unless it has at least half a tank.
Although "driving back home" may seem to an aspect of the world state, the fact of the taxi's destination is actually
an aspect of the agent's internal state.
Model-based Reflex Agents
The agent program is shown in Figure 2.12.
The function UPDATE-STATE, is responsible for creating the new internal state description.
The details of how models and states are represented vary widely depending on the type of
environment and the particular technology used in the agent design.
Goal based Agents
Knowing something about the current state of the environment is not always enough to decide what
to do.
For example. at a road junction, the taxi can turn left, turn right, or go straight on. The correct
decision depends on where the taxi is trying to get to.
In other words, as well as a current state description, the agent needs some sort of goal information
that describes situations that are desirable—for example, being at the passenger's destination.
The agent program can combine this with the model to choose actions that achieve the goal. Figure
2.13 shows the goal-based agent's structure.
Goal based Agents
The decision making of this kind is different from the condition action rules described earlier, in that
it involves consideration of the future—both "What will happen if I do such-and-such?" and "Will
that make me happy?"
In the reflex agent designs, this information is not explicitly represented, because the built-in rules
map directly from percepts to actions.
The reflex agent brakes when it sees brake lights.
A goal-based agent, could reason that if the car in front has its brake lights on, it will slow down.
Given the way the world usually evolves, the only action that will achieve the goal of not hitting
other cars is to brake.
Although the goal-based agent appears less efficient, it is more flexible because the knowledge that
supports its decisions is represented explicitly and can be modified.
If it starts to rain, the agent can update its knowledge of how effectively its brakes will operate; this
will automatically cause all of the relevant behaviors to be altered to suit the new conditions
Utility based Agents
Goals alone are not enough to generate high-quality behavior in most environments. For
example, many action sequences will get the taxi to its destination (thereby achieving the
goal) but some are quicker, safer, more reliable, or cheaper than others.
Goals just provide a binary distinction between "happy" and "unhappy" states.
A more general performance measure should allow a comparison of different world states
according to exactly how happy they would make the agent.
A performance measure assigns a score to any given sequence of environment states, so it
can easily distinguish between more and less desirable ways of getting to the taxi's
destination.
An agent's utility function is essentially an internalization of the performance measure.
If the internal utility function and the external performance measure are in agreement, then
an agent that chooses actions to maximize its utility will be rational according to the
external performance measure.
Utility based Agents
A utility-based agent has many advantages in terms of flexibility and learning.
Furthermore, in two kinds of cases, goals are inadequate but a utility-based agent can still make
rational decisions.
First, when there are conflicting goals, only some of which can be achieved (for example, speed
and safety), the utility function specifies the appropriate tradeoff.
Second, when there are several goals that the agent can aim for, none of which can be achieved
with certainty, utility provides a way in which the likelihood of success can be weighed against
the importance of the goals.
A rational utility-based agent chooses the action that maximizes the expected utility of the action
outcomes—that is, the utility the agent expects to derive, on average, given the probabilities and
utilities of each outcome.
Any rational agent must behave as if it possesses a utility function whose expected value it tries
to maximize.
An agent that possesses an explicit utility function can make rational decisions with a general-
purpose algorithm that does not depend on the specific utility function being maximized.
Utility based Agents
The utility-based agent is shown in Figure 2.14.
Utility based agent programs works in stochastic or partially observable environments.
A utility-based agent has to model and keep track of its environment, tasks that involved perception,
representation, reasoning, and learning.
Learning Agents
The agent programs described so far are for selecting the actions. A method is proposed
to build learning machines and then to teach them.
Learning has another advantage, it allows the agent to operate in initially unknown
environments and to become more competent than its initial knowledge.
A learning agent can be divided into four conceptual components, as shown in Figure
2.15.
Learning Agents
The performance element is what we have previously considered to be the entire agent:
it takes in percepts and decides on actions.
The learning element uses feedback from the critic on how the agent is doing with
respect to fixed performance standard and determines how the performance element
should be modified to do better in the future.
The most important distinction is between the learning element, which is responsible for
making improvements, and the performance element, which is responsible for selecting
external actions.
The design of the learning element depends on the design of the performance element.
The critic is necessary because the percepts themselves provide no indication of the
agent's success.
The last component of the learning agent is the problem generator. It is responsible for
suggesting actions that will lead to new and informative experiences.
Learning Agents
For example, a chess program could receive a percept indicating that it has checkmated its
opponent, but it needs a performance standard to know that this is a good thing; the percept
itself does not say so.
It is important that the performance standard be fixed.
Conceptually, one should think of it as being outside the agent because the agent must not
modify it to fit its own behavior.
The point is that if the performance element had its way, it would keep doing the actions that are
best ; given what it knows.
But if the agent is willing to explore a little and do some suboptimal actions in the short run, it
might discover much better actions for the long run.
The problem generator's job is to suggest these exploratory actions.
Learning Agents
Consider the example of automated taxi.
The performance element consists of collection of knowledge and procedures the taxi has for selecting its driving
actions. The taxi goes out on the road and drives, using this performance element.
The critic observes the world and passes information to the learning element. For example, after the taxi makes
a quick left turn across three lanes of traffic, the critic observes the shocking language used by other drivers.
From this experience. the learning element is able to formulate a rule saying this was a bad action, and the
performance element is modified by installation of the new rule.
The problem generator might identify certain areas of behavior in need of improvement and suggest
experiments, such as trying out the brakes on different road surfaces under different conditions.
The learning element can make changes to any of the "knowledge" components.
The simplest cases involve learning directly from the percept sequence.
Observation of pairs of successive states of the environment can allow the agent to learn "How the world
evolves," and observation of the results of its actions can allow the agent to learn "What my actions do.“
For example, if the taxi exerts a certain braking pressure when driving on a wet road, then it will soon find out
how much deceleration is actually achieved.
For example, suppose the taxi-driving agent receives no tips from passengers during the trip.
The external performance standard must inform the agent that the loss of tips is a negative contribution to its
overall performance; then the agent might be able to learn that violent maneuvers do not contribute to its own
utility.
In a sense, the performance standard distinguishes part of the incoming percept as a reward (or penalty) that
provides direct feedback on the quality of the agent's behavior.
How the components of agent programs work
The agent programs consist of various components, whose function it is to answer questions such as:
"What is the world like now?" "What action should l do now?" "What do my actions do?"
The next question for a AI is, "How on earth do these components work?"
The various ways that the components can represent the environment that the agent inhabits are atomic,
factored, and structured.
To illustrate these ideas, it helps to consider a particular agent component, such as the one that deals with
"What my actions do."
This component describes the changes that might occur in the environment as the result of taking an
action, and Figure 2.16 provides schematic depictions of how those transitions might be represented.
How the components of agent programs work
In an atomic representation each state of the world is indivisible—it has no internal structure.
Consider the problem of finding a driving route from one end of a country to the other via some
sequence of cities.
To solve this problem, it may suffice to reduce the state of world to just the name of the city we
are in – a single atom of knowledge; a "black box" whose only discernible property is that of
being identical to or different from another black box.
The algorithms underlying search and game-playing, Hidden Markov models, and Markov
decision processes all work with atomic representations—or, they treat representations as if they
were atomic.
How the components of agent programs work
Consider a higher-fidelity description for the same problem, where we need to be concerned
with more than just atomic location in one city or another; we might need to pay attention to
how much gas is in the tank, our current GPS coordinates, whether or not the oil warning light is
working, how much spare change we have for toll crossings, what station is on the radio, and so
on.
A factored representation splits up each state into a fixed set of variables or attributes, each of
which can have a value.
While two different atomic states have nothing in common—they are just different black boxes—
two different factored states can share some attributes (such as being at some particular GPS
location) and not others (such as having lots of gas or having no gas); this makes it much easier to
work out how to turn one state into another.
With factored representations, we can also represent uncertainty—for example, ignorance about
the amount of gas in the tank can be represented by leaving that attribute blank.
Many important areas of Al are based on factored representations, including constraint
satisfaction algorithms, propositional logic, planning, Bayesian networks, and the machine
learning algorithms.
How the components of agent programs work
For many purposes, we need to understand the world as having things in it that are related to each other,
not just variables with values.
For example, we might notice that a large truck ahead of us is reversing into the driveway of a dairy farm
but a cow has got loose and is blocking the truck's path.
A structured representation, in which objects such as cows and trucks and their various and varying
relationships can be described explicitly (Figure 2.16(c)).
Structured representations underlie relational databases and first-order logic, first-order probability
models, knowledge-based learning and much of natural language understanding.
For example, the rules of chess can be written in a page or two of a structured-representation language
such as first-order logic but require thousands of pages when written in a factored-representation
language such as propositional logic.
On the other hand, reasoning and learning become more complex as the expressive power of the
representation increases.

You might also like