Module 1
Module 1
Foundations of AI
Introduction – Agents and rationality – Task environment – Agent
Architecture Types.
INSTRUCTIONAL OBJECTIVE
Overview of AI
At the completion of this lecture,
For each possible percept sequence, a rational agent should select an action that is
expected to maximize its performance measure, given the evidence provided by the
percept sequence and whatever built-in knowledge the agent has.
RATIONALITY
Consider the simple vacuum-cleaner agent that cleans a square if it is dirty and moves
to the other square if not; Is this a rational agent?
It depends. First, we need to say what the performance measure is, what is known
about the environment, and what sensors and actuators the agent has.
Let us assume the following:
The performance measure awards one point for each clean square at each time
step, over a "lifetime" of 1000 time steps.
The "geography" of the environment is known a priori (Figure 2.2) but the dirt
distribution and the initial location of the agent are not. Clean squares stay clean and
sucking cleans the current square. The Left and Hight actions move the agent left
and right except when this would take the agent outside the environment, in which
case the agent remains where it is.
The only available actions are Left, Right, and Suck.
The agent correctly perceives its location and whether that location contains dirt.
We claim that under these circumstances the agent is indeed rational; its expected
performance is at least as high as any other agent's.
RATIONALITY
One can see easily that the same agent would be irrational under different
circumstances.
For example, once all the dirt is cleaned up, the agent will oscillate needlessly back and
forth; if the performance measure includes a penalty of one point for each movement
left or right, the agent will fare poorly.
A better agent for this case would do nothing once it is sure that all the squares are
clean.
If clean squares can become dirty again, the agent should occasionally check and re-
clean them if needed.
If the geography of the environment is unknown, the agent will need to explore it
rather than stick to squares A and B.
Omniscience, learning, and autonomy
An omniscient agent knows the actual outcome of its actions and can act accordingly;
but omniscience is impossible in reality.
Consider the following example:
I am walking along the Champs Elysees one day and I see an old friend across the
street. There is no traffic nearby and I'm not otherwise engaged, so, being rational, I
start to cross the street.
Meanwhile, at 33,000 feet, a cargo door falls off a passing airliner, 2 and before I make
it to the other side of the street I am flattened.
Was I irrational to cross the street?
It is unlikely that my obituary would read "Idiot attempts to cross street.“
This example shows that rationality is not the same as perfection.
Rationality maximizes expected performance, while perfection maximizes actual
performance.
Omniscience, learning, and autonomy
A rational agent not only gather information but also learn as much as possible from what it
perceives.
The agent's initial configuration could reflect prior knowledge of the environment, but as the
agent gains experience this may be modified and augmented.
There are extreme cases in which the environment is completely known a priori. In such cases,
the agent need not perceive or learn; it simply acts correctly. Such agents are fragile.
An agent relies on the prior knowledge of its designer rather than on its own percepts, and it is
said that the agent lacks autonomy.
A rational agent should be autonomous—it should learn what it can to compensate for partial
or incorrect prior knowledge.
For example, a vacuum-cleaning agent that learns to foresee where and when additional dirt
will appear will do better than one that does not.
An agent requires complete autonomy from the start: when the agent has had little or no
experience, it would have to act randomly unless the designer gave some assistance.
After sufficient experience of its environment, the behavior of a rational agent can become
effectively independent of its prior knowledge.
Hence, the incorporation of learning allows one to design a single rational agent that will
succeed in a vast variety of environments.
INSTRUCTIONAL OBJECTIVE
Overview of AI
For example, an automated taxi may not be able to see around the large truck that has stopped in front of it and can
only guess about what may be causing the hold-up. Thus, uncertainty about the current state may be unavoidable,
but the agent still has to make a decision.
For example, the taxi may be driving back home, and it may have a rule telling it to fill up with gas on the way home
unless it has at least half a tank.
Although "driving back home" may seem to an aspect of the world state, the fact of the taxi's destination is actually
an aspect of the agent's internal state.
Model-based Reflex Agents
The agent program is shown in Figure 2.12.
The function UPDATE-STATE, is responsible for creating the new internal state description.
The details of how models and states are represented vary widely depending on the type of
environment and the particular technology used in the agent design.
Goal based Agents
Knowing something about the current state of the environment is not always enough to decide what
to do.
For example. at a road junction, the taxi can turn left, turn right, or go straight on. The correct
decision depends on where the taxi is trying to get to.
In other words, as well as a current state description, the agent needs some sort of goal information
that describes situations that are desirable—for example, being at the passenger's destination.
The agent program can combine this with the model to choose actions that achieve the goal. Figure
2.13 shows the goal-based agent's structure.
Goal based Agents
The decision making of this kind is different from the condition action rules described earlier, in that
it involves consideration of the future—both "What will happen if I do such-and-such?" and "Will
that make me happy?"
In the reflex agent designs, this information is not explicitly represented, because the built-in rules
map directly from percepts to actions.
The reflex agent brakes when it sees brake lights.
A goal-based agent, could reason that if the car in front has its brake lights on, it will slow down.
Given the way the world usually evolves, the only action that will achieve the goal of not hitting
other cars is to brake.
Although the goal-based agent appears less efficient, it is more flexible because the knowledge that
supports its decisions is represented explicitly and can be modified.
If it starts to rain, the agent can update its knowledge of how effectively its brakes will operate; this
will automatically cause all of the relevant behaviors to be altered to suit the new conditions
Utility based Agents
Goals alone are not enough to generate high-quality behavior in most environments. For
example, many action sequences will get the taxi to its destination (thereby achieving the
goal) but some are quicker, safer, more reliable, or cheaper than others.
Goals just provide a binary distinction between "happy" and "unhappy" states.
A more general performance measure should allow a comparison of different world states
according to exactly how happy they would make the agent.
A performance measure assigns a score to any given sequence of environment states, so it
can easily distinguish between more and less desirable ways of getting to the taxi's
destination.
An agent's utility function is essentially an internalization of the performance measure.
If the internal utility function and the external performance measure are in agreement, then
an agent that chooses actions to maximize its utility will be rational according to the
external performance measure.
Utility based Agents
A utility-based agent has many advantages in terms of flexibility and learning.
Furthermore, in two kinds of cases, goals are inadequate but a utility-based agent can still make
rational decisions.
First, when there are conflicting goals, only some of which can be achieved (for example, speed
and safety), the utility function specifies the appropriate tradeoff.
Second, when there are several goals that the agent can aim for, none of which can be achieved
with certainty, utility provides a way in which the likelihood of success can be weighed against
the importance of the goals.
A rational utility-based agent chooses the action that maximizes the expected utility of the action
outcomes—that is, the utility the agent expects to derive, on average, given the probabilities and
utilities of each outcome.
Any rational agent must behave as if it possesses a utility function whose expected value it tries
to maximize.
An agent that possesses an explicit utility function can make rational decisions with a general-
purpose algorithm that does not depend on the specific utility function being maximized.
Utility based Agents
The utility-based agent is shown in Figure 2.14.
Utility based agent programs works in stochastic or partially observable environments.
A utility-based agent has to model and keep track of its environment, tasks that involved perception,
representation, reasoning, and learning.
Learning Agents
The agent programs described so far are for selecting the actions. A method is proposed
to build learning machines and then to teach them.
Learning has another advantage, it allows the agent to operate in initially unknown
environments and to become more competent than its initial knowledge.
A learning agent can be divided into four conceptual components, as shown in Figure
2.15.
Learning Agents
The performance element is what we have previously considered to be the entire agent:
it takes in percepts and decides on actions.
The learning element uses feedback from the critic on how the agent is doing with
respect to fixed performance standard and determines how the performance element
should be modified to do better in the future.
The most important distinction is between the learning element, which is responsible for
making improvements, and the performance element, which is responsible for selecting
external actions.
The design of the learning element depends on the design of the performance element.
The critic is necessary because the percepts themselves provide no indication of the
agent's success.
The last component of the learning agent is the problem generator. It is responsible for
suggesting actions that will lead to new and informative experiences.
Learning Agents
For example, a chess program could receive a percept indicating that it has checkmated its
opponent, but it needs a performance standard to know that this is a good thing; the percept
itself does not say so.
It is important that the performance standard be fixed.
Conceptually, one should think of it as being outside the agent because the agent must not
modify it to fit its own behavior.
The point is that if the performance element had its way, it would keep doing the actions that are
best ; given what it knows.
But if the agent is willing to explore a little and do some suboptimal actions in the short run, it
might discover much better actions for the long run.
The problem generator's job is to suggest these exploratory actions.
Learning Agents
Consider the example of automated taxi.
The performance element consists of collection of knowledge and procedures the taxi has for selecting its driving
actions. The taxi goes out on the road and drives, using this performance element.
The critic observes the world and passes information to the learning element. For example, after the taxi makes
a quick left turn across three lanes of traffic, the critic observes the shocking language used by other drivers.
From this experience. the learning element is able to formulate a rule saying this was a bad action, and the
performance element is modified by installation of the new rule.
The problem generator might identify certain areas of behavior in need of improvement and suggest
experiments, such as trying out the brakes on different road surfaces under different conditions.
The learning element can make changes to any of the "knowledge" components.
The simplest cases involve learning directly from the percept sequence.
Observation of pairs of successive states of the environment can allow the agent to learn "How the world
evolves," and observation of the results of its actions can allow the agent to learn "What my actions do.“
For example, if the taxi exerts a certain braking pressure when driving on a wet road, then it will soon find out
how much deceleration is actually achieved.
For example, suppose the taxi-driving agent receives no tips from passengers during the trip.
The external performance standard must inform the agent that the loss of tips is a negative contribution to its
overall performance; then the agent might be able to learn that violent maneuvers do not contribute to its own
utility.
In a sense, the performance standard distinguishes part of the incoming percept as a reward (or penalty) that
provides direct feedback on the quality of the agent's behavior.
How the components of agent programs work
The agent programs consist of various components, whose function it is to answer questions such as:
"What is the world like now?" "What action should l do now?" "What do my actions do?"
The next question for a AI is, "How on earth do these components work?"
The various ways that the components can represent the environment that the agent inhabits are atomic,
factored, and structured.
To illustrate these ideas, it helps to consider a particular agent component, such as the one that deals with
"What my actions do."
This component describes the changes that might occur in the environment as the result of taking an
action, and Figure 2.16 provides schematic depictions of how those transitions might be represented.
How the components of agent programs work
In an atomic representation each state of the world is indivisible—it has no internal structure.
Consider the problem of finding a driving route from one end of a country to the other via some
sequence of cities.
To solve this problem, it may suffice to reduce the state of world to just the name of the city we
are in – a single atom of knowledge; a "black box" whose only discernible property is that of
being identical to or different from another black box.
The algorithms underlying search and game-playing, Hidden Markov models, and Markov
decision processes all work with atomic representations—or, they treat representations as if they
were atomic.
How the components of agent programs work
Consider a higher-fidelity description for the same problem, where we need to be concerned
with more than just atomic location in one city or another; we might need to pay attention to
how much gas is in the tank, our current GPS coordinates, whether or not the oil warning light is
working, how much spare change we have for toll crossings, what station is on the radio, and so
on.
A factored representation splits up each state into a fixed set of variables or attributes, each of
which can have a value.
While two different atomic states have nothing in common—they are just different black boxes—
two different factored states can share some attributes (such as being at some particular GPS
location) and not others (such as having lots of gas or having no gas); this makes it much easier to
work out how to turn one state into another.
With factored representations, we can also represent uncertainty—for example, ignorance about
the amount of gas in the tank can be represented by leaving that attribute blank.
Many important areas of Al are based on factored representations, including constraint
satisfaction algorithms, propositional logic, planning, Bayesian networks, and the machine
learning algorithms.
How the components of agent programs work
For many purposes, we need to understand the world as having things in it that are related to each other,
not just variables with values.
For example, we might notice that a large truck ahead of us is reversing into the driveway of a dairy farm
but a cow has got loose and is blocking the truck's path.
A structured representation, in which objects such as cows and trucks and their various and varying
relationships can be described explicitly (Figure 2.16(c)).
Structured representations underlie relational databases and first-order logic, first-order probability
models, knowledge-based learning and much of natural language understanding.
For example, the rules of chess can be written in a page or two of a structured-representation language
such as first-order logic but require thousands of pages when written in a factored-representation
language such as propositional logic.
On the other hand, reasoning and learning become more complex as the expressive power of the
representation increases.