Chapter 7 Learning
Chapter 7 Learning
What is Learning?
Learning is an important area in AI, perhaps more so than planning.
Information processes that improve their performance or enlarge their knowledge bases are said
to learn.
Why is it hard?
Intelligence implies that an organism or machine must be able to adapt to new situations.
It must be able to learn to do new things.
This requires knowledge acquisition, inference, updating/refinement of knowledge base,
acquisition of heuristics, applying faster searches, etc.
Listed below are a few examples of how one may learn. We will look at these in detail shortly
Skill refinement
Knowledge acquisition
-- one can learn by experience and by storing the experience in a knowledge base. One basic
example of this type is rote learning.
Taking advice
-- Similar to rote learning although the knowledge that is input may need to be transformed (or
operationalised) in order to be used effectively.
Problem Solving
-- if we solve a problem one may learn from this experience. The next time we see a similar
problem we can solve it more efficiently. This does not usually involve gathering new knowledge
but may involve reorganisation of data or remembering how to achieve to solution.
Induction
-- One can learn from examples. Humans often classify things in the world without knowing
explicit rules. Usually involves a teacher or trainer to aid the classification.
Discovery
Analogy
-- If a system can recognize similarities in information already stored then it may be able to
transfer some knowledge to improve to solution of the task in hand.
Rote Learning
Rote Learning is basically memorisation.
Samuel's Checkers program employed rote learning (it also used parameter adjustment)
Rote learning is basically a simple process. However it does illustrate some issues that are
relevant to more complex learning issues.
Organisation
-- access of the stored value must be faster than it would be to recomputed
. Methods such as hashing, indexing and sorting can be employed to enable this.
E.g Samuel's program indexed board positions by noting the number of pieces.
Generalisation
-- The number of potentially stored objects can be very large. We may need to generalise
some information to make the problem manageable.
E.g Samuel's program stored game positions only for white to move. Also rotations along
diagonals are combined.
Take high level, abstract advice and convert it into rules that can guide performance
elements of the system. Automate all aspects of advice taking
Develop sophisticated tools such as knowledge base editors and debugging. These are
used to aid an expert to translate his expertise into detailed rules. Here the expert is an
integral part of the learning system. Such tools are important in expert systems area of
AI.
Request
-- This can be simple question asking about general advice or more complicated by identifying
shortcomings in the knowledge base and asking for a remedy.
Interpret
Operationalise
-- Translated advice may still not be usable so this stage seeks to provide a representation that
can be used by the performance element.
Integrate
-- When knowledge is added to the knowledge base care must be taken so that bad side-effects
are avoided.
-- The system must assess the new knowledge for errors, contradictions etc.
Instead of automating the five steps above, many researchers have instead assembled tools that
aid the development and maintenance of the knowledge base.
Providing intelligent editors and flexible representation languages for integrating new
knowledge.
Providing debugging tools for evaluating, finding contradictions and redundancy in the existing
knowledge base.
FOO (First Operational Operationaliser) tries to convert high level advice (principles, problems,
methods) into effective executable (LISP) procedures.
Hearts:
Although the possible situations are numerous general advice can be given such as:
FOO operationalises the advice by translating it into expressions it can use in the game. It can
UNFOLD avoid and then trick to give:
However the advice is still not operational since it depends on the outcome of trick which is
generally not known. Therefore FOO uses case analysis (on the during expression) to determine
which steps could case one to take points. Step 1 is ruled out and step 2's take-points is
UNFOLDED:
FOO now has to decide: Under what conditions does (take me c2) occur during (take
(trick-winner) c1).
A technique, called partial matching, hypothesises that points will be taken if me = trick-
winner and c2 = c1. We can reduce our expression to:
This not quite enough a this means Do not win trick that has points. We do not know who the
trick-winner is, also we have not said anything about how to play in a trick that has point led
in the suit. After a few more steps to achieve this FOO comes up with:
basic domain concepts such as trick, hand, deck suits, avoid, win etc.
Rules and behavioural constraints -- general rules of the game.
Heuristics as to how to UNFOLD.
Many programs rely on an evaluation procedure to summarise the state of search etc. Game
playing programs provide many examples of this.
In learning a slight modification of the formulation of the evaluation of the problem is required.
Here the problem has an evaluation function that is represented as a polynomial of the form such
as:
In designing programs it is often difficult to decide on the exact value to give each weight
initially.
Samuel's Checkers programs employed 16 such features at any one time chosen from a pool of
38.
For example: Making dinner can be described a lay the table, cook dinner, serve dinner. We
could treat laying the table as on action even though it involves a sequence of actions.
The STRIPS problem-solving employed macro-operators in it's learning phase.
Consider a blocks world example in which ON(C,B) and ON(A,TABLE) are true.
But it is not very general. The above can be easily generalised with variables used in place of the
blocks.