Ai - Unit Vi
Ai - Unit Vi
INTELLIGENCE
UNIT VI
1
Learning
•Forms of Learning,
•Supervised Learning,
•Learning Decision Trees
Learning
4
Why Learn?
Understand and improve efficiency of human learning
◦ Use to improve methods for teaching and tutoring people (e.g., better
computer-aided instruction.)
Discover new things or structures that are unknown to humans
◦ Example: Data mining, Knowledge Discovery in Databases
5
Learning agents
Learning element
Design of a learning element is affected by
◦ Which components of the performance element are to be
learned
◦ What feedback is available to learn these components
◦ What representation is used for the components
Type of feedback:
◦ Supervised learning: correct answers for each example
◦ Unsupervised learning: correct answers not given
◦ Reinforcement learning: occasional rewards
Supervised vs. unsupervised
Learning
Supervised learning: classification is seen as supervised learning from
examples.
◦ Supervision: The data (observations, measurements, etc.) are labeled with pre-
defined classes. It is like that a “teacher” gives the classes (supervision).
◦ Test data are classified into these classes too.
10
Supervised learning process:
two steps
Learning (training): Learn a model using the training data
Testing: Test the model using unseen test data to assess the model accuracy
11
Decision tree learning is one of the most widely used techniques for
classification.
◦ Its classification accuracy is competitive with other methods, and
◦ it is very efficient.
14
Learning decision trees
Problem: decide whether to wait for a table at a restaurant,
based on the following attributes:
1. Alternate: is there an alternative restaurant nearby?
2. Bar: is there a comfortable bar area to wait in?
3. Fri/Sat: is today Friday or Saturday?
4. Hungry: are we hungry?
5. Patrons: number of people in the restaurant (None, Some, Full)
6. Price: price range ($, $$, $$$)
7. Raining: is it raining outside?
8. Reservation: have we made a reservation?
9. Type: kind of restaurant (French, Italian, Thai, Burger)
10. WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60)
Decision tree learning
Aim: find a small tree consistent with the training examples
Idea: (recursively) choose "most significant" attribute as root of (sub)tree
Which Attribute Is the Best
Classifier?
Information gain, that measures how well a given attribute separates
the training examples according to their target classification
In order to define information gain precisely we calculate entropy
Given a collection S, containing positive and negative examples of
some target concept, the entropy of S relative to this boolean
classification is
Entropy Calculation
9 +ve , 5 –ve examples.
Information Gain Measures The Expected Reduction
In Entropy
Trivially, there is a consistent decision tree for any training set with one path to leaf for each example
(unless f nondeterministic in x) but it probably won't generalize to new examples
v
p i ni pi ni
remainder ( A) I( , )
i 1 p n pi ni pi ni
Information Gain (IG) or reduction in entropy from the attribute
test:
p n
IG ( A) I ( , ) remainder ( A)
pn pn
Choose the attribute with the largest IG
Information gain
For the training set, p = n = 6, I(6/12, 6/12) = 1 bit
Patrons has the highest IG of all attributes and so is chosen by the DTL
algorithm as the root
Example contd.
Decision tree learned from the 12 examples: