Unit 1-Concept Learning
Unit 1-Concept Learning
Concept Learning
• Learning involves acquiring general concepts from specific
training examples. Example: People continually learn
general concepts or categories such as "bird," "car,"
"situations in which I should study more in order to pass the
exam," etc.
• Each such concept can be viewed as describing some subset
of objects or events defined over a larger set
• Alternatively, each concept can be thought of as a Boolean-
valued function defined over this larger set. (Example: A
function defined over all animals, whose value is true for
birds and false for other animals).
Table describes the example days along with attributes of Enjoysport concept
The Inductive Learning Hypothesis
Consider example 1 :
The data in example 1 is { GREEN, HARD, NO, WRINKLED }. We see that our initial
hypothesis is more specific and we have to generalize it for this example. Hence, the
hypothesis becomes :
h1 = { GREEN, HARD, NO, WRINKLED }
Consider example 2 :
Here we see that this example has a negative outcome. Hence we neglect this
example and our hypothesis remains the same.
h2 = { GREEN, HARD, NO, WRINKLED }
Consider example 3 :
Here we see that this example has a negative outcome. Hence we neglect this
example and our hypothesis remains the same.
h3 = { GREEN, HARD, NO, WRINKLED }
Consider example 4 :
The data present in example 4 is { ORANGE, HARD, NO, WRINKLED }. We compare
every single attribute with the initial data and if any mismatch is found we replace that
particular attribute with general case ( ” ? ” ). After doing the process the hypothesis 3
h3 = { GREEN, HARD, NO, WRINKLED } becomes :
h6 = { ?, ?, ?, ? }
h7 = { ?, ?, ?, ? }
• Version space:
The version space, denoted VSH,D, with respect to
hypothesis space H and training example D, is the
subset of hypotheses from H which are consistent
with the training examples in D.
• Algorithm
Version Space a list containing every
hypothesis in H
For each training example, <x, c(x)>
remove from Version Space any
hypothesis h which h(x) c(x)
Output the list of hypotheses in Version
Space
List-Then-Eliminate Algorithm
Guaranteed to output all
hypotheses consistent with the
training data
Can be applied whenever the
hypothesis space H is finite
It requires exhaustively
enumerating all hypotheses in
H( not realistic)
Candidate-Elimination
<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>
• If d is a positive example
– //Generalize S...
– For each hypothesis s in S that is not consistent with d
» Remove s from S
» Add to S all minimal generalizations h of s such that
h is consistent with d and some member of G is more general than h
» Remove from S any hypothesis that is more general than another h in S
– Remove from G any hypothesis inconsistent with d
Candidate_Elimination:Example Trace
d1: <Sunny, Warm, Normal, Strong, Warm, Same, Yes>
S0 <Ø, Ø, Ø, Ø, Ø, Ø>
<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>
• The learned version space is independent of the order in which the training
examples are presented
• After all, the VS shows all the consistent hypotheses
• S and G boundary will move closer together with more examples, up to
convergence
Candidate-Elimination(contd..)
<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>
G4 <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> G4: Last element of G3 is inconsistent with
d4, must be removed.
S0 , , , , , s
G0 ?, ?, ?, ?, ?, ?
For training example d,
S0 , , , , .
S1
Sunny, Warm, Normal, Strong, Warm,
Same
G0, G1 ?, ?, ?, ?, ?, ?
For training
example d,
Sunny, Warm, High, Strong, Warm, Same +
G1, G2 ?, ?, ?, ?, ?, ?
For training
example d,
Rainy, Cold, High, Strong, Warm, Change
S4 Sunny, Warm, ?,
Strong, ?, ?
Weak
– Rote-Learner: This system simply memorizes
the training data and their classification--- No
generalization is involved.
Bias
Strength – Candidate-Elimination: New instances are
classified only if all the hypotheses in the
version space agree on the classification
– Find-S: New instances are classified using the
most specific hypothesis consistent with the
training data
Strong