0% found this document useful (0 votes)
97 views

Candidate Elimination Algorithm

Uploaded by

bhavani
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views

Candidate Elimination Algorithm

Uploaded by

bhavani
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Candidate Elimination

Algorithm
Version space
 A version space is a hierarchical representation of knowledge that
enables you to keep track of all the useful information supplied by a
sequence of learning examples without remembering any of the
examples.
 The version space method is a concept learning process accomplished
by managing multiple models within a version space.
 A hypothesis “h” is consistent with a set of training examples
D of target concept c if and only if h(x) = c(x) for each training
example in D.
 The version space VS with respect to hypothesis space H and
training examples D is the subset of hypothesis from H
consistent with all training examples in D.
Version Space Characteristics

A version space description consists of two complementary trees:


1.One that contains nodes connected to overly general models, and
2.One that contains nodes connected to overly specific models.
Node values/attributes are discrete.
Diagrammatical Guidelines
 There is a generalization tree and a specialization tree.

 Each node is connected to a model.

 Nodes in the generalization tree are connected to a model that matches


everything in its subtree.

 Nodes in the specialization tree are connected to a model that matches only one
thing in its subtree.

Links between nodes and their models denote


•generalization relations in a generalization tree, and
•specialization relations in a specialization tree.
Diagram of a Version Space
In the diagram below, the specialization tree is colored red, and the
generalization tree is colored green.

 The key idea in version space learning is that specialization of the general models and
generalization of the specific models may ultimately lead to just one correct model
that matches all observed positive examples and does not match any negative
examples.
Candidate Elimination
Algorithm

Candidate Elimination Algorithm is used to find the set of consistent


hypothesis, that is Version spsce.
The Candidate Elimination Algorithm computes the version space
containing all hypotheses from H that are consistent with an
observed sequence of training examples.

Candidate Elimination Algorithm Concept:


 Will use Version Space.
 Consider both positive and negative result.
 For positive example: tend to generalize specific hypothesis.
 For Negative example: tend to make general hypothesis more specific.
Algorithm:

1.Initialize G to the set of maximally general hypotheses in H.


2.Initialize S to the set of maximally specific hypotheses in H.
3.For each training example d
1.If d is a positive example
2.Remove from G any hypothesis that does not include.
3.For each hypothesis s in S that does not include d, remove s from S.
4.Add to S all minimal generalizations h of s such that h includes d, and
5.Some member of G is more general than h
1.Remove from S any hypothesis that is more general than another
hypothesis in S.
4.For each training example d
1.If d is a negative example
2.Remove from S any hypothesis that does not include.
3.For each hypothesis g in G that does not include d
4.Remove g from G
5.Add to G all minimal generalizations h of g such that
5.h does not include d and
6.Some member of S is more specific than h
6.Remove from G any hypothesis that is less general than another
hypothesis in G.
7.If G or S, ever becomes empty, data not consistent (with H).
Step1: Load Data set
Step2: Initialize General Hypothesis and Specific
Hypothesis.
Step3: For each training example
Step4: If example is positive example
if attribute_value == hypothesis_value:
Do nothing
else: replace attribute value with '?' (Basically
generalizing it)
Step5: If example is Negative example Make generalize
hypothesis more specific.
Example:

Find the maximally general hypothesis and maximally specific hypothesis for the training examples given
in the table using the candidate elimination algorithm.
Step 1:
Initialize G & S as most General and specific hypothesis.

G ={'?', '?','?','?', '?','?'}

S = {'φ','φ','φ','φ','φ','φ'}
Step 2:
for each +ve example: make a specific hypothesis more general.

s = {'φ','φ','φ','φ','φ','φ'}

Take the most specific hypothesis as your 1st positive instance.

h={'sunny', 'warm','Normal', 'Strong', 'warm', 'same’}

General hypothesis will remain same: G ={'?', '?','?','?', '?','?'}


Step 3:
Compare with another positive instance for each attribute.
if (attribute value = hypothesis value) do nothing.
else
replace the hypothesis value with more general constraint '?'.
Since instance 2 is also positive so we will compare with it. In instance 2 attribute
humidity is changing so we will generalize that attribute.
S={'sunny', 'warm','?', 'Strong', 'warm', 'same'}

General hypothesis will remain same: G ={'?', '?','?','?', '?','?'}


Step 4:
Instance 3 is negative so for each -ve example make general hypothesis more specific.
we will make the general hypothesis more specific by comparing all the attributes of the
negative instance with the positive instance if attribute found different to create a dedicated
set for the attribute.
G ={<'sunny', '?','?','?', '?','?'> ,
<'?', 'warm','?','?', '?','?’> ,
<'?', '?','Normal','?', '?','?'> ,
< '?', '?','?','?', '?','same'>}

Specific hypothesis will be same:


S={'sunny', 'warm','?', 'Strong', 'warm', 'same'}
step 5:
Instance 4 is positive so repeat step 3:
S={'sunny', 'warm','?', 'Strong', '?', '?'}

Discard the general hypothesis set which is contradicting with a resultant specific hypothesis here
humidity and forecast attribute is contradicting.
G ={<'sunny', '?','?','?', '?','?'> , <'?', 'warm','?','?', '?','?'> }

.:. Maximally specific and general hypothesis are:

S={'sunny', 'warm','?', 'Strong', '?', '?'}

G ={<'sunny', '?','?','?', '?','?'> , <'?', 'warm','?','?', '?','?'> }


Example2:

Find the maximally general hypothesis and maximally specific hypothesis for the training examples given
in the table using the candidate elimination algorithm.
S0: (0, 0, 0, 0, 0) Most Specific Boundary
G0: (?, ?, ?, ?, ?) Most Generic Boundary
S1: (0, 0, 0, 0, 0)
G1: (Many,?,?,?, ?) (?, Big,?,?,?) (?,Medium,?,?,?) (?,?,?,Exp,?) (?,?,?,?,One)
(?,?,?,?,Few)
S2: (Many, Big, No, Exp, Many)
G2: (Many,?,?,?, ?) (?, Big,?,?,?) (?,?,?,Exp,?) (?,?,?,?,Many)
S3: (Many, ?, No, Exp, ?)
G3: (Many,?,?,?,?) (?,?,?,exp,?)
S4: (Many, ?, No, ?, ?)
G4: (Many,?,?,?,?)
Learned Version Space by Candidate Elimination Algorithm for given data set is:
(Many, ?, No, ?, ?) (Many, ?, ?, ?, ?)
Example3:

Find the maximally general hypothesis and maximally specific hypothesis for the training examples given
in the table using the candidate elimination algorithm.
S0: (0, 0, 0) Most Specific Boundary
G0: (?, ?, ?) Most Generic Boundary
S1: (0, 0, 0)
G1: (Small, ?, ?), (?, Blue, ?), (?, ?, Triangle)
S2: (0, 0, 0)
G2: (Small, Blue, ?), (Small, ?, Circle), (?, Blue, ?), (Big, ?, Triangle), (?, Blue, Triangle)
S3: (Small, Red, Circle)
G3: (Small, ?, Circle)
S4: (Small, Red, Circle)
G4: (Small, ?, Circle)
S5: (Small, ?, Circle)
G5: (Small, ?, Circle)
Learned Version Space by Candidate Elimination Algorithm for given data set is:
S: G: (Small, ?, Circle)

You might also like