0% found this document useful (0 votes)
7 views

Ai and ML Module 3

Fast

Uploaded by

Hemanth K V
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Ai and ML Module 3

Fast

Uploaded by

Hemanth K V
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

78

Machine Learning
3.1 INTRODUCTION TO LEARNING. AND ITS TYPES
The process of experience, or
called as acquiring knowledge
learning. Generally,
and expertise through study,
different ways. To make machines
humans learn in
being
learn
wetaught
mathematicians and logicians. This needt
simulate the strategies of human learning in machines. But, willthe computers learn?
has been raised
First let us
over many centuries by philosophers,
This,
questio
of tasks can the computers learn?
the nature of addressthe question - What sortsolve. There aretwo kinds of problems.
depends U
problems that the can -

wewel -ld-epfoinsteet
and ill-posed. Computers have
problems, as these
Computers can solve only well-posed
specifications and have the following Componentsinherent to it.
1.Class of learning tasks (T)
2. A measure of
3. A source of performance
(P)
The standard experienceof
(E)
from Efor the task definition learning proposed by Tom Mitchell is that a program can lea
as follows:
T, and P
improves with experience E. Let us formalize the concept of learming
Let x be the input and Xbe Yis
the input space, which is the set of all inputs, and the output
space, which is the set of all possible outputs, that is,
yes/no.
Let D be the input dataset with examples, (x. ,V.). (z,,y,),". (x, Y,) tor n inputs.
Let the unknown target function be f: X -’ Y, that maps the input
Ihe objective of the learning program is to pick a function, g: X’ Yto space output space
to
All the possible formulae form a hvpothesis space. In short, let H be theapprOXimate hypothesis f
set of all tormulae from
which the learning algorithm chooses. The choice is good when the hypothesis g replicates ffor al
samples. This is shown in Figure 3.1.
Ideal target
hypothesis f)

Training
samples
Learning Generated
algorithms hypothesis g) Error
Candidate Difference
formulae

Hypothesis
space H

Figure 3.1: LearningEnvironment


It can be observed that training samples and target
function are dependent on the given
problem. The learning algorithm and hypothesis set are independent of the given problem. Thus
learning model is informally the hypothesis set and learning algorithm. Thus, learning model can
be stated as follows:
Learning Model= Hypothesis Set +Learning Algorithm
Basics of Learning Theory " 79
problem of predicting alabel for. agiven input data. Let Dbe the
assume a input dataset
Letuspositiveand negative examples. Let ybe the output with class 0or 1. The simple
learning
withboth givenas:
modelcanbe D
2x,w >Threshold, belongs to class 1and
D
<Threshold, belongs to another class
single equation as follows:
This can be put into a
h(x) =

,X,,",are the
components of the input vector, w, W,.., w, are the weights and +1
where,
represent the class. This simple model is called perception model. One can simplify this by
and-1 can further be simplified as:
=b and fixing it as 1, then the model
making W,
h(x) = sign(w"x).
Thisisscalled perception learning
algorithm. The:formal learning models are later discussed in
Section 3.7 of this chapter.

Classical and Adaptive Machine Learning Systems


Process and Output. The input
Adasicalmachine learning system has components such as Input, processed and a hypothesis is
values are taken from the environment directly. These values are predicted values
oenerated as output model. This model is then used for making predictions. The
are consumed by the environment.
the input for getting
Ih contrast to the classical systems, adaptive systems interact withreinforcement learning.
labelled data as direct inputs are not available. This process is called
and in return gets
In reinforcement learning, a learning agent interacts with the environment
feedback. Based on the feedback, the learning agent generates input samples for learning, which
their
change
are used for generating the learning model. Such learning agents are not static and is known
behaviour according to the external signal received from the environment. The feedback
as reward and learning here is the ability of the learning agent adapting to the environment
based on the reward. These are the characteristics of an adaptive system.

Learning Types
There are different types of learning. Some of the different learning methods are as follows:
1. Learn by memorization orlearn by repetition also called as rote learning is done by
memorizing without understanding the logic or concept. Although rote learring is
basically learning by repetition, in machine learning perspective, the learningoccurs by
simply comparing with the existing knowledge for the same input data and producing the
output if present.
2. Learn by examples also called as learn by experience or previous knowledge acquired
at some time, is like finding an analogy, which means performing inductive learning
from observations that formulate a general concept. Here, the learner learns by inferring a
general rule from the set of observations or examples. Therefore, inductive learning is also
called as discovery learning.
80
Machine Learning called as
3. teacher, generally
Learn by being taught by an learning called activelearning whera
expert or a
pasStoe learh
However, there
interactively query
is a aspecial kind of to label unlabelled data instances withhe the des
teacher/expert

outputs. deductive learning,


learner
4. called as
Learning by critical thinking, facts and information.
conclusion from related known
also deduces new fach.
5. Self leaning, is a self-directed
learning, also called as reinforcementand rewards.
normally learns from mistakes punishments learning where leaming t
6.
Learning to solve problems is a type of cognitive
learning happ
methodology to achieve a goal. Here,
Jeand and is possible by devising a the way to achieve the goal
learner initially is not aware of the solution or from the initial state bybut only kn,
the goal. The happens either directly
learning
steps to achieve the goal or indirectly byinferringthe behaviour.
explanations, also called as explanation-based
fol owing
7. Learning by
is another generalizing
learning method that exploits domain knowledge
from experts
to (E:leearninimgpro
the accuracy of learned concepts by supervised learning
the
Acquiring
of machine general concept
learning. from specific instances of the training dataset is main chaler

Scan for the figure showing Types of learning'

3.2 INTRODUCTION TOCOMPUTATION LEARNING THEORY


There are many questions that have been raised by mathematicians and logicians over the tr
taken by computers to learn. Some of the questions are as follows:
1. How can a learning system predict an unseen instance?
2. How do the hypothesis h is close to f, when hypothesisfitself is unknown?
3. How many samples are required?
4. Can we measure the performance of a learning system?
5. Is the solution obtained local or global?
These questions are the basis of a field called 'Computational Learning Theory' or in shor
(COLT). It is a specialized field of study of machine learning. COLT deals with formal metho
used for learning systems. It deals with frameworks for quantifying learning tasks and learnun
algorithms. It provides a fundamental basis for study of machine learning. It deals with Probab
Approximate Learning (PAC) and Vapnik-Chervonenkis (VC) dimensions. The focus of PACi
the quantification of the computational difficulty of learning tasks and
algorithms and the comp"
tation capacity quantification is the focus of VC dimension.
Computational Learning Theory uses many concepts from diverse areas such as Theoreu
Computer Science, Artificial Intelligence and Statistics. The core concept of COLT is the concepto
Rasics of Learning Theory 81

Iramework. One such inmportant framework is PAC Thelearnirngframework is discussed


imannerrinSection 3.7. COLT focuses onsupervisedlearrningtasks. Sincethe complexity
dificult. normally, binary classification tasks are considered for analysis
zingis

DESIGN OFALEARNING SYSTEM


built.around alearning algorithmis called alearningsystem. The design of systems
thatis
thesesteps:
experience
Choosinga training
function
Choosinga target
a target function
: Representation of
4 Functionapproximation

TrainingExperience
consderr desigring of a chess game. In direct experience,
individual board states and correct
Letus directly. In indirect system, the move sequences and results are
f
mOresof the chess game are given label all
given. The training experience also depends on the presence of asupervisor who can
onl boardI state. In the absence of asupervisor, the game agent plays against itself and
ralid mo fora
dmoves distributed
good moves, if the training samples cover all scenarios, or in other words,
learnsthe computation. Ifthe training samples and testing samples have the same
enoughfor performance
good.
istrbution, the results would be

Determine the TargetFunction


the type of knowledge that
The next step is the determination of a target function. In this step,
selected and is determined
needs to be learnt is determined. In direct experience, a board move is chosen as:
move, then it is
whether it is a good move or not against all other moves. If it is the best
B-> M, where, Band Mare legal moves. In indirect experience, all legal moves are accepted and
executed.
ascore is generated for each. The move with largest score is then chosen and

Determine the Target Function Representation


Ihe representation of knowledge may be a table, collection of rules or a neural network. The linear
combination of these factors can be coined as:
V=w, tw,x, +w,, + w,B,
Wnere, x, x, and x, represent different board features and w, w, w, and w, represent weights.
Choosing anIApproximation. Algorithm for the Target Function
The focus is to choose weights and fit the given training samples effectively. The aim is to reduce
the error
given as:
E= )V
Training
Samples
82 .
Machine Learning
Here, b is the sample and vib) is the predicted hypothesis.
The
out as:

Computing the error as the difference between trained and expected apprOximation
be error(b).
Then, for every board feature x, the weights are updated as: hypothesis.
w, = w, +ux error(b) Xx
Here, u is the constant that moderates the size of the
Thus, the learning system has the weight update.
A following components:
Performance
A Critic system to
system to allow the game to play against itself.
A generate the samples.
Generalizer system to generate a hypothesis based on samples.
An Experimenter
This is sent as inputsystem
to the
to generate a new system based on the
currently learnt
performance system. function
3.4
INTRODUCTION
Concept learning is a
T0 CONCEPT
LEARNING
concept or
deriving learning from
a category strategy of acquiring abstract knowledge or
inferring a general
generalization from the data. the given training samples. It is a
process of abstraction.
Concept learning helps to classify an object that and
it helps learner compare and
a has a set of common, relevant
positive and negative instances in the categories based on the similarity and features. Thusof
contrast
simplify by observing the common training data to classify an object. The learner association
tries to
simplified model to the future samples.features from the training samples and
This task is also known as then apply this
Each concept or category learning from
or false value. For
example,
obtained by learning is a Boolean valued function experience.
relevant features and categorizehumans can identify different kinds of which takes a true
that distinguish one animal all animals based on animals based on common
specific
from another can be called as a sets of features. The
specialfeatures
for object and to concept. This way
It is formally recognize new instances of those categories is called as of learning categories
defined as inferring a Boolean valued function by processing training concept learning.
Concept learning requires three things: instances.
1. Input Training
dataset which is a set of training
of a concept or instances, each labeled with the name
category to which it belongs. Use this
the model. past experience train and build
to
2. Output - Target concept or Target
output functionf. It is a mapping function f(x) from input x to
y. It is to determine the specific
In other words, it is to find the features or commorn features to
to determine the target identify an object.
specific set of features to identifyhypothesis
an elephant from all animals. concept. For e.g.. the
3. Test New instances to test the learned model.
Basics of Learning Theory 83
Formally, Concept learning is defined as-"Given a set of hypotheses, the learner searches
through
thehypothesis space toidentify the best hypothesis that matches thetarget concept':
Considerthe followingset of training instances shown in Table 3.1.
Sample Training Instances
Table3.1:
Tail Tusks Paws Fur Color Hooves
Horns Size Elephant
SNO.
Short Yes No No Black No Yes
No Big
No No
Yes
Short No Brown Yes Medium No
2 No No
No Short Yes Black No Medium Yes
3.
No Yes Yes White No Medium No
No Long
4.
Yes Yes Yes Black No
5
No Short Big Yes

Here,inthis set of traininginstances, the:independent attributes considered are 'Hons, Tail',


Tusks, Paws, Fur','Color, Hooves' and'Size'. The dependent attribute is'Elephant'. Thetarget
Conceptistoidentify the
animal to be an Elephant.
Let us nowtake this example and understand further the concept of hypothesis.
Target Concept: Predict the type of animal - For example Elephant.

34.1 Representation of aHypothesis


Ahypothesis 'W approximates a target hunction f to represent the relationship between the
independent attributes and the dependent attribute of the training instances. The hypothesis is the
predited approximate model that best maps the inputs to outputs. Each hypothesis is represented
as aconjunction of attribute conditions in the antecedent part.
For example, (Tail =Short) a (Color =Black)...
The set of hypothesis in the search space is called as hypotheses. Hypotheses are the plural
form of hypothesis. Generally 'H is used to represent the hypotheses and 'h is used to represent
acandidate hypothesis.
Each attribute condition is the constraint on the attribute which is represented as attribute-value
pair. In the antecedent of an attribute condition of a hypothesis,each attribute can take value as
either "?' or 'ã' or can hold a single value.
"" denotes that the attribute can take any value (e.g., Color =?]
"o" denotes that the attribute cannot take any value, ie., it represents null value
le.g., Horns =
Single value denotes aspecific single value from acceptable values of the attribute, ie., the
attribute "Tail' can take a value as 'short [e.g., Tail =Short)
For example, a hypothesis ' will look like,
Horns Tail Tusks Paws Fur Color Hooves Size
h <No Yes Black No Medium>
ren atest instance x, we say hx) =1, if the test instance x satisfies this hypothesis h.
84 " Machine Learning
The training dataset given above has 5 training instances with 8independent
and one dependent attribute. Here, the different hypotheses that can be predicted for
concept are, the at ribtaurtgeets
Horns Tail Tusks Paws Fur Color Hooves Size
h=
No ? Yes Black No
?
Medium>
(or)
h= <No ? Yes Black No
The task is to predict the Big>
bestthypothesis for the target concept (an elephant). The most
hypothesis can allow any value for each of the general
It is attribute.
represented as:
<?, ?, ?, ?,?,?, ?>. This hypothesis
?,
indicates that any animal can be an elephant.
1he most specific hypothesis willnot
allow any
hypothesis indicates that no animal can bevalue
9. P. This for each of the attribute <9 P, P. P. a. o.
an
Ihe target concept mentioned in this example is to elephant.
from the training instances to identify the conjunction of specific features
correctly identify an elephant.
Example 3.1: Explain Concept Learning Task of an Elephant the
Given, from dataset given in Table 3.1.
Input: 5 instances each with 8
attributes
Target concept/function 'c: Elephant ’ (Yes, No)
Hypotheses H: Set of
literal is represented ashypothesis each with conjunctions of literals as
an attribute-value pair] propositions [i.e., each
Solution: The hypothesis 'h for the concept learning task of
h= <No Short
an Elephant is given as:
Yes ? ? Black No
This hypothesis h is expressed in
propositional logic form as
(Horns = No) a (Tail = Short) a (Tusks = Yes) a (Paws =?) A below:
No) a (Size = ?) (Fur =?)a (Color =Black) (Hooves =
Output: Learn the hypothesis 'h' to predict an 'Elephant such that
for a given test instance x,
h(x) = c(*)
This hypothesis produced is also called as concept description which is a
used to classify subsequernt instances. model that can be

Thus, concept learning can also be called as Inductive Learning that tries to
function from specific training instances. This way of learning a hypothesis thatinduce a general
can produce an
approximate target function with a sufficiently large set of training instances can also approxi
mately classify other unobserved instances and is called as inductive learning hypothesis. We can
only determine an approximate target function because it is very difficult to find an exact target
function with the observed training instances. That is why ahypothesis is an approximate target
function that best maps the inputs to outputs.
Basics of Learning Theory 85
34.2 HypothesisSpace
is the set of all possible hypotheses that approximates the target function f.
Hpothesisspace set of all possible
other words, the approximations of the target function can be defined as
hypothesisspace.Fromthis set of hypothesessiinthe
In hypothesis space, a machine learning algorithm
determinethee best possible hypothesis that would best describe the target function or best
would
outputs. Generally, a hypothesis representation language represents a larger hypothesis
fRtthe Every machine learning algorithm would represent the hypothesis space in a different
space.
manner about the function that maps the input variables to output variables. For example,
regressionalgorithm represents the hypothesis space as alinear function whereas a decision
a representsthe.hypothesis space as atree.
tree algorithm
The set of hypotheses that can be generated by alearning algorithm can be further reduced by
specifyingalanguagebias.
The subset of hypothesis space that is
consistent with all-observed training instances is called
for the classification.
Version Space. Version space represents the only hypotheses that are used
of
Eor example, each of the attribute given in the Table 3.1 has the following possible set
values.
Horns- Yes, No
Tail- Long, Short
Tusks - Yes, No
Paws - Yes, No
Fur - Yes, No
Color - Brown, Black, White
Hooves- Yes, No
Size -Medium, Big
Considering these values for each of the attribute, there are (2 x2x2x2 x2x3x2 x2) =384
distinct instances covering all the 5 instances in the training dataset.
So, we can generate (4 x4 x4 x4x4x5x4 x4) =81,920 distinct hypotheses when including
two more values [2,o] for each of the attribute. However, any hypothesis containing one or more
psymbols represents the empty set of instances; that is, it classfies every instance as negative
instance. Therefore, there will be (3 x3 x3x3 x3 x 4x3 x3+1) =8,749 distinct hypotheses
by including only " for each of the attribute and one hypothesis representing the empty set
of instances. Thus, the hypothesis space is much larger and hence we need efficient learning
algorithms to search for the best hypothesis from the set of hypotheses.
Hypothesis ordering is also important wherein the hypotheses are ordered from the most
Specitic one to the most general one in order to restrict searching the hypothesis space exhaustively.
3.4.3 Heuristic Space Search
neuristic search is a search strategy that finds an optimized hypothesis/solution to a problem
ieratively improving the hypothesis/solution based on a given heuristic function or a cost
teasure. Heuristic search methods willgenerate apossble hypothesis that can be asolution in
86
Machine Learning
the hypothesis will be tested
hypothesis
function spae
or the goal or a Path trom theinitial state. This
state to see if it is a real solution. If the tested
hypothesis is a real with the
targ
it will be
selected. This
better hypothesis but maymethod generally increasesthe efficiency because it
not be the best hypothesis. It is useful for
is
solving tough sol
guaranteedution, the to in

heurproisbltiecmssearcwhih .
could not solved by any problem solved by
the travelling other method. The typical example
Several
salesman problem.
commonly used heuristic search methods are hill climbing methods
satisfaction problems, best-first t search, simulated-annealing, A* algorithm, and genetic. calgoornsitrtahi.:
3.4.4
Generalization
In order to
and ISpecialization
principle of understand about how we construct this concept hierarchy, let us apply this
gener;
relation. By generalization of the mosts
and by
an
general izatio n/specialization
specialization the most general hypothesis, the hypothesis
of space specific
can be hypothesto
searched
approximate hypothesis that
instance. matches all positive instances but does not match any negative

Searching the Hypothesis Space


There are two ways of learning the hypothesis, consistent with alltraining instances from the larg
hypothesis space.
1. Specialization -General to Specific
2. Generalization- Specific to General
learning
learning8

Scan for information on 'Additional Examples on Generalization and Specialization'

Generalization - Specific to General Learning This learning methodology will search through
the hypothesis space for an approximate hypothesis by generalizing the most specific hypothesis

Example 3.2: Consider the training instances shown in Table 3.1 and illustrate Specific to
General Learning
Solution: We will start from all false or the most specific hypothesis to determine the moSt
restrictive specialization. Consider only the positive instances and generalize the most specit
hypothesis. Ignore the negative instances.
This learning is illustrated as follows:
The most specific hypothesis 1s taken now, which will not classify any instance to true.
h= <9
Read the first instance I1, to generaliZe the hypothesis h so that this positive
instance ca
classified by the hypothesis h1.
Short Yes No No Black
Big Yes (Positive instance
I1: No No
<No Short Yes No No Black No
h1=
Big>
Basics of Learning Theory 87
instance 12, it is a negative instance, so ignore it.
readingthe.second
When Short No No Brown Yes
I2: Yes
No Medium No (Negative instance)
Short Yes
No No Black No Big>
H2=<No
Siilarly, when
readingthethhird instance I3, it is a positiveinstance so generalize h2to h3 to
generalized.
accommodateit. The resulting h3 is
Short Yes No No Black No Medium Yes (Positive instance)
13: No
No No Black No ?>
<No Short Yes
k3=
14ssince it is a
negative instance.
lgnore White No
No Yes Yes Medium No (Negative instance)
14: No Long
No No Black No ?>
h4=<No Short Yes
fifth instance I5, h4 is further generalized to h5.
Whenreadingthe Yes Yes Black No Big Yes (Positive instance)
5: No
Short Yes
Yes 2 ? Black No ?>
h5= <No Short
allthe positive instances, an approximate hypothesis h5 is generated
Now, after observing instance to true.
which can now classity any subsequent positive

- Generalto Specific Learning This learning methodology will search through


Specialization|-
hypothesis by specializing the most general hypothesis.
he hypothesis space for an approximate

3.3: Ilustrate learning by Specialization - General to Specific Learning for the data
Example
instances shown in Table 3.1.
true all positive and negative
Solution: Start from the most general hypothesis which willmake
instances.
Initially.
h=<? ? ? ? ?

his more general to classify all instances to true.


I1: No Short Yes No No Black No Big Yes (Positive instance)
h1 =<? ? ? ? ? ? ? ?>

I2: Yes Short No No No Brown Yes Medium No (Negative instance)


h2 = <No ? ? ? ?>
? Yes ? ? ?
? Black ?
? ? No ?>
? ? ? ? ? Big> ?
h2 imposes constraints so that it will not classify a negative instance to true.
I3: No Short Yes No No Black No Medium Yes (Positive instance)
h3= <No ? ? ? ?
<? ? Yes ? ?
88
Machine Learning
?>
Black
No ?>
?
? ? Big>
? No
White No Medium
(Negative
14: No
h4 =<?
<?
Long No
Yes
Yes Yes
?
Black
?>
?>
instance)
<? ? ? ? Big>
?
instance.
Remove any hypothesis inconsistent with this negative Yes (Positive instance)
15: No Short Yes Yes Yes Black No Big
h5 =<? ? Yes ? ?
<? 2 ? ? ? Black
<? ? ? ? Big>
?
Thus, h5 is the hypothesis space generated which will classify the positive instances to true an:
negative instances to false.

3.4.5 Hypothesis Space Search by Find-S Algorithm


Find-S algorithm is guaranteed to converge to the most specific hypothesis in Hthat is
with the positive instances in the training dataset. Obviously, it will also be
consistentconsiwithste t
negative instances. Thus, this algorithm considers only the positive instances and eliminates negah
instances while generating the hypothesis. It initially starts with the most specific hypothesis.
Algorithm 3.1: Find-s
Input: Positive instances in the Training dataset
Output: Hypothesis 'h
1. Initialize 'h to the most specific hypothesis.
h=<p .....>

2. Generalize the initial hypothesis for the first positive instance [Since »h' is more specific!
3. For each subsequent instances:
If it is a positive instance,
Check for each attribute value in the instance with the hypothesis 'h.
Ifthe attribute value is the same as the hypothesis value, then do
nothny
Else if the attribute value is different than the hypothesis value, change
to "? in 'h.
Else if it is a negative instance,
Ignore it.

Example 3.4: Consider the training dataset of 4instances shown in Table 3.2. It Containsthe
details of the performance of students and their likelihood of theirfinal
getting ajob offer or not in
semester. Apply the Find-S algorithm.
Basics of Learning Theory 89
Training Dataset
Table3.2: Practical
CGPA Interactiveness Communication Logical Interest Job
Knowledge Skills
Excellent Good
Thinking Offer
Fast Yes
Yes Yes
Good Good Fast Yes
Yes Yes
Good Good Fast No No
No
Good Good Slow No Yes
Yes

Solution:
Initialize'W to the most specific hypothesis. There are 6 attributes, so for each attribute,
Step1:
initiallyfilI'ã in theinitial hypothesis 'h'.
wei
h=<o
2: Generalize the initial hypothesis for the first positive instance. I1 is a positive instance,
Step
generalizethe most specific hypothesis 'h to include this positive instance. Hence,
sO Good Fast Yes Positive instance
I1: 29 Yes Excellent
h=<9 Yes Excellent Good Fast Yes>
is a positive instance. Generalize 'W to include positive
Step3: Scan the next instance 2, since I2
n .For each of the non-matching attribute value in 'h put a '? toinclude this positive
a '.
instance. The third attribute value is mismatching in 'h' with I2, soput
2: 29 Yes Good Good Fast Yes Positive instance
h=<9 Yes ? Good Fast Yes>

Now,scan 13. Since it is a negative instance, ignore it. Hence, the hypothesis remains the same
without any change after scanning I3.
I3: 28 No Good Good Fast No Negative instance
h=<9 Yes Good Fast Yes>

Now scan 14. Since it is a positive instance, check for mismatch in the hypothesis 'h with l4.
The 5th and 6th attribute value are mismatching, so add ? to those attributes in 'h'.
14: 29 Yes Good Good Slow No Positive instance
h=<9 Yes ? Good ?>

Now, the final hypothesis generated with Find-S algorithm is:


h=<9 Yes ? Good ? ?>

It includes all positive instances arnd obviously ignores any negative instance.

Limitations of Find-S Algorithm


1.Find-S algorithm tries to find a hypothesis that is consistent with positive instances, ignoring
all negative instances. As long as the training dataset is consisternt, the hypothesis found by this
algorithm may be consistent.
The algorithm finds only one unique hypothesis, wherein there may be many other hypotheses
that are consistent with the training dataset.

You might also like