Machine Learning Models

Uploaded by

tealice18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Machine Learning Models

Uploaded by

tealice18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

MACHINE LEARNING MODELS

The basic idea for creating a taxonomy of algorithms is that we divide the instance space by using one of
three ways:
o Using a Logical expression.
o Using the Geometry of the instance space.
o Using Probability to classify the instance space.

The outcome of the transformation of the instance space by a machine learning algorithm using the
above techniques should be exhaustive (cover all possible outcomes) and mutually exclusive (non-
overlapping).

1. Logical models
1.1 Logical models - Tree models and Rule models
Logical models use a logical expression to divide the instance space into segments and hence construct
grouping models. A logical expression is an expression that returns a Boolean value, i.e., a True or False
outcome. Once the data is grouped using a logical expression, the data is divided into homogeneous
groupings for the problem we are trying to solve. For example, for a classification problem, all the
instances in the group belong to one class.
There are mainly two kinds of logical models: Tree models and Rule models.
Rule models consist of a collection of implications or IF-THEN rules. For tree-based models, the ‘if-part’
defines a segment and the ‘then-part’ defines the behaviour of the model for this segment. Rule models
follow the same reasoning.
Tree models can be seen as a particular type of rule model where the if-parts of the rules are organised in a
tree structure. Both Tree models and Rule models use the same approach to supervised learning. The
approach can be summarised in two strategies: we could first find the body of the rule (the concept)
that covers a sufficiently homogeneous set of examples and then find a label to represent the body.
Alternately, we could approach it from the other direction, i.e., first select a class we want to learn and
then find rules that cover examples of the class.
A simple tree-based model is shown below. The tree shows survival numbers of passengers on the
Titanic ("sibsp" is the number of spouses or siblings aboard) . The values under the leaves show the
probability of survival and the percentage of observations in the leaf. The model can be summarised as:
Your chances of survival were good if you were (i) a female or (ii) a male younger than 9.5 years with less
than 2.5 siblings.
1.2 Logical models and Concept learning
To understand logical models further, we need to understand the idea of Concept Learning. Concept
Learning involves learning logical expressions or concepts from examples. The idea of Concept Learning
fits in well with the idea of Machine learning, i.e., inferring a general function from specific training
examples. Concept learning forms the basis of both tree-based and rule-based models. More formally,
Concept Learning involves acquiring the definition of a general category from a given set of positive and
negative training examples of the category. A Formal Definition for Concept Learning is “The infering of a
Boolean-valued function from training examples of its input and output. ”In concept learning, we only
learn a description for the positive class and label everything that doesn’t satisfy that description as
negative.
The following example explains this idea in more detail.

A Concept Learning Task called “Enjoy Sport” as shown above is defined by a set of data from some
example days. Each data is described by six attributes. The task is to learn to predict the value of Enjoy
Sport for an arbitrary day based on the values of its attribute values. The problem can be represented by
a series of hypotheses. Each hypothesis is described by a conjunction of constraints on the attributes. The
training data represents a set of positive and negative examples of the target function. In the example
above, each hypothesis is a vector of six constraints, specifying the values of the six attributes – Sky,
AirTemp, Humidity, Wind, Water, and Forecast. The training phase involves learning the set of days (as a
conjunction of attributes) for which Enjoy Sport = yes.
Thus, the problem can be formulated as:
Given instances X which represent a set of all possible days, each described by the attributes:
o Sky –(values: Sunny, Cloudy, Rainy),
o AirTemp –(values: Warm, Cold),
o Humidity –(values: Normal, High),
o Wind –(values: Strong, Weak),
o Water –(values: Warm, Cold),
o Forecast –(values: Same, Change) .

Try to identify a function that can predict the target variable Enjoy Sport as yes/no, i.e., 1 or 0.
1.3 Concept learning as a search problem and as Inductive Learning
We can also formulate Concept Learning as a search problem. We can think of Concept learning as
searching through a set of predefined space of potential hypotheses to identify a hypothesis that best fits
the training examples. Concept learning is also an example of Inductive Learning. Inductive learning, also
known as discovery learning, is a process where the learner discovers rules by observing examples.
Inductive learning is different from deductive learning, where students are given rules that they then need
to apply. Inductive learning is based on the inductive learning hypothesis. The Inductive Learning
Hypothesis postulates that: Any hypothesis found to approximate the target function well over a
sufficiently large set of training examples is expected to approximate the target function well over other
unobserved examples. This idea is the fundamental assumption of inductive learning.
To summarize, in this section, we saw the first class of algorithms where we divided the instance space
based on a logical expression. We also discussed how logical models are based on the theory of concept
learning –which in turn –can be formulated as an inductive learning or a search problem.

2. Geometric models
In the previous section, we have seen that with logical models, such as decision trees, a logical expression
is used to partition the instance space. Two instances are similar when they end up in the same logical
segment. In this section, we consider models that define similarity by considering the geometry of the
instance space. In Geometric models, features could be described as points in two dimensions (x-and y-
axis) or a three-dimensional space (x, y, and z). Even when features are not intrinsically geometric, they
could be modelled in a geometric manner (for example, temperature as a function of time can be modelled
in two axes). In geometric models, there are two ways we could impose similarity.
o We could use geometric concepts like lines or planes to segment (classify) the instance space.
These are called Linear models.
o Alternatively, we can use the geometric notion of distance to represent similarity. In this case, if
two points are close together, they have similar values for features and thus can be classed as
similar. We call such models as Distance-based models.

2.1 Linear models

Linear models are relatively simple. In this case, the function is represented as a linear combination of its
inputs. Thus, if x1 and x2 are two scalars or vectors of the same dimension and bare arbitrary scalars, then
ax1 + bx2 represents a linear combination of x1 and x2. In the simplest case where f(x) represents a
straight line, we have an equation of the form f(x) = mx+ c where, c represents the intercept and m
represents the slope.
Linear models are parametric, which means that they have a ﬁxed form with a small number of numeric
parameters that need to be learned from data. For example, in f(x) = mx+ c, m and care the parameters
that we are trying to learn from the data. This technique is different from tree or rule models, where the
structure of the model (e.g., which features to use in the tree, and where) is not ﬁxed in advance.
Linear models are stable, i.e., small variations in the training data have only a limited impact on the
learned model. In contrast, tree models tend to vary more with the training data, as the choice of a
different split at the root of the tree typically means that the rest of the tree is different as well. As a result
of having relatively few parameters, Linear models have low variance and high bias. This implies that
Linear models are less likely to overfit the training data than some other models. However, they are more
likely to underfit. For example, if we want to learn the boundaries between countries based on labelled
data, then linear models are not likely to give a good approximation.
In this section, we could also use algorithms that include kernel methods, such as support vector machine
(SVM). Kernel methods use the kernel function to transform data into another dimension where easier
separation can be achieved for the data, such as using a hyperplane for SVM.

2.2 Distance – based models

Distance-based models are the second class of Geometric models. Like Linear models, distance-based
models are based on the geometry of data. As the name implies, distance-based models work on the
concept of distance. In the context of Machine learning, the concept of distance is not based on merely
the physical distance between two points. Instead, we could think of the distance between two points
considering the mode of transport between two points. Travelling between two cities by plane covers less
distance physically than by train because a plane is unrestricted. Similarly, in chess, the concept of distance
depends on the piece used –for example, a Bishop can move diagonally. Thus, depending on the entity and
the mode of travel, the concept of distance can be experienced differently. The distance metrics
commonly used are Euclidean, Minkowski, Manhattan, and Mahalanobis.

Distance is applied through the concept of neighbours and exemplars. Neighbours are points in proximity
with respect to the distance measure expressed through exemplars. Exemplars are either centroids that
ﬁnd a center of mass according to a chosen distance metric or medoids that ﬁnd the most centrally located
data point. The most commonly used centroid is the arithmetic mean, which minimizes squared Euclidean
distance to all other points.
Notes:
o The centroid represents the geometric center of a plane figure, i.e., the arithmetic mean position of
all the points in the figure from the centroid point. This definition extends to any object inn-
dimensional space: its centroid is the mean position of all the points.
o Medoids are similar in concept to means or centroids. Medoids are most commonly used on data
when a mean or centroid cannot be defined. They are used in contexts where the centroid is not
representative of the dataset, such as in image data.
Examples of distance-based models include the nearest-neighbour models, which use the training data as
exemplars – for example, in classification. The K-means clustering algorithm also uses exemplars to create
clusters of similar data points.

3. Probabilistic models
The third family of machine learning algorithms is the probabilistic models. We have seen before that the
k-nearest neighbour algorithm uses the idea of distance (e.g., Euclidian distance) to classify entities, and
logical models use a logical expression to partition the instance space. In this section, we see how the
probabilistic models use the idea of probability to classify new entities.
Probabilistic models see features and target variables as random variables. The process of modelling
represents and manipulates the level of uncertainty with respect to these variables. There are two types
of probabilistic models: Predictive and Generative. Predictive probability models use the idea of a
conditional probability distribution P(Y|X) from which Y can be predicted from X. Generative models
estimate the joint distribution P (Y, X). Once we know the joint distribution for the generative models, we
can derive any conditional or marginal distribution involving the same variables. Thus, the generative
model is capable of creating new data points and their labels, knowing the joint probability distribution.
The joint distribution looks for a relationship between two variables. Once this relationship is inferred, it is
possible to infer new data points.
Naïve Bayes is an example of a probabilistic classifier.
The goal of any probabilistic classifier is given a set of features (x_0 through x_n) and a set of classes (c0
through ck), we aim to determine the probability of the features occurring in each class, and to return the
most likely class. Therefore, for each class, we need to calculate P (ci | x_0, …, x_n).
We can do this using the Bayes rule defined as

The Naïve Bayes algorithm is based on the idea of Conditional Probability. Conditional probability is based
on finding the probability that something will happen, given that something else has already happened.
The task of the algorithm then is to look at the evidence and to determine the likelihood of a specific class
and assign a label accordingly to each entity.

Corridor Variance Swap 2004
No ratings yet
Corridor Variance Swap 2004
6 pages
Actuarial Science Project Proposal
No ratings yet
Actuarial Science Project Proposal
40 pages
Cec 201 Theory
50% (4)
Cec 201 Theory
98 pages
ML Notes
No ratings yet
ML Notes
12 pages
Ai Learning
No ratings yet
Ai Learning
25 pages
Learning Model
No ratings yet
Learning Model
8 pages
5 - AIML - Module3 - PPT
No ratings yet
5 - AIML - Module3 - PPT
37 pages
Module 1 ML
No ratings yet
Module 1 ML
78 pages
UCS-401_CSE7th M L Lect 02_done
No ratings yet
UCS-401_CSE7th M L Lect 02_done
22 pages
Reinforcement Learning: Parallelizing Genetic Algorithms
No ratings yet
Reinforcement Learning: Parallelizing Genetic Algorithms
5 pages
CS 343: Artificial Intelligence Machine Learning: Raymond J. Mooney
No ratings yet
CS 343: Artificial Intelligence Machine Learning: Raymond J. Mooney
35 pages
Index: Unit No Topic Page No
No ratings yet
Index: Unit No Topic Page No
5 pages
Jntuk ML RECORD Full
No ratings yet
Jntuk ML RECORD Full
46 pages
ML DecisionTrees
No ratings yet
ML DecisionTrees
46 pages
ML_Lecture_2_Version_Spaces
No ratings yet
ML_Lecture_2_Version_Spaces
32 pages
Lect6 PDF
No ratings yet
Lect6 PDF
66 pages
Notes Artificial Intelligence Unit 5
No ratings yet
Notes Artificial Intelligence Unit 5
11 pages
Basics of Learning Theory
No ratings yet
Basics of Learning Theory
35 pages
Machine Learning Reftest
No ratings yet
Machine Learning Reftest
10 pages
21CS54 Aiml Module3 PPT
No ratings yet
21CS54 Aiml Module3 PPT
102 pages
Ai Unit V
No ratings yet
Ai Unit V
18 pages
SRU ADA Unit-3
No ratings yet
SRU ADA Unit-3
78 pages
Unit 3
No ratings yet
Unit 3
16 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
45 pages
Lecture Notes in Machine Learning
No ratings yet
Lecture Notes in Machine Learning
65 pages
UNIT 5
No ratings yet
UNIT 5
21 pages
Referencia 12
No ratings yet
Referencia 12
28 pages
Artificial Intelligence: (Unit 5: Machine Learning)
No ratings yet
Artificial Intelligence: (Unit 5: Machine Learning)
15 pages
AI notes Module- 4
No ratings yet
AI notes Module- 4
13 pages
Unit-5_1
No ratings yet
Unit-5_1
88 pages
UNIT I
No ratings yet
UNIT I
17 pages
AI UNIT-4 PPT
No ratings yet
AI UNIT-4 PPT
60 pages
1.concept Learning
No ratings yet
1.concept Learning
50 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
51 pages
2702 PDF
No ratings yet
2702 PDF
7 pages
Unit 1 ML
No ratings yet
Unit 1 ML
30 pages
ML Unit 2
No ratings yet
ML Unit 2
66 pages
AI_01_ID3
No ratings yet
AI_01_ID3
7 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
32 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
36 pages
Machine Learning
No ratings yet
Machine Learning
104 pages
ML UNIT-3 Notes PDF
No ratings yet
ML UNIT-3 Notes PDF
23 pages
3 Ml Ch2 Concept Learning Short
No ratings yet
3 Ml Ch2 Concept Learning Short
16 pages
jdavis-indlearn2 (1)
No ratings yet
jdavis-indlearn2 (1)
91 pages
Cohn1994 Article ImprovingGeneralizationWithAct
No ratings yet
Cohn1994 Article ImprovingGeneralizationWithAct
21 pages
DMML Unit 3
No ratings yet
DMML Unit 3
97 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
54 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
Data Science Vijay1
No ratings yet
Data Science Vijay1
88 pages
Machine Learning 1
No ratings yet
Machine Learning 1
29 pages
ML Unit-3
No ratings yet
ML Unit-3
24 pages
UNIT-1
No ratings yet
UNIT-1
43 pages
Chapter 6:artificial Intelligence Learning: By. Getaneh T
No ratings yet
Chapter 6:artificial Intelligence Learning: By. Getaneh T
59 pages
Module3 PPT
No ratings yet
Module3 PPT
132 pages
Chapter 11
No ratings yet
Chapter 11
55 pages
M2 - Concept Learning
No ratings yet
M2 - Concept Learning
64 pages
Concept Learning
No ratings yet
Concept Learning
71 pages
AI-unit-4
No ratings yet
AI-unit-4
91 pages
Chap 18
No ratings yet
Chap 18
51 pages
50 Most Challenging Algebra Problems!
From Everand
50 Most Challenging Algebra Problems!
Andrei Besedin
No ratings yet
Shape Theory: Categorical Methods of Approximation
From Everand
Shape Theory: Categorical Methods of Approximation
J. M. Cordier
No ratings yet
API 580 Training Course Notes
100% (5)
API 580 Training Course Notes
43 pages
Generalized Coordinates
No ratings yet
Generalized Coordinates
5 pages
QT2 Tutorial 9 10
No ratings yet
QT2 Tutorial 9 10
5 pages
Tractel Carol Rescue Winch Datasheet Watermark
No ratings yet
Tractel Carol Rescue Winch Datasheet Watermark
1 page
Isa-S75 01 PDF
100% (2)
Isa-S75 01 PDF
52 pages
Rotational Equilibrium and Rotational Dynamics Student
100% (1)
Rotational Equilibrium and Rotational Dynamics Student
13 pages
Nyu Stern 2011 811191422
100% (1)
Nyu Stern 2011 811191422
137 pages
Java Lab Manual
0% (1)
Java Lab Manual
38 pages
Electrical Charac of Trasmission Line PDF
No ratings yet
Electrical Charac of Trasmission Line PDF
178 pages
PET 9 - Paper 1
No ratings yet
PET 9 - Paper 1
17 pages
Sequence and Series NIT Allahabad
100% (2)
Sequence and Series NIT Allahabad
84 pages
Convective Heat Transfer PHD
No ratings yet
Convective Heat Transfer PHD
3 pages
Data Representation: Computer Architecture and Assembly Language
No ratings yet
Data Representation: Computer Architecture and Assembly Language
28 pages
02-Basic Structures
No ratings yet
02-Basic Structures
39 pages
Dropper Strategy
No ratings yet
Dropper Strategy
28 pages
Syllabus MECH 436lab: Control Systems Laboratory Part: Credit Hours
No ratings yet
Syllabus MECH 436lab: Control Systems Laboratory Part: Credit Hours
4 pages
Linear Regression Assumptions and Diagnostics in R - Essentials - Articles - STHDA
No ratings yet
Linear Regression Assumptions and Diagnostics in R - Essentials - Articles - STHDA
21 pages
Records (Structs)
100% (1)
Records (Structs)
29 pages
Number Theory
No ratings yet
Number Theory
44 pages
9781107060838_frontmatter
No ratings yet
9781107060838_frontmatter
11 pages
Curriculum Contents - IGCSE Physics 0625
No ratings yet
Curriculum Contents - IGCSE Physics 0625
2 pages
STA Questions
No ratings yet
STA Questions
1 page
Design Theory - Previous Year Questions
No ratings yet
Design Theory - Previous Year Questions
20 pages
Question # 6:: 2 3 4 Max Z X X X
No ratings yet
Question # 6:: 2 3 4 Max Z X X X
12 pages
2.03 Motion Graphs
No ratings yet
2.03 Motion Graphs
39 pages
Proiect Econometrie
No ratings yet
Proiect Econometrie
15 pages
Egm4313 Exam1 Review Statics
No ratings yet
Egm4313 Exam1 Review Statics
3 pages

Machine Learning Models

Uploaded by

Machine Learning Models

Uploaded by

MACHINE LEARNING MODELS

2.1 Linear models

2.2 Distance – based models

You might also like