0% found this document useful (0 votes)

42 views36 pages

Machine Learning Essentials

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views36 pages

Machine Learning Essentials

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Machine Learning Essentials

Lecture II: Supervised Machine Learning

What is Machine Learning?
Machine learning (ML) is the study of computer algorithms that improve automatically through experience..

Machine learning approaches are traditionally divided into:

• Supervised learning: The computer is presented with example inputs and their desired outputs, given
by a "teacher", and the goal is to learn a general rule that maps inputs to outputs.

• Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to find
structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data)
or a means towards an end (feature learning).

• Reinforcement learning: A computer program interacts with a dynamic environment in which it must
perform a certain goal (such as driving a vehicle or playing a game against an opponent). As it
navigates its problem space, the program is provided feedback that's analogous to rewards, which it
tries to maximize.
What is ML- cont
Tom Mitchell definition of ML: A computer program is said to learn from experience E with respect to some
class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves
with experience E.

Modern day machine learning has two objectives, one is to classify data based on models which have been
developed, the other purpose is to make predictions for future outcomes based on these models

Relations of ML and AI or
What is ML - cont
Relation to Data Mining:

Machine learning and data mining often employ the same methods and overlap significantly, but while
machine learning focuses on prediction, based on known properties learned from the training data, data
mining focuses on the discovery of (previously) unknown properties in the data.

Relation to Optimization:

Machine learning also has ties to optimization: many learning problems are formulated as minimization
of some loss function on a training set of examples. Loss functions express the discrepancy between
the predictions of the model being trained and the actual problem instances.

Statistics draws population inferences from a sample, while machine learning finds generalizable
predictive patterns.
Machine Learning in Python
This lecture focuses on practical aspects of machine learning, primarily using Python’s Scikit-Learn package

• In particular:

we introduce the fundamental vocabulary and concepts of machine learning.

We introduce the Scikit-Learn API and show some examples of its use.

We take a deeper dive into the details of several of the most important machine learning approaches, and
develop an intuition into how they work and when and where they are applicable.

We experiment with supervised learning ML methods

We experiment with unsupervised ML methods

Examples of ML Applications
Classification:
you are given a set of labeled points and want to use these to classify
some unlabeled points. The training set is shown on the right.

The ML algorithm learning result is show below.

The line separating the two sets can now be used
To classify unknown data points.
Examples of ML – cont.
Regression - Predicting Continuous labels

The training set: data with continuous labels (colors in picture)

The ML result. Coloring the whole space based

on the training data, which can be used now to label
unknown data.
Examples of ML – cont.
Clustering: Inferring labels on unlabeled data.

The unlabeled data on the right:

The ML algorithm result: dividing the points into groups

of points close to each other (k-means algorithm).
Each group was given a different color for illustration
purposes.
Examples of ML – cont.
Dimensionality Reduction: Inferring structure of unlabeled data.

The input data:

Result of the ML algorithm

Hands on with Python Scikit-Learn
To install: pip install -U scikit-learn

pip install seaborn # for later use

To check installation:
python -m pip show scikit-learn # to see which version and where scikit-learn is installed

scikit-learn comes with a few standard datasets, for instance the iris and digits datasets for classification and
the diabetes dataset for regression.

To Load these dataset use:

from sklearn import datasets
iris = datasets.load_iris() # database of flowers
digits = datasets.load_digits() # database of the digits 0-9
diabetes = datasets.load_diabetes() # diabetes database

# print(iris.data); print(iris.target); print(digits.images[0]); etc

Scikit-Learn - cont
import seaborn as sns # using the seaborn library
iris = sns.load_dataset('iris')
iris.head()

Output:
sepal_length sepal_width petal_length petal_width species
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
…

A table of features (columns) describing each measurement

Target array describing the labels

For plotting:
import seaborn as sns; sns.set()
sns.pairplot(iris, hue='species', size=1.5)
Frequentist (classical) vs. Bayesian Method
Up to now we looked at frequentist methods. The assumptions were:
P1: Probability refers to limiting relative frequency. Probabilities are objective properties of the real world.
P2: Parameters are fixed, unknown constant, and no probability statement about them can be made
P3: Statistical procedures should be designed to have a well defined long run frequency properties. For example,
95% confidence interval should trap the true value of a parameter with limiting frequency of at least 95%.

Another approach is Bayesian Inference. Its postulates are:

B1: Probability describes degree of belief, not a limiting frequency.
B2: We can make probability statements about parameters.
B3: We make inference about q by producing a probability distribution for it. Inference such as point estimation
and intervals can be estimated form it.
Bayesian Method
1. We choose a probability density – called prior distribution
– this is our beliefs about the parameter q before we see data
2. We choose a statistical model that reflects our beliefs about x given
3. After observing we update our beliefs and calculate the posterior distribution

In case of a singe data point:

In case of n IID observations,

So where
Posterior is proportional to Likelihood times Prior.
Bayesian Method – estimation
We use to make estimation as follows:
Point estimation – typically we take the mean

Interval estimation: find a set , such that

Taking a,b such that produce the desired result.

Hence, C is posterior interval

Example: Let and let σ assumed to be known. Take prior . After some
calculation we can get, where
Note that as
Bayesian Method – interval estimation

Find such that . This can be done by choosing c,d such that

We know so or equivalently

Similarly,

The 95 percent Bayesian interval is

This is the same as the frequentist confidence interval.

Bayesian Method – Posterior by simulation

Suppose we draw .

The histogram of approximate the posterior density

Approximation to the mean is

The posterior 1-a can be approximated by

where is the quantile of

Once we have a sample from Let then

is a sample from

This avoids the need to do any analytical calculations.

Simulation Methods – for Bayesain Inference

In Bayesian Inference we need to calculate certain integrals.

Let the Prior be and let the data be

The Posterior is where

The posterior mean

The integrals above, or their corresponding sums in case of discrete space are to be calculated using simulation
which is explained next.
Simulation Methods – for calculating integrals
Monte Carlo Integration:

We want to calculate some integral

We write it as

Where and

Then it is an expectation

We generate then

The standard error where and

A confidence interval We can take n large and make the interval small as we wish.
Simulation Methods – another example
We start with the standard PDF

And want to calculate its CDF

We write it as where

And then we have

For example, with x=2, the true value is 0.9772 and the Monte Carlo estimate with n = 10000 is 0.9751/ With
n = 100000 we get 0.9771.
Statistics in Data Science – Assignment II
An example where statistical tools are used in Data Science.

Say we are trying to understand complex data. It is very high dimension, so we try to come up with
features that can characterize it.

We invent many features, and then we want to choose a subset of them that is useful in classification.
If we are given labeled data, we can sample two classes, calculate their mean over a sample of size n,
estimate variance and check the Hypothesis that the two sample came from the same class. If the
hypothesis is rejected, we keep the feature. Otherwise we drop it.

Assignment II: Go over some datasets in UCI, especially those with many features and test whether all
the columns they give are really useful. Do it by picking a sample size, n, estimate mean and variance
for the population based on the sample, use the theorem about distribution of average values
(remember the sqrt(n) !!) to determine of the two samples came fro mthe same population.
Naive Bayes Classification
Naive Bayes models are a group of extremely fast and simple classification algorithms that are often suitable
for very high-dimensional datasets. They are fast and have so few tunable parameters.

We focus on an intuitive explanation of how naive Bayes classifiers work, with a couple of examples.

Let L denotes a label and F denotes observed feature (column in the dataset).

𝑃 𝐹 𝐿) 𝑃(𝐿)
𝑃 𝐿|𝐹 = 𝑃(𝐹)

If we are trying to decide between two labels—let’s call them L1 and L2—then one way
to make this decision is to compute the ratio

𝑃 𝐿1 𝐹) 𝑃 𝐹 𝐿1) 𝑃(𝐿1)
=𝑃
𝑃 𝐿2 𝐹) 𝐹 𝐿2) 𝑃(𝐿2)

Now we need a way to calculate 𝑃 𝐹 𝐿) .

Gaussian Naive Bayes – an example
The assumption: The data from each label is drawn from a simple Gaussian distribution

One simple model assumes that the data is described by a Gaussian distribution with no covariance between
dimensions. We can fit this model by simply finding the mean and standard deviation of the points within each
label, which is all you need to define such a distribution.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
from sklearn.datasets import make_blobs
from sklearn.naive_bayes import GaussianNB
X, y = make_blobs(100, 2, centers=2, random_state=2, cluster_std=1.5)
model = GaussianNB()
model.fit(X,y)
rng = np.random.RandomState(0)
Xnew = [-6, -14] + [14, 18] * rng.rand(2000, 2)
ynew = model.predict(Xnew)
plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='RdBu’)
lim = plt.axis()
plt.scatter(Xnew[:, 0], Xnew[:, 1], c=ynew, s=20, cmap='RdBu', alpha=0.1)
plt.axis(lim); plt.show() # shows the boundary of the classification
yprob = model.predict_proba(Xnew) # predicting probabilities for each label
yprob[-8:].round(2)
Multinomial Naive Bayes
The assumption: The features are assumed to be generated from a simple multinomial distribution. The
multinomial distribution describes the probability of observing counts among a number of categories, and thus
multinomial naive Bayes is most appropriate for features that represent counts or count rates.

The idea is precisely the same as before, except that instead of modeling the data distribution with the
best-fit Gaussian, we model the data distribution with a best-fit multinomial distribution.

Examples: Text classification. (in TA class)

When to Use Naive Bayes

Because naive Bayesian classifiers make such stringent assumptions about data, they will
generally not perform as well as a more complicated model. That said, they have several
advantages:
•They are extremely fast for both training and prediction
• They provide straightforward probabilistic prediction
• They are often very easily interpretable
• They have very few (if any) tunable parameters
Linear Regression
linear regression models are a good starting point for regression tasks. They can be fit very quickly, and are
very interpretable.
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
import numpy as np
from sklearn.linear_model import LinearRegression
model = LinearRegression(fit_intercept=True)
rng = np.random.RandomState(1)
x = 10 * rng.rand(50)
y = 2 * x - 5 + rng.randn(50)
model.fit(x[:, np.newaxis], y)
xfit = np.linspace(0, 10, 1000)
yfit = model.predict(xfit[:, np.newaxis])
plt.scatter(x, y) Model slope: 2.02720881036
plt.plot(xfit, yfit); plt.show() Model intercept: -4.99857708555

print("Model slope: ", model.coef_[0])

print("Model intercept:", model.intercept_)
Basis Function Regression
Use polynomials instead of linear functions,

from sklearn.preprocessing import PolynomialFeatures

import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
import numpy as np
from sklearn.linear_model import LinearRegression

def GetPolyData(x,n):
return np.sin(x) + np.random.uniform(-.2, .2, n)

n = 250 # elements number

x = list(range(n))
x = [i/100 for i in x]
y = GetPolyData(x,n)
train_x = np.array(x)
train_y = np.array(y)
polyModel = PolynomialFeatures(degree = 4)
xpol = polyModel.fit_transform(train_x.reshape(-1, 1))
preg = polyModel.fit(xpol,train_y)

liniearModel = LinearRegression(fit_intercept = True)

liniearModel.fit(xpol, train_y[:, np.newaxis])
polyfit = liniearModel.predict(preg.fit_transform(train_x.reshape(-1, 1)))
plt.scatter(train_x, train_y)
plt.plot(train_x, polyfit, color = 'red')
plt.show()
Classification using Support Vector Machines
An example with two sets and several separating lines.
New point x will be classified differently based on separating
line.

Support Vector Machines: Maximizing the Margin

Support Vector Machines (SVM) - example
from sklearn.datasets import make_blobs X, y = make_blobs(n_samples=50, centers=2,
from sklearn.svm import SVC # "Support vector classifier" random_state=0, cluster_std=0.60)
import numpy as np plt.scatter(X[:, 0], X[:, 1], c=y, s=50,
import matplotlib.pyplot as plt
import seaborn as sns; sns.set() cmap='autumn');

def plot_svc_decision_function(model, ax=None, model = SVC(kernel='linear', C=1E10)

plot_support=True): model.fit(X, y)
if ax is None:
plt.scatter(X[:, 0], X[:, 1], c=y, s=50,
ax = plt.gca()
xlim = ax.get_xlim() cmap='autumn')
ylim = ax.get_ylim() plot_svc_decision_function(model);
# create grid to evaluate model plt.show()
x = np.linspace(xlim[0], xlim[1], 30)

y = np.linspace(ylim[0], ylim[1], 30)

Y, X = np.meshgrid(y, x)
xy = np.vstack([X.ravel(), Y.ravel()]).T
P = model.decision_function(xy).reshape(X.shape)
# plot decision boundary and margins
ax.contour(X, Y, P, colors='k',
levels=[-1, 0, 1], alpha=0.5,
linestyles=['--', '-', '--'])
# plot support vectors
if plot_support:
ax.scatter(model.support_vectors_[:, 0],
model.support_vectors_[:, 1], Print(model.support_vectors_)
s=300, linewidth=1, facecolors='none'); array([[ 0.44359863, 3.11530945],
ax.set_xlim(xlim)
ax.set_ylim(ylim) [ 2.33812285, 3.43116792],
[ 2.06156753, 1.96918596]])
Kernel SVM - example
from sklearn.datasets import make_blobs X, y = make_circles(100, factor=.1, noise=.1)
from sklearn.svm import SVC # "Support vector classifier"
import numpy as np clf = SVC(kernel='rbf', C=1E6) # rbf = radial basis functions
import matplotlib.pyplot as plt
import seaborn as sns; sns.set() clf.fit(X, y)
plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')
def plot_svc_decision_function(model, ax=None, plot_svc_decision_function(clf)
plot_support=True): plt.scatter(clf.support_vectors_[:, 0],
if ax is None:
clf.support_vectors_[:, 1], s=300, lw=1, facecolors='none');
ax = plt.gca()
xlim = ax.get_xlim() plt.show()
ylim = ax.get_ylim()
# create grid to evaluate model
x = np.linspace(xlim[0], xlim[1], 30)

y = np.linspace(ylim[0], ylim[1], 30)

s=300, linewidth=1, facecolors='none');

ax.set_xlim(xlim)
ax.set_ylim(ylim)
SVM – Face Recognition
from sklearn.datasets import make_circles
from sklearn.svm import SVC Xtrain, Xtest, ytrain, ytest =
import numpy as np train_test_split(faces.data, faces.target,
import matplotlib.pyplot as plt random_state=42)
import seaborn as sns; sns.set() param_grid = {'svc__C': [1, 5, 10, 50],'svc__gamma':
from sklearn.svm import SVC [0.0001, 0.0005, 0.001, 0.005]}
from sklearn.decomposition import PCA as RandomizedPCA grid = GridSearchCV(model, param_grid)
from sklearn.pipeline import make_pipeline print(grid.fit(Xtrain,ytrain).best_params_)
from sklearn.datasets import fetch_lfw_people model = grid.best_estimator_
from sklearn.model_selection import train_test_split yfit = model.predict(Xtest)
from sklearn.metrics import classification_report fig, ax = plt.subplots(4, 6)
from sklearn.model_selection import GridSearchCV for i, axi in enumerate(ax.flat):
axi.imshow(Xtest[i].reshape(62, 47), cmap='bone')
from sklearn.metrics import confusion_matrix axi.set(xticks=[], yticks=[])

faces = fetch_lfw_people(min_faces_per_person=60) axi.set_ylabel(faces.target_names[yfit[i]].split()[-1],

print(faces.target_names) color='black' if yfit[i] == ytest[i] else 'red')
print(faces.images.shape) fig.suptitle('Predicted Names; Incorrect Labels in
Red', size=14);
# lets plot them
fig, ax = plt.subplots(3, 5) print(classification_report(ytest, yfit,
for i, axi in enumerate(ax.flat): target_names=faces.target_names))
axi.imshow(faces.images[i], cmap='bone') plt.show()
axi.set(xticks=[],
yticks=[],xlabel=faces.target_names[faces.target[i]]) # confusion classes
mat = confusion_matrix(ytest, yfit)
plt.show() sns.heatmap(mat.T, square=True, annot=True, fmt='d',
cbar=False,
pca = RandomizedPCA(n_components=150, whiten=True, xticklabels=faces.target_names,
random_state=42) yticklabels=faces.target_names)
svc = SVC(kernel='rbf', class_weight='balanced') plt.xlabel('true label')
model = make_pipeline(pca, svc) plt.ylabel('predicted label');
plt.show()
SVM – Face Recognition results
Predictions by SVM Confusion classes
Partial data

precision recall f1-score support

Ariel Sharon 0.65 0.73 0.69 15
Colin Powell 0.81 0.87 0.84 68
Donald Rumsfeld 0.75 0.87 0.81 31
George W Bush 0.93 0.83 0.88 126
Gerhard Schroeder 0.86 0.78 0.82 23
Hugo Chavez 0.93 0.70 0.80 20
Junichiro Koizumi 0.80 1.00 0.89 12
Tony Blair 0.83 0.93 0.88 42

avg / total 0.85 0.85 0.85 337

Decision Trees
Decision trees are extremely intuitive ways to classify or label objects: you simply
ask a series of questions designed to zero in on the classification

#Creating a decision tree

from sklearn.datasets import make_blobs
X, y = make_blobs(n_samples=300, centers=4,
random_state=0, cluster_std=1.0) The Data
plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='rainbow’)
from sklearn.tree import DecisionTreeClassifier
tree = DecisionTreeClassifier().fit(X, y)

Visualization of the decision tree splitting of the data

Decision Trees - cont
from sklearn.datasets import make_blobs
from sklearn.tree import DecisionTreeClassifier
import numpy as np
import numpy as np
import matplotlib.pyplot as plt

def visualize_classifier(model, X, y, ax=None, cmap='rainbow'):

ax = ax or plt.gca()
# Plot the training points
ax.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=cmap,
clim=(y.min(), y.max()), zorder=3)
ax.axis('tight')
ax.axis('off')
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# fit the estimator
model.fit(X, y)
xx, yy = np.meshgrid(np.linspace(*xlim, num=200),
np.linspace(*ylim, num=200))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
# Create a color plot with the results
n_classes = len(np.unique(y))
contours = ax.contourf(xx, yy, Z, alpha=0.3,
levels=np.arange(n_classes + 1) - 0.5,cmap=cmap, clim=(y.min(), y.max()), zorder=1)
ax.set(xlim=xlim, ylim=ylim)
plt.show()

# creating the data

X, y = make_blobs(n_samples=300, centers=4,
random_state=0, cluster_std=1.0)
plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='rainbow’);
# computing the tree classifier
tree = DecisionTreeClassifier().fit(X, y)
# visualizing the classifier
visualize_classifier(DecisionTreeClassifier(), X, y)
Random Forests
Random Forests – digits
import numpy as np
import matplotlib.pyplot as plt # define training data
from sklearn.datasets import load_digits Xtrain, Xtest, ytrain, ytest = train_test_split(digits.data, digits.target,
from sklearn.model_selection import train_test_split random_state=0)
from sklearn import metrics #create RF model
from sklearn.metrics import confusion_matrix model = RandomForestClassifier(n_estimators=1000)
from sklearn.ensemble import RandomForestClassifier model.fit(Xtrain, ytrain)
import seaborn as sns ypred = model.predict(Xtest)
# get the data
digits = load_digits() print(metrics.classification_report(ypred, ytest))
digits.keys()
# visualize some digits # confusion matrix
fig = plt.figure(figsize=(6, 6)) # figure size in inches mat = confusion_matrix(ytest, ypred)
fig.subplots_adjust(left=0, right=1, bottom=0, top=1, hspace=0.05, sns.heatmap(mat.T, square=True, annot=True, fmt='d', cbar=False)
wspace=0.05) plt.xlabel('true label')
# plot the digits: each image is 8x8 pixels plt.ylabel('predicted label');
for i in range(64):
ax = fig.add_subplot(8, 8, i + 1, xticks=[], yticks=[]) plt.show()
ax.imshow(digits.images[i], cmap=plt.cm.binary, interpolation='nearest')
# label the image with the target value
ax.text(0, 7, str(digits.target[i]))

plt.show()
Random Forests – digits results

The data 8x8 images The classification summary The confusion matrix

We find that a simple, untuned random forest results in a very accurate classification
of the digits data.
Thank you for listening

Bayesian Inference and Computation A Beginner's Guide - Brewer
No ratings yet
Bayesian Inference and Computation A Beginner's Guide - Brewer
40 pages
Zzzz-Essential Bayes
No ratings yet
Zzzz-Essential Bayes
158 pages
Bayes Manuscripts
No ratings yet
Bayes Manuscripts
180 pages
15.097: Probabilistic Modeling and Bayesian Analysis
No ratings yet
15.097: Probabilistic Modeling and Bayesian Analysis
42 pages
6 Min Read: Siwei Xu Aug 27
No ratings yet
6 Min Read: Siwei Xu Aug 27
4 pages
AIML-Unit 3 Notes-Assignment 3
No ratings yet
AIML-Unit 3 Notes-Assignment 3
37 pages
Chapter 9 Bayesian Methods - Machine Learning For Factor Investing
No ratings yet
Chapter 9 Bayesian Methods - Machine Learning For Factor Investing
11 pages
Mstat Note14 Bayesian Inference FSP
No ratings yet
Mstat Note14 Bayesian Inference FSP
30 pages
Baysian Inferences
No ratings yet
Baysian Inferences
20 pages
Bayes 2021 Part1
No ratings yet
Bayes 2021 Part1
44 pages
Block 4 ST3189
No ratings yet
Block 4 ST3189
25 pages
Bayesian Inference
No ratings yet
Bayesian Inference
5 pages
Revision - Bayesian Inference
No ratings yet
Revision - Bayesian Inference
4 pages
0 Points of View
No ratings yet
0 Points of View
15 pages
PyCon 2015 - Bayesian Statistics Made Simple
100% (4)
PyCon 2015 - Bayesian Statistics Made Simple
145 pages
CLASS 2025 Bayesian Framework
No ratings yet
CLASS 2025 Bayesian Framework
46 pages
Bayesian Inference
No ratings yet
Bayesian Inference
18 pages
BayesianThinking Day1 Albert WORKSHOP Ppts PDF
No ratings yet
BayesianThinking Day1 Albert WORKSHOP Ppts PDF
188 pages
Simulation
No ratings yet
Simulation
180 pages
Bayesian Inference Slides 2021
No ratings yet
Bayesian Inference Slides 2021
37 pages
Bayesian Linear Regression in Data Mining: K.Sathyanarayana Sharma, Dr.S.Rajagopal
No ratings yet
Bayesian Linear Regression in Data Mining: K.Sathyanarayana Sharma, Dr.S.Rajagopal
3 pages
18CS71 Module 4
No ratings yet
18CS71 Module 4
30 pages
MA40189 Notes
No ratings yet
MA40189 Notes
70 pages
Bayes and Frequentism: Return of An Old Controversy: Louis Lyons
No ratings yet
Bayes and Frequentism: Return of An Old Controversy: Louis Lyons
40 pages
19-Bayesian 2
No ratings yet
19-Bayesian 2
39 pages
Introduction To Bayesian Inference: M. Botje NIKHEF, PO Box 41882, 1009DB Amsterdam, The Netherlands June 21, 2006
No ratings yet
Introduction To Bayesian Inference: M. Botje NIKHEF, PO Box 41882, 1009DB Amsterdam, The Netherlands June 21, 2006
68 pages
MLT Unit 1,2,3,4 by Engineering Express
No ratings yet
MLT Unit 1,2,3,4 by Engineering Express
99 pages
Bayesian Inference: A Practical Primer: Outline
No ratings yet
Bayesian Inference: A Practical Primer: Outline
28 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
Bayesian Credible Interval
100% (1)
Bayesian Credible Interval
8 pages
5 Further Topics (60 Min.) : Systematic Errors, MCMC
No ratings yet
5 Further Topics (60 Min.) : Systematic Errors, MCMC
35 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
ML Unit III
No ratings yet
ML Unit III
40 pages
Notes4 BayesianLearning
No ratings yet
Notes4 BayesianLearning
8 pages
Naive Bayes
No ratings yet
Naive Bayes
29 pages
MIT18 650F16 Bayesian Statistics
No ratings yet
MIT18 650F16 Bayesian Statistics
18 pages
Bayesian Modelling Tuts-4-9
No ratings yet
Bayesian Modelling Tuts-4-9
6 pages
Data Handling: Probability Statistics II
No ratings yet
Data Handling: Probability Statistics II
98 pages
Bayesian Statistics
No ratings yet
Bayesian Statistics
20 pages
Chapter 1 B
No ratings yet
Chapter 1 B
35 pages
Bayesian Inference: The Basics
No ratings yet
Bayesian Inference: The Basics
37 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
MLT by Engineering Express
No ratings yet
MLT by Engineering Express
94 pages
Bayes ML Tutorial
No ratings yet
Bayes ML Tutorial
69 pages
Unit 2linear Regression Bayesian Learning
No ratings yet
Unit 2linear Regression Bayesian Learning
49 pages
ML Unit 3 Part 1
No ratings yet
ML Unit 3 Part 1
36 pages
2223hk1 Slide01 ML2022-2
No ratings yet
2223hk1 Slide01 ML2022-2
23 pages
Frequentist vs. Bayesian Statistics Frequentist Thinking Bayesian Thinking
No ratings yet
Frequentist vs. Bayesian Statistics Frequentist Thinking Bayesian Thinking
18 pages
Conceptual Introduction To MCMC
No ratings yet
Conceptual Introduction To MCMC
56 pages
Curs 1 SSL - Introduction
No ratings yet
Curs 1 SSL - Introduction
57 pages
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
No ratings yet
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
141 pages
Babybayes Master
No ratings yet
Babybayes Master
172 pages
Machine Learning Unit 5 Part 2
No ratings yet
Machine Learning Unit 5 Part 2
16 pages
Aiml Iii
No ratings yet
Aiml Iii
28 pages
ML Unit 1
No ratings yet
ML Unit 1
13 pages
Unit 2
No ratings yet
Unit 2
20 pages
Lecture 10
No ratings yet
Lecture 10
59 pages
Bayesian Inference - Bayesian Modeling and Computation in Python
No ratings yet
Bayesian Inference - Bayesian Modeling and Computation in Python
8 pages
BML Lecture Notes
No ratings yet
BML Lecture Notes
126 pages
TNAU UG Brochure 2025 2026 08.05.2025
No ratings yet
TNAU UG Brochure 2025 2026 08.05.2025
88 pages
Assignment 4 APM2611 2023
No ratings yet
Assignment 4 APM2611 2023
3 pages
Lec 5
No ratings yet
Lec 5
6 pages
Carbon Dioxide-GLOBAL WARMING-FACTS AND PREDICTIONS
No ratings yet
Carbon Dioxide-GLOBAL WARMING-FACTS AND PREDICTIONS
1 page
ApprovedList PDF
No ratings yet
ApprovedList PDF
196 pages
During The Extra Class Between Mr. Clark and Tayshawn, He Guide Tayshawn Step by Step To Do The Math
No ratings yet
During The Extra Class Between Mr. Clark and Tayshawn, He Guide Tayshawn Step by Step To Do The Math
3 pages
Poltical Science Worksheet-Globalization: Section-B
No ratings yet
Poltical Science Worksheet-Globalization: Section-B
2 pages
Bus and Truck Tires
No ratings yet
Bus and Truck Tires
11 pages
WFP Food - 0000020154
No ratings yet
WFP Food - 0000020154
5 pages
For Literature
No ratings yet
For Literature
112 pages
VMware Vsphere 8 Training Syllabus
No ratings yet
VMware Vsphere 8 Training Syllabus
6 pages
Interwood Mobel: Home Furniture: Karachi: Lahore: - Banks
No ratings yet
Interwood Mobel: Home Furniture: Karachi: Lahore: - Banks
5 pages
Maths Teacher's Guide
No ratings yet
Maths Teacher's Guide
61 pages
Annual Accomplishment Report-FY 2010
No ratings yet
Annual Accomplishment Report-FY 2010
8 pages
MYA French Ab Initio Revision Guidelines 2024
No ratings yet
MYA French Ab Initio Revision Guidelines 2024
9 pages
Gaurav Resume
No ratings yet
Gaurav Resume
4 pages
9 - Lesson - Plan - New RM
No ratings yet
9 - Lesson - Plan - New RM
4 pages
Pre Nuptial Agreement
No ratings yet
Pre Nuptial Agreement
3 pages
Wurth Metal Division Brochure
No ratings yet
Wurth Metal Division Brochure
49 pages
Eua2380 8976778
No ratings yet
Eua2380 8976778
1 page
Speciation Essay On The Finches From The Galapagos Islands
No ratings yet
Speciation Essay On The Finches From The Galapagos Islands
2 pages
Tello+User+Manual+v1.0 EN 2.12
No ratings yet
Tello+User+Manual+v1.0 EN 2.12
22 pages
APCPL Overview
No ratings yet
APCPL Overview
6 pages
Drug Study SVGH - Malaza
No ratings yet
Drug Study SVGH - Malaza
4 pages
Assignment 3 Set A To C
No ratings yet
Assignment 3 Set A To C
3 pages
HCF LCM Study Material Examples
No ratings yet
HCF LCM Study Material Examples
7 pages
Esha Ankita Devops
No ratings yet
Esha Ankita Devops
7 pages
Renewing Americas Food Traditions
100% (7)
Renewing Americas Food Traditions
93 pages
The Drugs and Cosmetics Act, 1940 and Rules, 1945
100% (3)
The Drugs and Cosmetics Act, 1940 and Rules, 1945
176 pages
Antonin Zadak On Hermetics
100% (1)
Antonin Zadak On Hermetics
10 pages

Machine Learning Essentials

Uploaded by

Machine Learning Essentials

Uploaded by

Machine Learning Essentials

Lecture II: Supervised Machine Learning

Machine learning approaches are traditionally divided into:

we introduce the fundamental vocabulary and concepts of machine learning.

We experiment with supervised learning ML methods

We experiment with unsupervised ML methods

The ML algorithm learning result is show below.

The training set: data with continuous labels (colors in picture)

The ML result. Coloring the whole space based

The unlabeled data on the right:

The ML algorithm result: dividing the points into groups

The input data:

Result of the ML algorithm

pip install seaborn # for later use

To Load these dataset use:

# print(iris.data); print(iris.target); print(digits.images[0]); etc

A table of features (columns) describing each measurement

Another approach is Bayesian Inference. Its postulates are:

In case of a singe data point:

In case of n IID observations,

Interval estimation: find a set , such that

Taking a,b such that produce the desired result.

Hence, C is posterior interval

The 95 percent Bayesian interval is

This is the same as the frequentist confidence interval.

The histogram of approximate the posterior density

Approximation to the mean is

The posterior 1-a can be approximated by

where is the quantile of

Once we have a sample from Let then

This avoids the need to do any analytical calculations.

In Bayesian Inference we need to calculate certain integrals.

Let the Prior be and let the data be

The Posterior is where

The posterior mean

We want to calculate some integral

The standard error where and

And want to calculate its CDF

And then we have

Now we need a way to calculate 𝑃 𝐹 𝐿) .

Examples: Text classification. (in TA class)

When to Use Naive Bayes

print("Model slope: ", model.coef_[0])

from sklearn.preprocessing import PolynomialFeatures

n = 250 # elements number

liniearModel = LinearRegression(fit_intercept = True)

Support Vector Machines: Maximizing the Margin

def plot_svc_decision_function(model, ax=None, model = SVC(kernel='linear', C=1E10)

y = np.linspace(ylim[0], ylim[1], 30)

y = np.linspace(ylim[0], ylim[1], 30)

s=300, linewidth=1, facecolors='none');

faces = fetch_lfw_people(min_faces_per_person=60) axi.set_ylabel(faces.target_names[yfit[i]].split()[-1],

precision recall f1-score support

avg / total 0.85 0.85 0.85 337

#Creating a decision tree

Visualization of the decision tree splitting of the data

def visualize_classifier(model, X, y, ax=None, cmap='rainbow'):

# creating the data

You might also like