0% found this document useful (0 votes)

19 views54 pages

Lecture 18 - SVM

SVM

Uploaded by

raoseshu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views54 pages

Lecture 18 - SVM

SVM

Uploaded by

raoseshu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

Transfer Functions

Supervised Learning – Classification

Support Vector Machines

Background
 There are three methods to establish a classifier
a) Model a classification rule directly
Examples: k-NN, decision trees, perceptron, SVM
b) Model the probability of class memberships given input data
Example: feedforward ANN (multi-layered perceptron)
c) Make a probabilistic model of data within each class
Examples: naive Bayes, model based classifiers

 a) and b) are examples of discriminative classification

 c) is an example of generative classification
 b) and c) are both examples of probabilistic classification

2
Support Vector Machines - Overview
• Proposed by Vapnik and his colleagues
- Started in 1963, taking shape in late 70’s as part of his statistical
learning theory (with Chervonenkis)
- Current form established in early 90’s (with Cortes)
• Became popular in last decade
- Classification, regression (function approx.), optimization
• Basic ideas
- Maximize margin of decision boundary
- Overcoming linear seperability problem by transforming the
problem into higher dimensional space using kernel functions
a
Linear Classifiers
x f yest
f(x,w,b) = sign(w. x + b)
denotes +1
denotes -1

How would you

classify this data?
a
Linear Classifiers
x f yest
f(x,w,b) = sign(w. x + b)
denotes +1
denotes -1

How would you

classify this data?
a
Linear Classifiers
x f yest
f(x,w,b) = sign(w. x + b)
denotes +1
denotes -1

How would you

classify this data?
a
Linear Classifiers
x f yest
f(x,w,b) = sign(w. x + b)
denotes +1
denotes -1

How would you

classify this data?
a
Linear Classifiers
x f yest
f(x,w,b) = sign(w. x + b)
denotes +1
denotes -1

Any of these
would be fine..

..but which is
best?
a
Classifier Margin
x f yest
f(x,w,b) = sign(w. x + b)
denotes +1
denotes -1 Define the margin
of a linear
classifier as the
width that the
boundary could be
increased by
before hitting a
datapoint.
a
Maximum Margin
x f yest
f(x,w,b) = sign(w. x + b)
denotes +1
denotes -1 The maximum
margin linear
classifier is the
linear classifier
with the maximum
margin.
This is the
simplest kind of
SVM (Called an
LSVM)
Linear SVM
a
Maximum Margin
x f yest
f(x,w,b) = sign(w. x + b)
denotes +1
denotes -1 The maximum
margin linear
classifier is the
linear classifier
Support Vectors with the, um,
are those
datapoints that maximum margin.
the margin This is the
pushes up
against simplest kind of
SVM (Called an
LSVM)
Linear SVM
Why Maximum Margin?

denotes +1
denotes -1 The maximum
margin linear
classifier is the
linear classifier
Support Vectors with the maximum
are those
datapoints that margin.
the margin This is the
pushes up
against simplest kind of
SVM (Called an
LSVM)
Specifying a line and margin
Plus-Plane
Classifier Boundary
Minus-Plane

• How do we represent this mathematically?

• …in m input dimensions?
Specifying a line and margin
Plus-Plane
Classifier Boundary
Minus-Plane

Conditions for optimal separating hyperplane for data points

(x1, y1),…,(xl, yl) where yi =1
1. w . xi + b  1 if yi = 1 (points in plus class)
2. w . xi + b  -1 if yi = -1 (points in minus class)
Specifying a line and margin
Estimate the Margin
denotes +1
denotes -1
x
wx +b = 0

X – Vector
W
W – Normal Vector
b – Scale Value
Maximize Margin
denotes +1
denotes -1 wx +b = 0

Margin
b  xi  w
argmax arg min

w ,b xi D d 2
w
i 1 i

subject to xi  D : yi  xi  w  b   0
WXi+b≥1 iff yi=1
WXi+b≤-1 iff yi=-1
argmin  i 1 wi2
d

w ,b
yi(WXi+b) ≥ 1 subject to xi  D : yi  xi  w  b   1
Linear SVM
• Linear model:

 
 1 if w  x  b  1
f ( x)    
  1 if w  x  b  1

• Learning the model is equivalent to determining

the values of

• How to find w and b from training data?

• Constrained Optimization
• Langrangian Method
SVM – Langrangian Formulation
SVM – Langrangian Formulation
Example of Linear SVM

x1 x2 y l
0.3858 0.4687 1 65.5261
0.4871 0.611 -1 65.5261
0.9218 0.4103 -1 0
0.7382 0.8936 -1 0
0.1763 0.0579 1 0
0.4057 0.3529 1 0
0.9355 0.8132 -1 0
0.2146 0.0099 1 0
Example of Linear SVM

• only the first two tuples are support

vectors in this case
Learning Linear SVM
• Let W = (w1;w2) and b denote the parameter to be determined. We
can solve for w1 and w2 as follows
Learning Linear SVM
Learning Linear SVM
Learning Linear SVM
Learning Linear SVM
Example of Linear SVM

Support vectors

x1 x2 y a
l
0.3858 0.4687 1 65.5261
0.4871 0.611 -1 65.5261
0.9218 0.4103 -1 0
0.7382 0.8936 -1 0
0.1763 0.0579 1 0
0.4057 0.3529 1 0
0.9355 0.8132 -1 0
0.2146 0.0099 1 0
Example of Linear SVM
Learning Linear SVM
• Decision boundary depends only on support
vectors
• If you have data set with same support vectors,
decision boundary will not change

• How to classify using SVM once w and b are

found? Given a test record, xi

 
 1 if w  x i  b  1
f ( xi )    
 1 if w  x i  b  1
Support Vector Machines
• What if the problem is not linearly separable?
Support Vector Machines
• What if the problem is not linearly separable?
• Introduce slack variables
• Need to minimize:  2
|| w ||  N k
L( w)   C   i 
2  i 1 
• Subject to:

 
1 if w  x i  b  1 - i
yi    
 1 if w  x i  b  1  i

• If k is 1 or 2, this leads to similar objective function

as linear SVM but with different constraints
Support Vector Machines
B1

b21
b22

margin
b11

b12

• Find the hyperplane that optimizes both factors

Nonlinear Support Vector Machines
• What if decision boundary is not linear?
Nonlinear Support Vector Machines
Nonlinear Support Vector Machines
Nonlinear Support Vector Machines
Concept of Nonlinear Mapping
Nonlinear Support Vector Machines
• What if decision boundary is not linear?
Nonlinear Support Vector Machines
Nonlinear Support Vector Machines
Nonlinear Support Vector Machines
Nonlinear Support Vector Machines
• Transform data into higher dimensional space

Decision boundary:
 
w  ( x )  b  0
Nonlinear Support Vector Machines
Nonlinear Support Vector Machines
Nonlinear Support Vector Machines
Nonlinear Support Vector Machines
Nonlinear Support Vector Machines
Nonlinear Support Vector Machines
Example of Nonlinear SVM

SVM with polynomial

degree 2 kernel
Learning Nonlinear SVM
• Advantages of using kernel:
• Computing dot product (xi) (xj) in the
original space avoids curse of dimensionality

• Not all functions can be kernels

• Must make sure there is a corresponding  in
some high-dimensional space
• Mercer’s theorem
Characteristics of SVM
• The learning problem is formulated as a convex optimization problem
• Efficient algorithms are available to find the global minima
• Many of the other methods use greedy approaches and find locally
optimal solutions
• High computational complexity for building the model

• Robust to noise
• Overfitting is handled by maximizing the margin of the decision
boundary,
• SVM can handle irrelevant and redundant attributes better than many
other techniques
• The user needs to provide the type of kernel function and cost function
• Difficult to handle missing values
References
• An excellent tutorial on VC-dimension and Support
Vector Machines:
C.J.C. Burges. A tutorial on support vector machines
for pattern recognition. Data Mining and Knowledge
Discovery, 2(2):955-974, 1998.
http://citeseer.nj.nec.com/burges98tutorial.html
• The VC/SRM/SVM Bible:
Statistical Learning Theory by Vladimir Vapnik, Wiley-
Interscience; 1998
• Download SVM-light:
http://svmlight.joachims.org/
Some other issues in SVM
• SVM works only in a real-valued space. For a
categorical attribute, we need to convert its
categorical values to numeric values.
• SVM does only two-class classification. For multi-
class problems, some strategies can be applied, e.g.,
one-against-rest, and error-correcting output coding.
• The hyperplane produced by SVM is hard to
understand by human users. The matter is made
worse by kernels. Thus, SVM is commonly used in
applications that do not required human
understanding.

Sewage & Septage Ordinance Guide
100% (2)
Sewage & Septage Ordinance Guide
10 pages
ND Science Laboratory Technology
No ratings yet
ND Science Laboratory Technology
264 pages
Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
SVM Applications and Properties
100% (1)
SVM Applications and Properties
34 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
No ratings yet
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
31 pages
Third Year Engineering: Unit II: Supervised Machine Learning
No ratings yet
Third Year Engineering: Unit II: Supervised Machine Learning
11 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Machine Learning - Open Elective - Part III
No ratings yet
Machine Learning - Open Elective - Part III
90 pages
Support Vector Machine For Classification
No ratings yet
Support Vector Machine For Classification
38 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
26 pages
Unit 2 PPT - Part 2
100% (1)
Unit 2 PPT - Part 2
81 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
SVM
No ratings yet
SVM
11 pages
7 - Support Vector Machines (SVM)
No ratings yet
7 - Support Vector Machines (SVM)
29 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
32 pages
W12 SVM
No ratings yet
W12 SVM
52 pages
Support Vector Machine
No ratings yet
Support Vector Machine
55 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
SVM Basics for Data Scientists
No ratings yet
SVM Basics for Data Scientists
139 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
SVM Notes Unit 4
No ratings yet
SVM Notes Unit 4
8 pages
Support Vector Machine
No ratings yet
Support Vector Machine
17 pages
Support Vector Machine Guide
No ratings yet
Support Vector Machine Guide
21 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
Unit II 2.2 ML Kernel Machines SVM
No ratings yet
Unit II 2.2 ML Kernel Machines SVM
50 pages
Module10 - Support Vector Machine
No ratings yet
Module10 - Support Vector Machine
23 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
No ratings yet
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
6 pages
L5 SVMs
No ratings yet
L5 SVMs
37 pages
ML 18-20 SVM
No ratings yet
ML 18-20 SVM
44 pages
Final - Support Vector Machine - Class - Modifie
No ratings yet
Final - Support Vector Machine - Class - Modifie
69 pages
Introduction To Support Vector Machines
No ratings yet
Introduction To Support Vector Machines
23 pages
Unit - 2-1
No ratings yet
Unit - 2-1
7 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
36 pages
Unit 2 - SVM - 241016 - 104220
No ratings yet
Unit 2 - SVM - 241016 - 104220
47 pages
Support Vector Machines: Jeff Wu
No ratings yet
Support Vector Machines: Jeff Wu
35 pages
Lec06 SVM
No ratings yet
Lec06 SVM
25 pages
Support Vector Machine
No ratings yet
Support Vector Machine
11 pages
Unit 2
No ratings yet
Unit 2
47 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
S V M (SVM) : Upport Ector Achine
No ratings yet
S V M (SVM) : Upport Ector Achine
67 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
103 pages
Lec5 Support Vector Machine
No ratings yet
Lec5 Support Vector Machine
28 pages
Support Vector Machine
No ratings yet
Support Vector Machine
19 pages
cs221 Lecture11
No ratings yet
cs221 Lecture11
71 pages
CMPE 442 Introduction To Machine Learning: Support Vector Machines
No ratings yet
CMPE 442 Introduction To Machine Learning: Support Vector Machines
64 pages
Presentation On Support Vector Machine (SVM)
100% (2)
Presentation On Support Vector Machine (SVM)
22 pages
Support Vector Machines
No ratings yet
Support Vector Machines
19 pages
Ann Unit III
No ratings yet
Ann Unit III
20 pages
Prediction & SVM Explained
No ratings yet
Prediction & SVM Explained
33 pages
Support Vector Machines: (Vapnik, 1979)
No ratings yet
Support Vector Machines: (Vapnik, 1979)
34 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
Biology 12 Unit 9 Assignment 2 Blood Type and Immune Response Virtual Lab
0% (1)
Biology 12 Unit 9 Assignment 2 Blood Type and Immune Response Virtual Lab
2 pages
Clocking in Digital Systems
No ratings yet
Clocking in Digital Systems
28 pages
Marine Oil Separator Guide
No ratings yet
Marine Oil Separator Guide
2 pages
Pune Metro Environmental Impact
No ratings yet
Pune Metro Environmental Impact
15 pages
Reliance Commercial Vehicle Policy
No ratings yet
Reliance Commercial Vehicle Policy
21 pages
Resume Bryan S. Caneda
No ratings yet
Resume Bryan S. Caneda
11 pages
Christmas Drawing Easy - Google Search
No ratings yet
Christmas Drawing Easy - Google Search
1 page
Vycon Ecoview Specifications
No ratings yet
Vycon Ecoview Specifications
1 page
Safety 12 02
No ratings yet
Safety 12 02
114 pages
Pilih Jawapan Yang Terbaik Untuk Melengkapkan Ayat Berikut
No ratings yet
Pilih Jawapan Yang Terbaik Untuk Melengkapkan Ayat Berikut
9 pages
Annex+A-2 Draft+Amended+Net+Metering+Agreement
No ratings yet
Annex+A-2 Draft+Amended+Net+Metering+Agreement
5 pages
A Superfluid Universe
No ratings yet
A Superfluid Universe
38 pages
Parlor Games
No ratings yet
Parlor Games
13 pages
Rotman Good Kitchen
No ratings yet
Rotman Good Kitchen
6 pages
Glass Block Technical Presentation
No ratings yet
Glass Block Technical Presentation
16 pages
8321 Asco
No ratings yet
8321 Asco
4 pages
Skin Disease in Travelers Premium Ebook Download
No ratings yet
Skin Disease in Travelers Premium Ebook Download
14 pages
Global Organic Textile Standard - GOTS
No ratings yet
Global Organic Textile Standard - GOTS
3 pages
Dynamic Response Analysis of Induction Motor Drive Influenced by Controller Design Methods
No ratings yet
Dynamic Response Analysis of Induction Motor Drive Influenced by Controller Design Methods
9 pages
Finite Element Method in Linear Elasticity
No ratings yet
Finite Element Method in Linear Elasticity
33 pages
LCD Repair Guide for Technicians
No ratings yet
LCD Repair Guide for Technicians
14 pages
Science Teaching Reflection
No ratings yet
Science Teaching Reflection
2 pages
Decoding The Ishango Bone: Unveiling Prehistoric Mathematical Art
No ratings yet
Decoding The Ishango Bone: Unveiling Prehistoric Mathematical Art
46 pages
Rwanda Eia Guidelines Road Construction
No ratings yet
Rwanda Eia Guidelines Road Construction
54 pages
Key Properties of Concrete
No ratings yet
Key Properties of Concrete
8 pages
TOPAZ B. Ing2
100% (3)
TOPAZ B. Ing2
6 pages
DVT PDF
No ratings yet
DVT PDF
10 pages

Lecture 18 - SVM

Uploaded by

Lecture 18 - SVM

Uploaded by

Transfer Functions

Supervised Learning – Classification

Support Vector Machines

 a) and b) are examples of discriminative classification

How would you

How would you

How would you

How would you

• How do we represent this mathematically?

Conditions for optimal separating hyperplane for data points

• Learning the model is equivalent to determining

• only the first two tuples are support

• How to classify using SVM once w and b are

• If k is 1 or 2, this leads to similar objective function

• Find the hyperplane that optimizes both factors

SVM with polynomial

• Not all functions can be kernels

You might also like