Support Vector Machine
Support Vector Machine
.
SVM takes the training data as an input and
outputs a line/hyperplane that separates
those classes if possible.
print(X_train.shape)
from sklearn.svm import SVC
svclassifier = SVC(kernel='linear')
svclassifier.fit(X_train, y_train)
35
y_pred = svclassifier.predict(X_test)
from sklearn.metrics import
classification_report, confusion_matrix
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))
36
SVM Kernels
1. Polynomial Kernel
In the case of polynomial kernel, you also
have to pass a value for the degree parameter
of the SVC class. This basically is the degree
of the polynomial.
37
2. Gaussian Kernel
Take a look at how we can use polynomial
kernel to implement kernel SVM:
svclassifier = SVC(kernel=‘rbf')
38
SVM Kernels
Sigmoid Kernel
svclassifier = SVC(kernel=‘sigmoid')
39
Parameters Tuning
Most of the machine learning and deep learning algorithms
have some parameters that can be adjusted which are
called hyperparameters.
We need to set hyperparameters before we train the
models. Hyperparameters are very critical in building
robust and accurate models.
They help us find the balance between bias and variance
and thus, prevent the model from overfitting or
underfitting.
40
Consider the data points in 2D space as
41
To overcome this issue, in 1995, Cortes and Vapnik, came up with the
idea of “soft margin” SVM which allows some examples to be
misclassified or be on the wrong side of decision boundary.
42
When determining the decision boundary, a soft margin
SVM tries to solve an optimization problem with the
following:
Increase the distance of decision boundary to classes (or
support vectors)
Maximize the number of points that are correctly classified
in the training set
43
There is a trade-off between these two goals.
44
C parameter adds a penalty for each misclassified data
point.
If c is small, the penalty for misclassified points is low so a
decision boundary with a large margin is chosen at the
expense of a greater number of misclassifications.
45
Parameters Tuning
Tuning parameters value for machine learning
algorithms effectively improves the model
performance.
kernel:Here, we have various options
available with kernel like, “linear”,
“rbf”,”poly” and others (default value is
“rbf”). Here “rbf” and “poly” are useful for
non-linear hyper-plane.
46
Programming Example
47
48
Change the kernel type to rbf
49
gamma: Kernel coefficient for ‘rbf’, ‘poly’ and
‘sigmoid’. Higher the value of gamma, will try to
exact fit the model as per training data set i.e.
generalization error and cause over-fitting
problem.
50
C: Penalty parameter C of the error term. It also
controls the trade off between smooth decision
boundary and classifying the training points
correctly.
51
Gamma vs C parameter
For a linear kernel, we just need to optimize the c
parameter.
However, if we want to use an RBF kernel, both c and
gamma parameter need to optimized simultaneously.
If gamma is large, the effect of c becomes negligible.
If gamma is small, c affects the model just like how it
affects a linear model.
52