Introduction to Support Vector Machines (SVM)
Last Updated :
02 Feb, 2023
INTRODUCTION:
Support Vector Machines (SVMs) are a type of supervised learning algorithm that can be used for classification or regression tasks. The main idea behind SVMs is to find a hyperplane that maximally separates the different classes in the training data. This is done by finding the hyperplane that has the largest margin, which is defined as the distance between the hyperplane and the closest data points from each class. Once the hyperplane is determined, new data can be classified by determining on which side of the hyperplane it falls. SVMs are particularly useful when the data has many features, and/or when there is a clear margin of separation in the data.
What are Support Vector Machines? Support Vector Machine (SVM) is a relatively simple Supervised Machine Learning Algorithm used for classification and/or regression. It is more preferred for classification but is sometimes very useful for regression as well. Basically, SVM finds a hyper-plane that creates a boundary between the types of data. In 2-dimensional space, this hyper-plane is nothing but a line. In SVM, we plot each data item in the dataset in an N-dimensional space, where N is the number of features/attributes in the data. Next, find the optimal hyperplane to separate the data. So by this, you must have understood that inherently, SVM can only perform binary classification (i.e., choose between two classes). However, there are various techniques to use for multi-class problems. Support Vector Machine for Multi-CLass Problems To perform SVM on multi-class problems, we can create a binary classifier for each class of the data. The two results of each classifier will be :
- The data point belongs to that class OR
- The data point does not belong to that class.
For example, in a class of fruits, to perform multi-class classification, we can create a binary classifier for each fruit. For say, the 'mango' class, there will be a binary classifier to predict if it IS a mango OR it is NOT a mango. The classifier with the highest score is chosen as the output of the SVM. SVM for complex (Non Linearly Separable) SVM works very well without any modifications for linearly separable data. Linearly Separable Data is any data that can be plotted in a graph and can be separated into classes using a straight line.
A: Linearly Separable Data B: Non-Linearly Separable Data
We use Kernelized SVM for non-linearly separable data. Say, we have some non-linearly separable data in one dimension. We can transform this data into two dimensions and the data will become linearly separable in two dimensions. This is done by mapping each 1-D data point to a corresponding 2-D ordered pair. So for any non-linearly separable data in any dimension, we can just map the data to a higher dimension and then make it linearly separable. This is a very powerful and general transformation. A kernel is nothing but a measure of similarity between data points. The kernel function in a kernelized SVM tells you, that given two data points in the original feature space, what the similarity is between the points in the newly transformed feature space. There are various kernel functions available, but two are very popular :
- Radial Basis Function Kernel (RBF): The similarity between two points in the transformed feature space is an exponentially decaying function of the distance between the vectors and the original input space as shown below. RBF is the default kernel used in SVM.
K(x,x') = exp(-\gamma||x-x'||²)
- Polynomial Kernel: The Polynomial kernel takes an additional parameter, 'degree' that controls the model's complexity and computational cost of the transformation
A very interesting fact is that SVM does not actually have to perform this actual transformation on the data points to the new high dimensional feature space. This is called the kernel trick. The Kernel Trick: Internally, the kernelized SVM can compute these complex transformations just in terms of similarity calculations between pairs of points in the higher dimensional feature space where the transformed feature representation is implicit. This similarity function, which is mathematically a kind of complex dot product is actually the kernel of a kernelized SVM. This makes it practical to apply SVM when the underlying feature space is complex or even infinite-dimensional. The kernel trick itself is quite complex and is beyond the scope of this article. Important Parameters in Kernelized SVC ( Support Vector Classifier)
- The Kernel: The kernel, is selected based on the type of data and also the type of transformation. By default, the kernel is Radial Basis Function Kernel (RBF).
- Gamma : This parameter decides how far the influence of a single training example reaches during transformation, which in turn affects how tightly the decision boundaries end up surrounding points in the input space. If there is a small value of gamma, points farther apart are considered similar. So more points are grouped together and have smoother decision boundaries (maybe less accurate). Larger values of gamma cause points to be closer together (may cause overfitting).
- The 'C' parameter: This parameter controls the amount of regularization applied to the data. Large values of C mean low regularization which in turn causes the training data to fit very well (may cause overfitting). Lower values of C mean higher regularization which causes the model to be more tolerant of errors (may lead to lower accuracy).
Pros of Kernelized SVM:Â
- They perform very well on a range of datasets.
- They are versatile: different kernel functions can be specified, or custom kernels can also be defined for specific datatypes.
- They work well for both high and low dimensional data.
Cons of Kernelized SVM:Â
- Efficiency (running time and memory usage) decreases as the size of the training set increases.
- Needs careful normalization of input data and parameter tuning.
- Does not provide a direct probability estimator.
- Difficult to interpret why a prediction was made.
Example
Python3
import numpy as np
from sklearn.datasets import make_classification
from sklearn import svm
from sklearn.model_selection import train_test_split
classes = 4
X,t= make_classification(100, 5, n_classes = classes, random_state= 40, n_informative = 2, n_clusters_per_class = 1)
#%%
X_train, X_test, y_train, y_test= train_test_split(X, t , test_size=0.50)
#%%
model = svm.SVC(kernel = 'linear', random_state = 0, C=1.0)
#%%
model.fit(X_train, y_train)
#%%
y=model.predict(X_test)
y2=model.predict(X_train)
#%%
from sklearn.metrics import accuracy_score
score =accuracy_score(y, y_test)
print(score)
score2 =accuracy_score(y2, y_train)
print(score2)
#%%
import matplotlib.pyplot as plt
color = ['black' if c == 0 else 'lightgrey' for c in y]
plt.scatter(X_train[:,0], X_train[:,1], c=color)
# Create the hyperplane
w = model.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(-2.5, 2.5)
yy = a * xx - (model.intercept_[0]) / w[1]
# Plot the hyperplane
plt.plot(xx, yy)
plt.axis("off"), plt.show();
Conclusion: Now that you know the basics of how an SVM works, you can go to the following link to learn how to implement SVM to classify items using Python: https://www.geeksforgeeks.org/classifying-data-using-support-vector-machinessvms-in-python/
Similar Reads
Support Vector Machine (SVM) for Anomaly Detection
Support Vector Machines (SVMs) are powerful supervised learning models that can also be used for anomaly detection. They can be effective for anomaly detection because they find the hyperplane that best separates the normal data points from the anomalies. Mainly, the one-class support vector machine
8 min read
Support Vector Machine (SVM) Algorithm
Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression tasks. It tries to find the best boundary known as hyperplane that separates different classes in the data. It is useful when you want to do binary classification like spam vs. not spam or
9 min read
Classifying data using Support Vector Machines(SVMs) in R
Support Vector Machines (SVM) are supervised learning models mainly used for classification and but can also be used for regression tasks. In this approach, each data point is represented as a point in an n-dimensional space, where n is the number of features. The goal is to find a hyperplane that b
5 min read
Dual Support Vector Machine
Pre-requisite: Separating Hyperplanes in SVM The Lagrange multiplier equation for the support vector machine. The equation of that can be given by: \underset{\vec{w},b}{min} \underset{\vec{a}\geq 0}{max} \frac{1}{2}\left \| w \right \|^{2} - \sum_{j}a_j\left [ \left ( \vec{w} \cdot \vec{x}_{j} \righ
6 min read
Major Kernel Functions in Support Vector Machine (SVM)
In previous article we have discussed about SVM(Support Vector Machine) in Machine Learning. Now we are going to learn in detail about SVM Kernel and Different Kernel Functions and its examples.Types of SVM Kernel FunctionsSVM algorithm use the mathematical function defined by the kernel. Kernel Fu
4 min read
Multi-class classification using Support Vector Machines (SVM)
Support Vector Machines (SVM) are widely recognized for their effectiveness in binary classification tasks. However, real-world problems often require distinguishing between more than two classes. This is where multi-class classification comes into play. While SVMs are inherently binary classifiers,
6 min read
Classifying data using Support Vector Machines(SVMs) in Python
Introduction to SVMs: In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. A Support Vector Machine (SVM) is a discriminative classifier
4 min read
Image classification using Support Vector Machine (SVM) in Python
Support Vector Machines (SVMs) are a type of supervised machine learning algorithm that can be used for classification and regression tasks. In this article, we will focus on using SVMs for image classification. When a computer processes an image, it perceives it as a two-dimensional array of pixels
9 min read
Support Vector Machine vs Extreme Gradient Boosting
Support Vector Machine (SVM) and Extreme Gradient Boosting (XGBoost) are both powerful machine learning algorithms widely used for classification and regression tasks. They belong to different families of algorithms and have distinct characteristics in terms of their approach to learning, model type
4 min read
Predicting Stock Price Direction using Support Vector Machines
We are going to implement an End-to-End project using Support Vector Machines to live Trade For us. You Probably must have Heard of the term stock market which is known to have made the lives of thousands and to have destroyed the lives of millions. If you are not familiar with the stock market you
5 min read