Machine Learning Unit-3.3
Machine Learning Unit-3.3
• The goal of the SVM algorithm is to create the best line or decision
boundary that can segregate n-dimensional space into classes so that we
can easily put the new data point in the correct category in the future.
•In this scenario, hyper-plane “B” has excellently performed this job.
• Scenario-2: Identify the right hyper-plane : Here, we
have three hyper-planes (A, B, and C) and all are
segregating the classes well. Now, How can we
identify the right hyper-plane?
• Above, you can see that the margin for hyper-plane C is high as
compared to both A and B. Hence, we name the right hyper-
plane as C. Another lightning reason for selecting the hyper-
plane with higher margin is robustness. If we select a hyper-plane
having low margin then there is high chance of miss-
classification.
• Scenario3:Identify the right hyper-plane :
• The SVM uses what is called a “Kernel Trick” where the data is
transformed and an optimal boundary is found for the possible
outputs.
• Here comes the use of kernel function which transform the points to
higher dimensions, solves the problem over there and returns the
output.
• Think of this in this way, we can see that the square are enclosed in
some perimeter area while the circle lies outside it, likewise, there
could be other scenarios where green dots might be distributed in a
trapezoid-shaped area.
• They help to determine the shape of the hyperplane and decision boundary.
• We can set the value of the kernel parameter in the SVM code.
• If the value of the kernel is linear then the decision boundary would be
linear and two-dimensional.
• We just have to give the input and use the appropriate kernel.
• SVM algorithms use a set of mathematical functions that are defined as the
kernel.
• The mostly used kernel function is RBF. Because it has localized and finite
response along the entire x-axis.
• The kernel functions return the inner product between two points in a suitable
feature space.
• Simply put, it does some extremely complex data transformations, then finds out
the method to separate the data points based on the target classes you’ve
defined.
SVM for Non-Linear Data Sets
• An example of non-linear data is:
• In this case we cannot find a straight line to separate apples from lemons.
• So how can we solve this problem. We will use the Kernel Trick!
• The basic idea is that when a data set is inseparable in the current
dimensions, add another dimension, may be that way the data will
be separable.
• If we plot the plane defined by the x1² + x2² formula, we will get
something like this:
.
• Now we have to map the apples and lemons (which are
just simple points) to this new space.
1. Linear Kernel: Let us say that we have two vectors with name x1 and x2, then the
linear kernel is defined by the dot product of these two vectors:
Here ‘.’ shows the dot product of both the values, and d denotes the degree.
F(xi, xj) representing the decision boundary to separate the given classes.
3. Gaussian RBF Kernel: It is a general-purpose kernel; used when there is no prior knowledge about the
data.
Equation is:
for
• The value of gamma varies from 0 to 1. You have to manually provide the value of gamma in the code.
The most preferred value for gamma is 0.1.
• The given sigma plays a very important role in the performance of the Gaussian kernel and should
neither be overestimated and nor be underestimated, it should be carefully tuned according to the
problem.
• It is one of the most preferred and used kernel functions in svm. It is usually chosen for non-linear data.
It helps to make proper separation when there is no prior knowledge of data.
4. Hyperbolic or the Sigmoid Kernel
• This kernel is used in neural network areas of machine learning.