solution_in5520_exercise_svm_2020
solution_in5520_exercise_svm_2020
Exercise 1.
corresponds to correct classification for all N samples in a binary classification problem with classes ‐
1 and 1.
See attachment.
Exercise 2.
Given a binary data set:
1 9 8 5
Class 1: 5 5 Class 1: 13 1
1 1 13 9
Plot the points in a plot. Sketch the support vectors and the decision boundary for a linear SVM
classifier with maximum margin for this data set.
See figure:
Exercise 3.
Given the binary classification problem:
2 2 6 2
3 3 7 3
4 4 8 4
5 5 9 5
Class ‐1: 4 6 Class 1: 8 6
3 7 7 7
4 8 7 8
5 9 7 9
6 10 8 10
c) What is the error rate of the Gaussian classifier on the training data set?
d) Sketch on the plot the decision boundary you would get using a SVM with linear kernel and a
high cost of misclassifying training data. Indicate the support vectors and the decision
boundary on the plot.
e) What is the error rate of the linear SVM on the training data set?
You can use a library for SVM e.g. svmtrain and svmclassify in Matlab
Load mybananadataset.mat. Try various values values of the C‐parameter with a linear SVM.
Can the linear SVM classifier make a good separation of the feature space?
Change kernel to a RBF (radial basis function), and rerun. Try changing the sigma‐parameter
(‘rbf_sigma’ in svmtrain). Make sure you know why we now get a non‐linear decision
boundaries.
Answer: Points correctly classified have i=0, while points inside the margin, but correctly classified,
have i <1. Points misclassified have i >i
d) Discuss how likely a Gaussian classifier and an SVM classifier are to overfit
to the training data.
Answer: A Gaussian classifier has a restricted shape, so with complex noisy data it will
not completely fit the data. An SVM without careful choice of C can easily overfit.
a) The basic optimization problem for a support vector machine classifier is:
1 2
minimize J ( w) w
2
subject to yi ( wT xi w0 ) 1, i 1,2,...N
What is the total margin for this problem?
SVM uses the points closest to other classes to define the boundaries, and are thus
sensitive to outliers. Gaussian classifiers use the class centres to define the
boundaries.
c) Support vector machine classifiers can also be explained based on convex hulls.
Explain the relationship between the convex hull of two regions and the hyperplane
with maximum margin.
Answer: If the problem is linearly separable, the convex hulls for the two classes are non-
overlapping. Furthermore, searching for the hyperplane is equivalent to searching for
the two nearest points in the two convex sets.
d) Given below is a scatter plot of a binary classification problem. The plot is also copied
to the appendix. Sketch the convex hulls and use this to find an approximate
hyperplane.
e) In the general case the optimization problem is given as:
N 1
max i i j yi y j xi x j
T
2 i, j
i1
N
subject to y i i 0 and 0 i C i
i 1
Explain briefly which terms here kernels are used to compute in a high‐dimensional
space,
and what the kernels measure.
Answer: Kernels are used to compute the inner product between pairs of samples xiTxj in a higher
dimensional space. The inner product is a measure of similarity, the angle between two vectors can
be expressed as the inner product. This is also seen in the RBF kernel.