Notes - Unit 4 - Machine Learning Lnctu-Bca (Aida) - IV Sem
Notes - Unit 4 - Machine Learning Lnctu-Bca (Aida) - IV Sem
Its main goal is to find the best boundary (called a hyperplane) that separates data into different
classes.
Real-life Example:
Imagine you're a teacher and you want to separate students into two groups:
]
L
. Students who failed
You have data like:
Number of hours studied
Number of assignments completed
SVM will help you draw a line (or a curve) that best separates these two groups, based on this data.
1
How does it work?
It finds the best boundary (hyperplane) between classes line.
It chooses the line that is farthest from both classes (so it's more confident).
The closest points to this line are called Support Vectors.
Important Terms
2
Types of SVM:
Type Description
Linear SVM Used when data is linearly separable
Non-Linear SVM Uses kernel trick for curved boundaries
✓ Advantages of SVM
⬛
Works well for clear margin of separation between classes.
Effective in high-dimensional spaces.
Memory efficient – uses only support vectors.
Can handle non-linear data using kernel tricks.
Good for both classification and regression tasks.
+ Disadvantages of SVM
Slow training on large datasets.
Hard to choose the right kernel sometimes.
Performance drops if there’s too much noise in the data.
Not suitable for very large or real-time problems.
Less effective when classes overlap a lot.
3
Why Maximize the Margin?
Better Generalization: A large margin means the model will perform better on new/unseen data.
More Robust: Less sensitive to small changes or noise in the data
Draw a line (in 2D), a plane (in 3D), or a hyperplane (in higher dimensions)
So that each class lies on different sides of this dividing boundary
Real-Life Example:
M Apple vs *
`
'
• . Banana Sorting
Imagine you're working in a fruit sorting machine:
It takes features like weight and color of the fruit.
You want to classify:
o M̀• Apple
'
o . * Banana
You collect data, and plot each fruit on a graph:
X-axis: color score
Y-axis: weight
Now, you can draw a line that separates:
M Apples on one side
•
'
`
.* Bananas on the other side
✓ That line is your separating hyperplane (in 2D it's just a line).
⬛
4
Term Meaning
Classification Assigning data into different categories (like fruit type)
A fancy word for the "dividing boundary" – line in 2D, plane in
Hyperplane
3D, etc.
Separating
A line or surface that splits the classes perfectly
Hyperplane
⬛ Summary
✓
Classification ka matlab hai cheezon ko alag-alag group mein baantna.
Separating hyperplane ek line ya surface hoti hai jo do classes ko alag karti hai.
2D data mein yeh line hoti hai, 3D mein plane, aur zyada features mein hyperplane.
Iska use hota hai new data ko predict karne ke liye kis class mein aayega.
Algorithms jaise SVM isi concept par kaam karte hain.
_
ˆ
J Conclusion
Separating hyperplane ek powerful tool hai classification ke liye.
Agar data clearly alag ho, toh yeh accurate prediction karta hai.
Real life examples jaise fruit sorting ya email spam filtering mein yeh bahut useful hai.
5
Tum dono ke groups ke beech ek rassi kheechte ho (rope), jahan se dono sides ke bachche door-
door rahen, par line ke bilkul paas sirf ek-ek ladka aur ladki khade hain.
● Wahi rassi hai Maximal Margin Line
’◉
;i";˜7 Jo bachche line ke sabse paas hain, woh hain “Support Vectors”
˙
Is baar ladke aur ladkiyon ki line thodi ghus ghus ke khade ho gaye hai.
Tum perfect rassi nahi kheench sakte.
Ab tum thoda flexible hokar rassi kheenchte ho, thoda error allow karke.
y That flexible line is what “Support Vector Classifier finds”.
S)
Important Points
Ye sirf tab kaam karta hai jab data perfectly alag ho (no overlap).
Ye base concept hai SVM (Support Vector Machine) ka.
Iska goal hota hai — maximum safety zone banana dono classes ke beech.
Advantages:
Simple, fast, and very accurate for clean data
Best boundary provide karta hai for generalization
Disadvantages:
Real-world data perfect nahi hota, isliye yeh classifier fail ho sakta hai jab data overlapping ho.
Tab use karte hain: Soft Margin SVM or Kernel SVM
Maximal Margin Classifier ek aisi line draw karta hai jo dono classes ke beech sabse zyada doori
banaye rakhe, bina galti kiye. Yeh tabhi kaam karta hai jab classes clearly alag ho. SVM isi concept par
based hai.
◆ Point ¸Ç Meaning
•
?
Soft Margin Thoda galti allow karta hai
Support Vectors Boundary ke sabse paas ke points
Hyperplane Boundary line ya plane that separates classes
6
Advantages:
Works well even if data is not perfectly separable
Focuses only on important data points (support vectors)
Robust and accurate for practical use
Disadvantages:
Thoda complex hota hai samajhna
Slow for large datasets
Needs parameter tuning (like C, the error penalty)
Conclusion
Support Vector Classifier ek smart algorithm hai jo real-world data mein flexible boundary
banata hai, jisme thoda error allow hota hai. Ye Maximal Margin Classifier ka upgraded
version hai, aur real problems ke liye zyada useful hai.
Iska main goal hota hai ki data points ko alag-alag categories/classes me sabse best tareeke se divide
karna.
✓ Margin
⬛
Ye distance hota hai hyperplane se sabse kareeb data points (support vectors) tak ka.
SVM ka goal hota hai margin ko maximize karna – jitna zyada margin, utna accurate classification.
7
2. =
* Disease Prediction
●
Y
Input: Symptoms ka data
Output: Disease hai ya nahi
SVM symptoms ke basis pe classify karta hai ki patient infected hai ya healthy.
3. „˙ Customer Segmentation
Input: Customer ke purchase behavior
Output: High spender ya low spender
SVM algorithm in customers ko alag groups me divide karta hai.
8
Why Use SVM?
Advantage Description
⬛ Accurate
✓ Margin maximize karta hai, isliye zyada sahi predictions deta hai.
⬛ Works in High-Dimension Agar features zyada ho (like 100s), tab bhi achha perform karta hai.
✓
⬛ Effective for Classification
✓ Especially jab classes clearly separated ho.
Support Vector Machine ek smart classifier hai jo best boundary dhoondta hai to separate different
classes by maximizing the margin between them.
Real-world me ye bahut use hota hai jahan accurate aur fast classification chahiye – jaise face
recognition, fraud detection, email filtering, etc.
Jab data linearly separate nahi hota (e.g. round/circular shape), tab hum kernel='rbf' ya kernel='poly'
use karte hain.
9
SVM kernel trick se data ko higher dimension me le jaata hai jaha wo linearly separate ho jata
hai.
10
★ One-Versus-One (OvO) Classification
Concept:
Agar classes zyada hain (e.g. Class A, B, C),
to har do classes ke beech ek SVM classifier train hota hai.
Har classifier sirf 2 classes ko differentiate karta hai.
Example:
Suppose tumhare paas 3 classes hain:
M
`
'
• Apple, * . Banana, ? 7` Grapes
Banenge:
Classifier 1: Apple vs Banana
Classifier 2: Apple vs Grapes
Classifier 3: Banana vs Grapes
Concept:
Yahan har class ke against baaki sab classes ke liye ek SVM train hota hai.
Example:
3 classes again: M`
• Apple, *
' . Banana, ?
7
` Grapes
Banenge:
Classifier 1: Apple vs (Banana + Grapes)
Classifier 2: Banana vs (Apple + Grapes)
Classifier 3: Grapes vs (Apple + Banana)
)y Jo classifier sabse confident (highest score) hoga, wo final prediction deta hai.
S
11
OvO = "Each pair fights, voting picks the winner."
OvR = "Each class fights all others, most confident one wins."
# One-Versus-One (OvO)
ovo_clf = OneVsOneClassifier(SVC(kernel='linear'))
ovo_clf.fit(X_train, y_train)
print("OvO Score:", ovo_clf.score(X_test, y_test))
# One-Versus-Rest (OvR)
ovr_clf = OneVsRestClassifier(SVC(kernel='linear'))
ovr_clf.fit(X_train, y_train)
print("OvR Score:", ovr_clf.score(X_test, y_test))
OUTPUT :-
OvO Score: 0.9555555555555556
OvR Score: 0.9333333333333333
12
13