ML _ Bias vs Variance - GeeksforGeeks
ML _ Bias vs Variance - GeeksforGeeks
ML | Bias Vs Variance
Difficulty Level : Easy ● Last Updated : 20 Jul, 2021
In this ar ticle, we will learn ‘ What are bias and variance for a machine learning model
There are various ways to evaluate a machine-learning model. We can use MSE (Mean
Squared Error) for Regression; Precision, Recall and ROC (Receiver of Characteristic s)
for a Classification Problem along with A bsolute Error. In a similar way, Bias and
Variance help us in parameter tuning and deciding better-fitted models among several
built.
Bias is one type of error that occurs due to wrong assumptions about data such as
assuming data is linear when in reality, data follows a complex function. On the other
hand, variance gets introduced with high sensitivity to variations in training data. This
also is one type of error since we want to make our model robust against noise.
Before coming to the mathematical definitions, we need to know about random variables
and functions. Let ’s say, f(x) is the function which our given data follows. We will build
few models which can be denoted as . Each point on this function is a random
variable having the number of values equal to the number of models. To correctly
Bias :
Variance :
Let ’s see some visuals of what impor tance both of these terms hold.
These images are self-explanator y. Still, we’ll talk about the things to be noted. When
bias is high, focal point of group of predicted function lie far from the true function.
Whereas, when variance is high, functions from the group of predicted ones, differ much
Let ’s take an example in the context of machine learning. The data taken here follows
scenarios, data contains noisy information instead of correct values. Therefore, we have
x y y_noisy
-5 25 2.67595670e+01
-4 16 1.46802434e+01
-3 9 1.05460668e+01
-2 4 3.25487498e+00
-1 1 1.73960283e+00
0 0 1.64552536e+00
Start Your Coding Journey Now!
x y y_noisy
1 1 4.46816845e-01
2 4 1.54342469e+00
3 9 9.37684917e+00
4 16 1.79133242e+01
Data Visualization
Now that we have a regression problem, let ’s tr y fitting several polynomial models of
broaden your vision from a toy problem, you will face situations where you don’t know
data distribution beforehand. So, if you choose a model with lower degree, you might not
correctly fit data behavior (let data be far from linear fit). If you choose a higher degree,
perhaps you are fitting noise instead of data. Lower degree model will anyway give you
high error but higher degree model is still not correct with low error. So, what should we
do? We can either use the Visualization method or we can look for better setting with
Bias and Variance. ( Data scientists use only a por tion of data to train the model and then
Now, if we plot ensemble of models to calculate bias and variance for each polynomial
model:
Start Your Coding Journey Now!
A s we can see, in linear model, ever y line is ver y close to one another but far away from
actual data. On the other hand, higher degree polynomial cur ves follow data carefully
but have high differences among them. Therefore, bias is high in linear and variance is
high in higher degree polynomial. This fact reflects in calculated quantities as well.
Linear Model:-
Bias : 6.3981120643436356
Variance : 0.09606406047494431
Af ter this task, we can conclude that simple model tend to have high bias while complex
model have high variance. We can determine under-fitting or over-fitting with these
characteristic s.
Again coming to the mathematical par t: How are bias and variance related to the
empirical error (MSE which is not true error due to added noise in data) between target
Now, we reach the conclusion phase. Impor tant thing to remember is bias and variance
have trade-off and in order to minimize error, we need to reduce both. This means that
we want our model prediction to be close to the data (low bias) and ensure that
predicted points don’t var y much w.r.t. changing noise (low variance).
Related Articles
Next
Ar ticle Contributed By :
krin99
@krin99
Article Tags : data-science, Technical Scripter 2019, DSA, Machine Learning, Mathematical,
Technical Scripter
Company Learn
About Us DSA
Careers Algorithms
In Media Data Structures
Contact Us SDE Cheat Sheet
Privacy Policy Machine learning
Copyright Policy CS Subjects
Advertise with us Video Tutorials
Courses
News Languages
Top News
Python
Technology
Java
Work & Career
CPP
Business
Golang
Finance
C#
Lifestyle
SQL
Knowledge
Kotlin