0% found this document useful (0 votes)
14 views

ML _ Bias vs Variance - GeeksforGeeks

Uploaded by

Kuzivakwashe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

ML _ Bias vs Variance - GeeksforGeeks

Uploaded by

Kuzivakwashe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

DSA Array Matrix Strings Hashing Linked List Stack Queue Binary Tree Binary Searc

ML | Bias Vs Variance
Difficulty Level : Easy ● Last Updated : 20 Jul, 2021

Read Discuss Courses Practice Video

In this ar ticle, we will learn ‘ What are bias and variance for a machine learning model

and what should be their optimal state.

There are various ways to evaluate a machine-learning model. We can use MSE (Mean

Squared Error) for Regression; Precision, Recall and ROC (Receiver of Characteristic s)

for a Classification Problem along with A bsolute Error. In a similar way, Bias and

Variance help us in parameter tuning and deciding better-fitted models among several

built.

Bias is one type of error that occurs due to wrong assumptions about data such as

assuming data is linear when in reality, data follows a complex function. On the other

hand, variance gets introduced with high sensitivity to variations in training data. This

also is one type of error since we want to make our model robust against noise.

Before coming to the mathematical definitions, we need to know about random variables

and functions. Let ’s say, f(x) is the function which our given data follows. We will build

few models which can be denoted as . Each point on this function is a random

variable having the number of values equal to the number of models. To correctly

approximate the true function f(x), we take expected value of


Start Your Coding Journey Now! Login Register

Bias :

Variance :

Let ’s see some visuals of what impor tance both of these terms hold.

These images are self-explanator y. Still, we’ll talk about the things to be noted. When

bias is high, focal point of group of predicted function lie far from the true function.

Whereas, when variance is high, functions from the group of predicted ones, differ much

from one another.

Let ’s take an example in the context of machine learning. The data taken here follows

quadratic function of features(x) to predict target column(y_noisy). In real-life

scenarios, data contains noisy information instead of correct values. Therefore, we have

added 0 mean, 1 variance Gaussian Noise to the quadratic function values.


Start Your Coding Journey Now!

x y y_noisy

-5 25 2.67595670e+01

-4.5 20.25 2.11632561e+01

-4 16 1.46802434e+01

-3.5 12.25 1.31647290e+01

-3 9 1.05460668e+01

-2.5 6.25 5.95794282e+00

-2 4 3.25487498e+00

-1.5 2.25 1.97478968e+00

-1 1 1.73960283e+00

-0.5 0.25 -1.13112086e-02

0 0 1.64552536e+00
Start Your Coding Journey Now!
x y y_noisy

0.5 0.25 -9.60938656e-01

1 1 4.46816845e-01

1.5 2.25 4.01016081e+00

2 4 1.54342469e+00

2.5 6.25 7.27654456e+00

3 9 9.37684917e+00

3.5 12.25 1.32076198e+01

4 16 1.79133242e+01

4.5 20.25 2.08601281e+01

Data Visualization

Now that we have a regression problem, let ’s tr y fitting several polynomial models of

different order. The results presented here are of degree: 1, 2, 10.


Start Your Coding Journey Now!
Start Your Coding Journey Now!
In this case, we already know that the correct model is of degree=2. But as soon as you

broaden your vision from a toy problem, you will face situations where you don’t know

data distribution beforehand. So, if you choose a model with lower degree, you might not

correctly fit data behavior (let data be far from linear fit). If you choose a higher degree,

perhaps you are fitting noise instead of data. Lower degree model will anyway give you

high error but higher degree model is still not correct with low error. So, what should we

do? We can either use the Visualization method or we can look for better setting with

Bias and Variance. ( Data scientists use only a por tion of data to train the model and then

use remaining to check the generalized behavior.)

Now, if we plot ensemble of models to calculate bias and variance for each polynomial

model:
Start Your Coding Journey Now!

A s we can see, in linear model, ever y line is ver y close to one another but far away from

actual data. On the other hand, higher degree polynomial cur ves follow data carefully

but have high differences among them. Therefore, bias is high in linear and variance is

high in higher degree polynomial. This fact reflects in calculated quantities as well.

Linear Model:-
Bias : 6.3981120643436356
Variance : 0.09606406047494431

Higher Degree Polynomial Model:-


Bias : 0.31310660249287225
Variance : 0.565414017195101

Af ter this task, we can conclude that simple model tend to have high bias while complex

model have high variance. We can determine under-fitting or over-fitting with these

characteristic s.

Again coming to the mathematical par t: How are bias and variance related to the

empirical error (MSE which is not true error due to added noise in data) between target

value and predicted value.

Now, let ’s calculate another quantity:


Start Your Coding Journey Now!

Now, we reach the conclusion phase. Impor tant thing to remember is bias and variance

have trade-off and in order to minimize error, we need to reduce both. This means that

we want our model prediction to be close to the data (low bias) and ensure that

predicted points don’t var y much w.r.t. changing noise (low variance).

Related Articles

1. Bias-Variance Trade off - Machine Learning

2. 5 Algorithms that Demonstrate Artificial Intelligence Bias

3. Mathematics | Mean, Variance and Standard Deviation

4. Program for Variance and Standard Deviation of an array

5. Find combined mean and variance of two series

6. Variance and standard-deviation of a matrix

7. Program to calculate Variance of first N Natural Numbers

8. What is Full Binary Tree?

9. Array meaning in DSA

10. Preorder Traversal of Binary Tree


Start Your Coding Journey Now!
Like 3

Next

Bias-Variance Trade off - Machine


Learning

Ar ticle Contributed By :

krin99
@krin99

Vote for difficulty

Current difficulty : Easy

Easy Normal Medium Hard Expert

Article Tags : data-science, Technical Scripter 2019, DSA, Machine Learning, Mathematical,
Technical Scripter

Practice Tags : Machine Learning, Mathematical

Improve Article Report Issue

A-143, 9th Floor, Sovereign Corporate Tower,


Sector-136, Noida, Uttar Pradesh - 201305
[email protected]
Start Your Coding Journey Now!

Company Learn
About Us DSA
Careers Algorithms
In Media Data Structures
Contact Us SDE Cheat Sheet
Privacy Policy Machine learning
Copyright Policy CS Subjects
Advertise with us Video Tutorials
Courses

News Languages
Top News
Python
Technology
Java
Work & Career
CPP
Business
Golang
Finance
C#
Lifestyle
SQL
Knowledge
Kotlin

Web Development Contribute


Web Tutorials Write an Article
Django Tutorial Improve an Article
HTML Pick Topics to Write
JavaScript Write Interview Experience
Bootstrap Internships
ReactJS Video Internship
NodeJS

@geeksforgeeks , Some rights reserved


@g g , g

Start Your Coding Journey Now!

You might also like