Python statistics | variance()
Last Updated :
29 Jul, 2024
Statistics module provides very powerful tools, which can be used to compute anything related to Statistics. variance() is one such function. This function helps to calculate the variance from a sample of data (sample is a subset of populated data).
variance() function should only be used when variance of a sample needs to be calculated. There's another function known as pvariance(), which is used to calculate the variance of an entire population.
In pure statistics, variance is the squared deviation of a variable from its mean. Basically, it measures the spread of random data in a set from its mean or median value. A low value for variance indicates that the data are clustered together and are not spread apart widely, whereas a high value would indicate that the data in the given set are much more spread apart from the average value.
Variance is an important tool in the sciences, where statistical analysis of data is common. It is the square of standard deviation of the given data-set and is also known as second central moment of a distribution. It is usually represented by s^{2}, \sigma ^{2}, \operatorname {Var} (X) in pure Statistics.
Variance is calculated by the following formula :
It's calculated by mean of square minus square of mean
\operatorname {Var} (X)=\operatorname {E} \left[(X-\mu )^{2}\right]
Syntax : variance( [data], xbar )
Parameters :
[data] : An iterable with real valued numbers.
xbar (Optional) : Takes actual mean of data-set as value.
Returntype : Returns the actual variance of the values passed as parameter.
Exceptions :
StatisticsError is raised for data-set less than 2-values passed as parameter.
Throws impossible values when the value provided as xbar doesn't match actual mean of the data-set.
Code #1 :
Python3
# Python code to demonstrate the working of
# variance() function of Statistics Module
# Importing Statistics module
import statistics
# Creating a sample of data
sample = [2.74, 1.23, 2.63, 2.22, 3, 1.98]
# Prints variance of the sample set
# Function will automatically calculate
# it's mean and set it as xbar
print("Variance of sample set is % s"
%(statistics.variance(sample)))
Output :
Variance of sample set is 0.40924
Code #2 : Demonstrates variance() on a range of data-types
Python3
# Python code to demonstrate variance()
# function on varying range of data-types
# importing statistics module
from statistics import variance
# importing fractions as parameter values
from fractions import Fraction as fr
# tuple of a set of positive integers
# numbers are spread apart but not very much
sample1 = (1, 2, 5, 4, 8, 9, 12)
# tuple of a set of negative integers
sample2 = (-2, -4, -3, -1, -5, -6)
# tuple of a set of positive and negative numbers
# data-points are spread apart considerably
sample3 = (-9, -1, -0, 2, 1, 3, 4, 19)
# tuple of a set of fractional numbers
sample4 = (fr(1, 2), fr(2, 3), fr(3, 4),
fr(5, 6), fr(7, 8))
# tuple of a set of floating point values
sample5 = (1.23, 1.45, 2.1, 2.2, 1.9)
# Print the variance of each samples
print("Variance of Sample1 is % s " %(variance(sample1)))
print("Variance of Sample2 is % s " %(variance(sample2)))
print("Variance of Sample3 is % s " %(variance(sample3)))
print("Variance of Sample4 is % s " %(variance(sample4)))
print("Variance of Sample5 is % s " %(variance(sample5)))
Output :
Variance of Sample 1 is 15.80952380952381
Variance of Sample 2 is 3.5
Variance of Sample 3 is 61.125
Variance of Sample 4 is 1/45
Variance of Sample 5 is 0.17613000000000006
Code #3 : Demonstrates the use of xbar parameter
Python3
# Python code to demonstrate
# the use of xbar parameter
# Importing statistics module
import statistics
# creating a sample list
sample = (1, 1.3, 1.2, 1.9, 2.5, 2.2)
# calculating the mean of sample set
m = statistics.mean(sample)
# calculating the variance of sample set
print("Variance of Sample set is % s"
%(statistics.variance(sample, xbar = m)))
Output :
Variance of Sample set is 0.3656666666666667
Code #4 : Demonstrates the Error when value of xbar is not same as the mean/average value
Python3
# Python code to demonstrate the error caused
# when garbage value of xbar is entered
# Importing statistics module
import statistics
# creating a sample list
sample = (1, 1.3, 1.2, 1.9, 2.5, 2.2)
# calculating the mean of sample set
m = statistics.mean(sample)
# Actual value of mean after calculation
# comes out to 1.6833333333333333
# But to demonstrate xbar error let's enter
# -100 as the value for xbar parameter
print(statistics.variance(sample, xbar = -100))
Output :
0.3656666666663053
Note : It is different in precision from the output in Code #3
Code #4 : Demonstrates StatisticsError
Python3
# Python code to demonstrate StatisticsError
# importing Statistics module
import statistics
# creating an empty data-srt
sample = []
# will raise Statistics Error
print(statistics.variance(sample))
Output :
Traceback (most recent call last):
File "/home/64bf6d80f158b65d2b75c894d03a7779.py", line 10, in
print(statistics.variance(sample))
File "/usr/lib/python3.5/statistics.py", line 555, in variance
raise StatisticsError('variance requires at least two data points')
statistics.StatisticsError: variance requires at least two data points
Applications :
Variance is a very important tool in Statistics and handling huge amounts of data. Like, when the omniscient mean is unknown (sample mean) then variance is used as biased estimator. Real world observations like the value of increase and decrease of all shares of a company throughout the day cannot be all sets of possible observations. As such, variance is calculated from a finite set of data, although it won't match when calculated taking the whole population into consideration, but still it will give the user an estimate which is enough to chalk out other calculations.
Similar Reads
Python Tutorial | Learn Python Programming Language Python Tutorial â Python is one of the most popular programming languages. Itâs simple to use, packed with features and supported by a wide range of libraries and frameworks. Its clean syntax makes it beginner-friendly.Python is:A high-level language, used in web development, data science, automatio
10 min read
DSA Tutorial - Learn Data Structures and Algorithms DSA (Data Structures and Algorithms) is the study of organizing data efficiently using data structures like arrays, stacks, and trees, paired with step-by-step procedures (or algorithms) to solve problems effectively. Data structures manage how data is stored and accessed, while algorithms focus on
7 min read
Python Interview Questions and Answers Python is the most used language in top companies such as Intel, IBM, NASA, Pixar, Netflix, Facebook, JP Morgan Chase, Spotify and many more because of its simplicity and powerful libraries. To crack their Online Assessment and Interview Rounds as a Python developer, we need to master important Pyth
15+ min read
Quick Sort QuickSort is a sorting algorithm based on the Divide and Conquer that picks an element as a pivot and partitions the given array around the picked pivot by placing the pivot in its correct position in the sorted array. It works on the principle of divide and conquer, breaking down the problem into s
12 min read
Merge Sort - Data Structure and Algorithms Tutorials Merge sort is a popular sorting algorithm known for its efficiency and stability. It follows the divide-and-conquer approach. It works by recursively dividing the input array into two halves, recursively sorting the two halves and finally merging them back together to obtain the sorted array. Merge
14 min read
Bubble Sort Algorithm Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the adjacent elements if they are in the wrong order. This algorithm is not suitable for large data sets as its average and worst-case time complexity are quite high.We sort the array using multiple passes. After the fir
8 min read
Data Structures Tutorial Data structures are the fundamental building blocks of computer programming. They define how data is organized, stored, and manipulated within a program. Understanding data structures is very important for developing efficient and effective algorithms. What is Data Structure?A data structure is a st
2 min read
Breadth First Search or BFS for a Graph Given a undirected graph represented by an adjacency list adj, where each adj[i] represents the list of vertices connected to vertex i. Perform a Breadth First Search (BFS) traversal starting from vertex 0, visiting vertices from left to right according to the adjacency list, and return a list conta
15+ min read
Python OOPs Concepts Object Oriented Programming is a fundamental concept in Python, empowering developers to build modular, maintainable, and scalable applications. By understanding the core OOP principles (classes, objects, inheritance, encapsulation, polymorphism, and abstraction), programmers can leverage the full p
11 min read
Binary Search Algorithm - Iterative and Recursive Implementation Binary Search Algorithm is a searching algorithm used in a sorted array by repeatedly dividing the search interval in half. The idea of binary search is to use the information that the array is sorted and reduce the time complexity to O(log N). Binary Search AlgorithmConditions to apply Binary Searc
15 min read