0% found this document useful (0 votes)
2 views

04_Measures of Variations

The document discusses various measures of variation in statistics, including absolute and relative measures such as range, quartile deviation, mean deviation, variance, and standard deviation. It explains the concepts of skewness and kurtosis, providing formulas and examples for calculating these measures. Additionally, it covers the five-number summary, boxplots, and characteristics of normal distribution, emphasizing the importance of understanding data dispersion and its implications.

Uploaded by

abdullah.abaid78
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

04_Measures of Variations

The document discusses various measures of variation in statistics, including absolute and relative measures such as range, quartile deviation, mean deviation, variance, and standard deviation. It explains the concepts of skewness and kurtosis, providing formulas and examples for calculating these measures. Additionally, it covers the five-number summary, boxplots, and characteristics of normal distribution, emphasizing the importance of understanding data dispersion and its implications.

Uploaded by

abdullah.abaid78
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Measures of Variation

Measures of Variation/ Dispersion


• In Statistics, Dispersion (also called variability, scatter, or spread) denotes
how stretched or squeezed a distribution is.
• Variability is the extant to which data points in a Statistical Distribution or
data set diverge from the average, or mean value as well as the extent to
which these data points differ from each other.
• Following are the commonly used measures of variability.
Absolute M. of. Dispersion Relative M. of. Dispersion
(same data units) (Unit less quantities)
1. Range 1. Coefficient of range
2. Quartile Deviation 2. Coefficient of Quartile Deviation
3. Mean Deviation 3. Coefficient Of M.D
4. Standard Deviation 4. Coefficient of variation (C.V)
The Range and Coefficient of Range
• The Range R is defined as the difference between the largest and the
smallest observations in a dataset. i.e,

𝑅 = 𝑋𝑙𝑎𝑟𝑔𝑒𝑠𝑡 − 𝑋𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡

• The Coefficient of Range is defined as

𝑋𝑙𝑎𝑟𝑔𝑒𝑠𝑡 −𝑋𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 50𝑘𝑔−45𝑘𝑔


𝐶𝑜𝑒𝑓𝑓. 𝑜𝑓 𝑅𝑎𝑛𝑔𝑒 = = =5kg/95kg=
𝑋𝑙𝑎𝑟𝑔𝑒𝑠𝑡 +𝑋𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 50𝑘𝑔+45𝑘𝑔
Example

The WEIGHTS obtained by 9 students are given below


45, 32, 37, 46, 39, 36, 41, 48, 36
Find the range and the Coefficient of Range.
Maximum Obs. is 48 and Minimum 32, therefore
Range = 16 marks
Co-efficient of Range = (48-32)/(48+32)=0.2
Semi Inter Quartile Range / Quartile Deviation

• The inter quartile range (IQR) is a measure of dispersion, defined as


𝐼𝑄𝑅 = 𝑄3 − 𝑄1
• The Semi Inter Quartile Range or Quartile Deviation (QD) is defined
as
𝑄3 − 𝑄1
𝑄𝐷 =
2
• The Co-efficient of Quartile Deviation (QD) is defined as
𝑄3 − 𝑄1
𝐶𝑜𝑒𝑓𝑓. 𝑜𝑓 𝑄𝐷 =
𝑄3 + 𝑄1
The Mean Deviation OR Average Deviation

• The mean deviation (M.D.) of a set of data is defined as the


arithmetic mean of the absolute deviations measured either from
the mean or from the median,
σ𝑛 ത
𝑖=1 𝑋𝑖 −𝑋 σ𝑛
𝑖=1 𝑋𝑖 −𝑚𝑒𝑑𝑖𝑎𝑛
𝑀. 𝐷 = OR 𝑀. 𝐷 =
𝑛 𝑛
• Co-efficient of Mean Deviation is given as
𝑀.𝐷 𝑀.𝐷
𝐶𝑜𝑒𝑓𝑓. 𝑜𝑓 𝑀. 𝐷 = OR
𝑀𝑒𝑎𝑛 𝑀𝑒𝑑𝑖𝑎𝑛
X (𝑿 − 𝟔𝟖. 𝟓) 𝑋𝑖 − 𝑋ത
Example 65 -3.5 3.5
71 2.5 2.5
67 -1.5 1.5
• Consider the following data of yield of wheat (in kgs) 75 6.5 6.5
from 8 experimental plots. 63 -5.5 5.5
65, 71, 67, 75, 63, 69, 75, 63 69 0.5 0.5
• Find the Range, coefficient of range, Quartile 75 6.5 6.5
deviation, coefficient of Q.D, and mean deviation 63 -5.5 5.5
from mean and from median. Also calculate all the 548 0
relative measures of dispersions.
• Range= 12 Coeff. Of range= 0.086 σ 𝑿 𝟓𝟒𝟖
ഥ=
𝑿 = = 𝟔𝟖. 𝟓
• Q1=63.5, Q2= 68 Q3= 74 𝒏 𝟖
𝑋𝑖 − 𝑀𝑒𝑑𝑖𝑎𝑛
• Q.D= 5.25 Coefficient of Q.D= 0.076
Median=68 65-68=3
• M.D from mean=32/8=4
• Coeff. Of M.D=4/68.5=0.058
• M.D from median= 32/8=4
• Coeff. Of M.D=0.058
Variance

• Variance is the measure of the spread between observations in a


dataset.
• The variance measures the distance of all the observations from
their mean.
σ𝑁 2
2 𝑖=1 𝑋𝑖 − 𝜇
Population Variance 𝜎 =
𝑁

σ𝑛 ത 2
Sample Variance 𝑖=1 𝑋𝑖 − 𝑋
𝑆2 =
𝑛−1
Standard Deviation

• It is the positive square root of the Variance

Population Standard σ𝑁
𝑖=1 𝑋𝑖 − 𝜇
2
𝜎=
Deviation 𝑁

Sample Standard σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 2

Deviation
𝑆=
𝑛−1
X (𝑿 − 𝟔𝟖. 𝟓) (𝑿 − 𝟔𝟖. 𝟓)𝟐
Example 65 -3.5 12.25
71 2.5 6.25
67 -1.5 2.25
• Consider the following data of yield of 75 6.5 42.25
wheat (in kgs) from 8 experimental 63 -5.5 30.25
plots. 69 0.5 0.25
75 6.5 42.25
65, 71, 67, 75, 63, 69, 75, 63 63 -5.5 30.25
• Find the average, variance and the 548 0 166
standard deviation of the yield.
σ 𝑿 𝟓𝟒𝟖
ഥ=
𝑿 = = 𝟔𝟖. 𝟓
𝒏 𝟖
σ𝒏 ഥ 𝟐
𝒊=𝟏 𝑿 𝒊 − 𝑿 𝟏𝟔𝟔
𝑺𝟐 = = = 𝟐𝟑. 𝟕𝟏
𝒏−𝟏 𝟖−𝟏

𝑺 = 𝟐𝟑. 𝟕𝟏 = 𝟒. 𝟖𝟕
Co-efficient of Variation (CV)

• The coefficient of variation is a measure of spread that describes


the amount of variability relative to the mean. Because the
coefficient of variation is unit less, you can use it instead of the
standard deviation to compare the spread of data sets that have
different units or different means.
𝑆
𝐶𝑜𝑒𝑓𝑓. 𝑜𝑓 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 (𝐶𝑉) = × 100
𝑋ത
Example
Following data represents the Following data represents the
prices in Rs. of a certain life of car battery in hours
commodity
130, 150, 180, 250, 345
8, 13, 18, 23, 30
Sol: Sol:
𝑋ത = 18.4 𝑅𝑠. 𝑌ത = 211 𝐻𝑟𝑠.

𝑆𝑥 = 8.56 𝑅𝑠. 𝑆𝑦 = 87.63 𝐻𝑟𝑠.

𝑪. 𝑽 = 𝟒𝟔. 𝟓% 𝑪. 𝑽 = 𝟒𝟏. 𝟓%
Types of Distribution
Measures of Skewness
• Skewness is a measure of symmetry, or more precisely, the lack of
symmetry. A distribution, or data set, is symmetric if it looks the
same to the left and right of the center point.
σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 3 Q3  2Q2  Q1 
𝑆𝑘 = Sk 
𝑛𝑆 3 Q3  Q1 

• If Sk = 0 the distribution is Symmetrical


• If Sk  0 the distribution is +vely skewed
• If Sk  0 the distribution is -vely skewed
Measures of Kurtosis
• Describes the extent of
peakedness or flatness of
the distribution of the
data.
• Measured by Coefficient
of Kurtosis (K) computed
as,

σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 4
𝐾= −3
𝑛𝑆 4
Interpretation

K=0

mesokurtic/Normal

K>0 K<0
leptokurtic platykurtic
Example
Consider the following data:-
25, 27, 29, 31, 33, 35, 37 Variance 18.67
Find Mean, Variance, Coefficient of Skewness and Standard Deviation 4.32
Coefficient of Kurtosis and interpret the results.
Kurtosis -1.2
• Example 2: Consider the data Skewness 0
X= 2.5, 2.7, 2.9, 3.1, 3.3, 3.5, 3.7 Range 12
• Find Mean=3.1, Minimum 25
• Mean Deviation from median=2.65 Maximum 37
• Coefficient of Mean Deviation=0.85 Sum 217
• Variance= 0.1867
• Standard Deviation=0.43205, CV= 12.90
• Coefficient of Skewness=0
• Coefficient of Kurtosis= -3 < 0 (Platykurtic)
Five Number Summary

• For a set of data, the minimum, first quartile, median, third


quartile, and maximum.
Minimum, Q1, Median, Q3, Maximum
Boxplot / Box & Whisker plot

1. Line within box( median) indicates average size of the data


2. Length of graph / box indicates variation in the data
3. Position of line within box indicates the shape of the data
 Line at the center of the box indicates data is symmetrical
 Line above the center of the box indicates data is -vely skewed
 Line below the center of the box indicates data is +vely skewed
Example
Consider the following data of marks of 20 students:-
53 74 82 42 39 28 20 81 68 58
54 93 70 30 61 55 36 37 29 94
Construct Boxplot of the data and interpret it.

Minimum = 20
Q1 = 36.25
Median = 54.5
Q3 = 73
Maximum = 94
Question
The breaking strength of 20 test pieces of a
certain alloy is given as:- Mean 90.15
95, 97, 96, 73, 78, 95, 89, 68, 82, 79, 69, 67, 83, 94, 87,
93, 103, 108, 117, 130 Variance S2 = 269.08
Calculate the average breaking strength of the SD S = 16.4
alloy and the standard deviation. Calculate the
percentage of observations lying within the limits:-
(i) Mean ± 1S (73.75, 106.55), count: 13 ,
percentage of observations=(13/20)*100= 65%
(ii) Mean ± 2S (57.35, 122.95), count:19
percentage of observations=(19/20)*100= 95%

(iii) Mean ± 3S= ( )=20/20=100%


Characteristics of Normal Curve: About 68 percent of the observations fall
between plus and minus one SD from the mean; about 95 percent fall
between plus and minus two SD from the mean; and about 99 percent fall
between plus and minus three SD from the mean.
Standardized Variable

• A standardized variable (sometimes called a z-score or a standard


score) is a variable that has been rescaled to have a mean of zero
and a standard deviation of one.
𝑋𝑖 − 𝑋ത
𝑍𝑖 =
𝑆
Example Consider the following data:- 25, 26, 23, 25, 45, 45, 58, 58,
50, 25. Calculate its mean and variance and make a standardized
variable 𝑍𝑖 . Verify that the Mean and Variance of the 𝑍𝑖 is zero and 1
respectively.
Solution

𝑋𝑖 − 𝑋ത Zi-square
𝑍𝑖 =
𝑋𝑖 𝑋𝑖 − 𝑋ത 𝑋𝑖 − 𝑋ത 2
𝑆 𝑍𝑖 − 𝑍ҧ 2 𝑋ത = 38
25 -13 169 -0.8905
26 -12 144 -0.8220 𝑆𝑋 = 14.598
23
25
-15
-13
225
169
-1.0275
-0.8905
𝑍ҧ = 0
45 7 49 0.4795 𝑉𝑎𝑟 𝑍
45 7 49 0.4795
58 20 400 1.3700 = 8.995/9
58
50
20
12
400
144
1.3700
0.8220
=1
25 -13 169 -0.8905 𝑆𝑧 = 1
380 0 1918 0.00 8.995

You might also like