Measures of Dispersion
Measures of Dispersion
Dispersion
• The measure of the spread or variability
• No Variability – No Dispersion
Measures of Variation
• There are 3 values that we will look at to
measure the amount of dispersion or variation.
(The spread of the group)
1. Range
2. Standard Deviation
3. Quartile deviation
Why is it Important?
• You want to choose the best brand of
medicine for your patients. You are
interested in how long the drugs takes to
cure a disease. The choices are narrowed
down to 2 different drugs. The results are
shown in the chart. Which drug would
you choose?
Drug A Drug B
The chart indicates 10 35
the number of days 60 45
a drug takes to 50 30
cure a particular 30 35
disease. 40 40
20 25
210 210
Does the Average Help?
• Drug A: Avg = 210/6 = 35 days
• Range = 100 – 2 = 98
Deviation from the Mean
• A deviation from the mean, x – x , is the difference
between the value of x and the mean x
• We base our formulas for variance and standard
deviation on the amount that they deviate from the
mean.
• The mean deviation of a set of observations
𝑥1 , 𝑥2 , ⋯ , 𝑥𝑁 is the mean of the absolute deviations
from the mean and equals
1 𝑁
σ𝑖=1 |𝑥𝑖 − 𝑥|ҧ
N
Formulae for sample and population
variances
Computation formulae Definition formulae
( x) 2
x − 2
σ𝑛
(𝑥
𝑖=1 𝑖 − 𝑥)
lj 2
n 2
𝑠 =
s2 = 𝑛−1
n −1
( xi ) 2 N
− (x − )
2 2
x
=
2 N i
N 2 = i =1
N
Standard Deviation
• The standard deviation is the square root of the
variance.
s = s 2
Example – Using Formula
• Find the variance of the following
dataset 6, 3, 8, 5, 3 (in hours)
x x 2
6 36
3 9
8 64
5 25
3 9
x = 25 x = 143
2
Example – Using Formula
( x) 2
x 2
−
s2 = n
n −1
252
143 −
5 143 − 125 18
s =
2
= = = 4.5
4 4 4
Find the standard deviation
• The standard deviation is the positive square
root of the variance.
s = 4.5 = 2.12
Example: Mean, variance and standard deviation of data
• In a city there are six professional football clubs. Last season they had
25, 30, 18, 27, 28 and 22 players respectively on their full-time paid
staffs. Find the mean, variance and standard deviation of the number
of full-time paid staffs.
• Let us call the number of full-time paid staff r. It is easier to layout the
calculation in form of a table
Example: Mean, variance and standard deviation of data
𝟐
Club 𝒓𝒊 𝒓𝒊 − 𝒓ത 𝒓𝒊 − 𝒓ത 𝒓𝟐𝒊
A 25 0 0 625
B 30 5 25 900
C 18 -7 49 324
D 27 2 4 729
E 28 3 9 784
F 22 -3 9 484
6|150 6|96 3846
Mean 𝑟ҧ = 25 Variance = 16
2
1 𝑘 1 σ𝑘
𝑖=1 𝑓𝑖 𝑟𝑖
σ 𝑓 𝑟𝑖 − 𝑟ҧ 2 or σ𝑘𝑖=1 𝑓𝑖 𝑟𝑖2 −
𝑁 𝑖=1 𝑖 𝑁 𝑁
194.53
= = 2.13377
91
The standard deviation is 2.13377 = 1.46
Variance and standard deviation of grouped
observations of a continuous variable
The variance of a set of N observations of a continuous variable, in
which 𝑓𝑖 observations fall in the interval whose centre is 𝑥𝑖 (𝑖 =
1, 2, ⋯ , 𝑘 ), is
2
1 𝑘 1 σ𝑘
𝑖=1 𝑓𝑖 𝑥𝑖
σ𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑥ҧ 2 or σ𝑘𝑖=1 𝑓𝑖 𝑥𝑖2 −
𝑁 𝑁 𝑁
The semi-inter-quartile range
(or quartile deviation)
The semi-inter-quartile range (or quartile deviation)
• The variance, the standard deviation and the mean deviation go
naturally with the mean.
• They are based on deviations from the mean, and the averaging
process is the same as that for calculating the mean.
Rank order 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Quartiles . . . Q1 . . . M . . . Q3
Example
Rank order 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Number of seats sold 54 57 60 67 72 74 75 78 83 87 88 93 98 99 100
Quartiles . . . Q1 . . . M . . . Q3
The general term for these measures is quantiles; the quartiles, deciles
and percentiles are examples.
Example: Deciles and quartiles of a grouped continuous
variable (Clarke & Cooke, 4th Ed., example 4.6.1)
The incomes of married couples over retiring age in 1973 are shown in columns 1
and 2 of the Table below. Draw a cumulative frequency curve for the data, and
from it estimate the lowest decile, the median, the lower and upper quartiles of
income. Use the curve to estimate the proportion of married couples who had a
gross weekly income between £22 and £28.
2000
1500
Frequency
1000
500
0
0 5 10 15 20 25 30 35 40 45 50
Gross weekly income (£)
Example: Deciles and quartiles of a grouped continuous
variable (Clarke & Cooke, 4th Ed., example 4.6.1)
• Note that the upper limit has been conveniently set at £50
• The total frequency, in thousands, is 2059.
• One-tenth of the total frequency must lie below the first decile. Thus from the
graph we need to find the income corresponding to 205.9 on the vertical scale: it
is £14.40 as accurately as we can read it. The first decile is thus £14.40.
• From the graph, the median has to have half the total frequency, i.e. 1029.5,
below it. The income corresponding to this is £19.80.
1
• The quartiles correspond to cumulative frequencies of × 2059 = 514.75 and
4
3
× 2059 = 1544.25; they are therefore, from the graph, 𝑄1 = £16.20 and 𝑄3 =
4
£26.20
Example: Deciles and quartiles of a grouped continuous
variable (Clarke & Cooke, 4th Ed., example 4.6.1)
• Finally from the graph the cumulative frequency up to £22 is 1230 thousands,
and up to £28 is 1590 thousands, so that the number of married couples having
360
incomes between £22 and £28 is 360 thousands. This is a proportion =
2059
1
0.175 (i.e 17 %) of the whole
2