0% found this document useful (0 votes)

574 views21 pages

Chapter Four: Measures of Dispersion (Variation) : Abebuabebaw

This chapter discusses measures of dispersion, which quantify how spread out or varied the values in a data set are. There are absolute measures of dispersion, such as range, which are expressed in the same units as the original data, and relative measures, such as coefficient of variation, which are ratios that allow comparison across data sets with different units. Range is defined as the difference between the largest and smallest values, while quartile deviation measures the spread of the middle 50% of values using the interquartile range. Measures of dispersion are important for judging the reliability of central tendency measures, controlling variability, and comparing groups.

Uploaded by

Yohannis Reta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

574 views21 pages

Chapter Four: Measures of Dispersion (Variation) : Abebuabebaw

Uploaded by

Yohannis Reta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Chapter four: MEASURES OF DISPERSION (VARIATION)

CHAPTER FOUR:

MEASURES OF DISPERSION (VARIATION)

4.1 Introduction

Just as central tendency can be measured by a number in the form of an average, the amount of variation
(dispersion, spread, or scatter) among the values in the data set can also be measured. The measures of
central tendency describe that the major part of values in the data set appears to concentrate around a
central value called average with the remaining values scattered (distributed) on either sides of that value.
But these measures do not reveal how these values are dispersed (spread or scatter) on each side of the
central value. The dispersion of values is indicated by the extent to which these values tend to spread over
an interval rather than cluster closely around an average.
The term dispersion is generally used in two senses. Firstly, dispersion refers to the variations of the items
among themselves. If the value of all the items of a series is the same, there will be no variation among
different items of a series. Secondly, dispersion refers to the variation of the items around an average. If
the difference between the value of items and the average is large, the dispersion will be high and on the
other hand if the difference between the value of the items and averaging is small, the dispersion will be
low. Thus, dispersion is defined as scatteredness or spreadness of the individual items in a given series.

After studying this chapter, you should be able to:

✓ Explain the meaning of measures of dispersion

✓ Compare two or more sets of data using relative measures of dispersion.

✓ Apply the Z-score to find out the relative standing of values.
✓ Explain measures of skewness and kurtosis.
Objectives of measuring Variation:
✓ To judge the reliability of measures of central tendency
✓ To control variability itself.
✓ To compare two or more groups of numbers in terms of their variability.
✓ To make further statistical analysis.
4.2 Absolute and Relative Measures of Dispersion

Absolute measures of dispersion: Absolute measure is expressed in the same statistical

unit in which the original data are given such as kilograms, tones etc. These measures are
suitable for comparing the variability in two distributions having variables expressed in
the same units and of the same averaging size. These measures are not suitable for

[email protected]
1
Chapter four: MEASURES OF DISPERSION (VARIATION)

comparing the variability in two distributions having variables expressed in different

units.

Relative measures of dispersion: A relative measure of dispersion is the ratio of a measure of absolute
dispersion to an appropriate average or the selected items of the data.

Relative measure of
dispersion

Based on Based on all

selected items
items

Coefficient of Coefficient of mean

range and deviation &coefficient of
coefficient of standard deviation or
quartile coefficient of variation
deviation

4.3 Types of Measures of Variation

4.3.1 The Range and Relative Range

[email protected]
2
Chapter four: MEASURES OF DISPERSION (VARIATION)

Range is the simplest measures of dispersion. It is defined as the difference between the largest and
smallest value in a given set of data. Its formula is:

𝑅 =𝐿−𝑆

Where R=Range, L= Largest value in a given set of data, S= smallest value in a given set of data.

For a continuous grouped distribution, the range may be obtained as:

• The difference between upper class limit of the last class and the lower class limit of the first
class, or

• The difference between the largest class mark and the smallest class mark, or

• The difference between the upper class boundary of the last class and the lower class boundary
of the first class.

The range is used in describing like the maximum change in daily temperature, rainfall, etc. When the
sample size is small, it can be an adequate measure of variation. It is commonly used in quality control.

The relative measures of range, also called coefficient of range, is defined as

L−S
𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑅𝑎𝑛𝑔𝑒(𝑅𝑅) =
L+S

Example 4.1: Five students obtained the following marks in statistics: 20, 35, 25, 30, 15. Find the range
and relative range

Solution: Here, 𝐿 = 35, 𝑎𝑛𝑑 𝑆 = 15

𝑅𝑎𝑛𝑔𝑒 = 𝐿 − 𝑆 = 35 − 15 = 20

L−S 35 − 15
𝑅𝑅 = = = 0.4
L+S 35 + 15

Example 4.2: Find out range and relative range of the following given data.

Size 5-10 11-15 16-20 21-25 26-30

Frequency 4 9 15 30 40

Solution: Here,

L = Upper class limit of the largest class = 30

L = lower class limit of the smallest class = 5

[email protected]
3
Chapter four: MEASURES OF DISPERSION (VARIATION)

30 − 5
Range = 30 – 5 = 25, 𝑅𝑅 = = 0.7143 .
30 + 5

Merits of the Range

➢ It is well-defined, easy to compute and simple to understand.

➢ It helps in giving an idea about the variation, just by giving the lowest value and the greatest
value of variable.

Demerits of the Range

➢ It is not based on all observations of the series.

➢ It can’t be calculated in case of open-ended distribution.
➢ It is affected by sampling fluctuation.
➢ It is affected by extreme values in the series.

4.3.2 The Quartile Deviation and Coefficient of Quartile Deviation

Inter-quartile range and quartile deviation are other measures of dispersion. The difference between the
upper quartile (𝑄3 ) and lower quartile (𝑄1 ) is called inter-quartile range. Symbolically,

𝑰𝑛𝑡𝑒𝑟 𝑸𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑹𝑎𝑛𝑔𝑒 (𝐼𝑄𝐷) = 𝑄3 − 𝑄1

The inter-quartile ranges covers dispersion of middle 50% of the items of the series. Quartile deviation,
also called semi-inter-quartile range, is half of the difference between the upper and lower quartile. That
is, half of the inter-quartile range. Its formula is

𝑄3 − 𝑄1
𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 (𝑄𝐷) =
2

The relative measure of quartile deviation also called the coefficient of quartile deviation (CQD) is
defined as:

𝑄3 − 𝑄1
𝐶𝑄𝐷 =
𝑄3 + 𝑄1

Example 4.3: Find inter-quartile range, quartile deviation and coefficient of quartile deviation from the
following data.

28, 18, 20, 24, 27, 30, 15

[email protected]
4
Chapter four: MEASURES OF DISPERSION (VARIATION)

Solution: First arrange the data in ascending order. 15, 18, 20, 24, 27, 28, 30

𝑛 + 1 𝑡ℎ 7 + 1 𝑡ℎ
𝑄1 = 𝑠𝑖𝑧𝑒 𝑜𝑓 ( ) 𝑖𝑡𝑒𝑚 = 𝑠𝑖𝑧𝑒 𝑜𝑓 ( ) 𝑖𝑡𝑒𝑚
4 4

= 𝑠𝑖𝑧𝑒 𝑜𝑓 2𝑛𝑑 𝑖𝑡𝑒𝑚 = 18 𝑚𝑎𝑟𝑘𝑠

𝑛 + 1 𝑡ℎ 7 + 1 𝑡ℎ
𝑄3 = 𝑠𝑖𝑧𝑒 𝑜𝑓 3 ( ) 𝑖𝑡𝑒𝑚 𝑠𝑖𝑧𝑒 𝑜𝑓 3 ( ) 𝑖𝑡𝑒𝑚
4 4

= 𝑠𝑖𝑧𝑒 𝑜𝑓 6𝑡ℎ 𝑖𝑡𝑒𝑚 = 28 𝑚𝑎𝑟𝑘𝑠

𝐼𝑄𝑅 = 𝑄3 − 𝑄1 = 28 − 18 = 10

𝑄3 − 𝑄1 28 − 18
𝑄𝐷 = = =5
2 2

𝑄3 − 𝑄1 28 − 18
𝐶𝑄𝐷 = = = 0.217
𝑄3 + 𝑄1 28 + 18

Example 4.4: Find inter-quartile range, quartile deviation and coefficient of quartile deviation from the
following data

Marks 2 3 4 5 6 7 8 9
No. Of students 10 11 12 13 5 12 7 5

Solution:

Marks 2 3 4 5 6 7 8 9

No. of students 10 11 12 13 5 12 7 5

CF 10 21 33 46 51 63 70 75=N

𝑁+1 75 + 1
𝑄1 = ( )= = 19𝑡ℎ 𝑖𝑡𝑒𝑚 = 3
4 4

𝑁+1 75+1
𝑄3 = 3 ( 4
) = 3( 4
) = 57th item = 7

𝐼𝑄𝑅 = 𝑄3 − 𝑄1 = 7 − 3 = 4

[email protected]
5
Chapter four: MEASURES OF DISPERSION (VARIATION)

𝑄3 − 𝑄1 7 − 3
𝑄𝐷 = = =2
2 2

𝑄3 − 𝑄1 7 − 3
𝐶𝑄𝐷 = = = 0.4
𝑄3 + 𝑄1 7 + 3

Remark: Q.D or CQD includes only the middle 50% of the observation.

Merits of QD

➢ It is well-defined, easy to compute and simple to understand.

➢ It helps in studying the middle 50% item in the series.
➢ It is not affected by the extreme items.
➢ It is useful in measuring variations in the case of open-ended distributions.

Demerits of QD

➢ It is not based on all the items (it ignores 50% items, i.e., the first 25% and the last 25%).
➢ It is greatly influenced by sampling fluctuations.
➢ It is not amenable to algebraic manipulations.

4.3.3 The Mean Deviation and Coefficient of Mean Deviation

The mean deviation (MD) measures the average deviation of a set of observations about their central
value, generally the mean or the median, ignoring the plus/minus sign of the deviations. In other words
the mean deviation of a set of items is defined as the arithmetic mean of the values of the absolute
deviations from a given average. Depending up on the type of averages used we have different mean
deviations.
❖ The mean deviation of a sample of n observations x1, x2, . . .,xn (individual series)is given as
∑|𝑋𝑖 − 𝐴|
𝑀𝐷 =
𝑛

Where |𝑋𝑖 − 𝐴| denotes the absolute value of the deviation. Generally, arithmetic mean and median are
used in calculating mean deviation. So, 𝐴 stands for the average used for calculating 𝑀𝐷. That is, 𝐴 =
𝑚𝑒𝑑𝑖𝑎𝑛(𝑋̃ ) 𝑜𝑟 𝐴 = 𝑚𝑒𝑎𝑛(𝑋̅).

❖ In case of discrete data arranged in FD and continuous grouped data, the formula for MD
becomes

[email protected]
6
Chapter four: MEASURES OF DISPERSION (VARIATION)

∑ 𝑓𝑖 |𝑋𝑖 −𝐴|
𝑀𝐷 = 𝑛
, where 𝑋𝑖 is the class mark of the ith class, 𝑓𝑖 is the frequency of the ith class and

n = ∑ 𝑓𝑖 .
1. The mean deviation about the arithmetic mean is, therefore, given by

∑|𝑋 −𝑋| ̅
𝑀𝐷(𝑋̅) = 𝑛𝑖 … for ungrouped data (individual series).
∑ 𝑓 |𝑋 −𝑋| ̅
𝑀𝐷 (𝑋̅) = 𝑖 𝑖 . . . for discrete data arranged in FD and a grouped continuous frequency
𝑛

distribution; where 𝑋𝑖 is the value for discrete data arranged in FD and class mark of the ith class
for continuous grouped data, 𝑓𝑖 is the frequency of the ith class and n = ∑ 𝑓𝑖 .

Steps to calculate M.D for (𝑋̅)

▪ Find the arithmetic mean, 𝑋̅
▪ Find the deviations of each reading from 𝑋̅
▪ Find the arithmetic mean of the deviations, ignoring sign.
2. The mean deviation about the median is also given by

∑|𝑋 −x̃|
𝑀𝐷(𝑋̃) = 𝑛𝑖 … for ungrouped data (individual series).
∑ 𝑓 |𝑋 −x̃|
𝑀𝐷(𝑋̃) = 𝑖 𝑛 𝑖 . . . for discrete data arranged in FD and a grouped continuous frequency

distribution; where 𝑋𝑖 is the value for discrete data arranged in FD and class mark of the ith class
for continuous grouped data , 𝑓𝑖 is the frequency of the ith class and n = ∑ 𝑓𝑖 .

Steps to calculate M.D (𝑋̃ )

▪ Find the median, 𝑋̃
▪ Find the deviations of each reading from 𝑋̃
▪ Find the arithmetic mean of the deviations, ignoring sign.

3. The mean deviation about the mode is also given by

∑|𝑋𝑖 −x̂|
𝑀𝐷(x̂) = 𝑛
… for ungrouped data (individual series).
∑ 𝑓𝑖 |𝑋𝑖 −x̂|
𝑀𝐷(x̂) = . . for discrete data arranged in FD and a grouped continuous frequency
𝑛

Steps to calculate M.D (x̂)

[email protected]
7
Chapter four: MEASURES OF DISPERSION (VARIATION)

▪ Find the mode, x̂

▪ Find the deviations of each reading from x̂
▪ Find the arithmetic mean of the deviations, ignoring sign.
Example 4.5
The following are the number of visit made by ten mothers to the local doctor’s surgery. 8, 6, 5, 5, 7, 4, 5,
9, 7, 4. Find mean deviation about mean, median and mode.
Solution:
First calculate the three averages
𝑋̅ = 6, 𝑋̃ = 5.5, x̂ = 5
Then take the deviations of each observation from these averages.
xi 4 4 5 5 5 6 7 7 8 9 Total

|𝑋𝑖 − 𝑋̅| 2 2 1 1 1 0 1 1 2 3 14

|𝑋𝑖 − x̃| 1.5 1.5 0.5 0.5 0.5 0.5 1.5 1.5 2.5 3.5 14

|𝑋𝑖 − 𝑋̂| 1 1 0 0 0 1 2 2 3 4 14

Since the distribution is ungrouped the mean deviation about mean, median and mode:

∑|𝑋𝑖 − 𝑋̅| 14
𝑀𝐷(𝑋̅) = = = 1.4
𝑛 10
∑|𝑋𝑖 − x̃| 14
𝑀𝐷(𝑋̃) = = = 1.4
𝑛 10
∑|𝑋𝑖 −x̂| 14
𝑀𝐷(x̂) = 𝑛
= 10 = 1.4

Merits of 𝑴𝑫

➢ It is well-defined, easy to compute and simple to understand.

➢ It is based on all observations.
➢ It is not greatly affected by the extreme items.
➢ It can be calculated by using any average.

Demerit of 𝑴𝑫

[email protected]
8
Chapter four: MEASURES OF DISPERSION (VARIATION)

➢ It does not take in to account the signs of the deviations of items from the average.

Remark: Of all the mean deviations taken about different averages or any arbitrary value, the mean
deviation about the median has the smallest value.

Coefficient of mean deviation (CMD):

The relative measure of mean deviation, also called the coefficient of mean deviation is obtained by
dividing mean deviation by the particular average used in computing mean deviation. Thus,

➢ CMD about the arithmetic mean is given by:

𝑀𝐷(𝑋) ̅
𝐶𝑀𝐷(𝑋̅) = 𝑋̅ where MD is the mean deviation calculated about the arithmetic mean.
➢ CMD about the median is given by:

𝑀𝐷(𝑋) ̃
𝐶𝑀𝐷(𝑋̃) = 𝑋̃ in which case MD is calculated about the median of the observations.

➢ CMD about the mode is given by:

𝑀𝐷(x̂)
𝐶𝑀𝐷(x̂) = x̂
in which case MD is calculated about the mode of the observations.

Example 4.6: Calculate the coefficient of mean deviation about the mean, median and mode for the data
in Example 4.5 above.
Solution:
𝑀𝐷(𝑋̅) 1.4
𝐶𝑀𝐷(𝑋̅) = = = 0.23
𝑋̅ 6
𝑀𝐷(𝑋̃) 1.4
𝐶𝑀𝐷(𝑋̃) = = = 0.25
𝑋̃ 5.5
𝑀𝐷(x̂) 1.4
𝐶𝑀𝐷(x̂) = = = 0.28
x̂ 5

4.3.4 The Variance, Standard Deviation and Coefficient of Variation

Variance and Standard Deviation

Like the mean deviation, the variance is also based on all observations in a set of data. But the
variance is the average of squared deviations from the mean. Recall that the sum of squared deviations is
minimum only when taken from the mean. Squared deviations are mathematically manipulated than
absolute deviations. Thus, if we averaged the squared deviations from the mean and take the square root
of the result (to compensate for the fact that the deviations were squared), we obtain the standard
deviation. This overcomes the limitation of the mean deviation.

[email protected]
9
Chapter four: MEASURES OF DISPERSION (VARIATION)

Population Variance (𝝈𝟐 )

If we divide the variation by the number of values in the population, we get something called the
population variance. This variance is the "average squared deviation from the mean".
• For ungrouped data (individual series )
∑𝑵
𝒊=𝟏(𝑿𝒊 −𝝁)
𝟐 𝟏 2
𝝈𝟐 = 𝑵
= 𝑵 [∑N 𝟐
i=1 X i − 𝑵𝝁 ] where 𝝁 is the population arithmetic mean and N is the

total number of observations in the population.

• For discrete data arranged in FD & for continuous grouped data

∑ 𝒇𝒊 (𝑿𝒊 −𝝁)𝟐 𝟏
𝝈𝟐 = 𝑵
= 𝑵 [∑ fi Xi 2 − 𝑵𝝁𝟐 ] where 𝝁 is the population arithmetic mean, 𝑿𝒊 is the class mark of

the ith class, fi is the frequency of the ithclass and N=∑ fi

Sample Variance (𝑺𝟐 )
One would expect the sample variance to simply be the population variance with the population mean
replaced by the sample mean. However, one of the major uses of statistics is to estimate the
corresponding parameter. This formula has the problem that the estimated value isn't the same as the
parameter. To offset this, the sum of the squares of the deviations is divided by one less than the sample
size.
• For ungrouped data
∑𝑛
𝑖=1(𝑥𝑖 −𝑥̅ )
2 1
𝑆2 = 𝑛−1
= 𝑛−1 [∑ni=1 xi 2 − 𝑛𝑥̅ 2 ] where 𝒙
̅ is the sample arithmetic mean and n is the total

number of observations in the sample.

• For discrete data arranged in FD

If the values xi have frequencies fi (i=1,2,…,m), then the sample variance is given by:

1 m
S =  fi ( xi − x )
2 2
∑ 𝑓𝑖 (𝑥𝑖 −𝑥̅ )2 1
𝑆2 = = [ ∑ fi x i 2 − 𝑛𝑥̅ 2 ] or
𝑛−1 𝑛−1 n − 1 i =1

• For continuous grouped data

∑ 𝑓𝑖 (𝑥𝑖 −𝑥̅ )2 1
𝑆2 = 𝑛−1
= 𝑛−1 [∑ fi xi 2 − 𝑛𝑥̅ 2 ] where 𝒙
̅ is the sample arithmetic mean, 𝒙𝒊 is the class mark of the

ith class, fi is the frequency of the ith class and n=∑ fi.
The Standard Deviation
There is a problem with variances. Recall that the deviations were squared. That means that the units were
also squared. To get the units back the same as the original data values, the square root must be taken.
➢ Population Standard Deviation (s )

[email protected]
10
Chapter four: MEASURES OF DISPERSION (VARIATION)

𝜎 = √𝝈𝟐 where 𝜎 2 is the population variance.

➢ Sample Standard Deviation ( S )

𝑆 = √𝑆 2 where 𝑆 2 is the sample variance.

Example 4.7: Find the sample variance and standard deviation of:

xi 2 4 5 6 8

fi 2 2 3 1 2

Solution: Prepare the following table:

xi fi fixi xi2 fixi2

2 2 4 4 8
4 2 8 16 32
5 3 15 25 75
6 1 6 36 36
8 2 16 64 128
Sum 10 49 279

Thus, n=∑ fi = 10, ∑ fi xi = 49, ∑ fi xi 2 = 279.

1
𝑆2 = [∑ fi xi 2 − 𝑛𝑥̅ 2 ]
𝑛−1

1 49 1
= 9 [279 − 10(10)2 ] = 9 (38.9) = 4.32, 𝑎𝑛𝑑 𝑆 = √4.32 = 2.08.

Example 4.8: Find the sample variance and standard deviation for the distribution:

C.I 1-5 6-10 11-15 16-20

Freq. 4 1 2 3

Solution: In a continuous F.D., xi is the class mark representing the ith class.

C.I xi fi 2
f i xi f i xi

[email protected]
11
Chapter four: MEASURES OF DISPERSION (VARIATION)

1-5 3 4 12 36

6-10 8 1 8 64

11-15 13 2 26 338

16.20 18 3 54 972

Total 10 100 1410

∑ fi x i 100
Where, n=∑ fi = 10, x̅ = = = 10, ∑ fi xi 2 = 1410, so that
𝑛 10

1 1
𝑆2 =
𝑛−1
[ ∑ fi x i 2 − 𝑛𝑥̅ 2 ] = [1410 − 10(10)2 ]
9

410
= = 45.56,
9

𝑆 = √45.56 = 6.75.

Properties of Variance & Standard Deviation

1. If a constant is added to (or subtracted from) all the values, the variance remains the same; i.e.,

for any constant k, V ( xi  k ) = V ( xi ) .

Example 4.9 Consider the 6 sample values xi: 54,52,53,50,51, and 52.

The sample variance is 2 = V ( xi ) . Now, subtract 50 from each value to get:

yi : 4, 2, 3, 0, 1, 2; and, the variance of this new series is 2. i.e., V (x) = V ( y ) = 2 .

2. If each and every value is multiplied by a non-zero constant (k), the standard deviation is

multiplied by |𝑘| and the variance is multiplied by k2; i.e., V (kxi ) = k V ( xi ) .

3. Both the variance and the standard deviation give more weight to extreme values and less to
those which are near to the mean.

Coefficient of Variation
The standard deviation is an absolute measure of dispersion. The corresponding relative measure is
known as the coefficient of variation (CV).

[email protected]
12
Chapter four: MEASURES OF DISPERSION (VARIATION)

Of course, standard deviation is an absolute measure of dispersion that expresses the variation in the same
unit as the original data but it can not be the sole basis for comparing two distributions. For instance, if
we have a standard deviation of 10 and a mean of 5, the values vary by an amount twice as large as the
mean itself. If, on the other hand, we have a standard deviation of 10 and a mean of 5000, the variation
relative to the mean is significant. Therefore, we cannot know the dispersion of a set of data until we
know the standard deviation, the mean, and how the standard deviation compares with the mean.
Coefficient of variation is used in such problems where we want to compare the variability of two or more
different series. Coefficient of variation is the ratio of the standard deviation to the arithmetic mean,
usually expressed in percent.
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
CV = 𝑚𝑒𝑎𝑛
× 100%

For population data:

𝜎
CV = 𝜇 × 100

Where 𝜎 is the population standard deviation and 𝜇 is population mean.

For sample data:
𝑆
CV = ̅ × 100
x

Where 𝑆 is the sample standard deviation and x̅ is sample mean.

Remark: A distribution having less coefficient of variation is said to be less variable or more consistent
or more uniform or more homogeneous.
Example 4.10: Last semester, the students of Mathematics and Chemistry Departments took Introduction
to Statistics course. At the end of the semester, the following information was recorded.

Department Mathematics Chemistry

Mean score 85 65

Standard deviation 25 12

Compare the relative dispersions of the two departments’ scores using the appropriate way.
Solution:
Mathematics Departments Chemistry Departments
𝑆 𝑆
CV = ̅ × 100 CV = ̅ × 100
x x
25 12
= 85 × 100 = 65 × 100

[email protected]
13
Chapter four: MEASURES OF DISPERSION (VARIATION)

= 29.41% = 18.46%
Interpretation: Since the CV of Mathematics Department students is greater than that of Chemistry
Department students, we can say that there is more dispersion relative to the mean in the distribution of
Mathematics students’ scores compared with that of Chemistry students.
4.4 Standard Scores (Z-Scores)

A standard score for sample value in a data set is obtained by subtracting the mean of the data set from
the value and dividing the result by the standard deviation of the data set. Basically, the standard score (z-
score) tells us how many standard deviations a specific value is above or below the mean value of the data
set. That is, the z-score is the number of standard deviations the data value falls above (positive z-score)
or below (negative z-score) the mean for the data set.

Z-score computed from the population

𝑋−𝜇
𝑍 𝑠𝑐𝑜𝑟𝑒 =
𝜎

Z-score computed from the sample

𝑋 − 𝑋̅
𝑍 𝑠𝑐𝑜𝑟𝑒 =
𝑆

Example 4.11: What is the Z-score for the value of 14 in the following sample data set?

3 8 6 14 4 12 7 10

Solution:

14−8
𝑋̅ = 8, SD = 3.8173 thus, Z =3.8173 ≈ 1.57.

 The data value of 14 is located 1.57 standard deviations above the mean 8 because the z-score is
positive.

Example 4.12: Suppose that a student scored 66 in Statistics and 80 in Mathematics. The score of the
summary of the courses is given below.
Course Average score Standard deviation of the score

Statistics 51 12

[email protected]
14
Chapter four: MEASURES OF DISPERSION (VARIATION)

Mathematics 72 16

In which course did the student scored better as compared to his classmates?
Solution:
𝑋−𝜇 66−51 15
Z-score of student in Statistics: 𝑍 = 𝜎
= 12
= 12 = 1.25

𝑋−𝜇 80−72 8
Z-score of student in Mathematics: 𝑍 = 𝜎
= 16
= 16 = 0.5

From these two standard scores, we can conclude that the student has scored better in Statistics course
relative to his classmates than in Mathematics course.

4.5 Moments, Skewness and Kurtosis

The measures of central tendency and variation discussed in previous one do not reveal the entire story
about a frequency distribution. Two distributions may have the same mean and standard deviation but
may differ in their shape of the distribution. Further description of their characteristics is necessary that is
provided by measures of skewness and kurtosis.

4.5.1 Moments

Moments are statistical tools used in statistical investigation. The moments of a distribution are the
arithmetic mean of the various powers of the deviations of items from some number. In our course, we
shall use it in the study of Skewness and Kurtosis of statistical distribution.

Moments about the origin

∑ 𝑋𝑖 𝑟
𝑀𝑟 =
𝑛

Where 𝑟 = 0, 1, 2, 3, …

Moments about the origin for grouped frequency distribution and for ungrouped frequency distribution is

∑ 𝑓𝑖 𝑋𝑖 𝑟
𝑀𝑟 =
𝑛

Where 𝑓𝑖 is the frequency of 𝑋𝑖 . 𝑋𝑖 is the midpoint in the case of grouped frequency distribution or class
value in the case of ungrouped frequency distribution.

Note that: 𝑀1 = 𝑋̅, 𝑀0 = 1

[email protected]
15
Chapter four: MEASURES OF DISPERSION (VARIATION)

Moments about the Mean (Central Moments)

∑(𝑋𝑖 − 𝑋̅)𝑟
𝑀𝑟′ =
𝑛

Moments about the mean for grouped frequency distribution and for ungrouped frequency distribution.

∑ 𝑓𝑖 (𝑋𝑖 − 𝑋̅)𝑟
𝑀𝑟′ =
𝑛

Where 𝑓𝑖 is the frequency of 𝑋𝑖 . 𝑋𝑖 is the midpoint in the case of grouped frequency distribution or class
value in the case of ungrouped frequency distribution.

Note that: 𝑀2′ = 𝑆𝐷2 if it is assumed 𝑛 = 𝑛 − 1.

Moments about any arbitrary constant 𝑨

∑(𝑋𝑖 − 𝐴)𝑟
𝑀𝑟′ =
𝑛

Moments about any arbitrary constant 𝐴 for grouped frequency distribution and for ungrouped frequency
distribution

∑ 𝑓𝑖 (𝑋𝑖 −𝐴)𝑟
𝑀𝑟′ = .
𝑛

Example 4.13: Find the first four moments about the mean for the following individual series

𝑋𝑖 : 3 6 8 10 18

Solution: n=5,

S.No 𝑿𝒊 ̅)
(𝑿𝒊 − 𝑿 ̅ )𝟐
(𝑿𝒊 − 𝑿 ̅ )𝟑
(𝑿𝒊 − 𝑿 ̅ )𝟒
(𝑿𝒊 − 𝑿

1 3 -6 36 -216 1296

[email protected]
16
Chapter four: MEASURES OF DISPERSION (VARIATION)

2 6 -3 9 -27 81

3 8 -1 1 -1 1

4 10 1 1 1 1

5 18 9 81 729 6561

Total ∑ 𝑋 = 45 ∑(𝑋 − 𝑋̅) = 0 ∑(𝑋 − 𝑋̅)2 ∑(𝑋 − 𝑋̅)3 ∑(𝑋 − 𝑋̅)4

= 128 = 486 = 7940

Thus,

45 ∑(𝑋𝑖 −9)1 ∑(𝑋𝑖 −9) 2128 ∑(𝑋𝑖 −9) 4863

𝑋̅ = 5 = 9, 𝑀1′ = 5
= 0, 𝑀2′ = 5
= 5 = 25.6, 𝑀3′ = 5
= 5 = 97.2

∑(𝑋𝑖 − 9)4 7940

𝑀4′ = = = 1588
5 5

4.5.2 Skewness

Skewness refers to lack of symmetry (or departure from symmetry) in a distribution.

➢ A skewed frequency distribution is one that is not symmetrical.

➢ Skewness is concerned with the shape of the curve not size.
A distribution is said to be symmetrical when the value is uniformly distributed around the mean
(distribution of the data below the mean and above the mean are equal). In a symmetrical distribution, the
mean, median and mode coincide (i.e., mean = median = mode).
Positively skewed distribution: if the value of mean is greater than the mode, skewness is said to be
positive. In a positively skewed distribution mean is greater than the mode and the median lies
somewhere in between mean and mode. A positively skewed distribution contains some values that are
much larger than the majority of other observations.
Negatively Skewed distribution: if the value of mode is greater than the mean, skewness is said to be
negative. In a negatively skewed distribution mode is greater than the mean and the median lies in
between mean and mode. The mean is pulled towards the low-valued item (that is, to the left). A
negatively skewed distribution contains some values that are much smaller than the majority of
observations.

[email protected]
17
Chapter four: MEASURES OF DISPERSION (VARIATION)

Note that: In moderately skewed distributions the averages have the following
relationship.

(Mean – mode) = 3(mean - median)

How to check the presence of skewness in a distribution?

Skewness present in the data if:

i) the graph is not symmetrical.

ii) the mean, median and mode do not coincide.
iii) the sum of positive and negative deviations from the median is not zero.
iv) the frequencies are not similarly distributed on either side of the mode.

Measures of skewness (𝜶𝟑 )

A measure of skewness gives a numerical expression for and the direction of asymmetry in a distribution.
It gives information about the shape of the distribution and the degree of variation on either side of the
central value. The three most commonly used measures of skewness are Pearson’s coefficient of
skewness, Bowley’s coefficient of skewness and coefficient of skewness based on moments.

1. Pearson’s coefficient skewness (Pearsonian coefficient of skewness)

The skewness of the distribution can be measured by Pearson’s Coefficient of Skewness (𝜶𝟑 ),
for which the formula is given below:

[email protected]
18
Chapter four: MEASURES OF DISPERSION (VARIATION)

𝑀𝑒𝑎𝑛−𝑀𝑜𝑑𝑒
𝛼3 = 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛

2. Bowley’s Coefficient of Skewness

Bowley’s coefficient of skewness is based on quartiles. The formula for calculating coefficient of
skewness is:
(𝑄3 −𝑄2 )−(𝑄2 − 𝑄1 ) 𝑄3 +𝑄1 − 2𝑄2
𝛼3 = 𝑄3 −𝑄1
= 𝑄3 −𝑄1

3. Moment Coefficient of Skewness

Moment coefficient of skewness is based on moments. The formula for calculating coefficient of
skewness is:

𝑀′3 𝑀′3
𝛼3 = 3/2 =
𝑀′2 𝜎3

Where, M'r = ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )𝑟 /𝑛

The shape of the curve is determined by the value of 𝛼3

𝛼3 > 0,➔ the distribution is positively skewed/skewed to the right, i.e mode < median <mean

➔smaller observations are more frequent than larger observations. i.e., the majority of

the observations have a value below an average.

α3 = 0,➔ the distribution is symmetric, i.e. mean = mode = median

α3 < 0,➔ the distribution is negatively skewed/skewed to the left. i.e., mean < median < mode

➔smaller observations are less frequent than larger observations. i.e., the majority of

the observations have a value above an average.

4.5.3 Kurtosis

Kurtosis is a measure of peakedness of a distribution. The degree of kurtosis of a distribution is

measured relative to the peakedness of a normal curve. If a curve is more peaked than the
normal curve it is called ‘leptokurtic’; if it is more or flate-topped than the normal curve it is
called ‘platykurtic’ or flat-topped. The normal curve itself is known as ‘mesokurtic’.

[email protected]
19
Chapter four: MEASURES OF DISPERSION (VARIATION)

Measures of Kurtosis (𝜶𝟒 )

The moment coefficient of kurtosis:

𝑀′4 𝑀′4
α4 = 𝑀′22
= 𝜎4

The peakedness depends on the value of 𝛼4

• 𝛼4 > 3 ➔ the curve is leptokurtic,
• 𝛼4 = 3 ➔ the curve is mesokurtic,
• 𝛼4 < 3 ➔ the curve is platykurtic.

Example: Based on the following data:

𝑀′0 = 1, 𝑀′1 = -0.6, 𝑀′2 = 1.6, 𝑀′3 = -2.4, 𝑀′4 = 5.8
a/ Find the coefficient of skewness and discuss the distribution type.
b/ Find the coefficient of kurtosis and discuss the distribution type.
Solution:
𝑀′3 −2.4
a/ 𝛼3 = 3/2 = 1.63/2 = -1.19 < 0, ➔the distribution is negatively skewed.
𝑀′2

𝑀′4 5.8
b/ 𝛼4 = 𝑀′22
= 1.62 = 2.26 < 3, ➔the curve is platykurtic.

Example 4.14: Find the coefficient of skewness and the coefficient of kurtosis for the above
example 4.13.
Solution:
𝑀′3 97.2 97.2
i) 𝛼3 = 3/2
𝑀′2
= 3 =
129.527
= 0.75
(25.6)2

➔the distribution is positively skewed.

𝑀′4 1588
ii) 𝛼4 =
𝑀′2
=
25.62
= 2.42
2

➔the curve is platykurtic.

[email protected]
20
Chapter four: MEASURES OF DISPERSION (VARIATION)

[email protected]
21

Cluster MCQ
No ratings yet
Cluster MCQ
12 pages
3 Relations - Functions - Functions - Types of Functions
100% (1)
3 Relations - Functions - Functions - Types of Functions
62 pages
STA 101 Exam INTRODUCTORY STATISTICS QUESTIONS 2022 - 2023
100% (1)
STA 101 Exam INTRODUCTORY STATISTICS QUESTIONS 2022 - 2023
2 pages
MCQ On Endogenous and Exogenous Variables
No ratings yet
MCQ On Endogenous and Exogenous Variables
7 pages
Stat-L CHAPTER 4
No ratings yet
Stat-L CHAPTER 4
18 pages
Chapter 6-Continuous Probability Distributions: Multiple Choice
100% (1)
Chapter 6-Continuous Probability Distributions: Multiple Choice
30 pages
Chapter 4
No ratings yet
Chapter 4
18 pages
CHAPTER 4 Measures of Dispersion (Variation)
No ratings yet
CHAPTER 4 Measures of Dispersion (Variation)
17 pages
Kuhfeld (2010) Discrete Choice - mr2010f
No ratings yet
Kuhfeld (2010) Discrete Choice - mr2010f
379 pages
Time Series Questions:: Annual Data
100% (1)
Time Series Questions:: Annual Data
10 pages
4 6006014285984042352
100% (1)
4 6006014285984042352
4 pages
MCQ Chapter 11
100% (1)
MCQ Chapter 11
3 pages
R Programming Exam With Solutions
No ratings yet
R Programming Exam With Solutions
9 pages
CH 3 and 4
100% (4)
CH 3 and 4
44 pages
Belisa Aliyi - Assignments - For - Econometrics
No ratings yet
Belisa Aliyi - Assignments - For - Econometrics
34 pages
Midterm Exam
No ratings yet
Midterm Exam
3 pages
Biostatistics (Midterm)
No ratings yet
Biostatistics (Midterm)
28 pages
CH 5 Time Series
No ratings yet
CH 5 Time Series
46 pages
Class Discusion Dsamplindistn
100% (1)
Class Discusion Dsamplindistn
2 pages
Review Questions For Final
100% (2)
Review Questions For Final
29 pages
Detecting Multicollinearity in Regression Analysis
No ratings yet
Detecting Multicollinearity in Regression Analysis
4 pages
Revision Questions. Statistics
100% (5)
Revision Questions. Statistics
27 pages
Chapter 3 Geometric Construction
100% (1)
Chapter 3 Geometric Construction
32 pages
MCQ On Index Number With Answers With Bold
No ratings yet
MCQ On Index Number With Answers With Bold
6 pages
The Exponential Distribution
No ratings yet
The Exponential Distribution
3 pages
ESTIMATION (One Population) : CHAPTER - 8
100% (1)
ESTIMATION (One Population) : CHAPTER - 8
14 pages
1.1 Definitions and Classification of Statistics: Chapter One: Introduction
100% (3)
1.1 Definitions and Classification of Statistics: Chapter One: Introduction
10 pages
Marketing Research Coetzee AJ Chapter 4
No ratings yet
Marketing Research Coetzee AJ Chapter 4
37 pages
Technical Drawing TG G11
76% (80)
Technical Drawing TG G11
98 pages
Chapter 5. Elementary Probability
No ratings yet
Chapter 5. Elementary Probability
11 pages
T Test For A Mean
100% (1)
T Test For A Mean
18 pages
237 Final
No ratings yet
237 Final
45 pages
Chapter7 MQM100 MultipleChoice PDF
No ratings yet
Chapter7 MQM100 MultipleChoice PDF
15 pages
Biostat Exam For Anaesthesia
No ratings yet
Biostat Exam For Anaesthesia
7 pages
Poisson Distribution
100% (1)
Poisson Distribution
6 pages
Assignment I
100% (1)
Assignment I
4 pages
Comm291 Practice Midterm
No ratings yet
Comm291 Practice Midterm
70 pages
Ch7 Sampling Distribution - Suggested Problems Solutions
100% (1)
Ch7 Sampling Distribution - Suggested Problems Solutions
8 pages
chp2 Econometric
No ratings yet
chp2 Econometric
54 pages
EPAT Structure
No ratings yet
EPAT Structure
8 pages
Data Analytics Starter Pack
No ratings yet
Data Analytics Starter Pack
3 pages
Evaluation of Grit, General Weighted Average, and Year Level among NU Fairview College Students
No ratings yet
Evaluation of Grit, General Weighted Average, and Year Level among NU Fairview College Students
5 pages
Quiz - 1 Linear Equations and Inequalities
100% (1)
Quiz - 1 Linear Equations and Inequalities
4 pages
Econometrics Chapter Two-1
No ratings yet
Econometrics Chapter Two-1
41 pages
Experimental Design and Its Role in Data Science: Tirthankar Dasgupta CS 109 / Stat 121 November 17, 2015
No ratings yet
Experimental Design and Its Role in Data Science: Tirthankar Dasgupta CS 109 / Stat 121 November 17, 2015
67 pages
SDM Report
No ratings yet
SDM Report
32 pages
Data Mining Project - Clustering - State Wise Health Income
No ratings yet
Data Mining Project - Clustering - State Wise Health Income
9 pages
David R. Anderson, Dennis J. Sweeney, Thomas A. Williams, Jeffrey D. Camm, James James J. Cochran Quantitative Methods For Business
No ratings yet
David R. Anderson, Dennis J. Sweeney, Thomas A. Williams, Jeffrey D. Camm, James James J. Cochran Quantitative Methods For Business
19 pages
Final Review Worksheet-STAT 362-Final Review
No ratings yet
Final Review Worksheet-STAT 362-Final Review
24 pages
Sam Roweis Probx
No ratings yet
Sam Roweis Probx
12 pages
Media Laboratorium Virtual Pada Pembelajaran Fisika Di Era Pandemi Covid-19 Terhadap Keterampilan Proses Sains Siswa
No ratings yet
Media Laboratorium Virtual Pada Pembelajaran Fisika Di Era Pandemi Covid-19 Terhadap Keterampilan Proses Sains Siswa
8 pages
Chapter 7 Portfolio Theory & Risk Diversification
No ratings yet
Chapter 7 Portfolio Theory & Risk Diversification
13 pages
Trachnhiem
50% (2)
Trachnhiem
4 pages
Weighted Least Sq (15)
No ratings yet
Weighted Least Sq (15)
5 pages
Implementation of Linear Regression With Python
No ratings yet
Implementation of Linear Regression With Python
5 pages
Data Science - Part II (Cra 4061)
No ratings yet
Data Science - Part II (Cra 4061)
2 pages
GROUP ASSIGNMENT MGT555 (FARHANA DAN SUAIDAH) CS2907B (1)
No ratings yet
GROUP ASSIGNMENT MGT555 (FARHANA DAN SUAIDAH) CS2907B (1)
36 pages
Chapter 4 Managerial Statistics Solutions
No ratings yet
Chapter 4 Managerial Statistics Solutions
24 pages
8.estimation I - 530
100% (1)
8.estimation I - 530
22 pages
Tutorial 5.0 Discrete Random Variable 2023
No ratings yet
Tutorial 5.0 Discrete Random Variable 2023
7 pages
Mean, Median and Mode - Module1
100% (1)
Mean, Median and Mode - Module1
8 pages
Assignment 1
100% (1)
Assignment 1
6 pages
QMM Exam Assist
67% (3)
QMM Exam Assist
21 pages
8 Production, Costs, Profit
No ratings yet
8 Production, Costs, Profit
23 pages
Normal Distribution Test Multiple-Choice Questions (13 Marks)
100% (2)
Normal Distribution Test Multiple-Choice Questions (13 Marks)
6 pages
Correlation Exercises
No ratings yet
Correlation Exercises
4 pages
Arima
No ratings yet
Arima
21 pages
Understanding Effect Sizes
No ratings yet
Understanding Effect Sizes
7 pages
28619156
No ratings yet
28619156
6 pages
EC203 Tutorial 12 Time Series 16
No ratings yet
EC203 Tutorial 12 Time Series 16
4 pages
03 Mcqs Stat Mod-III
100% (1)
03 Mcqs Stat Mod-III
8 pages
Measure of Central Tendency
No ratings yet
Measure of Central Tendency
13 pages
Chapter 6 Section 4-5: Probability: Multiple Choice
No ratings yet
Chapter 6 Section 4-5: Probability: Multiple Choice
7 pages
Lecture Notes On Techniques of Integration
No ratings yet
Lecture Notes On Techniques of Integration
20 pages
TQ - Stat
100% (1)
TQ - Stat
9 pages
Pengaruh Koreksi Fiskal Atas Laporan Laba Rugi Komersial Terhadap Penghasilan Kena Pajak
No ratings yet
Pengaruh Koreksi Fiskal Atas Laporan Laba Rugi Komersial Terhadap Penghasilan Kena Pajak
22 pages
TOPIC 6 Hypothesis-Testing
No ratings yet
TOPIC 6 Hypothesis-Testing
2 pages
Chapter Three
No ratings yet
Chapter Three
15 pages
Using The T-Test: IB Biology Topic 1
No ratings yet
Using The T-Test: IB Biology Topic 1
22 pages
Basic Stats
No ratings yet
Basic Stats
6 pages
Bảng phân phối F (f Distribution)
100% (3)
Bảng phân phối F (f Distribution)
14 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
MGMT E-5070 2nd Examination Solution
100% (1)
MGMT E-5070 2nd Examination Solution
8 pages
5 Neff Et Al (2020) - Self Compassion Scale For Youth
No ratings yet
5 Neff Et Al (2020) - Self Compassion Scale For Youth
15 pages
3008 Assignment 1 - Due Oct 9th Revised
No ratings yet
3008 Assignment 1 - Due Oct 9th Revised
3 pages
Height (CM) Number of Student X X.F
No ratings yet
Height (CM) Number of Student X X.F
3 pages
Math Grade 11 (Social) Note and Worksheet Week-11
No ratings yet
Math Grade 11 (Social) Note and Worksheet Week-11
5 pages
Midterm Examination # 3: Sta 113: Probability and Statistics in Engineering Tuesday, 2008 Nov. 25, 1:15 - 2:30 PM
No ratings yet
Midterm Examination # 3: Sta 113: Probability and Statistics in Engineering Tuesday, 2008 Nov. 25, 1:15 - 2:30 PM
14 pages
Stationary and Non Stationary
100% (1)
Stationary and Non Stationary
5 pages
I've Come Loaded With Statistics, For I've Noticed That A Man Can't Prove Anything Without Statistics
No ratings yet
I've Come Loaded With Statistics, For I've Noticed That A Man Can't Prove Anything Without Statistics
19 pages
Basic 2
No ratings yet
Basic 2
13 pages
MCQ 2
No ratings yet
MCQ 2
4 pages
Sample Final Exam 1
No ratings yet
Sample Final Exam 1
13 pages
Normal Distribution
No ratings yet
Normal Distribution
6 pages
SMCh03 Final
No ratings yet
SMCh03 Final
11 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)