0% found this document useful (0 votes)
10 views16 pages

CHAPTER TWO - Pfrof Abegunde and DR Oladeji

This chapter discusses the concept of measurement in statistics, emphasizing the importance of measurement scales such as nominal, ordinal, interval, and ratio scales, each with distinct properties and applications. It also explains measures of central tendency, including mean, median, and mode, which help summarize datasets and understand their distribution. The chapter highlights the advantages and disadvantages of these measures, particularly focusing on the mean's sensitivity to outliers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views16 pages

CHAPTER TWO - Pfrof Abegunde and DR Oladeji

This chapter discusses the concept of measurement in statistics, emphasizing the importance of measurement scales such as nominal, ordinal, interval, and ratio scales, each with distinct properties and applications. It also explains measures of central tendency, including mean, median, and mode, which help summarize datasets and understand their distribution. The chapter highlights the advantages and disadvantages of these measures, particularly focusing on the mean's sensitivity to outliers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

CHAPTER TWO

MEASUREMENTS

Prof. Ola Abegunde and Dr. Olayide Oladeji

Department of Political Science,


Faculty of the Social Sciences,
Ekiti State University,
Ado-Ekiti, Nigeria.

1. INTRODUCTION

Measurement means the assignment of numbers in a meaningful way. Measurement is the


assignment of numerals to objects or events according to specific rules. It is the process
through which we obtain symbols that represent a concept, or the rules for assigning numbers
to objects to represent quantities or attributes (Ogunleye, 2005). These definitions highlight
the emphasis on quantification. Therefore, we can conclude that quantification is impossible
without measurement. When measuring, we aim to distinguish between the entities being
measured, compare measurements of two different entities, and assess the ratios of two
entities based on their measurements. (Ayeni, 1983).

Consequently, understanding measurement scales is important to interpret the numbers


assigned to people, objects, and events. In statistics, a measurement scale is the information
provided by numbers. Measurement scales are, therefore, referred to as levels of
quantification. Quantification is a fundamental aspect of the scientific method, which requires
us to formulate a problem, define a set of hypotheses, and then strive to test them empirically.
The measurement scale is how variables are defined and categorised. A measurement scale
has properties that determine how to properly analyse the data/variable. These properties
include identity, magnitude, equal intervals, and a minimum value of zero.

Identity means that each value has a unique meaning. Magnitude is that the values have an
ordered relationship with each other. Thus, there is a specific order to the variables. Equal
intervals connote that data points along the scale are equal. This means that the difference
between data points one and two will be the same as between data points five and six. A
minimum value of zero presupposes that the scale has a true zero point. For instance, degrees

1
can fall below zero and still have meaning, but if someone or something weighs nothing, the
person or thing does not exist (UNSW Sydney, n.d.).

2. FUNDAMENTAL LEVELS OF MEASUREMENT SCALES

Statisticians often use “levels of measurement” to differentiate between variables with


different properties. There are four fundamental types or levels of measurement scales:
nominal, ordinal, interval, and ratio scales. (See Figure 2.1). Each level of measurement scale
possesses distinct properties that dictate its unique applications in statistical analysis.

Figure 2.1: Fundamental levels of measurement scales (BYJU’S n.d.).

2.1. Nominal Scales

The nominal scale is the first level of measurement in which numbers are used as labels to
classify objects. This scale typically deals with non-numeric variables or numbers that do not
have any inherent value. In nominal scales, numbers are utilized to name or classify people,
objects, or events, such as driver’s license numbers and product serial numbers. For example,
gender serves as a nominal measurement wherein a number, such as 1 for males and 2 for
females, is employed to distinguish between genders. It is important to emphasize that these
2
numbers do not denote any superiority or inferiority; they are simply used for categorization.
It is equally worth noting that alternative numbers could serve the same purpose, as they do
not signify a quantity or quality. While word names may not be compatible with certain
statistical techniques, numerals can be employed in coding systems. For example, fire
departments may analyze the association between gender (where male = 1, female = 2) and
performance on physical-ability tests, using numerical scores to gauge ability (Lee, 2022).

We can identify some characteristics of the nominal scale:

i. A nominal scale variable is categorized into two or more distinct classes, with
each answer falling into one of these categories.
ii. A nominal scale is a qualitative scale where numbers are used to identify different
categories or objects.
iii. The characteristics of an object are not defined by numbers. The only permissible
use of numbers in the nominal scale is for “counting.”
iv. No mathematical operation is performed on a nominal scale.

2.2. Ordinal Scale

An ordinal scale of measurement is considered superior to the nominal scale because it has all
the qualities of the nominal scale but adds the merit of being ranked. This means that at this
level of measurement, sets of objects or people are ordered from ‘most’ to ‘least’ concerning
an attribute. However, under this scaling method, information regarding the amount of the
measured attribute that separates the objects or events is lacking (Ogunleye, 2005).
Therefore, in ordinal scales, numbers indicate rank order and the order of quality or quantity,
but they do not represent an amount of quantity or degree of quality. For example, the
number 1 means that the person (or object or event) is better than the person labelled 2;
person 2 is better than person 3, and so forth. This type of scaling does not, however, indicate
how much better the leader is compared to the person labelled 2, and there may be very little
difference between 1 and 2. When ordinal measurement is used (rather than interval
measurement), certain statistical techniques are applicable, such as Spearman’s rank
correlation (Lee, 2022).

The characteristics of the ordinal scale include the following:

i. The ordinal scale displays the relative ranking of variables.


ii. It identifies and describes the magnitude of a variable.
iii. Along with the information provided by the nominal scale, ordinal scales provide
the rankings of those variables.

3
iv. The interval properties are not known.
v. Surveyors can quickly analyse the degree of agreement concerning the identified
order of variables.

2.3. Interval Scale

The interval scale, which is the third level of the measurement scale, provides a quantitative
way of measuring variables where the difference between two variables holds significance.
This means that the variables are measured precisely, without any relative interpretation, and
the presence of zero is not arbitrary. Thus, in interval scales, numbers form a continuum and
provide information about the amount of difference, but the scale lacks a true zero. The
differences between adjacent numbers are equal or known. If zero is used, it simply serves as
a reference point on the scale but does not indicate the complete absence of the characteristic
being measured. The Fahrenheit and Celsius temperature scales are examples of interval
measurements. In those scales, 0 °F and 0 °C do not indicate an absence of temperature. The
following are some of the characteristics of interval scale:

i. The interval scale is quantitative because it can measure the difference between
values.
ii. This scale allows for calculating the mean and median of variables.
iii. It is possible to understand the difference between variables by subtracting the
values.
iv. The interval scale is the preferred scale in Statistics because it allows for assigning
numerical values to arbitrary assessments such as feelings, calendar types, etc.

2.4. Ratio Scale

The ratio scale stands as the most advanced level of measurement, encompassing the
characteristics of nominal, ordinal, and interval scales. It enables researchers to compare
differences or intervals. It possesses an absolute or natural zero point, meaning that a zero on
the scale signifies a complete absence of the attribute being measured. A tangible example of
this scale is a meter rule, where the zero point denotes a total lack of the attribute being
measured. Thus, the ratio scale is distinguished as the most refined and precise scale. It is
well-suited for attributes such as weight, height, and age, which feature zero points. For
instance, the age of an unborn child is represented as zero on the ratio scale. A person who is
1.2 metres (4 feet) tall is two-thirds as tall as a 1.8-metre- (6-foot-) tall person. Similarly, a

4
person weighing 45.4 kg (100 pounds) is two-thirds as heavy as a person who weighs 68 kg
(150 pounds). This scale facilitates the most accurate and comprehensive mathematical
manipulation (Ogunleye, 2005). We can isolate some characteristics of the ratio scale to
include:

i. The ratio scale has a feature of absolute zero, which means it doesn’t have
negative numbers due to its zero-point feature.
ii. It provides unique opportunities for statistical analysis, as variables can be orderly
added, subtracted, multiplied, and divided.
iii. Mean, median, and mode can be calculated using the ratio scale.
iv. Additionally, the ratio scale has unique and useful properties, such as allowing
unit conversions like kilogram to calories, gram to calories, and so on.

3. LOCATION AND CENTRAL TENDENCY

A measure of central tendency, also known as a measure of centre or central location, is a


summary statistic that aims to describe a dataset with a single value representing its middle or
centre of distribution (Australian Bureau of Statistics [ABS], n.d.). Central tendency is a
statistical measure that identifies the central points in a sample, often referred to colloquially
as “averages” (Lecturi, 2022). The fundamental measures of central tendency, namely the
mean, median, and mode, play a vital role in statistics. They help in pinpointing the central
value within a dataset, enabling comparisons with other data points. By doing so, these
measures provide insights into how the sample is spread out or clustered, which is crucial for
understanding the dispersion or distribution of the data. We will discuss this in more detail
later. A measure of location helps us understand where data points tend to cluster.

5
Figure 2.2: Measures of central tendency (Adapted from AnalystPrep, 2023).

3.1. Mean

The mean (or average) is the most popular and well-known measure of central tendency. It
can be used with discrete and continuous data, although it is most often with continuous data.
The mean is equal to the sum of all the values of the observations in a data set divided by the
number of values of observation in the data set. Consequently, the mean is the sum of a set of
numbers divided by n. For example, the votes of 10 political parties in a presidential election
are 23, 21, 30, 45, 70, 13, 18, 56, 72, 82. Find the mean vote of the political parties.

Solution:

The mean is the sum of all the votes divided by 10


Sum = 23+21+30+45+70+13+18+56+72+82 = 430
The mean = 430 ÷ 10
Therefore, the Mean vote of the political parties is 43.

The mean is usually referred to as the average value. We can represent the figure to be
averaged by letters such as x, y or z. The letter n is usually used to denote the number of
values in a sample (the sample size). Using the letter x, we can say the n values in a sample
are: x1, x2, ……, and xn. Therefore, the mean is calculated as:
x 1+ x 2+ x 3+… … …+ xn
Mean =
N

Also, if we denote the mean as the sum of X, the mean can be expressed as
Mean = X̄ = ∑ x/n

Mean for Frequency Data

a. Ungrouped Data

Table 2.1 below presents the food palliatives distribution to households in Ekiti State. We use
X to represent the number of households, f as the amount of food received in bags, and fx as
the number of households multiplied by the amount of food in bags. Thus, the sum of
households is fx and the total number of values (bags of food items) is f.

6
Table 2.1: Food Palliatives Distribution in Ekiti State

X (Households) Frequency - f (Bags of Fx


food)

14 1 14

15 2 30

16 3 48

17 4 68

22 4 88

Total ∑ 10 248

∑ fx
The mean X̄ =
∑f
248
Therefore, the mean = = 24.8
10

Generally, if a value ‘x’ occurs ‘f’ times in a set of data, the sum of the values is represented
as ∑fx and the total number of values is represented as ∑f.

b. Grouped Data

Consider table 2.2 below which gives the age brackets of candidates in a 2023 general
election in Nigeria. Please, note that to calculate the mean for grouped data, we must get a
mid-value for the interval (age bracket). To calculate this (mid-value) we add the lower value
to the upper value and divide the answer by 2. For example, 52 + 56 = 108/2 = 54.

Table 2.2: Participants’ age brackets

Age Mid-value (x) Frequency Fx

52 – 56 54 3 162

7
56 – 60 58 6 348

60 – 64 62 10 620

64 – 68 66 4 264

68 – 72 70 8 560

72 – 76 74 2 148

76 – 80 78 1 78

∑ fx
We can calculate the mean age by applying the formula: X̄ =
∑f
2180
Therefore X̄ = = 64.1.
34

Advantages of Mean

i. The mean uses every value in the data and hence is a good representative of the
data. The irony in this is that most of the time this value never appears in the raw
data.
ii. Repeated samples drawn from the same population tend to have similar means.
The mean is therefore the measure of central tendency that best resists the
fluctuation between different samples
iii. It is closely related to standard deviation, the most common measure of
dispersion.

Disadvantages of Mean

i. The important disadvantage of the mean is that it is sensitive to extreme


values/outliers, especially when the sample size is small.
ii. It is not an appropriate measure of central tendency for skewed distribution.
iii. Mean cannot be calculated for nominal or nonnominal ordinal data.

8
iv. Even though the mean can be calculated for numerical ordinal data, many times it
does not give a meaningful value.

3.2. Median

The median is a measure of central location that remains unaffected by extremely large or
small data values, unlike the mean. To calculate the median, arrange the data values in
ascending or descending order. If there is an odd number of data values, the median is the
middle value; If there is an even number of data values, the median is the average of the two
middle values. For example, when the number of n of observations is odd and the
1
observations are arranged in ascending order, the median is the (n + 1)th observation of the
2
middle value. Find the median of the following scores: 44, 40, 79, 42, 51, 59, 71, 44, 60, 65,
45.

Solution:

We need to arrange the scores in ascending order: 40, 42, 44, 44, 45, 51, 59, 60, 65, 71, 79.

Since n = 11, then median = ½(11 + 1)th score = ½(12) = 6th score.

Therefore, the media score is 51.

On the other hand, for an even number of observations, the median is the mean of the two
middle values of the observation. If n is even, the median is half of nth (1/2nth) plus ½(n +
2)th observations divided by 2. For example, find the median of the attendance for GST113 if
the attendances in twelve lectures are: 40, 32, 37, 30, 24, 40, 38, 35, 40, 28, 32, and 37.

Solution:

We arrange the attendances in ascending order: 24, 28, 30, 32, 32, 35, 37, 37, 38, 40, 40, 40.

½(12)th +½ (12+ 2) th 6 th+7 th


The median = = .
2 2

35+37 72
The median = = = 36.
2 2

9
Median of Grouped Data

The median of grouped data is denoted by nth/2 observations whether n is odd or even. To
calculate the median of grouped data, we use the following steps:

1. Determine the class intervals. Grouped data is usually presented in the form of
a frequency table, where the data is divided into intervals or classes.
2. Find the cumulative frequency. Calculate the cumulative frequency of each
class interval, by adding the frequency of each interval to the frequency of the
previous interval.
3. Identify the median class. The median class is the class interval that contains
N
the median value. This can be found by dividing the total frequency by 2 ( )
2
And locating the class interval where the cumulative frequency is equal to or
exceeds this value.
4. Calculate the median. Once the median class has been identified, use the
following formula to calculate the median: Median = L + ((n/2 - CF) / f) x i
where:
 L is the lower limit of the median class.
 n is the total frequency.
 CF is the cumulative frequency of the class preceding the median class
 f is the frequency of the median class
 i is the class interval width

For example, table 2.3 shows the weight in kilograms of children and teenagers attending a
primary health centre in Ado-Ekiti:

Table 2.3: Weight Distribution


Weight (Class Intervals) Frequency Cumulative Frequency
10 – 19 5 5
20 – 29 12 17
30 – 39 18 35
40 – 49 8 43
50 – 59 7 50

Solution:

The total frequency is 50. Then, let us determine the median class by dividing 50 by 2 to get
25. Thus, the median class is the 30-39 interval, with a cumulative frequency of 17 (5+12).
The lower limit of the median class is 30, the frequency of the median class is 18, and the
class interval width is 10. Applying the formula [L + ((n/2 - CF) / f) x i], we get:
10
Median = 30 + ((25 - 17)/18) x 10

= 30 + (8/18) x 10

= 30 + 4.44

Median = 34.44

Therefore, the median of the weight distribution (set of grouped data) is 34.44.

Exercise: Calculate the median for the following data:

Table 2.4: Students marks in FSS001

Marks Number of Students

0 – 20 6

20 – 40 20

40 – 60 37

60 – 80 10

80 – 100 7

3.3. Mode

While the median is good for judging relative standing, it is unsuitable for qualitative data.
The mode is suitable for measuring relative standing for both qualitative and quantitative
data. The mode is the value (observation) that appears most often in a dataset. At times, a
dataset has a mode; at times, it does not. Thus, if all the observations in a dataset are different
and no value repeats more than others, then the dataset has no mode. A dataset with one
mode is called unimodal. When there are two modes, it is called bimodal. If the distribution
has three frequently occurring values, it is termed trimodal. An interval, in grouped data, with
the highest frequency is called the modal interval (or intervals). The mode has two essential
advantages: One, it does not require any calculation, except for grouped data; it only occurs.
Two, it can be determined for qualitative or nominal data.

11
Example: The 23 meetings of a political party were attended by 24, 21, 25, 28, 25, 25, 25, 27,
23, 26, 32, 23, 25, 23, 30, 36, 22, 20, 29. 28, 29, 30, and 25 of its members. Determine the
mode.

From the result, 25 occurs six times and thus is the modal attendance.

On the other hand, to find the mode for grouped data, we can follow these steps:

i. Determine the midpoint of each interval in the grouped data.


ii. Find the frequency for each interval.
iii. Identify the interval with the highest frequency.
iv. The mode is the midpoint of the interval with the highest frequency.

For example, let us say we have age-grades attending community meetings as presented in
the following grouped data:

Age-Grades (Intervals) Frequency

10 – 19 6

20 – 29 12

30 – 39 15

40 – 49 9

50 – 59 3

Solution:

Following the above steps of finding the mode, we must identify the interval with the highest
frequency and this is the interval 30 – 39 with a frequency of 15. This midpoint of this
30+39
interval is { } = 34.5. Thus, the mode for this grouped data is 34.4.
2

4. DISPERSION AND VARIABILITY

12
The utilization of measures of dispersion facilitates the depiction of the variability within
data. Dispersion, as a statistical concept, is employed to convey the degree to which data is
scattered. Consequently, measures of dispersion are distinct types of quantifiers utilized to
assess the distribution of data. Measures of dispersion, designated as constructive numeric
values, are employed to gauge the homogeneity or heterogeneity of given data. If the data
points within a selected data set are identical, the value of the measure of dispersion will be 0.
Yet, as the level of variability within the data amplifies, the value of the measures of
dispersion will augment correspondingly. There are various ways of measuring the dispersion
of any data set, including range, variance, and standard deviation.

4.1. Range

The range is the difference between the largest and smallest values (observations) in a data
set. It is a measure of the spread or dispersion of the data. To calculate the range, we can
adopt the formula: H – S, where H is the largest value and S is the smallest value in a data
set. For instance, the scores of students in the POS 101 test are given as: 5, 8, 4, 5, 7, 6, 5, 3,
and 9. Applying the formula, the range is 9 – 3 = 6. Thus, the range of the score is 6.

The measure of the range is simple to calculate. The major disadvantage of the range is that,
since it only takes account of the two extreme values in a data set, it cannot be said to be
representative of the distribution (Akode, 2005).

4.2. Variance

The variance is the average of the squared differences from the mean. It measures how far
each value in the data set is from the mean. Variance is a statistical measure that assesses the
degree of spread among all the data points in a data set. Together with the ubiquitous
standard deviation - defined as the square root of the variance - it is one of the most
commonly employed dispersion measures. Specifically, the variance represents the mean of
the squared deviations or differences between every data point and the central point of the
distribution, captured by the mean value.

For example, let us calculate the variance of this data set: 2, 7, 3, 12, and 9.

2+ 7+3+12+9 33
First, we need to calculate the mean = = = 6.6.
5 2

13
Then, let us take each value in the data set, subtract the mean, and square the difference. For
instance, for the first value, we have:

2
(2−6.6) = 21.16.

We can now add the squared differences for all values:

21.16 + 0.16 + 12.96 + 29.16 + 5.76 = 69.20

Then, we will divide the sum by the number of data points:

69.20 ÷ 5 = 13.84.

Therefore, the variance of the given data set is 13.84.

4.3. Standard Deviation

The standard deviation is the square root of the variance. It measures the amount of variation
or dispersion of a set of data values from the mean. It is widely used in statistics to indicate
the variability or spread of a data set. Simply put, standard deviation measures how far apart
numbers are in a data set.

To calculate the standard deviation, we must follow all the steps in calculating variance and
find the square root of the answer – the square root of variance. For instance, the standard
deviation for the data set we used under the variance above is:

SD = √ 13.84 = 3.72.

Therefore, the standard deviation of the given data set is 3.72.

References

14
Akode, T.O. (2005). Measures of dispersion. In ’Dipo Kolawole (ed.), Social statistics and
data processing. Ado-Ekiti: Faculty of the Social Sciences, University of Ado-Ekiti.

AnalystPrep (2023). Measures of central tendency and location. July 3. Last accessed on
26/9/2024 at
https://analystprep.com/cfa-level-1-exam/quantitative-methods/measures-of-central-
tendency-and-location/?
gad_source=1&gclid=Cj0KCQjwo8S3BhDeARIsAFRmkONfCoGNvoBSA5HN2rKd
DIgefM72QI7DKtUDLYJIjmw984kbiXwqdT4aAp-yEALw_wcB.

Anderson, D. R., Sweeney, Dennis J. and Williams, Thomas A. (2024, August 31). Statistics.
Encyclopedia Britannica. https://www.britannica.com/science/statistics

Ayeni, B. (1983). An introductory course in statistical and mathematical methods for


geography and planning. Ibadan: Department of Geography and Planning, University
of Ibadan.

BYJU’S (n.d.). Scales of measurement. Last accessed on 26/09/2024 at


https://byjus.com/maths/scales-of-measurement/.

Lecturi (2022). Measure of central tendency and dispersion. September 1. Last accessed on
26/09/2024 at https://app.lecturio.com/#/article/3075.

Lee, J. Ann (2022, August 15). Measurement scale. Encyclopedia Britannica.


https://www.britannica.com/topic/measurement-scale.

Manikandan S. (2011). Measures of central tendency: The mean. Journal of Pharmacology


and Pharmacotherapeutics; 2:140-2.

Ogunleye, O.S. (2005). Scales of measurement (Chapter Six). In ’Dipo Kolawole (ed.),
Social statistics and data processing. Ado-Ekiti: Faculty of the Social Sciences,
University of Ado-Ekiti.

Omotoso, F. (2005). Measures of central tendency (Chapter Eight). In ’Dipo Kolawole (ed.),
Social statistics and data processing. Ado-Ekiti: Faculty of the Social Sciences,
University of Ado-Ekiti.

15
Statistics Canada (2021). Measures of dispersion. September 2. Last accessed on 26/9/2024
at https://www150.statcan.gc.ca/n1/edu/power-pouvoir/ch12/5214891-eng.htm.

UNSW Sydney (n.d.). Types of data and the scales of measurement. Last accessed on
26/09/2024 at https://studyonline.unsw.edu.au/blog/types-of-data.

16

You might also like