Chapter 2 Measures of Location
Chapter 2 Measures of Location
A statistical graph is a tool used to learn about the shape or distribution of a sample or a
population. A graph can be a more effective way of presenting data than a mass of numbers
because we can see where data are clusters and where there are only a few data values.
Newspapers and the Internet use graphs to show trends and to enable readers to compare facts
and figures quickly. Statisticians often graph data first to get a picture of the data.
There are no strict rules concerning which graphs to use. Two graphs that are used to display
PIE CHART: Data are represented by sectors in a circle and are proportional in size to the
percent of individuals in each category. When creating a pie chart, each slice should be labeled
Example:
Protein 6.1g
Fats 34.2g
Carbohydrate 48.1g
1
Pie Chart showing the Nutrients per 100g
5% 6%
11%
Protein
Fats
32%
Carbohydrate
Dietary Fibre
46% Vitamin and Minerals
BAR CHART: The length of the bar for each category is proportional to the number or percent
of individuals in each category. Bars may be vertical or horizontal. For vertical bars, the
categories are on the x-axis and frequency or relative frequency on the y-axis.
Example: The data below relates people taking out mortgages. Draw an appropriate bar chart for
Bungalow 10
Detached house 19
Semi- detached 31
Terraced house 31
Converted flat 3
2
Bar Chart Showing the Buyers information
All Buyers
35
30
25
20
15 All Buyers
10
5
0
Bungalow Detached Semi- Terraced Purpose Converted
house detached house built flat flat
HISTOGRAM
A histogram consists of very close bars. It has both a horizontal axis and a vertical axis. The
horizontal axis is labeled with what the data represents (for instance, distance from your home to
school). Horizontal axis uses the class boundaries. The vertical axis is labeled either frequency or
1. Create class boundaries on the grouped frequency distribution. Choose a starting point for the
first interval to be less than the smallest data value. A convenient starting point is a lower value
carried out to one more decimal place than the value with the most decimal places. For example,
if the value with the most decimal places is 6.1 and this is the smallest value, a convenient
starting point is 6.05 i.e (6.1 – 0.05 = 6.05). We say that 6.05 has more precision. If all the data
happen to be integers and the smallest value is two, then a convenient starting point is 1.5 i.e (2 –
3
0.5 = 1.5). Also, when the starting point and other boundaries are carried to one additional
3. Draw bars as high as the frequency for each class interval within each boundary
Note: The plot of frequency against interval (class boundary) is called a histogram
Example: 100 people were asked to record how many T.V programme they watched in a week.
The result obtain are shown below. Draw a histogram to illustrate the data.
Example
10- 14 16
15- 19 36
20- 24 21
25- 29 12
30- 34 9
35- 39 3
40- 44 0
4
Histogram showing numbers of Viewers
No of Viewers
40
35
30
25
20
15
10
5
0
4.5 - 9.5 9.5 - 14.5 14.5 - 19.5 - 24.5 - 29.5 - 34.5 - 39.5 -
19.5 24.5 29.5 34.5 39.5 44.5
No of Viewers
Stem-and-Leaf Graph:
The stem-and-leaf graph or stem-plot, comes from the field of exploratory data analysis. It is a
2. Divide each observation of data into a stem and a leaf. The leaf consists of a final
significant digit.
For instance, number 23 has stem two and leaf three. The number 432 has stem 43 and leaf two.
Likewise, the number 5,432 has stem 543 and leaf two. The decimal 9.3 has stem nine and
leaf three.
3. Write the stems in a vertical line from smallest to largest. Draw a vertical line to the right of
the stems. Then write the leaves in increasing order next to their corresponding stem.
5
Example 1
Consider the following data which give the marks scored by 14 pupils in a quiz test.
27, 36, 24, 17, 35, 18, 23, 25, 34, 25, 41, 18, 22, 24
Solution
Arranging in order of magnitude: 17, 18, 18, 22, 23, 24, 24, 25, 25, 27, 34, 35, 36, 41
Stem Leaf
1 7, 8, 8
2 2, 3, 4, 4, 5, 5, 7
3 4, 5, 6
4 1
Example 2
The data are the distances (in km) from a home to local supermarkets. Create a stem-plot using
the data:
1.1; 1.5; 2.3; 2.5; 2.7; 3.2; 3.3; 3.3; 3.5; 3.8; 4.0; 4.2; 4.5; 4.5; 4.7; 4.8; 5.5; 5.6; 6.5; 6.7;
12.3
6
Solution
The given data is arranged in ascending order of magnitude. To plot Stem and leaf diagram,
Stem Leaf
1 1, 5
2 3, 5, 7
3 2, 3, 3, 5, 8
4 0, 2, 5, 5, 7, 8
5 5, 6
6 5,7
10
11
12 3
From the diagram, 3 and 4 kilometers have more concentration of values also value 12.3 is an
Measure of location also known as measures of central tendency is used to describe the centre
of a data set. The widely used measures of the "center" of a data set are the MEAN i.e
(Average), MEDIAN and MODE. To calculate the mean weight of 10 students, add the weight
of 10 students together and divide by 10. To find the median weight of the 10 students, order the
7
data and find the number that splits the data into two equal parts. The median is generally a
better measure of the center when there are extreme values or outliers because it is not affected
by the precise numerical values of the outliers. The mean is the most common measure of the
center. The Mode is the data value with the highest frequency / occurs most.
ARITHMETIC MEAN
Adding all the observations and dividing the sum by the number of observation gives the
arithmetic mean. The symbol used to represent the sample mean is an x with a bar over it
(pronounced “x bar”) i.e . The Greek letter (pronounced "mew") represents the population
Examples:
The number of books checked out from the library by 15 students is as follows:
7, 0, 5, 1, 2, 6, 1, 2, 4, 0, 3, 5, 6, 3, 8
= = = 3.533
The formula given above is the basically for the definition of arithmetic mean for ungrouped
data.
= ,
where x is the mid-point of various classes, f is the frequency of each class. The calculation of
8
Example
The table below gives the marks of 58 students in Statistics. Calculate the average marks of this
group.
Marks No of Students
0–9 4
10 -19 8
20 – 29 11
30 – 39 15
40 – 49 12
50 – 59 6
60 – 69 2
70 – 79 1
Solution:
10 – 19 14.5 8 116
20 – 29 24.5 11 269.5
30 - 39 34.5 15 517.5
40 – 49 44.5 12 534
50 - 59 54.5 6 327
60 – 69 64.5 2 129
70 - 79 74.5 1 74.5
Total = 58 1981
9
= = = 34.155, approximately 34 marks
It may be noted that the mid-point of each class is taken as a good approximation of
When the values are extremely large and/or in fractions, the use of the above formula or method
could be cumbersome. The arithmetic mean formula using the assume method is given below:
=A+
f = frequency, and
To determine the value for the assumed mean, one may choose any value as assumed mean, it
would be ideal to avoid extreme values, that is, too small or to high to simplify calculations. A
Example:
Consider the example given above; calculate the average marks obtained by 58 students using
10
Marks Mid point(x) frequency d= x-A fd
60 – 69 64.5 2 29.5 59
Total 58
Note that we have taken arbitrary assumed mean A= 35 (being the midpoint value of the class
with highest frequency) and deviations from midpoints (d= x-A). In other words, the arbitrary
mean has been subtracted from each value of mid-point and the result is shown in column
labeled d.
=A+
35 +
35 + (-0.8190)
11
MEDIAN
Median is defined as the value that occupies the middle item (or the mean of the values of the
two middle items) when the data are arranged in an ascending or descending order of magnitude.
descending order of magnitude, the median is the middle value if n is odd. When n is even, the
Example:
We have to first arrange it in either ascending or descending order. These figures are arranged in
The number of observation is odd number of items, to find out the value of the middle item, we
= 10/ 2 = 5
It implies that the 5th item among that which are arranged in ascending order is the median. This
happens to be 15.
Suppose the observation consists of: 3, 5, 7, 10, 15, 18, 19, 20, 22, 25,
Applying the above formula, the median is that which occupies the middle position i.e 15+18/2
5 Here, we have to take the average of the values of 5th and 6th item. This means an average of 15
12
Note that n+1 is not the formula for the median; it merely indicates the position of the median,
namely, the number of items we have to count until we arrive at the item whose value is the
median.
= L1 + ( )
N = Total frequency
Example:
Class 4–9 10 - 14 15 - 19 20 - 24 25 - 29 30 - 34 35 - 39 40 - 44
frequency 3 8 11 15 12 6 2 1
13
Solution
5–9 3 3
10 – 14 8 11
15 – 19 11 22
20 – 24 15 37
25 – 29 12 49
30 -34 6 55
35 – 39 2 57
40 - 44 1 58
Total N= 58
First is to determine the class where the median class interval lies i.e
= = 29
From the cumulative frequency column locate where 29 lies, it is observe that it lies in the
interval 20 – 24 class. Thus, the lower class boundary of the class boundary of the class is
calculated as
L1 = 20 – 0.5 = 19.5
C.Fb = 22
Fm = 15
C = 5 and is calculated using this simple counting technique i.e from the interval 5 -9 we have 5,
14
= 19.5 +
= 19.5 + 2.333
= 21.833
MODE
It is another measure of central tendency. It is the value that occur most in a distribution or data
set.
6, 8, 5, 11, 7, 9, 4, 9
There are eight observations in the data above but, 9 appears most. Therefore the mode is 9
= +
Where
= Difference between the frequency of the modal class and the class before it
= Difference between the frequency of the modal class and the class after it.
Example
Class 11 - 20 21 - 30 31 - 40 41 - 50 51 - 60 61 - 70
Frequency 6 20 12 10 9 9
15
Solution:
To determine the modal class interval, it is that with the highest frequency i.e class 21 – 30 with
= 20 – 6 = 14
= 20 -12 = 8
= 20.5 +
= 22.9
16