measure of central tendency-intro
measure of central tendency-intro
Dr. A.L. Bowley correctly stated, "Statistics may rightly be called the science of averages."
The word average is commonly used in day-to-day conversations. For example, we may say that
Kanga is an average boy of my class; we may talk of an average Kenyan, average income, etc.
When it is said, "Kanga is an average student," it means is that he is neither very good nor very
bad, but a mediocre student. However, in statistics the term average has a different meaning.
However, the most common measures of central tendencies or Locations are Arithmetic mean,
median and mode.
TASK 1
A variable takes the values as given below. Calculate the arithmetic mean of 110, 117, 129, 195,
95, 100, 100, 175, 250 and 750.
Example:
Using task 1 data to calculate the mean using Assumed mean method.
Example Mr. Sonko’s earnings for the past week were:
Monday $ 450
Tuesday $ 375
Wednesday $ 500
Thursday $ 350
Friday $ 270
Find his average earning per day.
Solution:
TASK 2
The expenditure of ten families in dollars is given below:
Family A B C D E F G H I J
Expenditure 300 700 100 750 500 80 120 250 100 370
Taking $ 500 as Assumed mean, Calculate the Arithmetic mean.
The formulae for Arithmetic mean by direct method and by the short-cut methods are as follows:
TASK 3
Find the mean of the following 50 observations.
19, 19, 20, 20, 20, 19, 20, 18, 21, 19,
20, 20, 19, 19, 20, 19, 21, 19, 19, 21,
18, 20, 18, 18, 17, 20, 20, 22, 20, 20,
20, 20, 20, 21, 20, 17, 23, 18, 17, 21,
20, 21, 20, 20, 20, 18, 21, 19, 20, 19
TASK 4
Nine coins were tossed together and the number of times they fell on the side of heads was
observed. The activity was performed 256 times and the frequency obtained for different values
of x, (the number of times it fell on heads) is shown in the following table.
x 0 1 2 3 4 5 6 7 8
f 1 9 26 59 72 52 29 1 1
Calculate then mean by:
i) Direct method
ii) ii) Short-cut method
TASK 5
Find the arithmetic mean for the following:
Marks 10 20 30 40 50 60 70 80
below
No. of 15 35 60 84 96 127 198 250
students
Step-Deviation Method
Here all class intervals are of the same width say 'c'. This method is employed in place of the
Short-cut method. We measure all the class-marks (mid values) from some convenient value, say
'A', which generally should be taken as the class-mark of a class of maximum frequency or of a
class which is the middle one. All the class marks happen to be multiples of c, since all class
intervals are equal. We consider class frequencies as if they are centered at the corresponding
class-marks.
Theorem If x1, x2 , x3, ......, xn are n values of the class marks with frequencies f1, f2 , f3, ......fn
respectively and if each xi is expressed in terms of the new variable ui by the relation
Example
Calculate the arithmetic mean from the following data:
Age 25 30 35 40 45 50 55 60
(years)
below
No. of 8 23 51 81 103 113 117 120
employees
Solution:
TASK 6
From the following data, of the calculation of arithmetic mean, find the missing item.
Wages (in 110 112 113 117 ? 125 129 130
dollars)
No. of 25 17 13 15 14 8 7 2
workers
TASK 7
The average marks of three batches of students having 70, 50 and 30 students respectively are
50, 55 and 45. Find the average marks of all the 150 students, taken together.
TASK 8
The mean of a certain number of observations is 40. If two or more items with values 50 and 64
are added to this data, the mean rises to 42. Find the number of items in the original data.
TASK 9
The sum of deviations of a certain number of observations measured from 4 is 72 and the sum of
deviations of observations measured from 7 is -3. Find the number of observations and their
mean.
TASK 10
The mean weight of 98 students is found to be 50 kg. It is later discovered that the frequency of
the class interval (30- 40) was wrongly taken as 8 instead of 10. Calculate the correct mean.
4.3 Median
It is the value of the size of the central item of the arranged data (data arranged in the ascending
or the descending order). Thus, it is the value of the middle item and divides the series in to
equal parts.
In Connor’s words - "The median is that value of the variable which divides the group into two
equal parts, one part comprising all values greater and the other all values lesser than the
median." Example
the daily wages of 7 workers are 5, 7, 9, 11, 12, 14 and 15 dollars. This series contains 7 terms.
The fourth term i.e., $11 is the median.
1.3.1 Median in an Individual Series (ungrouped Data)
1. Set the individual series either in the ascending (increasing) or in the descending (decreasing)
order, of the size of its items or observations.
2. If the total number of observations be 'n' then
TASK 11
The following figures represent the number of books issued at the counter of a Statistics library
on 11 different days: 96, 180, 98, 75, 270, 80, 102, 100, 94, 75 and 200.
Calculate the median.
TASK 12
The population (in thousands) of 36 metropolitan cities are as follows: 2468, 591, 437, 20, 213,
143, 1490, 407, 284, 176, 263, 19, 181, 777, 387, 302, 213, 204, 153, 733, 391, 176 178, 122,
532, 360, 65, 260, 193, 92, 672, 258, 239, 160, 147, 151. Calculate the median.
Sometimes the series is given in the descending order of magnitude. In this situation convert the
series in the ascending order of magnitude and then using the regular formula, the median can be
calculated or the series can be put in the descending order of the magnitude and an alternative
formula be used to calculate the median.
Example
Marks 40-50 30-40 20-30 10-20 0-10
No of 10 12 40 30 8
students
Solution:
By interpolation
Alternative formula:
Note that, while calculating the median of a series, it must be put in the 'exclusive class-interval'
form. If the original series is in inclusive type, first convert it into the exclusive type and then
find its median.
TASK 15
The following distribution represents the number of minutes spent by a group of teenagers in
watching movies. What is the median?
1.4 Mode
It is the size of that item which possesses the maximum frequency. According to Professor
Kenney and Keeping, the value of the variable which occurs most frequently in a distribution is
called the mode.
It is the most common value. It is the point of maximum density.
Note that if in any series, two or more numbers have the maximum frequency, then the mode
will be difficult to calculate. Such series are called as Bi-modal, Tri-modal or Multi-modal
series.
4.4.2 Grouped Data
Steps:
1. Determine the modal class which as the maximum frequency.
2. By interpolation the value of the mode can be calculated as -
TASK 17
Calculate the modal wage by interpolation.
Verify it graphically.
TASK 18
Given median = 20.6, mode = 26
Find mean.
There are several reasons why you might want to use a weighted mean.
1. Each individual data value might actually represent a value that is used by multiple people in
your sample. The weight, then, is the number of people associated with that particular value.
2. Your sample might deliberately over represent or under represent certain segments of the
population. To restore balance, you would place less weight on the over represented segments of
the population and greater weight on the underrepresented segments of the population.
3. Some values in your data sample might be known to be more variable (less precise) than other
values. You would place greater weight on those data values known to have greater precision.
TASK 19
Joan gets quiz grades of 79, 82, and 69. She gets a 65 on her final exam. Find the weighted mean
if the quizzes each count for 10% and the final exam counts for 70% of the final grade.
1.7 Geometric mean
The geometric mean is an average calculated by multiplying a set of numbers and taking the nth
root, where n is the number of numbers.
A common example when the geometric mean is use is when averaging growth rates.
The formula for the geometric mean: -
Where n is the number of observations made of the variable x and X 1, X2…, Xn are the values of
these observations.
Example,
Find the Geometric mean of numbers: 3, 25 and 45
There are three observations, thus n = 3
The geometric mean cannot be calculated if we have negative or zero observations. The
geometric mean of a set of readings is always less than the arithmetic mean (unless all readings
are identical) and is less influenced by very large values / items.
TASK 20
a. Calculate the arithmetic and geometric mean of the following salaries: - in thousands of
shillings per month 6, 8, 10, 10,10,12,16.
b. Given the following salaries (i.e. in thousands of Ksh) in a company per annum (p.a):- 6,
8, 10, 10,10,12,48.
The geometric mean is useful when only a few items in a distribution are changing: it’s in the
circumstances more stable than the arithmetic mean. It is useful in the calculation of share
indices and also in such calculations where data grows in geometric progression i.e., the
population of a country.
EXAMPLE
Given population in a city was 300,000 in 1980 and 400,000 in 1990, if we wanted to find out an
estimate of the arithmetic mean of the population in 1985.
Here, we are making an assumption the population grows by the same number each year which
is not correct. The same thing applies to money assuming its growing in a compound rate. The
geometric mean for 1985 would be: -
= 2√ (300,000 x 400,000)
= 371,080
Harmonic mean is quotient of “number of the given values” and “sum of the reciprocals of
the given values”.
TASK 21
Calculate the harmonic mean of the numbers: 13.5, 14.5, 14.8, 15.2 and 16.1
TASK 22
Given the following frequency distribution of first year students of a particular college.
Calculate the Harmonic Mean.
Age 13 14 15 16 17
(Years)
Number of 2 5 13 7 3
Students
TASK 23:
Calculate the harmonic mean for the given below:
Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99
frequency 2 3 11 20 32 25 7
4. David looked at a passage from a book. He recorded the number of words in each sentence
as shown in the following frequency table.
Class interval (number of words) Frequency
1-5 16
6-10 28
11-15 26
16-20 14
21-25 10
26-30 3
31-35 1
36-40 0
41-45 2
5. Twenty students are asked how many detentions they received during the previous week at
school. The results are summarized in the frequency distribution table below.
7. An atlas gives the following information about the approximate population of some cities in
the year 2000. The population of Nairobi has accidentally been left out.
Melbourne 3.2
Bangkok 7.2
Nairobi
Paris 9.6
São Paulo 17.7
Tokyo 28.0
Seattle 2.1
The atlas tells us that the mean population for this group of cities is 10.01 million.
(a) Calculate the population of Nairobi.
(b) Which city has the median population value?
8. The number of hours that a professional footballer trains each day in the month of June is
represented in the following histogram.
(a) Write down the modal number of hours trained each day.
(b) Calculate the mean number of hours he trains each day.