0% found this document useful (0 votes)
4 views

Lecture III Probability and Statistics

The document discusses numerical summaries for data, distinguishing between statistics for samples and parameters for populations. It categorizes numerical summaries into measures of location (central tendency and relative positioning) and measures of spread, detailing methods to calculate mean, median, and mode. Additionally, it introduces weighted, harmonic, and geometric means, providing examples and exercises for practical understanding.

Uploaded by

adamsedwin06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture III Probability and Statistics

The document discusses numerical summaries for data, distinguishing between statistics for samples and parameters for populations. It categorizes numerical summaries into measures of location (central tendency and relative positioning) and measures of spread, detailing methods to calculate mean, median, and mode. Additionally, it introduces weighted, harmonic, and geometric means, providing examples and exercises for practical understanding.

Uploaded by

adamsedwin06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

3 NUMERICAL SUMMARIES

A numerical summary for a set of data is referred to as a statistic if the data set is a sample and
a parameter if the data set is the entire population.
Numerical summaries are categorized as measures of location and measures of spread.
Measures of location can further be classified into measures of central tendancy and measures
of relative positioning (quantiles).

3.1 Measures of Location


Before discussing the measures of location, it’s important to consider summation notation and
indexing
Index (subscript) Notation: Let the symbol xi (read ‘x sub t’i) denote any of the n values
x1 , x2 ,...., xn assumed by a variable X. The letter i in xi (i=1,2,. . . ,n) is called an index or
subscript. The letters j, k , p, q or s can also be used.
n
Summation Notation: x
i =1
i = x1 + x2 + ...... + xn and

n n

 aX i = aX1 + aX 2 + ......+ aX N = a( X 1 + X 2 + ......+ X N ) = a X i


i =1 i =1

3.1.1 Measures of Central Tendency


A Measures of Central Tendency of a set of numbers is a value which best represents it. There
are three different types of Central Tendencies namely the mean, median and mode. Each has
advantages and disadvantages depending on the data and intended purpose.
Arithmetic Mean (Averages)
The arithmetic mean of a set of values x1 , x2 ,...., xn , denoted x if the data set is a sample, is
found by dividing the sum of the set of numbers with the actual number of values. Ie
x1 + x2 + ...... + xn 1 n
x= =  xi
n n i =1
Example 1 Find the mean of 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10.
Solution
Sum of values: 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55
Number of values = 10 Mean of values x = 10
55
= 5.5
Note: If the numbers x1 , x2 ,...., xn occur f1 , f 2 ,...., f n times respectively, (occur with
frequencies f1 , f 2 ,...., f n ), the arithmetic mean is, given by
f1 x1 + f 2 x2 + ...... + f n xn 1 n
x= =  f i xi
f1 + f 2 + ...... + f n n i =1
where n is the total frequency. This is the formula for the mean of a grouped data.

Page 1
Example 2 The grades of a student on six examinations were 84, 91, 72, 68, 91 and 72. Find
the arithmetic mean.
1 n 1(84) + 2(91) + 2(72) + 1(68)
The arithmetic mean x = 
n i =1
fi xi ==
1+ 2 + 2 +1
= 79.67

Example 3 If 5, 8, 6 and 2 occur with frequencies 3, 2, 4 and 1 respectively, the arithmetic


1 n 3(5) + 2(8) + 4(6) + 1(2)
mean is x = 
n i =1
fi xi =
3 + 2 + 4 +1
= 5.7

Exercise
1 Find the mean of 9, 3, 4, 2, 1, 5, 8, 4, 7, 3
2 A sample of 5 executives received the following amount of bonus last year: sh 14,000, sh
15,000, sh 17,000, sh 16,000 and sh y. Find the value of y if the average bonus for the
5executives is sh 15,400

Properties of the Arithmetic Mean


(1) The algebraic sum of the deviations of a set of numbers from their arithmetic mean
n
is zero, that is  (x − x ) = 0 .
i =1
i

 (x − a ) a=x.
2
(2) i is minimum if and only if
i =1

(3) If n1 numbers have mean x1 , n2 numbers have mean x 2 ,…, nk numbers have
mean x k , then the mean of all the numbers called the combined mean is given by

xc =
n1x1 + n2 x 2 + ... + nk x k
=
n x i i
n11 + n2 + ... + nk n i

Median
It’s the value below which and above which half of the observations fall when ranked in order
of size. The position of the median term is given by (n+21 )th Value .
NB if n is even we average the middle 2 terms
For grouped data median is estimated using the formular
 ( n +1 ) − Cf a 
Median = LCB +  2   i
 f 
where LCB, f and i are the lower class boundary, frequency and class interval of the median
class. Cfa is the cumuilative frequency of the class above the median class.
Remark: The disadvantage of median is that it is not sensitive against changes in the data.

Mode
It’s the value occurring most frequently in a data set. If each observation occurs the same
number of times, then there is no mode. When 2 or more observation occurs most frequently in
a data then the data is said to be multimodal.

Page 2
For ungrouped data it’s very easy to pick out the mode. However If the data is grouped, mode
 f − fa 
is estimated using the formular Mode = LCB +    i
 2 f − f a − f b 

where LCB, f and i are the lower class boundary, frequency and class interval of the modal
class. fa and fb are frequencies of the class above and below the modal class respectively.
Remark The Empirical Relation between the Mean, Median and Mode
MEAN − MODE = 3 ( MEAN − MEDIAN )
The above relation is true for unimodal frequency curves which are asymmetrical.

Example 1 Find the median and mode for the data: 19, 13, 18, 14, 12, 25, 11, 17, 16, 23, 19.
Solution
Sorted data: 11, 12, 13, 14, 16, 17, 18, 19, 19, 23, 25.
n = 11 thus Median = (112+1 ) Value = 6th Value = 17
th

Mode=19 since it appears most frequently in this data set as compared to other observations.

Example 2 Find the median and mode of the data: 2, 4, 8, 7, 9, 4, 6, 10, 8, and 5.
Solution
Array: 2, 4, 4, 5, 6, 7, 8, 8, 9, 10 Mode 4 and 8 ie the data is bimodal.
n = 10 thus Median = (102+1 ) Value = 5.5th Value = 6 +2 7 = 6.5
th

Example 3
Estimate the mean, median and mode for the following frequency distribution:
Class 5-9 10-14 15-19 20-24 25-29 30-34 35-39
Freq 5 12 32 40 16 9 6
Solution
Boundarie 4.5- 9.5- 14.5- 19.5- 24.5- 29.5- 34.5-
s 9.5 14.5 19.5 24.5 29.5 34.5 39.5
Mid pts (x) 7 12 17 22 27 32 37
Frequency 5 12 32 40 16 9 6
Xf 35 144 544 880 432 288 222
CF 5 17 49 89 105 114 120

Mean x =
 fx = 35 + 144 + ... + 222 = 2545  21.2083
n 120 120
n = 120 thus Median = 60.5th Value  Median class is 19.5-24.5 thus
 ( n+1 ) − Cf a   60.5 − 49 
Median = LCB +  2   i = 19.5 +    5 = 20.9375
 f   40 
The modal class (class with the highest frequency) is 19.5-24.5 therefore
 f − fa  40 − 32 
Mode = LCB +    i = 19.5 + 5  = 20.75
 2 f − f a − fb   80 − 32 − 16 
Exercise
1. Find the mean median and mode for the following data: 9, 3, 4, 2, 9, 5, 8, 4, 7, 4

Page 3
2. Find the mean median and mode of 1, 2, 2, 3, 4, 4, 5, 5, 5, 5, 7, 8, 8 and 9
3. The number of goals scored in 15 hockey matches is shown in the table.
No of goals 1 2 3 4 5
No of matches 2 1 5 3 4
Calculate the mean number of goals cored
4. The table shows the heights of 30 students in a class calculate an estimate of the mean mode
and median height.
Height (cm) 140<x<14 144<x<14 148<x<15 152<x<15 156<x<16 160<x<16
4 8 2 6 0 4
No of 4 5 8 7 5 1
students
5. Estimate the mean, median and mode for the following 2 frequency distribution:
Class 1-4 5-8 9-12 13-16 17-20 21-24
frequency 10 14 20 16 12 8

Class 40-59 60-79 80-99 100-119 120-139 140-159 160-179


freq 5 12 32 40 16 9 6

3.1.2 Other Types of Means


These will include weighted, harmonic and geometric means.

The Weighted Arithmetic Mean


The weighted arithmetic mean of a set of n numbers x1 , x2 ,...., xn having corresponding weights
w1 , w2 ,...., wn is defined as x w = w1x1 + w2 x2 + ... + wn xn =
w x i i
w11 + w2 + ... + wn w i

Example 1 Consider the following table with marks obtained by two students James (mark
x) and John (mark y). The weights are to be used in determining who joins the engineering
course whose requirement is a weighted mean of 58% on the four subjects below;
Subject Maths English History Physics Total
Mark x 25 87 83 30 225
Mark y 70 45 35 75 225
Weight 3.6 2.3 1.5 2.6 10

Working the products of the marks and the weights we get


Subject Maths English History Physics Total
Wx 90 200.1 124.5 78 492.6
Wy 252 103.5 52.5 195 603

Now x w =  w y
wi xi 492 .6 603
= = 49.26 and yw = = = 60.3
i i

w i 10 w i 10
Clearly John qualifies but James does not.

Page 4
Example 2 If a final examination is weighted 4 times as much as a quiz, a midterm examination
3 times as much as a quiz, and a student has a final examination grade of 80, a midterm
examination grade of 95 and quiz grades of 90, 65 and 70, the mean grade is
1(90)+ 1(65)+ 1(70)+ 3(95)+ 4 (80) 830
X= = 83 . =
1+ 1+ 1+ 3 + 4 10
Question A tycoon has 3 house girls who he pays Ksh 4,000 each per month, 2 watch men
who he pays Ksh 5,000 each and some garden men who receives Ksh 7,000 each. If he pays
out an average of Ksh 5,700 per month to these people, find the number of garden men.

Question A student’s grades in laboratory, lecture, and recitation parts of a computer course
were 71, 78, and 89, respectively.
(a) If the weights accorded these grades are 2,4, and 5, respectively, what is an average grade?
(b) What is the average grade if equal weights are used?

The Geometric Mean


Geometric mean is well suited for data measured as ratios, proportions or percentages. For this
type of data, it is more reliable than the arithmetic mean.
Definition The geometric mean for a set of observations x1 , x2 ,...., xn given by
n
GM = n x1  x2  .... xn = n x
i =1
i

where n =  f i
n
for grouped data, one has; GM = n x f  x f  .... x f = n
1
1
2
2
n
n
x
i =1
fi
i

If n is too large, the nth root might be a challenge. In this case, the geometric mean is computed
 n f  n
through logarithm and in this case one has, log GM == log n  x i i  1n  f i log xi
 i =1  i =1
Taking the antlog, one obtains G, the geometric mean.
Example
Yearly percentage growth of urban population in Kenya from 1965 to 1970 is given below;
year 1965 1966 1967 1968 1969 1970
%Increase 8.35 17.91 33.08 40.32 35.8 47.67
Obtain the average percentage growth rate of urban population from 1965 to 1971 in Kenya.
Solution
Geometric Mean should be used because the measurements are in percentages.
GM = 6 8.35  17.91  33.08  40.32  35.80  47.67 = 26.42405

Question Below are the yearly percentages profits made by a company in six successive years;
52.22 46.59 21.36 30.17 22.87 17.77 Determine the mean percentage profits made by the
company in the last six years.

The Harmonic Mean


This type of mean is suitable for data pertaining to speed, rates and time.
Page 5
Definition The Harmonic mean for a set of observations x1 , x2 ,...., xn is given by
n n
HM = =
1
x1 + x12 + .... + x1n ( ) 1
xi

n=fi
n n
For a grouped data we have; HM = =
f1
x1
+ f2
x2
+ .... + fn
xn  ( ) where
fi
xi

Note: This type of mean cannot be used when any observation x1 , x2 ,...., xn is zero.

Example A production machine is programmed to run for four hours only per day. In the first
hour, its output is 40 units/h, in the second hour its output is 60 units/h, in the third hour the
output is 70 units/h while in the fourth hour the output is 30 units/h. Determine the average
output per hour of the machine.
Solution
Using Harmonic mean one has;
4
HM = 1 = 45 units/hr .
401 + 601 + 701 + 301
1 1 1

Example (Geometric and Harmonic for grouped data)


Find the harmonic and geometric mean of the frequency table below
x 13 14 15 16 17
f 2 5 13 7 3
Solution
30
The harmonic mean HM =  15. and
2
13 + + 15
5
14
13
+ 167 + 173

The geometric mean GM = 13  14  15  16  17  15.09837


30 2 5 13 7 3

The Relation between the Arithmetic, Geometric and Harmonic Means:


HM  GM  x .
Exercise:
1. Find the harmonic and the geometric mean of the numbers 10, 12, 15, 5 and 8
2. The number of goals scored in 15 hockey matches is shown in the table below
. Calculate the harmonic and geometric mean number of goals scored.
No of goals 1 3 5 6 9
No of matches 2 1 5 3 4
3. Find the harmonic and geometric mean of the frequency table below
Class 0-29 30-49 50-79 80-99
Frequency 20 30 40 10

Page 6

You might also like