0% found this document useful (0 votes)
34 views

STA112 Week 2 Class Note

D

Uploaded by

johnajoma058
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

STA112 Week 2 Class Note

D

Uploaded by

johnajoma058
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 102

FREQUENCY DISTRIBUTIONS

1
Introduction
When summarizing large masses of raw data, it is often useful to
distribute the data into classes or categories and to determine the
number of individuals belonging to each class, called the class
frequency.

Definition:
A tabular arrangement of data by classes together with the
corresponding class frequencies is called a frequency
distribution or frequency table. It is a summary of the number of
times (how frequently) each category on a measurement scale
occurs within a given set of measurements.
2
It shows how the measurements are distributed(or spread) across the used
part of the measurement scale. Frequency distributions can be presented in
summary tables, known as frequency tables The unorganized data
collected during investigation is known as raw data.

Two types of frequency distribution include;


1. Ungrouped frequency distribution
2. Grouped frequency distribution

Ungrouped Frequency Distribution


This means list of data values, each showing its frequency of occurrence.
Example.
Let the blood type of 40 persons are as follows:O O A B A O A A A O
B O B O O A O O A A A A AB A B A A O O A O O A A A O
A O O AB.
Produce a simple frequency distribution for the above data.
3
Example 2.
The output in units during week ten of twenty employees were found to be;
68, 69, 70, 71, 70, 68, 69, 67, 70, 68, 72, 71, 69, 74, 70, 73, 71, 67, 69, 70
Construct a simple frequency distribution for the data.

ASSIGNMENT 2
• Assume that during a sixty- day working period, the cashier made the following
miscounts:
2 0 1 3 5 6 4 3 2 3
1 2 2 0 1 2 1 1 3 2
0 4 1 1 2 0 2 2 4 4
4 1 2 2 0 1 0 0 3 5
5 0 2 1 1 2 2 1 4 3
3 1 0 3 2 2 1 2 3 3
Summarize the raw data by the use of tally marks.

4
2. Grouped Frequency Distributions
In many practical applications, however, the data under consideration are
usually large in volume and of continuous in nature. In such cases, grouping
the data into classes or groups would be the most appropriate thing to do.

EXAMPLE
Suppose a teacher records raw scores of 50 students in a statistics test as
follows:
58 15 81 79 92 58 69 32 45 56
41 85 43 52 61 75 85 69 56 49
57 87 89 49 85 45 69 75 65 61
25 72 67 58 84 60 32 57 69 68
73 42 65 55 74 58 36 78 68 79
Present the data on a frequency table?

5
Frequency Table:.
Classes Tally Frequency

11‐20

21‐30

31‐40

41‐50

51‐60

61‐70

71‐80

81‐90

91‐100
Data organized and summarized as in the above frequency distribution are often called
grouped data.
6
DEFINITION OF SOME TERMS:

1) CLASS INTERVALS AND CLASS LIMITS: A symbol defining a


class, such as 11–20 in the Table above, is called a class intervals. The
end numbers, 11 and 20, are called class limits; the smaller number (11)
is the lower class limit, and the larger number (20) is the upper class
limit.

2) CLASS BOUNDARIES: If the scores are recorded to the three nearest


decimal, the class interval 11–20 theoretically includes all
measurements from 10.500 to 20.50. These numbers, indicated briefly
by the exact numbers 10.5 and 20.5, are called class boundaries, or true
class limits; the smaller number (10.5) is the lower class boundary, and
the larger number (20.5) is the upper class boundary.

7
In practice, the class boundaries are obtained by adding the upper limit of
one class interval to the lower limit of the next-higher class interval and
dividing by 2.
Sometimes, class boundaries are used to symbolize classes. For example,
the various classes in the first column of the last example could be
indicated by:
10.5–20.5, 20.5–30.5, etc.

3) THE SIZE OF A CLASS INTERVAL: The size, or width, of a class


interval is the difference between the lower and upper class boundaries and
is also referred to as the class width, class size, or class length. If all class
intervals of a frequency distribution have equal widths, this common width
is denoted by c. In such case c is equal to the difference between two
successive lower class limits or two successive upper class limits. For the
data of the above example class interval is c = 20:5-10:5 =10.
8
4) THE CLASS MARK: The class mark is the midpoint of the class
interval and is obtained by adding the lower and upper class limits and
dividing by 2. Thus the class mark of the interval 11–20 is (11+20)/2 =
15.5. The class mark is also called the class midpoint.

9
RELATIVE FREQUENCY DISTRIBUTIONS AND PERCENTAGE
DISTRIBUTIONS
The relative frequency of each measurement category in a sample is the
frequency of that category divided by the total frequency. It is symbolized
by : fi
n

For a population, it is the frequency in each category divided by the


population size: f i
N

The percentage for each category is the percent of the total frequency (n
or N) that is found in that category. This percentage is achieved by
multiplying the relative frequency by 100, which is symbolized for a
sample by:  f * 100  %
i
 n 

 fi *100 %
and for a population by:  N 

10
Example
Calculate relative frequency for the table below:
Weight (kg)
x i Frequency f i
1.0 2

1.1 0

1.2 4

1.3 8

1.4 4

1.5 2

 20

Class Exercise Group Frequency

21-30 4
With the aid of the table below, draw a frequency table
31-40 6
with class boundaries, class mark, class width and
41-50 7
relative frequency?
51-60 3
11
Cumulative Frequency Distribution
The total frequency of all values less than the upper class boundary of a
given class interval is called the cumulative frequency up to and including
that class interval.
The entry for any score value or class interval is the sum of the frequencies
for that value or that interval plus the frequencies of all lower score values.

Example:
In a survey of monthly rents payable in a certain town, the following sample
data were collected for 48 residential flats in the town
200 250 335 230 270 190 270 490 Present the data on a
255 240 445 290 256 245 310 480 frequency distribution
340 580 280 235 274 220 435 525 table using 150-199,
330 260 310 172 405 295 385 535 200-249 ,... Construct
160 250 190 210 390 230 280 460
cumulative frequency
315 370 345 355 232 170 270 362
and percentage relative
efficiency
12
Exercise
The regulation board of health in a particular state specifies that the
fluoride level must not exceed 1.5 ppm (parts per million). The 25
measurements below represent the fluoride level for a sample of 25 days.
Although, fluoride levels are measured more than once per day, these data
represent the early morning readings for the 25 days sampled: 0.75,
0.86,0.84,0.85, 0.97, 0.94,0.89, 0.84, 0.83, 0.89, 0.88, 0.78, 0.77, 0.76,
0.82, 0.71, 0.92,1.05,0.94 ,0.83, 0.85,0.97,0.93,0.79, 0.81

i. Construct a frequency table for the above data using classes of 0.71
–0.75, 0.76 – 0.80, 0.81 – 0.85etc.

ii. Construct a relative frequency distribution.

iii. Construct a cumulative frequency distribution.

13
Graphical Displays of Data
Data can be represented for more visual impact in form of any of the
following;
1. Pictograms
2. Pie chart
3. Bar charts:
i. Simple bar chart
ii. Multiple bar chart
iii. Component bar chart
4. Histogram
5. Cumulative frequency curve (ogive)

Pictogram
A Pictograph is a way of representing an interested data using images. For
instance, if data is on population the pictogram will contain diagrams of
human beings. If it is about cars, the pictogram will contain cars.
Note: The number of diagrams drawn is usually proportional to the given data.
14
• In the pictogram above:
1. Who read the most book?
2. Who read the least book?

15
16
• Who scored the most goals?
Who scored the fewest goals?
• Which two pupils scored the same number of goals?
• Jessica and David
• How many more goals did Sam score than Will?

17
example
Year 1960 1970 1980 1990
Population 12.6 26.2 38.4 50.1
(in millions)

Represent the data on a pictogram.


Hint: Each figure represents 10 million people.

18
Bar Chart
Bar graph is constructed on a rectangular coordinate
system where by the X axis represents the
independent variables, the Y axis the dependent
variable and rectangle (or bars) show the relationship
between the variables. The frequency of each
measurement value is proportional to the height of the
vertical rectangle above that value.
19
Note the following

• A bar chart is a bar graph for nominal‐level, ordinal‐level,


and discrete ratio‐level data.

• The bars of a bar chart must not touch each other.

20
SIMPLE BAR CHART
• This is a chart where the information is represented by a series of
bars all of the same width. The height or length of each bar
represents the magnitude of the figures. The bar charts may be
drawn vertically or horizontally. This also represents a set of
non-joining bars, each showing just one data value
proportionately. It deals with only one set of data.

21
Example 1
The table below shows the distribution of weight gains (kg) in lambs feeding
on a certain diet over a specified amount of time. Draw a simple bar chart for
the information?
Weight gain Frequency
5 1
6 3
7 2
8 6
9 7
10 5
11 4
22
MULTIPLE/COMPOSITE BAR CHART
• This consists of grouped bars. Each of those bars show a different
characteristic corresponding to a common variate value. The lengths of the
bars are proportional to the magnitudes of the characteristics they represent.
Each of the grouped bars may be coloured for ease of identification. Multiple
bar chart is a good device for visual comparison of two or more kinds of
information.

23
24
Example
• The table below shows the intake through JAMB by the Faculty of
Science of a certain University in three consecutive years. Present
the data on a multiple bar chart.
Department 2002 2003 2004
Micro-biology 43 40 35
Biochemistry 45 35 42
Applied Biology 28 40 28
Physics 33 25 35
Mathematics 35 35 38
Computer Science 40 42 45
Chemistry 37 40 42
Total 261 257 265
25
Example

• Draw a multiple bar diagram for the following data which represented
agricultural production for the period from 2017‐2020.

Year Food grains (tones) Vegetables (tones) Others (tones)


2017 100 30 10
2018 120 40 15
2019 130 45 25
2020 150 50 25

26
27
COMPONENT BAR CHART
• This is the chart in which each bar is divided into two or
more sections proportional in size to the component parts of
the total quantity being represented by each bar. The
component bar chart also aids visual comparison.

• Using the last example under multiple bar chart, construct a


component bar chart for the data.

28
29
ASSIGNMENT
• The data below shows the number of students that pass with an
A in a course of a university in Nigeria from 2004 to 2006.

Number of 2004 2005 2006 Present the data in a simple,


Students
multiple, and component bar
Chemistry 24 15 18 chart.
Present the data in a
Physics 10 6 24
percentage simple and
Mathematics 16 9 6 multiple bar charts

30
PIE CHART
• A pie chart is simply a circle divided into sections. The
circle represents the total of the data being presented and
each section is drawn proportional to its relative size. A
pair of compass and protractor is needed in the
construction of pie chart.

31
Example
• The table below represents the skill classification of the
workforce at two factories. Draw a degree and
percentage pie chart to represent the data.

Skill Classification Number of Workers


Unskilled 20
Semi-skilled 30
Skilled 70
Total 120
32
• HISTOGRAM

• Histogram consists of a set of rectangles having:


a) A title, which identifies the population of concern.
b)A vertical scale, which identifies the frequencies in
the various classes.
c) A horizontal scale, which identifies the variable
d)The bars of histogram touches each other.

33
EXAMPLE
Represent the information in the table below on a
histogram.
Class Freq class boundaries

10-14 3 9.5-14.5
15-19 6 14.5-19.5
20-24 10 19.5-24.5
25-29 11 24.5-29.5
30-34 5 29.5-34.5
35-39 3 34.5-39.5
34
ADVANTAGES OF HISTOGRAM
1). They display the comparative frequency occurrence of data items within each
class interval and so show which class interval are the most frequently occurring
and which are the least.
2). They indicate whether the range of values is wide or narrow and whether
most values occur in the middle of the range or whether the frequencies are
more evenly spread.

35
Frequency Polygon
This is a line graph of the class frequency plotted
a g a i n s t t h e c l a s s m a rk . I t c a n b e o b t a i n e d by
connecting the midpoints of the tops of the rectangles
in the histogram. (Note: Class mark or midpoint is the
average of class limits)

36
CUMULATIVE FREQUENCY CURVE (OGIVE)
The graph of a cumulative frequency distribution is called
c u m u l a t ive f re q u e n c y c u r ve o r o g ive c u r ve. I n i t s
construction, each cumulative frequency is plotted against
the upper class boundaries of the class interval.
When the cumulative totals of successive frequencies of a
distribution are plotted against the corresponding class
boundaries then we have a cumulative frequency curve.
37
(also known as ogive). Since cumulative frequencies are
formed by successive additions, the cumulative frequency
for a can never be less than the cumulative frequency of the
preceding class. For this reason the graph of cumulative
frequency curve either increases or remains level, and can
never drop down towards the x-axis. The last cumulative
frequency is the total of the frequencies in the distribution.

38
Examples
1. The number of maids in ten houses
Number of maids 6 1 2 3 4

Number of houses 1 3 4 1 1

Represent this on a histogram and draw the frequency


polygon on the histogram
2. Draw the cumulative frequency curve of the data below
Group 1‐10 11‐20 21‐30 31‐40 41‐50

Frequency 1 4 12 8 3

39
Solution

Group Frequency Cumulative Class


Frequency Boundaries
1‐10 1 1 0.5‐10.5
11‐20 4 5 10.5‐20.5
21‐30 12 17 20.5‐30.5
31‐40 8 25 30.5‐40.5
41‐50 3 28 40.5‐50.5

40
41
QUARTILES FROM CUMULATIVE FREQ. CURVE
To help read from the ogive, percentage cumulative frequencies are
marked on the vertical axis and the corresponding values are read
from the horizontal axis. These are quartiles and percentiles.

To obtain the percentiles and quartiles, we find the sum of the


frequencies N = ∑f and then calculate the following and read it from
the ogive.

42
(a). The lower quartile or first quartile:

Q1  N 0r 25% of N
(b) The median: 4

Q  N 0r 50% of N
(c) The upper quartile or 3rd quartile:
2 2

Q3  3N 0r 75% of N
4
43
(d) Decile:
DN 0r 10% of N
10
(e) Quintile:
Q N 0r 20% of N
(f) Inter‐quartile range= 5

(f) Semi‐interquartile range Q3  Q1( from the graph)


1
 (Q3  Q1 )
2

44
Example
• The table below shows the weight of 40 female students in
a school. Form a cumulative frequency table and use it to
draw an ogive.
Weight 118- 125- 132- 139- 146- 153- 160- 167- 174-
(kg) 124 131 138 145 152 159 166 173 180
No of 1 3 7 8 9 5 4 2 1
Students

45
Using your graph, determine the following:
(a). Lower quartile (137.5kg)
(b). Median (146.3kg)
(c). Upper Quartile (155.3kg)
(d). Decile (131.8kg)
(e). Quintile (135.7kg)
(f). Interquartile range (17.8kg)
(g). Semi‐interquartile range. (8.9kg)

46
MEASURES OF CENTRAL TENDENCY OR
LOCATION
• As earlier stated, descriptive statistics aims at describing data by
summarising the values in the data set. One of the ways to achieve
this is to find a single value that will describe the general location
of the data. This single value which is a central point of the
distribution is referred to as measure of central tendency or
location.
47
Measures of central tendency are typical and
representative of a data. All other value in the
distr ibution clusters around the measure of
location. These measures include;
arithmetic mean, median, mode, mode, harmonic
mean and geometric mean.
48
• Measures of partition are measures that divide a distribution into a specified
fraction of the distribution. These measures are also known as fractiles they
includes: median, quartiles, percentiles and deciles.

49
The Arithmetic Mean
It is the usual average of a set of population. i.e. The
equal sharing among all the values in the data set.
It is denoted mathematically as,
n

 xi
x  i 1
1 
n
N

 xi
  i 1
3 
N

50
Example
1. Find the mean of 2g, 4g, 6g, 8g and 10g
2. Calculate the mean of 2.3, 5.4, 0 , 6.2, 7.9, 8.1, 0,
3.4

51
Solution
1. n

 xi
x  i 1

n
2  4  6  8  10
x 
5
30
  6g
5
n

2.  xi
x  i1

n
8

 xi
x  i1

n
2 .3  5 .4  0  6 .2  7 .9  8 .1  0  3 .4
x 
8
3 3 .3

8 52
 4 .1 6 3
Arithmetic Mean of an Ungrouped Frequency
Distribution
It is denoted mathematically as n

 f x i i
x  i 1

 f
Where
x i represents ith observation
f i represents ith frequency of each observation
 f total num ber of cases
53
Example
The table below indicates the number of children in the
families of twenty teachers in a school. Compute the
arithmetic mean

Number of children 1 2 3 4 5
Number of teachers 4 2 6 5 3

54
Arithmetic Mean : Method of Assumed Mean
This can be used to find the arithmetic mean of any
distribution, whether grouped or ungrouped. It
involves subtracting each value of ‘x’ from a specified
assumed mean which is the mid mark of the class with
the highest frequency
If A is the assumed mean and if the obtained
deviation is represented by d i then,
55
For a frequency distribution, the arithmetic mean is
given by: x

Where : x  A
 fd
n

This method reduces the value of the data to be


n
computed:
n  
i 1
fi
56
Example
With the aid of assumed mean method, find the
arithmetic mean of the table below:

x 3 4 5 6
f 2 6 8 4

57
Solution
The highest frequency is 8
The value of x corresponding to this is 5. hence, 5 is the
assumed mean
x f d(x‐5) fd

3 2 ‐2 ‐4

4 6 ‐1 ‐6

5 8 0 0

6 4 1 4

20 ‐6
58
Arithmetic mean:  A 
 fd
n
 6 
 5  
 20 
 5  (  0 .3 )
 4 .7

Assignment:
 fx
Attempt to solve the above using:
 f
59
Properties of the Arithmetic Mean
1.It is unique since there is only one in a set of data
2.It makes use of every value in the data, making it
suitable for further statistical analysis
3.It can sometimes give rise to ridiculous values e.g.
4.67 students
4.It is the most stable and widely used of the measures
of central tendency
60
5.It is unwise to use the mean as average if the distribution
is open ended
6.It represents equal sharing of the items in the distribution
7.The algebraic sum of all deviations from the arithmetic
mean is always zero i.e.

 X  X  0

61
MEDIAN
It is the middle value in a distribution.
To obtain the median of a data, there’s a need to
arrange the values either in ascending or descending
order and select the middle value(s).
It is easier to select the middle number if the number
of items is odd. If it is even, the median will be the
average of the two middle terms
62
Example
1. Find the median of 6, 7, 2, 4, 9, 0, 3
2. Find the median of 163, 149, 152, 160, 195, 180

63
Solution
1. Rearrange (ascending or descending) the numbers; 0, 2, 3, 4, 6,
7, 9 the middle value is 4.
2. Rearrange; 149, 152, 160, 163, 180 , 195 Here, we have a tie
hence the median is

160  163
 1 6 1 .5
2

64
Median of an Ungrouped Frequency Table
Firstly, find the total frequency, add 1 and
divide by two. There's a need to create a
column for cumulative frequency to help locate
the median

65
Example
1. Find the median of the following ungrouped frequency
table
x 2 4 6 8 10
f 15 12 23 6 4

2. The age distribution of 20 students are given in the data


below, obtain the median for the table.

Age 3 5 7 9 10
No of Students 2 3 4 5 6

66
Solution
1. Total frequency is 15+12+23+6+4=60 hence
n  1 60  1
  3 0 .5
2 2
x f Cumulative Freq
2 15 15
4 12 27
6 23 50
8 6 56
10 4 60
60
67
From the cumulative frequency, 30.5 falls
under 50 and the corresponding value of x is
6. Therefore, the median is 6

2. Total frequency is 20
n  1 20  1
  10.5
2 2
68
x f Cumulative
Freq
3 2 2
5 3 5
7 4 9
9 5 14
10 6 20
20
From the commutative frequency, 10.5 is located under
‘14’ and the corresponding x value is 9.
The median is 9
69
Properties of the Median
1. It is the central observation of a data
2. The median gives the actual value for a set of discrete and
odd items
3. it can be estimated from incomplete data
4. The median always exist
5. The median cant be used for further statistical computation
6. It is unique because there is only one value for median in a
data
70
MODE
It is the highest occurring item in a set of
observation.
When they are two modes it is called bimodal
distribution, if three, it is called trimodal
distribution and when it is more than three it is
called multimodal distribution
71
Example
Find the mode of the following;
i. 2, 5, 2, 3, 7, 1, 5, 6, 5
ii. 1, 6, 7, 6, 8, 4, 1
iii. 93, 72, 24, 43, 67, 93, 24, 43, 72, 67

72
Solution
i. The number that occurred most is 5 since it occurred
thrice
ii. There’s a tie in the number that occurred most. They are; 1
and 6 each occurred twice hence this distribution is
bimodal
iii. This set has no mode since all the items occurs equally
(twice). This indicates that the mode may not exist in some
cases
73
Properties of the Mode
1. It is the most occurring
2. It may or may not exist and if it exists, it may not be
unique
3. It ignores a large part of the data hence not widely
used in research
4. It can be estimated from incomplete data
5. It cannot be used for further statistical analysis
74
Relationship between Mean, Median and Mode
An empirical relationship exists in unimodal
distributions which are not symmetrical in nature. The

mean - mod e  3 mean - median


relationship is:

If two of the above named measures are known, the third


can be computed provided the distribution is unimodal
75
Example
1. The mode of a given distribution is 81.93 and
the mean is 97.68. compute the median
2. The mean value of a unimodal distribution is
181 while the mode is 160. Calculate the
median

76
Solution
1. m e a n - m o d e  3 m e a n - m e d ia n 
9 7 . 6 8 - 8 1 . 9 3  3 9 7 . 6 8 - m e d i a n 
1 5 . 7 5  3 9 7 . 6 8 - m e d i a n 
1 5 .7 5
 9 7 .6 8 - m e d ia n
3
5 .2 5  9 7 .6 8 - m e d ia n
m e d ia n  9 7 .6 8 - 5 .2 5
m e d ia n  9 2 .4 3 77
2. m e a n - m o d e  3 m e a n - m e d ia n 
1 8 1 - 1 6 0  3 1 8 1 - m e d i a n 
2 1  3 1 8 1 - m e d i a n 
21
 1 8 1 - m e d ia n
3
7  1 8 1 - m e d ia n
m e d ia n  1 8 1 - 7
m e d ia n  1 7 4
78
Weighted Arithmetic Mean, Geometric Mean, and Harmonic Mean

Weighted Arithmetic Mean: if the values in a set of


observations X 1 , X 2 , X 3 , , X n are each given weights
or weighting factor w 1 , w 2 , w 3 ,  , w n then the weighted
arithmetic mean is defined as:
w 1 x1  w 2 x 2  w 3 x 3    w n x n

 wx
w1  w 2  w 3    w n w
Example of W.A.M. is the Grade Point Average (GPA)
79
Example
The scores obtained by a Covenant University
Industrial Physics student in Alpha semester
is given below. Calculate her Grade Point
Average (GPA)
Courses MAT 114 MAT 111 CHM 111 TMC 111 GST 111
Units 2 3 3 1 2
Grade B C A A B

To obtain the Grade point average, the unit of the


courses form the weight.
80
Solution
We know what each grade represents i.e. A is 5
points, B-4, C-3, D-2, F-0
w 1 x1  w 2 x 2  w 3 x 3    w n x n
w1  w 2  w 3    w n
 2  4    3  3    3  5   1  5    2  4 
2  3  3 1 2
8  9  15  5  8
11
45
 4 .0 9
11
81
ASSIGNMENT
The final grade in a biology course is determined by a score from 0 to 100,
which has three components: a laboratory component of 25%, two
hour-exams that together contribute 25%, and a final exam that
contributes 50%. There are 100 possible points for the laboratory, 50
possible points for each hour exam, and 100 possible points for the
final. A student in the course got 75 points for the laboratory, 40 and
38 points for the two hour-exams, and 85 points for the final. Use
weighted mean to determine his overall score (from 0 to 100) for the
course.
82
Geometric Mean

Geometric Mean
This is the nth root of the product of n numbers. If
X 1 , X 2 , X 3 , , X n are observations then, geometric
mean G is :
G  n X1X 2X3 X n

G.M. is used to find the rate of increase and decrease in a set of


variable. For instance, it can be used to find percentage increase
in business series also, in constructing index numbers

83
G  n X1X 2 X 3  X n

G n
 X1X 2 X 3  X n

T a k e lo g o f b o th s id e s
lo g G n
 lo g X 1X 2 X 3  X n 
S im p lif y in g w ith la w s o f lo g a r ith m
n lo g G  lo g X 1  lo g X 2  lo g X 3    lo g X n

lo g X  lo g X  lo g X    lo g X
lo g G  1 2 3 n

lo g G 
 lo g X
a 
n

G  a n ti lo g 
 lo g X 
 b 
 n
 
84
Either equation a or b can be used to obtain
geometric mean

Example
Find the geometric mean of 6, 8, 10, 16

85
Geometric Mean of a Frequency Distribution
Denoted mathematically by
f 1 lo g X 1  f 2 lo g X  f 3 lo g X    f 3 lo g X
lo g G  2 3 n

lo g G 
 f lo g X
 f
For an ungrouped data, X i represent
individual observation while f i represents the
corresponding frequency.
For a grouped data, X i represents the mid mark of the class.

86
Example
1. The table below shows the distribution of
the life span of 50 batteries in hours,
calculate the geometric mean

Life of battery (hours) 10 15 20 25 30


Number of batteries 5 9 18 12 6

2. The table below shows the distribution of


the tenement rates of houses in Ota. Find the geometric mean
Tenement Rates 2‐4 5‐7 8‐10 11‐13 14‐16
Number of Houses 3 4 10 15 5
87
Harmonic Mean
It is the reciprocal of arithmetic mean. For a set of
observation X 1, X 2 , X 3 , , X n harmonic mean,

1
H 
1 1
N
 X
N

1
 X

88
Example
1. Calculate the harmonic mean of 6,7,8 and 9

89
The relationship between harmonic mean, arithmetic
mean and geometric mean is

Arithmetic Mean  Geometric Mean  Harmonic Mean

90
MEASURES OF PARTITION
These are measures that divide a distribution into
specified number of parts.
They include quartiles, deciles and percentiles.

91
QUARTILES
When an ordered set of data is divided into four equal parts,
the divisions are called quartiles.
The first or lower quartile, Q 1 is a value that has
1
approximately 25% or 4 of the observations below it and
approximately 75% of the observations above. This is
determined by: 1 
 4  n  1

92
2  1 of the
The second quartile, Q2 has approximately 50% or 4 2

observations below its value. It is exactly equal to the median

3
The third or upper quartile, Q 3 , has approximately 75% or 4 of
the observations below its value. it is determined by .
3 
 4  n  1 

93
Examples
1. Given the data below 3, 4, 1, 2, 7, 12, 5 . Find,
i. The lower quartile
ii. The upper quartile
2. Find Q 1 and Q 3 of 1, 5, 3, 6, 9, 8
3. Find Q 1 and Q 3 of 4, 12, 2.2, 14, 23, 10, 16.4, 2, 15,
19.6, 20.6, 8

94
Examples
Rearrange in ascending or descending order
1. Lower quartile-2nd item hence 2
Upper quartile- 6th item hence 7
2. Lower quartile-between 1st & 2nd item hence 2.5
Upper quartile- between 5th & 6th item hence 8.25
3. Lower quartile-between 3rd & 4th item hence 5
Upper quartile- between 9th & 10th item hence 18.8

95
DECILES
When an ordered set of a data is divided into ten equal
parts then it is called deciles.
1 3
First decile is 10 , third decile is10 . Thus the nth
n
decile is 10

96
PERCENTILES
When an ordered set of a data is divided into one hundred equal
parts then it is called percentiles.
The 30th percentile is 30100 of the distribution hence the lower
quartile which is 25% or is known as the 25 th percentile,
1
4 the
median is the 50 t h percentile and the upper quartile 75 t h
percentile.
The methods of computing any fractile is similar to the method
of computing median and the quartiles
97
Examples
Calculate the
i. Third decile
ii. 80th percentile of 1, 5, 3, 6, 9, 8
1. Find Q 1 and Q 3 of 4, 12, 2.2, 14, 23, 10, 16.4, 2, 15, 19.6,
20.6, 8

98
A

99
A

100
A

101
A

102

You might also like