0% found this document useful (0 votes)
27 views23 pages

Understanding Statistics and Data Analysis

The document provides an overview of statistics, including its historical context and modern applications in decision-making. It covers key concepts such as frequency distribution, measures of central tendency (mean, median, mode), and measures of dispersion (range, variance, standard deviation). Various methods for calculating these statistical measures are illustrated with examples and tables.

Uploaded by

Patrick Sikale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views23 pages

Understanding Statistics and Data Analysis

The document provides an overview of statistics, including its historical context and modern applications in decision-making. It covers key concepts such as frequency distribution, measures of central tendency (mean, median, mode), and measures of dispersion (range, variance, standard deviation). Various methods for calculating these statistical measures are illustrated with examples and tables.

Uploaded by

Patrick Sikale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

The Open University of

Tanzania (OUT)

OFC 009 – MATHEMATICS


Lecture 6

STATISTICS

2
STATISTICS
Statistics which was formerly dealing with only affairs of the
state, is derived from the Italian word Statistica – meaning
Statesman as it was recorded by the Government during the
Roman Empire times, aimed at knowing about birth, death,
tax paid by the people, military strength etc. and it was
commonly known as The Statecraft.
In these modern days statistics is used as a tool in decision
making to resolve problems through factual information. This
is to say the volume of numerical information can be
organised, presented, analysed and interpreted.

3
Frequency distribution
One of the activities before the statistical data is analysed
effectively is for the data to be sorted out according to
numerical values of some characteristics (variables).
Example
The ages of 30 Open University staff are recorded in the
following table below:

35 60 55 55 38 45
45 60 55 50 55 45
55 45 62 55 60 42
45 55 50 45 55 35
55 60 35 55 40 65

4
Frequency Table
The table below shows a Frequency score of the ages of the
above 30 Open University staff.
Age TALLY Number of
(years) Staff
Frequency
65 1
62 1
60 4
55 10
50 2

42 1
40 1
38 1
35 3

5
Frequency distribution by classes
When the variable under consideration is continuous or large or
involves classifications, it is advisable to group the data into
suitable groups of sub-ranges (Classes).
Age of Students Number of Student
(years) Frequency

25 – 29 2
30 – 34 4
35 – 39 6
40 – 44 8
45 – 49 11

50 – 54 9
55 – 59 4
60 – 64 2
65 – 69 4

6
Less than Cumulative Frequency
If the classes are numbered 1, 2, 3, ..., n from the smallest to
the largest measurement and f1, f2, f3, ..., fn is their respective
frequencies. Then the less than cumulative frequency
distribution is as indicated below.
Cumulative frequency for class 1 0
Cumulative frequency for class 2 f1
Cumulative frequency for class 3 f1 + f2
Cumulative frequency for class 4 f1 + f2 + f3
“ “
“ “
Cumulative frequency for class n+1 f1 + f2 + f3………….. fn

7
Less than Cumulative Less than Cumulative
Frequency Frequency Polygon (Ogive)

Age of Number of
Students Student
(years) Frequency
Less than 25 0
Less than 30 2
Less than 35 6
Less than 40 12
Less than 45 20
Less than 50 31
Less than 55 40
Less than 60 44
Less than 65 46
Less than 70 50

8
More than cumulative frequency
If the classes are numbered 1, 2, 3, ..., n from the smallest to
the largest measurement and f1, f2, f3, ..., fn be their respective
frequencies then the more than cumulative frequency
distribution is as indicated below:
Cumulative frequency for class 1 f1 + f2 + f3+...+fn + fn+1
Cumulative frequency for class 2 f1 + f2 + f3+...+fn-1 - fn
Cumulative frequency for class 3 f1 + f2 + f3+...+fn-1
,, ,,
,, ,,
Cumulative frequency for class n - 1 f1 + f2

Cumulative frequency for class n f1


Cumulative frequency for class n + 1 0

9
More than Cumulative More than Cumulative
Frequency Frequency Polygon (Ogive)

Age of Students Number of Student


(years) Frequency
25 or more 50
30 or more 46
35 or more 44
40 or more 40
45 or more 31

50 or more 20

55 or more 12
60 or more 6
65 or more 2
70 or more 0

10
Histogram
A histogram consists of a set of rectangles drawn from the class
boundaries and having their bases on the x-axis with centres at the
class mark.
Class Boundaries Table
Histogram
Class Class No. of
int Low Hi Students

25 – 29 24.5 – 29.5 2
30 – 34 29.5 – 34.5 4
35 – 39 34.5 – 39.5 6
40 – 44 39.5– 44.5 8
45 – 49 44.5 – 49.5 11
50 – 54 49.5 – 54.5 9
55 – 59 54.5– 59.5 4
60 – 64 49.5 – 64.5 2
65 – 69 64.5 – 69.5 4

11
Frequency polygon
As in the histogram with the set of rectangles along with bases
on the x axis, a frequency polygon is drawn by connecting the
top mid points of the rectangles in a histogram.

Class mark
The class mark of an interval is the midpoint of the class
interval. It is obtained by taking the average of the lower class
limit and the upper class limit or the average of the lower class
and upper class boundaries.
35  39
Class mark   37
2 OR
34.5  39.5
Class mark   37
2

12
Class Boundaries Table Frequency Polygon

Class Class Mid No. of


Intervals Boundary points Student
.
25 – 29 24.5 – 29.5 27 2
30 – 34 29.5 – 34.5 32 4
35 – 39 34.5 – 39.5 37 6
40 – 44 39.5– 44.5 42 8
45 – 49 44.5 – 49.5 47 11
50 – 54 49.5 – 54.5 52 9
55 – 59 54.5– 59.5 57 4
60 – 64 49.5 – 64.5 62 2
65 – 69 64.5 – 69.5 67 4

13
Measure of central tendency:
(mean, Median, Mode)
Measures of central tendency
When a representation of a given set of data describes the
characteristics of the entire group in a single value, that
representation of the set of data is called a Measure of
Central Tendency. Examples of measures of Central Tendency
are:

•Arithmetic mean
•Median
• Mode

14
Arithmetic Mean
The arithmetic mean or simply the mean is the average of the data. It is
obtained by dividing the sum of the items by the number of items.
Let be the Arithmetic mean and x1, x2, x3, ..., xN be the values of variables.
Then N


i 1
xi
x 
N
x1  x 2  x3  ...  x N
x 
N

Calculation of arithmetic mean using an assumed Mean


Let A be an assumed mean (any) and d the deviation from the assumed
mean A , where di = xi – A N

 di
i 1
x  
N
15
Calculation of arithmetic mean using frequency distribution
Let the variables x1, x2 ... xN be distributed with frequencies f1, f2 ... fN,
N
respectively.  f i xi
Then the Arithmetic mean of the distribution is given by x  i 1
N
Median
The median of an ordered distribution is the middle number in the given
ordered distribution.
(i) Median for ungrouped data when Ordered Numbers are Odd
1 2 3 4 5 6 7
1 2 5 10 12 13
8 Median  8
(ii) Median for ungrouped data when Ordered Numbers are Odd
1 2 3 4 5 6 7 8 6 8
Median   7
2 3 4 6 8 9 10 13 2
6 8

16
Median for Grouped Data
Let: L = Lower class limit of the Interval
Containing the median
N = The total number of Items
∑ f = sum of frequencies of the class interval preceding
the class interval containing the median
fm = frequency of the class containing the median.
W = Class Width , Then

W N 
Median  L    f 
fm  2 

17
Mode
The Mode or the modal value of a distribution is the variable which occur
most frequently.
Example 1
1, 3, 3, 5, 6, 8, 8, 8, 8,9, 10, 10, 10, 11, 12, 12, 13 has mode 8.

Example 2
1, 3, 3, 3, 3, 6, 8, 8, 9, 10, 10, 10, 10, 11, 12, has two modes 3 and 10.

Example 3
1, 3, 5, 6, 8, 9, 10, 11, 12, 13 has no mode.

18
Mode for a Grouped Data
The mode for a grouped set of data is calculated using the
formula
 1 
Mode  L    W
 1   2 
Where
L = Lower class boundary of the modal class
∆1 = Difference between the frequency of the modal class and
the frequency of the preceding class (Ignoring signs)
∆2 = Difference between the frequency of the modal class and
the frequency of the succeeding class (Ignoring signs)
W = Class Width

19
Measure of dispersion :
(Range, Variance, Standard deviation)
MEASURES OF DISPERSION
The measures of central tendency discussed in the previous
section are not sufficient to give summary representation of the
entire data.
It is therefore necessary to estimate the variability (dispersion)
of the observations and thereby assess their significance.
Three different measures of dispersion are commonly used:
(i) Range
(ii) Variance and
(iii) Standard deviation

20
Range
The range is the most straight forward measure of
dispersion.
The range is the difference between the value of the
largest and smallest variable (data) in the
distribution.
If L is the Largest value of the data, and
S is the smallest value of the data

Then Range = L - S

21
Variance and Standard Deviation
Both the variance and standard deviation are measures of
dispersion of the data using the mean as point of reference.
Each deviation d  xi  x
is positive or negative depending on whether xi is greater
than or less than the mean value . Because we are interested
in how the distance and the variables xi are away from the
mean we do away with the sign by squaring the quantity:
N _ 2

 (x  x)
Variance  2  i 1
N

S tan dard Deviation    2

22
THE END
Thank You
For Your
Attention
23

You might also like