Statistics
Statistics
STATISTICS is a branch of mathematics concerned with the organization, analysis and interpretation of numerical data.
1. Descriptive Statistics - statistical method concerned with describing the properties and characteristics of a set of
data (e.g. gender, age group, percentage of literacy, average family income, etc.)
2. Inferential Statistics – statistical analysis concerned with the analysis of data leading to prediction, inferences,
interpretation or conclusion about the entire population
STATISTICAL TERMS
SIGMA NOTATION (∑) – summation; most frequently used from of notation in statistics which abbreviates the sum of the
quantities in a given range
COLLECTION OF DATA
TYPEPS OF DATA
STATISTICAL PRESENTATION
STATISTICAL GRAPHS
• Bar graph –show relative sizes of data; bars drawn proportional to the data may be horizontal or vertical
• Line graph – show the relationship between two or more sets of continuous data
• Circle graph – best used to compare parts to a whole where the size of each sector of the circle is proportional to
the size of the category that it represents
FREQUENCY DISTRIBUTION – the tabular presentation of data showing the frequency of each score
29 27 28 27 34 29 27 27 28
25 23 35 25 29 33 23 27 33
27 22 40 27 21 29 22 25 29
25 21 20 21 23 25 30 20 28
30 29 28 30 27 27 27 19 30
1. Find the RANGE (𝑟 ) (difference between the highest score and the lowest score)
𝑟 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒 − 𝑙𝑜𝑤𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒 = 40 − 19 = 𝟐𝟏
2. Decide on the number of CLASSES (grouping or category; ideally between 5 and 15)
*assume that the desired number of classes is 7
3. Determine CLASS INTERVAL (size of each class rounded to the nearest integer)
𝑟𝑎𝑛𝑔𝑒 21
𝑖 = 𝑑𝑒𝑠𝑖𝑟𝑒𝑑 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠 = 7
=𝟑
5. Determine the CLASS FREQUENCY (𝑓) for each class by counting the tally.
• Class Boundaries – described as true limits because these are more precise expression of class limits
a. Lower Boundaries (LB) – 0.5 less than the lower limit (e.g. 25 – 0.5 = 24.5)
b. Upper Boundaries (UB) – 0.5 more than the upper limit (e.g. 27 + 0.5 = 27.5)
HISTOGRAM – bar graph-like representation of a frequency distribution; the height of each bar corresponds to the class
frequency and the width corresponds to the class interval
FREQUENCY POLYGON – line graph where the frequency of each class is plotted against class mark
OGIVE – line graph where the cumulative frequency of each class is plotted against the corresponding class boundary; the
intersection of the less than ogive and the greater than ogive is the MEAN of the data
• Measures of Central Tendency/ Measure of Average – statistic that serves as a representative of the data; a
quantitative representation of the set of data under investigation and lies within the center of the set of data
• Measures of Dispersion/ Measure of Spread – statistic that indicates how close or widespread the data are from
the average
1. Mean – arithmetic average; the sum of the quantities divided by the number of quantities under consideration
𝑥 ,𝑥 ,…,𝑥
a. Mean of Ungrouped Data (𝑥̅ = 1 2𝑛 𝑛 )
b. Mean of Grouped Data Using Class Mark
∑ 𝑓𝑥
( 𝑥̅ = )
∑𝑓
2. Median – middle value in a set of quantities; the value that separates an ordered set of data in two equal parts
a. Median of Ungrouped Data
𝑛+1
• If n is odd, the median is the ( 2
)𝑡ℎ quantity
𝑛 𝑛
• If n is even, the median is the mean of (2 + 1)𝑡ℎ and (2 )𝑡ℎ quantities
b. Median of Grouped Data
∑𝑓
−𝑐𝑓
𝑥̃ = 𝑙𝑏𝑚𝑒 + [ 2𝑓 ]𝑖
𝑚𝑒
EXAMPLE:
A. MEAN C. MODE
∑ 𝑓𝑥 1204 𝐷1
𝑥̅ = = = 𝟐𝟔. 𝟕𝟔 𝑥̂ = 𝑙𝑏𝑚𝑜 + ( )𝑖
∑𝑓 45 𝐷1 + 𝐷2
1
𝑥̂ = 24.5 + ( )3
B. MEDIAN 1 + 10
∑𝑓 1
− 𝑐𝑓 𝑥̂ = 24.5 + ( ) 3
11
𝑥̃ = 𝑙𝑏𝑚𝑒 + [ 2 ]𝑖
𝑓𝑚𝑒 𝑥̂ = 24.5 + (0.09)3
𝑥̂ = 24.5 + 0.27
45
− 11 𝑥̂ = 𝟐𝟒. 𝟕𝟕
𝑥̃ = 24.5 + [ 2 ]3
15
22.5 − 11
𝑥̃ = 24.5 + [ ]3
15
11.5
𝑥̃ = 24.5 + [ ]3
15
𝑥̃ = 24.5 + [0.77]3
𝑥̃ = 24.5 + 2.31
𝑥̃ = 𝟐𝟔. 𝟖𝟏