Measures of Central Tendency
Describing Data Numerically
Center and Other Measures of Variation
Location Location
Mean Range
Percentiles
Median Inter quartile Range
Quartiles
Variance
Mode
Standard Deviation
Weighted Mean
Coefficient of Variation
Measures of Center and Location
Center and Location
Mean Median Mode Weighted Mean
xi XW
wx i i
x i 1
n
w i
xi W
wxi i
i 1
N
w i
Mean (Arithmetic Average)
The most common measure of central tendency
Mean = sum of values divided by the number of values
Affected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Mean = 3 Mean = 4
1 2 3 4 5 15 1 2 3 4 10 20
3 4
5 5 5 5
Mode
A measure of central tendency
Value that occurs most often
Not affected by extreme values
Used for either numerical or categorical data
There may be no mode
There may be several modes
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 5 No Mode
Weighted Mean
Used when values are grouped by frequency or relative
importance of each value to the overall total.
Example: Sample of 26
Repair Projects Weighted Mean Days to Complete:
Days to Frequency
Complete
5 4 XW
wx
i i
(4 5) (12 6) (8 7) (2 8)
6 12
w i 4 12 8 2
164
7 8 6.31 days
26
8 2
Best Measure of Location
Mean is generally used, unless extreme values (outliers)
exist
Then median is often used, since the median is not
sensitive to extreme values.
Example: Median home prices may be reported for a
region – less sensitive to outliers
Matching Average to Data
Measure of Appropriate to choose Should not be used
Central when … when…
Tendency
Mean •No situation precludes it •Extreme scores
•First choice measure of •Skewed distribution
central tendency •Ordinal scale
•Nominal scale
Median •Extreme scores •Nominal scale
•Skewed distribution
•Ordinal scale
Mode •Nominal scales •Interval or ratio data, except
•Discrete variables to accompany mean or
•Describing shape of median
distribution
Shape of a Distribution
Describes how data is distributed
Symmetric or skewed
Left-Skewed Symmetric Right-Skewed
Mean < Median < Mode
Mean = Median = Mode Mode < Median < Mean
(Longer tail extends to left) (Longer tail extends to right)
Example
Five houses on a hill by the beach
$2,000 K
House Prices:
$2,000,000
$500 K
500,000
$300 K
300,000
100,000
100,000
$100 K
$100 K
Summary Statistics
Mean: ($3,000,000/5)
House Prices:
= $600,000
$2,000,000
500,000 Median: middle value of
300,000
ranked data
100,000
100,000 = $300,000
Sum 3,000,000
Mode: most frequent value
= $100,000
Method’s Nature of Data
Name Ungrouped Data Grouped Data
Direct
Method
Indirect or
Short-Cut
Method
Method of
Step-
Deviation
Where
Indicates values of the variable .
Indicates number of values of .
Indicates frequency of different groups.
Indicates assumed mean.
Indicates deviation from i.e,
Step-deviation and Indicates common divisor
Indicates size of class or class interval in case of
grouped data.
Summation or addition.
Example
The following data shows distance covered by 100
persons to perform their routine jobs.
Distance (Km)
Number of
Persons
Calculate Arithmetic Mean by Step-Deviation
Method; also explain why it is better than direct
method in this particular case.
Solution
The given distribution belongs to a grouped data
and the variable involved is ages of “distance
covered”. While the “number of persons” Represent
frequencies.
Distance Number Mid
Covered of Persons Points fu
in (Km) f x
0-10 10 5 -1 -10
10-20 20 15 0 0
20-30 40 25 +1 40
30-40 30 35 +2 60
Total ∑f= 100 ∑fu = 90
Now we will find the Arithmetic Mean as
Where
, , and
Km
Explanation:
Here from the mid points (x) it is very much clear
that each mid point is multiple of 5 and there is also a gap
of 10 from mid point to mid point i.e. class size or interval
(h). Keeping in view this, we should prefer to take
method of Step-Deviation instead of Direct Method.
Example
The following frequency distribution showing the
marks obtained by 50 students in statistics at a
certain college. Find the arithmetic mean using (1)
Direct Method (2) Short-Cut Method (3) Step-
Deviation.
Marks 20-29 30-39 40-49 50-59 60-69 70-79 80-89
Frequency 1 5 12 15 9 6 2
Step-
Direct Short-Cut
Deviation
Method Method
Method
Marks f x fx D=x-A fD fu
20-29 1 24.5 24.5 -30 -30 -3 -3
30-39 5 34.5 172.5 -20 -100 -2 -10
40-49 12 44.5 534 -10 -120 -1 -12
50-59 15 54.5 817.5 0 0 0 0
60-69 9 64.5 580.5 10 90 1 9
70-79 6 74.5 447 20 120 2 12
80-89 2 84.5 169 30 60 3 6
Total 50 2745 20 2
(1) Direct Method:
(2) Short-Cut Method:
Where ;A= 54.5
Marks
(3) Step-Deviation Method:
Where ;A=54.5 h = 10
=54.5 + 0.4 = 54.9 Marks
Exercise
HEIGHT ( in) Mid value (X) FREQUENCY ( F) fX
60-62 61 5
63-65 64 18
66-68 67 42
69-71 70 27
72-74 73 8
100
Exercise
OVERTIME NO OF MID VALUE d = (Mid Freq x d
(HRS) EMPLOYEES value –A)/
class interval
10 – 15 11 12.5
15 – 20 20 17.5
20 – 25 35 22.5 say is
assumed mean
25- 30 20 27.5
30 – 35 8 32.5
35 -40 6 37.5
100
MEDIAN
The median of a finite list of numbers can be found by
arranging all the observations from lowest value to
highest value and picking the middle one.
For Odd number of observations:
Median = (n+1)/2 th observations
For Even number of observations:
Median = Average of (n/2) th and (n/2 + 1) th
observations
FORMULA for MEDIAN
MEDIAN
(n/2) - cf
= l + _____________ x h
f
l=lower limit of median class interval
cf= cumulative freq of class prior to median class
f=freq of median class
h= width of median class
n= total no of observations
Example
CALCULATE MEDIAN VALUE
AGE OF NO OF AUTOS CUMULATIVE
AUTOS (f) FREQ
0-4 13 13
4-8 29 42
8-12 48 90 ( MEDIAN
CLASS)
12-16 22 112
16-20 8 120
n=120
Other Location Measures
Other Measures of
Location
Percentiles Quartiles
The pth percentile in a data array: • 1st quartile = 25th percentile
p% are less than or equal to this
value • 2nd quartile = 50th percentile
(100 – p)% are greater than or = median
equal to this value
(where 0 ≤ p ≤ 100) • 3rd quartile = 75th percentile
Percentiles
The pth percentile in an ordered array of n values is the value in ith
position, where
p
i (n 1)
100
Example: The 60th percentile in an ordered array of 19 values is the
value in 12th position:
p 60
i (n 1) (19 1) 12
100 100
Quartiles
25% 25% 25% 25%
Q1 Q2 Q3
Quartiles split the ranked data into 4 equal groups
Example: Find the first quartile
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22
(n = 9)
Q1 = 25th percentile, so find the 25 25(9+1) = 2.5 position
100 100
so use the value half way between the 2nd and 3rd values,
so
Q1=12.5
QUARTILES
QUARTILE
i (n/4) - cf
= l + ________________ x h
f
i = 1,2,3
l = lower limit of QUARTILE class interval
cf = cumulative freq of class prior to quartile class
f = freq of quartile class
h = width of quartile class
n = total no of observations
EXERCISE
CALCULATE Q1 AND P37 FOR THE
PREVIOUS EXAMPLE