Quantitative Methods: Describing Data Numerically
Quantitative Methods: Describing Data Numerically
DESCRIBING DATA
NUMERICALLY
Central Tendency
Variation
Arithmetic Mean
Range
Median
Variance
Mode
Standard Deviation
Coefficient of Variation
Mean
Median
Mode
Midpoint of
ranked values
Most frequently
observed value
x
i1
Arithmetic
average
Arithmetic Mean
x1 x 2 x N
N
N
i1
Population
values
Population size
x
i 1
x1 x 2 x n
Observed
values
Sample size
Arithmetic Mean
(continued)
0 1 2 3 4 5 6 7 8 9 10
Mean = 3
1 2 3 4 5 15
3
5
5
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
1 2 3 4 10 20
4
5
5
Median
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
Median = 3
Median = 3
n 1
Note that
is not the value of the median, only the
2
position of the median in the ranked data
Mode
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
0 1 2 3 4 5 6
No Mode
Shape of a Distribution
Measures of shape
Symmetric or skewed
Left-Skewed
Symmetric
Right-Skewed
Mean = Median
Measures of Variability
Variation
Range
Variance
Standard
Deviation
Coefficient
of Variation
Range
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12
Range = 14 - 1 = 13
13 14
10
11
12
Range = 12 - 7 = 5
10
11
12
Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Quartiles
25%
25%
Q2
25%
Q3
The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
Q2 is the same as the median (50% are smaller, 50% are
larger)
Only 25% of the observations are greater than the third
quartile
Population Variance
Population variance:
Where
(x )
i1
= population mean
N = population size
xi = ith value of the variable x
N -1
Sample Variance
Sample variance:
s
2
Where
(x x)
i1
X = arithmetic mean
n = sample size
Xi = ith value of the variable X
n -1
2
(x
)
i
i1
N -1
(x x)
i 1
n -1
Calculation Example:
Sample Standard Deviation
Sample
Data (xi) :
10
12
14
n=8
s
15
17
18
18
24
Mean = x = 16
126
7
4.2426
Measuring variation
Small standard deviation
12
13
14
15
16
17
18
19
20 21
Mean = 15.5
s = 3.338
20 21
Mean = 15.5
s = 0.926
20 21
Mean = 15.5
s = 4.570
Data B
11
12
13
14
15
16
17
18
19
Data C
11
12
13
14
15
16
17
18
19
68%
95%
99.7%
Coefficient of Variation
s
CV
x
100%
Comparing Coefficient
of Variation
Stock A:
Average price last year = Rs.50
Standard deviation = Rs.5
s
CVA
x
5
100% 100% 10%
50
Stock B:
Average price last year = Rs.100
Standard deviation = Rs.5
s
CVB
x
5
100%
100% 5%
100
Both stocks
have the same
standard
deviation, but
stock B is less
variable relative
to its price
Cov (x , y) xy
(x
i
i1
)(y i y )
Cov (x , y) s xy
(x x)(y y)
i1
n 1
Interpreting Covariance
Cov(x,y) > 0
Cov(x,y) < 0
Cov(x,y) = 0
Coefficient of Correlation
Cov (x , y)
XY
Cov (x , y)
r
sX sY
Features of
Correlation Coefficient, r
r = -1
r = -0.6
X
Y
r = +1
r=0
r = +0.3
r=0
r = 0.733
There is a relatively
strong positive linear
relationship between
test score #1
and test score #2