FRM Part 1: Basic Statistics
FRM Part 1: Basic Statistics
BASIC STATISTICS
Learning Objectives
After completing this reading you should be
able to:
Interpret and apply the mean, standard deviation, and variance of a random
variable.
Calculate the mean, standard deviation, and variance of a discrete random
variable.
Interpret and calculate the expected value of a discrete random variable.
Calculate and interpret the covariance and correlation between two random
variables.
Calculate the mean and variance of sums of variables.
Describe the four central moments of a statistical variable or distribution: mean,
variance, skewness, and kurtosis.
Interpret the skewness and kurtosis of a statistical distribution, and
interpret the concepts of coskewness and cokurtosis.
Describe and interpret the best linear unbiased estimator.
Measures of Central Tendency
Also referred to as measures of central location, measures of central
tendency are summary measures that attempt to describe a whole set of
data with a single value that represents the middle or centre of its
distribution.
Mean
Median
Mode
Mean
Is the sum of all the measurements divided by the sum of all the
measurements in the set.
o It is the most popular and widely used measure of central tendency.
o Denoted by 𝛍 (read “mu”) in a population, and 𝐱 (read “x bar”) in a
sample.
Example
Consider this dataset showing a sample of scores of FRM candidates in a
mock exam:
n
Score Frequency i=1 xi
Mean =
60 4 n
64 3
60×4+64×3+68×2+75×1+72×1
68 2 =
11
75 1
72 1
= 65
Median
Is the middle value in distribution when the values are arranged in
ascending or descending order.
It divides the distribution in half.
o If the number of observations is odd, median = middle value
o If the number of observations is even, median is the average of the 2 two
middle values.
Example
Consider this dataset showing a sample of scores of FRM candidates in a
mock exam:
Score Frequency
60 4 [60, 60, 60, 60, 64, 64, 64, 68, 68, 72, 75]
64 3
68 2
75 1
72 1 Median
Mode
It’s the most commonly occurring value in a distribution.
o A distribution can be unimodal (one mode), bimodal (two modes),
trimodal (three modes) or mulitimodal (more than three modes).
Example
Consider this dataset showing a sample of scores of FRM candidates in a
mock exam:
Score Frequency [60, 60, 60, 60, 64, 64, 64, 68, 68, 72, 75]
60 4
64 3
68 2
75 1 Mode
72 1
Population and Sample Data
Sample statistics and population statistics should be distinguished.
o A population refers to the summation of all the elements of interest to
the researcher.
Examples: the no. of people in a country, the number of hegde funds
in the U.S., or even the total no. of FRM candidates in a given year.
Where the letter “E” is known as the expectations operator and is used to indicate the calculation
of a probability-weighted average; and
Xi represents the ith observation.
Expectations
Example
The probability distribution of the return earned by a newly established hedge
fund is as follows:
Probability Return(%)
20% 8
40% 15
30% 20
10% 30
Calculate the expected return.
Solution
The expected return is simply a weighted average of each possible return,
where the weights are the probabilities of each possible outcome.
E R = 0.2 × 8 + 0.4 × 15 + 0.3 × 20 + 0.1 × 30 = 16.6%
Expectations
(II) Expectations of Variance
The variance 𝜎 2 is a measure of the spread of a distribution about its
mean.
Formally, variance is defined as:
v𝐚𝐫,𝐗- = 𝐄, 𝐗 − 𝐄 𝐗 𝟐 -
i.e., the expected value (or mean) of the squared deviation of X from its
mean.
Note that this can also be written as 𝐸,(𝑋 − 𝜇)2 - because E(X) = μ.
Expanding the above will prove that:
𝐯𝐚𝐫 𝐱 = 𝐄 𝐗 𝟐 − 𝛍𝟐 … Important Result
The standard deviation, 𝜎 , is the positive square root of this – hence the
term sometimes used “root mean squared deviation”.
Expectations
Example
Going back to the probability distribution of the return earned by a newly
established hedge fund:
Probability Return(%)
20% 8
40% 15
30% 20
10% 30
Calculate the standard deviation of return.
Solution
Var R = E R2 − μ2 = E R2 − E R 2
E R = 0.2 × 8 + 0.4 × 15 + 0.3 × 20 + 0.1 × 30 = 16.6
𝐸 𝑅2 = 0.2 × 82 + 0.4 × 152 + 0.3 × 202 + 0.1 × 302 = 312.8
𝑉𝑎𝑟 𝑅 = 312.8 − 16.62 = 37.24
Standard deviation of return = 37.240.5 = 6.1%
Covariance
Variance and standard deviation are good tools but they measure the
dispersion, or volatility, of only one variable.
o However, in finance we are usually interested in the relative performance
of two assets.
o We may want to establish the relationship between the return for Stock X
and Stock Y or the relationship between the performance of the S&P 500
and that of the tech industry.
Covariance makes all that possible.
Covariance
Covariance is the expected value of the product of the deviations of the two
random variables from their respective expected values.
If we have two assets, i and j, with returns R i and R j respectively,
𝐂𝐨𝐯(𝑹𝒊 , 𝑹𝒋 ) = 𝐏 𝑹𝒊 𝑹𝒊 − 𝐄 𝑹𝒊 𝑹𝒋 − 𝐄 𝑹𝒋
𝐢=𝟏
Interpretation
Strong positive linear relationship (up to 1) indicates a perfect linear
relationship).
Strong negative (inverse) relationship (down to −1) indicates a perfect
inverse linear relationship
Zero correlation indicates no linear relationship
Portfolio variance
Assuming that we have two securities whose random returns
are 𝐗 𝐀 and 𝐗 𝐁 and their means are μ𝑨 and μ𝐁 with standard deviations
of σ𝐀 and σ𝐁 . Then, the variance of 𝐗 𝐀 plus 𝐗 𝐁 could computed as
follows:
σ𝟐𝐀+𝐁 = σ𝟐𝐀 + σ𝟐𝐁 + 2 ρ𝐀𝐁 σ𝐀 σ𝐁
Where 𝐗 𝐀 and 𝐗 𝐁 have a correlation of ρ𝐀𝐁 between them.
True
Point
population
estimate
parameter
Formula used = “estimator”
Best Linear Unbiased Estimator
There are four desirable properties of an estimator:
1) Unbiased: The expected value of the estimator must be equal to the true
population parameter, e.g., E x = μ.
2) Efficient: The variance of its sampling distribution is smaller than that of
all the other unbiased estimators.
3) Consistent: Its accuracy increases as the sample size is increased
4) Linear: It can be used as a linear function of sample data.