0% found this document useful (0 votes)
192 views28 pages

FRM Part 1: Basic Statistics

This document provides an overview of basic statistics concepts used in risk management. It defines key terms like mean, median, mode, variance, standard deviation, and expected value. It explains how to calculate these measures for both discrete and continuous random variables. It also distinguishes between population and sample data, and describes how to compute variance, standard deviation, and expected value for random variables. The goal is to interpret and apply these fundamental statistical concepts.

Uploaded by

Ra'fat Jallad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
192 views28 pages

FRM Part 1: Basic Statistics

This document provides an overview of basic statistics concepts used in risk management. It defines key terms like mean, median, mode, variance, standard deviation, and expected value. It explains how to calculate these measures for both discrete and continuous random variables. It also distinguishes between population and sample data, and describes how to compute variance, standard deviation, and expected value for random variables. The goal is to interpret and apply these fundamental statistical concepts.

Uploaded by

Ra'fat Jallad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

FRM part 1

Book 2 - Foundations of risk management


Chapter 2

BASIC STATISTICS
Learning Objectives
After completing this reading you should be
able to:
 Interpret and apply the mean, standard deviation, and variance of a random
variable.
 Calculate the mean, standard deviation, and variance of a discrete random
variable.
 Interpret and calculate the expected value of a discrete random variable.
 Calculate and interpret the covariance and correlation between two random
variables.
 Calculate the mean and variance of sums of variables.
 Describe the four central moments of a statistical variable or distribution: mean,
variance, skewness, and kurtosis.
 Interpret the skewness and kurtosis of a statistical distribution, and
interpret the concepts of coskewness and cokurtosis.
 Describe and interpret the best linear unbiased estimator.
Measures of Central Tendency
 Also referred to as measures of central location, measures of central
tendency are summary measures that attempt to describe a whole set of
data with a single value that represents the middle or centre of its
distribution.

 They identify a single value as representative of an entire distribution.


 They include:

Mean
Median
Mode
Mean
 Is the sum of all the measurements divided by the sum of all the
measurements in the set.
o It is the most popular and widely used measure of central tendency.
o Denoted by 𝛍 (read “mu”) in a population, and 𝐱 (read “x bar”) in a
sample.
Example
 Consider this dataset showing a sample of scores of FRM candidates in a
mock exam:
n
Score Frequency i=1 xi
Mean =
60 4 n
64 3
60×4+64×3+68×2+75×1+72×1
68 2 =
11
75 1
72 1
= 65
Median
 Is the middle value in distribution when the values are arranged in
ascending or descending order.
 It divides the distribution in half.
o If the number of observations is odd, median = middle value
o If the number of observations is even, median is the average of the 2 two
middle values.
Example
 Consider this dataset showing a sample of scores of FRM candidates in a
mock exam:

Score Frequency
60 4 [60, 60, 60, 60, 64, 64, 64, 68, 68, 72, 75]
64 3
68 2
75 1
72 1 Median
Mode
 It’s the most commonly occurring value in a distribution.
o A distribution can be unimodal (one mode), bimodal (two modes),
trimodal (three modes) or mulitimodal (more than three modes).
Example
 Consider this dataset showing a sample of scores of FRM candidates in a
mock exam:
Score Frequency [60, 60, 60, 60, 64, 64, 64, 68, 68, 72, 75]
60 4
64 3
68 2
75 1 Mode
72 1
Population and Sample Data
 Sample statistics and population statistics should be distinguished.
o A population refers to the summation of all the elements of interest to
the researcher.
 Examples: the no. of people in a country, the number of hegde funds
in the U.S., or even the total no. of FRM candidates in a given year.

o A sample is just a set of elements that represent the population as a


whole. By analyzing sample data, we are able to make conclusions
about the entire population.
 For example, if we sample the returns of 30 hedge funds spread
across the U.S., we can use the results to make reasonable
conclusions about the market as a whole (well over 10,000 hedge
funds)
Variance and Standard
Deviation
 Variance is the average of the distances from each data point in
the population to the mean, squared.
 The standard deviation is the positive square root of the variance.
 A population’s variance is given by:
𝐍 𝟐
𝟐 𝐢=𝟏 𝐗 𝐢 − 𝛍
𝛔 =
𝐍
Where μ is the population mean and N is population size
 The population standard deviation equals the square root of population
variance.

 The sample variance is given by:


𝐧 𝟐
𝐢=𝟏 𝐗𝐢 − 𝐗
𝐒𝟐 =
𝐧−𝟏
Where 𝑋 is the sample mean and n is the sample size
 The sample standard deviation equals the square root of sample
variance.
Variance and standard
deviation
Example
Using sample data from the previous example:
Score Frequency Xi
X  x ( X i  x )2
i
60 4
60 - 65 25
64 3
60 - 65 25
68 2 60 - 65 25
75 1 60 - 65 25
72 1 64 - 65 1
64 - 65 1
 Mean = 65 64 - 65 1
68 - 65 9
68 - 65 9
72 - 65 49
75 - 65 100
Total = 270
Variance = 270/10 = 27
S.d = 27^(1/2) = 5.2
Continuous Random Variable
 The mean, mode, and median of a continuous random variable can be
defined as well. We multiply the variable with its probability density
function (PDF) and integrate the product to compute the mean of a
continuous random variable.
 If X is our continuous random variable, and f(x) its PDF, then the mean
is:
𝒙𝒎𝒂𝒙
μ= 𝒙 𝒇 𝒙 𝒅𝒙
𝒙𝒎𝒊𝒏

 The definition of a median in the context of a continuous random variable is


simply as it was defined for the discrete random variable, with 0.5 likelihood
of the values being less than or equal to, or, greater than or equal to the
median.
 Supposing m is the median, then:
𝒎 𝒙𝒎𝒂𝒙
𝟏
𝒇 𝒙 𝒅𝒙 = 𝒇 𝒙 𝒅𝒙 =
𝒙𝒎𝒊𝒏 𝒎 𝟐
Expectations
 Expected values are numerical summaries of important characteristics of
the distributions of random variables.

(I) Expectations of the Mean


 The expected value is the weighted average of the possible outcomes of a
random variable, where the weights are the probabilities that the
outcomes will occur. In mathematical parlance, this can be represented
as:

𝐄 𝐱 = (𝐏𝐗𝐢 ) 𝐱 𝐢 = 𝐏𝐱𝟏 𝐗 𝟏 + 𝐏𝐱𝟐 𝐗 𝟐 + 𝐏𝐱𝟑 𝐗 𝟑 + ⋯ + 𝐏𝐱𝐧 𝐗 𝐧

Where the letter “E” is known as the expectations operator and is used to indicate the calculation
of a probability-weighted average; and
Xi represents the ith observation.
Expectations
Example
The probability distribution of the return earned by a newly established hedge
fund is as follows:
Probability Return(%)
20% 8
40% 15
30% 20
10% 30
 Calculate the expected return.
Solution
 The expected return is simply a weighted average of each possible return,
where the weights are the probabilities of each possible outcome.
 E R = 0.2 × 8 + 0.4 × 15 + 0.3 × 20 + 0.1 × 30 = 16.6%
Expectations
(II) Expectations of Variance
 The variance 𝜎 2 is a measure of the spread of a distribution about its
mean.
 Formally, variance is defined as:
v𝐚𝐫,𝐗- = 𝐄, 𝐗 − 𝐄 𝐗 𝟐 -
i.e., the expected value (or mean) of the squared deviation of X from its
mean.

 Note that this can also be written as 𝐸,(𝑋 − 𝜇)2 - because E(X) = μ.
 Expanding the above will prove that:
𝐯𝐚𝐫 𝐱 = 𝐄 𝐗 𝟐 − 𝛍𝟐 … Important Result
 The standard deviation, 𝜎 , is the positive square root of this – hence the
term sometimes used “root mean squared deviation”.
Expectations
Example
 Going back to the probability distribution of the return earned by a newly
established hedge fund:
Probability Return(%)
20% 8
40% 15
30% 20
10% 30
 Calculate the standard deviation of return.
Solution
 Var R = E R2 − μ2 = E R2 − E R 2
 E R = 0.2 × 8 + 0.4 × 15 + 0.3 × 20 + 0.1 × 30 = 16.6
 𝐸 𝑅2 = 0.2 × 82 + 0.4 × 152 + 0.3 × 202 + 0.1 × 302 = 312.8
 𝑉𝑎𝑟 𝑅 = 312.8 − 16.62 = 37.24
 Standard deviation of return = 37.240.5 = 6.1%
Covariance
 Variance and standard deviation are good tools but they measure the
dispersion, or volatility, of only one variable.
o However, in finance we are usually interested in the relative performance
of two assets.
o We may want to establish the relationship between the return for Stock X
and Stock Y or the relationship between the performance of the S&P 500
and that of the tech industry.
 Covariance makes all that possible.
Covariance
 Covariance is the expected value of the product of the deviations of the two
random variables from their respective expected values.
 If we have two assets, i and j, with returns R i and R j respectively,

𝐂𝐨𝐯(𝑹𝒊 , 𝑹𝒋 ) = 𝐄*,𝑹𝒊 − 𝐄(𝑹𝒊 )- ,𝑹𝒋 − 𝐄(𝑹𝒋 )-+


𝒏

𝐂𝐨𝐯(𝑹𝒊 , 𝑹𝒋 ) = 𝐏 𝑹𝒊 𝑹𝒊 − 𝐄 𝑹𝒊 𝑹𝒋 − 𝐄 𝑹𝒋
𝐢=𝟏

 The covariance between i and j is a measure of the strength of the “linear


association” or “linear relationship” between the variables.
o However it suffers from the fact that its value is dependent on the units
of measurement of the variables.
Covariance
Example
 Suppose we wish to find the variance of each asset and the covariance
between the returns of assets A and B.
Strong Normal Week Economy
Economy Economy
Probability 15% 60% 25%
A’s Return 40% 20% 0
B’ Return 20% 15% 4%

 How do we establish the covariance between the two assets?


Covariance
Solution
Strong Normal Week Economy
Economy Economy
Probability 15% 60% 25%
A’s Return 40% 20% 0
B’ Return 20% 15% 4%
n
 Cov(R i , R j ) = i=1 P Ri Ri − E Ri Rj − E Rj
 As per the formula above, we must first calculate the expected return of
each asset:
o E(RA) = 0.15 * 0.40 + 0.60 * 0.2 + 0.25 * 0.00 = 0.18
o E(RB) = 0.15 * 0.2 + 0.60 * 0.15 + 0.25 * 0.04 = 0.13
 Finally, we can compute the covariance between the returns of the two
assets:
o Cov(RARB) = 0.15(0.40 – 0.18)(0.20 – 0.13) + 0.6(0.20 – 0.18)(0.15 –
0.13) + 0.25(0.00 – 0.18)(0.04 – 0.13) = 0.0066
Correlation
 A quantity related to the covariance is the correlation coefficient which is a
dimensionless quantity (i.e. it has no “units”).
 The correlation coefficient (X,Y) written as corr(X,Y) or 𝜌 (𝑋, 𝑌) of two
random variables X and Y is defined by:
𝐜𝐨𝐯(𝐗, 𝐘)
𝐂𝐨𝐫𝐫 𝐗𝐘 =
𝐕𝐚𝐫 𝐱 𝐕𝐚𝐫 𝐲
 It always lies between -1 and +1.

Interpretation
 Strong positive linear relationship (up to 1) indicates a perfect linear
relationship).
 Strong negative (inverse) relationship (down to −1) indicates a perfect
inverse linear relationship
 Zero correlation indicates no linear relationship
Portfolio variance
 Assuming that we have two securities whose random returns
are 𝐗 𝐀 and 𝐗 𝐁 and their means are μ𝑨 and μ𝐁 with standard deviations
of σ𝐀 and σ𝐁 . Then, the variance of 𝐗 𝐀 plus 𝐗 𝐁 could computed as
follows:
σ𝟐𝐀+𝐁 = σ𝟐𝐀 + σ𝟐𝐁 + 2 ρ𝐀𝐁 σ𝐀 σ𝐁
Where 𝐗 𝐀 and 𝐗 𝐁 have a correlation of ρ𝐀𝐁 between them.

 Portfolio Variance (two assets):


𝛔𝟐𝐩 = 𝐰𝟏𝟐 𝛔𝟐𝟏 + 𝐰𝟐𝟐 𝛔𝟐𝟐 + 𝟐𝐰𝟏 𝐰𝟐 𝝆𝟏𝟐 𝝈𝟏 𝝈𝟐

 Portfolio standard deviation is equal to the square root of portfolio variance

 Note that 𝐂𝐨𝐯𝟏𝟐 = 𝝆𝟏𝟐 𝝈𝟏 𝝈𝟐


Portfolio variance
Example
 From the information given in the table, calculate the portfolio risk and
return.
Asset 1 Asset 2
Weights 80% 20%
Expected return 9.98% 15.80%
Standard deviation 17.00% 30.00%
Cov 0.005
Solution
 Portfolio return = weighted return of assets
o E(R p ) = (0.8 × 9.98%) + (0.2 × 15.8%) = 11.14%
 Portfolio variance (σ2p ) = w12 σ12 + w22 σ22 + 2w1 w2 𝜌12 𝜎1 𝜎2
o = 0.82 × 0.172 + 0.22 × 0.32 + 2 × 0.8 × 0.2 × 0.005 = 0.023696
o Portfolio risk = Portfolio standard deviation = 0.023696 = 0.1539 =
15.39%
 Note: The portfolio risk is less than the risk of each respective asset.
Moments
 Moments describe a particular sample statistic
 Recall that:
μ = E [X]
 The above concept can be generalized as follows:
𝐤
𝐦𝐤 = E[𝐗 ]
 𝐦𝐤 is referred to as the kth moment of X. The first moment of X is
considered to be the mean.
 In the same manner, the concept of variance could be generalized as
follows:
μ𝐤 = 𝐄 ,( X – μ)𝐤 -
 The central moment of X is μ𝐤 . The moment is referred to as central since
it is centered on the mean. The second central moment is variance.
3rd Moment: Skewness
 The term skewness refers to the lack of symmetry.
 The lack of symmetry in a distribution is always determined with reference
to a normal or Gaussian distribution (A normal distribution is always
symmetrical).
o If Mean > Mode, the skewness is positive. Example: income.
o If Mean < Mode, the skewness is negative. Example: retirement age.
o If Mean = Mode, the skewness is zero. Example: Scores in an exam.
4th Moment: Kurtosis
 The kurtosis depicts how a random variable is spread out, and unlike the
second central moment, it puts more weight on the extreme points.
 The kurtosis K of a random variable X can be defined as follows:
𝐄 ,(𝐗 − μ)𝟒 -
𝐊=
σ𝟒
 The distribution with a higher kurtosis has a tendency of having more
extreme points and is therefore considered riskier.
o If 𝐾𝑒 − 3 > 0, the distribution is leptokurtic.
o If 𝐾𝑒 − 3 < 0 the distribution is platykurtic.
o If 𝐾𝑒 − 3 = 0 the distribution is mesokurtic
 Excess kurtosis is the preferred statistical tool.
o It is computed as: 𝐊 𝐞𝐱𝐜𝐞𝐬𝐬 = 𝐊 − 𝟑
Coskewness and Cokurtosis
 The idea of covariance can also be generalized to cross the central
moments. The third and fourth standardized cross-central moments are
the coskewness and cokurtosis, respectively.
 Cokurtosis is used in asset management to measure the extent to which a
security will undergo extreme positive and negative deviations from the
market portfolio at the same time.
 It measures a security’s risk in relation to the market as a whole (beta)

High kurtosis = high potential for extreme positive returns

Low kurtosis = low potential for extreme positive returns


Best Linear Unbiased Estimator
 Point estimation involves the use of sample data to calculate a single
value (known as a point estimate or statistic) which serves as a "best
guess" or "best estimate" of an unknown population parameter
 Examples:
o The sample mean is a point estimate of the population mean.
o The sample variance is a point estimate of population variance.
 The formula used to calculate a point estimate is called the estimator:

True
Point
population
estimate
parameter
Formula used = “estimator”
Best Linear Unbiased Estimator
 There are four desirable properties of an estimator:
1) Unbiased: The expected value of the estimator must be equal to the true
population parameter, e.g., E x = μ.
2) Efficient: The variance of its sampling distribution is smaller than that of
all the other unbiased estimators.
3) Consistent: Its accuracy increases as the sample size is increased
4) Linear: It can be used as a linear function of sample data.

 According to staticians, the best estimator among all unbiased estimators


is one whose variance is the minimum.
Book 2 – Quantitative Analysis
Chapter 2
BASIC STATISTICS
Learning Objectives Recap:
 Interpret and apply the mean, standard deviation, and variance of a random variable.
 Calculate the mean, standard deviation, and variance of a discrete random variable.
 Interpret and calculate the expected value of a discrete random variable.
 Calculate and interpret the covariance and correlation between two random variables.
 Calculate the mean and variance of sums of variables.
 Describe the four central moments of a statistical variable or distribution: mean, variance,
skewness, and kurtosis.
 Interpret the skewness and kurtosis of a statistical distribution, and interpret the concepts of
coskewness and cokurtosis.
 Describe and interpret the best linear unbiased estimator.
N EXT
DISTRIBUTIONS

You might also like