QUANTITATIVE RISK ASSESSMENT IN GEOTECHNICAL ENGINEERING
Introduction to Random Variables
Presented by
D. V. Griffiths (Colorado School of Mines)
Colombian Geotechnical Society
XV CGC & II ISCSR
4th October, 2016 66
Discrete and Continuous Random Variables
Discrete : A random variable is called a discrete random variable
if its set of possible outcomes is countable. This usually occurs for
any random variable which is a count of occurrences or of items.
Continuous : A random variable is called a continuous random
variable if it can take on values on a continuous scale. This is
usually the case with measured data.
Examples: 1) Let X be the number of blows in an Standard Penetration Test
-- X is discrete.
2) Let Z be the time till consolidation settlement exceeds some threshold
-- Z is continuous.
3) Let W be the undrained shear strength of a soil deposit.
--W is continuous.
67
Random variables are unknown.
The most we can say about a random variable is what its
probability is of taking on each of its possible values.
We call this a probability distribution. For example if the
discrete random variable X can take on possible values 0, 1,
2, and 3, then a complete description of X might be given by
the following probabilities;
P X 0 0.22
P X 1 0.47
P X 2 0.18
P X 3 0.13
(for example) which must add up to 1.0. 68
Probability Distributions
Discrete Random Variables: The set of probabilities
P X x1 p1
P X x2 p2
is called a Probability Mass Function (PMF)
Continuous Random Variables:
P X 12.000000000.....000 0
so we must define probabilities over a range
P x X x dx f X ( x ) dx
where f x ( x ) is called a Probability Density Function (PDF) 69
Probability Density Function
Definition :
The function f ( x) is a probability density function (PDF) for the
continuous random variable X , defined over the set of real numbers, if
1) 0 f ( x) < , for all x ,
2)
f ( x) dx 1 (the area under the PDF is unity)
b
3) P[a X b] f ( x) dx
a
NOTE : it is important to recognize that, in the continuous case,
f ( x) is not a probability. It has units of probability per unit length.
In order to get probabilities, we have to find areas under the pdf,
ie. sum up values of f ( x) dx.
70
SOME SIMPLE PROBABILITY DENSITY FUNCTIONS (PDF)
1
UNIFORM f X ( x) x
The uniform distribution is useful in representing random variables which have
known upper and lower bounds and which have equal likelihood of occurring
anywhere between these bounds.
The distribution makes no assumptions regarding preferential likelihood of
the random variable since all possible values are equally likely.
f X ( x)
Let 3 and 7 Mean X
2
0.25
3
Standard deviation X
6
x
0 1 2 3 4 5 6 7 8 9
see later for
where this comes from!
71
NORMAL
Let X be a normally distributed random variable
with mean and standard deviation given by X and X .
In this case the PDF is given by:
The best known Probability Density Function is the
Normal or Gaussian distribution.
1
1 x X
2
E[ X ] X
f X ( x) exp
X 2 Var[ X ] X2
2 X
72
NORMAL DISTRIBUTION
X 100 X 50
f X ( x) Mean,median and mode
The area under the
distribution is unity
inflection points
are at
73
x
STANDARD NORMAL DISTRIBUTION
For the special case where X =0 and X 1 it is
called the Standard Normal Distribution:
1 1 2
f X ( x) exp x
2 2
The Standard Normal distribution is so important that it is
commonly given its own symbols:
Z for the random variable and for the PDF, thus
1 1 2
Z ( z ) exp z
2 2
74
STANDARD NORMAL DISTRIBUTION
Z 0 Z 1
Z ( z )
The area under the
distribution is unity
inflection points
are now at 1
z
75
LOGNORMAL
Let Y be a lognormally distributed random variable
with a mean and standard deviation given by Y and Y .
If Y is lognormally distributed, ln Y is normally distributed. .
Y
Let the Coefficient of Variation vY
Y
76
The PDF for a lognormal distribution is given by:
1
1 ln y ln Y
2
f Y ( y) exp
y ln Y 2
2 ln Y
If the mean and standard deviation of the lognormal
random variable y are Y and Y , then the mean
and standard deviation of the underlying normal distribution
of ln y are given by:
ln Y ln Y ln 1 vY 2
1
2
very useful
ln Y ln 1 vY 2
77
Going in the other direction....
1 2
Y exp ln Y ln Y
2 rarely used
Y Y exp ln2 Y 1
Further relationships involving the lognormal distribution:
MedianY exp ln Y
ModeY exp ln Y ln2 Y
78
LOGNORMAL DISTRIBUTION
Y 100 Y 50
fY ( y ) Mode 71.6
Median 89.4
Mean 100
The area under the
distribution is unity
Area = 0.5 Area = 0.5
y79
COMPARISON OF NORMAL AND LOGNORMAL DISTRIBUTIONS
X
X 100 (Mean) vX (Coefficient of Variation)
X
f X ( x) vX 0.1
vX 0.3
vX 0.5
x
....not much difference for vX 0.3 80
Cumulative Distribution Function (CDF)
The Cumulative Distribution Function for a random
variable gives the probability that the random variable
will take a value less than or equal to a given value x.
Hence,
x
FX ( x) P[ X x]
f X ( )d
f X ( )
81
x
Standard Normal Cumulative Distribution Function (CDF)
( z )
Standard tables cover z 0
z
82
Standard
Normal
Function
CDF
Table gives
(z) for z ≥ 0
Area enclosed within
X X
in a normal distribution
Area(%)
1 68.3
2 95.4
3 99.7
83
Standard
Normal
Function
CDF
For z < 0
Φ( z ) = 1- Φ(- z )
84
The Reliability Index
The Reliability Index is a measure of the margin of safety in
“standard deviation units”.
...not to be confused with "The Reliability" which is simply
R 1- p f
For example, if dealing with a normally distributed Factor of Safety
(where FS=1 implies failure), the Reliability Index is given by:
FS 1
FS
If the Factor of Safety is lognornal, the Reliability Index is given by:
ln( FS )
ln( FS )
For normally distributed random variables, the “Reliability Index” ()
is uniquely related to the “Probability of Failure” ( pf ) through the expression:
p f 1
85
Consider a normal distribution of the Factor of Safety (FS)
f FS
FS 1.5
FS 0.21
pf is this area
FS
is this distance ÷ the standard deviation
2.38 p f 1 2.38) 1 0.991343 0.0087
0.21 86
0.5
Probability of Failure: pf
p f 0.0013 (0.13%)
3
Reliability Index:
Probability of Failure vs. Reliability Index for a Normal Distribution 87
Example calculations using the Standard Normal Function
Example 1:
Permeability measurements have indicated that k is normally distributed
with the properties: k 4.1108 m/s and k 1108 m/s
8
What is the probability that k 4.5 10 m/s ?
fK k
We want this area
…but tables only give area to the left of a given point…
88
fK k
…so estimate this area
and subtract from 1
P[k 4.5 108 ] 1 P[k 4.5 108 ]
4.5 108 4.1108
1 8
1 10
1 0.4
1 0.65542
0.3446 (34%) 89
Example 2:
Permeability measurements have indicated that k is lognormally distributed
with the properties: k 4.1108 m/s and k 1108 m/s
8
What is the probability that k 4.5 10 m/s ?
First find the properties of the underlying normal distribution of ln k
2 1
2
ln k ln 1 ln 1
k
0.2404
k 4.1
1 2 1
ln k ln k ln k ln(4.110 ) (0.2404) 2 17.0386
8
2 2
90
ln(4.5 108 ) 16.92
f ln K (ln k )
We want this area
ln k
…but tables only give area to the left of a given point…
91
ln(4.5 108 ) 16.92
f ln K (ln k )
…so estimate this area
and subtract from 1
ln k
P[k 4.5 108 ] 1 P[k 4.5 108 ]
16.92 (17.0386)
1
0.2404
1 0.5075
1 0.51
1 0.69497
92
0.3050 (31%)