lecture_note_5
lecture_note_5
Lecture 5
In order to use a computer to initiate a simulation study, we must be able to generate the
value of a U (0, 1) random variable; such variates are called random numbers. To generate
them, most computers have a built-in subroutine, called a random-number generator,
whose output is a sequence of pseudorandom numbers—a sequence of numbers that is,
for all practical purposes, indistinguishable from a sample from the U (0, 1) distribution.
Most random-number generators start with an initial value X0 , called the seed, and
then recursively compute values by specifying positive integers a, c, and m, and then
letting
where the foregoing means that aXn + c is divided by m and the remainder is taken as
the value of Xn+1 . Thus, each Xn is either 0, 1, . . . , m − 1, and the quantity Xmn is taken
as an approximation to a U (0, 1) random variable.
It can be shown that, subject to suitable choices for a, c, and m, Equation 1 gives rise
to a sequence of numbers that look as if they were generated from independent U (0, 1)
random variables.
1
This works because the probability of selecting xj is
( j−1 j
)
X X
P {X = xj } = P pi < U ≤ pi = pj .
i=1 i=1
FY (a) = P {Y ≤ a} = P {F −1 (U ) ≤ a}.
FY (a) = P {U ≤ F (a)}.
2
Given that U is uniformly distributed over [0, 1], P {U ≤ F (a)} = F (a). Thus,
FY (a) = F (a).
This method is widely used because it establishes a direct link between uniform ran-
dom variables and arbitrary continuous distributions. We conclude that a random vari-
able X with a continuous distribution function F can be simulated by generating a
random number U from the uniform distribution and setting
X = F −1 (U ).
F (x) = 1 − e−λx ,
1 − e−λx = u.
f (y)
≤ c for all y.
g(y)
This leads to the following technique for simulating a random variable with density f .
3
1. Step 1: Simulate Y having density g and generate a random number U ∼ U (0, 1).
f (Y )
2. Step 2: If U ≤ cg(Y )
, set X = Y . Otherwise, return to Step 1.
Proposition 2. The random variable X, generated by the acceptance-rejection method,
has density function f .
Proof. Let X be the value obtained, and let N denote the number of necessary iterations.
Then, we compute P (X ≤ x) as follows
P (X ≤ x) = P (YN ≤ x).
f (Y )
From the acceptance-rejection method, the acceptance criterion U ≤ cg(Y )
implies
f (Y )
P (X ≤ x) = P (Y ≤ x | U ≤ ).
cg(Y )
Using the definition of conditional probability
f (Y )
P (Y ≤ x, U ≤ cg(Y )
)
P (X ≤ x) = ,
K
f (Y )
where K = P (U ≤ cg(Y )
) is the normalization constant. The joint density function of Y
and U , due to independence, is
f (y, u) = g(y), 0 ≤ u ≤ 1.
Thus,
Z Z f (y)
1 x cg(y)
P (X ≤ x) = g(y) du dy.
K −∞ 0
Evaluating the inner integral over u
Z f (y)
cg(y) f (y) f (y)
g(y) du = g(y) · = .
0 cg(y) c
Substituting this into the outer integral
Z x
1 f (y)
P (X ≤ x) = dy.
K −∞ c
Simplifying Z x
1
P (X ≤ x) = f (y) dy.
cK −∞
Letting
R∞ x → ∞ and using the fact that f (y) is a valid probability density function
( −∞ f (y) dy = 1), we find
Z ∞
1 1
1= f (y) dy = .
cK −∞ cK
Thus
cK = 1.
Therefore, we obtain Z x
P (X ≤ x) = f (y) dy,
−∞
which confirms that X has the cumulative distribution function of the target density f .
This completes the proof.
4
Simulation of the Standard Univariate Normal Distribution To simulate a
standard normal random variable Z, note first that the absolute value of Z has the
probability density function
r
2 −x2 /2
f (x) = e , 0<x<∞
π
We will start by simulating from the preceding density function by using the acceptance-
rejection method, with g(x) being the exponential density function with mean 1, i.e.,
2. Accept Y as X if:
(Y − 1)2
U ≤ exp − .
2
Otherwise, return to step 1.
Once a random variable X is generated with the above method, a standard normal
random variable Z can be obtained by assigning Z to be either X or −X with equal
2
probability. In step 2, the acceptance criterion U ≤ exp − (Y −1)
2
can also be expressed
as:
(Y − 1)2
− log U ≥ .
2
Note that − log U is exponentially distributed with rate 1. Summing up, then, we have
the following algorithm
5
Sample Moments and Their Distributions
Random Sampling
Let X be a random variable (RV) with distribution function (DF) F , and let X1 , X2 , . . . , Xn
be independent and identically distributed (iid) random variables with common DF F .
Then the collection X1 , X2 , . . . , Xn is known as a random sample of size n from the
DF F , or simply as n independent observations on X.
The set of n values x1 , x2 , . . . , xn is called a realization of the sample. Note that
the possible values of the random variables (X1 , X2 , . . . , Xn ) can be regarded as points
in Rn , which may be called the sample space.
If X1 , X2 , . . . , Xn is a random sample from the distribution function (DF) F , their
joint DF is given by:
n
Y
∗
F (x1 , x2 , . . . , xn ) = F (xi ).
i=1
and
n
!
1X
Var X̄ = Var Xi
n i=1
n
1 X
= 2 Var(Xi ) (because Xk ’s are independent,and Var(aXi ) = a2 Var(Xi ))
n i=1
n
1 X 2 1 2 σ2
= σ = · nσ = .
n2 i=1 n2 n
6
2
We denote E[X̄] = µX̄ and Var(X̄) = σX̄ . Here, σX̄ is called the standard error of
the mean.
Hence,
Pn n
Xi2 − nX̄ 2
2 1 X n
E[S ] = E i=1
= E[Xi2 ] − E[X̄ 2 ].
n−1 n − 1 i=1 n−1
Using the fact that E(X 2 ) = Var(X) + µ2 we have:
σ2
2 1 2 2 2
E[S ] = n(σ + µ ) − n − nµ .
n−1 n
Simplifying:
2 n 2 n−1 2 n−1 2
E[S ] = σ + µ − µ = σ2.
n−1 n n
Thus, the expected value of the sample variance is the same as the variance of the
population under consideration.