0% found this document useful (0 votes)

23 views14 pages

chap 8 1

Introduction to probability chapter 8

Uploaded by

daiyifei36

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views14 pages

chap 8 1

Introduction to probability chapter 8

Uploaded by

daiyifei36

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

8

Transformations

The topic for this chapter is transformations of random variables and random vec-
tors. After applying a function to a random variable X or random vector X, the goal
is to find the distribution of the transformed random variable or joint distribution
of the transformed random vector.
Transformations of random variables appear all over the place in statistics. Here are
a few examples, to preview the kinds of transformations we’ll be looking at in this
chapter.
• Unit conversion: In one dimension, we’ve already seen how standardization and
location-scale transformations can be useful tools for learning about an entire
family of distributions. A location-scale change is linear, converting an r.v. X to
the r.v. Y = aX + b where a and b are constants (with a > 0).
There are also many situations in which we may be interested in nonlinear trans-
formations, e.g., converting from the dollar-yen exchange rate to the yen-dollar
exchange rate, or converting information like “Janet’s waking hours yesterday
consisted of 8 hours of work, 4 hours visiting friends, and 4 hours surfing the web”
to the format “Janet was awake for 16 hours yesterday; she spent 12 of that time
working, 14 of that time visiting friends, and 14 of that time surfing the web”. The
change of variables formula, which is the first result in this chapter, shows what
happens to the distribution when a random vector is transformed.
• Sums and averages as summaries: It is common in statistics to summarize n
observations by their sum or sample average. Turning X1 , . . . , Xn into the sum
T = X1 + · · · + Xn or sample mean X̄n = T /n is a transformation from Rn to R.
The term for a sum of independent random variables is convolution. We have
already encountered stories and MGFs as two techniques for dealing with convo-
lutions. In this chapter, convolution sums and integrals, which are based on the
law of total probability, will give us another way of obtaining the distribution of
a sum of r.v.s.
• Extreme values: In many contexts, we may be interested in the distribution of
the most extreme observations. For disaster preparedness, government agencies
may be concerned about the most extreme flood or earthquake in a 100-year
period; in finance, a portfolio manager with an eye toward risk management will
want to know the worst 1% or 5% of portfolio returns. In these applications,
we are concerned with the maximum or minimum of a set of observations. The

367
368 Introduction to Probability

transformation that sorts observations, turning X1 , . . . , Xn into the order statistics

min(X1 , . . . , Xn ), . . . , max(X1 , . . . , Xn ), is a transformation from Rn to Rn that is
not invertible. Order statistics are addressed in the last section in this chapter.
Furthermore, it is especially important to us to understand transformations because
of the approach we’ve taken to learning about the named distributions. Starting
from a few basic distributions, we have defined other distributions as transforma-
tions of these elementary building blocks, in order to understand how the named
distributions are related to one another. We’ll continue in that spirit here as we in-
troduce two new distributions, the Beta and Gamma, which generalize the Uniform
and Exponential.
We already have quite a few tools in our toolbox for dealing with transformations,
so let’s review those briefly. First, if we are only looking for the expectation of
g(X), LOTUS shows us the way: it tells us that the PMF or PDF of X is enough
for calculating E(g(X)). LOTUS also applies to functions of several r.v.s, as we
learned in the previous chapter.
If we need the full distribution of g(X), not just its expectation, our approach
depends on whether X is discrete or continuous.
• In the discrete case, we get the PMF of g(X) by translating the event g(X) = y
into an equivalent event involving X. To do so, we look for all values x such that
g(x) = y; as long as X equals any of these x’s, the event g(X) = y will occur. This
gives the formula X
P (g(X) = y) = P (X = x).
x:g(x)=y

For a one-to-one g, the situation is particularly simple, because there is only one
value of x such that g(x) = y, namely g 1 (y). Then we can use
1
P (g(X) = y) = P (X = g (y))

to convert between the PMFs of X and g(X), as also discussed in Section 3.7. For
example, it is extremely easy to convert between the Geometric and First Success
distributions.
• In the continuous case, a universal approach is to start from the CDF of g(X), and
translate the event g(X)  y into an equivalent event involving X. For general g,
we may have to think carefully about how to express g(X)  y in terms of X, and
there is no easy formula we can plug into. But when g is continuous and strictly
increasing, the translation is easy: g(X)  y is the same as X  g 1 (y), so
1 1
Fg(X) (y) = P (g(X)  y) = P (X  g (y)) = FX (g (y)).

We can then di↵erentiate with respect to y to get the PDF of g(X). This gives a
one-dimensional version of the change of variables formula, which generalizes to
invertible transformations in multiple dimensions.
Transformations 369

8.1 Change of variables

Theorem 8.1.1 (Change of variables in one dimension). Let X be a continuous

r.v. with PDF fX , and let Y = g(X), where g is di↵erentiable and strictly increasing
(or strictly decreasing). Then the PDF of Y is given by

dx
fY (y) = fX (x) ,
dy

where x = g 1 (y). The support of Y is all g(x) with x in the support of X.

Proof. Let g be strictly increasing. The CDF of Y is

1 1
FY (y) = P (Y  y) = P (g(X)  y) = P (X  g (y)) = FX (g (y)) = FX (x),

so by the chain rule, the PDF of Y is

dx
fY (y) = fX (x) .
dy
The proof for g strictly decreasing is analogous. In that case the PDF ends up as
fX (x) dx dx dx
dy , which is nonnegative since dy < 0 if g is strictly decreasing. Using | dy |,
as in the statement of the theorem, covers both cases. ⌅

When applying the change of variables formula, we can choose whether to compute
dx dy
dy , or compute dx and take the reciprocal. By the chain rule, these give the same
result, so we can do whichever is easier.
h 8.1.2. When finding the distribution of Y , be sure to:
• Check the assumptions of the change of variables theorem carefully if you wish to
apply it (if it doesn’t apply, a good strategy is to start with the CDF of Y ).
• Express your final answer for the PDF of Y as a function of y.
• Specify the support of Y .
The change of variables formula (in the strictly increasing g case) is easy to remem-
ber when written in the form

fY (y)dy = fX (x)dx,

which has an aesthetically pleasing symmetry to it. This formula also makes sense if
we think about units. For example, let X be a measurement in inches and Y = 2.54X
be the conversion into centimeters (cm). Then the units of fX (x) are inches 1
and the units of fY (y) are cm 1 , so it would be absurd to say something like
“fY (y) = fX (x)”. But dx is measured in inches and dy is measured in cm, so fY (y)dy
and fX (x)dx are unitless quantities, and it makes sense to equate them. Better yet,
370 Introduction to Probability

fX (x)dx and fY (y)dy have probability interpretations (recall from Chapter 5 that
fX (x)dx is essentially the probability that X is in a tiny interval of length dx,
centered at x), which makes it easier to think intuitively about what the change of
variables formula is saying.
The next two examples derive the PDFs of two r.v.s that are defined as transforma-
tions of a standard Normal r.v. In the first example the change of variables formula
applies; in the second example it does not.
Example 8.1.3 (Log-Normal PDF). Let X ⇠ N (0, 1), Y = eX . In Chapter 6
we named the distribution of Y the Log-Normal, and we found all of its moments
using the MGF of the Normal distribution. Now we can use the change of variables
formula to find the PDF of Y , since g(x) = ex is strictly increasing. Let y = ex , so
x = log y and dy/dx = ex . Then

dx 1 1
fY (y) = fX (x) = '(x) x = '(log y) , y > 0.
dy e y

Note that after applying the change of variables formula, we write everything on
the right-hand side in terms of y, and we specify the support of the distribution. To
determine the support, we just observe that as x ranges from 1 to 1, ex ranges
from 0 to 1.
We can get the same result by working from the definition of the CDF, translating
the event Y  y into an equivalent event involving X. For y > 0,

FY (y) = P (Y  y) = P (eX  y) = P (X  log y) = (log y),

so the PDF is again

d 1
fY (y) = (log y) = '(log y) , y > 0. ⇤
dy y

Example 8.1.4 (Chi-Square PDF). Let X ⇠ N (0, 1), Y = X 2 . The distribution

of Y is an example of a Chi-Square distribution, which is formally introduced in
Chapter 10. To find the PDF of Y , we can no longer apply the change of variables
formula because g(x) = x2 is not one-to-one; instead we start from the CDF.
By drawing the graph of y = x2 , we can see that the event X 2  y is equivalent to
p p
the event y  X  y. Then
p p p p p
FY (y) = P (X 2  y) = P ( yX y) = ( y) ( y) = 2 ( y) 1,

so
p 1 p
fY (y) = 2'( y) · y 1/2
= '( y)y 1/2
, y > 0. ⇤
2

The following example sheds light on an unexpected appearance of a decidedly

non-Normal distribution.
Transformations 371

Example 8.1.5 (Lighthouse). A lighthouse on a shore is shining light toward the

ocean at a random angle U (measured in radians), where
✓ ◆
⇡ ⇡
U ⇠ Unif , .
2 2

Consider a line which is parallel to the shore and 1 mile away from the shore, as
illustrated in Figure 8.1. An angle of 0 would mean the ray of light is perpendicular
to the shore, while an angle of ⇡/2 would mean the ray is along the shore, shining
to the right from the perspective of the figure.
Let X be the point that the light hits on the line, where the line’s origin is the point
on the line that is closest to the lighthouse. Find the distribution of X.

lighthouse
beach
ocean

1 mile U

0 X

FIGURE 8.1
A lighthouse shining light at a random angle U , viewed from above.

Solution: Looking at the right triangle in Figure 8.1, the length of the opposite side
of U divided by the length of the adjacent side of U is X/1 = X, so

X = tan(U ).

(The figure illustrates a case where U > 0 and, correspondingly, X > 0, but the
same relationship holds when U  0.) Let x be a possible value of X and u be the
corresponding possible value of U , so

x = tan(u) and u = arctan(x).

By the change of variables formula, which applies since tan is a di↵erentiable, strictly
increasing function on ( ⇡/2, ⇡/2),

du 1 1
fX (x) = fU (u) = · ,
dx ⇡ 1 + x2

which shows that X is Cauchy. In particular, this implies that E|X| is infinite (since
the expected value of a Cauchy does not exist), so on average X is infinitely far
from the origin of the line!
372 Introduction to Probability

The fact that X is Cauchy also makes sense in light of universality of the Uniform.
As shown in Example 7.1.25, the Cauchy CDF is
1
F (x) = arctan(x) + 0.5.
⇡
The inverse is F 1 (v) = tan (⇡ (v 0.5)) , so for V ⇠ Unif(0, 1) we have
1
F (V ) = tan (⇡ (V 0.5)) ⇠ Cauchy.

This agrees with our earlier result since ⇡ (V 0.5) ⇠ Unif( ⇡/2, ⇡/2). ⇤
We can also use the change of variables formula to find the PDF of a location-scale
transformation.
Example 8.1.6 (PDF of a location-scale transformation). Let X have PDF fX ,
and let Y = a + bX, with b 6= 0. Let y = a + bx, to mirror the relationship between
dy
Y and X. Then dx = b, so the PDF of Y is
✓ ◆
dx y a 1
fY (y) = fX (x) = fX . ⇤
dy b |b|

The change of variables formula generalizes to n dimensions, where it tells us how to

use the joint PDF of a random vector X to get the joint PDF of the transformed ran-
dom vector Y = g(X). The formula is analogous to the one-dimensional version, but
it involves a multivariate generalization of the derivative called a Jacobian matrix ;
see sections A.6 and A.7 of the math appendix for more about Jacobians.
Theorem 8.1.7 (Change of variables). Let X = (X1 , . . . , Xn ) be a continuous
random vector with joint PDF fX . Let g : A0 ! B0 be an invertible function,
where A0 and B0 are open1 subsets of Rn , A0 contains the support of X, and B0 is
the range of g.
Let Y = g(X), and mirror this by letting y = g(x). Since g is invertible, we also
have X = g 1 (Y) and x = g 1 (y).
@xi
Suppose that all the partial derivatives @yj exist and are continuous, so we can form
the Jacobian matrix 0 1
@x1 @x1 @x1
@y1 @y2 ... @yn
@x B .. .. C
=@ . . A.
@y @xn @xn @xn
@y1 @y2 ... @yn
Also assume that the determinant of this Jacobian matrix is never 0. Then the joint
PDF of Y is
@x
fY (y) = fX g 1 (y) · | | for y 2 B0 ,
@y
1
A set C ⇢ Rn is open if for each x 2 C, there exists ✏ > 0 such that all points with distance
less than ✏ from x are contained in C. Sometimes we take A0 = B0 = Rn , but often we would like
more flexibility for the domain and range of g. For example, if n = 2, and X1 and X2 have support
(0, 1), we may want to work with the open set A0 = (0, 1) ⇥ (0, 1) rather than all of R2 .
Transformations 373

and 0 otherwise. (The inner bars around the Jacobian say to take the determinant
and the outer bars say to take the absolute value.)
That is, to convert fX (x) to fY (y) we express the x in fX (x) in terms of y and then
multiply by the absolute value of the determinant of the Jacobian @x/@y.
As in the 1D case,
1
@x @y
= ,
@y @x
so we can compute whichever of the two Jacobians is easier, and then at the end
express the joint PDF of Y as a function of y.
We will not prove the change of variables formula here, but the idea is to apply the
change of variables formula from multivariable calculus and the fact that if A is a
region in A0 and B = {g(x) : x 2 A} is the corresponding region in B0 , then X 2 A
is equivalent to Y 2 B—they are the same event. So P (X 2 A) = P (Y 2 B), which
shows that Z Z
fX (x)dx = fY (y)dy.
A B

The change of variables formula from multivariable calculus (which is reviewed in

the math appendix) can then be applied to the integral on the left-hand side, with
the substitution x = g 1 (y).
h 8.1.8. A crucial conceptual di↵erence between transformations of discrete r.v.s
and transformations of continuous r.v.s is that with discrete r.v.s we don’t need a
Jacobian, while with continuous r.v.s we do need a Jacobian. For example, let X be
a positive r.v. and Y = X 3 . If X is discrete, then

P (Y = y) = P (X = y 1/3 )

converts between the PMFs. But if X is continuous, we need a Jacobian (which in

one dimension is just a derivative) to convert between the PDFs:

dx 1
fY (y) = fX (x) = fX (y 1/3 ) 2/3 .
dy 3y

Exercise 23 is a cautionary tale about someone who failed to use a Jacobian when
it was needed.
The next two examples apply the 2D change of variables formula.
Example 8.1.9 (Box-Muller). Let U ⇠ Unif(0, 2⇡), and let T ⇠ Expo(1) be inde-
pendent of U . Define
p p
X = 2T cos U and Y = 2T sin U.

Find the joint PDF of (X, Y ). Are they independent? What are their marginal
distributions?
374 Introduction to Probability

Solution:
The joint PDF of U and T is
1 t
fU,T (u, t) = e ,
2⇡
for u 2 (0, 2⇡) and t > 0. Viewing (X, Y ) as a point in the plane,

X 2 + Y 2 = 2T (cos2 U + sin2 U ) = 2T
p
is the squared distance from the origin and U is the angle; that is, ( 2T , U ) expresses
(X, Y ) in polar coordinates.
Since we can recover (U, T ) from (X, Y ), the transformation is invertible. The
Jacobian matrix
p !
@(x, y) 2t sin u p12t cos u
= p
@(u, t) 2t cos u p1 sin u
2t

exists, has continuous entries, and has absolute determinant

| sin2 u cos2 u| = 1
p p
(which is never 0). Then letting x = 2t cos u, y = 2t sin u to mirror the transfor-
mation from (U, T ) to (X, Y ), we have

@(u, t)
fX,Y (x, y) = fU,T (u, t) · | |
@(x, y)
1 t
= e ·1
2⇡
1 1 2 2
= e 2 (x +y )
2⇡
1 2 1 y 2 /2
= p e x /2 · p e ,
2⇡ 2⇡
for all real x and y.
The joint PDF fX,Y factors into a function of x times a function of y, so X and Y
are independent. Furthermore, we recognize the joint PDF as the product of two
standard Normal PDFs, so X and Y are i.i.d. N (0, 1) r.v.s! This result is called the
Box-Muller method for generating Normal r.v.s. ⇤
Example 8.1.10 (Bivariate Normal joint PDF). In Chapter 7, we saw some prop-
erties of the Bivariate Normal distribution and found its joint MGF. Now let’s find
its joint PDF.
Let (Z, W ) be BVN with N (0, 1) marginals and Corr(Z, W ) = ⇢. (If we want the
joint PDF when the marginals are not standard Normal, we can standardize both
components separately and use the result below.) Assume that 1 < ⇢ < 1 since
otherwise the distribution is degenerate (with Z and W perfectly correlated).
Transformations 375

As shown in Example 7.5.10, we can construct (Z, W ) as

Z=X
W = ⇢X + ⌧ Y,
p
with ⌧ = 1 ⇢2 and X, Y i.i.d. N (0, 1). We also need the inverse transformation.
Solving Z = X for X, we have X = Z. Plugging this into W = ⇢X +⌧ Y and solving
for Y , we have

X=Z
⇢ 1
Y = Z + W.
⌧ ⌧
The Jacobian is
0 1
1 0
@(x, y)
=@ ⇢ 1 A,
@(z, w)
⌧ ⌧
which has absolute determinant 1/⌧ . So by the change of variables formula,

@(x, y)
fZ,W (z, w) = fX,Y (x, y) · | |
@(z, w)
✓ ◆
1 1 2 2
= exp (x + y )
2⇡⌧ 2
✓ ◆
1 1 2 ⇢ 1 2
= exp (z + ( z + w) )
2⇡⌧ 2 ⌧ ⌧
✓ ◆
1 1 2 2
= exp (z + w 2⇢zw) , for all real z, w.
2⇡⌧ 2⌧ 2

In the last step we multiplied things out and used the fact that ⇢2 + ⌧ 2 = 1. ⇤

8.2 Convolutions

A convolution is a sum of independent random variables. As we mentioned earlier, we

often add independent r.v.s because the sum is a useful summary of an experiment
(in n Bernoulli trials, we may only care about the total number of successes), and
because sums lead to averages, which are also useful (in n Bernoulli trials, the
proportion of successes).
The main task in this section is to determine the distribution of T = X + Y ,
where X and Y are independent r.v.s whose distributions are known. In previous
chapters, we’ve already seen how stories and MGFs can help us accomplish this
376 Introduction to Probability

task. For example, we used stories to show that the sum of independent Binomials
with the same success probability is Binomial, and that the sum of i.i.d. Geometrics
is Negative Binomial. We used MGFs to show that a sum of independent Normals
is Normal.

A third method for obtaining the distribution of T is by using a convolution sum or

integral. The formulas are given in the following theorem. As we’ll see, a convolution
sum is nothing more than the law of total probability, conditioning on the value of
either X or Y ; a convolution integral is analogous.

Theorem 8.2.1 (Convolution sums and integrals). Let X and Y be independent

r.v.s and T = X + Y be their sum. If X and Y are discrete, then the PMF of T is
X
P (T = t) = P (Y = t x)P (X = x)
x
X
= P (X = t y)P (Y = y).
y

If X and Y are continuous, then the PDF of T is

Z 1
fT (t) = fY (t x)fX (x)dx
1
Z 1
= fX (t y)fY (y)dy.
1

Proof. For the discrete case, we use LOTP, conditioning on X:

X
P (T = t) = P (X + Y = t|X = x)P (X = x)
x
X
= P (Y = t x|X = x)P (X = x)
x
X
= P (Y = t x)P (X = x).
x

Conditioning on Y instead, we obtain the second formula for the PMF of T .

h 8.2.2. We use the assumption that X and Y are independent in order to get
from P (Y = t x|X = x) to P (Y = t x) in the last step. We are only justified
in dropping the condition X = x if the conditional distribution of Y given X = x
is the same as the marginal distribution of Y , i.e., X and Y are independent. A
common mistake is to assume that after plugging in x for X, we’ve “already used
the information” that X = x, when in fact we need an independence assumption to
drop the condition. Otherwise we destroy information without justification.

In the continuous case, since the value of a PDF at a point is not a probability, we
Transformations 377

first find the CDF, and then di↵erentiate to get the PDF. By LOTP,
Z 1
FT (t) = P (X + Y  t) = P (X + Y  t|X = x)fX (x)dx
1
Z 1
= P (Y  t x)fX (x)dx
1
Z 1
= FY (t x)fX (x)dx.
1

Again, we need independence to drop the condition X = x. To get the PDF, we

then di↵erentiate with respect to t, interchanging the order of integration and dif-
ferentiation. This gives
Z 1
fT (t) = fY (t x)fX (x)dx.
1

Conditioning on Y instead, we get the second formula for fT .

An alternative derivation uses the change of variables formula in two dimensions.
The only snag is that the change of variables formula requires an invertible trans-
formation from R2 to R2 , but (X, Y ) 7! X + Y maps R2 to R and is not invertible.
We can get around this by adding a redundant component to the transformation, in
order to make it invertible. Accordingly, we consider the invertible transformation
(X, Y ) 7! (X + Y, X) (using (X, Y ) 7! (X + Y, Y ) would be equally valid). Once we
have the joint PDF of X + Y and X, we integrate out X to get the marginal PDF
of X + Y .
Let T = X + Y , W = X, and let t = x + y, w = x. It may seem redundant to
give X the new name “W ”, but doing this makes it easier to distinguish between
pre-transformation variables and post-transformation variables: we are transforming
(X, Y ) 7! (T, W ). Then ✓ ◆
@(t, w) 1 1
=
@(x, y) 1 0
@(x,y)
has absolute determinant equal to 1, so | @(t,w) | is also 1. Thus, the joint PDF of
T and W is

fT,W (t, w) = fX,Y (x, y) = fX (x)fY (y) = fX (w)fY (t w),

and the marginal PDF of T is

Z 1 Z 1
fT (t) = fT,W (t, w)dw = fX (x)fY (t x)dx,
1 1

in agreement with our result above. ⌅

h 8.2.3. It is not hard to remember the convolution integral formula by reasoning
by analogy from X
P (T = t) = P (Y = t x)P (X = x)
x
378 Introduction to Probability

to Z 1
fT (t) = fY (t x)fX (x)dx.
1
But care is still needed. For example, Exercise 23 shows that an analogous-looking
formula for the PDF of the product of two independent continuous r.v.s is wrong:
a Jacobian is needed (for convolutions, the absolute Jacobian determinant is 1 so it
isn’t noticeable in the convolution integral formula).
Since convolution sums are just the law of total probability, we have already used
them in previous chapters without mentioning the word convolution; see, for ex-
ample, the first and most tedious proof of Theorem 3.8.9 (sum of independent
Binomials), as well as the proof of Theorem 4.8.1 (sum of independent Poissons).
In the following examples, we find the distribution of a sum of Exponentials and a
sum of Uniforms using a convolution integral.
i.i.d.
Example 8.2.4 (Exponential convolution). Let X, Y ⇠ Expo( ). Find the dis-
tribution of T = X + Y .
Solution:
For t > 0, the convolution formula gives
Z 1 Z t
(t x) x
fT (t) = fY (t x)fX (x)dx = e e dx,
1 0

where we restricted the integral to be from 0 to t since we need t x > 0 and x > 0
for the PDFs inside the integral to be nonzero. Simplifying, we have
Z t
2
fT (t) = e t dx = 2 te t , for t > 0.
0

This is known as the Gamma(2, ) distribution. We will introduce the Gamma

distribution in detail in Section 8.4. ⇤
i.i.d.
Example 8.2.5 (Uniform convolution). Let X, Y ⇠ Unif(0, 1). Find the distri-
bution of T = X + Y .
Solution:
The PDF of X (and of Y ) is
⇢
1, x 2 (0, 1),
g(x) =
0, otherwise.

The convolution formula gives

Z 1 Z 1
fT (t) = fY (t x)fX (x)dx = g(t x)g(x)dx.
1 1

The integrand is 1 if and only if 0 < t x < 1 and 0 < x < 1; this is a parallelogram-
shaped constraint. Equivalently, the constraint is max(0, t 1) < x < min(t, 1).
Transformations 379

1.0
0.8
0.6
x
0.4
0.2
0.0

0.0 0.5 1.0 1.5 2.0

FIGURE 8.2
Region in the (t, x)-plane where g(t x)g(x) is 1.

From Figure 8.2, we see that for 0 < t  1, x is constrained to be in (0, t), and for
1 < t < 2, x is constrained to be in (t 1, 1). Therefore, the PDF of T is a piecewise
linear function:
8 Z t
>
>
>
< dx = t for 0 < t  1,
0
fT (t) = Z 1
>
>
>
: dx = 2 t for 1 < t < 2.
t 1

Figure 8.3 plots the PDF of T . It is shaped like a triangle with vertices at 0, 1, and
2, so it is called the Triangle(0, 1, 2) distribution.
Heuristically, it makes sense that T is more likely to take on values near the mid-
dle than near the extremes: a value near 1 can be obtained if both X and Y are
moderate, if X is large but Y is small, or if Y is large but X is small. In contrast, a
value near 2 is only possible if both X and Y are large. Thinking back to Example
3.2.5, the PMF of the sum of two die rolls was also shaped like a triangle. A single
die roll has a Discrete Uniform distribution on the integers 1 through 6, so in that
problem we were looking at a convolution of two Discrete Uniforms. It makes sense
that the PDF we obtained here is similar in shape. ⇤

8.3 Beta

In this section and the next, we will introduce two continuous distributions, the
Beta and Gamma, which are related to several named distributions we have already
380 Introduction to Probability

1.0
0.8
0.6
PDF
0.4
0.2
0.0

0.0 0.5 1.0 1.5 2.0

FIGURE 8.3
PDF of T = X + Y , where X and Y are i.i.d. Unif(0, 1).

studied and are also related to each other via a shared story. This is an interlude
from the subject of transformations, but we’ll eventually need to use a change of
variables to tie the Beta and Gamma distributions together.
The Beta distribution is a continuous distribution on the interval (0, 1). It is a
generalization of the Unif(0, 1) distribution, allowing the PDF to be non-constant
on (0, 1).
Definition 8.3.1 (Beta distribution). An r.v. X is said to have the Beta distribution
with parameters a and b, where a > 0 and b > 0, if its PDF is
1
f (x) = xa 1
(1 x)b 1
, 0 < x < 1,
(a, b)
where the constant (a, b) is chosen to make the PDF integrate to 1. We write this
as X ⇠ Beta(a, b).
Taking a = b = 1, the Beta(1, 1) PDF is constant on (0, 1), so the Beta(1, 1) and
Unif(0, 1) distributions are the same. By varying the values of a and b, we get PDFs
with a variety of shapes; Figure 8.4 shows four examples. Here are a couple of general
patterns:
• If a < 1 and b < 1, the PDF is U-shaped and opens upward. If a > 1 and b > 1,
the PDF opens down.
• If a = b, the PDF is symmetric about 1/2. If a > b, the PDF favors values larger
than 1/2; if a < b, the PDF favors values smaller than 1/2.
By definition, the constant (a, b) satisfies
Z 1
(a, b) = xa 1 (1 x)b 1
dx.
0

ALL ST218 Lecture Notes
No ratings yet
ALL ST218 Lecture Notes
87 pages
Wildlife Fact File - Birds - Pgs. 111-120
No ratings yet
Wildlife Fact File - Birds - Pgs. 111-120
20 pages
(TG)TransformedRV_Distributions
No ratings yet
(TG)TransformedRV_Distributions
5 pages
ch-8
No ratings yet
ch-8
29 pages
Transformations of Random Variables: September, 2009
No ratings yet
Transformations of Random Variables: September, 2009
5 pages
Math5846_chapter5
No ratings yet
Math5846_chapter5
102 pages
Lecture 7
No ratings yet
Lecture 7
6 pages
ln3
No ratings yet
ln3
6 pages
lect2
No ratings yet
lect2
7 pages
Lecture16 General Transformations of RVs PDF
No ratings yet
Lecture16 General Transformations of RVs PDF
6 pages
ECE673 - Week3 - Lecture - With Figures
No ratings yet
ECE673 - Week3 - Lecture - With Figures
67 pages
MA2216/ST2131 Probability Notes 5 Distribution of A Function of A Random Variable and Miscellaneous Remarks
No ratings yet
MA2216/ST2131 Probability Notes 5 Distribution of A Function of A Random Variable and Miscellaneous Remarks
13 pages
proba 3
No ratings yet
proba 3
30 pages
Mit18 05 s22 Class05-Prep-D
No ratings yet
Mit18 05 s22 Class05-Prep-D
5 pages
MIT Microeconomics Lec09 PDF
No ratings yet
MIT Microeconomics Lec09 PDF
7 pages
Distributions and Normal Random Variables
No ratings yet
Distributions and Normal Random Variables
8 pages
Manipulating Continuous Random Variables Class 5, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
No ratings yet
Manipulating Continuous Random Variables Class 5, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
3 pages
Lecture 3_CSE38900_rev
No ratings yet
Lecture 3_CSE38900_rev
88 pages
chap 8 3
No ratings yet
chap 8 3
12 pages
3. Random Variables and Distribution Functions
No ratings yet
3. Random Variables and Distribution Functions
33 pages
Week5 BAM
No ratings yet
Week5 BAM
48 pages
Transformation of Function of A Random Variable: Univariate Transformations
No ratings yet
Transformation of Function of A Random Variable: Univariate Transformations
30 pages
Sta 242-Bivariate Analysis-10-Transf of Variables
No ratings yet
Sta 242-Bivariate Analysis-10-Transf of Variables
26 pages
Handout4 Transformation of Rvs
No ratings yet
Handout4 Transformation of Rvs
2 pages
Transformation - 530
No ratings yet
Transformation - 530
30 pages
S1B 16 All Lectures
No ratings yet
S1B 16 All Lectures
221 pages
chap4
No ratings yet
chap4
36 pages
Slide 2 - 20191
No ratings yet
Slide 2 - 20191
44 pages
Orientation - Basic Mathematics and Statistics - Probability
No ratings yet
Orientation - Basic Mathematics and Statistics - Probability
48 pages
06 RandomVariableMath PDF
No ratings yet
06 RandomVariableMath PDF
11 pages
S2 Vol2 Jointcontsdistributions
No ratings yet
S2 Vol2 Jointcontsdistributions
81 pages
MIT14 381F13 Lec1 PDF
No ratings yet
MIT14 381F13 Lec1 PDF
8 pages
Transformation - 530
No ratings yet
Transformation - 530
30 pages
Group 4 - Summary PSM
No ratings yet
Group 4 - Summary PSM
33 pages
CHAPTER 04 Expectation
No ratings yet
CHAPTER 04 Expectation
35 pages
Jacobian Convolution
No ratings yet
Jacobian Convolution
23 pages
Week 5-8 Short Notes
No ratings yet
Week 5-8 Short Notes
10 pages
Chapter 1
No ratings yet
Chapter 1
129 pages
Chapter 4 Functions of Random Variables
No ratings yet
Chapter 4 Functions of Random Variables
34 pages
Stochastic Hydrology: Indian Institute of Science
No ratings yet
Stochastic Hydrology: Indian Institute of Science
56 pages
1 Math Fundamentals: 1.1 Integrals, Factors and Techniques
No ratings yet
1 Math Fundamentals: 1.1 Integrals, Factors and Techniques
11 pages
Addis Ababa Science & Technology University Department of Electrical & Computer Engineering
No ratings yet
Addis Ababa Science & Technology University Department of Electrical & Computer Engineering
63 pages
MATH2010 2022 23 AutumnNotes Gappy
No ratings yet
MATH2010 2022 23 AutumnNotes Gappy
92 pages
An Introduction to Mathematical Statistics (Jem N. Corcoran) (Z-Library)
No ratings yet
An Introduction to Mathematical Statistics (Jem N. Corcoran) (Z-Library)
259 pages
Probability_FoundationalMathofAI_S24
No ratings yet
Probability_FoundationalMathofAI_S24
7 pages
Lecture 14 1756137910 231018 104530
No ratings yet
Lecture 14 1756137910 231018 104530
12 pages
Notes
No ratings yet
Notes
56 pages
STAT8310 Statistical Theory 2021 Topic 4.2 Transformations
No ratings yet
STAT8310 Statistical Theory 2021 Topic 4.2 Transformations
48 pages
Axdif
No ratings yet
Axdif
38 pages
Sia Notes 2013
No ratings yet
Sia Notes 2013
279 pages
02-Random Variables
No ratings yet
02-Random Variables
44 pages
Ch1_DATA315_2023W2 (1)
No ratings yet
Ch1_DATA315_2023W2 (1)
75 pages
Chapter 7 Distribution for Functions of Random Variables Student (2)
No ratings yet
Chapter 7 Distribution for Functions of Random Variables Student (2)
41 pages
Mathematical Expectation
No ratings yet
Mathematical Expectation
19 pages
Probability
No ratings yet
Probability
73 pages
Functions of Random Variables (Optional)
No ratings yet
Functions of Random Variables (Optional)
14 pages
Sta 242 Bivariate Analysis
No ratings yet
Sta 242 Bivariate Analysis
46 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Elgenfunction Expansions Associated with Second Order Differential Equations
From Everand
Elgenfunction Expansions Associated with Second Order Differential Equations
E. C. Titchmarsh
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Rectangular HSS K-Connections With Zero Gap - July 2022
No ratings yet
Rectangular HSS K-Connections With Zero Gap - July 2022
4 pages
MTC
No ratings yet
MTC
3 pages
g11 Half Yearly Examination (Set B)
No ratings yet
g11 Half Yearly Examination (Set B)
12 pages
VH1 Bible Specifications
No ratings yet
VH1 Bible Specifications
22 pages
Ingredient Cost: Ingredient Package Size Cost Cost Per Unit Serving Size Per Container Tota L
No ratings yet
Ingredient Cost: Ingredient Package Size Cost Cost Per Unit Serving Size Per Container Tota L
3 pages
Development and Validation of GC Method For The Estimation of Eugenol in Clove Extract
No ratings yet
Development and Validation of GC Method For The Estimation of Eugenol in Clove Extract
5 pages
XABCD Pattern
No ratings yet
XABCD Pattern
9 pages
496 - SSP481 - Audi A7 Onboard Power Supply and Networking PDF
100% (5)
496 - SSP481 - Audi A7 Onboard Power Supply and Networking PDF
40 pages
EcologyofAutomobile Complete3
No ratings yet
EcologyofAutomobile Complete3
211 pages
GE 9 - Ethics Name: Jocelyn O. Millano Course: Bse Iii Science Learning Activity 3 Self-Reflection
No ratings yet
GE 9 - Ethics Name: Jocelyn O. Millano Course: Bse Iii Science Learning Activity 3 Self-Reflection
2 pages
Sub Ballast CBR 25 10 2022
No ratings yet
Sub Ballast CBR 25 10 2022
5 pages
1492 Terminal Blocks Wiring Systems Catalogue (A116-CA913A-En-P May 08)
No ratings yet
1492 Terminal Blocks Wiring Systems Catalogue (A116-CA913A-En-P May 08)
158 pages
FRM B3-12 Getting IG Cert Checklist E2190
No ratings yet
FRM B3-12 Getting IG Cert Checklist E2190
5 pages
Environmental Problems Presentation
No ratings yet
Environmental Problems Presentation
10 pages
Annual Report 2011
No ratings yet
Annual Report 2011
135 pages
Effect of Acid Rain On Vigna Radiata
No ratings yet
Effect of Acid Rain On Vigna Radiata
5 pages
Semifinal Examination Science 10
No ratings yet
Semifinal Examination Science 10
5 pages
Swarm PPT
No ratings yet
Swarm PPT
10 pages
khối lượng riêng cà phê PDF
No ratings yet
khối lượng riêng cà phê PDF
12 pages
List of Lists
100% (1)
List of Lists
255 pages
Introduction To Heat Transfer
No ratings yet
Introduction To Heat Transfer
15 pages
Pioneer VSX 324 K P Manual de Usuario
No ratings yet
Pioneer VSX 324 K P Manual de Usuario
108 pages
Baxter-King Notes 1998
No ratings yet
Baxter-King Notes 1998
13 pages
Fish Genetics (Part 1) - LEFT Review Class (CLSU)
No ratings yet
Fish Genetics (Part 1) - LEFT Review Class (CLSU)
50 pages
Drug Abuse Handbook Second Edition Steven B. Karch Md Ffflm download
No ratings yet
Drug Abuse Handbook Second Edition Steven B. Karch Md Ffflm download
57 pages
Tutorial 1: EEN 206: Power Transmission and Distribution
100% (1)
Tutorial 1: EEN 206: Power Transmission and Distribution
28 pages
An Overview of Practical Time Series Forecasting Using Pytho
No ratings yet
An Overview of Practical Time Series Forecasting Using Pytho
30 pages
May 2009 TZ2 Subject Report
No ratings yet
May 2009 TZ2 Subject Report
12 pages
Au t2 S 1596 Southern Lights Powerpoint English Ver 1
No ratings yet
Au t2 S 1596 Southern Lights Powerpoint English Ver 1
7 pages

chap 8 1

Uploaded by

chap 8 1

Uploaded by

8

transformation that sorts observations, turning X1 , . . . , Xn into the order statistics

8.1 Change of variables

Theorem 8.1.1 (Change of variables in one dimension). Let X be a continuous

where x = g 1 (y). The support of Y is all g(x) with x in the support of X.

Proof. Let g be strictly increasing. The CDF of Y is

so by the chain rule, the PDF of Y is

FY (y) = P (Y  y) = P (eX  y) = P (X  log y) = (log y),

so the PDF is again

Example 8.1.4 (Chi-Square PDF). Let X ⇠ N (0, 1), Y = X 2 . The distribution

The following example sheds light on an unexpected appearance of a decidedly

Example 8.1.5 (Lighthouse). A lighthouse on a shore is shining light toward the

x = tan(u) and u = arctan(x).

The change of variables formula generalizes to n dimensions, where it tells us how to

The change of variables formula from multivariable calculus (which is reviewed in

converts between the PMFs. But if X is continuous, we need a Jacobian (which in

exists, has continuous entries, and has absolute determinant

As shown in Example 7.5.10, we can construct (Z, W ) as

A convolution is a sum of independent random variables. As we mentioned earlier, we

A third method for obtaining the distribution of T is by using a convolution sum or

Theorem 8.2.1 (Convolution sums and integrals). Let X and Y be independent

If X and Y are continuous, then the PDF of T is

Proof. For the discrete case, we use LOTP, conditioning on X:

Conditioning on Y instead, we obtain the second formula for the PMF of T .

Again, we need independence to drop the condition X = x. To get the PDF, we

Conditioning on Y instead, we get the second formula for fT .

fT,W (t, w) = fX,Y (x, y) = fX (x)fY (y) = fX (w)fY (t w),

and the marginal PDF of T is

in agreement with our result above. ⌅

This is known as the Gamma(2, ) distribution. We will introduce the Gamma

The convolution formula gives

0.0 0.5 1.0 1.5 2.0

0.0 0.5 1.0 1.5 2.0

You might also like