6 Jointly Continuous Random Variables: 6.1 Joint Density Functions
6 Jointly Continuous Random Variables: 6.1 Joint Density Functions
Again, we deviate from the order in the book for this chapter, so the subsections in this chapter do not correspond to those in the text.
6.1
Recall that X is continuous if there is a function f (x) (the density) such that
Z t
P(X t) =
fX (x) dx
The integral is over {(x, y) : x s, y t}. We can also write the integral as
Z s Z t
P(X s, Y t) =
fX,Y (x, y) dy dx
Z t Z s
fX,Y (x, y) dx dy
=
f (x, y)dxdy = 1
Just as with one random variable, the joint density function contains all
the information about the underlying probability measure if we only look at
the random variables X and Y . In particular, we can compute the probability
of any event defined in terms of X and Y just using f (x, y).
Here are some events defined in terms of X and Y :
{X Y }, {X 2 + Y 2 1}, and {1 X 4, Y 0}. They can all be written
in the form {(X, Y ) A} for some subset A of R2 .
1
Proposition 1. For A R2 ,
P((X, Y ) A) =
Z Z
f (x, y) dxdy
1
1
exp( (x2 + y 2 ))
2
2
6.2
Suppose we know the joint density fX,Y (x, y) of X and Y . How do we find
their individual densities fX (x), fY (y). These are called marginal densities.
The cdf of X is
FX (x) = P(X x) = P( < X x, < Y < )
Z x Z
=
fX,Y (u, y) dy du
Z
fX,Y (x, y) dx
fY (y) =
We will define independence of two contiunous random variables differently than the book. The two definitions are equivalent.
Definition 2. Let X, Y be jointly continuous random variables with joint
density fX,Y (x, y) and marginal densities fX (x), fY (y). We say they are
independent if
fX,Y (x, y) = fX (x)fY (y)
If we know the joint density of X and Y , then we can use the definition
to see if they are independent. But the definition is often used in a different
way. If we know the marginal densities of X and Y and we know that they
are independent, then we can use the definition to find their joint density.
Example: If X and Y are independent random varialbes and each has the
standard normal distribution, what is their joint density?
f (x, y) =
1
1
exp( (x2 + y 2 ))
2
2
6.3
Expected value
If we write the marginal fX (x) in terms of the joint density, then this becomes
Z Z
x fX,Y (x, y) dxdy
E[X] =
provided
x + y if 0 x 1, 0 y 1
f (x, y) =
0,
otherwise
Let Z = X + Y . Find the mean and variance of Z.
We now consider independence and expectation.
5
=
x fX (x) dx
y fY (y) dy = E[X]E[Y ]
6.4
FZ (z) = z +
z/x
"Z
1 dy dx
= z+
"Z
z/x
1 dy dx
z
dx
z x
= z + z ln x|1z = z z ln z
= z+
fZ (z) =
ln z, if 0 z 1
0,
otherwise
Z Z zx
=
f (x, y) dy dx
Z Z zx
d
d
f (x, y) dy dx
fZ (z) = FZ (z) =
dz
dz
Z
=
f (x, z x)dx
This is known as a convolution. We can use this formula to find the density of
the sum of two independent random variables. But in some cases it is easier
to do this using generating functions which we study in the next section.
Example: Let X and Y be independent random variables each of which has
the standard normal distribution. Find the density of Z = X + Y .
We need to compute the convolution
Z
1
1
1
fZ (z) =
exp( x2 (z x)2 ) dx
2
2
2
Z
1
1
exp(x2 z 2 + xz) dx
=
2
2
Z
1
1
=
exp((x z/2)2 z 2 ) dx
2
4
Z
1
2
= ez /4
exp((x z/2)2 ) dx
2
Now the substitution u = x z/2 shows
Z
Z
2
exp((x z/2) ) dx =
exp(u2 ) du
6.5
.
Example: Compute it for exponential. Should find M (t) = t
Example: In the homework you will compute it for the gamma distribution
and find (hopefully)
M (t) =
t
=
fX (x) xk etx |t=0 dx
Z
=
fX (x) xk dx = E[X k ]
n
X
i=1
n
X
i ,
i2
i=1
6.6
Z x Z y
FX,Y (x, y) =
f (u, v) dv du
If we know the joint cdf, then we can compute the joint pdf by taking partial
derivatives of the above :
2
FX,Y (x, y) = f (x, y)
xy
Calc review : partial derivatives
The joint cdf has properties similar to the cdf for a single RV.
Proposition 5. Let F (x, y) be the joint cdf of two continuous random variables. Then F (x, y) is a continuous function on R2 and
lim F (x, y) = 0,
x,y
lim F (x, y) = 1,
x,y
11
F (x1 , y) F (x2 , y) if x1 x2 ,
lim F (x, y) = FY (y)
F (x, y1 ) F (x, y2 ) if y1 y2
lim F (x, y) = FX (x)
We will use the joint cdf to prove more results about independent of RVs.
Theorem 3. If X and Y are jointly continuous random variables then they
are independent if and only if FX,Y (x, y) = FX (x)FY (y).
The theorem is true for discrete random variables as well.
Proof.
Example: Suppose that the joint cdf of
(1 e2x )(y + 1)
2
(1 e2x )
F (x, y) =
X and Y is
if
if
if
if
1 y 1, x 0
y 0, x > 1
y<0
y 0, x < 1
Show that X and Y are independent and find their joint density.
Theorem 4. If X and Y are independent jointly continuous random variables and g and h are functions from R to R then g(X) and h(Y ) are independent random variables.
Proof. We will only prove a special case. We assume that g and h are increasing. We also assume they are differentiable. Let W = g(X), Z = h(Y ).
By the previous theorem we can show that W and Z are independent by
showing that FW,Z (w, z) = FW (w)FZ (z). We have
FW,Z (w, z) = P(g(X) w, h(Y ) z)
Because g and h are increasing, the event {g(X) w, h(Y ) z} is the same
as the event {X g 1 (w), Y h1 (z)}. So
FW,Z (w, z) = P(X g 1 (w), Y h1 (z))
= FX,Y (g 1 (w), h1 (z)) = FX (g 1 (w))FY (h1 (z))
12
where the last equality comes from the previous theorem and the independence of X and Y . The individual cdfs of W and Z are
FW (w) = P(X g 1 (w)) = FX (g 1 (w))
FZ (z) = P(Y h1 (z)) = FY (h1 (z))
So we have shown FW,Z (w, z) = FW (w)FZ (z).
End of October 28 lecture
Corollary 2. If X and Y are independent jointly continuous random variables and g and h are functions from R to R then
E[g(X)h(Y )] = E[g(X)]E[h(Y )]
Recall that for any two random variables X and Y , we have E[X + Y ] =
E[X] + E[Y ]. If they are independent we also have
Theorem 5. If X and Y are independent and jointly continuous, then
var(X + Y ) = var(X) + var(Y )
Proof.
6.7
Change of variables
Suppose we have two random variables X and Y and we know their joint
density. We have two functions g : R2 R and g : R2 R, and we define
two new random variables by U = g(X, Y ), W = h(X, Y ). Can we find
the joint density of U and W ? In principle we can do this by computing
their joint cdf and then taking partial derivatives. In practice this can be a
mess. There is a another way involving Jacobians which we will study in this
section. But we start by illustrating the cdf approach with an example.
Example Let X and Y be independent standard normal RVs. Let U =
X + Y and W = X Y . Find the joint density of U and W .
There is a another way to compute the joint density of W, Y that we
will now study. First we return to the case of a function of a single random
variable. Support that X is a continuous random variable and we know its
13
d 1
g (y)
dy
Proof.
1
g 1 (y)
fX (x) dx
y
y
u w u w
u
w
We then have
Z Z
f (x, y) dxdy =
Z Z
Often f (T 1 (u, w)) is simply written as f (u, w). In practice you write f ,
which is originally a function of x and y as a function of u and w.
14
T (A)
y = r sin()
1
r 2 /2
re
if r 0, 0 2
2
fR, (R, ) =
0,
otherwise
End of Nov 2 lecture
Example Let X and Y be independent random variables. They both have
an exponential distribution with = 1. Let
U = X + Y,
X
W =
X +Y
Find the joint density of U and W .
x
Let T (x, y) = (x + y, x+y
). Then T is a bijection from [0, ) [0, )
onto [0, ) [0, 1]. We need to find its inverse, i.e., find x, y in terms of u, w.
Multiply the two equations to get x = uw. Then y = u x = u uw. So
T 1 (u, w) = (uw, u uw). And so
x x
w
u
w
J(u, w) = det u
= u
= det
y
y
1 w u
u
w
So
fU,W (u, w) =
ue1
0,
if u 0, 0 w 1
otherwise
15
1
1
p
(x2 2xy + y 2 ))
exp(
2
2
2(1 )
2 1
You can compute the marginals of this joint distribution by the usual trick of
completing the square. You find that X and Y both have a standard normal
distribution. Note that the stuff in the exponential is a quadratic form in x
and y. A more general quadratic form would have three parameters:
exp((Ax2 + 2Bxy + Cy 2 ))
In order for the intergal to converge the quadratic form Ax2 + 2Bxy + Cy 2
must be positive.
Now suppose we start with two independent random variables X and Y
which are independent and define
U = aX + bY,
W = cX + dY
Correlation coefficient
If X and Y are independent, then E[XY ] E[X]E[Y ] = 0. If there are
not independent, it need not be zero and it is in some sense a measure of
how dependent they are.
Definition 5. The covariance of X and Y is
cov(X, Y ) = E[XY ] E[X]E[Y ]
The correlation coefficient is a
cov(X, Y )
p
(X, Y ) = p
var(X) var(Y )
1
1
p
(x2 2xy + y 2 ))
exp(
2
2
2(1 )
2 1
Z
y 2 2xy
1
1
2
p
x)
) dy dx
y exp(
x exp(
=
2(1 2 )
2(1 2 )
2 1 2
Z
Z
(y x)2 2 x2
1
1
2
p
x)
) dy dx
y exp(
x exp(
=
2(1 2 )
2(1 2 )
2 1 2
Z
Z
y2
1 2
1
p
) dy dx
(y + x) exp(
x exp( x )
=
2
2(1 2 )
2 1 2
Z
Z
1
1 2
y2
2
= p
) dy
x exp( x ) dx
exp(
2
2(1 2 )
2 1 2
=
17
18
6.8
P(A B)
P(B)
P(X = x, B)
P(B)
Most of our applications were of the following form. Let Y be another discrete
RV. Define Bn = {Y = n} where n ranges over the range of Y . Then
X
E[X] =
E[X|Y = n]P(Y = n)
n
f
(u,
w)
dw
du
a
y
P(a X b|y Y y + ) = R R y+
f (u, w) dw du
y
19
Definition 6. Let X, Y be jointly continuous RVs with pdf fX,Y (x, y). The
conditional density of X given Y = y is
fX|Y (x|y) =
fX,Y (x, y)
, if fY (y) > 0
fY (y)
We have made the above definitions. We could have defined fX|Y and
P(a X b|Y = y) as limits and then proved above as theorems.
What happens if X and Y are independent? Then f (x, y) = fX (x)fY (y).
So fX|Y (x|y) = fX (x) as we would expect.
Example (X, Y ) is uniformly distributed on the triangle with vertices (0, 0), (0, 1)
and (1, 0). Find the conditional density of X given Y .
The joint density is 2 on the triangle.
Z 1y
dx = 2(1 y), 0 y 1
fY (y) = 2
0
And we have
fX|Y (x|y) =
2
1
=
,
2(1 y)
1y
0x1y
Definition 7.
E[X|Y = y] =
1
(1
2
x fX|Y (x|y) dx
1 0
u
w
J(u, w) = det y y = det
=1
1 1
u
w
We have fX,Y (x, y) = 2 exp((x + y)) for x, y 0. So
2 ez , if 0 x z
fX,Z (x, z) =
0,
otherwise
It is convenient to write the condition on x, z as 1(0 x z). This notation
means the function is 1 if 0 x z is satisfied and 0 if it is not. So
fX,Z (x, z) = 2 ez 1(0 x z). So we have for x 0,
fX|Z (x|z) =
fX,Z (x, z)
2 ez 1(0 x z)
=
= e(zx) 1(0 x z)
fX (x)
ex
0
x
For the other one, we first find the marginal for Z:
Z
Z
fZ (z) =
fX,Z (x, z) dx =
2 ez 1(0 x z) dx
Z
z
2 ez dx = 2 zez
=
0
21
So we have
fX|Z (x|z) =
2 ez 1(0 x z)
1
fX,Z (x, z)
=
= 1(0 x z)
2
z
fZ (z)
ze
z
y fY |X (y|x) dy
y
fY,X (y, x)
dy
fX (x)
So
Z
E[Y |X = x] fX (x) dx =
Z Z
Recall that the partition theorem was useful when it was hard to compute
the expected value of Y , but easy to compute the expected value of Y given
that some other random variable is known.
22
Example: Quality of lightbulbs varies because ... For fixed factory conditions, the lifetime of the lightbulb has an exponential distribution. We
model this by assuming the parameter is uniformly distributed between
5 104 and 8 104 . Find the mean lifetime of a lightbulb and the pdf for
its lifetime. Is it exponential?
Example: Let X, Y be independent standard normal RVs. Let Z = X + Y .
Find fZ|X , fX|Z , E[Z|X = x] and E[X|Z = z].
23