Chap 3: Two Random Variables: X X X X X
Chap 3: Two Random Variables: X X X X X
  and                                                                y2
                   P (y1 < Y (ξ) ≤ y2 ) = FY (y2 ) − FY (y1 ) =            fY (y) dy
                                                                  y1
  What about the probability that the pair of RVs (X, Y ) belongs to an arbitrary region D? In
  other words, how does one estimate, for example
                                                                                       1
Chap 3: Two Random Variables
Properties
1.
        Similarly, (X(ξ) ≤ +∞, Y (ξ) ≤ +∞) = Ω, we get FXY (+∞, +∞) = P (Ω) = 1.
   2.
                                                                              2
Chap 3: Two Random Variables
and the mutually exclusive property of the events on the right side gives
              P (x1 < X(ξ) ≤ x2 , y1 < Y (ξ) ≤ y2 ) =       FXY (x2 , y2 ) − FXY (x2 , y1 )     (5)
                                                            −FXY (x1 , y2 ) + FXY (x1 , y1 )
        This is the probability that (X, Y ) belongs to the rectangle in Fig. 3. To prove (5), we
        can make use of the following identity involving mutually exclusive events on the right
        side.
This gives
                                                                                    3
Chap 3: Two Random Variables
      and the desired result in (5) follows by making use of (3) with y = y2 and y1
      respectively.
                                                                                4
Chap 3: Two Random Variables
                                              
                     P ((X, Y ) ∈ D) =                      fXY (x, y) dx dy       (9)
                                                  (x,y)∈D
                                                                               5
Chap 3: Two Random Variables
Marginal Statistics
  In the context of several RVs, the statistics of each individual ones are called marginal
  statistics. Thus FX (x) is the marginal probability distribution function of X, and fX (x) is
  the marginal pdf of X. It is interesting to note that all marginal can be obtained from the joint
  pdf. In fact
  Also
                             ∞                                 ∞
               fX (x) =           fXY (x, y)dy    fY (y) =           fXY (x, y)dx.            (11)
                           −∞                                −∞
(X ≤ x) = (X ≤ x) ∩ (Y ≤ +∞)
so that
                                                                                     6
Chap 3: Two Random Variables
  To prove (11), we can make use of (7) and (10), which gives
                                                x  +∞
                    FX (x) = FXY (x, +∞) =                 fXY (u, y)dy du
                                                     −∞   −∞
  If X and Y are discrete RVs, then pij = P (X = xi , Y = yj ) represents their joint pmf, and
  their respective marginal pmfs are given by
                                                             
                     P (X = xi ) = j P (X = xi , Y = yj ) = j pij
                                                                                            (14)
  and
                                                                 
                     P (Y = yj ) =     i P (X = xi , Y = yj ) =       i   pij
                                                                                             (15)
                                                                                 7
Chap 3: Two Random Variables
  It used to be a practice for insurance companies routinely to scribble out these sum values in
  the left and top margins, thus suggesting the name marginal densities! (Fig 2).
                                                                                 8
Chap 3: Two Random Variables
Examples
  From (11) and (12), the joint CDF and/or the joint pdf represent complete information about
  the RVs, and their marginal pdfs can be evaluated from the joint pdf. However, given
  marginals, (most often) it will not be possible to compute the joint pdf.
  Example 1
                : Given
                                       ⎧
                                       ⎨ cconstant    0<x<y<1
                          fXY (x, y) =                                                          (16)
                                       ⎩ 0            o.w.
1  c
                  fXY (x, y)dx dy =                 c · dx dy =          cy dy =        
0 = = 1
    −∞     −∞                            y=0    x=0                  y=0           2        2
                                                                                    9
Chap 3: Two Random Variables
  Thus c = 2. Moreover
                             +∞                        1
               fX (x) =            fXY (x, y)dy =             2dy = 2(1 − x) 0 < x < 1
                              −∞                     y=x
  and similarly,
                                  +∞                       y
                   fY (y) =             fXY (x, y)dx =             2 dx = 2y   0<y<1
                               −∞                            x=0
                                                                                       10
Chap 3: Two Random Variables
  Clearly, in this case given fX (x) and fY (y) as above, it will not be possible to obtain the
  original joint pdf in (16).
  Example 2
                : X and Y are said to be jointly normal (Gaussian) distributed, if their joint pdf
  has the following form:
                             1
  fXY (x, y)   =                                                                                 (17)
                    2πσX σY 1 −      ρ2
                                      
                                              
                              1          (x − μX )2   2ρ(x − μx )(y − μY ) (y − μY )2
                    exp −            ·        2     −                     +
                          2(1 − ρ2 )        σX              σ X σY            σY2
                    −∞ < x < ∞, −∞ < y < ∞, |ρ| < 1
  and similarly
                       ∞                                 
  
                                                1 (y − μY )2               2
          fY (y) =    fXY (x, y)dx =       exp −       2      ∼ N (μY , σ Y )
                   −∞                 2πσ 2
                                          Y
                                                     2σ Y
                                                                      2
   Following the above notation, we will denote (17) as N (μX , μY , σX , σY2 , ρ). Once again,
                                                                                   11
Chap 3: Two Random Variables
  knowing the marginals in above alone doesn’t tell us everything about the joint pdf in (17).
  As we show below, the only situation where the marginal pdfs can be used to recover the
  joint pdf is when the random variables are statistically independent.
                                                                                12
Chap 3: Two Random Variables
Independence of RVs
  Equations (18)-(20) give us the procedure to test for independence. Given fXY (x, y), obtain
  the marginal pdfs fX (x) and fY (y) and examine whether one of the equations in (18) or (20)
  is valid. If so, the RVs are independent, otherwise they are dependent.
   • Returning back to Example 1, we observe by direct verification that
                                                                                 13
Chap 3: Two Random Variables
      fXY (x, y) 
= fX (x) · fY (y). Hence X and Y are dependent RVs in that case.
   • It is easy to see that such is the case in the case of Example 2 also, unless in other words,
     two jointly Gaussian RVs as in (17) are independent if and only if the fifth parameter
     ρ = 0.
                                                                                 14
Chap 3: Two Random Variables
  If X and Y are random variables and g(·) is a function of two variables, then
                               
             E[g(X, Y )] =             g(x, y) p(x, y) discrete case
                                  y       x
                                     ∞      ∞
                            =                      g(x, y)f (x, y) dx dy   continuous case
                                     −∞       −∞
  Example 3
                At a party N men throw their hats into the center of a room. The hats are
  mixed up and each man randomly selects one. Find the expected number of men who select
  their own hats.
  Solution: let X denote the number of men that select their own hats, we can compute E[X]
  by noting that
                                     X = X1 + X2 + · · · + X N
                                                                                      15
Chap 3: Two Random Variables
  So
                          E[Xi ] = 1P [Xi = 1] + 0P [Xi = 0] = 1/N
  Therefore, E[X] = E[X1 ] + E[X2 ] + · · · + E[XN ] = 1. No matter how many people are at
  the party, on the average, exactly one of the men will select his own hat.
                                                                           16
Chap 3: Two Random Variables
If X and Y are independent, then for any functions h(·) and g(·)
And
  Example 4
                : Random variables X1 and X2 are independent and identically distributed with
  probability density function
                                        ⎧
                                        ⎨ 1 − x/2 0 ≤ x ≤ 2
                               fX (x) =
                                        ⎩    0         o.w.
Find
                                                                             17
Chap 3: Two Random Variables
      (b) Let FX (x) denote the CDF of both X1 and X2 . The CDF of Z = max(X1 , X2 ) is
      found by observing that Z ≤ z iff X1 ≤ z and X2 ≤ z. That is
                                                                                       18
Chap 3: Two Random Variables
                                                                  19
Chap 3: Two Random Variables
  Example 5
                 : Given
                                      ⎧
                                      ⎨ xy 2 e−y        0 < y < ∞, 0 < x < 1
                         fXY (x, y) =
                                      ⎩     0                      o.w.
                               2 −y 
∞
                     =     x −y e 
0 + 2         ye−y dy = 2x, 0 < x < 1
                                                    0
  Similarly
                                       1
                                                          y 2 −y
                         fY (y) =           fXY (x, y)dx = e ,          0<y<∞
                                    0                     2
  In this case
                                        fXY (x, y) = fX (x) · fY (y)
  and hence X and Y are independent random variables.
                                                                                   20
Chap 3: Two Random Variables
  Correlation
                : Given any two RVs X and Y , define
  Covariance
                : Given any two RVs X and Y , define
                                                                              21
Chap 3: Two Random Variables
  Correlation coefficient
                               between X and Y .
                                     Cov(X, Y )        Cov(X, Y )
                  ρXY      =                      =                − 1 ≤ ρXY ≤ 1        (25)
                               V ar(X)V ar(Y )           σX σY
           Cov(X, Y )      = ρXY σX σY                                                   (26)
  Uncorrelated RVs
                       : If ρXY = 0, then X and Y are said to be uncorrelated RVs. If X and Y
  are uncorrelated, then
                                     E(XY ) = E(X) E(Y )                                  (27)
  Orthogonality
                  : X and Y are said to be orthogonal if
E(XY ) = 0 (28)
  From above, if either X or Y has zero mean, then orthogonality implies uncorrelatedness and
  vice-versa.
  Suppose X and Y are independent RVs,
                                                                              22
Chap 3: Two Random Variables
  therefore from (27), we conclude that the random variables are uncorrelated. Thus
  independence implies uncorrelatedness (ρXY = 0). But the inverse is generally not true.
  Example 6
               : Let Z = aX + bY . Determine the variance of Z in terms of σX , σY and ρXY .
  Solution:
                         μZ = E(Z) = E(aX + bY ) = aμX + bμY
  and
          2
         σZ   = V ar(Z) = E[(Z − μZ )2 ] = E{[a(X − μX ) + b(Y − μY )]2 }
              = a2 E[(X − μX )2 ] + 2abE[(X − μX )(Y − μY )] + b2 E[(Y − μY )2 ]
              = a2 σ X
                     2
                       + 2abρXY σX σY + b2 σY2
  In particular if X and Y are independent, then ρXY = 0, and the above equation reduces to
                                      2
                                     σZ = a2 σX
                                              2
                                                + b2 σY2
Thus the variance of the sum of independent RVs is the sum of their variances (a = b = 1).
                                                                              23
Chap 3: Two Random Variables
  Moments
              :                                 
                                             ∞       ∞
                        E[X Y ] =
                             k     m
                                                         xk y m fXY (x, y)dx dy           (30)
                                          −∞     −∞
  From this and the two-dimensional inversion formula for Fourier transforms, it follows that
                                  ∞ ∞
                              1                               −j(ω1 x+ω2 y)
               fXY (x, y) =                 Φ XY (ω 1 , ω 2 )e              dω1 dω2        (32)
                            4π 2 −∞ −∞
  Note that
                                  |ΦXY (ω1 , ω2 )| ≤ ΦXY (0, 0) = 1
                                                                                  24
Chap 3: Two Random Variables
  Also
                         ΦX (ω) = ΦXY (ω, 0)        ΦY (ω) = ΦXY (0, ω)                   (34)
  Independence
                   If the RV X and Y are independent, then
                                            
                               j(ω1 X+ω2 Y )
                            E e                = E[ejω1 X ] · E[ejω2 Y ]
                                                                               25
Chap 3: Two Random Variables
  Hence,
                                     ΦZ (ω) = ΦX (ω) · ΦY (ω)
  It is known
  From         that
          above, thethe density of Z function
                      characteristic  equals theofconvolution
                                                   RV Z is equal   X (x)
                                                              of fto  the and fY (y).
                                                                          product     From above,
                                                                                   between  the
  the  characteristic
  characteristic      function
                  function  of of
                               X the
                                  andconvolution    of two function
                                       the characteristic  densities of
                                                                     equals
                                                                        Y. the product of their
  characteristic functions.
  Example 7
                 : X and Y are independent Poisson RVs with parameters λ1 and λ2
  respectively, let
                                          Z =X +Y
  Then
                                     ΦZ (ω) = ΦX (ω) ΦY (ω)
  From earlier results
                                             jω                        jω
                           ΦX (ω) = eλ1 (e        −1)
                                                         ΦY (ω) = eλ2 (e    −1)
  so that
                                                   jω
                          ΦZ (ω) = e(λ1 +λ2 )(e         −1)
                                                              ∼   P (λ1 + λ2 )
  i.e., sum of independent Poisson RVs is also a Poisson random variable.
                                                                                   26
Chap 3: Two Random Variables
  From (17), X and Y are said to be jointly Gaussian if their joint pdf has the form in (17):
                             1
  fXY (x, y)   =              
                    2πσX σY 1 − ρ2
                                      
                                              
                              1          (x − μX )2   2ρ(x − μx )(y − μY ) (y − μY )2
                    exp −            ·        2     −                     +
                          2(1 − ρ2 )        σX              σ X σY            σY2
                    −∞ < x < ∞, −∞ < y < ∞, |ρ| < 1
  By direct substitution and simplification, we obtain the joint characteristic function of two
  jointly Gaussian RVs to be
                                                              1   2  2                  2  2
                           j(ω1 X+ω2 Y )
    ΦXY (ω1 , ω2 ) = E e                   = ej(μX ω1 +μY ω2 )− 2 (σX ω1 +2ρσX σY ω1 ω2 +σY ω2 ) (35)
  From (17) by direct computation, it is easy to show that for two jointly Gaussian random
  variables
                                     Cov(X, Y ) = ρσX σY
                                                                                    27
Chap 3: Two Random Variables
                                                2
  Hence from definition of ρ, ρ in N (μX , μY , σX , σY2 , ρ) represents the actual correlation
  coefficient of the two jointly Gaussian RVs in (17). Notice that ρ = 0 implies
  Thus if X and Y are jointly Gaussian, uncorrelatedness does imply independence between
  the two random variables. Gaussian case is thean
                                                 only exception where the two concepts imply
  each other.
  Example 8
                : Let X and Y be jointly Gaussian RVs with parameters
                2
  N (μX , μY , σX , σY2 , ρ). Define Z = aX + bY , determine fZ (z).
  Solution: In this case we can make use of characteristic function to solve this problem
                      jZω                                      
                                   j(aX+bY )ω
      ΦZ (ω) = E e           =E e               = E ejXaω+jY bω = ΦXY (aω, bω)                   (37)
  where
                                                 2
                    μZ = aμX + bμY              σZ = a2 σX
                                                         2
                                                           + 2ρabσX σY + b2 σY2
                                                                                           28
Chap 3: Two Random Variables
  Notice that (38) has the same form as (36), and hence we conclude that Z = aX + bY is also
  Gaussian with mean and variance as above, which also agrees with previous example.
  From the previous example, we conclude that any linear combination of jointly Gaussian RVs
  generates a new Gaussian RV. In other words, linearity preserves Gaussianity.
  Gaussian random variables are also interesting because of the following result.
                                                                                29
Chap 3: Two Random Variables
Y → N (0, 1) (40)
  The central limit theorem states that a large sum of independent random variables each with
  finite variance tends to behave like a normal random variable. Thus the individual pdfs
  become unimportant to analyze the collective sum behavior. If we model the noise
  phenomenon as the sum of a large number of independent random variables (eg: electron
  motion in resistor components), then this theorem allows us to conclude that noise behaves
  like a Gaussian RV.
  This theorem holds for any distribution of the Xi ’s; herein lies its power.
                                                                                    30
Chap 3: Two Random Variables
  The Normal Approximation: Suppose n → ∞ with p held fixed. Then for k in the
  neighborhood of np, we can approximate
                        ⎛     ⎞
                              ⎠ pk q n−k ≈ √ 1
                           n                              2
                        ⎝                         e−(k−np) /2npq
                           k                2πnpq
                                                                             31
Chap 3: Two Random Variables
  Example 9
               : The lifetime of a special type of battery is a random variable with mean 40
  hours and standard deviation 20 hours. A battery is used until it fails, at which point it is
  replaced by a new one. Assuming a stockpile of 25 such batteries, the lifetimes of which are
  independent, approximate the probability that over 1100 hours of use can be obtained.
  Solution: if we let Xi denote the lifetime of the ith battery to be put in use, and
  Y = X1 + X2 + · · · + X25 . Then we want to find P (Y > 1100).
                                                                                   32
Chap 3: Two Random Variables
  For any two events A and B, we have defined the conditional probability of A given B as
                                           P (A ∩ B)
                               P (A|B) =             ,    P (B) 
= 0                          (41)
                                             P (B)
  Noting that the probability distribution function FX (x) is given by FX (x) = P {X(ξ) ≤ x},
  we may define the conditional distribution of the RV X given the event B as
                                                         P {(X(ξ) ≤ x) ∩ B}
                     FX (x|B) = P {X(ξ) ≤ x|B} =                                              (42)
                                                               P (B)
  In general, event B describes some property of X. Thus the definition of the conditional
  distribution depends on conditional probability, and since it obeys all probability axioms, it
  follows that the conditional distribution has the same properties as any distribution function.
  In particular
                               P {(X(ξ) ≤ +∞) ∩ B}   P (B)
                   FX (+∞|B) =                     =       =1                                 (43)
                                       P (B)         P (B)
                               P {(X(ξ) ≤ −∞) ∩ B}   P (φ)
                   FX (−∞|B) =                     =       =0
                                       P (B)         P (B)
                                                                                  33
Chap 3: Two Random Variables
  The conditional density function is the derivative of the conditional distribution function.
  Thus
                                                   d
                                    fX (x|B) =       FX (x|B)
                                                  dx
  we obtain
                                            x
                              FX (x|B) = −∞ fX (u|B) du
                                                                                                 (45)
                                                                                  34
Chap 3: Two Random Variables
  Example 10
               : Toss a coin and X(T ) = 0, X(H) = 1. Suppose B = {H}. Determine
  FX (x|B). (Suppose q is the probability of landing a tail)
  Solution: From earlier example, FX (x) has the following form shown in Fig. 4(a). We need
  FX (x|B) for all x.
   • For x < 0, {X(ξ) ≤ x} = φ, so that {(X(ξ) ≤ x) ∩ B} = φ and FX (x|B) = 0.
   • For 0 ≤ x < 1, {X(ξ) ≤ x} = {T }, so that
{(X(ξ) ≤ x) ∩ B} = {T } ∩ {H} = φ
      and FX (x|B) = 0.
   • For x ≥ 1, {X(ξ) ≤ x} = Ω, and
      and
                                                  P (B)
                                     FX (x|B) =         =1
                                                  P (B)
  The conditional CDF is shown in Fig. 4(b).
                                                                             35
Chap 3: Two Random Variables
                                                                      36
Chap 3: Two Random Variables
  Example 11
                : Given FX (x), suppose B = {X(ξ) ≤ a}. Find fX (x|B).
  Solution: We will first determine FX (x|B) as,
                                         P {(X ≤ x) ∩ (X ≤ a)}
                            FX (x|B) =
                                               P (X ≤ a)
   • For x < a,
                                             P (X ≤ x)   FX (x)
                                FX (x|B) =             =
                                             P (X ≤ a)   FX (a)
   • For x ≥ a, (X ≤ x) ∩ (X ≤ a) = (X ≤ a), so that FX (x|B) = 1.
  Thus, the conditional CDF and pdf are given as below (shown in Fig. 5)
                                            ⎧
                                            ⎨ FX (x) x < a
                                                FX (a)
                               FX (x|B) =
                                            ⎩ 1         x≥a
  and hence                                        ⎧
                                   d               ⎨   fX (x)
                                                                x<a
                                                       FX (a)
                       fX (x|B) =    FX (x|B) =
                                  dx            ⎩        0      o.w.
                                                                           37
Chap 3: Two Random Variables
  Example 12
                : Let B represent the event {a < X(ξ) ≤ b} with b > a. For a given FX (x),
  determine FX (x|B) and fX (x|B).
  Solution:
                                             P {(X(ξ) ≤ x) ∩ (a < X(ξ) ≤ b)}
              FX (x|B) =   P {X(ξ) ≤ x|B} =
                                                    P (a < X(ξ) ≤ b)
                           P {(X(ξ) ≤ x) ∩ (a < X(ξ) ≤ b)}
                       =
                                   FX (b) − FX (a)
   • For x < a, we have {(X(ξ) ≤ x) ∩ (a < X(ξ) ≤ b)} = φ and hence FX (x|B) = 0.
   • For a ≤ x < b, we have {(X(ξ) ≤ x) ∩ (a < X(ξ) ≤ b)} = {a < X(ξ) ≤ x} and
     hence
                                 P (a < X(ξ) ≤ x)    FX (x) − FX (a)
                     FX (x|B) =                   =
                                  FX (b) − FX (a)    FX (b) − FX (a)
   • For x ≥ b, we have {(X(ξ) ≤ x) ∩ (a < X(ξ) ≤ b)} = {a < X(ξ) ≤ b} so that
     FX (x|B) = 1
                                                                            38
Chap 3: Two Random Variables
                                                                    39
Chap 3: Two Random Variables
B is related to another RV
                                                                              40
Chap 3: Two Random Variables
  To determine, the limiting case FX (x|Y = y), we can let y1 = y and y2 = y + Δy, then this
  gives
                                    x  y+Δy                      x
                                    −∞ y
                                               fXY  (u, v) dudv
                                                           dvdu         f (u, v) duΔy
                                                                    −∞ XY
      FX (x|y < Y ≤ y + Δy) =             y+Δy                 =
                                                fY (v) dv                fY (y)Δy
                                          y
To remind about the conditional nature on the left hand side, we shall use the subscript X|Y
                                                                                 41
Chap 3: Two Random Variables
  It is easy to see that the conditional density represents a valid probability density function. In
  fact
                                                      fXY (x, y)
                                  fX|Y (x|Y = y) =                ≥0
                                                        fY (y)
  and                ∞                          ∞
                                                      fXY (x, y)dx      fY (y)
                          fX|Y (x|Y = y)dx = −∞                       =         =1
                      −∞                               fY (y)           fY (y)
  Therefore, the conditional density indeed represents a valid pdf, and we shall refer to it as the
  conditional pdf of the RV X given Y = y. We may also write
                                                                                   42
Chap 3: Two Random Variables
  and
                                                  fXY (x,y)
                                  fX|Y (x|y) =     fY (y)
                                                                                           (50)
  and similarly
                                                  fXY (x,y)
                                  fY |X (y|x) =    fX (x)
                                                                                           (51)
  If the RVs X and Y are independent, then fXY (x, y) = fX (x)fY (y) and the conditional
  density reduces to
  implying that the conditional pdfs coincide with their unconditional pdfs. This makes sense,
  since if X and Y are independent RVs, information about Y shouldn’t be of any help in
  updating our knowledge about X.
  In the case of discrete-type RVs, conditional density reduces to
                                                   P (X=xi ,Y =yj )
                           P (X = xi |Y = yj ) =      P (Y =yj )
                                                                                           (53)
                                                                               43
Chap 3: Two Random Variables
  Example 13
                : Given                     ⎧
                                            ⎨ k          0<x<y<1
                               fXY (x, y) =
                                            ⎩ 0          o.w.
  determine fX|Y (x|y) and fY |X (y|x)
  Solution: The joint pdf is given to be a constant in the shadowed region. This gives
                                     1 y                1
                                                                       k
                fXY (x, y) dx dy =            k dx dy =        k y dy = = 1 ⇒ k = 2
                                        0   0               0          2
  Similarly
                                                   1
                fX (x) =       fXY (x, y) dy =           k dy = k(1 − x),       0<x<1
                                                 x
  and                                                      y
                   fY (y) =        fXY (x, y) dx =               k dx = k y,   0<y<1
                                                         0
  Therefore
                                         fXY (x, y)  1
                       fX|Y (x|y) =                 = ,              0<x<y<1
                                           fY (y)    y
                                                                                       44
Chap 3: Two Random Variables
  and
                                   fXY (x, y)    1
                   fY |X (y|x) =              =     ,   0<x<y<1
                                    fX (x)      1−x
                                                                     45
Chap 3: Two Random Variables
  Example 14
                : Let R be a uniform random variable with parameters 0 and 1. Given R = r,
  X is a uniform random variable with parameters 0 and r. Find the conditional pdf of R given
  X, fR|X (r|x)).
  Solution: conditional density of X given R is
                                             ⎧
                                             ⎨ 1        0≤x≤r
                                fX|R (x|r) =    r
                                             ⎩ 0        o.w.
  since                                     ⎧
                                            ⎨ 1     0≤r≤1
                                   fR (r) =
                                            ⎩ 0     o.w.
  it follows that the joint pdf of R and X is
                                                        ⎧
                                                        ⎨      1
                                                                   0≤x<r<1
                    fR,X (r, x) = fX|R (x|r) fR (r) =          r
                                                        ⎩ 0        o.w.
                                                                             46
Chap 3: Two Random Variables
                                                                         47
Chap 3: Two Random Variables
  We can use the conditional pdfs to define the conditional mean. More generally, applying
  definition of expectation to conditional pdfs we get
                                              ∞
                              E[g(X)|B] =         g(x)fX (x|B) dx
                                             −∞
                                                                               48
Chap 3: Two Random Variables
  Example 15
                : Let                    ⎧
                                         ⎨ 1 0 < |y| < x < 1
                            fXY (x, y) =
                                         ⎩ 0 o.w.
  Solution: As Fig. 8 shows, fXY (x, y) = 1 in the shadowed area, and zero elsewhere. From
  there                            x
                        fX (x) =        fXY (x, y) dy = 2x 0 < x < 1
                                   −x
                                                                            49
Chap 3: Two Random Variables
  and                                     1
                            fY (y) =           1 dx = 1 − |y| |y| < 1
                                        |y|
  This gives
                                  fXY (x, y)      1
                   fX|Y (x|y) =              =                     0 < |y| < x < 1
                                    fY (y)     1 − |y|
  and
                                      fXY (x, y)    1
                      fY |X (y|x) =              =               0 < |y| < x < 1
                                       fX (x)      2x
  Hence
                                                                                
1
                                                      1
                                                              x            1 x2 
                                                            2 
x
                                                 y       1 y 
                                                                                         50
Chap 3: Two Random Variables
                                                                                51
Chap 3: Two Random Variables
  Example 16
                : Poisson sum of Bernoulli random variables: Let Xi , i = 1, 2, 3, · · · ,
  represent independent, identically distributed Bernoulli random variables with
P (Xi = 1) = p P (Xi = 0) = 1 − p = q
  and N a Poisson random variable with parameter λ that is independent of all Xi . Consider
  the random variables
                                    N
                              Y1 =      Xi ,    Y2 = N − Y1
                                      i=1
  Show that Y1 and Y2 are independent Poisson random variables.
  Solution : the joint probability mass function of Y1 and Y2 can be solved as
                                                                                 52
Chap 3: Two Random Variables
              m+n
  Note that    i=1  Xi ∼ B(m + n, p) and Xi ’s are independent of N
                                                          	               	
                                          (m + n)! m n                λ m+n
                P (Y1 = m, Y2 = n) =                 p q       e−λ
                                             m! n!                 (m + n)!
                                                       	              	
                                                 (pλ) m
                                                                 (qλ) n
                                    =     e−pλ              e−qλ
                                                   m!              n!
                                    = P (Y1 = m) · P (Y2 = n)
  Thus,
                                 Y1 ∼ P (pλ)      Y2 ∼ P (qλ)
  and Y1 and Y2 are independent random variables. Thus if a bird lays eggs that follow a
  Poisson random variable with parameter λ, and if each egg survives with probability p, then
  the number of baby birds that survive also forms a Poisson random variable with parameter
  pλ.
  Example 17
                : Suppose that the number of people who visit a yoga academy each day is a
  Poisson RV. with mean λ. Suppose further that each person who visits is, independently,
  female with probability p or male with probability 1 − p. Find the joint probability that
                                                                              53
Chap 3: Two Random Variables
                                                                         −λ    λn+m
         P (N1 = n, N2 = m) =       P [N1 = n, N2 = m|N = n + m]e
                                                                              (n + m)!
                                    ⎛         ⎞
                                       n+m                        n+m
                                =   ⎝         ⎠ pn (1 − p)m e−λ λ
                                          n                    (n + m)!
  We can conclude that N1 and N2 are independent Poisson RVs with respectively means λp
  and λ(1 − p). Therefore, example 16 and 17 showed an important result: when each of a
                                                                               54
Chap 3: Two Random Variables
  Poisson number of events is independently classified either as being type 1 with probability p
  or type 2 with probability 1 − p, then the number of type 1 and type 2 events are independent
  Poisson random variables.
                                                                               55
Chap 3: Two Random Variables
   • E[X] = E[E[X|Y ]]
                                  
                  E[X]    =            E[X|Y = y]P (Y = y)                  Y is discrete           (59)
                                  y
                                     ∞
                  E[X]    =               E[X|Y = y]fY (y)dy                Y is continuous         (60)
                                  −∞
                                                                                            56
Chap 3: Two Random Variables
  Example 18 (The expectation of the sum of a random number of random variables) Suppose
  that the expected number of accidents per week at an industrial plant is four. Suppose also
  that the number of workers injured in each accident are independent RVs with a common
  mean of 2. Assume also that the number of workers injured in each accident is independent of
  the number of accidents that occur. What is the expected number of injuries during a week?
  Solution: Letting N denote the number of accidents and Xi the number of injured in the ith
                                                                                N
  accident, i = 1, 2, ..., then the total number of injuries can be expressed as i=1 Xi . Now
                                             !                      !!
                                      N                  N
                                  E        Xi = E E           Xi |N
                                   1                  1
                                                                              57
Chap 3: Two Random Variables
  But
                               !                                   !
             
             N                                 
                                               n
         E         Xi |N = n       = E                 Xi |N = n
               1                                   1
                                                            !
                                               
                                               n
                                   = E                 Xi          by independence of Xi and N
                                                   1
                                   = nE[X]
  which is                                                  !
                                           
                                           N
                                       E           Xi |N = N E[X]
                                               1
  and thus                                 !
                                   
                                   N
                           E           Xi = E[N E[X]] = E[N ]E[X]
                                   1
  Therefore, the expected number of injuries during a week equals 4 × 2 = 8.
                                                                                       58
Chap 3: Two Random Variables
  Example 19
               (The mean of a geometric distribution) A coin, having probability p of
  coming up head, is to be successively flipped until the first head appears. What is the
  expected number of flips required?
  Solution: Letting N be the number of flips required, and let
                              ⎧
                              ⎨ 1, if the first flip results in a head
                         Y =
                              ⎩ 0. if the first flip results in a tail
Now
  and thus
                                 E[N ] = 1/p
                      ∞        ∞
  compare with E[N ] = 1 np(n) = 1 n × p(1 − p)n−1
                                                                               59
Chap 3: Two Random Variables
  Example 20
                  A miner is trapped in a mine containing three doors. The first door leads to a
  tunnel that takes him to safety after two hours of travel. The second door leads to a tunnel
  that returns him to the mine after three hours of travel. The third door leads to a tunnel that
  returns him to his mine after five hours. Assuming that the miner is at all times equally likely
  to choose any one of the doors, what is the expected length of time until the miner reaches
  safety?
  Solution: Letting X denote the time until the miner reaches safety, and let Y denote the door
  he initially chooses. Now
                                                                                 60
Chap 3: Two Random Variables
  and thus
                 1
        E[X] =     (2 + 3 + E[X] + 5 + E[X])   leads to   E[X] = 10 hours.
                 3
                                                                 61
Chap 3: Two Random Variables
  It follows that
                      E[X] = P [E]        and   E[X|Y = y] = P [E|Y = y]
  for any RV Y . Therefore,
                              
                    P [E] =        P [E|Y = y]P (Y = y)      if Y is discrete          (61)
                               y
                                  ∞
                         =             P [E|Y = y]fY (y)dy   if Y is continuous        (62)
                               −∞
                                                                                  62
Chap 3: Two Random Variables
  Example 21
                   Suppose that X and Y are independent continuous random variables having
  densities fX   and fY , respectively. Compute P (X < Y )
  Solution: Conditioning on the value of Y yields
                                          ∞
                      P (X < Y ) =            P [X < Y |Y = y]fY (y)dy
                                             −∞
                                             ∞
                                     =            P [X < y|Y = y]fY (y)dy
                                             −∞
                                             ∞
                                     =            P (X < y)fY (y)dy
                                             −∞
                                             ∞
                                     =            FX (y)fY (y)dy
                                             −∞
  where                                           y
                                    FX (y) =           fX (x)dx
                                                  −∞
63