0% found this document useful (0 votes)
54 views10 pages

MS Lectures 6

The document discusses multivariate random variables and their properties. It defines an n-dimensional random vector whose components are random variables. The definitions for bivariate random variables, such as the joint probability distribution function and probability that a random variable is contained in a set, extend to multivariate random variables. The document also provides examples of calculating the marginal probability distribution function of components of a multivariate random variable, the expectation of the product of two components, a conditional probability distribution function, and a probability involving multiple components.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views10 pages

MS Lectures 6

The document discusses multivariate random variables and their properties. It defines an n-dimensional random vector whose components are random variables. The definitions for bivariate random variables, such as the joint probability distribution function and probability that a random variable is contained in a set, extend to multivariate random variables. The document also provides examples of calculating the marginal probability distribution function of components of a multivariate random variable, the expectation of the product of two components, a conditional probability distribution function, and a probability involving multiple components.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

62 CHAPTER 1.

ELEMENTS OF PROBABILITY DISTRIBUTION THEORY

1.12 Multivariate Random Variables

We will be using matrix notation to denote multivariate rvs and their distributions.
Denote by X = (X1 , . . . , Xn )T an n-dimensional random vector whose compo-
nents are random variables. Then, all the definitions given for bivariate rvs extend
to the multivariate case. For example, if X is continuous, then we may write
Z xn Z x1
FX (x1 , . . . , xn ) = ··· fX (x1 , . . . , xn )dx1 . . . dxn
−∞ −∞

and Z Z
P (X ⊆ A) = ··· fX (x1 , . . . , xn )dx1 . . . dxn ,
A
where A ⊆ X and X ⊆ Rn is the support of fX .

Example 1.37. Let X = (X1 , X2 , X3 , X4 )T be a four-dimensional random vector


with the joint pdf given by
3
fX (x1 , x2 , x3 , x4 ) = (x21 + x22 + x23 + x24 )IX ,
4
where X = {(x1 , x2 , x3 , x4 ) ∈ R4 : 0 < xi < 1, i = 1, 2, 3, 4}. Calculate:

1. the marginal pdf of (X1 , X2 );


2. the expectation E(X1 X2 );

3. the conditional pdf f x3 , x4 |x1 = 31 , x2 = 2
3
;

4. the probability P X1 < 21 , X2 < 34 , X4 > 1
2
.

Solution:

1. Here we have to calculate the double integral of the joint pdf with respect
to x3 and x4 , that is,
Z ∞Z ∞
f (x1 , x2 ) = fX (x1 , x2 , x3 , x4 )dx3 dx4
−∞ −∞
Z 1Z 1
3 2
= (x1 + x22 + x23 + x24 )dx3 dx4
0 0 4
3 2 1
= (x1 + x22 ) + .
4 2
1.12. MULTIVARIATE RANDOM VARIABLES 63

2. By definition of expectation we have


Z ∞Z ∞
E(X1 X2 ) = x1 x2 f (x1 , x2 )dx1 dx2
−∞ −∞
Z 1Z 1  
3 2 2 1 5
= x1 x2 (x1 + x2 ) + dx1 dx2 = .
0 0 4 2 16

3. By definition of a conditional pdf we have,


 fX (x1 , x2 , x3 , x4 )
f x3 , x4 |x1 , x2 =
f (x1 , x2 )
3
4
(x1 + x22 + x23 + x24 )
2
= 3
4
(x21 + x22 ) + 21
x2 + x2 + x2 + x2
= 1 2 2 2 3 2 4.
x1 + x2 + 3
Hence,

1 2

2 2
1 2 + + x23 + x24 5 9 9
f x3 , x4 |x1 = , x2 = = 3
 3
 = + x23 + x24 .
3 3 1 2 2
11 11 11
3
+ 23 + 32

4. Here we use (indirectly) the marginal pdf for (X1 , X2 , X4 ):


1 3 1
P X1 < , X2 < , X4 >
2 4 2
Z 1Z 1Z 3 Z 1
4 2 3 151
= (x21 + x22 + x23 + x24 )dx1 dx2 dx3 dx4 = .
1
2
0 0 0 4 1024


The following results will be very useful in the second part of this course. They
are extensions of Definition 1.18, Theorem 1.13, and Theorem 1.14, respectively,
to n random variables X1 , X2 , . . . , Xn .

Definition 1.22. Let X = (X1 , X2 , . . . , Xn )T denote a continuous n-dimensional


rv with joint pdf fX (x1 , x2 , . . . , xn ) and marginal pdfs fXi (xi ), i = 1, 2, . . . , n.
The random variables are called mutually independent (or just independent) if
n
Y
fX (x1 , x2 , . . . , xn ) = fXi (xi ).
i=1

64 CHAPTER 1. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY

It means that all pairs Xi , Xj , i 6= j, are independent.

Example 1.38. Suppose that Yi ∼ Exp(λ) independently for i = 1, 2, . . . , n.


Then the joint pdf of Y = (Y1 , Y2 , . . . , Yn )T is
n
Y Pn
fY (y1 , . . . , yn ) = λe−λyi = λn e−λ i=1 yi
.
i=1

Theorem 1.21. For gj (Xj ), a function of Xj only, j = 1, 2, . . . , m, m ≤ n, we


have !
Ym m
Y 
E gj (Xj ) = E gj (Xj ) .
j=1 j=1


Theorem 1.22. Let X = (X1 , X2 , . . . , Xn )T be a vector of mutually independent


rvs with mgfs MX1 (t), MX2 (t), . . . , MXn (t) and let a1 , a2 , . . . , P
an and b1 , b2 , . . . , bn
be fixed constants. Then the mgf of the random variable Z = ni=1 (ai Xi + bi ) is

P n
Y
MZ (t) = et bi
MXi (ai t).
i=1

Exercise 1.23. Proof Theorem 1.22.

Example
Pn 1.39. Calculate the mean and the variance of the random variable Y =
i=1 X i where Xi ∼ Gamma(αi , λ) independently.
,

First, we will find the mgf of Y and then generate the first and second moments
using this mgf (Theorem 1.7). Xi are independent, hence, by Theorem 1.22 we
have n
Y
MY (t) = MXi (t).
i=1

The pdf of a single rv X ∼ Gamma(α, λ) is


λα α−1 −λx
fX (x) = x e I[0,∞) (x).
Γ(α)
1.12. MULTIVARIATE RANDOM VARIABLES 65

Thus, by the definition of the mgf we have



MX (t) = E etX
Z ∞
λα
= etx xα−1 e−λx dx
Γ(α) 0
Z ∞
λα
= xα−1 e−(λ−t)x dx
Γ(α) 0
Z ∞
(λ − t)α λα
= xα−1 e−(λ−t)x dx
(λ − t)α Γ(α) 0
Z
λα (λ − t)α ∞ α−1 −(λ−t)x
= x e dx
(λ − t)α Γ(α) 0
| {z }
=1, (pdf of a Gamma rv)
 α  −α
λ t
= = 1− .
λ−t λ
Hence,
n n  −αi  − Pni=1 αi
Y Y t t
MY (t) = MXi (t) = 1− = 1− .
i=1 i=1
λ λ
This
Pn has the same form as the mgf of a Gamma random variable with parameters
i=1 αi and λ, that is,
n
!
X
Y ∼ Gamma αi , λ .
i=1

The mean and variance of a Gamma rv can be obtained calculating the derivatives
of the mgf at t = 0, see Theorem 1.7. For X ∼ Gamma(α, λ) we have
 −α
t
MX (t) = 1 −
λ
α
EX =
λ
α(α + 1)
E X2 =
λ2
α
var(X) = E X 2 − [E X]2 = 2
λ
Pn
Hence, for Y ∼ Gamma( i=1 αi , λ) we get
Pn Pn
α i αi
E Y = i=1 and var(Y ) = i=12 .
λ λ

66 CHAPTER 1. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY

The following definition is often used when we consider realizations of rvs (sam-
ples) coming from populations having the same distribution.
Definition 1.23. The random variables X1 , X2 , . . . , Xn are identically distributed
if their distribution functions are identical, that is,
FX1 (x) = FX2 (x) = . . . = FXn (x) f or all x ∈ R.

If they are also independent then we denote this briefly as IID, which means In-
dependently, Identically Distributed. For example, notation
{Xi }i=1,2,...,n ∼ IID
means that the variables Xi are IID but the type of the distribution is not specified.
We will often use IID normal rvs denoted by
Xi ∼ N (µ, σ 2 ), i = 1, 2, . . . , n.
iid

1
Pn
Exercise 1.24. Find the pdf of the random variable X = n i=1 Xi , where
2
Xi ∼ N (µ, σ ), i = 1, 2, . . . , n.
iid

1.12.1 Expectation and Variance of Random Vectors

The expectation of a random vector X is a vector of expectations of its compo-


nents, that is,
     
X1 E(X1 ) µ1
 X2  
   E(X2 )  
  µ2 

E(X) = E  ..  =  .. = ..  = µ.
 .   .   . 
Xn E(Xn ) µn
The variance-covariance matrix of X is
V = Var(X)
 
= E (X − E(X))(X − E(X))T
 
var(X1 ) cov(X1 , X2 ) . . . cov(X1 , Xn ) (1.20)
 cov(X2 , X1 ) var(X2 ) . . . cov(X2 , Xn ) 
 
= .. .. .. .. 
 . . . . 
cov(Xn , X1 ) cov(Xn , X2 ) . . . var(Xn )
1.12. MULTIVARIATE RANDOM VARIABLES 67

The following theorem shows a basic property of the variance-covariance matrix.

Theorem 1.23. If X is a random vector then its variance-covariance matrix V


is a non-negative definite matrix, that is for any constant vector b the quadratic
form bT V b is non-negative.

Proof. For any constant vector b ∈ Rn we can construct a one-dimensional vari-


able Y = bT X whose variance is
 
0 ≤ var(Y ) = E (Y − E(Y ))2
 
= E (bT X − E(bT X))2
 
= E (bT X − E(bT X))(bT X − E(bT X))T
 
= E bT (X − E(X))(X − E(X))T b
 
= bT E (X − E(X))(X − E(X))T b
= bT Var(X)b = bT V b.

That is bT V b ≥ 0 and so V is a non-negative definite matrix. 

The proof of the above theorem shows that the variance of a combination Y =
P n
i=1 bi Xi of random variables Xi is a quadratic form of the variance-covariance
matrix of X and the vector of the coefficients of the combination b. More gen-
erally, if X is n-dimensional rv, B is an m × n constant matrix and a is a real
m × 1 vector, then the expectation and the variance of the random vector

Y = a + BX

are, respectively
E(Y ) = a + B E(X) = a + Bµ,

and
Var(Y ) = B Var XB T .

The covariance of two random vectors, n-dimensional X and m-dimensional Y ,


is defined as
 
Cov(X, Y ) = E (X − E(X))(Y − E(Y ))T .

It is an n × m-dimensional matrix.
68 CHAPTER 1. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY

1.12.2 Joint Moment Generating Function

Definition 1.24. Let X = (X1 , X2 , . . . , Xn )T be a random vector. We define the


joint mgf as  T 
MX (t) = E et X ,
where t = (t1 , t2 , . . . , tn )T is an n-dimensional argument of M. 

Similarly as in the univariate case, there is a unique relationship between the joint
pdf and the joint mgf. The mgf related to a marginal distribution of a subset of
variables Xi1 , . . . , Xis can be obtained by setting tj = 0 for all j not in the set
{i1 , . . . , is }.

Note also that if the variables X1 , X2 , . . . , Xn are mutually independent, then the
joint mgf is a product of the marginal mgfs, that is
n n
 tT X   Pn tj Xj  Y
tj Xj
Y
MX (t) = E e =E e j=1 =E e = MXj (tj ).
j=1 j=1

Another useful property of the joint mgf is given in the following theorem.

Theorem 1.24. Let X = (X1 , X2 , . . . , Xn )T be a random vector. If the joint mgf


of X can be written as a product of some functions gj (tj ), j = 1, 2, . . . , n, that is
n
Y
MX (t) = gj (tj ),
j=1

then the variables X1 , X2 , . . . , Xn are independent.


Proof. Let ti = 0 for all i 6= j. Then, the marginal mgf MXj (tj ) is
Y
MXj (tj ) = gj (tj ) gi (0).
i6=j

Also, note that if ti = 0 for all i = 1, 2, . . . , n, then


 Pn   
MX (t) = E e j=1 tj Xj = E e0 = 1.

This gives
n
Y Y 1
1 = MX (t) = gj (0) ⇒ gi (0) = .
j=1 i6=j
gj (0)
1.12. MULTIVARIATE RANDOM VARIABLES 69

Therefore,
gj (tj )
MXj (tj ) =
gj (0)
and hence
n
Y n
Y n
Y
MX (t) = gj (tj ) = gj (0)MXj (tj ) = 1 × MXj (tj ).
j=1 j=1 j=1

This means that the joint pdf can also be written as a product of marginal pdfs,
g (t )
each with the marginal mgf equal to MXj (tj ) = gjj (0)j . Hence, the random vari-
ables X1 , X2 , . . . , Xn are independent. 

1.12.3 Transformations of Random Vectors

Let X = (X1 , X2 , . . . , Xn )T be a continuous random vector and let g : Rn → Rn


be a one-to-one and onto function denoted by

g(x) = (g1 (x), g2(x), . . . , gn (x))T ,

where x = (x1 , x2 , . . . , xn )T and gi : Rn → R. Then, for a transformed random


vector Y = g(X) we have the following result.

Theorem 1.25. The density of Y = g(X) is given by



fY (y) = fX h(y) Jh (y) ,

where h(y) = g −1 (y) and Jh (y) denotes the absolute value of the Jacobian
 ∂ ∂ ∂ 
h (y)
∂y1 1
h (y)
∂y1 2
... h (y)
∂y1 n
∂ ∂ ∂
∂ 
 h (y)
∂y2 1
h (y)
∂y2 2
... h (y)
∂y2 n


Jh (y) = det h(y) = det  .. .. .. 
∂y  . . . 
∂ ∂ ∂
h (y)
∂yn 1
h (y) . . . ∂yn hn (y)
∂yn 2

Another useful form of the Jacobian is


 −1
Jh (y) = Jg h(y) ,
70 CHAPTER 1. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY

where

Jg (x) = det g(x).
∂x

Exercise 1.25. Let A be a non-singular n × n real matrix and let X be an n-


dimensional random vector. Show that the linearly transformed random variable
Y = AX has the joint pdf given by
1 
fY (y) = fX A−1 y .
| det A|

1.12.4 Multivariate Normal Distribution

A random variable X has a multivariate normal distribution if its joint pdf can be
written as
 
1 1 T −1
fX (x1 , . . . , xn ) = √ exp − (x − µ) V (x − µ) ,
(2π)n/2 det V 2
where the mean is
µ = (µ1 , . . . , µn )T ,
and the variance-covariance matrix has the form (1.20).

Exercise 1.26. Use the result from Exercise 1.25 to show that if X ∼ N n (µ, V )
then Y = AX has n-dimensional normal distribution with expectation Aµ and
variance-covariance matrix AV AT .

Lemma 1.3. If X ∼ Nn (µ, V ), B is an m × n matrix, and a is a real m × 1


vector, then the random vector
Y = a + BX
is also multivariate normal with
E(Y ) = a + B E(X) = a + Bµ,
and the variance-covariance matrix,
VY = BV B T .


1.12. MULTIVARIATE RANDOM VARIABLES 71

Note that taking B = bT , where b is an n × 1 dimensional vector and a = 0 we


obtain
Y = bT X = b1 X1 + . . . + bn Xn ,
and
Y ∼ N (bT µ, bT V b).

You might also like