Presence Tasks 2
Presence Tasks 2
Presence Tasks 2.1 - Rules for Mean Vectors and Covariance Matrices
Let X, Y and Z be n-dimensional random vectors, µ := E[X], Cov(X) = Σ and A, B ∈
Rm×n , a, b ∈ Rn . Show that the following results hold:
b) E[AX + b] = A · E[X] + b,
d) Var(aT X) = aT Cov(X)a,
e) Cov(AX + b) = ACov(X)AT ,
Solution
Let Xi , Yi , Zi , i ∈ {1, . . . , n} denote the components of X, Y, Z.
1
b) Let A ∈ Rm×n , X ∈ Rn , b ∈ Rn . Further we write A = (aij )i∈{1,...,m},j∈{1,...,n} for
the components of A and Aj , j ∈ {1, . . . , n} for the rows of A. Then we have that
A1 · X T + b1
E[AX + b] = E
..
.
An · X T + bn
E[A1 · X T + b1 ]
= ..
.
.
E[An · X T + bn ]
A1 · E[X]T + b1
E[A · X + b] =
..
.
T
An · E[X] + bn
= A · E[X] + b.
2
e) With A ∈ Rm×n , X ∈ Rn and b ∈ Rn we have that
Cov(AX + b) = E (AX + b − E[AX + b])(AX + b − E[AX + b])T
3
i) We have that
E[E[X1 |Y ]]
E[E[X|Y ]] =
..
.
E[E[Xn |Y ]]
E[X1 ]
= ...
E[Xn ]
= E[X],
by the tower property of the one dimensional conditional expectation.
j) Let X, Y ∈ Rn . Then from the one dimensional case we know that
Cov(Xi , Xj ) = E[Xi Xj ] − E[Xi ]E[Xj ] ⇔ E[Xi Xj ] = Cov(Xi , Xj ) + E[Xi ]E[Xj ],
where i ̸= j. We note that
Cov(X) = (Cov(Xi , Xj ))i,j∈{1,...,n} .
Further
Cov(Xi , Xj ) = E[Xi Xj ] − E[Xi ]E[Xj ]
= E[E[Xi Xj |Y ]] − E[E[Xi |Y ]]E[E[Xj |Y ]]
above
= E[Cov(Xi Xj |Y ) + E[Xi |Y ]E[Xj |Y ]] − E[E[Xi |Y ]]E[E[Xj |Y ]]
= E[Cov(Xi , Xj |Y )] + E[E[Xi |Y ]E[Xj |Y ]] − E[E[Xi |Y ]]E[E[Xj |Y ]]
= E[Cov(Xi , Xj |Y )] − Cov(E[Xi |Y ], E[Xj |Xj ).
So we have proven the property in every entry of the matrix Cov(X).
j) Setting i = j in the solution of task j) we obtain
Cov(Xi , Xi ) = Var(Xi ) = E[Var(Xi |Y )] − Var(E[Xi |Y ]).
So the claimed identy holds in every coordinate of the vector Var(X).
4
Solution
We will use the following properties of matrix calculations: Let A, B ∈ Rn×k , C ∈ Rk×k ,
then
• (AB)T = B T AT ,
• (A + B)T = AT + B T ,
• (AT )T = A,
Ax = A2 x ⇒ x = Ax.
If x ∈ ker(A) then
0 = (I − A)x ⇒ Ix = Ax ⇒ x ∈ im(A).
Overall im(A) = ker(I − A) and ker(A) = im(I − A). From this we conclude
a) From the lecture we know that H = X(X T X)−1 X T . Then the transpose of H is
given as
T
H T = X(X T X)−1 X T = (X T )T (X(X T X)−1 )T
= X((X T X)−1 )T X T = X((X T X)T )−1 X T
= X(X T X)−1 X T = H.
5
So H is a symmetric matrix. Further
H · H = X (X T X)−1 X T X (X T X)−1 X T = X(X T X)−1 X T = H,
| {z }
=I
such that H is also idempotent.
From the assumptions to classical linear model we know that rk(X) = k + 1.
Therefore we have that
X (X T X)−1 |{z}
rk(H) = rk( |{z} X T ) = rk((X T X)−1 ).
| {z }
∈Rn×k+1 ∈Rk+1×k+1 ∈Rk+1×n
6
Solution
a) From the lecture we know that β̂ = (X T X)−1 X T y. Therefore we can express the
residuals ε̂ as
b) We have shown above that I − H is symmetric and idempotent. With this we have
that
c) The variance of the residuals Var(ε̂i ), for i ∈ {1, . . . , n} are defined as the diagonal
entries of Cov(ε̂). Therefore with task b) we have that
Var(ε̂i ) = (1 − hii )σ 2 .
Q = I − H = I − X(X T X)−1 X T .
QX = X − X(X T X)−1 X T X = X − X = 0.
7
From this we have that
ε̂T ε̂ = (Qy)T Qy = y T QT Qy
= y T QQy = (Xβ + ε)T QQ(Xβ + ε)
= (β T (QX)T +εT Q)(QX β + Qε)
| {z } |{z}
=0 =0
= εT QQε = εT Qε.
By the model assumptions we have that ε ∼ N (0, σ 2 I) so ε/σ ∼ N (0, I). The
result follows from the theorem:
Let X ∼ N (0, I) and R symmetric, idempotent p × p matrix with rk(R) = r. Then
X T RX ∼ χ2r .
f) We show that σ1 (β̂ −β) and σ12 ε̂T ε̂ are independent. For this we apply the following
result:
Let X ∼ N (0, Ip ), B an (n × p)-matrix (n ≤ p), R a symmetric idempotent (p × p)
matrix with rk(R) = r, then
BR = 0 ⇒ X T RX is idependent to BX.
εT ε ε̂T ε̂
Q = 2 ,
σ σ σ
T −1 T ε
(X X) X .
σ
The claim follows by
1 1
(β̂ − β) = ((X T X)−1 X T y − β)
σ σ
1
(X T X)−1 X T (Xβ + ε) − β
=
σ
1
= (β + (X T X)−1 ε − β)
σ
1
= (X T X)−1 ε
σ
ε
= (X T X)−1 .
σ