Mathbootcamp Ampba Aug 2021
Mathbootcamp Ampba Aug 2021
Lecture 1
Aug 2021
Abhishek Rishabh
Today’s agenda
• About this course.
• This will help you to understand the math behind most of the econometrics, machine and
deep learning models.
• Practice.
• Knowing that, this is a starting point. You might not be able to appreciate all the
ideas now. It’s okay, it takes time.
Student Interaction
• 20 mins session.
• A vector ei with ith element 1 and all others 0 is the ith unit vector.
• Vector dimension.
• Vector Operations.
• Types of vectors.
• Inner product.
Matrices
• X + Y = Y + X (commutativity) and (X + Y) + Z = X + (Y + Z)
(associativity).
• X + Y = Y + X (commutativity) and (X + Y) + Z = X + (Y + Z)
(associativity).
• Is UV=VU ?
Matrix Multiplication
Matrix Multiplication
Matrix Multiplication
• Is the following true?
• (A+B)2=A2+B2+2AB
• Is AB symmetric?
Orthogonal
• A pxp square matrix A is orthogonal if A’A = AA’ = Ip
1 𝑥
• Example: A= , which of the above is A?
0 −1
R exercises
•
Linear Independence
2 −4 1 0
• Independence if a1 2 + a2 6 + a3 0 = 0
1 5 0 0
If it has only a trivial solution a1 = a2 = a3 = 0
2a1-4a2+a3=0
2a1+6a2=0 (2) => a1=-3a2
a1+5a2=0 (3) => a1=-5a2
Rank of a matrix
• For a (m x n) matrix X,
• (x1, x2, x3 … xn) – column vectors of length ‘m’
• Similarly, it has ‘m’ row vectors with length ‘n’
• For a (m x n) matrix X,
• Rank of a matrix can be defined as maximum number of linearly independent
rows (or) maximum number of linearly independent columns, whichever is
lower
1 3 5
2 4 6
Example 1: Rank of a matrix (1/2)
1 3 5
X=
2 4 6
Þ a1 + 2a2 = 0 - eq(1)
Þ ρ(X) = 2
Example 2: Rank of a matrix
• Find the Rank of the following matrix:
4 6
6 9
Example 2: Rank of a matrix (1/2)
4 6
X=
6 9
Þ ρ(X) = 1
Example 3: Rank of a matrix
• Find the Rank of the following matrix:
1 2 3 2
4 5 6 −1
5 7 9 1
Example 3: Rank of a matrix (1/2)
A1+4a2+5a3=0
1 2 3 2 2a1+5a2+7a3=0
X = 4 5 6 −1 3a1+6a2+9a3=0
5 7 9 1 2a1-a2+a3=0
A1=1,a2=1,a3=-1
1 2 3 2
2 4 6 4
3 6 9 0
Þ But, 1 2 3 2 + 4 5 6 −1 = 5 7 9 1
Þ Hence, ρ(X) = 2
1 5 6
2 6 8
7 1 8
Example 4: Rank of a matrix (1/2)
1 5 6
X= 2 6 8
7 1 8
Rank of this matrix (3 x 3), ρ(X) ≤ 3 &
Þ a1 1 5 6 + a2 2 6 8 + a3 7 1 8 = 0
Þ a2 + a3 = 0
Þ Hence, ρ(X) = 2
Full Rank
• A (m x n) Matrix X is said to be of full rank if
• ρ(X) = min(m, n)
2 6 10
X= is a 2x3 matrix and ρ(X) = 2
4 8 12
2 6 10
•X=
0 0 0
1 0 0
•X= 0 1 0
0 0 1
Rank of a Diagonal Matrix
• Rank of a diagonal Matrix D
• Is equal to Number of non-zero diagonal elements in D. How?
3 0 0 0
2 0 0
0 5 0 0
X= ;Y = 0 7 0
0 0 0 0
0 0 9
0 0 0 8
1 0 0
I= 0 1 0 (𝑆𝑝𝑒𝑐𝑖𝑎𝑙 𝑐𝑎𝑠𝑒 𝑜𝑓 𝐷𝑖𝑎𝑔𝑜𝑛𝑎𝑙 𝑚𝑎𝑡𝑟𝑖𝑥 − 𝐼𝑑𝑒𝑛𝑡𝑖𝑡𝑦 𝑀𝑎𝑡𝑟𝑖𝑥)
0 0 1
2 6 2 3
N= ;M=
1 3 1 5
Rank Inequalities: Sum
• ρ(X + Y) ≤ ρ(X) + ρ(Y)
• ρ(X +(-Y))
2 6 2 3
•N= ;M=
1 3 1 5
• ρ(N) = 1, ρ(M) = 2
4 9
•M+N=
2 8
Rank Inequalities: Rank Products
2 5 2 2
•N= ;M=
3 4 3 3
• ρ(N) = 2, ρ(M) = 1
19 19
•N*M=
18 18
Rank Inequalities : Example(2/2)
19 19
• Rank of N * M = is 1
18 18
• ρ(NM) ≤ min(ρ(N), ρ(M))
• 1 ≤ min(2, 1) = 1
Rank Inequalities
Product with orthogonal matrix:
• If C is an orthogonal matrix then ρ(AC) = ρ(A) because
• ρ(A) = ρ(ACCT) ≤ ρ(AC) ≤ ρ(A)
Submatrices:
• If Aij is a submatrix of A, then ρ(Aij) ≤ ρ(A)
Rank in Statistics
• Rank can be used as a crucial tool in Statistics in the form of
analysis
• In Simple linear Models, y = Xβ +ε
• In Multivariate analysis, as few statistical techniques depend on the X
having a full Rank
• More details will be on these will be covered as we move along in the
course
Summary on Rank
• Idea of linear independence
• What is the rank of a matrix?
• How to calculate it?
• Some rank inequalities
Determinant of a Matrix
• For every (n x n) Matrix A = (aij), det(A) can be calculated using
aij
2 4
A=
1 −3
6 4
B=
9 6
Determinant of a matrix – Example
2 4
A=
1 −3
• det(A) = 2*(-3) – 1*4 = -10
6 4
B=
9 6
• det(B) = 6*6 – 4*9 = 0
Determinant of a matrix – How?
2 2 4
A = 1 3 −3
4 6 7
Determinant of a matrix – Example
2 2 4
A = 1 3 −3
4 6 7
3 −3 1 −3 1 3
det(A) = 2*│ │ − 2∗│ │+ 4*│ │
6 7 4 7 4 6
= 2*(3*7 –(-3)*6) – 2*(1*7 – (-3)*4) + 4*(1*6 – 3*4)
= 2*(39) – 2*(19) + 4*(-6)
= 78 – 38 -24
=16
Co-factors in matrix
• Cofactor is a number that you get by eliminating the row and
column of a element and finding the determinant of the rest.
3 4
(-1)^(1+1) │ │
−2 −5
2 5 −1
A= 0 3 4
1 −2 −5
2−1
Cofactor of a32 = c32 = (-1)3+2 *det(M 32) = (-1) │
5 │
0 4
= -1*(8 – 0)
= -8
Adjoint or Adjugate matrix
• Adjoint matrix A is a square matrix formed by the transpose of
a co-factor matrix of A
• It is denoted as Adj A
Adjoint matrix - Example
Find the adjoint of the matrix A.
3 1 −1
A = 2 −2 0
1 2 −1
Adjoint matrix - Example
Calculating co-factors of A
−2 0
Co-factor of a11 = 2
(-1) │ │=2
2 −1
2 0
Cofactor of a12 = (-1)3│ │=2
1 −1
2−2
Cofactor of a13 = (-1) │
4 │=6
1 2
Adjoint matrix - Example
Calculating co-factors of A
1−1
Co-factor of a21 = 3
(-1) │ │ = −1
2 −1
3 −1
Cofactor of a22 = (-1)4│ │ = -2
1 −1
3 1
Cofactor of a23 = (-1) │
5 │ = -5
1 2
Adjoint matrix - Example
Calculating co-factors of A
1 −1
Co-factor of a31 = (-1)4│ │ = −2
−2 0
3 −1
Cofactor of a32 = (-1) │
5 │ = -2
2 0
3 1
Cofactor of a33 = (-1) │
6 │ = -8
2 −2
Adjoint matrix - Example
2 2 6
Cofactor matrix of A, Cij= −1 −2 −5
−2 −2 −8
2 −1 −2
Adjoint of the matrix A, Adj A = (Cij)T = 2 −2 −2
6 −5 −8
Determinant of a (n x n) matrix
• Co-factors can be used to find the determinants of a (n x n)
matrix
2 4
A= , det(A) = -6 –(4) = -10
1 −3
Properties of Determinants: Row &
Column Operations
2 4
A= , det(A) = -6 –(4) = -10
1 −3
C1 is multiplied by scalar ‘2’ to form matrix B
4 4
B= , det(B) = -12 – 8 = -20 = 2*(-10)
2 −3
Properties of Determinants
• det(A) = det(AT)
5 2
A= , det(A) = -20 –(6) = -26
3 −4
5 3
AT = , det(AT) = -20 –(6) = -26
2 −4
Properties of Determinants
• If we multiply a n x n matrix A with scalar ‘λ’, the determinant of
the new matrix is λn *det(A)
3 5
A= , det(A) = 12 – 10 = 2
2 4
9 15
3*A = , det(AT) = 108 – 90 = 18 = 32*2 = 32*det(A)
6 12
Properties of Determinants
• If a complete row or column has zero as elements in matrix A,
det(A) = 0
0 5
A= , det(A) = 0 – 0 = 0
0 4
Properties of Determinants
• If A has two identical rows or columns, then det(A) = 0
3 3
A= , det(A) = 6 – 6 = 0
2 2
Properties of Determinants
• If A is n x n matrix and ρ(A) < n, then det(A) = 0
• Why?
Properties of Determinants
• If A is n x n matrix and ρ(A) < n, then det(A) = 0
• Why?
2 4
A= , det(A) = 12 – 12 = 0
3 6
Properties of Determinants
2 4
A=
3 6
Þ ρ(X) = 1
Properties of Determinants
• Determinant of the diagonal matrix D with d1, d2, d3…dn
• det(D) = d1*d2*d3…*dn
𝑑1 0 0
D = 0 𝑑2 0
0 0 𝑑3
det(D) = d1*d2*d3
Properties of Determinants
• Determinant of the diagonal matrix D with d1, d2, d3…dn
• det(D) = d1*d2*d3…*dn
2 0 0
A= 0 3 0
0 0 7
3 0 0 0 0 3
det(A) = 2*│ │ − 0∗│ │+ 0*│ │
0 7 0 7 0 0
= 2*3*7 = 42
Properties of Determinants
• Determinant of the triangular matrix T(upper or lower) with t1, t2,
t3…tn as the diagonal elements
• det(T) = t1*t2*t3…*tn
𝑡1 𝑎 𝑏
• T= 0 𝑡2 𝑐
0 0 𝑡3
det(T) = t1*t2*t3
Properties of Determinants
• Determinant of the triangular matrix T with t1, t2, t3…tn are the diagonal
elements
2 3 4
T= 0 3 5
0 0 7
3 5 0 5 0 3
det(A) = 2*│ │ − 3∗│ │+ 4*│ │
0 7 0 7 0 0
= 2*(3*7 –5*0) – 3*(0*7 – 0*5) + 4*(0*0 – 3*0)
= 2*3*7 = 42
Orthogonal Matrix
• A n x n matrix A is said to be orthogonal if
• AAT = In
• det(A) = +1 or -1
Block Matrix or Partitioned Matrix
• A block matrix is a matrix that is interpreted as having been
broken into sections – Blocks or Submatrices
2 2 3 4
1 3 0 5
P=
8 1 3 4
4 6 5 7
• It can be broken into four (2 x 2) matrices
Block Matrix - Example
2 2 3 4
2 2 3 4
P11 = P12 = P=
1 3 0 5
1 3 0 5 8 1 3 4
4 6 5 7
8 1 3 4
P21 = P22 =
4 6 5 7
𝑃11 𝑃12
P=
𝑃21 𝑃22
Inverse of a matrix
• The inverse of a square matrix A, denoted as A-1 , such that
• AA-1 = I
𝑎22 −𝑎12
• Inverse of A, A-1 = 1/(det(A))*
−𝑎21 𝑎11
Inverse of a (2 x 2) matrix – Example
2 3
• Find the inverse of the matrix, A = ?
7 5
• det(A) = a11*a22 – a12*a21 = 2*5 – 3*7 = -11
𝑎22 −𝑎12
• Inverse of A, A-1 = 1/(det(A))*
−𝑎21 𝑎11
5 −3 −5/11 3/11
= 1/(-11) * =
−7 2 7/11 −2/11
Inverse of a (3 x 3) matrix – How?
𝑎11 𝑎12 𝑎13
• In a (3 x 3) matrix, for A = 𝑎21 𝑎22 𝑎23
𝑎31 𝑎32 𝑎33
5 −3 −5/11 3/11
= 1/(-11) * =
−7 2 7/11 −2/11
3 1 −1
• 2 −2 0 ?
1 2 −1
• det(A) =2
• C11=2, c12=-1*(-2)=2,c13=6
2 −1 −2
• Adj A = (Cij)T = 2 −2 −2
6 −5 −8
• Inverse of A, A-1 = 1/(det(A))* 𝑎𝑑𝑗 (𝐴)
2 −1 −2 1 −1/2 −1
A-1 = (½ )* 2 −2 −2 = 1 −1 −1
6 −5 −8 3 −5/2 −4
Inverse of a Matrix - Properties
Proof:
• I = AA-1
=T(B)*A(T)
Inverse of a Matrix - Properties
• If a matrix is orthogonal, then A-1 = AT
Proof:
• AAT = ATA = I (By definition of orthogonal Matrix)
• This implies,
• AT = A-1 in case of an orthogonal matrix
Inverse of a Matrix - Properties
• Inverse of Product of two square matrices A, B
• (AB)-1 = B-1A-1
Inverse of a Matrix - Properties
Proof:
• (AB)B-1A-1 = AIA-1 = AA-1 = I
• Hence,
• (AB)-1 = B-1A-1
Applications of Inverse of a matrix
• Consider a system of linear equations, y = Ax
https://www.intmath.com/matrices-determinants/8-applications-eigenvalues-eigenvectors.php
Eigen Values
• Their task was to find the "most important" page for a particular search query. For
example, if everyone linked to Page 1, and it was the only one that had 5 incoming links,
then it would be easy - Page 1 would be returned at the top of the search result.
• However, we can see some pages in our web are not regarded as very important. For
example, Page 3 has only one incoming link. Should its outgoing link (to Page 5) be worth
the same as Page 1's outgoing link to Page 5?
• The beauty of PageRank was that it regarded pages with many incoming links (especially
from other popular pages) as more important than those from mediocre pages, and it
gave more weighting to the outgoing links of important pages.
• Eigenvector = PageRank
https://www.intmath.com/matrices-determinants/8-applications-eigenvalues-eigenvectors.php
Eigen Values
• If S is an n x n matrix, then the eigen values of S are roots of
characteristic equation S,
• │S - λIn│ = 0
Eigen Values - Example
1 4
S= , for eigen values │S - λIn│ = 0
9 1
1−λ 4
=>│ │=0
9 1 −λ
=> (1 – λ)2 – 36 = 0
=> λ2 –2 λ - 35 = 0
1 4 𝑥11 𝑥11
• For λ = 7, = 7*
9 1 𝑥12 𝑥12
2/3 −2/3
• The Eigen vectors = ,
1 1
Properties of Eigenanalyses
• If x is an eigen vector of S, any scalar multiple kx is also an
eigen vector
• S(kx) = λkx
Properties of Eigenanalyses
• If λi, λj are the distinct eigen values of S with eigen vectors xi, xj
• Then xi, xj are distinct vectors
Properties of Eigenanalyses
Proof:
• Hence, xi ≠ xj
Properties of Eigenanalyses
• If xi, xj are the distinct eigenvectors of S with same eigen value λ,
• Since, Sxi = λxi & Sxj = λxj ; Consider a, b are real scalars
=> Saxi = λaxi (Multiply with a) & Sbxj = λbxj (Multiply with b)
1
Þ −1 1 = [ -1 + 1 ] = [ 0 ]
1
Properties of Eigenanalyses
• If x and λ are eigen pair of a non-singular matrix S, then
• x is an eigen vector of S-1 with λ-1 as eigen value
Properties of Eigenanalyses
Proof:
• We have Sx = λx
1 1
S=
2 3
Properties of Eigenanalyses - Example
1 1
For S= , for eigen values │S - λIn│ = 0
2 3
1−λ 1
=>│ │=0
2 3 −λ
=> (3 – λ)(1 – λ) – 2 = 0
=> λ2 –4λ +1 = 0
=> λ = 2 + √3, 2 - √3
Properties of Eigenanalyses - Example
1 1
•S= , λ = 2 + √3, 2 - √3
2 3
1 1 𝑥11 𝑥11
• For λ = 2+ √3 , = (2 + √3)*
2 3 𝑥12 𝑥12
1 1 𝑥11 𝑥11
• For λ = 2- √3 , = (2 - √3)*
2 3 𝑥12 𝑥12
=> λ2 –4λ +1 = 0
=> λ = 2 + √3, 2 - √3
Properties of Eigenanalyses - Example
3 −1
• S-1 = , λ = 2 - √3, 2 + √3
−2 1
3 −1 𝑥11 𝑥11
• For λ = 2+ √3 , = (2 + √3)*
−2 1 𝑥12 𝑥12
3 −1 𝑥11 𝑥11
• For λ = 2- √3 , = (2 - √3)*
−2 1 𝑥12 𝑥12
• We have Sx = λx
1 1
S=
2 3
3 4
S2 =
8 11
= 7 + 4√3
= 7 - 4√3
Properties of Eigenanalyses - Example
1 1 1 1 1+2 1+3 3 4
S2= ∗ = = ,
2 3 2 3 2+6 2+9 8 11
Properties of Eigenanalyses - Example
3 4
For S2= , for eigen values │S - λIn│ = 0
8 11
3−λ 4
=>│ │=0
8 11 − λ
=> (3 – λ)(11 – λ) – 32 = 0
=> λ2 –14λ +1 = 0
3 4 𝑥11 𝑥11
• For λ = 7 + 4√3 , = (7 + 4√3)*
8 11 𝑥12 𝑥12
3 4 𝑥11 𝑥11
• For λ = 7 - 4√3 , = (7 - 4√3)*
8 11 𝑥12 𝑥12
1 0 1 1
• This implies, S* = aIn – bS = 2* - 1*
0 1 2 3
1 −1
S* =
−2 −1
Properties of Eigenanalyses - Example
1 −1
For S*= , for eigen values │S - λIn│ = 0
−2 −1
1−λ −1
=>│ │=0
−2 −1 − λ
=> (-1 – λ)(1 – λ) –2 = 0
=> λ2 – 3 = 0
1 −1 𝑥11 𝑥11
• For λ = √3 , = (√3)*
−2 −1 𝑥12 𝑥12
1 −1 𝑥11 𝑥11
• For λ = -√3 , = (-√3)*
−2 −1 𝑥12 𝑥12
1 1
• T’ = (1/√2)* ( Check if TT’ = I)
−1 1
Properties of Eigenanalyses
1 2 1 −1
• Let us take an example of S = , T = (1/√2)*
2 1 1 1
1 1 1 2 1 −1
• T’ST = (1/√2)* * * (1/√2)*
−1 1 2 1 1 1
1+2 2+1 1 −1 3 3 1 −1
= (1/2)* * = * (1/2)*
−1 + 2 −2 + 1 1 1 1 −1 1 1
3 + 3 −3 + 3 𝟑 𝟎
= (1/2)* = =D
1 − 1 −1 − 1 𝟎 −𝟏
Properties of Eigenanalyses - Example
1 2
For S= , for eigen values │S - λIn│ = 0
2 1
1−λ 2
=>│ │=0
2 1 −λ
=> (1 – λ)(1 – λ) – 4 = 0
=> λ2 –2λ -3 = 0
2 −1
A=
−1 2
Properties – Positive Definite Matrix
2 −1
A= , for eigen values │S - λIn│ = 0
−1 2
2−λ −1
=>│ │=0
−1 2−λ
=> (2 – λ)2 – 1 = 0
=> λ2 - 4λ + 3 = 0 => λ = 3, 1
2 −1
A=
−1 2
Matrix Decomposition
• A matrix decomposition is way of reducing a matrix into its
constituent parts
1 0 0
L = 𝑙21 1 0
𝑙31 𝑙32 1
𝑢11 𝑢12 𝑢13
U= 0 𝑢22 𝑢23
0 0 𝑢33
Does [L][U] = [A]?
é 1 0 0ù é25 5 1 ù
ê ú ê ú
[L][U ] = ê2.56 1 0ú ê 0 - 4.8 - 1.56ú = ?
êë5.76 3.5 1úû êë 0 0 0.7 úû
• X-y+z=2
• X+2y+2z=3
• X-y-z=1 1 −1 1
1 2 2
1 −1 −1
é 25 5 1ù é x1 ù é106.8 ù
ê 64 8 1ú ê x ú = ê177.2 ú
ê ú ê 2ú ê ú
êë144 12 1úû êë x3 úû êë279.2úû
Using the procedure for finding the [L] and [U] matrices
é 1 0 0ù é25 5 1 ù
[A] = [L][U ] = êê2.56 1 0úú êê 0 - 4.8 - 1.56úú
êë5.76 3.5 1úû êë 0 0 0.7 úû
Spectral Matrix Decomposition
• If S is a symmetric matrix with eigen values, λ1 ≥ λ2 ≥ λ3 …≥ λn
We have S = TΛT’