Mat188 Notes
Mat188 Notes
Arnav Patil
University of Toronto
Contents
1 Vectors, Limits, Sets, and Planes 2
3 Linear Transformations 6
5 Determinant 9
8 Orthogonal Projection 13
11 Diagonalization 17
1
1 Vectors, Limits, Sets, and Planes
Introduction
A vector is a representation of displacement – it gives a set of instructions on how to move from one point
to the next. This is known as a coordinate vector. When the coordinate vector is taken from the origin,
it becomes known as the standard representation of the vector. Each vector has n components, where n
is a positive integer.
The collection of all vectors with n components forms Rn , also known as the n-dimensional Euclidean
vector space.
We use the head to tail rule for vector addition; that is, add up the individual components of each vec-
tor being summed to get the resulting vector.
There are eight “axioms” of linear algebra that will guide us through this course, and allows us to ensure
vector arithmetic works the way we want it to:
1. Vector addition is associative, or (⃗v + w)
⃗ + ⃗u = ⃗v + (w
⃗ + ⃗u),
2. Vector addition is commutative, or ⃗v + w
⃗ =w
⃗ + ⃗v ,
3. ⃗v + ⃗0 = ⃗v ,
4. For each ⃗v ∈ Rn , there exists a vector ⃗x such that ⃗v + ⃗x = ⃗0,
5. k(⃗v + w)
⃗ = k⃗v + k w,
⃗
6. (c + k)⃗v = c⃗v + k⃗v ,
7. c(k⃗v ) = (ck)⃗v , and
8. 1⃗v = ⃗v .
Solidify
Definition: Dot Product
Let ⃗v and w⃗ be two vectors with components v1 , v2 , ..., vn and w1 , w2 , ..., wn respectively. The dot
⃗ denoted by ⃗v · w,
product of ⃗v and w, ⃗ is given by: v1 w1 , v2 w2 , ..., vn wn .
Definition: Norm
Let ⃗v be a vector in Rn . The norm of ⃗v is given by:
q
v12 + v22 + ... + vn2
2
For vectors in R2 and R3 , the angle between ⃗v and w
⃗ can be given by:
⃗v · w⃗
arccos
||⃗v ||||w||
⃗
Definition: Perpendicular
Definition: Subset
We say a set A is a subset of a set B, and write A ⊆ B, if all the elements of A are also in B. In
other words, A ⊆ B, if for every a ∈ A, a ∈ B.
Expand
We can define lines in set notation. Let’s take an example of y = −2x. As a set, we can describe it as:
l = {(x, y)|y = −2x} = {(t, −2t)|t ∈ R}
We can generalize this to vectors:
l = {⃗ ⃗ ∈ R}
p + td|t
We refer to ⃗v = p⃗ + td⃗ as the vector parametric form of the line l.
Definition: Plane
A plane in Rn has the form:
⃗ + s⃗n + ⃗b|s, t ∈ R}
P = {tm
where m
⃗ and ⃗n may NOT be parallel.
A collection of linear equations with the same variables is known as a system of linear equations. Values
for each xn that make all the equations true simultaneously are known as solutions to the system.
3
The coefficient matrix:
2 4
4 3
1 −1
and the augmented matrix:
2 4 8
4 3 10
1 −1 2
We use the process of Gauss-Jordan elimination to systematically solve systems of linear equations.
There are three fundamental operations that do not change the general solution to the system:
1. Multiplying an equation by a scalar,
2. Adding a multiple of an equation to another, and
3. Switching the order of equations.
A particular solution is one single tuple (c1 , c2 , ..., cn ) that makes all the equations true simulta-
neously. The set of all possible solutions to a system of linear equations is known as the general
solution.
Solidify
For each matrix M , there are many REF forms of it, however, there is only one RREF form.
The first nonzero entry is called that row’s leading entry, or leading 1 if the entry is a 1. A pivot
position is a location in a matrix that corresponds to a leading 1 in the RREF in that matrix.
4
5. Solve for the basic variables in term of the free variables.
6. Let the free variables run over all scalars.
7. We typically write the general solution in set builder notation.
A system of equations is consistent if there’s at least one solution; otherwise, it is called inconsistent.
A linear system is inconsistent only if the RREF matrix has the equation 0 = 1, in other words, if
the augmented column is a pivot column. Otherwise, if a linear system is consistent, then it has:
• Infinitely many solutions (there’s at least one free variable), or
Definition: Rank
Expand
Some matrix terminology:
We can write the linear system with augmented matrix [A|⃗b] in matrix form as A⃗x = ⃗b.
5
Expand
Theorem: Algebraic Rules for A⃗x
• A(k⃗x) = k(A⃗x).
3 Linear Transformations
Introduce
We sometimes use more archaic words to describe linear algebra than calculus, this is due to linear algebra’s
deep connections to geometry. For example, what we call a ‘function’ in calculus, we may call a ‘transfor-
mation’ or ‘mapping’ in linear algebra.
The space from which the input to a transformation comes is called the domain. The codomain is the
space in which the output vectors lie, while the set of vectors that make up the output is called the range.
Solidify
Definition: Standard Vectors
The standard vector ⃗ei ∈ Rm is defined as a unit vector with all zeros except for the 1 in the ith
entry.
Expand
The linear transformation that rotates vector ⃗u by θ radians is:
cos θ − sin θ
T (⃗u) =
sin θ cos θ
6
u·⃗
We call ⃗x|| = ⃗ x
||u||2 the projection of ⃗x onto ⃗u.
⃗u · ⃗x
proj⃗u (⃗x) =
||u||2
Let T : Rm → Rn , T (⃗x) = A⃗x and S : Rn → Rp , S(⃗y ) = B⃗y be linear transformations with associated
standard matrices A and B respectively. Then BA is defined to be the unique matrix associated to
the composition S ◦ T : Rm → Rp .
7
Solidify
Theorem: The Columns of the Matrix Product
| | |
Let A = ⃗a1 ⃗a2 · · · ⃗am be an n × m matrix and B is a p × n matrix. The BA is a p × m matrix
| | |
given by:
| | |
BA = B⃗a1 B⃗a2 · · · B⃗am
| | |
Expand
Injective (or one-to-one) transformation – A transformation T is called injective if no two vectors in
the domain get sent to the same vector in the same codomain.
Surjective (or onto) transformation – A transformation T is called surjective if every vector in the
codomain has a vector that gets mapped to it, by T .
A square matrix A is said to be invertible if the linear transformation T (⃗x) = A⃗x is invertible. In
this case, the matrix of T −1 is denoted by A−1 . If the linear transformation T (⃗x) = A⃗x is invertible,
then its inverse is T −1 (⃗y ) = A−1 ⃗y .
The product of two invertible matrices is also invertible. Furthermore, (AB)−1 = B −1 A−1 , meaning that
the order of the matrix multiplication reverses when we take an inverse.
8
5 Determinant
Introduce
Definition: 2 × 2 Determinant
a b
Let A = be a 2 × 2 matrix. The determinant of A is the scalar ad − bc.
c d
The absolute value of the determinant of A is the expansion factor, or the ratio by which T changes the area
of any subset Ω in R2 .
area of T (Ω)
= | det A|
area of Ω
Solidify
Definition: Cross Product in R3
• ⃗v × w
⃗ is orthogonal to both ⃗v and w.
⃗
• ∥⃗v × w∥
⃗ = ∥⃗v ∥∥w∥ ⃗ with 0 ≤ θ ≤ π.
⃗ sin θ, where θ is the angle between ⃗v and w,
• The direction of ⃗v × w
⃗ follows the right-hand rule.
Definition: 3 × 3 Determinant
9
Theorem: Determinants of Products and Powers
Let A, B be n × n matrices:
1. det AB = det A det B
2. det(Am ) = (det A)m
Expand
Theorem: Elementary Row Operations and Determinants
Suppose A is an n × n matrix:
1
1. If B is obtained from A by dividing a row of A by a scalar k, then det B = k det A.
An elementary matrix is a matrix you get by applying an elementary row reduction step to the
identity matrix.
Consider the linear system A⃗x = ⃗b, where A is an invertible n × n matrix. The components xi of the
solution vector ⃗x are
det Ab,i
xi =
det A
where Ab,i is the matrix obtained by replacing the i-th column of A by ⃗b.
A subspace W of Rn which contains ⃗0 and is closed under vector addition and scalar multiplication.
10
Definition: Span
Let ⃗v1 , ⃗v2 , . . . , ⃗vm be vectors in Rn . The set of all linear combinations of ⃗v1 , ⃗v2 , . . . , ⃗vm is called their
span. That is,
If span(⃗v1 , ⃗v2 , . . . , ⃗vm ) = V , then {⃗v1 , ⃗v2 , . . . , ⃗vm } is called a spanning set for V , or is said to span V .
Given vectors ⃗v1 , ⃗v2 , ..., ⃗vm in Rn , span(⃗v1 , ⃗v2 , ..., ⃗vm ) is a subspace.
Definition: Image
Solidify
Definition: Kernel
The kernel of linear transformation T : Rm → Rn is the set of all vectors in the domain such that
T (⃗v ) = ⃗0.
ker(T ) = {⃗v ∈ Rm |T (⃗v ) = ⃗0}
Theorem:
For a linear transformation T : Rm → Rn , T is injective if and only if ker(T ) = {⃗0}, and T is surjective
if and only if im(T ) = Rn .
11
Definition: Linearly Dependent
Let ⃗v1 , ⃗v2 , . . . , ⃗vn be vectors in a subspace V of Rm . We say ⃗v1 , ⃗v2 , . . . , ⃗vn are linearly dependent if
there are scalars c1 , . . . , cn that are not all zero such that c1⃗v1 + · · · + cn⃗vn = ⃗0.
Let ⃗v1 , ⃗v2 , ..., ⃗vn are linearly independent if c1⃗v1 + · · · + cn⃗vn = ⃗0 has only one solution, where all the
scalars are 0.
Expand
Let’s summarize what we know about a set of vectors that’s linearly independent:
1. The set {⃗v1 , ⃗v2 , ..., ⃗vn } contains no redundant vector,
Definition: Basis
A basis of a subspace V of Rn is a linearly independent set of vectors in V that spans V .
Theorem:
Vectors ⃗v1 , ⃗v2 , ..., ⃗vn in V form a basis iff every vector ⃗v in V can be expressed as a linear combination
⃗v = c1⃗v1 + c2⃗v2 + ... + cn⃗vn . The coefficients c are called the coordinates of ⃗v with respect to basis
⃗v1 , ⃗v2 , ..., ⃗vn .
Given a nonzero subject of Rn , there are infinitely many bases of V . The theorem above guarantees
that ALL bases of a subspace V have the same number of vectors. This is called the dimension of V .
The rank of matrix A is the dimension of Im(A), and the nullity is the dimension of Ker(A).
Let TA : Rn → Rm . The rank-nullity theorem states that:
Rank(A) + Nullity(A) = n
12
Solidify
Theorem: Bases and Unique Representation
Vectors ⃗v1 , ⃗v2 , ..., ⃗vm in V only make up a basis for V if every vector ⃗v can be expressed as a linear
combination of ⃗v1 , ⃗v2 , ..., ⃗vm .
Definition: Coordinates
Suppose B = (⃗v1 , . . . , ⃗vn ) is an ordered basis of a subspace V . The B-coordinates of ⃗v ∈ V are the
unique scalars ai such that
⃗v = a1⃗v1 + · · · + an⃗vn .
The B-coordinates are arranged into a column vector, denoted [⃗v ]B . That is,
a1
..
[⃗v ]B = . .
an
If B = (⃗b1 , . . . , ⃗bn ) and C = (⃗c1 , . . . , ⃗cn ) are two ordered bases of a subspace V , the change-of-
coordinates matrix from B to C is the unique matrix S such that S[⃗v ]B = [⃗v ]C for all ⃗v ∈ V .
Expand
Definition: Similar Matrices
Two n × n matrices are called similar if there exists an invertible matrix S such that B = S −1 AS.
8 Orthogonal Projection
Introduce
Definition: Orthogonal Set
A set of vectors is orthogonal if each vector’s dot product with every other vector is 0.
13
Solidify
Definition: Orthogonal Complement
W ⊥ = {⃗v ∈ Rn : ⃗v · w ⃗ ∈ W}
⃗ = 0 for all w
Theorem:
dim W + dim W ⊥ = n
Expand
Theorem: Orthogonal Decomposition
For a vector ⃗x in subspace V , we can say ⃗x = ⃗x|| +⃗x⊥ where ⃗x|| is in V and ⃗x⊥ is in V ⊥ . Furthermore,
this representation is unique for every ⃗x in V .
Let V be a subspace of Rn and B = {⃗b1 , . . . , ⃗bk } be a basis for V . Then we can construct an
orthonormal basis U = {⃗u1 , . . . , ⃗uk } for V where
p⃗i
⃗ui =
∥⃗
pi ∥
and
14
Theorem: Orthogonal Matrices
Solidify
Theorem:
Definition: Eigenbasis
Theorem:
A linear transformation T : Rn → Rn is diagonalizable iff it has an eigenbasis.
15
Solidify
Theorem: Eigenvalues of a Matrix
The sum of the diagonal entries of a square matrix A is called the trace of A, denoted by tr(A).
Expand
Definition: Eigenspace
The geometric multiplicity of λ is the dimension of the λ-eigenspace or, equivalently, the maximal
size of a linearly independent set of eigenvectors with eigenvalue λ.
16
11 Diagonalization
Introduce
Theorem:
Let T : Rn → Rn be a linear transformation. Eigenvectors of T that correspond to distinct eigenvalues
are linearly independent.
Theorem:
If T : Rn → Rn has n eigenvectors then T is diagonalizable.
3. There exists a diagonal matrix D and an invertible matrix S such that A = SDS −1 ,
4. The dimensions of the eigenspaces of T add up to n,
5. Geometric multiplicities of T adds up to n, and
Solidify
Definition: Orthogonally Diagonalizable Map
Expand
A distribution vector is a vector ⃗x in Rn where all components add up to 1, and all components are
either positive or zero. A square matrix A is a transition or stochastic matrix if all of its columns are
distribution vectors. Furthermore, if all entries are positive, or nonzero, then we call it positive.
17
matrix r = Rank(A) nonzero entries listed in decreasing order.
σ1 T
| | | .. | | |
A = ⃗u1 ⃗u2 · · · ⃗un
. 0 ⃗v1
⃗v2 ··· ⃗vm
| | | σ r
| | |
0 0
We can use singular value decomposition to compress data; for example, we construct a single value de-
composition for an RGB image, then set the smaller σi ’s, which are substantially smaller than σ1 , to 0. That
way, we can make the vectors (and necessary operations) significantly less intense, while minimizing loss of
quality.
18