0% found this document useful (0 votes)
41 views

Tam ProbabilityLinearAlgebraReview

Linear Algebra Review

Uploaded by

kyawmoesoe
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Tam ProbabilityLinearAlgebraReview

Linear Algebra Review

Uploaded by

kyawmoesoe
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

EE5907R: Pattern Recognition

EE5907R:
Pattern Recognition

Lecture 0: Review of Probability & Linear Algebra

Dr. Tam Nguyen


Review of Probability
 Probability
– Axioms and properties
– Conditional probability
– Law of total probability
– Bayes theorem
 Random Variables
– Discrete
– Continuous
 Random Vectors
 Gaussian Random Variables
Some of the following slides are taken from lecture notes by Ricardo Gutierrez-Osuna

EE5907R: Pattern Recognition 2


Basics of Probability

EE5907R: Pattern Recognition 3


Properties of Probability

EE5907R: Pattern Recognition 4


Conditional Probability

EE5907R: Pattern Recognition 5


Law of Total Probability

EE5907R: Pattern Recognition 6


Bayes Theorem

EE5907R: Pattern Recognition 7


Example
 You can play tennis if there is no rain on either Saturday or
Sunday. The probability of it raining on a Saturday is 80% and
the chance of it raining on Sunday is 60%.

a) What is the probability of playing tennis in the


weekend?
P (Tennis) = 1 - P (TennisC) = 1 - (0.8)(0.6)= 0.52

b) Given that you played tennis over the weekend,


what is the probability that that it rained on the weekend?
P (Rain|Tennis)= P(Rain and Tennis)/P(Tennis)
= [P(Rain SatC & Rain Sun) + P(Rain Sat and Rain SunC)]/P(Tennis)
| = (0.2*0.6 + 0.8*0.4) / 0.52
≈ 0.846

EE5907R: Pattern Recognition 8


An Interesting Example
You are a contestant in a game show, and the
game show host tells you there is a prize behind
one of the three doors you face. You have to
guess which door to open.
But when you make your guess, instead of
opening the door you picked, the game show
host opens a different door - one that he knows
has nothing behind it (the host will never reveal
the prize at this stage). So now you're down to
two doors. And the game show host says, "I'll
let you change your choice, if you want to.”

And the question is, do you change your guess? Or keep


your original choice? Does it make any difference?

EE5907R: Pattern Recognition 9


Monty Hall Problem

Explanations: http://www.youtube.com/watch?v=mhlc7peGlGg
http://www.youtube.com/watch?v=9vRUxbzJZ9Y
Try it out: http://math.ucsd.edu/~crypto/Monty/monty.html

EE5907R: Pattern Recognition 10


Random Variables

EE5907R: Pattern Recognition 11


Two Types of Random Variables
 Discrete Random Variable
–has countable number of values
–e.g., the resulting number of rolling a dice (any
number from (1, 2, 3, 4, 5, 6)
–probability distribution defined by probability
mass function
 Continuous Random Variable
–has values that are continuous
–e.g., the weight of an individual (any real
number within the range of human weight)
© iStockphoto.com/Julie de
–defined by probability density function Leseleuc

EE5907R: Pattern Recognition 12


Statistical Characterization of RVs

EE5907R: Pattern Recognition 13


Cumulative Distribution Function

CDF for discrete RV CDF for continuous RV

EE5907R: Pattern Recognition 14


Properties of CDF

Questions about X can be asked in terms of CDF

P a  X  b   F b   F a 
P( a person’s weight between 100 and 200 )=F(200)-F(100)
EE5907R: Pattern Recognition 15
Discrete Random Variable:
Probability Mass Function
 Given a discrete random variable X, the
probability mass function is defined as
Pa   P X  a 
 Satisfies all axioms of
probability

 CDF satisfies

F a   P X  a    P X  k 
k a

EE5907R: Pattern Recognition 16


Continuous Random Variable:
Probability Density Function
 PDF is the derivative of CDF

 CDF satisfies
F a   P X  a    f  x dx
a



 General usage

Pa  X  b    f  x dx
b

EE5907R: Pattern Recognition 17


Random Vectors

–Example: When trying to recognize the aircraft, we may


consider its shape, size and color.

EE5907R: Pattern Recognition 18


Covariance Matrix

• symmetric: cij = cji


• Positive semi-definite:
• eigenvalues are nonnegative
• determinant is nonnegative, |C|≥0

EE5907R: Pattern Recognition 19


Covariance Matrix: Quiz
You are given the heights and weights of a certain set of
individuals in unknown units. Which one of the following
four matrices is the most likely to be the sampled
covariance matrix?
ASYMMETRIC ARE HEIGHT AND WEIGHT
NOT CORRELATED?

HEIGHT AND WEIGHT


OVER CORRELATED?

The answer is C. Why? 20


EE5907R: Pattern Recognition
The Normal or Gaussian Distribution
of a Random Variable
 Probability density function:

1  1  x   2 
p(x)  exp     
2   2    

  = mean (or expected


value) of x
 2 = expected squared
deviation or variance

EE5907R: Pattern Recognition 21


Multivariate Gaussian (Random
Vector)
 Probability density function:
1  1 
p ( x)  exp  (x  μ) t Σ 1 (x  μ)
(2 )  2 
1/ 2
d /2
Σ

 mean vector 
 covariance matrix 

EE5907R: Pattern Recognition 22


Why Gaussian?
 The parameters (μ,Σ) are sufficient to uniquely
characterize the distribution.
 If xi’s are mutually uncorrelated, then they are
also independent.
 Practical – nice to work with
–The marginal and conditional densities are also
Gaussian.
–Any linear transformation of any N jointly Gaussian
RV’s results in N RV’s also Gaussian.

EE5907R: Pattern Recognition 23


Review of Linear Algebra
 Vectors
 Products and norms
 Linear Dependence and Independence
 Vector spaces and basis
 Matrices
 Linear transformations
 Eigenvalues and eigenvectors

EE5907R: Pattern Recognition 24


Vectors
 An n-dimensional column vector and its
transpose (row vector) are represented as

and

 The inner product (dot product or scalar


product) of two vectors:

EE5907R: Pattern Recognition 25


Vectors (Cont’d)
 Euclidean norm or length a  a12  a 22  a 32

a
a
 Normalized (unit) vector
|x|=1

 Angle between vectors x and y

EE5907R: Pattern Recognition 26


Vectors (Cont’d)
 Two vectors x and y are y
–orthogonal if cosθ=0 or <x,y>=0
–orthonormal if they are orthogonal and
|x|=|y|=1 y orthogonal
orthonormal x
x
 Euclidean distance between vectors x
y
and y
n
xy   x  yi 
2
i
x-y
i 1
x
EE5907R: Pattern Recognition 27
Linear Dependence and
Independence
 Vectors x1, x2, …, xn are linearly dependent if
there exists a set of coefficients a1, a2, …, an (at
least one ai≠0) such that

 Vectors x1, x2, …, xn are linearly independent if

EE5907R: Pattern Recognition 28


Vector Spaces and Basis
 Vector Space:
– The n-dimensional space in which all the n-dimensional
vectors reside
 Basis:
– A set of vectors {u1, u2, ... un} are called a basis for
a vector space if any vector x can be written as a
linear combination of {ui}.

– u1, u2, ... un are independent implies they form a


basis, and vice versa.
– A basis {ui} is orthonormal if
 Basis vectors are pairwise orthogonal,
and have unit length, i.e., |ui|=1.

EE5907R: Pattern Recognition 29


Matrices
 An n by d matrix A and its transpose AT

n×d
d×n
 Product of two matrices:

EE5907R: Pattern Recognition 30


Matrices (Cont’d)
 Determinant of square matrix A

–minor matrix Aik is formed by removing the ith row and the kth
column of A
A= A23=

–Its transpose has the same determinant: |A|=|AT|


 Trace: sum of diagonal elements

EE5907R: Pattern Recognition 31


Matrices (Cont’d)
 Rank: the number of linearly independent rows
(or columns)
 Singular and Non-singular
–A singular matrix has a zero determinant
–A non-singular matrix has a non-zero determinant
 Rank equals the number of rows (or columns)
 Identity matrix: I 1 6
1 0  0 2 8
0 1  2 9 
  3 4
I A
IA = AI = A    0 8 4 5 6
   
0  0 1 6 9 6 7
 Symmetric: A = AT
EE5907R: Pattern Recognition 32
Matrices (Cont’d)
 For a square matrix A
–Orthonormal:
AAT= ATA=I
–Positive definite:

–Semi-positive definite:

EE5907R: Pattern Recognition 33


Matrices (Cont’d)
 Inverse
–The inverse of a square matrix A is A-1

AA-1=A-1A=I
–The inverse A-1 exists if and only if A is non-singular
|A|≠0
 Pseudo-inverse
A† =[ATA]-1AT with A†A = I
–Assuming ATA is non-singular
–Used whenever A-1 does not exist, i.e., A is not square or A
is singular.
EE5907R: Pattern Recognition 34
Linear Transformations
 Mapping from vector space XN to vector space
YM, represented by a matrix

 Note that
–The dimensionality of the two spaces does not need to be
the same.
–For PR, typically M<N, i.e., project onto a lower-
dimensionality space.
EE5907R: Pattern Recognition 35
Eigenvectors and Eigenvalues
 Definition: v is an eigenvector of matrix A if
there exists a scalar λ, such that

 Computation

EE5907R: Pattern Recognition 36


Eigenvectors and Eigenvalues
 Properties
–If A is non-singular
 All eigenvalues are non-zero.
–If A is real and symmetric
 All eigenvalues are real.
 The eigenvectors associated with distinct eigenvalues are
orthogonal.
–If A is positive definite
 All eigenvalues are positive.

EE5907R: Pattern Recognition 37


Eigenvectors and Eigenvalues
 Interpretation: an eigenvector represents an
invariant direction in the vector space
–any point lying on the direction defined by v remains on that
direction
–its magnitude is multiplied by the corresponding eigenvalue λ

EE5907R: Pattern Recognition 38


Eigenvectors and Eigenvalues
 For Gaussian distribution
–The eigenvectors of Σ are the principal directions.
–The eigenvalues are the variances.

Projection

EE5907R: Pattern Recognition 39


Readings
1. Review Appendix A of DHS book, Mathematical
foundations

EE5907R: Pattern Recognition 40

You might also like