0% found this document useful (0 votes)
63 views8 pages

MLE and Statistical Inference for Engineers

This document covers classical statistical inference, focusing on maximum likelihood estimation (MLE) for Bernoulli random variables and mixtures of normal distributions. It includes examples of estimating parameters for biased coins, independent tosses, and student height distributions. Additionally, it discusses the comparison between MLE and Bayesian approaches in estimating probabilities and parameters.

Uploaded by

hiddentalent3210
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views8 pages

MLE and Statistical Inference for Engineers

This document covers classical statistical inference, focusing on maximum likelihood estimation (MLE) for Bernoulli random variables and mixtures of normal distributions. It includes examples of estimating parameters for biased coins, independent tosses, and student height distributions. Additionally, it discusses the comparison between MLE and Bayesian approaches in estimating probabilities and parameters.

Uploaded by

hiddentalent3210
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Statistical Foundation for Electrical Engineers

(EE343)

Unit-8 Classical Statistical Inference

Tutorial 8: Classical Statistical Inference

Krishnan C.M.C
Assistant Professor, E&E,
NITK Surathkal
MLE
T8-P1: Estimate the mean of Bernoulli RV – Using the MLE, find the probability
of getting heads for a biased coin based on multiple, independent coin flips.
Let there be 𝑛 coin flips and let 𝑋𝑖 represent the indicator RV for the ith flip
with the unknow parameter 𝜃 = 𝑃 𝑋𝑖 = 1 .

Then, the likelihood function is given by 𝑛 𝑘 𝑛−𝑘


𝑓𝑋 𝑥; 𝜃 = 𝜃 1−𝜃
𝑘
Here 𝑘 = σ𝑛𝑖=1 𝑥𝑖 and
loglikelihood function is given by 𝑋 = 𝑋1 , 𝑋2 , … , 𝑋𝑛
𝑛
log 𝑓𝑋 𝑥; 𝜃 = log + 𝑘 log 𝜃 + 𝑛 − 𝑘 log 1 − 𝜃
𝑘

Differentiating w.r.t. 𝜃 and equating to zero we have


𝜕 𝑘 𝑛−𝑘 𝑘
log 𝑓𝑋 𝑥; 𝜃 =0+ − =0 ⇒ 𝜃መ𝑛 =
𝜕𝜃 𝜃 1−𝜃 𝑛
Thus MLE estimator is given by To note:
𝑛 • Θ෡ 𝑛 is in fact the sample mean.
𝒦
෡𝑛 =
Θ 𝒦 = ෍ 𝑋𝑖 • Unbiased and consistent.
𝑛
𝑖=1
HW: compare it with a similar scenario of Bayesian framework with flat prior (fair coin).
෡ 𝑀𝐴𝑃 = 𝑘 + 1Τ𝑛 + 2. So, both converge asymptotically!
You will obtain Θ
EE343, Dept. of E & E, NITK Surathkal
MLE
T8-P2: Consider a sequence of independent tosses and let 𝜃 be the probability
of heads at each toss.
(a) Fix some 𝑘 and let 𝑁 be the number of tosses until the 𝑘𝑡ℎ head occurs.
෡ 𝟏 ) based on the observation 𝑁.
Find the ML estimate of 𝜃 (call it Θ
(b) Compare it with the case where you fix 𝑛 and let 𝒦 be the number of heads
෡ 2 . See the previous problem P1.
in these 𝑛 tosses. Call this estimate of 𝜃 as Θ
𝑁 is a RV and we say that at the observation 𝑁 = 𝑛 we obtain 𝑘𝑡ℎ head.
𝑛 𝑛−1
This is like a Binomial RV, except that number will be replaced by
𝑘 𝑘−1
Then, the log likelihood function is given by

𝑛−1
log 𝑓𝑋 𝑥; 𝜃 = log + 𝑘 log 𝜃 + 𝑛 − 𝑘 log 1 − 𝜃
𝑘−1
Differentiating w.r.t. 𝜃 and equating to zero we will get the same result as P1.
That is 𝑘
Θ෡1 = Comparison:
𝑁 • For Θ෡1 the term 𝑁 is the RV, whereas
Whereas from P1 we have ෡ 2 it is the no. of heads!
for Θ
𝑛 ෡1 is biased and Θ
• Θ ෡ 2 is unbiased.
𝒦

Θ2 = 𝒦 = ෍ 𝑋𝑖
𝑛 𝑖=1
EE343, Dept. of E & E, NITK Surathkal
MLE
T8-P3: Let the PDF of a random variable X be the mixture of m components:
𝑚 𝑚

𝑓𝑋 𝑥 = ෍ 𝑝𝑗 𝑓𝑌𝑗 𝑥 ; ෍ 𝑝𝑗 = 1 & 𝑝𝑗 ≥ 0 𝑓𝑜𝑟 1 ≤ 𝑗 ≤ 𝑚


𝑗=1 𝑗=1
Assume that each 𝑌𝑗 ~𝒩 𝜇𝑗 , 𝜎𝑗2 and that we have a set of observation 𝑋 =
(𝑋1 , 𝑋2 , . . 𝑋𝑛 ) each of the entry with PDF 𝑓𝑋 𝑥 and independent.
(a) Write down the likelihood and log-likelihood functions

Likelihood function
𝑛 𝑚
1 2
− 𝑥𝑖 −𝜇𝑗 ൗ2𝜎𝑗2
𝑓𝑋 𝑥; 𝜇, 𝜎 2 = ෑ ෍ 𝑝𝑗 𝑒
𝑖=1 𝑗=1
𝜎𝑗 2𝜋

Log-Likelihood function

𝑛 𝑚
2
1 − 𝑥𝑖 −𝜇𝑗
2
ൗ2𝜎𝑗2
log 𝑓𝑋 𝑥; 𝜇, 𝜎 = ෍ log ෍ 𝑝𝑗 𝑒
𝑖=1 𝑗=1
𝜎𝑗 2𝜋

EE343, Dept. of E & E, NITK Surathkal


MLE
T8-P3: Let the PDF of a random variable X be the mixture of m components:
𝑚 𝑚

𝑓𝑋 𝑥 = ෍ 𝑝𝑗 𝑓𝑌𝑗 𝑥 ; ෍ 𝑝𝑗 = 1 & 𝑝𝑗 ≥ 0 𝑓𝑜𝑟 1 ≤ 𝑗 ≤ 𝑚


𝑗=1 𝑗=1
Assume that each 𝑌𝑗 ~𝒩 𝜇𝑗 , 𝜎𝑗2
and that we have a set of observation 𝑋 =
(𝑋1 , 𝑋2 , . . 𝑋𝑛 ) each of the entry with PDF 𝑓𝑋 𝑥 and independent.
(b) Consider the case 𝑚 = 2 and 𝑛 = 1 , and assume that 𝜇1 , 𝜇2 , 𝜎1 & 𝜎2 are
known. Find the ML estimates of 𝑝1 & 𝑝2 .
Log-Likelihood function = 𝑐1 = 𝑐2
1 2 Τ2𝜎2 1 2 2
log 𝑓𝑋 𝑥; 𝜇, 𝜎 2 = log 𝑝1 𝑒 − 𝑥−𝜇 1 1 + (1 − 𝑝1 ) 𝑒 − 𝑥−𝜇2 Τ2𝜎2
𝜎1 2𝜋 𝜎2 2𝜋
∵ 𝑝2 = 1 − 𝑝1
This is linear in 𝑝1 log 𝑓𝑋 𝑥; 𝜇, 𝜎 2 = 𝑝1 𝑐1 − 𝑐2 + 𝑐2

𝑝Ƹ1𝑀𝐿 = 0 𝑝Ƹ1𝑀𝐿 = 1

0 1 𝑝1
1 2 Τ2𝜎2 1 2 Τ2𝜎2
1 if 𝑒 − 𝑥−𝜇1 1 > 𝑒 − 𝑥−𝜇2 2
𝑃෠1𝑀𝐿𝐸 = ൞ 𝜎1 2𝜋 𝜎2 2𝜋
0 otherwise EE343, Dept. of E & E, NITK Surathkal
MLE
T8-P3: Let the PDF of a random variable X be the mixture of m components:
𝑚 𝑚

𝑓𝑋 𝑥 = ෍ 𝑝𝑗 𝑓𝑌𝑗 𝑥 ; ෍ 𝑝𝑗 = 1 & 𝑝𝑗 ≥ 0 𝑓𝑜𝑟 1 ≤ 𝑗 ≤ 𝑚


𝑗=1 𝑗=1
Assume that each 𝑌𝑗 ~𝒩 𝜇𝑗 , 𝜎𝑗2 and that we have a set of observation 𝑋 =
(𝑋1 , 𝑋2 , . . 𝑋𝑛 ) each of the entry with PDF 𝑓𝑋 𝑥 and independent.
(c) Consider the case 𝑚 = 2 and 𝑛 = 1 , and assume that 𝑝1 , 𝑝2 , 𝜎1 & 𝜎2 are
known. Find the ML estimates of 𝜇1 & 𝜇2 .
Log-Likelihood function
1 2 Τ2𝜎2 1 2 2
log 𝑓𝑋 𝑥; 𝜇, 𝜎 2 = log 𝑝1 𝑒 − 𝑥−𝜇 1 1 + (1 − 𝑝1 ) 𝑒 − 𝑥−𝜇2 Τ2𝜎2
𝜎1 2𝜋 𝜎2 2𝜋

We need to maximize the term inside the parenthesis w.r.t 𝜇1 followed by 𝜇2

By inspection (without doing the differentiation) we can see that the


exponent has to be minimized
Thus the ML estimates are nothing but

𝜇Ƹ 1𝑀𝐿𝐸 = 𝜇Ƹ 2𝑀𝐿𝐸 = 𝑥

EE343, Dept. of E & E, NITK Surathkal


MLE
T8-P4:Consider a study of student heights in a batch. Assume that the
height of a S1 student is normally distributed with mean 𝜇1 and variance 𝜎12 ,
and that the height of a S2 student is normally distributed with mean 𝜇2 and
variance 𝜎22 . Assume that a student is equally likely to be from S1 or S2. A
sample of size 𝑛 = 10 was collected and the following values were recorded
(in centimeters) : 164, 167, 163, 158, 170, 183, 176, 159, 170, 167
(a) Assume that 𝜇1 , 𝜇2 , 𝜎1 & 𝜎2 are unknown and write down the likelihood
function
10
1 2 Τ2𝜎2 1 𝑥𝑖 −𝜇2 2Τ2𝜎22
𝑓𝑋 𝑥; 𝜇1 , 𝜇2 , 𝜎1 , 𝜎2 = ෑ 0.5 𝑒 − 𝑥𝑖−𝜇1 1 + 0.5 𝑒−
𝑖=1
𝜎1 2𝜋 𝜎2 2𝜋

(b) Assume that we know 𝜎12 = 𝜎22 = [Link] the ML estimates of


𝜇1 & 𝜇2 numerically Max @ 𝜇1 ≈ 174, 𝜇2 ≈ 156

We need to write a program for that


searches for this maxima

Using Matlab and using brute-


force (fine grid) search,
𝜇Ƹ 1𝑀𝐿𝐸 = 174, 𝜇Ƹ 2𝑀𝐿𝐸 = 156
EE343, Dept. of E & E, NITK Surathkal
MLE
T8-P4:Consider a study of student heights in a batch. Assume that the
height of a S1 student is normally distributed with mean 𝜇1 and variance 𝜎12 ,
and that the height of a S2 student is normally distributed with mean 𝜇2 and
variance 𝜎22 . Assume that a student is equally likely to be from S1 or S2. A
sample of size 𝑛 = 10 was collected and the following values were recorded
(in centimeters) : 164, 167, 163, 158, 170, 183, 176, 159, 170, 167
(c) Treating the estimates obtained in part (b) as exact values, describe the
MAP rule for deciding a student’s Section based on the student's height

Bayesian framework with Θ = 𝑆1 𝑂𝑅 𝑆2. prior 𝑝Θ = 0.5 each (equally likely)


𝑓X|Θ 𝑥 𝑆1 = 𝒩(𝜇1 , 𝜎12 ) 𝑓X|Θ 𝑥 𝑆2 = 𝒩(𝜇2 , 𝜎22 )
𝑓X|Θ 𝑥 𝑆1 𝑓X|Θ 𝑥 𝑆2
𝑝Θ|X 𝑆1 𝑥 = 𝑝Θ|X 𝑆2 𝑥 =
𝑓X|Θ 𝑥 𝑆1 + 𝑓X|Θ 𝑥 𝑆2 𝑓X|Θ 𝑥 𝑆1 + 𝑓X|Θ 𝑥 𝑆2

𝑆2 𝑆1
𝑆1 if 𝑓X|Θ 𝑥 𝑆1 > 𝑓X|Θ 𝑥 𝑆2 ∵ 𝜎12 = 𝜎22
෡ MAP
Θ =൝
𝑆2 𝑖𝑓 𝑓X|Θ 𝑥 𝑆1 < 𝑓X|Θ 𝑥 𝑆2

174 + 156 𝜇2 ≈ 156 𝜇1 ≈ 174


𝑥=
2 EE343, Dept. of E & E, NITK Surathkal

You might also like