0% found this document useful (0 votes)

3 views40 pages

ParameterEstimation_slides

The document discusses parameter estimation in statistics, covering key concepts such as parameters, sampling distributions, estimates, and the quality of estimators. It emphasizes the importance of understanding the properties of estimators, including bias, variance, and mean squared error (MSE). The document also provides examples of unbiased estimators for population mean and variance, illustrating the calculations involved.

Uploaded by

satyamkumarmod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views40 pages

ParameterEstimation_slides

Uploaded by

satyamkumarmod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Parameter Estimation

Nathaniel E. Helwig

Associate Professor of Psychology and Statistics

University of Minnesota

August 30, 2020

Copyright c 2020 by Nathaniel E. Helwig

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 1 / 40

Table of Contents

1. Parameters and Statistics

2. Sampling Distribution

3. Estimates and Estimators

4. Quality of Estimators

5. Estimation Frameworks

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 2 / 40

Parameters and Statistics

Table of Contents

1. Parameters and Statistics

2. Sampling Distribution

3. Estimates and Estimators

4. Quality of Estimators

5. Estimation Frameworks

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 3 / 40

Parameters and Statistics

Probability Distribution Reminders

A random variable X has a cumulative distribution function (CDF)

denoted by F (x) = P (X ≤ x) that describes the probabilistic nature of
the random variable X.

F (·) has an associated probability mass function (PMF) or probability

density function (PDF) denoted by f (x).
• PMF: f (x) = P (X = x) for discrete variables
Rb
• PDF: a f (x) = P (a < X < b) for continuous variables

The functions F (·) and f (·) are typically assumed to depend on a finite
number of parameters, where a parameter θ = t(F ) is some function of
the probability distribution.

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 4 / 40

Parameters and Statistics

Inferences and Statistics

Given a sample of n independent and identically distributed (iid)

observations from some distribution F , inferential statistical analyses
are concerned with inferring things about the population from which
the sample was collected.

To form inferences, researchers often make assumptions about the form

of F , e.g., F is a normal distribution, and then use the sample of data
to form educated guesses about the population parameters.

Given a sample of data x = (x1 , . . . , xn )> , a statistic T = s(x) is some

function of the sample of data. Not all statistics are created equal. . .
• Some are useful for estimating parameters or testing hypotheses

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 5 / 40

Sampling Distribution

Table of Contents

1. Parameters and Statistics

2. Sampling Distribution

3. Estimates and Estimators

4. Quality of Estimators

5. Estimation Frameworks

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 6 / 40

Sampling Distribution

Statistics are Random Variables

iid iid
Assume that xi ∼ F for i = 1, . . . , n, where the notation ∼ denotes
that the xi are iid observations from the distribution F .
• x = (x1 , . . . , xn )> denotes the sample of data as an n × 1 vector

Each xi is assumed to be an independent realization of a random

variable X ∼ F , so any valid statistic T = s(x) will be a random
variable with a probability distribution.
• By “valid” I mean that T must depend on the xi values

The sampling distribution of a statistic T = s(x) refers to the

probability distribution of T .

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 7 / 40

Sampling Distribution

Sampling Distribution Properties

Suppose that we collect R independent realizations of the vector x, and

let Tr = s(xr ) denote the r-th realization of the statistic. The sampling
distribution is the probability distribution of {Tr }R
r=1 as the number of
independent realizations R → ∞.

The sampling distribution depends on the distribution of data.

iid iid
• if xi ∼ F and yi ∼ G, then the statistics T = s(x) and U = s(y)
will have different sampling distributions if F and G are different.

Sometimes the sampling distribution will be known as n → ∞.

• CLT or asymptotic normality of MLEs
• Question of interest is: how large does n need to be?

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 8 / 40

Estimates and Estimators

Table of Contents

1. Parameters and Statistics

2. Sampling Distribution

3. Estimates and Estimators

4. Quality of Estimators

5. Estimation Frameworks

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 9 / 40

Estimates and Estimators

Definition of Estimates and Estimators

iid
Given a sample of data x1 , . . . , xn where xi ∼ F , an estimate of a
parameter θ = t(F ) is some function of the sample θ̂ = g(x) that is
meant to approximate θ.

An estimator refers to the function g(·) that is applied to the sample to

obtain the estimate θ̂.

Standard notation in statistics, where a “hat” (i.e.,ˆ) is placed on top

of the parameter to denote that θ̂ is an estimate of θ.
• θ̂ should be read as “theta hat”
• should interpret θ̂ as some estimate of the parameter θ

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 10 / 40

Estimates and Estimators

Examples of Estimates and Estimators

Example. Suppose that we have a sample of data x1 , . . . , xn where

iid
xi ∼ F , which denotes any generic distribution, and the population
mean µ P= E(X) is the parameter of interest. The sample mean
x̄ = n1 ni=1 xi provides an estimate of the parameter µ, so we could
also write it as x̄ = µ̂.

Example. Similarly, suppose that we have a sample of data x1 , . . . , xn

iid
where xi ∼ F and the population variance σ 2 = E[(XP− µ)2 ] is the
1 n
parameter of interest. The sample variance s2 = n−1 i=1 (xi − x̄)
2

provides an estimate of the parameter σ 2 , so we could also

P write it as
s2 = σ̂ 2 . Another reasonable estimate would be s̃2 = n1 ni=1 (xi − x̄)2 .

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 11 / 40

Quality of Estimators

Table of Contents

1. Parameters and Statistics

2. Sampling Distribution

3. Estimates and Estimators

4. Quality of Estimators

5. Estimation Frameworks

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 12 / 40

Quality of Estimators

Overview

Like statistics, not all estimators are created equal. Some estimators
produce “better” estimates of the intended population parameters.

There are several ways to talk about the “quality” of an estimator:

• its expected value (bias)
• its uncertainty (variance)
• both its bias and variance (MSE)
• its asymptotic properties (consistency)

MSE is typically the preferred way to measure an estimator’s quality.

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 13 / 40

Quality of Estimators

Bias of an Estimator
The bias of an estimator refers to the difference between the expected
value of the estimate θ̂ = g(x) and the parameter θ = t(F ), i.e.,

Bias(θ̂) = E(θ̂) − θ

where the expectation is calculated with respect to F .

• An estimator is “unbiased” if Bias(θ̂) = 0

Despite the negative connotations of the word “bias”, it is important to

note that biased estimators can be a good thing (see Helwig, 2017).
• Ridge regression (Hoerl and Kennard, 1970)
• Least absolute shrinkage and selection operator (LASSO)
regression (Tibshirani, 1996)
• Elastic Net regression (Zou and Hastie, 2005)
Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 14 / 40
Quality of Estimators

Bias Example 1: The Mean

iid
Given a sample of data x1 , . . . , xn where xi ∼ F and F has mean
1 Pn
µ = E(X), the sample mean x̄ = n i=1 xi is an unbiased estimate of
the population mean µ.

To prove that x̄ is an unbiased estimator, we can use the expectation

rules from Introduction
P to Random Variables chapter. Specifically,
note that E(x̄) = n1 ni=1 E(xi ) = n1 ni=1 µ = µ.
P

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 15 / 40

Quality of Estimators

Bias Example 2: The Variance (part 1)

iid
Given a sample of data x1 , . . . , xn where xi ∼ F and F has mean
µ = E(X)Pand variance σ 2 = E[(X − µ)2 ], the sample variance
1 n
s2 = n−1 2
i=1 (xi − x̄) is an unbiased estimate of σ .
2

To prove that s2 is unbiased, first note that

n
X n
X n
X n
X
2 2 2
(xi − x̄) = xi − 2x̄ xi + nx̄ = x2i − nx̄2
i=1 i=1 i=1 i=1

1
Pn
which implies that E(s2 ) = 2 − nE(x̄2 ) .

n−1 i=1 E(xi )

Now note that σ 2 = E(x2i ) − µ2 , which implies that E(x2i ) = σ 2 + µ2 .

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 16 / 40

Quality of Estimators

Bias Example 2: The Variance (part 2)

Also, note that we can write

!2  
n n n X
i−1
1X 1 X X
x̄2 = xi = 2 x2i + 2 xi xj 
n n
i=1 i=1 i=2 j=1

and applying the expectation operator gives

n n i−1
2 1 X 2 2 XX
E(x̄ ) = 2 E(xi ) + 2 E(xi )E(xj )
n n
i=1 i=2 j=1
1 n−1 2
= (σ 2 + µ2 ) + µ
n n
given that E(xi xj ) = E(xi )E(xj ) for all i 6= j because xi and xj are
n(n−1) 2
independent, and ni=2 i−1 2
P P
j=1 µ = 2 µ .

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 17 / 40

Quality of Estimators

Bias Example 2: The Variance (part 3)

Putting all of the pieces together gives

n
!
1 X
E(s2 ) = E(x2i ) − nE(x̄2 )
n−1
i=1
1
n(σ 2 + µ2 ) − (σ 2 + µ2 ) − (n − 1)µ2

=
n−1
= σ2

which completes the proof that E(s2 ) = σ 2 .

This result can be used to show that s̃2 = n1 ni=1 (xi − x̄)2 is biased:
P

• E s̃2 = E n−1 2 = n−1 E s2 = n−1 σ 2

n s n n
• n−1 2
n < 1 for any finite n, so s̃ has a downward bias

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 18 / 40

Quality of Estimators

Variance of a Estimator

The variance of an estimator refers to second central moment of the

estimator’s probability distribution, i.e.,
2
Var(θ̂) = E θ̂ − E(θ̂)

where both expectations are calculated with respect to F .

The standard error of an estimator is the square root of the variance of

the estimator, i.e., SE(θ̂) = Var(θ̂)1/2 .

We would like an estimator that is both reliable (low variance) and

valid (low bias), but there is a trade-off between these two concepts.

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 19 / 40

Quality of Estimators

Variance of the Sample Mean

iid
Given a sample of data x1 , . . . , xn where xi ∼ F and F has mean
µ = E(X) and variance σ 2 = E[(X − µ)2 ], the sample mean
2
x̄ = n1 ni=1 xi has a variance of Var(x̄) = σn .
P

To prove that this is the variance of x̄, we can use the variance rules
from the Introduction to Random Variables chapter, i.e.,
n n
!
1X 1 X σ2
Var(x̄) = Var xi = 2 Var(xi ) =
n n n
i=1 i=1

given that the xi are independent and Var(xi ) = σ 2 for all i = 1, . . . , n.

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 20 / 40

Quality of Estimators

Variance of the Sample Variance

iid
Given a sample of data x1 , . . . , xn where xi ∼ F and F has mean
µ = E(X)Pand variance σ 2 = E[(X − µ)2 ], the sample variance
1 n
s2 = n−1 2
i=1 (xi − x̄) has a variance of

1 n−3 4
Var(s2 ) = µ4 − σ
n n−1

where µ4 = E[(X − µ)4 ] is the fourth central moment of X.

• The proof of this is too tedious to display on the slides
• Bonus points for anyone who can prove this formula

The above result can be used to show that

n−1
(n−1)2
2 2
• Var(s̃ ) = Var n s = n3 µ4 − n−3
n−1 σ 4

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 21 / 40

Quality of Estimators

Mean Squared Error of an Estimator

The mean squared error (MSE) of an estimator refers to the expected

squared difference between the parameter θ = t(F ) and the estimate
θ̂ = g(x), i.e.,
MSE(θ̂) = E (θ̂ − θ)2

where the expectation is calculated with respect to F .

Although not obvious from its definition, MSE can be decomposed as

MSE(θ̂) = Bias(θ̂)2 + Var(θ̂)

where the first term is squared bias and the second term is variance.

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 22 / 40

Quality of Estimators

MSE = Bias2 + Variance

To prove this relationship holds for any estimator, first note that
(θ̂ − θ)2 = θ̂2 − 2θ̂θ + θ2 , and applying the expectation operator gives

E (θ̂ − θ)2 = E(θ̂2 ) − 2θE(θ̂) + θ2
given that the parameter θ is assumed to be an unknown constant.

Next, note that we can write the squared bias and variance as
2
Bias(θ̂)2 = E(θ̂) − θ = E(θ̂)2 − 2θE(θ̂) + θ2
Var(θ̂) = E(θ̂2 ) − E(θ̂)2
and adding these two terms together gives
Bias(θ̂)2 + Var(θ̂) = E(θ̂)2 − 2θE(θ̂) + θ2 + E(θ̂2 ) − E(θ̂)2
= E(θ̂2 ) − 2θE(θ̂) + θ2
which is the form of the MSE given on the previous slide.
Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 23 / 40
Quality of Estimators

Consistency of an Estimator

iid
Given a sample of data x1 , . . . , xn with xi ∼ F , an estimator θ̂ = g(x)
p
of a parameter θ = t(F ) is said to be consistent if θ̂ →− θ as n → ∞.

p
The notation →
− should be read as “converges in probability to”, which
means that the probability that θ̂ 6= θ goes to zero as n gets large.

Note that any reasonable estimator should be consistent. Otherwise,

collecting more data will not result in better estimates.

All of the estimators that we’ve discussed (i.e., x̄, s2 and s̃2 ) are
consistent estimators.

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 24 / 40

Quality of Estimators

Efficiency of an Estimator

iid
Given a sample of data x1 , . . . , xn with xi ∼ F , an estimator θ̂ = g(x)
of a parameter θ = t(F ) is said to be efficient if it is the best possible
estimator for θ using some loss function.

The chosen loss function is often MSE, so the most efficient estimator
is the one with the smallest MSE compared to all other estimators of θ.

If you have two estimators θ̂1 = g1 (x) and θ̂2 = g2 (x), we would say
that θ̂1 is more efficient than θ̂2 if MSE(θ̂1 ) < MSE(θ̂2 ).
• If θ̂1 and θ̂2 are both unbiased, the most efficient estimator is the
one with the smallest variance

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 25 / 40

Estimation Frameworks

Table of Contents

1. Parameters and Statistics

2. Sampling Distribution

3. Estimates and Estimators

4. Quality of Estimators

5. Estimation Frameworks

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 26 / 40

Estimation Frameworks

Least Squares Estimation

A simple least squares estimate of a parameter θ = t(F ) is the estimate

θ̂ = g(x) that minimizes a least squares loss function of the form
n
X
(h(xi ) − θ)2
i=1

where h(·) is some user-specified function (typically h(x) = x).

Least squares estimation methods can work well for mean parameters
and regression coefficients, but will not work well for all parameters.
• Variance parameters are best estimated using other approahces

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 27 / 40

Estimation Frameworks

Least Squares Estimation Example

iid
Given a sample of data x1 , . . . , xn where xi ∼ F , suppose that we want
to find the least squares estimate of µ = E(X).

The least squares loss function is

n
X n
X n
X
LS(µ|x) = (xi − µ)2 = x2i − 2µ xi + nµ2
i=1 i=1 i=1

where x = (x1 , . . . , xn ) is the observed data vector.

Taking the derivative of the function with respect to µ gives

n
dLS(µ|x) X
= −2 xi + 2nµ
dµ
i=1
1 Pn
and setting the derivative to 0 and solving for µ gives µ̂ = n i=1 xi .
• The sample mean x̄ is the least squares estimate of µ
Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 28 / 40
Estimation Frameworks

Method of Moments Estimation

Assume that X ∼ F where the probability distribution F depends on
parameters θ1 , . . . , θp .

Also, suppose that the first p moments of X can be written as

µj = E(X j ) = mj (θ1 , . . . , θp )

where mj (·) is some known function for j = 1, . . . , p.

iid
Given data xi ∼ F for i = 1, . . . , n, the method of moments estimates
of the parameters are the values θ̂1 , . . . , θ̂p that solve the equations

µ̂j = mj (θ̂1 , . . . , θ̂p )

1 Pn j
where µ̂j = n i=1 xi is the j-th sample moment for j = 1, . . . , p.
Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 29 / 40
Estimation Frameworks

Method of Moments: Normal Distribution

iid
Suppose that xi ∼ N (µ, σ 2 ) for i = 1, . . . , n. The first two moments of
the normal distribution are µ1 = µ and µ2 = µ2 + σ 2 .

two sample moments are µ̂1 = n1P ni=1 xi = x̄ and

P
The first
µ̂2 = n1 ni=1 x2i = x̄2 + s̃2 , where s̃2 = n1 ni=1 (xi − x̄)2 .
P

Thus, the method of moments estimates of µ and σ 2 are given by

µ̂ = x̄ and σ̂ 2 = s̃2 .

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 30 / 40

Estimation Frameworks

Method of Moments: Uniform Distribution

iid
Suppose that xi ∼ U [a, b] for i = 1, . . . , n. The first two moments of
the uniform distribution are µ1 = 12 (a + b) and µ2 = 31 (a2 + ab + b2 ).

Solving the first equation gives b = 2µ1 − a and plugging this into the
second equation gives µ2 = 13 a2 − 2aµ1 + 4µ21 , which is a simple
quadratic function of a.

√ p 2
1 −p 3 µ2 − µ1 ,
Applying the quadratic formula (see here) gives a = µ√
2
and plugging this into b = 2µ1 − a produces b = µ1 + 3 µ2 − µ1 .

Using µ̂1 and µ̂2 in these equations gives the methods of moments
estimates of a and b.

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 31 / 40

Estimation Frameworks

Likelihood Function and Log-Likelihood Function

iid
Suppose that xi ∼ F for i = 1, . . . , n where the distribution F depends
on the vector of parameters θ = (θ1 , . . . , θp )> .

The likelihood function has the form

n
Y
L(θ|x) = f (xi |θ)
i=1

where f (xi |θ) is the probability mass function (PMF) or probability

density function (PDF) corresponding to the distribution function F .

The log-likelihood function is the logarithm of the likelihood function:

n
X
`(θ|x) = log (L(θ|x)) = log (f (xi |θ))
i=1

where log(·) = ln(·) is the natural logarithm function.

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 32 / 40
Estimation Frameworks

Maximum Likelihood Estimation

iid
Suppose that xi ∼ F for i = 1, . . . , n where the distribution F depends
on the vector of parameters θ = (θ1 , . . . , θp )> .

The maximum likelihood estimates (MLEs) are the parameter values

that maximize the likelihood (or log-likelihood) function, i.e.,
θ̂ MLE = arg max L(θ|x) = arg max `(θ|x)
θ∈Θ θ∈Θ
where Θ = Θ1 × · · · × Θp is the joint parameter space with Θj denoting
the parameter space for the j-th parameter, i.e., θj ∈ Θj for all j.

Maximum likelihood estimates have desirable large sample properties:

• consistent: θ̂MLE → θ as n → ∞
• asymptotically efficient: Var(θ̂MLE ) ≤ Var(θ̂) as n → ∞
• functionally invariant: if θ̂MLE is the MLE of θ, then h(θ̂MLE ) is
the MLE of h(θ) for any continuous function h(·)
Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 33 / 40
Estimation Frameworks

MLE for Normal Distribution

iid
Suppose that xi ∼ N (µ, σ 2 ) for i = 1, . . . , n. Assuming that
X ∼ N (µ, σ 2 ), the probability density function can be written as

2 1 1 2
f (x|µ, σ ) = √ exp − 2 (x − µ)
2πσ 2 2σ

This implies that the log-likelihood function has the form

n
1 X n
`(µ, σ 2 |x) = − (xi − µ)2 − log(σ 2 ) − c
2σ 2 2
i=1

where c = (n/2) log(2π) is a constant with respect to µ and σ 2 .

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 34 / 40

Estimation Frameworks

MLE for Normal Distribution (part 2)

Maximizing `(µ, σ 2 |x) with respect to µ is equivalent to minimizing

n
X
`1 (µ|x) = (xi − µ)2
i=1

which is the least squares loss function that we encountered before.

We can use the same approach as before to derive the MLE:

• Take the derivative of `1 (µ|x) with respect to µ
• Equate the derivative to zero and solve for µ

1 Pn
The MLE of µ is the sample mean, i.e., µ̂MLE = x̄ = n i=1 xi .

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 35 / 40

Estimation Frameworks

MLE for Normal Distribution (part 3)

Maximizing `(µ, σ 2 |x) with respect to σ 2 is equivalent to minimizing

n
1 X
`2 (σ 2 |µ̂, x) = (xi − x̄)2 + n log(σ 2 )
σ2
i=1

Taking the derivative of `2 (σ 2 |µ̂, x) with respect to σ 2 gives

n
d`2 (σ 2 |µ̂, x) 1 X n
2
= − 4
(xi − x̄)2 + 2
σ σ σ
i=1

Equating the derivative to zero and solving for σ 2 reveals that

1 Pn
σ̂MLE = s̃ = n i=1 (xi − x̄)2 .
2 2

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 36 / 40

Estimation Frameworks

MLE for Binomial Distribution

iid
Suppose that xi ∼ B[N, p] for i = 1, . . . , n. Assuming that
X ∼ B[N, p], the probability density function can be written as

N x N!
f (x|N, p) = p (1 − p)N −x = px (1 − p)N −x
x x!(N − x)!

This implies that the log-likelihood function has the form

n n
!
X X
`(p|x, N ) = log(p) xi + log(1 − p) nN − xi +c
i=1 i=1
Pn
where c = n log(N !) − i=1 [log(xi !) + log((N − xi )!)] is a constant.

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 37 / 40

Estimation Frameworks

MLE for Binomial Distribution (part 2)

Taking the derivative of the log-likelihood with respect to p gives
n n
!
d`(p|x, N ) 1X 1 X
= xi − nN − xi
dp p 1−p
i=1 i=1

Setting the derivative to zero and multiplying by p(1 − p) reveals that

the MLE satisfies

(1 − p)nx̄ − pn (N − x̄) = 0 → x̄ − pN = 0

Solving the above equation for p reveals that the MLE of p is

n
1 X x̄
p̂MLE = xi =
nN N
i=1

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 38 / 40

Estimation Frameworks

MLE for Uniform Distribution

iid
Suppose that xi ∼ U [a, b] for i = 1, . . . , n. Assuming that X ∼ U [a, b],
the probability density function can be written as
1
f (x|a, b) =
b−a

This implies that the log-likelihood function has the form

n
X
`(a, b|x) = − log(b − a) = −n log(b − a)
i=1

Maximizing `(a, b|x) is equivalent to minimizing log(b − a) with the

requirements that a ≤ xi for all i = 1, . . . , n and b ≥ xi for all i.
• MLEs are âMLE = min(xi ) = x(1) and b̂MLE = max(xi ) = x(n)

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 39 / 40

References

Helwig, N. E. (2017). Adding bias to reduce variance in psychological

results: A tutorial on penalized regression. The Quantitative
Methods for Psychology 13, 1–19.
Hoerl, A. and R. Kennard (1970). Ridge regression: Biased estimation
for nonorthogonal problems. Technometrics 12, 55–67.
Tibshirani, R. (1996). Regression and shrinkage via the lasso. Journal
of the Royal Statistical Society, Series B 58, 267–288.
Zou, H. and T. Hastie (2005). Regularization and variable selection via
the elastic net. Journal of the Royal Statistical Society, Series B 67,
301–320.

Nathaniel E. Helwig (Minnesota) Parameter Estimation c August 30, 2020 40 / 40

Grinstead, Snell Introduction To Probability Solutions Manual
No ratings yet
Grinstead, Snell Introduction To Probability Solutions Manual
45 pages
Maths For Human Flourishing
No ratings yet
Maths For Human Flourishing
6 pages
Worksheet 1: SUBJECT: Some / Any / Much / Many / A Lot of / (A) Few / (A) Little
No ratings yet
Worksheet 1: SUBJECT: Some / Any / Much / Many / A Lot of / (A) Few / (A) Little
5 pages
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
12 pages
STA 303 Lec 1
No ratings yet
STA 303 Lec 1
5 pages
Estimators: The Basic Statistical Model
No ratings yet
Estimators: The Basic Statistical Model
9 pages
DPBS 1203 Business and Economic Statistics
No ratings yet
DPBS 1203 Business and Economic Statistics
21 pages
Theory of Estimation by P.G.dixit, Nirali Publication
No ratings yet
Theory of Estimation by P.G.dixit, Nirali Publication
186 pages
stat2602_chapter3
No ratings yet
stat2602_chapter3
37 pages
Review of Probability and Statistics
No ratings yet
Review of Probability and Statistics
34 pages
Lecture 21
No ratings yet
Lecture 21
4 pages
Module02 Slides Print 1
No ratings yet
Module02 Slides Print 1
65 pages
Reading-Point Estimates of Population Mean
No ratings yet
Reading-Point Estimates of Population Mean
5 pages
Lecture Slides 10 UN1201
No ratings yet
Lecture Slides 10 UN1201
35 pages
Lecture Notes For Mathematical Statistics
No ratings yet
Lecture Notes For Mathematical Statistics
184 pages
Probability and Statistics ch7
No ratings yet
Probability and Statistics ch7
19 pages
Sta 2
No ratings yet
Sta 2
13 pages
Stat Lecture 2
No ratings yet
Stat Lecture 2
6 pages
Statistics
No ratings yet
Statistics
53 pages
Session2_QTII_24
No ratings yet
Session2_QTII_24
31 pages
ST_Topic_4 (1)
No ratings yet
ST_Topic_4 (1)
110 pages
202004160626023624rajiv Saksena Advance Statistical Inference
No ratings yet
202004160626023624rajiv Saksena Advance Statistical Inference
31 pages
Chapter Two - Estimators.2
No ratings yet
Chapter Two - Estimators.2
8 pages
P.1 Biasedness - The Bias of On Estimator Is Defined As:: Chapter Two Estimators
No ratings yet
P.1 Biasedness - The Bias of On Estimator Is Defined As:: Chapter Two Estimators
8 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
Lectura 2 Point Estimator Basics
No ratings yet
Lectura 2 Point Estimator Basics
11 pages
Transition To MATH503
No ratings yet
Transition To MATH503
12 pages
Chapter 5 Introduction To Statistical Inference
No ratings yet
Chapter 5 Introduction To Statistical Inference
9 pages
slides-sm-1
No ratings yet
slides-sm-1
51 pages
Topic_10_Point_estmation_of_parameters
No ratings yet
Topic_10_Point_estmation_of_parameters
36 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
09 Inference - Slides Web
No ratings yet
09 Inference - Slides Web
39 pages
Statinf Estimation
No ratings yet
Statinf Estimation
110 pages
Chap 3
No ratings yet
Chap 3
25 pages
Module-5-Joint-Probanility-Distribution
No ratings yet
Module-5-Joint-Probanility-Distribution
9 pages
Lectura 1 Point Estimation
No ratings yet
Lectura 1 Point Estimation
47 pages
Statistical Inferences: Dr. Olivia Carrillo Gamboa
No ratings yet
Statistical Inferences: Dr. Olivia Carrillo Gamboa
16 pages
Estimation: M. Shafiqur Rahman
No ratings yet
Estimation: M. Shafiqur Rahman
31 pages
2A.3 Lecture Slides 0
No ratings yet
2A.3 Lecture Slides 0
19 pages
Chapter 2 Students-Sta408
No ratings yet
Chapter 2 Students-Sta408
59 pages
Estimation Bertinoro09 Cristiano Porciani 1
No ratings yet
Estimation Bertinoro09 Cristiano Porciani 1
42 pages
LN Estimation Theory
No ratings yet
LN Estimation Theory
11 pages
Sample Theory With Ques. - Estimation (JAM MS Unit-14)
No ratings yet
Sample Theory With Ques. - Estimation (JAM MS Unit-14)
25 pages
Unit - III (P&S Notes)
No ratings yet
Unit - III (P&S Notes)
39 pages
Lecture 11
100% (1)
Lecture 11
33 pages
Statistics 512 Notes I D. Small
No ratings yet
Statistics 512 Notes I D. Small
8 pages
Chap8_STAT_2_merged
No ratings yet
Chap8_STAT_2_merged
15 pages
2 Hypothesis Testing
No ratings yet
2 Hypothesis Testing
22 pages
Random Sampling, Statistics, and Estimators
No ratings yet
Random Sampling, Statistics, and Estimators
9 pages
10-estimators-pre-lecture
No ratings yet
10-estimators-pre-lecture
109 pages
Fundamentals of Statistics (18.6501x)
No ratings yet
Fundamentals of Statistics (18.6501x)
20 pages
Intermediate Statistics Formula Sheet
No ratings yet
Intermediate Statistics Formula Sheet
30 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
SI_Chapter-2
No ratings yet
SI_Chapter-2
53 pages
MATH 403 Engineering Data Analysis 95 132
No ratings yet
MATH 403 Engineering Data Analysis 95 132
38 pages
Formula_List_Statistics_2
No ratings yet
Formula_List_Statistics_2
4 pages
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
No ratings yet
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
16 pages
Wickham Stati
No ratings yet
Wickham Stati
12 pages
Stimation: Statistic
No ratings yet
Stimation: Statistic
46 pages
Basic Univariate Statistics for Engineers 2019
No ratings yet
Basic Univariate Statistics for Engineers 2019
32 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
brownian_motion_and_ito_calculus
No ratings yet
brownian_motion_and_ito_calculus
19 pages
frobenius problem
No ratings yet
frobenius problem
28 pages
change-of-numeraire-forward-measures
No ratings yet
change-of-numeraire-forward-measures
36 pages
The Pigeonhole Principle
No ratings yet
The Pigeonhole Principle
2 pages
wiki_student_dist
No ratings yet
wiki_student_dist
15 pages
RNN Vanishing-Gradient
No ratings yet
RNN Vanishing-Gradient
7 pages
Lecture Future Routing DRSCG PDF
No ratings yet
Lecture Future Routing DRSCG PDF
1 page
04 Hwvirt Kvmqemu
No ratings yet
04 Hwvirt Kvmqemu
12 pages
05 Fullvirt
No ratings yet
05 Fullvirt
10 pages
Hilber Proof Systems
No ratings yet
Hilber Proof Systems
105 pages
COL215: Digital Logic and System Design: Lab Assignment - 2 Seven-Segment Display Logic
No ratings yet
COL215: Digital Logic and System Design: Lab Assignment - 2 Seven-Segment Display Logic
1 page
About DNS: Technology
No ratings yet
About DNS: Technology
2 pages
Prepositional Resolution
No ratings yet
Prepositional Resolution
4 pages
CON101 Assignment 3: Network Security: Satyam Kumar Modi December 2020
No ratings yet
CON101 Assignment 3: Network Security: Satyam Kumar Modi December 2020
2 pages
COL215: Digital Logic and System Design: Lab Assignment - 3 4-Digit 7-Segment Display
No ratings yet
COL215: Digital Logic and System Design: Lab Assignment - 3 4-Digit 7-Segment Display
1 page
CON101 Assignment 6: Post Silicon Architecture: Satyam Kumar Modi, 2019CS50448 November 13, 2020
No ratings yet
CON101 Assignment 6: Post Silicon Architecture: Satyam Kumar Modi, 2019CS50448 November 13, 2020
2 pages
Set 5 A& B. Due On September 16: 5 A Q1. Draw Reaction Force and Moment Components Exerted by Supports On Member ABCDEF
No ratings yet
Set 5 A& B. Due On September 16: 5 A Q1. Draw Reaction Force and Moment Components Exerted by Supports On Member ABCDEF
3 pages
CON Assignment-Match Making: Satyam Kumar Modi December 2020
No ratings yet
CON Assignment-Match Making: Satyam Kumar Modi December 2020
2 pages
CON 101 Assignment 5: Vulnerabilities: Satyam Kumar Modi November 3, 2020
No ratings yet
CON 101 Assignment 5: Vulnerabilities: Satyam Kumar Modi November 3, 2020
2 pages
CML 100 Organic Chemistry (Tutorial 1) Answer These Following Questions
No ratings yet
CML 100 Organic Chemistry (Tutorial 1) Answer These Following Questions
1 page
ELL100 - Minor 1 Marks - 19 September 2019 PDF
No ratings yet
ELL100 - Minor 1 Marks - 19 September 2019 PDF
12 pages
B.Tech. CSE Syllabus 3rd-4th
No ratings yet
B.Tech. CSE Syllabus 3rd-4th
26 pages
Applied Biostatistics for the Health Sciences 2nd Edition Richard J. Rossi - The complete ebook set is ready for download today
No ratings yet
Applied Biostatistics for the Health Sciences 2nd Edition Richard J. Rossi - The complete ebook set is ready for download today
62 pages
SPSS For Beginners
100% (7)
SPSS For Beginners
445 pages
Instant Access to Chemometrics in Excel 1st Edition Alexey L. Pomerantsev ebook Full Chapters
100% (2)
Instant Access to Chemometrics in Excel 1st Edition Alexey L. Pomerantsev ebook Full Chapters
71 pages
Module 5 HW Answers
No ratings yet
Module 5 HW Answers
4 pages
Continuous and Random Variables
No ratings yet
Continuous and Random Variables
22 pages
Instant Access to Schaum’s Outline of Statistics, 6e 6th Edition Murray R. Spiegel ebook Full Chapters
100% (1)
Instant Access to Schaum’s Outline of Statistics, 6e 6th Edition Murray R. Spiegel ebook Full Chapters
47 pages
Tabel Probstat - Montgomery (Part 1)
No ratings yet
Tabel Probstat - Montgomery (Part 1)
19 pages
Statistics 502 Lecture Notes: Peter D. Hoff
No ratings yet
Statistics 502 Lecture Notes: Peter D. Hoff
186 pages
Instant Download Probability and Random Processes 4th Edition Geoffrey R. Grimmett PDF All Chapters
100% (2)
Instant Download Probability and Random Processes 4th Edition Geoffrey R. Grimmett PDF All Chapters
40 pages
Chapter 4 Continuous Probability Distribution
No ratings yet
Chapter 4 Continuous Probability Distribution
53 pages
4 Computing Probabilities and Percentile Using The Standard Normal
No ratings yet
4 Computing Probabilities and Percentile Using The Standard Normal
18 pages
S130 Lecture 1-Probability Distribution and Special Discrete PDs
100% (1)
S130 Lecture 1-Probability Distribution and Special Discrete PDs
61 pages
EDX As s1 2017 v1
No ratings yet
EDX As s1 2017 v1
4 pages
Heip, C. & Al. (1998) - Indices of Diversity and Evenness.
No ratings yet
Heip, C. & Al. (1998) - Indices of Diversity and Evenness.
27 pages
Gan Tutorial
No ratings yet
Gan Tutorial
57 pages
(eBook PDF) CFA Program Curriculum 2019 Level I Volumes 1-6 Box Setinstant download
100% (6)
(eBook PDF) CFA Program Curriculum 2019 Level I Volumes 1-6 Box Setinstant download
46 pages
M.A. Economics
No ratings yet
M.A. Economics
7 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
MTech Project Management
No ratings yet
MTech Project Management
40 pages
Statistics For Quality
No ratings yet
Statistics For Quality
170 pages
Do Rosenblatt and Nataf Isoprobabilistic Transformation Really Differ
No ratings yet
Do Rosenblatt and Nataf Isoprobabilistic Transformation Really Differ
9 pages
Chapter 6 Normal
No ratings yet
Chapter 6 Normal
25 pages
Discrete Random Variables
No ratings yet
Discrete Random Variables
1 page
Algorithmic Transparency Via Quantitative Input Influence
No ratings yet
Algorithmic Transparency Via Quantitative Input Influence
24 pages
M.E. Ise
No ratings yet
M.E. Ise
43 pages
Statistical Methodology For Sensory Discrimination Tests and Its Implementation in Sensr
No ratings yet
Statistical Methodology For Sensory Discrimination Tests and Its Implementation in Sensr
24 pages
Stats 241.3: Probability Theory
No ratings yet
Stats 241.3: Probability Theory
67 pages
Random Variables and Probability Distribution
No ratings yet
Random Variables and Probability Distribution
54 pages