0% found this document useful (0 votes)
41 views

Lecture 6: Sampling Distributions: Hengki Purwoto (Econ UGM) Statistics 2: Lecture 6 March 18, 2021 1

This document discusses sampling distributions and their properties. It begins by introducing the concept of a sampling distribution as the probability distribution of sample statistics. When samples are taken from a population, their statistics have distributions. The document then provides examples of how to develop sampling distributions for the mean from both a finite population and when the population is normally distributed. It also defines key terms like the sample mean, standard error of the mean, and compares the population distribution to the sampling distribution.

Uploaded by

Rahmat Junaidi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Lecture 6: Sampling Distributions: Hengki Purwoto (Econ UGM) Statistics 2: Lecture 6 March 18, 2021 1

This document discusses sampling distributions and their properties. It begins by introducing the concept of a sampling distribution as the probability distribution of sample statistics. When samples are taken from a population, their statistics have distributions. The document then provides examples of how to develop sampling distributions for the mean from both a finite population and when the population is normally distributed. It also defines key terms like the sample mean, standard error of the mean, and compares the population distribution to the sampling distribution.

Uploaded by

Rahmat Junaidi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Lecture 6: Sampling Distributions

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 1


0. Outline

1. Introduction

2. Sampling Distributions with Normal Populations

3. Chi-Square Distribution

4. Student t−distribution

5. F −distribution

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 2


1. Introduction

Sample is a set of random variables!

Thus, sample statistic is also random. Why?

Thus, sample statistic has a probability distribution.

We call it the sampling distribution.

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 3


1. Introduction

Sampling distribution of a statistic provides:


◮ a theoretical model of the relative frequency histogram
◮ for the likely values of the statistic
◮ that one would observe through repeated sampling.

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 4


1. Introduction

Sample

Definition
A sample is a set of observable random variables X1, . . . , Xn. The
number n is called the sample size.

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 5


1. Introduction

Random Sample

Definition
A random sample of size n from a population is a set of n independent
and identically distributed (iid) observable random variables X1, . . . , Xn.

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 6


1. Introduction

Statistic

Definition
A function T of observable random variables X1, . . . , Xn that
does not depend on any unknown parameters is called a statistic.

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 7


1. Introduction

Sampling dist ribut ion

Definition
The probability distribution of a sample statistic is called the
sampling distribution.

• A sampling distribution is a probability distribution of


all of the possible values of a statistic for a given size
sample selected from a population

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 8


1. Introduction

Developing a Sampling Distribution

• Assume there is a population …


• Population size N=4 D
A
• Random variable, X, B C

is age of individuals
• Values of X:
18, 20, 22, 24 (years)

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 9


1. Introduction
Developing a Sampling Distribution
(continued)

In this example the Population Distribution is uniform:

P(x)

.25

0
18 20 22 24 x
A B C D

Uniform Distribution

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 10


1. Introduction
Developing a Sampling Distribution
(continued)
Now consider all possible samples of size n = 2

1st 2nd Observation


Obs 18 20 22 24 16 Sample Means

18 18,18 18,20 18,22 18,24


1st 2nd Observation
20 20,18 20,20 20,22 20,24 Obs 18 20 22 24
22 22,18 22,20 22,22 22,24 18 18 19 20 21
24 24,18 24,20 24,22 24,24 20 19 20 21 22
16 possible samples 22 20 21 22 23
(sampling with
replacement)
24 21 22 23 24

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 11


1. Introduction
Developing a Sampling Distribution
(continued)
Sampling Distribution of All Sample Means

16 Sample Means Distribution of


Sample Means
1st 2nd Observation _
P(X)
Obs 18 20 22 24
.3
18 18 19 20 21
.2
20 19 20 21 22
.1
22 20 21 22 23
0 _
24 21 22 23 24 18 19 20 21 22 23 24 X
(no longer uniform)
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 12
1. Introduction

Sample mean and sample variance

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 13


1. Introduction

Standard error of the mean

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 14


1. Introduction

Chebyshev’s inequality

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 15


2. Sampling distributions with normal population

Sample mean and sample variance

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 16


2. Sampling distributions with normal population

Sample mean and sample variance

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 17


2. Sampling distributions with normal population

Sample Mean

• Let X1, X2, . . ., Xn represent a random sample from a


population

• The sample mean value of these observations is defined as

1 n
X =  Xi
n i=1

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 18


2. Sampling distributions with normal population

Standard Error of the Mean

• Different samples of the same size from the same population


will yield different sample means
• A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:

σ
σX =
n
• Note that the standard error of the mean decreases as the
sample size increases

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 19


2. Sampling distributions with normal population
Comparing the Population with its Sampling Distribution

Population Sample Means Distribution


N=4 n=2
μ = 21 σ = 2.236 μX = 21 σ X = 1.58
_
P(X) P(X)
.3 .3

.2 .2

.1 .1

0
18 20 22 24 X
0
18 19 20 21 22 23 24
_
X
A B C D

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 20


2. Sampling distributions with normal population
Developing a Sampling Distribution
(continued)
Summary Measures for the Population Distribution:

P(x)
μ=
X i
N
.25
18 + 20 + 22 + 24
= = 21
4

0
18 20 22 24 x
A B C D
σ=
 (X − μ)
i
2

= 2.236
Uniform Distribution N

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 21


2. Sampling distributions with normal population
Developing a Sampling Distribution
(continued)
Summary Measures of the Sampling Distribution:

E(X) =
X i
=
18 + 19 + 21+  + 24
= 21 = μ
N 16

σX =
 ( X − μ)
i
2

N
(18 - 21)2 + (19 - 21)2 +  + (24 - 21)2
= = 1.58
16

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 22


2. Sampling distributions with normal population
If sample values are not independent

• If the sample size n is not a small fraction of the population


size N, then individual sample members are not distributed
independently of one another
• Thus, observations are not selected independently
• A finite population correction is made to account for this:

σ2 N − n or σ N−n
Var(X) = σX =
n N −1 n N −1

The term (N – n)/(N – 1) is often called a finite population


correction factor

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 23


2. Sampling distributions with normal population

If the Population is Normal

• If a population is normal with mean μ and standard deviation σ,


the sampling distribution of X is also normally distributed with

μX = μ and σ
σX =
n

• If the sample size n is not large relative to the population size


N, then

σ N−n
μX = μ and σX =
n N −1

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 24


2. Sampling distributions with normal population

Standard Normal Distribution for the Sample Means


• Z-value for the sampling distribution of X :

X −μ X −μ
Z= =
σX σ
n

where: X = sample mean


μ = population mean
σx = standard error of the mean

Z is a standardized normal random variable with mean of 0 and a


variance of 1
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 25
2. Sampling distributions with normal population

Sampling Distribution Properties

Normal Population
E[X] = μ Distribution

μ x
(i.e. x is unbiased )
Normal Sampling
Distribution

(both distributions have the same mean)


μx
x
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 26
2. Sampling distributions with normal population
Sampling Distribution Properties
(continued)

Normal Population
σ Distribution
σx =
n
μ x
(i.e. x is unbiased )
Normal Sampling
Distribution

(the distribution of x has a reduced


standard deviation) μx
x
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 27
2. Sampling distributions with normal population

Sampling Distribution Properties


(continued)

As n increases, Larger sample


size
σ x decreases

Smaller sample
size

μ x
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 28
2. Sampling distributions with normal population

What if the populat ion is not normal?

Apply the central limit theorem!

Sample means from the (not normal) population will be


approximately normal for large sample size

How large is large?


◮ n > 25

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 29


3. Sampling distributions: Chi Square

Introduction

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 30


3. Sampling distributions: Chi Square

Degrees of freedom (df)

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 31


3. Sampling distributions: Chi Square

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 32


3. Sampling distributions: Chi Square

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 33


3. Sampling distributions: Chi Square

Sample Variance
• Let x1, x2, . . . , xn be a random sample from a population.
The sample variance is
1 n
s2 =  (xi − x)2
n − 1 i=1

• the square root of the sample variance is called the sample


standard deviation

• the sample variance is different for different random samples


from the same population

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 34


3. Sampling distributions: Chi Square

Sampling Distribution of Sample Variances

• The sampling distribution of s2 has mean σ2

E[s 2 ] = σ 2

• If the population distribution is normal, then

2σ 4
Var(s2 ) =
n −1

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 35


3. Sampling distributions: Chi Square
Chi-Square Distribution of
Sample and Population Variances
• If the population distribution is normal then

(n - 1)s2
χn2−1 =
σ2

has a chi-square (2 ) distribution with n – 1 degrees of


freedom.

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 36


3. Sampling distributions: Chi Square

The Chi-square Distribution

• The chi-square distribution is a family of distributions,


depending on degrees of freedom:
• d.f. = n – 1

0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2

d.f. = 1 d.f. = 5 d.f. = 15

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 37


3. Sampling distributions: Chi Square

Degrees of Freedom (df)

Idea: Number of observations that are free to vary


after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0

Let X1 = 7 If the mean of these three values is 8.0,


Let X2 = 8 then X3 must be 9
(i.e., X3 is not free to vary)
What is X3?

Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2


(2 values can be any numbers, but the third is not free to vary for a
given mean)
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 38
3. Sampling distributions: Chi Square

Chi-square Example

• A commercial freezer must hold a selected temperature with


little variation. Specifications call for a standard deviation of no
more than 4 degrees (a variance of 16 degrees2).

▪ A sample of 14 freezers is to be tested


▪ What is the upper limit (K) for the sample variance such that the
probability of exceeding this limit, given that the population
standard deviation is 4, is less than 0.05?

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 39


3. Sampling distributions: Chi Square

Finding the Chi-square Value

(n − 1)s2 Is chi-square distributed with (n – 1) = 13


χ2 = degrees of freedom
σ2

• Use the chi-square distribution with area 0.05 in the


upper tail:

213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.)

probability
α = .05

2
213 = 22.36
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 40
3. Sampling distributions: Chi Square
Chi-square Example
(continued)
2 13
= 22.36 (α = .05 and 14 – 1 = 13 d.f.)

 (n − 1)s2 2 
So: P(s2  K) = P  χ13  = 0.05
 16 
(n − 1)K
or = 22.36 (where n = 14)
16

(22.36)(16)
so K= = 27.52
(14 − 1)

If s2 from the sample of size n = 14 is greater than 27.52, there


is strong evidence to suggest the population variance exceeds 16.

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 41


4. Sampling distributions: Student t

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 42


4. Sampling distributions: Student t

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 43


4. Sampling distributions: Student t

Student’s t Distribution

• Consider a random sample of n observations


– with mean x and standard deviation s
– from a normally distributed population with mean μ

• Then the variable


x −μ
t=
s/ n

follows the Student’s t distribution with (n - 1) degrees of


freedom

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 44


4. Sampling distributions: Student t

Student’s t Distribution

• The t is a family of distributions


• The t value depends on degrees of freedom (d.f.)
– Number of observations that are free to vary after sample
mean has been calculated
d.f. = n - 1

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 45


4. Sampling distributions: Student t
Student’s t Distribution
Note: t Z as n increases

Standard
Normal
(t with df = ∞)

t-distributions are bell- t (df = 13)


shaped and symmetric,
but have ‘fatter’ tails t (df = 5)
than the normal

0 t
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 46
4. Sampling distributions: Student t

Student’s t Table

Upper Tail Area


Let: n = 3
df .10 .05 .025 df = n - 1 = 2
 = .10
1 3.078 6.314 12.706 /2 =.05

2 1.886 2.920 4.303


/2 = .05
3 1.638 2.353 3.182

The body of the table


contains t values, not 0 2.920 t
probabilities
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 47
4. Sampling distributions: Student t

t distribution values
With comparison to the Z value

Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) ____

.80 1.372 1.325 1.310 1.282


.90 1.812 1.725 1.697 1.645
.95 2.228 2.086 2.042 1.960
.99 3.169 2.845 2.750 2.576

Note: t Z as n increases

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 48


5. Sampling distributions: F

The F − distribution was developed to study the behavior of two


variances from random sample taken from two independent
normal populations.

For example, we are interested whether the population variances


are equal or not.

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 49


5. Sampling distributions: F

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 50


5. Sampling distributions: F

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 51


5. Sampling distributions: F
Tests of Equality of Two Variances

▪ Goal: Test hypotheses about two


Tests for Two
Population population variances
Variances
H0: σx2  σy2
Lower-tail test
H1: σx2 < σy2
F test statistic
H0: σx2 ≤ σy2
Upper-tail test
H1: σx2 > σy2

H0: σx2 = σy2


H1: σx2 ≠ σy2 Two-tail test

The two populations are assumed to be


independent and normally distributed
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 52
5. Sampling distributions: F
Hypothesis Tests for Two Variances
(continued)

The random variable


Tests for Two
Population
s2x /σ 2x
Variances F= 2 2
s y /σ y
F test statistic
Has an F distribution with (nx – 1)
numerator degrees of freedom and (ny – 1)
denominator degrees of freedom

Denote an F value with 1 numerator and 2


denominator degrees of freedom by Fν1,ν 2

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 53


5. Sampling distributions: F

Test Statistic

Tests for Two The critical value for a hypothesis test


Population about two population variances is
Variances

s2x
F= 2
F test statistic sy

where F has (nx – 1) numerator degrees of


freedom and (ny – 1) denominator degrees
of freedom

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 54


5. Sampling distributions: F
Decision Rules: Two Variances

Use sx2 to denote the larger variance. H0: σx2 = σy2


H0: σx2 ≤ σy2 H1: σx2 ≠ σy2
H1: σx2 > σy2
 /2
0 0 F
F
Do not Do not Reject H0
Reject H0 Fnx −1,ny −1,α / 2
reject H0 Fnx −1,ny −1,α reject H0

Reject H0 if F  Fnx −1,ny −1,α ◼ rejection region for a


two-tail test is:
Reject H0 if F  Fnx −1,ny −1,α / 2

where sx2 is the larger of


the two sample variances
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 55
5. Sampling distributions: F
Example: F Test

You are a financial analyst for a brokerage firm. You want to


compare dividend yields between stocks listed on the NYSE &
NASDAQ. You collect the following data:

NYSE NASDAQ
Number 21 25
Mean 3.27 2.53
Std dev 1.30 1.16

Is there a difference in the variances between the NYSE &


NASDAQ at the  = 0.10 level?

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 56


5. Sampling distributions: F
F Test: Example Solution

• Form the hypothesis test:


H0: σx2 = σy2 (there is no difference between variances)
2
H1: σx ≠ σy2 (there is a difference between variances)

◼ Find the F critical values for  = .10/2:

Degrees of Freedom:
◼ Numerator Fn x −1, n y −1, α / 2
(NYSE has the larger
standard deviation):
◼ nx – 1 = 21 – 1 = 20 d.f.
= F20 , 24 , 0.10/2 = 2.03
◼ Denominator:
◼ ny – 1 = 25 – 1 = 24 d.f.
Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 57
5. Sampling distributions: F
F Test: Example Solution
(continued)

• The test statistic is: H0: σx2 =


σy2
s2x 1.30 2 H1: σx2 ≠
F= = = 1.256 σy2
s2y 1.16 2 /2 =
.05 F
Do not Reject H0
◼ F = 1.256 is not in the rejection reject H0
region, so we do not reject H0
F20 , 24 , 0.10/2 = 2.03
◼ Conclusion: There is not sufficient evidence of a
difference in variances at  = .10

Hengki Purwoto (Econ UGM) Statistics 2 : Lecture 6 March 18, 2021 58

You might also like