0% found this document useful (0 votes)
33 views25 pages

5 BSM214 Lecture5 Fall2023

(1) The document discusses the difference between a population and a sample, with populations having unknown parameters and samples being used to estimate those parameters through statistics. (2) Key statistics discussed include measures of central tendency (sample mean), variability (sample variance and standard deviation), and the central limit theorem. (3) The t-distribution and its application to hypotheses testing when the population variance is unknown is also summarized.

Uploaded by

mf7059708
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views25 pages

5 BSM214 Lecture5 Fall2023

(1) The document discusses the difference between a population and a sample, with populations having unknown parameters and samples being used to estimate those parameters through statistics. (2) Key statistics discussed include measures of central tendency (sample mean), variability (sample variance and standard deviation), and the central limit theorem. (3) The t-distribution and its application to hypotheses testing when the population variance is unknown is also summarized.

Uploaded by

mf7059708
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Population Sample = Observations

(Some Unknown (We calculate Some


Parameters) Statistics)
Example: 6 October Example: 20 Students
U Students (Height from 6 October U
Mean) (Sample Mean)
N=Population Size n = Sample Size
· Let X1,X2,…,XN be the population values (in general, they
are unknown)
· Let x1,x2,…,xn be the sample values (these values are
known)
· Statistics obtained from the sample are used to estimate
(approximate) the parameters of the population.

* Statistical Inference

(1) Estimation:
→ Point Estimation
→ Interval Estimation (Confidence Interval)

(2) Hypotheses Testing


Some Important Statistics:
Definition:
Any function of the random sample X1, X2, …, Xn is called a
statistic.
Central Tendency in the Sample:
Definition:
If X1, X2, …, Xn represents a random sample of size n, then the
sample mean is defined to be the statistic:
n

X1 + X 2 +  + X n ∑X i
X= = i =1
(unit)
n n
Variability in the Sample:
Definition:
If X1, X2, …, Xn represents a random sample of size n, then the
sample variance is defined to be the statistic:
n

∑(X i − X )2
( X1 − X )2 + ( X 2 − X )2 +  + ( X n − X )2
2
S = i =1
= (unit)2
n −1 n −1
Theorem: (Computational Formulas for S2)

Note:
· S2 is a statistic because it is a function of the random
sample X1, X2, …, Xn.
· S2 measures the variability in the sample.
n
The standard deviation ∑(X − X )
2
i
2 i =1
S= S = (unit)
n −1
Example:
Compute the sample variance and standard deviation of the
following observations (ages in year): 10, 21, 33, 53, 54.
Solution:
n=5
n 5

∑x i ∑x i
10 + 21 + 33 + 53 + 54 171
x= i =1
n
= i =1
5
=
5
=
5
= 34.2 (year)

n
2
∑x 2
i − nx
S2 = i =1
n −1
xi 10 21 33 53 54 ∑ xi = 171

x 2 100 441 1089 2809 2916 ∑ x i2 = 7355


i

7355 − (5)(34.2)
2
1506.8
= = = 376.7
5 −1 4
(year)2

The sample standard deviation is:

S = S 2 = 376.7 = 19.41 (year)


Random Sampling:

• Each observation in a population is a value of a random


variable X having some probability distribution f(x).

• To eliminate bias in the sampling procedure, we select a


random sample in the sense that the observations are made
independently and at random.

• The random sample of size n is: X1, X2, …, Xn


It consists of n observations selected independently and
randomly from the population.
E( X ) = µ X = µ
and variance
σ2
Var ( X ) = σ =
2
X
n
· If X1, X2, …, Xn is a random sample of size n from N(µ,σ),
σ
µ σ
then X ~N( X , X ) or X ~N(µ, ).
n
σ X −µ
· X ~ N(µ, )⇔Z= ~ N(0,1)
n σ/ n
Theorem: (Central Limit Theorem)
If X1, X2, …, Xn is a random sample of size n from any distribution
(population) with mean µ and finite variance σ2, then, if the
sample size n is large, the random variable
X −µ
Z=
σ/ n
is approximately standard normal random variable, i.e.,
X −µ
Z= ~ N(0,1) approximately.
σ/ n
X −µ σ
 Z = ~ N(0,1) ⇔ X ~ N( µ , )
σ/ n n
We consider n large when n ≥ 30.
For large sample size n, X has approximately a normal
distribution with mean µ and variance σ 2
, i.e.,
σ n
X ~ N( µ , ) approximately.
n
The sampling distribution of X is used for inferences about the
population mean µ.
Example:
An electric firm manufactures light bulbs that have a length of life
that is approximately normally distributed with mean equal to
800 hours and a standard deviation of 40 hours. Find the
probability that a random sample of 16 bulbs will have an
average life of less than 775 hours.
Solution:
X= the length of life
µ=800 , σ=40
X~N(800, 40)
n=16
µ X = µ = 800
σ 40
σX = = = 10
n 16
σ
X ~ N(µ, ) = N(800,10)
n
X −µ X − 800
⇔Z= =Z= ~ N(0,1)
σ/ n 10
 X − 800 775 − 800 
= P < 
 10 10 
 775 − 800 
= P Z < 
 10 
= P(Z < −2.50 )
= 0.0062
t-Distribution:
 Recall that, if X1, X2, …, Xn is a random sample of size n
from a normal distribution with mean µ and variance σ2, i.e.
N(µ,σ), then
X −µ
Z= ~ N(0,1)
σ/ n
We can apply this result only when σ2 is known.

 If σ2 is unknown, we replace the population variance σ2 with


n
2
∑(Xi − X )
the sample variance S 2 = i =1 · to have the
following statistic n −1

X −µ
T=
S/ n
Result:
If X1, X2, …, Xn is a random sample of size n from a normal
distribution with mean µ and unknown variance σ2, i.e. N(µ,σ),
then the statistic
X −µ
T=
S/ n
has a t-distribution with ν=n−1degrees of freedom (df), and we
write T~ t(ν).
Note:
 t-distribution is a continuous
distribution.
The shape of t-distribution is similar to
the shape of the standard normal
distribution.
Notation:

t α = The t-value above which we find an area equal to α, that


is P(T> t α) = α
Since the curve of the pdf of T~ t(ν) is symmetric about 0, we
have
t1 − α = − t α
Values of tα are tabulated in Table A-4 (p.683).
Critical Values of the t-distribution (tα )
Critical Values of the t-distribution (tα )
Example:
Find the t-value with ν=14 (df) that leaves an area
of:
(a) 0.95 to the left.
(b) 0.95 to the right.
Solution:
ν = 14 (df); T~ t(14)
(a) The t-value that leaves an area of 0.95 to the left is
t0.05 = 1.761
(b) The t-value that leaves an area of 0.95 to the right is
t0.95 = − t 1 − 0.95 = − t 0.05 = − 1.761
Example:
For ν = 10 degrees of freedom (df), find t0.10 and t 0.85 .
Solution:
t0.10 = 1.372
t0.85 = − t1−0.85 = −t 0.15 = −1.093 (t0.15 = 1.093)
Sampling Distribution of the Sample Proportion:
Suppose that the size of a population is N. Each element of the
population can be classified as type A or non-type A. Let p be
the proportion of elements of type A in the population. A random
sample of size n is drawn from this population. Let p̂ be the
proportion of elements of type A in the sample.

Let X = no. of elements of type A in the sample


p =Population Proportion
no. of elements of type A in the population
=
N
p̂ = Sample Proportion
no. of elements of type A in the sample X
= =
n n
Result:
(1) X ~ Binomial (n, p)
(2) E( p̂ )= E( X )= p
n
X pq
(3) Var( p̂ ) = Var( )= ; q =1− p
n n
(4) For large n, we have

p̂ ~ N(p, pq ) (Approximately)
n
pˆ − p
Z= ~ N(0,1) (Approximately)
pq
n

You might also like