0% found this document useful (0 votes)
34 views

Probability and Statistics - 3

The document summarizes key concepts in probability and statistics including: 1) The central limit theorem and how it applies to different distributions as the sample size increases. 2) Hypothesis testing framework including null and alternative hypotheses, test statistics, p-values, types of errors, and significance

Uploaded by

Den Thanh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Probability and Statistics - 3

The document summarizes key concepts in probability and statistics including: 1) The central limit theorem and how it applies to different distributions as the sample size increases. 2) Hypothesis testing framework including null and alternative hypotheses, test statistics, p-values, types of errors, and significance

Uploaded by

Den Thanh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

Probability & Statistics

17.07.2020
Anh Tuan Tran (Ph.D.) & Thinh Tien Nguyen (Ph.D.)
1. Central Limit Theorem
Central Limit Theorem

Theorem:
 Let X1 , … , X n be i.i.d. random variables with expected
value E Xi = μ and variance 0 < D Xi = σ2 < +∞
for i = 1, … , n.
 Then, the random variable
X − μ X1 + ⋯ + Xn − nμ
Zn ≔ σ =

n
converges in distribution to the standard normal
random variable as n → +∞.
Central Limit Theorem
Example:
 Toss a fair coin n times.
 Let X i be 1 if Head occurs and 0 if Tail occurs in the ith
toss for i = 1, … , n.
 E X i = p = 0.5 and D X i = p(1 − p) for i = 1, … , n.
 Then, the random variable
X1 + ⋯ + Xn − np
Zn ≔
np 1 − p
converges in distribution to the standard normal random
variable as n → +∞.
 Binom n, p converges to N np, np 1 − p as n → +∞.
Central Limit Theorem

n=2
Central Limit Theorem

n=5
Central Limit Theorem

n = 30
Central Limit Theorem
Example:
 Roll n dice.
 Let X i be the number occurs on the ith die for i = 1, … , n.
7 35
 E Xi = and D Xi = for i = 1, … , n.
2 12
 Then, the random variable
7
X1 + ⋯ + X n − n
Zn ≔ 2
35
n
12
converges in distribution to the standard normal random
variable as n → +∞.
7 35
 Zn converges to N n, n as n → +∞.
2 12
Central Limit Theorem

n=1
Central Limit Theorem

n=2
Central Limit Theorem

n=8
Central Limit Theorem
Example:

A bank teller serves customers standing in the queue


one by one. Suppose that the service time Xi for
customer i has mean E Xi = 2 (minutes) and D Xi =
1. We assume that service times for different bank
customers are independent. Let Y be the total time the
bank teller spends serving 50 customers.

Find the probability that the bank teller spends from 90


to 110 minutes for the customers.
Central Limit Theorem

Y = X1 + ⋯ + X50

P 90 ≤ Y ≤ 110
90 − 2 ⋅ 50 Y − 2 ⋅ 50 110 − 2 ⋅ 50
=P ≤ ≤
50 50 50
≈ P − 2 ≤ Z ≤ 2 ≈ 0.8427

Where Z~N 0,1 .


2. Hypothesis testing
Hypothesis testing

Motivation problem 1:
 Select randomly 100 people in a city to compute the
average height.
 Repeat the above steps for few times, and the
records of the average height is a sequence
approximating 1.65 m.
 The average height of the people in the city is
exactly 1.65 m or not?
Hypothesis testing

Motivation problem 2:
 A dataset of the final scores of a group of 300
students.
 The group can be divided into 2 subgroups of boys
and girls.
 Compute the average score of each subgroup, boys:
7.91/10 and girls: 6.96/10
 Sex affects the performance of the students? i.e. the
difference between the average scores is
significant?
Null and alternative hypotheses

Null hypothesis:
 The hypothesis that is often the opposite of our
guess.
 Denoted by H0 .

Alternative hypothesis:
 The hypothesis that is often consistent with our
guess and is opposite to the null hypothesis.
 Denoted by Ha or H1 .
Null and alternative hypotheses
Example 1 (One-sample test):
Two-tailed test:
The average height of the people of the city is exactly
1.65 m?
H0 : μ = 1.65
Ha : μ ≠ 1.65
One-tailed test (Right or left):
The average height of the people of the city is less
than or equal to (greater than or equal to) 1.65 m?
H0 : μ ≤ 1.65 H0 : μ ≥ 1.65
or
Ha : μ > 1.65 Ha : μ < 1.65
Null and alternative hypotheses

Example 2 (Two-independent-samples test):


Sex affects the performance of the students? i.e. the
difference between the average scores of the boys
and the girls in the class is really significant?
Two-tailed test:
H0 : μ1 = μ2
Ha : μ1 ≠ μ2
One-tailed tests (Right or left):
H0 : μ1 ≤ μ2 H0 : μ1 ≥ μ2
or
Ha : μ1 > μ2 Ha : μ1 < μ2
Test statistic

Definition:

 A test statistic is the output of a scalar function of all


the observations (data).
 The test statistic is constructed based on the
assumption the null hypothesis H0 is true.
Test statistic
Example (One-sample test):
 We have the heights of 150 people in a city.
 A test statistic of the test of the average height of the
people in the city is 1.65 m is

x − 1.65
t≔ s
150
where
 x is the average height of the sample,
 s is the adjusted standard deviation of the sample.
Test statistic
Why t?
 If the null hypothesis H0 is true and the average height of
the people in the city is exactly μ0 = 1.65 (m).
 By the central limit theorem, for n large enough
X − μ0
T≔ ~N 0,1
S/ n
where
 X is the random variable, the possible values of X are the
average heights of every sample of size n taken from the
people in the city,
 S is the random variable, the possible values of S are the
adjusted standard deviations of every sample of size n
taken from the people in the city.
Test statistic
p-value

Definition:
 Assume H0 is true.
 Let T be a test statistic random variable deduced
from H0 .
 Let t be the observed test statistic from the data.
 Then
 Right tests: p-value = P(T ≥ t|H0 ),
 Left tests: p-value = P T ≤ t|H0 ,
 Two-tailed tests:
p-value = 2 min P T ≥ t|H0 , P(T ≤ t|H0 )
p-value

Right-tailed test
p-value
p-value

Left-tailed test

p-value
p-value

Two-tailed test p-value=2 times the


min
Two types of errors
Definition:

 Type I (False positive): reject H0 when it is actually true.


 Type II (True negative): accept H0 when it is actually
false.

Example:

 Type I: Reject the hypothesis that the average height of


the people in the city is 1.65 m when it is exactly 1.65 m.
 Type II: Accept the hypothesis that the average height of
the people in the city is 1.65 m when it is not the case.
Significance level
Definition:

 Assume H0 is true.
 The probability that H0 will be rejected is called a
significance level.
 Denoted by α.

Example:

α = 0.05 indicates a 5% risk of concluding that rejecting the


1.65 m average height of the people in the city when it is
exactly the case.
Significance level

Right-tailed test
Significance level

Left-tailed test
Significance level

Two-tailed test
Accepting H0

If the p-value is larger


the significance level α,
we accept H0 .
Otherwise, reject it.
Accepting H0
Example:
H0 : μ ≤ 1.65
Ha : μ > 1.65

 P T ≥ t|H0 = p-value > α: Accept H0 .


 P T ≥ t|H0 = p-value < α: Reject H0 .

X−1.65 x−1.65
T= S ~N(0,1) and t = s .
n n

where x and s are resp. mean and adjusted standard


deviation from size n (large) sample of observed data.
3. Useful tests
One-sample t-test

When we want to test the hypothesis of the mean of


the whole population is not equal to a constant μ0 .

In two-tailed tests:
H0 : μ = μ0
Ha : μ ≠ μ0

In one-tailed tests:
H0 : μ ≤ μ0 H0 : μ ≥ μ0
or
Ha : μ > μ0 Ha : μ < μ0
One-sample t-test
Test statistic:
x − μ0
t≔ s ~T n − 1
n

 x is the mean of the sample.


 s is the adjusted standard deviation of the sample.
 The sample size n > 30 or we do need to assume
the normal distribution of the whole population.
 T n − 1 is the Student’s t distribution with freedom
degree n − 1.
Student’s t-distribution

Density function:

X~T n . Then
n+1 n+1
Γ x2

2
2
f x = n 1+ .
nπΓ n
2
Student’s t-distribution

Gamma function:

+∞

Γ x ≔ y x−1 e−y dy.


0
Student’s t-distribution
Student’s t distribution

Property:

Let
X~T n .

Then for n ≥ 30,

X~N 0,1 .
One-sample t-test

Example:

In a manufactory, a machine is used to package sugar


for each 1 kg. To check if it works properly, workers
select 100 packages randomly with the weights as
follows.

Weight 0.95 0.97 0.99 1.01 1.03 1.05


#Packages 9 31 40 15 3 2

t ≈ −6.92, p − value ≈ 4.522e − 10


Two-independent-samples t-test

When we want to compare the means μ1 and μ2 of two


independent samples.

In two-tailed tests:
H0 : μ1 = μ2
Ha : μ1 ≠ μ2

In one-tailed tests:
H0 : μ1 ≤ μ2 H0 : μ1 ≥ μ2
or
Ha : μ1 > μ2 Ha : μ1 < μ2
Two-independent-samples t-test
Test statistic (Equal variances):
x1 − x2
t≔ ~T(n1 + n2 − 2)
1 1
sp +
n1 n2

 x1 , x2 is the means of the samples.


 s1 , s2 are the adjusted standard deviations of the samples
2 2
n1 − 1 s1 + n 2 − 1 s 2
sp2 ≔
n1 + n2 − 2
 The sample sizes n1 , n2 > 30 or we need to assume the
normal distribution on each group.
 The two samples are independent (Otherwise, another
test is applied).
Two-independent-samples t-test
Test statistic (Unequal variances):
x1 − x2
t≔ ~T(df)
s12 s22
+
n1 n2

 x1 , x2 is the means of the samples.


 s1 , s2 are the adjusted standard deviations of the
samples.
 The sample sizes n1 , n2 > 30 or we need to assume the
normal distribution on each group.
 The two samples are independent (Otherwise, another
test is applied).
Two-independent-samples t-test

 Degree of freedom:
2 2 2
s1 s2
+
n1 n2
df ≔
2 2 2 2
1 s1 1 s2
+
n1 − 1 n1 n2 − 1 n2
Two-independent samples test
Example:

In order to compare the average weights of rural and urban


births, 10000 births were weighed. Here is the summary table.

Region #Births Average weight (Adjusted) standard deviation


Rural 8000 3.0 kg 0.3 kg
Urban 2000 3.2 kg 0.2 kg

Equal variances:
t ≈ −28.23, df ≈ 9998, p − value ≈ 2.26e − 1.69

Not equal variances:


t ≈ −35.77, df ≈ 4523, p − value ≈ 4.46e − 247
F-test (two independent samples)

When we want to compare the variances σ1 and σ2 of


two independent samples.

In two-tailed tests:
H0 : σ12 = σ22
Ha : σ12 ≠ σ22
In one-tailed tests:
H0 : σ12 ≤ σ22 H0 : σ12 ≥ σ22
or
Ha : σ12 > σ22 Ha : σ12 < σ22
F-test (two independent samples)

Test statistic:
s12
f ≔ 2 ~F n1 − 1, n2 − 1
s2

 si is the adjusted standard deviation of the ith


sample of size ni for i = 1,2.
 F n1 − 1, n2 − 1 is the (Fisher-Snedecor’s) F
distribution with freedom degree n1 − 1 and n2 − 1.
F-distribution

Density function:

X~F m, n . Then for x > 0


m+n m n m
Γ m 2 n2 x 2 −1
f x = 2 .
m n m+n
Γ Γ n + mx 2
2 2
F-distribution
F-test (two independent samples)
Example:

In order to compare the average weights of rural and


urban births, 10000 births were weighed. Here is the
summary table.

Region #Births Average weight (Adjusted) standard deviation


Rural 8000 3.0 kg 0.3 kg
Urban 2000 3.2 kg 0.2 kg

f ≈ 2.25, df1 = 7999, df2 = 1999 ,


p − value ≈ 1.11e − 16
One-way ANOVA (Analysis of variances)

When we want to compare the means μi of more than


two independent samples.

H0 : μ1 = ⋯ = μk
Ha : ∃i ≠ j, μi ≠ μj
One-way ANOVA (Analysis of variances)
 ni : the number of observed data of the ith group.
 The observed data of the ith group are denoted by
xi1 , xi2 , … , xini
 The average of the ith group:
ni
1
xi ≔ xij .
ni
j=1
 The adjusted variance of the ith group:
ni
2 1 2
si ≔ xij − xi .
ni − 1
j=1
One-way ANOVA (Analysis of variances)

 n: the number of observed data.


 The average of the whole sample:
k ni
1
x≔ xij .
n
i=1 j=1
 The adjusted variance of the whole sample:
k ni
1 2
s2 ≔ xij − x .
n−1
i=1 j=1
One-way ANOVA (Analysis of variances)
 The total sum of squares:
k ni
2
SST ≔ xij − x .
i=1 j=1
 The sum of squares within groups:
k ni
2
SSE ≔ xij − xi .
i=1 j=1
 The sum of squares between groups and x:
k
2
SSA ≔ ni xi − x = SST − SSE.
i=1
One-way ANOVA (Analysis of variances)

Test statistic:

n − k SSA
f≔ ⋅ ~F k − 1, n − k
k − 1 SSE

where F k − 1, n − k is the (Fisher-Snedecor’s) F


distribution with freedom degree k − 1 and n − k.
One-way ANOVA (Analysis of variances)

Comments:

 The groups are independent.


 The size of each group is large enough or they have
normal distribution.
 Equal variances must be assumed.
One-way ANOVA (Analysis of variances)
Example:

Amount of Alcaloid (mg) in a new herb in the three


regions.

Region A: 7.5, 6.8, 7.1, 7.5, 6.8, 6.6, 7.8


Region B: 5.8, 5.6, 6.1, 6.0, 5.7
Region C: 6.1, 6.3, 6.5, 6.4, 6.5, 6.3

f ≈ 26.56, df1 = 2, df2 = 15 ,


p − value ≈ 1.17e − 05

You might also like