0% found this document useful (0 votes)
19 views22 pages

Actuary Math - Stat. Lec1-9

Actuary Math and Statistics

Uploaded by

noringuyenni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views22 pages

Actuary Math - Stat. Lec1-9

Actuary Math and Statistics

Uploaded by

noringuyenni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Lecture 01, 02

1. Statistics
Descriptive statistics Inferential statistics
orginize predict
summarize } data?collect?n,N? forecast} uncertain
present verify
• The average number of students in a class at White • A recent study showed that eating garlic can lower
Oak University is 22.6 blood pressure
• Last year’s total attendance at Long Run High • It is predicted that the average number of
School’s football games was 8235 automobiles each household owns will increase
next year
2. Types of Variable
Qualitative (Categorical) Quantitative (Scale)
Nominal Ordinal Discrete Continuous
Incompareable Compareable Countable Uncountable
Cannot calculated Calculate: +, - Calculate: +, -, x, :, …
Name, address, career,.. Rank, Size, … Score, The number of … Height, Temperature, …
Binary: 2 values; male/female; yes/no; …
3. Level of measurement
Qualitative (Categorical) Quantitative (Scale)
Nominal Ordinal Interval: Ordinal + interval between 2 steps is equal
Names/label Nominal + rank/order Score, Temperatue, …
Ratio: Interval + ratio of 2 values is meaningful
Height, Time, …
4. Tabular
Qualitative (Categorical): nominal / ordinal Quantitative (Scale)
• Frequency distribution tabular • Frequency distribution tabular
• Relative freq. dis. • Relative freq. dis. / Percent freq. dis. (%)
• Percent freq. dis. (%) • Cummulative freq. dis.
• Cross-tabular / row and column … • Cummulative relative freq. dis.
• Cross-tabular / row and column …
5. Graphics
Qualitative (Categorical) Quantitative (Scale)
• Pie chart • Bar/column chart
• Bar/column chart • Clustered bar/column chart
• Clustered bar/column chart • Stacked bar/column chart
• Stacked bar/column chart • Group chart
• Histogram / Histogram and cummulative
• Dot Plot; Line; …
• Scatter Plot (→ Correlation)
• Bubble chart

1
6. Data
Population Sample
Listed (𝑥1 , 𝑥2 , … , 𝑥𝑁 ) (𝑥1 , 𝑥2 , … , 𝑥𝑛 )
Frequency
Value 𝑥1 𝑥2 … 𝑥𝐾 Value 𝑥1 𝑥2 … 𝑥𝑘
tabular
Freq. 𝑓1 𝑓2 … 𝑓𝐾 Freq. 𝑓1 𝑓2 … 𝑓𝑘
Grouped
𝑎0 − 𝑎1
Value 𝑎1 − 𝑎2 … 𝑎𝐾−1 − 𝑎𝐾 Value 𝑎0 − 𝑎1 𝑎1 − 𝑎2 … 𝑎𝐾−1 − 𝑎𝑘
data 𝑎0 + 𝑎1 𝑎1 + 𝑎2 𝑎𝐾−1 + 𝑎𝐾 𝑎0 + 𝑎1 𝑎1 + 𝑎2 … 𝑎𝐾−1 + 𝑎𝑘
𝑥𝑖 … 𝑥𝑖
2 2 2 2 2 2
Freq. 𝑓1 𝑓2 … 𝑓𝐾 Freq. 𝑓1 𝑓2 … 𝑓𝑘
7. Measures of Locations
𝑺𝒖𝒎 𝒐𝒇 𝒗𝒂𝒍𝒖𝒆𝒔
▪ Arithmetic mean: 𝑴𝒆𝒂𝒏 = 𝑺𝒊𝒛𝒆
▪ Same unit with X
Mean ▪ “Sensitive” to any change in element’s value
∑𝑵
𝒊 𝒙𝒊
∑𝒏
𝒊 𝒙𝒊
Population: 𝝁 = ̅=
Sample: 𝒙
𝑵 𝒏
▪ Geometric mean: = (𝑥1 × 𝑥2 × … × 𝑥𝑆𝑖𝑧𝑒 )1/𝑆𝑖𝑧𝑒 (return/increase …over …year)
Central ▪ is middle value of an ordinal data
tendency 𝑛+1
▪ is value at position of = (𝑛 + 1) × 0.5
2
Median ▪ Population: 𝑚𝑒 ; Sample: 𝑥̃
▪ is not affected by extreme value
▪ useful to compare data with outlier
▪ is most freq. value
Mode
▪ may have 0, 1 or > 1 mode
𝑘
▪ Ordinal data (𝑘 𝑡ℎ value is quantile level 𝑛+1)
▪ Sort Smallest → Largest
▪ Quantile level 𝛽 là 𝑞𝛽 ; 𝑞0 = 𝑥𝑚𝑖𝑛 ; 𝑞1 = 𝑥𝑚𝑎𝑥
Position of 𝑞𝛽 is (𝑛 + 1)𝛽 = {𝑖𝑛𝑡, 𝑑𝑒𝑐}
𝑞𝛽 = 𝑥𝑖𝑛𝑡 + 𝑑𝑒𝑐 × (𝑥𝑖𝑛𝑡+1 − 𝑥𝑖𝑛𝑡 )

3 quartiles divide data into 4 equal parts

𝑄1 (𝑙𝑜𝑤𝑒𝑟 𝑓𝑜𝑢𝑟𝑡ℎ) = 𝑞0.25
Quantile ▪
𝑄2 (𝑚𝑒𝑑𝑖𝑎𝑛) = 𝑞0.5

𝑄3 (𝑢𝑝𝑝𝑒𝑟 𝑓𝑜𝑢𝑟𝑡ℎ) = 𝑞0.75
Quartiles

5 key-point: 𝑥𝑚𝑖𝑛 = 𝑞0 ; 𝑄1 = 𝑞0.25 ; 𝑄2 = 𝑚𝑒𝑑𝑖𝑎𝑛 = 𝑞0.5 ; 𝑄3 = 𝑞0.75 ; 𝑥𝑚𝑎𝑥 = 𝑞1

𝐼𝑄𝑅 = 𝑄3 − 𝑄1
𝑂𝑢𝑡𝑙𝑖𝑒𝑟 ∉ (𝑄1 − 1,5 × 𝐼𝑄𝑅, 𝑄3 + 1,5 × 𝐼𝑄𝑅)
𝐸𝑥𝑡𝑟𝑒𝑚𝑒 𝑜𝑢𝑡𝑙𝑖𝑒𝑟 ∉ (𝑄1 − 3 × 𝐼𝑄𝑅, 𝑄3 + 3 × 𝐼𝑄𝑅)
Quintiles 4 quintiles divide data into 5 equal parts
Deciles 9 deciles divide data into 10 equal parts
Percentiles 99 percentiles divide data into 100 equal parts

2
8. Measures of Variability
Population Sample
Range 𝑅 = 𝑊 = 𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛 : width of interval cover 100% values
Interquartile 𝐼𝑄𝑅 = 𝑄3 − 𝑄1 : width of interval cover 100% values
range Forth spread: 𝑓𝑠
∑𝑵
𝟏 (𝒙𝒊 − 𝝁)
𝟐
𝑺𝑿𝑿 ∑𝒏𝟏(𝒙𝒊 − 𝒙
̅)𝟐 𝑺𝒙𝒙
𝝈𝟐 = = 𝒔𝟐 = =
𝑵 𝑵 𝒏−𝟏 𝒏
𝑵 𝟐 𝟐
∑𝟏 𝒙 𝒊 𝒏 ∑ 𝒙𝒊 𝒏
= − 𝝁𝟐 = ( − (𝒙̅)𝟐 ) = 𝒎𝒔
𝑵 𝒏−𝟏 𝒏 𝒏−𝟏
Variance
▪ Absolute variability
▪ unit of variance is squared unit of X
▪ Var(X) > Var(Y) → X is variability, dispersion, fluctuate than Y // Y is more stable,
concentrated than X
Standard 𝜎 = √𝜎 2 𝑠 = √𝑠 2
deviation Same unit with X
(S.D) Absolute variability
Coefficient 𝜎 𝑠
𝐶𝑉 = × 100(%) 𝐶𝑉 = × 100(%)
of variation 𝜇 𝑥̅
▪ Unit: %
▪ CV is used to compare the variability of variables with different units
▪ Relative variability
9. Measures of Shape
Population Sample
𝑁 3
∑1 (𝑥𝑖 − 𝜇) /𝑁 ∑𝑛1(𝑥𝑖 − 𝑥̅ )3 /𝑛
𝑆𝑘𝑒𝑤 = 𝑆𝑘𝑒𝑤 =
𝜎3 𝑠3

Skewness

Mean < Median Mean = Median Median < Mean


∑𝑁 4
1 (𝑥𝑖 − 𝜇) /𝑁 ∑𝑛1(𝑥𝑖 − 𝑥̅ )4 /𝑛
𝐾𝑢𝑟𝑡 = 𝐾𝑢𝑟𝑡 =
𝜎4 𝑠4

Kurtosis

3
𝑖 𝑥 −𝑚𝑒𝑎𝑛
10. Standardized value 𝑧𝑖 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
Example 1
Population Sample
(2, 2, 3, 3, 3, 4, 4, 4, 4, 5) (2, 3, 2, 4, 5)

4 2
3
2 1 1 1
1

2 3 4 5 2 3 4 5

10 10 5 5

𝑁 = 10; ∑ 𝑥𝑖 = 34; ∑ 𝑥𝑖2 = 124 𝑛 = 5; ∑ 𝑥𝑖 = 16; ∑ 𝑥𝑖2 = 58


1 1 1 1
10
∑1 𝑥𝑖 5
Mean 34 ∑1 𝑥𝑖 16
𝜇= = = 3.4 𝑥̅ = = = 3.2
𝑁 10 𝑛 5
Median 𝑥5 + 𝑥6 3 + 4 2; 2; 3; 4; 5
𝑚𝑒 = = = 3.5
2 2 𝑥̃ = 𝑥3 = 3
Mode 𝑀𝑜𝑑𝑒 = 4 𝑀𝑜𝑑𝑒 = 2
Quartiles • (𝑁 + 1) × 0.25 = 11 × 0.25 = 2.75 • (𝑛 + 1) × 0.25 = 6 × 0.25 = 1.5
𝑄1 = 𝑞0.25 = 𝑥2 + 0.75(𝑥2+1 − 𝑥2 ) 𝑄1 = 𝑞0.25 = 𝑥1 + 0.5(𝑥2 − 𝑥1 )
= 2 + 0.75(3 − 2) = 2.75 = 2 + 0.5(2 − 2) = 2
• 𝑄2 = 𝑞0.5 = 𝑚𝑒 = 3.5 • 𝑄2 = 𝑞0.5 = 𝑚𝑒 = 3
• (𝑁 + 1) × 0.75 = 11 × 0.75 = 8.25 • (𝑛 + 1) × 0.75 = 6 × 0.75 = 4.5
𝑄3 = 𝑞0.75 = 𝑥8 + 0.25(𝑥9 − 𝑥8 ) 𝑄3 = 𝑞0.75 = 𝑥4 + 0.5(𝑥5 − 𝑥4 )
= 4 + 0.25(4 − 4) = 4 = 4 + 0.5(5 − 4) = 4.5
Range 𝑅 = 5−2 =3 𝑅 = 5−2 =3
Interquartile 𝐼𝑄𝑅 = 𝑓𝑠 = 𝑄3 − 𝑄1 = 4 − 2.75 = 1.25 𝐼𝑄𝑅 = 𝑓𝑠 = 𝑄3 − 𝑄1 = 4.5 − 3 = 1.5
range
124 5 58
Variance 𝜎2 = − 3.42 = 0.84 𝑠 2 = ( − 3.22 ) = 1.7
10 4 5
Standard 𝜎 = √0.84 = 0.92 𝑠 = √1.7 = 1.3
deviation
(S.D)
Coefficient 0.92 1.3
𝐶𝑉 = × 100(%) = 27,06(%) 𝐶𝑉 = × 100(%) = 40.6%
of variation 3.4 3.2
Standardized (-1.52; -1.52; -0.43; -0.43; -0.43; 0.65; 0.65; (-0.92; -0.92; -0.15; 0.62; 1.38)
value 0.65; 0.65; 1.74)
Skewness −0.72/10 2.88/5
𝑠𝑘𝑒𝑤 = = −0.092 𝑠𝑘𝑒𝑤 = = 0.26
0.923 1.33
Kurtosis 14.832/10 15.056/5
𝑘𝑢𝑟𝑡 = 4
= 2.07 𝑘𝑢𝑟𝑡 = = 1.05
0.92 1.34
4
Example 2
Population Sample
Group Group
𝑥𝑖 2-4 4-6 6-8 8 - 10 𝑥𝑖 2-4 4-6 6-8
Freq. 2 5 7 1 Freq. 1 3 2

𝑥𝑖 3 5 7 9 𝑥𝑖 3 5 7
Freq. 2 5 7 1 Freq. 1 3 2

(3, 3, 5, 5, 5, 5, 5, 7, 7, 7, 7, 7, 7, 7, 9) (3, 5, 5, 5, 7, 7)

Mean
Median
Mode
Quartiles
Range
Interquartile
range
Variance
Standard
deviation
(S.D)
CV
Standardized
value
Skewness
Kurtosis
11. Measures of Relationship
Population Sample
Covariance ∑(𝑥𝑖 − 𝜇𝑋 )(𝑦𝑖 − 𝜇𝑌 ) ∑(𝑥𝑖 − 𝑥̅ )(𝑦𝑖 − 𝑦̅)
𝐶𝑜𝑣(𝑋, 𝑌) = 𝐶𝑜𝑣(𝑋, 𝑌) =
𝑁 𝑁
= 𝜇𝑋𝑌 − 𝜇𝑋 𝜇𝑌 𝑛
= (𝑥
̅̅̅̅̅̅
⋅ 𝑦 − 𝑥̅ ⋅ 𝑦̅)
𝑛−1
Correlation 𝐶𝑜𝑣(𝑋, 𝑌) 𝐶𝑜𝑣(𝑋, 𝑌)
𝜌𝑋𝑌 = 𝑟𝑋𝑌 =
𝜎𝑋 𝜎𝑌 𝑠𝑋 𝑠𝑌

5
Lecture03
1. Population ≡ Random Variable 𝑿
Population mean 𝜇 = 𝐸(𝑋)
Population variance 𝜎 2 = 𝑉(𝑋)
2. Sample: random & observed
𝑋1 , 𝑋2 , … , 𝑋𝑛 𝑎𝑟𝑒 𝒊𝑛𝑑𝑒𝑝𝑒𝑛𝑡𝑑𝑒𝑛𝑡
𝑿 = (𝑋1 , 𝑋2 , … , 𝑋𝑛 ): 𝑟𝑎𝑛𝑑𝑜𝑚 𝑠𝑎𝑚𝑝𝑙𝑒 ⇔ { (𝑋𝑖 𝑎𝑟𝑒 𝑖𝑖𝑑. )
𝑋1 , 𝑋2 , … , 𝑋𝑛 𝑎𝑟𝑒 𝒊𝑑𝑒𝑛𝑡𝑖𝑐𝑎𝑙𝑙𝑦 𝒅𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑋
Observed sample (𝑥1 , 𝑥2 , … , 𝑥𝑛 )
3. Statistic is a function on random sample: 𝐺 = 𝐺(𝑋1 , 𝑋2 , … , 𝑋𝑛 )
Observed sample → 𝑔 = 𝐺𝑠𝑡𝑎𝑡 = 𝐺(𝑥1 , 𝑥2 , … , 𝑥𝑛 ): observed value
𝑋1 +𝑋2 +𝑋3 𝐸(𝑋1 )+𝐸(𝑋2 )+𝐸(𝑋3 ) 𝑉(𝑋1 )+𝑉(𝑋2 )+𝑉(𝑋3 ) 𝜎2
Ex: (𝑋1 , 𝑋2 , … , 𝑋10 ): random sample; 𝐺 = → 𝐸(𝐺) = = 𝜇; 𝑉(𝐺) = =
3 3 32 3
8+7+4
(8; 7; 4; 9; 10; 2; 6; 3; 5; 6) → 𝑔 = 𝐺𝑠𝑡𝑎𝑡 = = 6.3333
3
4.
Statistic Obseved Expectation Variance Related distribution Interval for sample …
(lec02)
Sample ∑𝑛1 𝑋𝑖 ∑𝑛1 𝑥𝑖 𝐸(𝑋̅) = 𝜇 𝜎2 • 𝑿 ∼ 𝑵(𝝁, 𝝈𝟐 ) • Two-tailed
𝑋̅ = 𝑥̅ = 𝑉(𝑋̅) = 𝜎 𝜎
mean 𝑛 𝑛 𝑛 𝜎2 𝜇 − 𝑧𝛼 < 𝑋̅ < 𝜇 + 𝑧𝛼
𝑛
∑1 𝑥𝑖 𝑛𝑖 → 𝑋̅ ∼ 𝑁 (𝜇, ) 2 √𝑛 2 √𝑛
𝑥̅ = 𝑛
𝜎
𝑛 𝑋̅ −𝜇 • Right-tailed: 𝜇 − 𝑧𝛼 𝑛 < 𝑋̅
Known 𝜎 2 : ∼ 𝑁(0,1) √
𝜎/√𝑛 𝜎
𝑋̅ −𝜇 • Left-tailed: 𝑋̅ < 𝜇 + 𝑧𝛼
√𝑛
Unknown 𝜎 2 : 𝑆/ 𝑛 ∼ 𝑇(𝑛 − 1)

𝑋̅ −𝜇
𝑛 > 30: ∼ 𝑁(0,1)
𝑆/√𝑛

𝑺𝟐𝝁 ∑𝑛𝑖(𝑋𝑖 − 𝜇)2 ∑𝑛𝑖(𝑥𝑖 − 𝜇)2 𝜎2 2𝜎 4 • 𝟐 ),


𝑿 ∼ 𝑵(𝝁, 𝝈 known 𝝁
𝑛 𝑛 𝑛 𝑛𝑆𝜇2
∼ 𝜒 2 (𝑛)
𝜎2
𝑋̅ − 𝜇
∼ 𝑇(𝑛)
𝑆𝜇 /√𝑛

6
𝑴𝑺 ∑𝑛1 𝑋𝑖2 ∑𝑛1 𝑥𝑖2 𝑛−1 2 2(𝑛 − 1) 4 𝑿 ∼ 𝑵(𝝁, 𝝈𝟐 ); unknown 𝝁, 𝝈𝟐
− (𝑋̅)2 − (𝑥̅ )2 𝜎 𝜎
𝑛 𝑛 𝑛 𝑛2 𝑛𝑀𝑆
∼ 𝜒 2 (𝑛 − 1)
= ̅̅̅̅
𝑋 2 − (𝑋̅)2 = ̅̅̅
𝑥 2 − (𝑥̅ )2 𝜎2
Sample 𝑛 𝐸(𝑆 2 ) = 𝜎 2 2𝜎 4 𝑿 ∼ 𝑵(𝝁, 𝝈𝟐 ); unknown 𝝁, 𝝈𝟐 • Two-tailed
𝑆2 = 𝑀𝑆
variance 𝑛−1 𝑛−1 (𝑛 − 1)𝑆 2 𝜎 2 2(𝑛−1) 𝜎 2 2(𝑛−1)
∼ 𝜒 2 (𝑛 − 1) 𝜒 2
<𝑆 < 𝜒
𝜎2 𝑛 − 1 1−𝛼/2 𝑛 − 1 𝛼/2
𝜎2
• 2
Right-tailed: 𝑛−1 𝜒(𝑛−1)1−𝛼 < 𝑆2
𝜎2
• Left-tailed: 2
𝑆 2 < 𝑛−1 𝜒(𝑛−1)𝛼
Sample 𝑋 ∼ 𝐵(1, 𝑝) 𝐸(𝑝̂ ) = 𝑝 𝑝(1 − 𝑝) 𝒏 ≥ 𝟏𝟎𝟎: • Two-tailed
proportion 𝑝̂ = 𝑋̅ 𝑛 𝑝(1 − 𝑝) 𝑝 − 𝑧𝛼 𝜎𝑝̂ < 𝑝̂ < 𝑝 + 𝑧𝛼 𝜎𝑝̂
𝑝̂ ∼ 𝑁 (𝑝, ) 2 2
𝑛
𝑝(1 − 𝑝)
𝜎𝑝̂ = √
𝑛
𝑝(1−𝑝)
• Right-tailed: 𝑝 − 𝑧𝛼 √ < 𝑝̂
𝑛

𝑝(1−𝑝)
• Left-tailed: 𝑝̂ < 𝑝 + 𝑧𝛼 √ 𝑛

7
Lecture04
• Estimator (example: 𝑋̅, 𝑆 2 ) is random statistic on random sample, is a random variable
• Estimate (example: 𝑥̅ , 𝑠 2 ) is observed value of statistic from observed sample
• Criteria for estimator
o 𝜃̂ is unbiased estimator of 𝜃 ⇔ 𝐸(𝜃̂) = 𝜃 ⇔ 𝑏𝑖𝑎𝑠 = |𝐸(𝜃̂) − 𝜃| = 0
𝐸(𝜃̂1 ) = 𝐸(𝜃̂2 ) = 𝜃
o { ⇒ 𝜃̂1 is more efficient than 𝜃̂2
𝑉(𝜃̂1 ) < 𝑉(𝜃̂2 )
𝐸(𝜃̂ ) = 𝜃
o { ⇒ 𝜃̂ is efficient estimator
𝑉(𝜃̂) is minimum among every unbiased estimator
MVUE: minimum variance unbiased estimator > BUE: best unbiased estimator
o Consistent estimator
• Find method:
o Percentile matching estimator
Using: when parameter could be calculated from percentile / quantile
From the population distribution, find the quantile formulas that are expression of parameters
Estimate population quantiles by sample quantiles → estimate parameters
𝑄2 , 𝑄1 (𝑜𝑟 𝑄3 ), …
o Moment estimator
Using: when parameter could be calculated from moments
Estimate k parameters by first k moments: estimate 𝐸(𝑋) by 𝑋̅, estimate 𝐸(𝑋 2 ) by ̅̅̅̅
𝑋2, …
Estimate moment → estimate parameter
o Maximum likelihood estimator
Likelihood function
Random variable 𝑋 with parameter 𝜃, random sample 𝑿 = (𝑋1 , 𝑋2 , … , 𝑋𝑛 ), then likelihood
function is:
𝑛

∏ 𝑃(𝑋𝑖 , 𝜃) ∶ 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒
𝑖=1
𝐿(𝑿, 𝜃) = 𝑛

∏ 𝑓(𝑋𝑖 , 𝜃) ∶ 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠
{ 𝑖=1
Maximum likelihood estimator (MLE)
MLE of 𝜃 is 𝜃̂ that maximize Likelihood function or logarithm of Likelihood function
𝐿(𝑿, 𝜃) → 𝑚𝑎𝑥 or ln (𝐿(𝑿, 𝜃) → 𝑚𝑎𝑥
𝜕𝐿(𝑿, 𝜃)
= 0 ⇔ 𝜃 = 𝜃̂
𝜕𝜃
𝐿(𝑿, 𝜃) → 𝑚𝑎𝑥 ⇔ 𝜕 2 𝐿(𝑿, 𝜃)
2
| <0
{ 𝜕𝜃 ̂
𝜃=𝜃
𝜕 ln(𝐿(𝑿, 𝜃))
= 0 ⇔ 𝜃 = 𝜃̂
𝜕𝜃
ln (𝐿(𝑿, 𝜃) → 𝑚𝑎𝑥 ⇔
𝜕 2 ln(𝐿(𝑿, 𝜃))
2
| <0
{ 𝜕𝜃 ̂
𝜃=𝜃

8
• Fisher information:

1 𝑛
o Bernoulli distribution: 𝑝̂ is MVUE of p ⇒ 𝐼𝑛 (𝑝) = 𝑉(𝑝̂) = 𝑝(1−𝑝)
1 𝑛
o Normality distribution: 𝑋̅ is MVUE of 𝜇 ⇒ 𝐼𝑛 (𝜇) = 𝑉(𝑋̅) = 𝜎2
𝑛−1
o Normality distribution: 𝑆 2 is not MVUE of 𝜎 2 ⇒ 𝐼𝑛 (𝜎 2 ) =? > 2𝜎2

9
Lecture05
• Confidence interval (C.I) = Interval estimate
• Prediction interval (P.I): interval for single random observation with prediction level (1 − 𝛼)
Confidence level = 1−𝛼
• 𝑃(𝐿𝑜𝑤𝑒𝑟 𝐿𝑖𝑚𝑖𝑡 < 𝜃 < 𝑈𝑝𝑝𝑒𝑟 𝐿𝑖𝑚𝑖𝑡) = 1 − 𝛼 Confidence width=𝑤=𝑈𝐿−𝐿𝐿
1. Mean
a. Normality distribution – known 𝝈𝟐
𝜎 𝜎
Two-sided C.I: 𝑋̅ − 𝑧𝛼/2 × < 𝜇 < 𝑋̅ + 𝑧𝛼/2 × → Shorten: 𝑋̅ ± 𝑀𝐸
√𝑛 √𝑛
𝜎 𝜎
→ 𝑤 = 2𝑧𝛼/2 × → 𝑀𝑎𝑟𝑔𝑖𝑛 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟: 𝑀𝐸 = 𝑧𝛼/2 ×
√𝑛 √𝑛
𝟐
b. Normality distribution – unknown 𝝈
𝑆 𝑆
• Two-sided C.I: 𝑋̅ − 𝑡(𝑛−1)𝛼/2 × 𝑛 < 𝜇 < 𝑋̅ + 𝑡(𝑛−1)𝛼/2 × 𝑛 → Shorten: 𝑋̅ ± 𝑀𝐸
√ √
𝑆 𝑆
→ 𝑤 = 2𝑡(𝑛−1)𝛼/2 × → 𝑀𝑎𝑟𝑔𝑖𝑛 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟: 𝑀𝐸 = 𝑡(𝑛−1)𝛼/2 ×
√𝑛 √𝑛
𝑆
• Right-sided C.I: 𝑋̅ − 𝑡(𝑛−1)𝛼 × <𝜇
√𝑛
𝑆
• Left-sided C.I: 𝜇 < 𝑋̅ + 𝑡(𝑛−1)𝛼 ×
√𝑛
1
• P.I: 𝑋̅ ± 𝑡(𝑛−1)𝛼/2 × 𝑆 × √1 + 𝑛
2. Variance: Normality distribution – unknown 𝝁
(𝑛−1)𝑆 2 (𝑛−1)𝑆 2
• Two-sided C.I: 2 < 𝜎 2 < 𝜒2
𝜒(𝑛−1)𝛼/2 (𝑛−1)1−𝛼/2
(𝑛−1)𝑆 2
• Right-sided C.I: 2 < 𝜎2
𝜒(𝑛−1)𝛼
(𝑛−1)𝑆 2
• Left-sided C.I: 𝜎 2 < 𝜒2
(𝑛−1)1−𝛼

3. Proportion
a. 𝒏 ≥ 𝟏𝟎𝟎
𝑝̂(1−𝑝̂) 𝑝̂(1−𝑝̂)
• Two-sided C.I: 𝑝̂ − 𝑧𝛼 × √ < 𝑝 < 𝑝̂ + 𝑧𝛼/2 × √ → Shorten: 𝑝̂ ± 𝑀𝐸
2 𝑛 𝑛

𝑝̂ (1 − 𝑝̂ ) 𝑝̂ (1 − 𝑝̂ )
→ 𝑤 = 2𝑧𝛼/2 × √ → 𝑀𝑎𝑟𝑔𝑖𝑛 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟: 𝑀𝐸 = 𝑧𝛼/2 × √
𝑛 𝑛
𝑝̂(1−𝑝̂)
• Right-sided C.I: 𝑝̂ − 𝑧𝛼 × √ <𝑝
𝑛

𝑝̂(1−𝑝̂)
• Left-sided C.I: 𝑝 < 𝑝̂ + 𝑧𝛼 × √ 𝑛

b. 𝒏 < 𝟏𝟎𝟎
Two-sided C.I:

10
Lecture06
1. Test for Normal mean – known 𝝈
Hypethesis pair Statistic Reject region P-value 𝜷 = P(Error Type 2); 𝝁 = 𝝁𝟏
𝐻 : 𝜇 = 𝜇0 • 𝑋̅ → 𝑋̅𝑠𝑡𝑎𝑡 = 𝑥̅ 𝜎 𝑃(𝑍 > 𝑍𝑠𝑡𝑎𝑡 ) 𝜎
{ 0 𝑥̅ > 𝜇0 + 𝑧𝛼 𝜇 = 𝜇1 > 𝜇0 ; 𝑐 = 𝜇0 + 𝑧𝛼
𝐻1 : 𝜇 > 𝜇0 𝑋̅ −𝜇 √𝑛 √𝑛
• 𝑍 = 𝜎/ 𝑛0 → 𝑍𝑠𝑡𝑎𝑡
𝐻0 : 𝜇 ≤ 𝜇0 √ ⇔ 𝑍𝑠𝑡𝑎𝑡 > 𝑧𝛼 𝜎2 𝜇0 − 𝜇1
𝑜𝑟 { 𝛽 = 𝑃 [𝑋̅ ≤ 𝑐|𝑋̅ ∼ 𝑁 (𝜇1 , 𝑛 )] = 𝑃 [𝑍 ≤ 𝑧𝛼 + ]
𝐻1 : 𝜇 > 𝜇0 (𝐻0 true → 𝑍 ∼ 𝑁(0; 1))
𝜎/√𝑛
𝐻0 : 𝜇 = 𝜇0 𝜎 𝑃(𝑍 < 𝑍𝑠𝑡𝑎𝑡 ) 𝜎
{ 𝑥̅ < 𝜇0 − 𝑧𝛼 𝜇 = 𝜇1 < 𝜇0 ; 𝑐 = 𝜇0 − 𝑧𝛼
𝐻1 : 𝜇 < 𝜇0 √𝑛 = 𝑃(𝑍 > −𝑍𝑠𝑡𝑎𝑡 ) √𝑛
𝐻 : 𝜇 ≥ 𝜇0 ⇔ 𝑍𝑠𝑡𝑎𝑡 < −𝑧𝛼 𝜎2 𝜇0 − 𝜇1
𝑜𝑟 { 0 𝛽 = 𝑃 [𝑋̅ ≥ 𝑐|𝑋̅ ∼ 𝑁 (𝜇1 , 𝑛 )] = 𝑃 [𝑍 ≥ −𝑧𝛼 + ]
𝐻1 : 𝜇 < 𝜇0
𝜎/√𝑛
𝐻0 : 𝜇 = 𝜇0 𝜎
{ |𝑥̅ − 𝜇0 | > 𝑧𝛼/2
𝐻1 : 𝜇 ≠ 𝜇0 √𝑛 2𝑃(𝑍 > |𝑍𝑠𝑡𝑎𝑡 |) 𝛽 = 𝑃[𝑋̅ ≤ 𝑐1|𝜇 = 𝜇1] + 𝑃[𝑋̅ ≥ 𝑐2 |𝜇 = 𝜇1 ]
⇔ |𝑍𝑠𝑡𝑎𝑡 | > 𝑧𝛼/2
𝜎
𝜎𝑥̅ > 𝜇0 + 𝑧𝛼/2 = 𝑐1
|𝑥̅ − 𝜇0 | > 𝑧𝛼/2 ⇔ [ √𝑛
√𝑛 𝜎
𝑥̅ < 𝜇0 − 𝑧𝛼 = 𝑐2
2 √𝑛

2. Test for Normal mean – unknown 𝝈


Hypetheses Statistic Reject region P-value P(ET.2); 𝝁 = 𝝁𝟏
pair
𝐻 : 𝜇 = 𝜇0
{ 0 • 𝑋̅ → 𝑋̅𝑠𝑡𝑎𝑡 = 𝑥̅ 𝑇𝑠𝑡𝑎𝑡 > 𝑡(𝑛−1)𝛼 𝑃[𝑇(𝑛 − 1) > 𝑇𝑠𝑡𝑎𝑡 ] 𝑆
𝐻1 : 𝜇 > 𝜇0 𝜇 = 𝜇1 > 𝜇0 ; 𝑐 = 𝜇0 + 𝑡(𝑛−1)𝛼
𝑋̅ −𝜇 𝑛 > 30 ⇒ 𝑇(𝑛 − 1) ≈ 𝑁(0,1) √𝑛
• 𝑇 = 𝑆/ 𝑛0 → 𝑇𝑠𝑡𝑎𝑡
𝐻0 : 𝜇 ≤ 𝜇0 √
𝜎 2
𝑜𝑟 { 𝑛 < 30 → 𝐸𝑥𝑐𝑒𝑙, 𝑅
𝐻1 : 𝜇 > 𝜇0 (𝐻0 true → 𝑇 ∼ 𝑇(𝑛 − 1)) 𝛽 = 𝑃 [𝑋̅ < 𝑐|𝑋̅ ∼ 𝑁 (𝜇1 , )]
𝑛
𝐻0 : 𝜇 = 𝜇0 𝑇𝑠𝑡𝑎𝑡 < −𝑡(𝑛−1)𝛼 𝑃[𝑇(𝑛 − 1) < 𝑇𝑠𝑡𝑎𝑡 ] 𝛽 =?
{
𝐻1 : 𝜇 < 𝜇0 = 𝑃[𝑇(𝑛 − 1) > −𝑇𝑠𝑡𝑎𝑡 ]
𝐻 : 𝜇 ≥ 𝜇0
𝑜𝑟 { 0
𝐻1 : 𝜇 < 𝜇0
𝐻 : 𝜇 = 𝜇0 |𝑇𝑠𝑡𝑎𝑡 | > 𝑡(𝑛−1)𝛼/2 2 × 𝑃[𝑇(𝑛 − 1) > |𝑇𝑠𝑡𝑎𝑡 |] 𝛽 =?
{ 0
𝐻1 : 𝜇 ≠ 𝜇0

11
3. Test for Normal variance

Hypetheses pair Statistic Reject region P-value


(𝑛−1)𝑆 2 2
𝐻 : 𝜎 = 𝜎02
2
• 𝜒2 = 2
→ 𝜒𝑠𝑡𝑎𝑡
2
𝜒𝑠𝑡𝑎𝑡 > 𝜒(𝑛−1)𝛼 𝑃[𝜒 2 (𝑛 − 1) > 𝜒𝑠𝑡𝑎𝑡
2
]
{ 0 2 𝜎02
𝐻1 : 𝜎 > 𝜎02 𝐸𝑥𝑐𝑒𝑙, 𝑅
(𝐻0 true → 𝜒 ∼ 𝜒 2 (𝑛 − 1))
2
𝐻 : 𝜎2 ≤ 𝜎02
𝑜𝑟 { 0 2
𝐻1 : 𝜎 > 𝜎02
2
𝐻 : 𝜎 2 = 𝜎02 2
𝜒𝑠𝑡𝑎𝑡 < 𝜒(𝑛−1)1−𝛼 𝑃[𝜒 2 (𝑛 − 1) < 𝜒𝑠𝑡𝑎𝑡
2 ]
{ 0 2
𝐻1 : 𝜎 < 𝜎02
𝐻0 : 𝜎 2 ≥ 𝜎02
𝑜𝑟 {
𝐻1 : 𝜎 2 < 𝜎02
𝐻 : 𝜎 2 = 𝜎02 |𝑇𝑠𝑡𝑎𝑡 | > 𝑡(𝑛−1)𝛼/2 𝐼𝑓 𝑠 2 > 𝜎02 → 𝑃 − 𝑣𝑎𝑙𝑢𝑒 = 2 × 𝑃[𝜒 2 (𝑛 − 1) > 𝜒𝑠𝑡𝑎𝑡
2
]
{ 0 2
𝐻1 : 𝜎 ≠ 𝜎02 2 2 2 2
𝐼𝑓 𝑠 < 𝜎0 → 𝑃 − 𝑣𝑎𝑙𝑢𝑒 = 2 × 𝑃[𝜒 (𝑛 − 1) < 𝜒𝑠𝑡𝑎𝑡 ]

12
4. Test for population proprtion, 𝒏 ≥ 𝟏𝟎𝟎 (large sample) (slide157)

Hypethesis pair Statistic Reject region P-value 𝜷 = P(Error Type 2); 𝝁 = 𝝁𝟏


𝐻 : 𝑝 = 𝑝0
{ 0 • 𝑝̂ → 𝑝̂𝑠𝑡𝑎𝑡 √𝑝0 (1 − 𝑝0 ) 𝑃(𝑍 > 𝑍𝑠𝑡𝑎𝑡 ) √𝑝0 (1 − 𝑝0 )
𝐻1 : 𝑝 > 𝑝0 𝑝̂−𝑝0 𝑝̂ > 𝑝0 + 𝑧𝛼 𝑝 = 𝑝1 > 𝑝0 ; 𝑐 = 𝑝0 + 𝑧𝛼
• 𝑍= → 𝑍𝑠𝑡𝑎𝑡 √𝑛 √𝑛
𝐻 : 𝑝 ≤ 𝑝0 √𝑝0 (1−𝑝0 )/√𝑛
𝑜𝑟 { 0 ⇔ 𝑍𝑠𝑡𝑎𝑡 > 𝑧𝛼 𝑝 (1 − 𝑝 )
𝐻1 : 𝑝 > 𝑝0 (𝐻0 true → 𝑍 ∼ 𝑁(0; 1)) 𝛽 = 𝑃 [𝑝̂ ≤ 𝑐|𝑝̂ ∼ 𝑁 (𝑝1 , 1 𝑛 1 )]
𝐻0 : 𝑝 = 𝑝0 √𝑝0 (1 − 𝑝0 ) 𝑃(𝑍 < 𝑍𝑠𝑡𝑎𝑡 ) √𝑝0 (1 − 𝑝0 )
{ 𝑝̂ < 𝑝0 − 𝑧𝛼
𝐻1 : 𝑝 < 𝑝0
√𝑛 = 𝑃(𝑍 > −𝑍𝑠𝑡𝑎𝑡 ) 𝑝 = 𝑝1 > 𝑝0 ; 𝑐 = 𝑝0 − 𝑧𝛼
√𝑛
𝐻 : 𝑝 ≥ 𝑝0
𝑜𝑟 { 0 ⇔ 𝑍𝑠𝑡𝑎𝑡 < −𝑧𝛼 𝑝 (1 − 𝑝 )
𝐻1 : 𝑝 < 𝑝0 𝛽 = 𝑃 [𝑝̂ ≥ 𝑐|𝑝̂ ∼ 𝑁 (𝑝1 , 1 𝑛 1 )]
𝐻0 : 𝑝 = 𝑝0 √𝑝0 (1 − 𝑝0 )
{ |𝑝̂ − 𝑝0 | > 𝑧𝛼/2
𝐻1 : 𝑝 ≠ 𝑝0
√𝑛 2𝑃(𝑍 > |𝑍𝑠𝑡𝑎𝑡 |) 𝛽 =?
⇔ |𝑍𝑠𝑡𝑎𝑡 | > 𝑧𝛼/2

5. Test for population proprtion, small sample

Hyp. pair Stat. Reject H0 P-value Binomal 𝑩(𝒏, 𝒑 = 𝒑𝟎 )


𝐻 : 𝑝 = 𝑝0 𝐻 : 𝑝 ≤ 𝑝0 𝑓𝑟𝑒𝑞. 𝑓𝑟𝑒𝑞. ≥ 𝑐𝑟𝑖𝑡. 𝑃(𝑋 ≥ 𝑓𝑟𝑒𝑞.𝑠𝑡𝑎𝑡 ) x P(X = x)
{ 0 𝑜𝑟 { 0
𝐻1 : 𝑝 > 𝑝0 𝐻1 : 𝑝 > 𝑝0 𝑃(𝑋 ≥ 𝑐𝑟𝑖𝑡. |𝑝 = 𝑝0) < 𝛼 0 𝐶𝑛0 𝑝00 (1 − 𝑝0 )𝑛
𝐻0 : 𝑝 = 𝑝0 𝐻 : 𝑝 ≥ 𝑝0 𝑓𝑟𝑒𝑞. ≤ 𝑐𝑟𝑖𝑡. 𝑃(𝑋 ≤ 𝑓𝑟𝑒𝑞.𝑠𝑡𝑎𝑡 ) 1 𝐶𝑛1 𝑝01 (1 − 𝑝0 )𝑛−1
{ 𝑜𝑟 { 0
𝐻1 : 𝑝 < 𝑝0 𝐻1 : 𝑝 < 𝑝0 𝑃(𝑋 ≤ 𝑐𝑟𝑖𝑡. |𝑝 = 𝑝0) < 𝛼 … …
𝐻0 : 𝑝 = 𝑝0 𝑓𝑟𝑒𝑞. ≥ 𝑐𝑟𝑖𝑡1 𝑜𝑟 𝑓𝑟𝑒𝑞. ≤ 𝑐𝑟𝑖𝑡2 𝑖 𝐶𝑛𝑖 𝑝0𝑖 (1 − 𝑝0 )𝑛−𝑖
{
𝐻1 : 𝑝 ≠ 𝑝0 𝑃(𝑋 ≥ 𝑐𝑟𝑖𝑡1 |𝑝 = 𝑝0) < 𝛼/2 … …
𝑃(𝑋 ≤ 𝑐𝑟𝑖𝑡1 |𝑝 = 𝑝0) < 𝛼/2 𝑛−1 𝐶𝑛𝑛−1 𝑝0𝑛−1 (1 − 𝑝0 )1
𝑛 𝐶𝑛𝑛 𝑝0𝑛 (1 − 𝑝0 )0

13
Lecture07
Inference for 2 means Hyp. pair Statistic Rejection region 𝑃 − 𝑣𝑎𝑙𝑢𝑒
𝑋1 ∼ 𝑁(𝜇1 , 𝜎12 ), 𝑋2 ∼ 𝑁(𝜇2 , 𝜎22 ) 𝐻0 : 𝜇𝑑 = 0 ̅
𝑑−0 𝑇𝑠𝑡𝑎𝑡 > 𝑡(𝑛−1)𝛼 𝑃[𝑇(𝑛 − 1) > 𝑇𝑠𝑡𝑎𝑡 ]
{ 𝑇𝑠𝑡𝑎𝑡 =
𝐻1 : 𝜇𝑑 > 0
𝑯𝟎 : 𝝁 𝟏 = 𝝁 𝟐 𝐻 :𝜇 = 0
𝑠𝑑 /√𝑛
true 𝑇𝑠𝑡𝑎𝑡 < −𝑡(𝑛−1)𝛼 𝑃[𝑇(𝑛 − 1) < 𝑇𝑠𝑡𝑎𝑡 ]
Pair sample? 𝒅 = 𝑿𝟏 − 𝑿𝟐 ; 𝒕𝒆𝒔𝒕 { 0 𝑑
𝐻1 : 𝜇𝑑 < 0
̅ , 𝒔𝒅
Sample: 𝒏, 𝒅 𝐻 :𝜇 = 0 |𝑇𝑠𝑡𝑎𝑡 | > 𝑡(𝑛−1)𝛼/2 2 × 𝑃[𝑇(𝑛 − 1) > |𝑇𝑠𝑡𝑎𝑡 |]
{ 0 𝑑
𝐻1 : 𝜇𝑑 ≠ 0
𝑠
𝐻0 is false → C.I for 𝜇1 − 𝜇2 : 𝑑̅ ± 𝑡(𝑛−1)𝛼/2 𝑑𝑛
false √

Hyp. pair Stat. Reject Region P-value


𝐻0 : 𝜇1 = 𝜇2 ̅̅̅
𝑋1 − ̅̅̅
𝑋2 𝑍𝑠𝑡𝑎𝑡 > 𝑧𝛼 𝑃(𝑍 > 𝑍𝑠𝑡𝑎𝑡 )
{ 𝑍= ∼ 𝑁(0,1)
𝐻1 : 𝜇1 > 𝜇2
true 𝐻1 : 𝜇1 < 𝜇2 𝜎2 𝜎22 𝑍𝑠𝑡𝑎𝑡 < −𝑧𝛼 𝑃(𝑍 < 𝑍𝑠𝑡𝑎𝑡 )
Known 𝝈𝟐𝟏 , 𝝈𝟐𝟐 ? Test √ 1 +
𝐻1 : 𝜇1 ≠ 𝜇2 𝑛1 𝑛2 |𝑍𝑠𝑡𝑎𝑡 | > 𝑧𝛼/2 2 × 𝑃(𝑍 > |𝑍𝑠𝑡𝑎𝑡 |)
𝐻0 is false → C.I for 𝜇1 − 𝜇2 : …

false
Hyp. pair Stat. Reject Region P
𝐻 : 𝜇 = 𝜇2 ̅̅̅
𝑋1 − ̅̅̅
𝑋2 𝑇𝑠𝑡𝑎𝑡 > 𝑡(𝑛1 +𝑛2 −2)𝛼 𝑃(𝑇 > 𝑇𝑠𝑡𝑎𝑡 )
{ 0 1 𝑇=
𝐻1 : 𝜇1 > 𝜇2
true 𝑆2 𝑆2
Inference for 2 variances 𝝈𝟐𝟏 = 𝝈𝟐𝟐 ? 𝐻1 : 𝜇1 < 𝜇2 √ 𝑝+ 𝑝 𝑇𝑠𝑡𝑎𝑡 < −𝑡(𝑛1 +𝑛2 −2)𝛼 𝑃(𝑇 < 𝑇𝑠𝑡𝑎𝑡 )
𝐻1 : 𝜇1 ≠ 𝜇2 𝑛1 𝑛2 |𝑇𝑠𝑡𝑎𝑡 | > 𝑡(𝑛1 +𝑛2 −2)𝛼/2 2 × 𝑃(𝑇 > |𝑇𝑠𝑡𝑎𝑡 |)
Hyp. pair Stat. Reject Region ∼ 𝑇(𝑛1 + 𝑛2 − 2)
𝐻 : 𝜎 2 = 𝜎22 𝑆12 𝐹𝑠𝑡𝑎𝑡 > 𝑓(𝑛1 −1,𝑛2 −1)𝛼/2 (𝑛1 − 1)𝑆12
+ (𝑛2 − 1)𝑆22
{ 0 12 𝐹𝑠𝑡𝑎𝑡 = 2 false 𝑆𝑝2 =
𝐻1 : 𝜎1 ≠ 𝜎22 𝑆2 or 𝐹𝑠𝑡𝑎𝑡 < 𝑓(𝑛1 −1,𝑛2 −1)1−𝛼/2 𝑛1 + 𝑛2 − 2
𝐻 : 𝜎 2 = 𝜎22 𝐹𝑠𝑡𝑎𝑡 > 𝑓(𝑛1 −1,𝑛2 −1)𝛼
{ 0 12
𝐻1 : 𝜎1 > 𝜎22 Hyp. pair Stat. Reject region P-value
𝐻 : 𝜎 2 = 𝜎22 𝐹𝑠𝑡𝑎𝑡 < 𝑓(𝑛1 −1,𝑛2 −1)1−𝛼 𝐻 : 𝜇 = 𝜇2 ̅̅̅
𝑋1 − ̅̅̅
𝑋2 𝑇𝑠𝑡𝑎𝑡 > 𝑡(𝑑𝑓)𝛼 𝑃(𝑇 > 𝑇𝑠𝑡𝑎𝑡 )
{ 0 12 { 0 1 𝑇=
𝐻1 : 𝜎1 < 𝜎22 𝐻1 : 𝜇1 > 𝜇2
𝜎12 𝐻1 : 𝜇1 < 𝜇2 𝑆2 𝑆2 𝑇𝑠𝑡𝑎𝑡 < −𝑡(𝑑𝑓)𝛼 𝑃(𝑇 < 𝑇𝑠𝑡𝑎𝑡 )
𝐻0 is false → C.I for 𝜎22
: … √ 1+ 2
𝐻1 : 𝜇1 ≠ 𝜇2 𝑛1 𝑛2 |𝑇𝑠𝑡𝑎𝑡 | > 𝑡(𝑑𝑓)𝛼/2 2 × 𝑃(𝑇 > |𝑇𝑠𝑡𝑎𝑡 |)
∼ 𝑇(𝑑𝑓)
(𝑆12 /𝑛1 + 𝑆22 /𝑛2 )2
𝑑𝑓 =
(𝑆12 /𝑛1 )2 (𝑆22 /𝑛2 )2
𝑛1 − 1 + 𝑛2 − 1

14
Inference for 2 proportions
Hyp. pair Stat. Reject Region P-value
𝐻 : 𝑝 = 𝑝2 𝑝̂1 − 𝑝̂ 2 𝑍𝑠𝑡𝑎𝑡 > 𝑧𝛼 𝑃(𝑍 > 𝑍𝑠𝑡𝑎𝑡 )
{ 0 1 𝑍= ∼ 𝑁(0,1)
𝐻1 : 𝑝1 > 𝑝2 1 1
𝐻1 : 𝑝1 < 𝑝2 √𝑝̅ (1 − 𝑝̅ ) ( + ) 𝑍𝑠𝑡𝑎𝑡 < −𝑧𝛼 𝑃(𝑍 < 𝑍𝑠𝑡𝑎𝑡 )
𝑛1 𝑛2
𝐻1 : 𝑝1 ≠ 𝑝2 𝑛1 𝑝̂1 + 𝑛1 𝑝̂2 |𝑍𝑠𝑡𝑎𝑡 | > 𝑧𝛼/2 2 × 𝑃(𝑍 > |𝑍𝑠𝑡𝑎𝑡 |)
𝑝̅ =
𝑛1 + 𝑛2

𝑝̂1 (1−𝑝̂1 ) 𝑝̂2 (1−𝑝̂2 )


C.I: 𝑝1 − 𝑝2 ∈ (𝑝̂1 − 𝑝̂ 2 ) ± 𝑧𝛼 √ +
2 𝑛1 𝑛2

Correlation test (formular and table)

15
Lecture 08: ANOVA: to test for equality of means
1. One - factor ANOVA

16
2. Two - factor ANOVA without interaction

17
3. Two - factor ANOVA with interaction

18
Lecture 09
1. Chi-squared test

2. Independent test (easy and common)

19
2
2 𝐹𝑖𝑗
Can be proved: 𝜒𝑠𝑡𝑎𝑡 = 𝑛 ( ∑𝑖 ∑𝑗 − 1)
𝑅𝑖 𝐶𝑗

20
3. Rank test

Critical value

21
4. Nomality test
Jarque-Bera test (easy and common)

22

You might also like