0% found this document useful (0 votes)
8 views

Formula Sheet_Test 2 - STAT4001

This document is a formula sheet for Test 2 of Stat4001, scheduled for July 24, 2024. It includes key concepts in classical and empirical probability, probability rules, distributions (such as binomial and Poisson), and confidence intervals for proportions and means. Additionally, it lists relevant Excel functions for statistical calculations.

Uploaded by

J.C.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Formula Sheet_Test 2 - STAT4001

This document is a formula sheet for Test 2 of Stat4001, scheduled for July 24, 2024. It includes key concepts in classical and empirical probability, probability rules, distributions (such as binomial and Poisson), and confidence intervals for proportions and means. Additionally, it lists relevant Excel functions for statistical calculations.

Uploaded by

J.C.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Formula Sheet for Test 2 of Stat4001

2:00 pm – 5:00 pm on Wednesday July 24, 2024 at SJC-207

Student ID:____________ Signature: _______________________

Classical Probability:
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑎𝑛 𝑒𝑣𝑒𝑛𝑡 =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠

Empirical Probability:
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑖𝑚𝑒𝑠 𝑒𝑣𝑒𝑛𝑡 𝑜𝑐𝑐𝑢𝑟𝑟𝑒𝑑 𝑖𝑛 𝑝𝑎𝑠𝑡
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑎𝑛 𝑒𝑣𝑒𝑛𝑡 ℎ𝑎𝑝𝑝𝑒𝑛𝑖𝑛𝑔 =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠

Probability Rules

Additional Rule: If A and B are disjoint events, then 𝑃(𝐴 𝑜𝑟 𝐵) = 𝑃(𝐴) + 𝑃(𝐵)

General Additional Rule: For any two events A and B

𝑃(𝐴 𝑜𝑟 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 𝑎𝑛𝑑 𝐵)


Complement Rule: 𝑃(𝐴) = 1 − 𝑃(𝐴𝑐 )

Conditional Probability: The probability of A given B is noted as 𝑃(𝐴|𝐵), and

𝑃(𝐴 𝑎𝑛𝑑 𝐵)
𝑃(𝐴|𝐵) = 𝑖𝑓 𝑃(𝐵) > 0
𝑃(𝐵)
General Multiplication Rule: For any two events A and B

𝑃(𝐴 𝑎𝑛𝑑 𝐵) = 𝑃(𝐴)𝑃(𝐵|𝐴) = 𝑃(𝐵)𝑃(𝐴|𝐵)


Bayes Rule:
𝑃(𝐵|𝐴𝑖 )𝑃(𝐴𝑖 )
𝑃(𝐴𝑖 |𝐵) =
∑ 𝑃(𝐵|𝐴𝑖 )𝑃(𝐴𝑖 )
Independence: If events A and B are independent, then
𝑃(𝐴 𝑎𝑛𝑑 𝐵) = 𝑃(𝐴)𝑃(𝐵)

Expected Value: 𝜇 = 𝐸𝑉 = 𝐸(𝑋) = ∑ 𝑥𝑃(𝑥)

Variance: 𝜎 2 = 𝑉𝑎𝑟(𝑋) = ∑(𝑥 − 𝜇)2 𝑃(𝑥) , Standard Deviation: 𝜎 = 𝑆𝐷(𝑋) = √𝑉𝑎𝑟(𝑋)


1|Page
The Empirical Rule For a symmetrical, bell-shaped frequency distribution:

➢ Approximately 68 percent of the observations will lie within ±1 standard


deviation of the mean.
➢ About 95 percent of the observations will lie within ±2 standard deviations of
the mean.
➢ Practically all (99.7 percent) will lie within ±3 standard deviations of the mean.

Addition Rule for Expected Values of Random Variable: 𝐸(𝑋 ± 𝑌) = 𝐸𝑋 ± 𝐸𝑌


Addition Rule for Variance of Random Variable: If random variables X and Y are
independent, then
𝑉𝑎𝑟(𝑋 ± 𝑌) = 𝑉𝑎𝑟𝑋 ± 𝑉𝐴𝑟𝑌
𝑆𝐷(𝑋 ± 𝑌) = √𝑉𝑎𝑟(𝑋) + 𝑉𝑎𝑟(𝑌)

If c is a constant number, then


𝐸(𝑋 ± 𝑐) = 𝐸(𝑋) ± 𝑐, 𝐸(𝑐𝑋) = 𝑐𝐸(𝑋)
𝑉𝑎𝑟(𝑋 ± 𝑐) = 𝑉𝑎𝑟(𝑋), 𝑉𝑎𝑟(𝑐𝑋) = 𝑐 2 𝑉𝑎𝑟(𝑋)
𝑆𝐷(𝑋 ± 𝑐) = 𝑆𝐷(𝑋), 𝑆𝐷(𝑐𝑋) = |𝑐|𝑆𝐷(𝑋)

Bernoulli Trial
➢ There are only two possible outcomes (success and failure) for each trial.
➢ The probability of success, denoted p, is the same for each trial. The probability
of failure is q = 1 – p.
➢ The trials are independent.

Geometric Distribution: Random variable X represent the number of trials until the
first success happens
1 𝑞
𝑃(𝑋 = 𝑘) = 𝑞 𝑘−1 𝑝, 𝑘 = 1, 2, 3, ⋯ ; 𝜇 = 𝐸(𝑋) = , 𝜎 = 𝑆𝐷(𝑋) = √
𝑝 𝑝2

Binomial Distribution: Random variable X represents the number of success


happened in n times trials.
𝑛
𝑃(𝑋 = 𝑘) = ( ) 𝑝𝑘 𝑞 𝑛−𝑘 = 𝑛 𝐶𝑘 𝑝𝑘 𝑞 𝑛−𝑘 , 𝑘 = 0, 1, 2, ⋯ , 𝑛; 𝜇 = 𝐸(𝑋) = 𝑛𝑝; 𝜎 = 𝑆𝐷(𝑋) = √𝑛𝑝𝑞
𝑘

Poisson Distribution: X represents the number of events that occur over a given
interval of time or space.
𝑒 −𝜆 𝜆𝑘
𝑃(𝑋 = 𝑘) = , 𝑘 = 0, 1, 2, ⋯ ; 𝜇 = 𝐸(𝑋) = 𝜆, 𝜎 = 𝑆𝐷(𝑋) = √𝜆
𝑘!

2|Page
Uniform Distribution:

For values c and d (𝑐 ≤ 𝑑) both within the interval [a, b]:


𝑑−𝑐
𝑃(𝑐 ≤ 𝑋 ≤ 𝑑) =
𝑏−𝑎
𝑎+𝑏 (𝑏 − 𝑎)2 (𝑏 − 𝑎)2
𝐸(𝑋) = , 𝑉𝑎𝑟(𝑋) = , 𝑆𝐷(𝑋) = √
2 12 12

𝑋−𝜇
Standard Normal Value 𝑧= 𝜎

If random variable X is normal distribution variable with mean 𝜇 and standard


𝑋−𝜇
deviation 𝜎, then 𝑧 = is a standard normal distribution variable.
𝜎

Normal Approximation:

A discrete Binomial model is approximately Normal if


𝑛𝑝 ≥ 10 𝑎𝑛𝑑 𝑛𝑞 ≥ 10
𝑥−𝜇
then for sufficiently large n, the random variable has a standard normal
𝜎

distribution where 𝜇 = 𝑛𝑝 and 𝜎 = √𝑛𝑝(1 − 𝑝).

Or we can say x is approximately normal distributed with mean 𝜇 = 𝑛𝑝 and standard


deviation 𝜎 = √𝑛𝑝(1 − 𝑝).

Exponential Distribution:
1 1
𝑃(𝑠 ≤ 𝑋 ≤ 𝑡) = 𝑒 −𝜆𝑠 − 𝑒 −𝜆𝑡 , 𝑃(𝑋 ≤ 𝑡) = 1 − 𝑒 −𝜆𝑡 , 𝜇 = 𝐸(𝑋) = , 𝜎 = 𝑆𝐷(𝑋) =
𝜆 𝜆
Central Limit Theorem

The means of all the possible random samples with the same sample size has a
sampling distribution which is approximately a normal distribution. The larger the
sample size is, the better the approximation will be.

Sampling Distribution for Proportion

𝑝(1 − 𝑝) 𝑝𝑞
𝜇(𝑝̂ ) = 𝑝, 𝑆𝐷(𝑝̂ ) = √ =√
𝑛 𝑛
𝑝𝑞
The normal model 𝑁 (𝑝, √ 𝑛 ) is a sampling distribution model for the sample proportion.

n is the sample size, q = 1 – p is the proportion of failures.

3|Page
Sampling Distribution for Mean

When a random sample is drawn from any population with mean μ and standard
deviation σ, its sample mean, 𝑥̅ , has a normal distribution with the mean μ and the
standard deviation is
𝜎
𝑆𝐷(𝑥̅ ) =
√𝑛
Confidence Interval for Proportions

𝑝̂ is the sample estimate of the true proportion 𝑝 and 𝑞̂ = 1 − 𝑝̂ .

𝑝̂ 𝑞̂ 𝑝̂ (1 − 𝑝̂ )
Standard Error is 𝑆𝐸(𝑝̂ ) = √ =√
𝑛 𝑛

𝑝̂ 𝑞̂ 𝑝̂ 𝑞̂
Confidence Interval is 𝑝̂ ± 𝑧 ∗ 𝑆𝐸(𝑝̂ ) = 𝑝̂ ± 𝑧 ∗ √ , 𝑴𝒂𝒓𝒈𝒊𝒏 𝒐𝒇 𝑬𝒓𝒓𝒐𝒓, 𝑀𝐸 = 𝑧 ∗ √
𝑛 𝑛

𝑧 ∗ is the critical value based on the confidence level, n is the sample size

Confidence Interval for the difference between two Proportions

𝒑̂𝟏 𝒒
̂𝟏 ̂𝟐 𝒒
𝒑 ̂𝟐
(𝒑 ̂ 𝟐 ) ± 𝒛∗ 𝑺𝑬(𝒑
̂𝟏 − 𝒑 ̂𝟏 − 𝒑
̂ 𝟐 ), 𝑺𝑬(𝒑 ̂𝟐 ) = √
̂𝟏 − 𝒑 + ̂𝟏 = 𝟏 − 𝒑
, 𝒒 ̂𝟏 , ̂𝟐 = 𝟏 − 𝒑
𝒒 ̂𝟐
𝒏𝟏 𝒏𝟐

where 𝑝̂1 𝑎𝑛𝑑 𝑝̂2 are the sample estimates of the true proportions from population 1 and 2,
respectively. 𝑧 ∗ is the critical value based on the confidence level.

Confidence Interval for Means when Population Standard Deviation is known


𝜎 𝜎
Confidence Interval is 𝑦̅ ± 𝑧 ∗ 𝑆𝐷(𝑦̅) = 𝑦̅ ± 𝑧 ∗ , 𝑴𝒂𝒓𝒈𝒊𝒏 𝒐𝒇 𝑬𝒓𝒓𝒐𝒓, 𝑀𝐸 = 𝑧 ∗ .
√𝑛 √𝑛

𝑧 ∗ is the critical value based on the confidence level, n is the sample size.

List of Excel Functions

AVERAGE(range of data in Excel) gives the mean of the data collection

MEDIAN(range of data in Excel) gives the median of the data collection

MODE(range of data in Excel) gives the median of the data collection


4|Page
STDEV.P(range of data in Excel) gives the population standard deviation

STDEV.S(range of data in Excel) gives the sample standard deviation

VAR.P(range of data in Excel) gives the population variance

VAR.S(range of data in Excel) gives the sample variance

EXP(x) gives the value of 𝑒 𝑥

FACT(k) gives the value of k!


𝑛
COMBIN(n,k) gives the value of combination 𝑛 𝐶𝑘 𝑜𝑟 ( )
𝑘
𝑛
BINOM.DIST(k,n,p,0) gives the value of 𝑃(𝑋 = 𝑘) = ( ) 𝑝𝑘 𝑞 𝑛−𝑘 = 𝑛 𝐶𝑘 𝑝
𝑘 𝑛−𝑘
𝑞 ,𝑞 = 1 − 𝑝
𝑘

BINOM.DIST(k, n, p, 1) gives the value of 𝑃(𝑋 ≤ 𝑘) = ∑𝑘𝑖=0 𝑃(𝑋 = 𝑖) = ∑𝑘𝑖=0 𝑛 𝐶𝑖 𝑝𝑖 𝑞 𝑛−𝑖

EXPON.DIST(t,λ,1) gives the value of cumulative probability 𝑃(𝑋 ≤ 𝑡) = 1 − 𝑒 −𝜆𝑡

𝑒 −𝜆 𝜆𝑘
POISSON.DIST(k,λ,0) gives the value of cumulative probability 𝑃(𝑋 = 𝑘) = 𝑘!

𝑒 −𝜆 𝜆𝑖
POISSON.DIST(k,λ,1) gives the value of cumulative probability 𝑃(𝑋 ≤ 𝑘) = ∑𝑘𝑖=1 𝑖!

NORM.S.DIST(z,1) gives the value of cumulative probability 𝑃(𝑋 < 𝑧)

if is X follows standard normal distribution.

NORM.DIST(z,µ,ơ,1) gives the value of cumulative probability 𝑃(𝑋 < 𝑧)

if is X follows normal distribution with mean µ and standard deviation ơ

NORM.S.INV(𝒑) gives the standard value of z

if X follows standard normal distribution and 𝑃(𝑋 < 𝑧) = 𝑝

NORM.INV(𝒑, 𝝁, 𝝈) gives the cut-off value of z,


if is X follows normal distribution with mean µ and
standard deviation ơ and 𝑃(𝑋 < 𝑧) = 𝑝

CONFIDENCE.NORM(α, ơ, n) gives the margin of error of a confidence interval,

where 1 - α is confidence level, ơ is population standard


deviation and n is the sample size.

5|Page

You might also like