Formula Sheet
Formula Sheet
Sample Mean, 𝒙
̅ Average (𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 )
𝑛
02 - Probability
Sample Space, 𝑺 Set of all possible outcomes, consists of elements, can be described by rules
or statements.
Permutations of n distinct 𝑛!
𝑛𝑃𝑟 =
objects, taking r at a time (𝑛 − 𝑟)!
Distinct permutations of n 𝑛!
things where 𝒏𝟏 are one kind 𝑛1 ! 𝑛2 !
and 𝒏𝟐 are another kind.
Partitioning Ways to partition/divide n 𝑛 𝑛!
( ) =
objects into r cells with 𝒏𝟏 𝑛1 , 𝑛2 , … , 𝑛𝑟 𝑛1 ! 𝑛2 … 𝑛𝑟 !
elements in the first, 𝒏𝟐 in
the seconds, etc.
Additive Rules (Probability) If A and B are two, non- 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵)
mutually exclusive events
Independence Two events A and B are independent if and only if 𝑷(𝑨 ∩ 𝑩) = 𝑷(𝑨)𝑷(𝑩)
∞
Probability after a value
𝑃(𝑥 ≥ 𝑎) = ∫ 𝑓(𝑥)𝑑𝑥
𝑎
𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) = 𝑓(𝑥, 𝑦)
+∞
Continuous.
𝑔(𝑥) = ∫ 𝑓(𝑥, 𝑦) 𝑑𝑦
−∞
04 – Mathematical Expectations
+∞
𝜇𝑥 = 𝐸(𝑥) = ∫ 𝑥 ∙ 𝑓(𝑥)𝑑𝑥
−∞
The mean or expected value
μ𝑔(𝑋) = 𝐸(𝑔(𝑋)) = ∑ 𝑔(𝑥) ∙ 𝑓(𝑥)
of a random variable g(X)
𝑥
+∞
𝜇𝑔(𝑋) = 𝐸(𝑔(𝑋)) = ∫ 𝑔(𝑥) ∙ 𝑓(𝑥) 𝑑𝑥
−∞
2
Variance of the random σ𝑔(𝑋) 2 = 𝐸(𝑔(𝑋) − μ𝑔(𝑋) )
variable g(X)
2
= ∑(𝑔(𝑥) − 𝜇𝑔(𝑥) ) 𝑓(𝑥)
𝑥
2
σ𝑔(𝑋) 2 = 𝐸(𝑔(𝑋) − μ𝑔(𝑋) )
+∞
2
=∫ (𝑔(𝑥) − 𝜇𝑔(𝑋) ) 𝑓(𝑥)𝑑𝑥
−∞
Mean 1
k
μx = ∑ xi
k
i=1
Variance 1
𝑘
μx = ∑(𝑥𝑖 − μ𝑥 )2
k
𝑖=1
Mean μ𝑥 = 𝑛𝑝
∑ 𝑥𝑖 = 𝑛, ∑ 𝑝𝑖 = 1
𝑖=1 𝑖=1
Variance 𝑁 − 𝑛 𝑘 𝑘
σ2𝑥 = ∗ 𝑛 ∗ (1 − )
𝑁 − 1 𝑁 𝑁
Mean 1
μ𝑥 =
𝑝
Variance 1 − 𝑝
σ2𝑥 =
𝑝2
When using the tables, 𝛍 = 𝛌𝒕, r is the number of outcomes (x). Table is
cumulative, so subtraction is needed for probability at exact r values.
06 – Common Continuous Probability Distributions
Mean 𝐴 + 𝐵
μ𝑥 =
2
Variance (𝐵 − 𝐴)2
σ2𝑥 =
12
+∞
Mean
μ = 𝐸(𝑋) = ∫ 𝑥𝑓(𝑥)𝑑𝑥
−∞
+∞
Variance
𝜎 2 = 𝐸(𝑥 − 𝜇)2 = ∫ (𝑥 − 𝜇)2 𝑓(𝑥)𝑑𝑥
−∞
Standard Normal To simplify calculations, we have a table for the standard normal curve. To
Distribution make use of them, we must convert our values to a function of Z.
Z to X formula 𝑋−𝜇
𝑍 =
𝜎
𝑥
Gamma Distribution The continuous random −
𝑥 α−1 𝑒 𝛽
variable X has a gamma 𝑓(𝑥; α, β) = { βα Γ(α) , 𝑥 > 0
distribution with
parameters 𝛂 and 𝛃 if its 0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
density function is given Γ(𝑛) = (𝑛 − 1)!
by:
Mean μ = αβ
Variance σ2 = αβ2
𝑥
Exponential Distribution Special case of the gamma −
distribution when 𝛂 = 𝟏 𝑒 𝛽
𝑓(𝑥; β) = { 𝛽 ,𝑥 > 0
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Mean μ= β
Variance σ2 = β2
𝑣 𝑥
Chi-Squared Distribution Special case of the gamma 𝑥 2−1 𝑒 −2
𝒗
distribution where 𝛂 = 𝟐 , 𝑥>0
𝑓(𝑥; 𝑣) = 2𝑣2 Γ (𝑣 )
and 𝛃 = 𝟐, where v is a 2
positive integer/parameter 0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
called the degrees of
freedom (df).
Mean μ= 𝑣
Variance σ2 = 2𝑣
Sampling Distribution (SD) The sample variance (𝑺𝟐 ) 2
(𝑛 − 1)𝑆 2
of Variances of a size n sample from a χ =
σ2
normal population with
variance 𝛔𝟐 has
distribution 𝛘𝟐 , which is a
Chi-squared density
function with 𝒗 = 𝒏 − 𝟏
When using the tables, 𝒗 is degrees of freedom, 𝛂 is area under the curve,
measured from the right, and the table values are the 𝛘𝟐 scores.
Applications of the Central The central limit theorem should only on random samples, with a finite
Limit Theorem mean and variance on the population, and a sample size greater than 30.
Student’s t-Distribution Used in situations where 𝑋̅ − μ
the population is normal 𝑇 =
𝑆
and 𝛔 is not known. √𝑛
Follows same format as
CLT. This also has a df
term (𝒅𝒇 = 𝒗 = 𝒏 − 𝟏).
Alternate formula (𝑽 = 𝛘𝟐 ) 𝑍
𝑇=
√ 𝑉
𝑛−1
When using tables, 𝒅𝒇 = 𝒗 = 𝒏 − 𝟏, 𝒑 is the area under the curve,
measured from the right, and the table values are the t-scores.
Fisher-Snedecor F- Used to compare two 𝑈
Distribution variances using the ratio 𝑣
𝐹= 1
between two chi-squared 𝑉
variables 𝑼, 𝑽 and their dfs 𝑣2
Hypothesis and Significance To solve any problem, it must be broken down into a few steps.
Tests
1. Create null (𝑯𝟎 ) and alternative (𝑯𝟏 ) hypothesis.
2. Determine if a one or two-tailed test is most suitable.
3. Calculate the sample statistic.
4. Reject or fail to reject the alternative hypothesis
Significance Test Results There are two types of error in significance tests. Type I error (𝛂) occurs
when 𝑯𝟎 is true and is rejected. Type II error (𝛃) occurs when 𝑯𝟎 is false
and is not rejected.
Hypothesis Testing on the Used when 𝛔 is known or 𝛔 unknown with 𝒏 ≥ 𝟑𝟎 and significance level 𝛂.
Population Mean (Case 1, 2)
One-Tailed Hypothesis 𝐻0 : μ = μ0
and Rejection Region 𝐻1 : μ > μ0 𝒐𝒓 μ < 𝑢0
𝑍 > 𝑧α 𝒐𝒓 𝑍 < 𝑧α
Two-Tailed Hypothesis 𝐻0 : μ = μ0
and Rejection Region 𝐻1 : μ ≠ μ0
|𝑍| > 𝑧α/2
𝑛=
δ2
Two-Tailed n (Round up) (𝑧α/2 + 𝑧β ) σ2
2
𝑛≈
δ2
Hypothesis Testing on the Used when 𝛔 is unknown with 𝒏 < 𝟑𝟎 and significance level 𝛂. We use T-
Population Mean (Case 3) test with 𝒅𝒇 = 𝒏 − 𝟏
One-Tailed Hypothesis 𝐻0 : μ = μ0
and Rejection Region 𝐻1 : μ > μ0 𝒐𝒓 μ < 𝑢0
𝑇 > 𝑡α 𝑜𝑟 𝑇 < −𝑡𝛼
Two-Tailed Hypothesis 𝐻0 : μ = μ0
and Rejection Region 𝐻1 : μ ≠ μ0
|𝑇| > 𝑡α/2
Test Statistic 𝑋̅ − μ0
𝑇= 𝑠
√𝑛
Hypothesis Testing on the Used when 𝛔𝟏 , 𝛔𝟐 are known, with two samples 𝒏𝟏 , 𝒏𝟐 chosen
Difference Between Two independently. Significance level of 𝛂
Means (Case 1)
One-Tailed Hypothesis 𝐻0 : μ1 − μ2 = 𝐷0
and Rejection Region 𝐻1 : μ1 − μ2 > 𝐷0 𝑜𝑟 μ − μ2 < 𝐷0
𝑍 > 𝑧α 𝑜𝑟 𝑍 < −𝑧α
Two-Tailed Hypothesis 𝐻0 : μ1 − μ2 = 𝐷0
and Rejection Region 𝐻1 : μ1 − μ2 ≠ 𝐷0
|𝑍| > 𝑧α/2
Hypothesis Testing on the Used when 𝛔𝟏 , 𝛔𝟐 are unknown, with two large (𝒏 ≥ 𝟑𝟎) samples 𝒏𝟏 , 𝒏𝟐
Difference Between Two chosen independently. Significance level of 𝛂
Means (Case 2)
One-Tailed Hypothesis 𝐻0 : 𝜇1 − 𝜇2 = 𝐷0
and Rejection Region 𝐻1 : 𝜇1 − 𝜇2 > 𝐷0 𝑜𝑟 𝜇 − 𝜇2 < 𝐷0
𝑍 > 𝑧𝛼 𝑜𝑟 𝑍 < −𝑧𝛼
Two-Tailed Hypothesis 𝐻0 : μ1 − μ2 = 𝐷0
and Rejection Region 𝐻1 : μ1 − μ2 ≠ 𝐷0
|𝑍| > 𝑧𝛼/2
Hypothesis Testing on the Used when 𝛔𝟏 , 𝛔𝟐 are unknown but equal, with two samples 𝒏𝟏 , 𝒏𝟐 chosen
Difference Between Two independently. Significance level of 𝛂. T-score with 𝒅𝒇 = 𝒏𝟏 + 𝒏𝟐 − 𝟐
Means (Case 3)
One-Tailed Hypothesis 𝐻0 : 𝜇1 − 𝜇2 = 𝐷0
and Rejection Region 𝐻1 : 𝜇1 − 𝜇2 > 𝐷0 𝑜𝑟 𝜇 − 𝜇2 < 𝐷0
𝑇 > 𝑡𝛼 𝑜𝑟 𝑇 < −𝑡𝛼
Two-Tailed Hypothesis 𝐻0 : μ1 − μ2 = 𝐷0
and Rejection Region 𝐻1 : μ1 − μ2 ≠ 𝐷0
|𝑇| > 𝑡𝛼/2
Test Statistic (𝑋̅1 − 𝑋̅2 ) − 𝐷0
𝑇=
1 1
𝑠𝑝 √𝑛 + 𝑛
1 2
Two-Tailed Hypothesis 𝐻0 : μ1 − μ2 = 𝐷0
and Rejection Region 𝐻1 : μ1 − μ2 ≠ 𝐷0
|𝑇| > 𝑡𝛼/2
2
𝑠2 𝑠2
(𝑛1 + 𝑛2 )
1 2
𝑑𝑓 = 𝑣 = 2 2
s2 s2
(n1 ) (n2 )
1 2
n1 − 1 + n2 − 1
Hypothesis Testing on the ̅.
Used with a sample of paired differences of size n with mean 𝒅
Difference Between Two Significance level of 𝛂.
Means (Case 5)
One-Tailed Hypothesis 𝐻0 : 𝜇1 − 𝜇2 = 𝐷0
and Rejection Region 𝐻1 : 𝜇1 − 𝜇2 > 𝐷0 𝑜𝑟 𝜇 − 𝜇2 < 𝐷0
𝑇 > 𝑡𝛼 𝑜𝑟 𝑇 < −𝑡𝛼
Two-Tailed Hypothesis 𝐻0 : μ1 − μ2 = 𝐷0
and Rejection Region 𝐻1 : μ1 − μ2 ≠ 𝐷0
|𝑇 | > 𝑡𝛼/2
Test Statistic 𝑑̅ − 𝐷0 𝑑̅ − 𝐷0
𝑇= σ𝑑 ≈ 𝑠𝑑
√ 𝑛 √𝑛
𝑑𝑓 = 𝑛 − 1
Hypothesis Testing on the Used for samples of sizes 𝒏𝟏 , 𝒏𝟐 selected independently and randomly.
Ratio of Two Population Significance level of 𝛂.
Variance
One-Tailed Hypothesis σ12
and Rejection Region 𝐻0 : = 1
σ22
𝜎12 𝜎12
𝐻1 : > 1 𝑜𝑟 2 < 1
𝜎22 𝜎2
𝐹 > 𝑓α (𝑑𝑓1 , 𝑑𝑓2 )
Two-Tailed Hypothesis 𝜎12
and Rejection Region 𝐻0 : 2 = 1
𝜎2
𝜎12
𝐻1 : 2 ≠ 1
𝜎2
𝐹 > 𝑓𝛼/2 (𝑑𝑓1 , 𝑑𝑓2 )
Test Statistic 𝑙𝑎𝑟𝑔𝑒𝑟 𝑠𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒
𝐹=
𝑠𝑚𝑎𝑙𝑙𝑒𝑟 𝑠𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒
𝑠12 𝑠22
= 2 𝑜𝑟 2
𝑠2 𝑠1
𝑑𝑓1 = 𝑛1 − 1
𝑑𝑓2 = 𝑛2 − 1
Goodness-of-Fit Test Test how good a fit between frequency of observations 𝒐𝒊 in a k-sized sample
compared to the expected frequencies 𝒆𝒊 . Significance level of 𝛂. Note that
when the expected frequencies are less than 5, collapse the class.
Rejection Region χ2 > χ2α
Test Statistic 𝑘
(𝑜𝑖 − 𝑒𝑖 )2
2
χ =∑
𝑒𝑖
𝑖=1
𝑑𝑓 = 𝑘 − 1
11 - Simple Linear Regression and Correlation
When solving, it is recommended to solve for the sums of 𝑥𝑖 , 𝑦𝑖 , 𝑥𝑖2 , 𝑦𝑖2, and
𝑥𝑖 𝑦𝑖 beforehand to simplify the calculations of 𝑆𝑥𝑥 , 𝑆𝑥𝑦
Coefficient of Correlation The coefficient of 𝑆𝑥𝑦
correlation r is the strength 𝑟=
√𝑆𝑥𝑥 𝑆𝑦𝑦
of the linear relationship
between two variables x, y
in the sample.
𝑛
Total Corrected Sum of Total variation of the data
Squares (SST) relative to the naïve model, 𝑆𝑦𝑦 = ∑(𝑌𝑖 − 𝑦̅)2 = 𝑆𝑆𝑇
𝑦̂ = 𝑦̅ 𝑖=1
𝑛
Total variation of the data 2
𝑆𝑥𝑦
relative to the least squares 𝑆𝑆𝐸 = ∑(𝑌𝑖 − 𝑦̂𝑖 )2 = 𝑆𝑦𝑦 −
𝑆𝑥𝑥
line 𝑖=1
Coefficient of Determination Describes how well the 𝑆𝑦𝑦 − 𝑆𝑆𝐸 𝑆𝑆𝑇 − 𝑆𝑆𝐸 2
𝑆𝑥𝑦
least-squares line fits the 𝑟2 = = =
𝑆𝑦𝑦 𝑆𝑆𝑇 𝑆𝑥𝑥 𝑆𝑦𝑦
data. Proportional to the
reduction in variation
using least-squares model
vs the naïve one.
Values range from 0 to 1, where 1 means all points lie on the least-squares
line and 0 represents the least-squares line offering no information about Y
Variance Estimation An unbiased estimate of 𝑆𝑆𝐸 𝑆𝑦𝑦 − 𝑏𝑆𝑥𝑦
the common variance of the 𝑠2 = =
𝑛−2 𝑛−2
residuals based on the
sample variance is:
Confidence Interval The 𝟏𝟎𝟎(𝟏 − 𝛂)% CI for
1 (𝑥0 − 𝑥̅ )2
the mean response 𝝁𝒀|𝒙𝟎 𝑦̂0 − 𝑡α/2 𝑠√ + < 𝜇𝑌|𝑥0
𝑛 𝑆𝑥𝑥
for a given x-value 𝒙𝟎 . The
value 𝒚̂𝟎 represents the 1 (𝑥0 − 𝑥̅ )2
least-squares prediction for < 𝑦̂0 + 𝑡𝛼/2 𝑠√ +
𝒙𝟎 . 𝑛 𝑆𝑥𝑥
𝑑𝑓 = 𝑛 − 2