Stats 1 Formulae
Stats 1 Formulae
Statistics 1 Formulae
Measures of Central Tendency
Mean
Continuous data
n
∑ xi
i=1
x̄ =
n
Discrete data
m
∑ fi xi
i=1
x̄ =
∑ fi
Adding a constant
ȳ = x̄ + c
x̄ is old mean
ȳ is new mean
c is constant value added in each x or overall mean x̄
i
Multiplying a constant
ȳ = x̄ ⋅ c
x̄ is old mean
ȳ is new mean
c is constant value multiplied with each x or overall mean x̄
i
Median
Important
Odd frequency n
n + 1
xi =
2
n is frequency
xi is median value
Even frequency n
n n+1
+
2 2
xi =
2
n is frequency
xi is median value
Adding a constant
yi = xi + c
x̄ is old median
ȳ is new median
c is constant value added in each x or overall median
i
Multiplying a constant
yi = xi ⋅ c
x̄ is old median
ȳ is new mean
c is constant value multiplied in each x or overall median x̄
i
Mode
Adding a constant
y i = mode + c
Multiplying a constant
y i = mode ⋅ c
Measures of Dispersion
It tells how much our data varies and spread.
Range
Difference b/w maximum and minimum value
Variance
Population variance
n 2
∑ (x i − μ)
2 i=1
σ =
n
Sample variance
n 2
∑ (x i − μ)
2 i=1
σ =
n − 1
Adding a constant
Multiplying a constant
2 2
σ = var ⋅ c
Standard Deviation
Note
√ 2
σ = σ
n 2
∑ (x i − μ)
√ i=1
σ =
n
xi each observation
n is total no. of observations
μ is mean of all observations
xi each observation
n is total no. of observations
μ is mean of all observations
Adding a constant
Same as variance, adding a constant does not change the standard deviation.
As constant will be added to all terms and mean, hence it cancelled out.
Multiplying a constant
√ 2
σ = std ⋅ c
Percentiles
Even frequency
p n (n + 1)
percentile = ⋅ +
100 2 2
Odd frequency
p n
percentile = ⋅
100 2
2
should be rounded to next integer.
Qaurtile
1. Q1 - Lower (first) = 25% tile
2. Q2 - Middle (Median) = 50% tile
3. Q3 - Upper (Third) = 75% tile
Quartile range
1. Min - Q1
2. Q1 - Q2
3. Q2 - Q3
4. Q3 - Max
Detecting outliers
Lower = Q1 − 1.5 ⋅ I QR
U pper = Q3 + 1.5 ⋅ I QR
It quantifies the strength of the linear association b/w two numerical variables.
It shows how two variables are different.
Important
−∞ ≤ Cov(x, y) ≤ ∞
Population covariance
N
∑ (x i − x̄) (y i − ȳ)
i=1
Cov(x, y) =
N
Sample covariance
n−1
∑ (x i − x̄) (y i − ȳ)
i=1
Cov(x, y) =
n − 1
Correlation
−1 ≤ r ≤ 1
Pearson correlation
Sample correlation
cov(x, y)
r =
Sx Sy
n
∑ (x i − x̄) (y i − ȳ)
i=1
r =
n 2 n 2
√∑ (x i − x̄) ⋅ √∑ (y i − ȳ)
i=1 i=1
Population correlation
cov(x, y)
ρ =
Sx Sy
N
∑ (x i − x̄) (y i − ȳ)
i=1
ρ =
N 2 N 2
√∑ (x i − x̄) ⋅ √∑ (y i − ȳ)
i=1 i=1
Scatter plot
In scatter plot we usually measure 4 things:
Fitting a line
A linear regression has a equation:
y = mx + c
m is slope
c is y-intercept
2
R
n!
n
Pr =
(n − r)!
n!
p 1 ! ∗ p 2 !∗. . . p k !
(n − 1)!
(n − 1)!
Combination Formula
In general, each combination of r objects from n objects can give rise to r!
arrangements.
The number of possible combinations of r objects from collection of n objects
is denoted by:
n!
n
Cr =
r!(n − r)!
Also
n! n!
n
= = C (n−r)
r!(n − r)! (n − r)!r!
n
Cn = 1 and n
C0 = 1 for all values of n.
n n+1 n+1
Cr = C r−1 + Cr ; 1 ≤ r ≤ n
Probability
Conditional Probability
The probability of an event A given that another event B has already occurred
is called conditional probability.
It is denoted by P (A|B).
P (A ∩ B)
P (A|B) =
P (B)
Multiplication Rule
P (A ∩ B) = P (A) ⋅ P (B|A)
P (A ∩ B ∩ C)
P (C|A ∩ B) =
P (A ∩ B)
Independent Events
In case of independent events, the probability of two events occurring
together is given by:
P (A|B) = P (A)
P (A ∩ B) = P (A) ⋅ P (B)
or
c c
P (E) = P (E|F ) ⋅ P (F ) + P (E|F ) ⋅ P (F )
Bayes' Theorem
P (B ∩ A)
P (B|A) =
P (B)
or
P (A|B) ⋅ P (B)
P (B|A) =
c
P (B) ⋅ P (A|B) + P (B ) ⋅ P (A|B)
Random Experiment
A random experiment is an experiment whose outcome is not known in
advance. It is an experiment whose outcome is determined by chance.
The outcome of a random experiment is called a random variable.
Sample Space: The set of all possible outcomes of a random experiment is
called the sample space of the experiment.
Sample space is denoted by S .
Suppose a random experiment of throwing a die is performed. The
sample space of the experiment is given by:
S = {1, 2, 3, 4, 5, 6}
There also exists a random variable that can take on an uncountably infinite
number of values. Such a random variable is called a Continuous Random
Variable.
\inf in
∑ p(x i ) = 1
i=1
The graph from the probablity mass function can take many shapes. The graph
can be skewed positively or negatively. It can be symmetric or asymmetric. It
can be unimodal or multimodal. It can be discrete or continuous. It can also be
uniform or non-uniform.
F (x) = P (X ≤ x)
The cumulative distribution function F (x) is a non-decreasing function of x.
E(X) = ∑ x i ⋅ p i
i=1
Suppose we want to find out the expectation of a random variable X that takes
on values x , x , . . . , x with probabilities p , p , . . . , p respectively.
1 2 n 1 2 n
E(g(X)) = ∑ g(x i )P (X = x i )
E(cg(X)) = cE(g(X))
E(X + c) = E(X) + c
The expected value of the sum of random variables is equal to the sum of the
individual expected values.
Example
X −1 0 1
X −1 0 1
Let Y = g(X) = X
2
. What is E(Y )
n
E(Y ) = ∑ yi ⋅ pi
i=1
X −2 2 5
n
E(Y ) = ∑ yi ⋅ pi
i=1
or
2 2
V ar(X) = E(X ) − μ
2
E(X) = √ V ar(X) + μ
In other words, the variance of random variable X measures the square of the
difference of the random variable from its mean, μ, on the average.
Example
| 1/6 | 1/6 | |E(X) | 1/6 | 2/6 | 3/6 | 4/6 | 5/6 | 6/6 | |E(X ) | 1/6 | 4/6 | 9/6 | 2
2 1 4 9 16 25 36 91
E(X ) = + + + + + = = 15.17
6 6 6 6 6 6 6
2 2 2
V ar(X) = E(X ) − μ = 15.17 − 3.5 = 15.17 − 12.25 = 2.92
p if x = 1
p(x) = {
1 − p if x = 0
V ar(X) = p(1 − p)
Variable.
2
(n − 1)
V ar(X) =
12
This is applicable only when the two random variables are independent.
or
k k
V (∑ X i ) = ∑ V (X i )
i=1 i=1
SD(X) = √ V ar(X)
or
SD(cX) = cSD(X)
Added to a constant
SD(X + c) = SD(X)
Collorary
SD(aX + b) = aSD(X)
2
SD(aX + b) = √ a V ar(X) = a√ V ar(X)
Bernoulli Distribution
X 0 1
P (X) 1 − p p
E(X) = 0 ⋅ (1 − p) + 1 ⋅ p = p
S = {H H H , H H T , H T H , H T T , T H H , T H T , T T H , T T T }
The largest variance occurs when p = 0.5. In happens when the success and
failure are equally likely. In other words the most uncertain outcome is when
the probability of success and failure are equal.
SD(X) = √ np(1 − p)
Hypergeometric Distribution
Let X be the number of items of type 1, then the probablity mass function of
the discrete random variable, X, is called the hypergeometric distribution and
is of the form:
m N −m
Ci ⋅ C n−i
P (X = i) =
N
Cn
Example: Choosing balls without replacement
A bag consists of 7 balls of which 4 are white and 3 are black. A student
randomly samples two balls without replacement. Let X be the number of
black balls selected.
Here, N = 7, m = 3, n = 2
3 7−3
C1 ⋅ C 2−1 24
P (X = 1) = =
7
C2 42
3 7−3
C2 ⋅ C 2−2 6
P (X = 2) = =
7
C2 42
N
. 1
Poisson Distribution
The Poisson probablity distribution gives the probablity of a number of ecents
occurring in a fixed interval of time or space.
We assume that these events happen with a known average rate, λ, and
independently of the time since the last event.
Let X be the number of events in a given interval. Then the probability mass
function of the discrete random variable, X, is called the Poisson distribution
and is of the form:
−λ x
e ⋅ λ
P (X = i) =
x!
If the value of λ is very small, the the graph is skewed to the right.
As the value of λ increases, the graph becomes more symmetric.
λ λ n − λ
n x n−x
Bin(n, p = ) = C1 ⋅ ( ) ⋅ ( )
n n n
E(X) = λ
V (X) = λ
∫ f (x) dx = 1
−∞
1 ≥ ∫ f (x) dx ≥ 0
a
∞
E(X) = ∫ _−∞ xf (x) dx
F (x) = ∫ f (x) dx
−∞
Uniform Distribution
A A random variable has the standard uniform distribution with minimum 0
and maximum 1 if its probability density function is given by
1 if 0 ≤ x ≤ 1
{
0 otherwise
∞ 1
∫ f (x) dx = ∫ f (x)dx = 1
−∞ 0
Expoential Distribution
A continuous random variable whose probablity density function is given, for
some λ > 0, by
−λx
λe x ≥ 0
f (x) = {
0 otherwise
Contributions:
Week 5-12 by Kabir Maniar