Probability_piyushwairale
Probability_piyushwairale
Instructions:
• Kindly go through the lectures/videos on our website www.piyushwairale.com
• Read this study material carefully and make your own handwritten short notes. (Short notes must not be
more than 5-6 pages)
• Attempt the mock tests available on portal.
• Revise this material at least 5 times and once you have prepared your short notes, then revise your short
notes twice a week
• If you are not able to understand any topic or required a detailed explanation and if there are any typos or
mistake in study materials. Mail me at [email protected]
Contents
1 Fundamental Principles of Counting 5
1.1 Addition Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Multiplication Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Permutations 5
3 Combinations 6
4 Introduction to Probability 7
4.1 Definition of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.2 Theorems of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
8 Discrete Distributions 17
8.1 Discrete Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8.2 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8.3 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9 Continuous Distributions 19
9.1 Uniform/Rectangular Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
9.2 Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
9.3 Standard Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
9.4 Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
12 t-distribution 26
12.1 Key Characteristics: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
12.1.1 Comparing the t-distribution and Normal Distribution: . . . . . . . . . . . . . . . . . . . . . 26
12.2 Formula for the t-distribution: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
12.3 Mean, Median, Mode and SD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
13 Chi-Square Distribution 28
13.1 Key Characteristics of the Chi-Square Distribution: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
13.2 Key Formulas for the Chi-Square Tests: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2
14 Hypothesis Testing 30
14.1 Key Terms: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
14.2 Steps in Hypothesis Testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
14.3 Significance Level and Confidence Level in Terms of t-Value . . . . . . . . . . . . . . . . . . . . . . . 32
15 t-test 34
16 z-test 36
17 chi-test 37
Youtube Channel
Telegram Group
2 Permutations
A permutation is an arrangement of objects in a specific order. The order of arrangement is important.
Note: In a permutation, the order of arrangement of the objects is important. Thus abc is a different permu-
tation from bca.
In a combination, the order in which objects are selected does not matter.
Thus abc and bca are the same combination.
n! = n × (n − 1) × · · · × 2 × 1
Circular Permutations
• Number of circular permutations of n distinct objects:
– Approximate formula:
n!
!n ≈
e
where e is the base of the natural logarithm (e ≈ 2.71828).
3 Combinations
A combination is a selection of objects where the order does not matter.
Properties of Combinations
• Symmetry Property:
nCr = nC(n − r)
• Addition Formula:
nCr + nC(r − 1) = (n + 1)Cr
Binomial Theorem
• Expansion of (a + b)n :
n
X
n
(a + b) = nCr an−r br
r=0
Applications:
Sample Space
The set of all possible outcomes of a random experiment is called the sample space. All the elements of the sample
space together are called as exhaustive cases. The number of elements of the sample space i.e. the number of
exhaustive cases is denoted by n(S) or N or n.
Event
Any subset of the sample space is called as an Event and is denoted by some capital letter like A, B, C, or A1 , A2 ,
A3 , ..., or B1 , B2 , ... etc.
Favourable Cases
The cases which ensure the happening of an event A, are called as the cases favourable to the event A. The number
of cases favourable to event A is denoted by n(A) or NA or nA .
n(A)
P (A) =
n(S)
Complement of an Event
The complement of an event A is denoted by A′ and it contains all the elements of the sample space which do not
belong to A.
For example: Random experiment: an unbiased die is rolled.
S = {1, 2, 3, 4, 5, 6}
P (A ∩ B) = P (A) · P (B)
Note: If A and B are independent, then
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
Note:
P (A′ ∪ B ′ ) = 1 − P (A ∪ B)
P (A ∪ B) = P (A) + P (B)
Properties
1. Multiplication Rule:
P (A ∩ B) = P (B) × P (A|B) = P (A) × P (B|A).
2. If A and B are independent:
P (A|B) = P (A).
3. Total Probability: X
P (Ai |B) = 1,
i
where {Ai } are mutually exclusive and exhaustive events.
Example
Suppose we have a standard deck of 52 playing cards. What is the probability that a card drawn at random is a
King given that it is a face card?
Solution:
• Total face cards: Jack, Queen, King in each suit ⇒ 12 cards.
• Number of Kings: 4.
• Conditional probability:
4
P (King and Face card) 52 4 1
P (King|Face card) = = 12 = = .
P (Face card) 52
12 3
The law of total probability allows us to compute the probability of an event A by considering all the different
ways A can occur through the events Bi .
Example
A factory has three machines M1 , M2 , and M3 producing 30%, 45%, and 25% of the total products, respectively.
The defect rates are 1%, 2%, and 3%, respectively. What is the probability that a randomly selected product is
defective?
Solution:
• Let D be the event that a product is defective.
• Using the law of total probability:
P (D) = P (M1 )P (D|M1 ) + P (M2 )P (D|M2 ) + P (M3 )P (D|M3 )
P (D) = 0.30 × 0.01 + 0.45 × 0.02 + 0.25 × 0.03 = 0.0175.
5.1 Bayes’ Theorem
Bayes’ theorem relates the conditional and marginal probabilities of random events. For events A and B with
P (B) > 0:
P (B|A)P (A)
P (A|B) = .
P (B)
P (A|Bk )P (Bk )
P (Bk |A) = Pn .
i=1 P (A|Bi )P (Bi )
Bayes’ theorem allows us to update the probability of an event based on new evidence or information.
Example
Using the previous factory example, if a randomly selected product is found to be defective, what is the probability
it was produced by machine M2 ?
Solution:
Independent Events
Events A and B are independent if:
P (A ∩ B) = P (A) × P (B).
Conditional Independence
Events A and B are conditionally independent given event C if:
Bayes’ Theorem
Bayes’ theorem for continuous variables:
fY |X (y|x)fX (x)
fX|Y (x|y) = .
fY (y)
6 Discrete and Continuous Random Variables
A discrete random variable takes the values that are finite or countable. For example, when we consider the
experiment of tossing 3 coins, the number of heads can be appreciated as a discrete random variable X. X would
take 0, 1, 2, 3 as possible values.
A continuous random variable takes values in the form of intervals. Also, in the case of a continuous random
variable P (X = c) = 0, where c is a specified point. Heights and weights of people, area of land held by individuals,
etc., are examples of continuous random variables.
p(xr ) = P (X = xr )
The values that X can take and the corresponding probabilities determine the probability distribution of X.
We also have the following conditions:
1. p(x) ≥ 0
P
2. p(x) = 1
1. f (x) ≥ 0
R∞
2. −∞ f (x) dx = 1
The probability P (X ≤ x) is called the cumulative distribution function (CDF) of X and is denoted by F (X).
It is a point function. It is defined for discrete and continuous random variables. The following are the properties
of probability distribution function F (x):
1. F (x) ≥ 0
E(X) = p1 x1 + p2 x2 + · · · + pn xn
Expected value for a discrete random variable is given by:
n
X
E(X) = xi P (xi )
i=1
Properties of Expectation
Linearity of Expectation
1. Addition:
E[X + Y ] = E[X] + E[Y ].
2. Scalar Multiplication:
E[aX] = a E[X],
where a is a constant.
Note: The linearity property holds regardless of whether the random variables are independent or not.
Expectation of a Constant
E[c] = c,
where c is a constant.
• Discrete: X
E[g(X)] = g(x) P (X = x).
x
• Continuous: Z ∞
E[g(X)] = g(x) fX (x) dx.
−∞
Product of Independent Random Variables
If X and Y are independent random variables:
E[X] ≥ 0.
Jensen’s Inequality
For a convex function g:
E[g(X)] ≥ g(E[X]).
For a concave function, the inequality is reversed.
Example 1:
7.2 Variance
Variance of a random variable is given by:
Properties of Variance
Variance of a Constant
Var(c) = 0,
where c is a constant.
Scaling Property
For a constant a:
Var(aX) = a2 Var(X).
Var(aX + b) = a2 Var(X).
Note: Adding a constant b does not affect the variance.
Cov(X, Y ) = 0,
so:
Note
1. Expected value µ = E(X) is a measure of central tendency.
2. Standard deviation σ is a measure of spread.
Properties of Covariance
1. Symmetry:
Cov(X, Y ) = Cov(Y, X).
3. Linearity:
Cov(aX + b, Y ) = a Cov(X, Y ).
Cov(X, aY + b) = a Cov(X, Y ).
4. Addition:
Cov(X1 + X2 , Y ) = Cov(X1 , Y ) + Cov(X2 , Y ).
Correlation Coefficient
Cov(X, Y )
ρXY = ,
σX σY
where σX and σY are the standard deviations of X and Y .
Properties:
• −1 ≤ ρXY ≤ 1.
Example 2:
Let X and Y be independent random variables with Var(X) = 4 and Var(Y ) = 9. Find Var(2X − 3Y + 5).
Solution:
1. Calculate Variance:
Chebyshev’s Inequality
For any random variable X with finite mean µ and variance σ 2 , and for any k > 0:
1
P (|X − µ| ≥ kσ) ≤ .
k2
Interpretation: The probability that X deviates from its mean by k standard deviations or more is at most
1
k2 .
8 Discrete Distributions
8.1 Discrete Uniform Distribution
A discrete random variable defined for values of x from 1 to n is said to have a uniform distribution if its probability
mass function is given by
1
f (x) = , for x = 1, 2, 3, . . . , n
n
f (x) = 0, otherwise
The cumulative distribution function F (x) of the discrete uniform random variable x is given by:
x
n , for 1 ≤ x ≤ n
F (x) = 1, for x > n
0, for x < 1
The mean of X = µ is
n+1
µ=
2
The variance of X = σ 2 is
n2 − 1
σ2 =
12
e−λ λx
p(x) = p(x; λ) = , x = 0, 1, 2, . . . λ>0
x!
p(x) = 0, otherwise
In a binomial distribution if n is large compared to p, then np approaches a fixed constant λ. Such a distribution
is called Poisson distribution (limiting case of binomial distribution).
Properties of Poisson Distribution
P∞
1. E(X) = x=0 x · p(x) = λ
• Mean of X = µ = a+b
2
(b−a)2
• Variance of X = σ 2 = 12
(x − µ)2
1
fX (x) = √ exp − , −∞ < x < ∞.
2πσ 2 2σ 2
This is denoted as:
X ∼ N (µ, σ 2 ).
The graphical representation of normal distribution is as follows:
Note
• Symmetry: The normal distribution is symmetric about the mean µ.
Z ∼ N (0, 1).
Standardization
Any normal random variable X ∼ N (µ, σ 2 ) can be transformed into a standard normal variable Z using:
X −µ
Z= .
σ
This process is called standardization and allows us to use standard normal distribution tables to compute
probabilities.
Example 1
Suppose X ∼ N (50, 16). Find P (X > 58).
Solution:
1. Standardize X:
X −µ 58 − 50
Z= = = 2.
σ 4
2. Find P (Z > 2):
P (Z > 2) = 1 − P (Z ≤ 2) = 1 − 0.9772 = 0.0228.
(Using standard normal distribution table)
Example 2
Let X1 , X2 , . . . , X36 be independent and identically distributed random variables with mean µ = 10 and variance
σ 2 = 9. Find the approximate probability that the sample mean X̄ exceeds 11.
Solution:
2. Standardize X̄:
X̄ − µ 11 − 10
Z= = = 2.
σX̄ 0.5
2. It has a maximum at x = µ.
3. The area under the curve within the interval (µ ± σ) is 68%.
4. A fairly large number of samples taken from a ’Normal’ population will have average, median, and mode
nearly the same, and within the limits of average ±2 × SD, there will be 95% of the values.
R∞
5. E(X) = −∞ x · f (x) dx = µ.
6. V (X) = σ 2 , SD(X) = σ.
7. For a normal distribution,
Mean = Median = Mode
8. All odd order moments about mean vanish for a normal distribution.
µ2n+1 = 0 for n = 0, 1, 2, . . .
Also,
X1 − X2 ∼ N (µ1 − µ2 , σ12 + σ22 )
10. If µ = 0 and σ 2 = 1, we call it a standard normal distribution. The standardization can be obtained by the
transformation,
x−µ
z=
σ
Also,
X −µ
∼ N (0, 1)
σ
And
m
X
PY (yj ) = Pxy (xi , yj ), for j = 1, 2, . . . , n
i=1
• Pxy (xi , yj ) ≥ 0 ∀ i, j
Pm Pn
• i=1 j=1 Pxy (xi , yj ) = 1
• The cumulative joint distribution function of the two dimensional random variable (X, Y ) is given by Fxy (x, y) =
P (X ≤ x, Y ≤ y).
• The cumulative joint distribution function FXY (x, y) of the two-dimensional random variable (X, Y ) (where
X and Y are any two continuous random variables defined on the same sample space) is given by:
Z x Z y
FXY (x, y) = fXY (x, y) dx dy
−∞ −∞
10.2 Conditional Probability Functions of Random Variables
Let X and Y be two discrete (continuous) random variables defined on the same sample space with joint probability
mass (density) function fXY (x, y), then:
1. The conditional probability mass (density) function fX|Y (x|y) of X, given Y = y, is defined as:
fXY (x, y)
fX|Y (x|y) = , where fY (y) ̸= 0
fY (y)
2. The conditional probability mass (density) function fY |X (y|x) of Y , given X = x, is defined as:
fXY (x, y)
fY |X (y|x) = , where fX (x) ̸= 0
fX (x)
Note
If the random variables X and Y are independent, then
P (a ≤ X ≤ b, c ≤ Y ≤ d) = P (a ≤ X ≤ b)P (c ≤ Y ≤ d)
11 Mean, Median, Mode and Standard Deviation
11.1 Mean
Mean of a data generally refers to the arithmetic mean of the data. However, there are two more types of mean
which are geometric mean and harmonic mean. The different types of mean for a set of values and grouped data
are given in the following table.
11.2 Median
For an ordered set of values, the median is the middle value. If the number of values is even, median is taken as
the mean of the middle two values. For grouped data, median is given by:
N/2 − C
Median = L + h
f
where,
X
L = Lower boundary of the median class, N= fi , C = Cumulative frequency up to the class before the median class
Note
• Median does not take into consideration all the items.
• The sum of absolute deviations taken about median is least.
• Median is the abscissa of the point of intersection of the cumulative frequency curves.
11.3 Mode
For a set of values, mode is the most frequently occurring value. For grouped data, mode is given by:
f1 − fi−1
Mode = L + h
(2f1 − fi−1 − fi+1 )
where,
L = Lower boundary of the modal class, f1 = Frequency of the modal class (Highest frequency)
fi−1 = Frequency of the class before the modal class, fi+1 = Frequency of the class after the modal class, h = Width of the
Note: Relation between Mean, Median and Mode:
(xi − x̄)2
P
σ2 = For a Set of Values
n
fi (xi − x̄)2
P
σ2 = P For Grouped Data
fi
The standard deviation σ can alternatively be calculated using the formula:
sP P 2
x2i xi
σ= −
n n
This is a useful formula for computational purposes.
Note
1. The square of standard deviation is termed as variance.
2. Standard deviation (SD) is the least mean square deviation.
3. If each item is increased by a fixed constant, the SD does not alter or SD is independent of change of
origin.
4. Standard deviation depends on each and every data item.
5. For a discrete series in the form a, a + d, a + 2d, . . . (Arithmetic Progression, AP), the standard deviation is
given by: r
n2 − 1
SD = d
12
where n is the number of terms in the series.
t-distribution and chi-distribution
12 t-distribution
The t-distribution, also known as Student’s t-distribution, is a probability distribution that is symmetric and bell-
shaped like the normal distribution. However, it has heavier tails, meaning it is more prone to producing values
that fall far from the mean. This distribution is used primarily when the sample size is small, or the population
standard deviation is unknown.
• Bell-shaped and symmetric around the mean, similar to the normal distribution.
• Heavier tails, which means there is more variability in extreme values compared to the normal distribu-
tion.
2. Degrees of Freedom (df):
• The shape of the t-distribution depends on the degrees of freedom (df), which are related to the sample
size.
• Lower df results in heavier tails (more spread out), while higher df makes the t-distribution closer to the
normal distribution.
• Degrees of freedom are typically equal to the sample size minus one (df = n - 1) for a one-sample t-test.
3. Used in:
• Hypothesis testing (like t-tests) when:
- The sample size is small (n < 30).
- The population standard deviation is unknown.
• Estimating the mean of a normally distributed population when the sample size is small.
• Constructing confidence intervals for small samples.
• The t-distribution is more spread out (has fatter tails) when the sample size is small, which accounts for the
extra uncertainty in estimating the population standard deviation.
• As the sample size increases, the t-distribution gets closer to the normal distribution. When the sample size
is large (df > 30), you can often use the normal distribution instead.
f (x) = √ df
1 +
dfπ Γ 2 df
Where: - Γ is the gamma function (a generalization of factorial). - x is the value of the random variable.
12.3 Mean, Median, Mode and SD
1. Mean of t-distribution (df = 3): For a t-distribution:
- The mean exists and is equal to 0 when the degrees of freedom (df) are greater than 1.
- Since df = 3 > 1, the mean of the t-distribution is:
Mean = 0
2. Median of t-distribution (df = 3): - The t-distribution is symmetric about 0. Therefore, the median is the
same as the mean.
- The median of the t-distribution is:
Median = 0
3. Mode of t-distribution (df = 3): - The mode of a t-distribution is also located at 0, as the distribution is
symmetric and centered at 0.
- The mode of the t-distribution is:
Mode = 0
4. Variance of t-distribution (df = 3): - The variance of a t-distribution depends on the degrees of freedom (df).
The variance is defined as:
df
Variance =
df − 2
- For df = 3:
3 3
Variance = = =3
3−2 1
13 Chi-Square Distribution
The chi-square distribution is a statistical distribution commonly used in hypothesis testing, especially for categor-
ical data and variance tests. It arises when you sum the squares of independent standard normal random variables.
The chi-square distribution is primarily used in chi-square tests, which are applied to:
- Test goodness of fit (how well observed data fits a theoretical distribution).
- Test independence in contingency tables (e.g., association between two categorical variables).
- Test variance (whether the variance of a population equals a specified value).
Xi − E(Xi ) Xi − µi
Zi = p = , i = 1, 2, . . . , n
Var(Xi ) σi
The sum of squares of standard normal variates is known as the chi-square variate with n degrees of freedom,
i.e.,
n n 2
X X Xi − µi
χ2 = Zi2 =
i=1 i=1
σi
If Z1 , Z2 , ..., Zk are independent, standard normal random variables (i.e., Zi ∼ N (0, 1)), then the sum of their
squares follows a chi-square distribution:
(n − 1)s2
χ2 =
σ2
Where: - s2 is the sample variance, - σ 2 is the population variance, - n is the sample size.
The graph above shows the Probability Density Functions (PDF) of the chi-square distribution for different
degrees of freedom (df = 2, 5, 10).
- For small degrees of freedom (df = 2), the distribution is highly skewed to the right.
- As the degrees of freedom increase (df = 5 and df = 10), the distribution becomes more symmetric and shifts to
the right.
14 Hypothesis Testing
In probability theory, we set up mathematical models of processes and systems that are affected by ‘chance’. In
statistics, we check these models against the reality, to determine whether they are faithful and accurate enough for
practical purposes. The process of checking models is called statistical inference. Methods of statistical inference are
based on drawing samples (or sampling). One of the most important methods of statistical inference is ‘Hypothesis
Testing’.
Testing of Hypothesis
We have some information about a characteristic of the population which may or may not be true. This information
is called statistical hypothesis or briefly hypothesis. We wish to know, whether this information can be accepted
or to be rejected. We choose a random sample and obtain information about this characteristic. Based on this
information, a process that decides whether the hypothesis to be accepted or rejected is called testing of hypothesis.
i.e., In brief, the test of hypothesis or the test of significance is a procedure to determine whether observed samples
differ significantly from expected results.
3.Test Statistic:
• A value calculated from the sample data that is used to decide whether to reject the null hypothesis. It
measures how far the sample data diverge from the null hypothesis.
• Common test statistics include t-values (for t-tests), z-values (for z-tests), or chi-square values (for chi-square
tests).
4.Significance Level (α):
• This is the threshold we set to determine how extreme the data must be to reject the null hypothesis. It is
usually set to 0.05 (5%), meaning there’s a 5% chance of rejecting the null hypothesis when it’s actually true
(Type I error).
• Example: If α = 0.05, there’s a 5% risk of incorrectly rejecting the null hypothesis.
5.p-value:
• The p-value is the probability of obtaining a test statistic at least as extreme as the one calculated from the
data, assuming that the null hypothesis is true.
• If the p-value is less than or equal to the significance level (p ≤ α), you reject the null hypothesis.
• If the p-value is greater than the significance level (p > α), you **fail to reject** the null hypothesis.
• Null Hypothesis (H0 ): This is the default claim you are testing.
• Alternative Hypothesis (Ha ): This is the claim you are trying to prove.
• Example:
- H0 : The average test score is 50.
- Ha : The average test score is not 50.
• The test statistic depends on the type of data and the test you’re conducting (t-test, z-test, etc.).
• For a t-test, the formula is:
x̄ − µ
t= √
s/ n
Where: - x̄ is the sample mean. - µ is the population mean under the null hypothesis. - s is the sample
standard deviation. - n is the sample size.
4. Determine the p-value:
• The p-value tells us how extreme the test statistic is under the assumption that H0 is true.
• The p-value is compared to α (significance level).
5. Make a decision:
• If the p-value ≤ α: Reject the null hypothesis. There is evidence to support the alternative hypothesis.
• If the p-value > α: Fail to reject the null hypothesis. There is not enough evidence to support the
alternative hypothesis.
6. Draw a conclusion: - Based on the results, you can conclude whether or not there is evidence to support the
alternative hypothesis.
Types of Hypothesis Tests:
1. One-Tailed Test: - Used when the alternative hypothesis is testing for a direction (greater than or less than).
- Example: Ha : µ > 50 (right-tailed) or Ha : µ < 50 (left-tailed).
2. Two-Tailed Test: - Used when the alternative hypothesis is testing for any difference, not a specific direction.
- Example: Ha : µ ̸= 50.
• Type II Error (False Negative): - Occurs when you fail to reject the null hypothesis when the alternative
hypothesis is true.
In Terms of t-Value:
- The t-critical value is the t-value that corresponds to the significance level in a t-distribution.
- For example, in a two-tailed test with α = 0.05, the critical t-values are chosen so that 5% of the total area under
the t-distribution curve is in the tails (2.5% in each tail). If the calculated t-value is more extreme than the critical
value, we reject the null hypothesis.
Example:
For a two-tailed test with α = 0.05 and df = 9:
- The critical values from the t-table are approximately ±2.262.
- If your calculated t-value is outside this range (less than -2.262 or greater than 2.262), you reject the null hypothesis.
2. Confidence Level
The confidence level is the percentage of confidence we have that the population parameter lies within the estimated
range. It is complementary to the significance level:
Confidence Level = 1 − α
For example: - If α = 0.05, the confidence level is 1 − 0.05 = 0.95
- This means we are 95% confident that the population parameter (such as the population mean) lies within
the interval calculated from the sample data.
In Terms of t-Value:
- The confidence interval (CI) is calculated using the t-critical value based on the confidence level. For a 95%
confidence interval, 95 of the data from the t-distribution will fall between the two t-critical values, leaving 5% in
the tails (2.5% in each tail).
Where: - x̄ is the sample mean, - µ is the hypothesized population mean, - s is the sample standard deviation,
- n is the sample size.
The t-value tells you how many standard deviations the sample mean is from the hypothesized population mean.
Types of t-Tests:
1. One-Sample t-Test: Compares the sample mean to a known population mean.
t-Statistic Calculation:
x̄ − µ
t=
√s
n
2. Two-Sample t-Test: Compares the means of two independent samples to see if they are significantly different.
t-Statistic Calculation (assuming equal variances):
x̄1 − x̄2
t= r
s2p n11 + n12
• Variance: The variance is greater than 1 for small sample sizes but approaches 1 as the degrees of freedom
increase.
• Use for Small Samples: The t-distribution is especially useful when the sample size is small and the
population standard deviation is unknown.
How is the t-Value Used in Hypothesis Testing?
1. Formulate Hypotheses: - Null Hypothesis (H0 ): There is no effect or no difference. - Alternative
Hypothesis (Ha ): There is an effect or a difference.
2. Calculate the t-Value: Use the formula for the t-statistic to calculate the t-value from the sample data.
3. Determine the Critical Value: Based on the significance level (usually α = 0.05) and degrees of freedom,
look up the critical value from the t-distribution table.
4. Compare the t-Value and Critical Value: - If the absolute value of the t-value is greater than the critical
value, reject the null hypothesis.
- If the absolute value of the t-value is less than the critical value, fail to reject the null hypothesis.
5. Decision: Based on the comparison, make a conclusion about whether there is enough evidence to support
the alternative hypothesis.
4. Degrees of Freedom Influence: The degrees of freedom impact the shape of the t-distribution, with lower
degrees of freedom resulting in heavier tails (larger critical values).
16 z-test
A Z-test is a statistical test used to determine whether there is a significant difference between sample and population
means, or between two sample means, when the population variance is known or when the sample size is large
(typically n > 30). The Z-test is based on the standard normal distribution (also called the Z-distribution), which
has a mean of 0 and a standard deviation of 1.
Types of Z-Tests:
1. One-sample Z-test: Used to test if the sample mean is different from a known population mean.
Formula:
x̄ − µ
Z=
√σ
n
Where: - x̄ = sample mean, - µ = population mean, - σ = population standard deviation, - n = sample size.
2. Two-sample Z-test Used to compare the means of two independent samples.
Formula:
(x̄1 − x̄2 )
Z=q 2
σ1 σ22
n1 + n2
Where: - x̄1 , x̄2 = sample means, - σ12 , σ22 = population variances, - n1 , n2 = sample sizes.
3. Z-test for proportions: Used to compare proportions between two samples. Formula:
p1 − p2
Z=r
p̂(1 − p̂) n11 + 1
n2
• One-Tailed Test: Use this when you’re testing for a directional hypothesis (e.g., whether a sample mean is
greater than or less than a population mean). The critical value here corresponds to the tail on one side of
the normal distribution.
Example: If you are conducting a one-tailed test at α = 0.05, the Z-critical value is 1.645.
• Two-Tailed Test: Use this when you’re testing for a non-directional hypothesis (e.g., whether the sample
mean is different from the population mean, either higher or lower). This splits the significance level equally
between both tails of the normal distribution.
Example: If you’re conducting a two-tailed test at α = 0.05, the Z-critical values are ±1.96.
17 chi-test
The Chi-Square Test (or χ2 test) is a statistical test used to determine if there is a significant association between
observed and expected frequencies or whether two categorical variables are independent. It is widely used in
hypothesis testing for categorical data, especially when working with contingency tables and goodness-of-fit tests.
Types of Chi-Square Tests
1. Chi-Square Goodness-of-Fit Test: - Used when you want to determine if an observed frequency distribution
fits a particular theoretical distribution. - Example: Testing whether a die is fair by comparing observed
frequencies of each outcome with the expected frequencies.
2. Chi-Square Test for Independence: - Used when you want to determine if two categorical variables are
independent of each other. - Example: Testing whether gender is independent of voting preference in a
population using a contingency table.
Chi-Square Test Formula
The chi-square test statistic (χ2 ) is calculated as follows:
X (Oi − Ei )2
χ2 =
Ei
Where: - Oi = Observed frequency, - Ei = Expected frequency.
Key Assumptions:
1. Data must be in the form of frequencies (counts).
2. The observations must be independent of each other.
3. The sample size should be large enough (expected frequencies should ideally be 5 or more).
1. Hypotheses:
- Null Hypothesis (H0 ): The observed data fits the expected distribution.
- Alternative Hypothesis (Ha ): The observed data does not fit the expected distribution.
2. Degrees of Freedom (df):
- df = k − 1, where k is the number of categories.
3. Test Statistic:
- Calculate the chi-square statistic using the formula:
X (Oi − Ei )2
χ2 =
Ei
5. Decision:
- If the calculated chi-square value is greater than the critical value, reject H0 . - Otherwise, fail to reject H0 .
2. Chi-Square Test for Independence
This test checks whether two categorical variables are independent of each other. It’s often used in a contingency
table, which shows the frequency distribution of variables.
Steps for Chi-Square Test for Independence:
2. Expected Frequency:
- Calculate the expected frequency for each cell in the contingency table:
(Row Total) × (Column Total)
Eij =
Grand Total
(n − 1)s2
χ2 =
σ2
Where: - χ = Chi-square test statistic, - n = Sample size, - s2 = Sample variance, - σ 2 = Hypothesized
2
population variance.
Key Hypotheses for the Test
H0 : σ 2 = σ02
2. Alternative Hypothesis (Ha ): The population variance is not equal to the specified value (can be
one-sided or two-sided).
Ha : σ 2 =
̸ σ02 (for two-tailed test)
Ha : σ 2 > σ02 (for right-tailed test)
Ha : σ 2 < σ02 (for left-tailed test)
6. Make a Decision:
- Compare the calculated chi-square value with the critical value(s):
- For a two-tailed test, if the calculated χ2 value falls outside the range of critical values, reject H0 .
- For a one-tailed test, if the calculated χ2 value is greater than the upper critical value (or less than the
lower critical value for a left-tailed test), reject H0 .
7. Conclusion:
- Based on the decision, either reject or fail to reject the null hypothesis.