Unit 3 (Hypothesis Testing)
Unit 3 (Hypothesis Testing)
Hypothesis testing can be defined as a statistical tool that is used to identify if the results of an
experiment are meaningful or not. It involves setting up a null hypothesis and an alternative
hypothesis. These two hypotheses will always be mutually exclusive. This means that if the null
hypothesis is true then the alternative hypothesis is false and vice versa.
Hypothesis testing uses sample data from the population to draw useful conclusions regarding the
population probability distribution. It tests an assumption made about the data using different types
of hypothesis testing methodologies. The hypothesis testing results in either rejecting or not rejecting
the null hypothesis.
Hypothesis testing is a statistical method that is used to make a statistical decision using
experimental data. Hypothesis testing is basically an assumption that we make about a population
parameter. It evaluates two mutually exclusive statements about a population to determine which
statement is best supported by the sample data.
Defining Hypotheses
• Null hypothesis (H0): In statistics, the null hypothesis is a general statement or default position
that there is no relationship between two measured cases or no relationship among groups.
In other words, it is a basic assumption or made based on the problem knowledge.
Example: A company’s mean production is 50 units/per da H0: μμ = 50.
• Alternative hypothesis (H1): The alternative hypothesis is the hypothesis used in hypothesis
testing that is contrary to the null hypothesis.
Example: A company’s production is not equal to 50 units/per day i.e. H1: μμ ≠ = 50.
Null Hypothesis is True (Accept) Correct Decision Type II Error (False Negative)
where,
• xˉxˉis the sample mean,
• μ represents the population mean,
• σ is the standard deviation
• and n is the size of the sample.
2. T-Statistics
T test is used when n<30,
t-statistic calculation is given by:
where,
• t = t-score,
• x̄ = sample mean
• μ = population mean,
• s = standard deviation of the sample,
• n = sample size
3. Chi-Square Test
Chi-Square Test for Independence categorical Data (Non-normally distributed) using:
where,
• OijOij is the observed frequency in cell ijij
• i,j are the rows and columns index respectively.
• Eij is the expected frequency in cell ij calculated as :
Testing a hypothesis typically involves several steps to determine whether there is enough evidence to
support or reject a proposed explanation or assumption based on data. Here’s a brief outline of the
process:
1. Formulate Hypotheses
• Null Hypothesis (H₀): A statement suggesting no effect or no difference. It's the default
position.
• Alternative Hypothesis (H₁ or Ha): A statement suggesting that there is an effect or difference.
2. Choose the Significance Level (α)
The significance level (often 0.05) determines the probability threshold for rejecting the null
hypothesis. If the p-value is less than α, the null hypothesis is rejected.
Example 1: The average weight of a dumbbell in a gym is 90lbs. However, a physical trainer believes
that the average weight might be higher. A random sample of 5 dumbbells with an average weight of
110lbs and a standard deviation of 18lbs. Using hypothesis testing check if the physical trainer's claim
can be supported for a 95% confidence level.
Solution: As the sample size is lesser than 30, the t-test is used.
H0H0: μμ = 90, H1H1: μμ > 90
ˉˉˉxxˉ = 110, μμ = 90, n = 5, s = 18.
αα = 0.05
Using the t-distribution table, the critical value is 2.132
t = ˉˉˉx−μs√nxˉ−μsn
t = 2.484
As 2.484 > 2.132, the null hypothesis is rejected.
t-Test
A t-test is a type of inferential statistic used to determine the significant difference between the means
of two groups, which may be related to certain features. A t-test is used as a hypothesis testing tool,
which allows an assumption applicable to a population.A t-test is only used when comparing the
means of two groups' also known as a pairwise comparison.
Types of T-Tests
There are three types of t-tests we can perform based on the data, such as:
In a one-sample t-test, we compare the average of one group against the set average. This set average
can be any theoretical value, or it can be the population mean.
In a nutshell, here's the formula to calculate or perform a one-sample t-test:
Where,
o t = t-statistic
o m = mean of the group
o µ = theoretical value or population mean
o s = standard deviation of the group
o n = group size or sample size
The unpaired t-test is used to compare the means of two different groups of samples.
For example, we want to compare the male employees' average height to their average height. Of
course, the number of males and females should be equal for this comparison. This is where an
unpaired or independent t-test is used.
Here's the formula to calculate the t-statistic for a two-sample t-test:
Where,
o mAand mB are the means of two different groups
o nAand nB are the sample sizes
o S2is an estimator of the common variance of two samples, such as:
Where,
o t = t-statistic
o m = mean of the group
o µ = theoretical value or population mean
o s = standard deviation of the group
o n = group size or sample size
Calculating T-Tests
Calculating a t-test requires three key data values. They include the difference between the mean
values from each data set called the mean difference, the standard deviation of each group, and the
number of data values of each group.
The outcome of the t-test produces the t-value. This calculated t-value is then compared against a
value obtained from a critical value table called the T-Distribution Table. This comparison helps
determine the effect of chance alone on the difference and whether it is outside that chance range.
The t-test questions whether the difference between the groups represents a true difference in the
study or possibly a meaningless random difference.
1. T-Distribution Tables
The T-Distribution Table is available in one tail and two tail formats. The former is used to
assess cases with a fixed value or range with a clear direction (positive or negative).
For example, what is the probability of output value remaining below -3 or getting more than
seven when rolling a pair of dice? The latter is used for range-bound analysis, such as asking if
the coordinates fall between -2 and +2.The calculations can be performed with standard
software programs that support the necessary statistical functions, like those found in MS
Excel.
2. T-Values and Degrees of Freedom
The t-test produces two values as its output, t-value and degrees of freedom.
The t-value is a ratio of the difference between the mean of the two sample sets and the
variation within the sample sets. While the numerator value (the difference between the mean
of the two sample sets) is straightforward to calculate, the denominator (the variation within
the sample sets) can become a bit complicated depending upon the type of data values
involved. The denominator of the ratio is a measurement of the dispersion or variability.
Higher values of the t-value, also called t-score, indicate a large difference between the two
sample sets. The smaller the t-value, the more similarity exists between the two sample sets.
o A large t-score indicates that the groups are different.
o A small t-score indicates that the groups are similar.
Degrees of freedom refers to the values in a study that can vary and are essential for assessing the null
hypothesis's importance and validity. The computation of these values usually depends upon the
number of data records available in the sample set.
A t-test is a type of inferential statistic test used to determine if there is a significant difference between
the means of two groups. It is often used when data is normally distributed and population variance is
unknown.
Assumptions in T-test
• Independence: The observations within each group must be independent of each other. This
means that the value of one observation should not influence the value of another
observation. Violations of independence can occur with repeated measures, paired data, or
clustered data.
• Normality: The data within each group should be approximately normally distributed i.e the
distribution of the data within each group being compared should resemble a normal (bell-
shaped) distribution. This assumption is crucial for small sample sizes (n < 30).
• Homogeneity of Variances (for independent samples t-test): The variances of the two groups
being compared should be equal. This assumption ensures that the groups have a similar
spread of values. Unequal variances can affect the standard error of the difference between
means and, consequently, the t-statistic.
• Absence of Outliers: There should be no extreme outliers in the data as outliers can
disproportionately influence the results, especially when sample sizes are small.
Prerequisites for T-Test
Let’s quickly review some related terms before digging deeper into the specifics of the t-test.
A t-test is a statistical method used to compare the means of two groups to determine if there is a
significant difference between them. The t-test is a parametric test, meaning it makes certain
assumptions about the data. Here are the key prerequisites for conducting a t-test.
Hypothesis Testing:
Hypothesis testing is a statistical method used to make inferences about a population based on a
sample of data.
P-value:
The p-value is the probability of observing a test statistic (or something more extreme) given that the
null hypothesis is true.
• A small p-value (typically less than the chosen significance level) suggests that the observed
data is unlikely to have occurred by random chance alone, leading to the rejection of the null
hypothesis.
• A large p-value suggests that the observed data is likely to have occurred by random chance,
and there is not enough evidence to reject the null hypothesis.
Degree of freedom (df):
The degree of freedom represents the number of values in a calculation that is free to vary. The degree
of freedom (df) tells us the number of independent variables used for calculating the estimate
between 2 sample groups.
In a t-test, the degree of freedom is calculated as the total sample size minus 1 i.e df=∑ns−1df=∑ns−1,
where “ns” is the number of observations in the sample. It reflects the number of values in the sample
that are free to vary after estimating the sample mean.Suppose, we have 2 samples A and B. The df
would be calculated
as df=(nA−1)+(nB−1)df=(nA−1)+(nB−1)
Significance Level:
The significance level is the predetermined threshold that is used to decide whether to reject the null
hypothesis. Commonly used significance levels are 0.05, 0.01, or 0.10.
A significance level of 0.05 indicates that the researcher is willing to accept a 5% chance of making a
Type I error (incorrectly rejecting a true null hypothesis).
T-statistic:
The t-statistic is a measure of the difference between the means of two groups relative to the variability
within each group. It is calculated as the difference between the sample means divided by the standard
error of the difference. It is also known as the t-value or t-score.
• If the t-value is large => the two groups belong to different groups.
• If the t-value is small => the two groups belong to the same group.
T-Distribution
The t-distribution, commonly known as the Student’s t-distribution, is a probability distribution with
tails that are thicker than those of the normal distribution.
Statistical Significance
Statistical significance is determined by comparing the p-value to the chosen significance level.
• If the p-value is less than or equal to the significance level, the result is considered statistically
significant, and the null hypothesis is rejected.
• If the p-value is greater than the significance level, the result is not statistically significant, and
there is insufficient evidence to reject the null hypothesis.
Types of T-tests
There are three types of t-tests, and they are categorized as dependent and independent t-tests.
1. One sample t-test test: The mean of a single group against a known mean.
2. Two-sample t-test: It is further divided into two types:
• Independent samples t-test: compares the means for two groups.
• Paired sample t-test: compares means from the same group at different times (say,
one year apart).
One sample T-test
One sample t-test is one of the widely used t-tests for comparison of the sample mean of the data to
a particularly given value. Used for comparing the sample mean to the true/population mean.
We can use this when the sample size is small. (under 30) data is collected randomly and it is
approximately normally distributed. It can be calculated as:
where,
• t = t-value
• x_bar = sample mean
• μ = true/population mean
• σ = standard deviation
• n = sample size
Where,
• (xˉ1)and(xˉ2)(xˉ1)and(xˉ2) are the means of the two groups.
• (s1)and(s2)(s1)and(s2) are the standard deviations of the two groups.
• (n1)and(n2)(n1)and(n2)are the sample sizes of the two groups.
A t-test is a statistical test used to compare the means of two groups or to compare a sample mean to
a known population mean. It is useful when the sample size is small, and the population standard
deviation is unknown. There are three common types of t-tests:
1. One-Sample t-test: Compares the mean of a single sample to a known value (often a
population mean).
2. Independent Two-Sample t-test: Compares the means of two independent groups.
3. Paired Sample t-test: Compares means from the same group at different times (e.g., before
and after an intervention).
Where:
• Xˉ\bar{X}Xˉ = sample mean
• μ0\mu_0μ0 = population mean (known value)
• sss = sample standard deviation
• nnn = sample size
What is a Z-Test?
A Z-test is a statistical test used to determine whether there is a significant difference between the
sample mean and the population mean (or between two sample means) when the population
standard deviation is known, or the sample size is large (typically n > 30). It is based on the standard
normal distribution (z-distribution).
There are several types of Z-tests:
1. One-Sample Z-test: Compares the sample mean to a known population mean.
2. Two-Sample Z-test: Compares the means of two independent samples.
3. Z-test for Proportions: Compares sample proportions to population proportions.
A z test is a test that is used to check if the means of two populations are different or not provided the
data follows a normal distribution. For this purpose, the null hypothesis and the alternative hypothesis
must be set up and the value of the z test statistic must be calculated. The decision criterion is based
on the z critical value.
A z test is conducted on a population that follows a normal distribution with independent data points
and has a sample size that is greater than or equal to 30. It is used to check whether the means of two
populations are equal to each other when the population variance is known. The null hypothesis of a
z test can be rejected if the z test statistic is statistically significant when compared with the critical
value.
What is Z-Test?
Z-test is a statistical test that is used to determine whether the mean of a sample is significantly
different from a known population mean when the population standard deviation is known. It is
particularly useful when the sample size is large (>30).
Z-test can also be defined as a statistical method that is used to determine whether the distribution of
the test statistics can be approximated using the normal distribution or not. It is the method to
determine whether two sample means are approximately the same or different when their variance is
known and the sample size is large (should be >= 30).
Z-Test Formula
The Z-test compares the difference between the sample mean and the population means by
considering the standard deviation of the sampling distribution. The resulting Z-score represents the
number of standard deviations that the sample mean deviates from the population mean. This Z-Score
is also known as Z-Statistics, and can be formulated as:
where,
o xˉ xˉ : mean of the sample.
o μ μ : mean of the population.
o σ σ : Standard deviation of the population.
o n: sample size.
• Now compare with the hypothesis and decide whether to reject or not reject the null
hypothesis
Type of Z-test
Left-tailed Test
In this test, our region of rejection is located to the extreme left of the distribution. Here our null
hypothesis is that the claimed value is less than or equal to the mean population value.
Right-tailed Test
In this test, our region of rejection is located to the extreme right of the distribution. Here our null
hypothesis is that the claimed value is less than or equal to the mean population value.
• Null Hypothesis: There is no significant difference between the mean score between the online
and offline classes
μ1−μ2=0 μ1−μ2=0
• Alternate Hypothesis: There is a significant difference in the mean scores between the online
and offline classes.
μ1−μ2≠0 μ1−μ2 =0
• Significance Label: 5%
α=0.05α=0.05
Step 3: Z-Score
Step 4: Check to Critical Z-Score value in the Z-Table for apha/2 = 0.025
• Reject the null hypothesis. There is a significant difference between the online and offline
classes.
A company claims that their product weighs 500 grams on average. A sample of 64 products has a
mean weight of 498 grams. The population standard deviation is known to be 8 grams. At a 1%
significance level, is there evidence to reject the company’s claim?
Solution:
Z = (x̄ – μ) / (σ / √n)
= -2 / 1
= -2
Conclusion: There is not enough evidence to reject the company’s claim at the 1% significance level.
Two factories produce semiconductors. Factory A’s chips have a mean resistance of 100 ohms with
a standard deviation of 5 ohms. Factory B’s chips have a mean resistance of 98 ohms with a standard
deviation of 4 ohms. Samples of 50 chips from each factory are tested. At a 1% significance level, is
there a difference in mean resistance between the two factories?
Solution:
H₀: μA – μB = 0;H₁: μA – μB ≠ 0
= 2 / √(0.5 + 0.32)
= 2 / 0.872
= 2.29
Critical value (α = 0.01, two-tailed): ±2.576 ; |2.29| < 2.576, so fail to reject H₀.
Conclusion: There is not enough evidence to conclude a difference in mean resistance at the 1%
significance level.
A chi-squared test (symbolically represented as χ2) is basically a data analysis on the basis of
observations of a random set of variables. Usually, it is a comparison of two statistical data sets.
The Chi-Square test is a statistical test used to determine if there is a significant association between
two categorical variables or if the observed data fits an expected distribution. There are two main
types of Chi-Square tests:
1. Chi-Square Goodness of Fit Test: Tests whether a sample data matches a population with a
specific distribution (e.g., a uniform distribution).
2. Chi-Square Test of Independence: Tests whether two categorical variables are independent of
each other.
Summary of Chi-Square Test Steps:
9. Draw a conclusion.
Chi-squared test, or χ² test, indicates that there is a relationship between two entities. For example, it
can be demonstrated when we look out for people’s favorite colors and their preference for ice cream.
The test is instrumental in telling whether these two variables are associated with each other. For
instance, it is possible that individuals who prefer the color blue also tend to be in favor of chocolate
ice cream. This test checks whether or not observed data fits those that would be expected assuming
that association is absent at all, where there is a huge deviation.
Chi-square tests are important in various fields of study such as marketing, biology, medicine or even
social sciences; that is why they are extremely valuable:
• Validating Assumptions: Chi-square tests check if your observed data matches what you
expected. This helps you know if your ideas are on track or if you need to reconsider them.
• Data-Driven Decisions: Chi-square tests validate our beliefs based on empirical evidence and
boost confidence in our inferences.
• Σ (sigma): The symbol means sum, so each cell of your contingency table must be computed.
• Oi: This shorthand captures the idea of actual number of observations in a given cell of a
contingency table, or what was actually counted.
• Ei: The number of times you would expect to see a particular result under conditions where
we assume the hypothesis of no association (null hypothesis) is called as the expected
frequency i.e. Ei.
• (Oi - Ei): The difference between the expected and actual frequencies is computed in this
section of the formula.
• Null Hypothesis (H₀): The relationship between categorical variables is determined by the use
of statistical analysis. This means the researcher assumes that there is no relationship between
the two variables under study no matter the differences or patterns identified are as a result
of random chance. Observing this hypothesis helps us protect our analysis from possible
prejudices hence ensuring it is just.
• Alternative Hypothesis (H₁): The hypothesis suggests that there is a relation between the two
categorical independent variables which are under study, therefore showing that there is an
actual relationship instead of mere coincidence.
Before performing a chi-square test, you should have on hand information about two categorical
variables you wish to observe. As an example, in case one wishes look into how sex influences which
type of ice-cream a person will choose- it would mean knowing the specific choice they would go for
whether it is chocolate or strawberry among others besides their gender which implies both pieces of
data have been collected already.
• Before conducting a chi-square test, it is necessary to get data on two categorical variables you
want to analyze. For instance, if you are interested in exploring the relationship between
gender and preferred ice cream flavors, then you must collect details on people’s sex (male or
female) and their best flavors (e.g., chocolate, vanilla, strawberry).
• When one is investigating the use of two related variables, it is necessary to use a contingency
table to capture all combinations they can possibly be combined in. In this table, the values of
one variable show up in the columns across, while values of another variable show up in rows.
For instance, one can use it to determine how many females liked diet coke/vanilla flavored
ice cream.
The hypothesis is that men prefer vanilla while women prefer chocolate. So we need to record how
many have chosen vanilla among all male respondents versus the number who chose chocolate out of
all female respondents.
Male 20 15 10
Chocolate Vanilla Strawberry
Female 25 20 30
In this table:
• Table contains two dimensions which are gender and ice cream flavors. The row headings are
male and female categories respectively whereas column headings represent chocolate,
vanilla and strawberry flavors. Each cell contains numerical counts for every combination of
category. Conduct a chi-square test on this table to examine association between these two
categorical variables.
• Get Predicted Frequency:In any specific cell, the expected frequency can be described as the
number of occurrences which are expected if the two variables were independent.
• Expected Frequency Calculation: To compute the anticipated frequency of individual cells, one
must use a method of comparison. This involves multiplying the sums of rows and columns in
proportion, then dividing by the total number of observations in a table.
• One can use a chi-squar table to get the p-value for a particular chi-square statistic (χ²) with
certain degrees of freedom (df) which was calculated. This table has chances of various values
of the chi-square statistic in different degrees of freedom.
• If null hypothesis is correct then chi-square with its validity will be observed as p value. If it is
assumed there is no correlation between the variables then the probability of this data set
occurring given what we have seen becomes cleare
• If the p-value is less than a certain significance level (e.g., 0.05) then we reject the null
hypothesis, which is commonly denoted by α. Thus it means that category variables highly
correlate each other.
• When a p-value is above α it implies that we cannot reject the null hypothesis hence there is
insufficient evidence for establishing the relationship between these variables.
• Chi-square tests suppose that the observations are independent from one another; they are
distinct.
• Each cell in the table should have a minimum of five values in it for better results. Otherwise,
think about the Fisher’s exact test as an alternative measure if a table cell has less than five
numbers in it.
• Chi-square tests do not indicate a causal relationship but they identify association between
variables.
• Categorical variables are like sorting things into different groups. But instead of using numbers,
we're talking about categories or labels. For example, colors, types of fruit, or types of cars are
all categorical variables.
• They’re termed as "categorical" simply because bit by bit they segment things like "red,"
"green" or "blue" into separate clusters. Unlike height or weight whose measurements are
contiguous, categorical data has definite options without numerical order between them. That
is why if you ask whether someone prefers apples to oranges then it means that the person is
discussing categorical data
• Distinct Groups: Categorical variables put things into different groups that don't overlap. For
example, when we talk about hair color, someone can be a redhead, black-haired, blonde, or
brunette. Each person falls into just one of these groups.
• Non-Numerical:There is no hierarchy in categorical terms for these are just names and not
actions hence its futility to compare like blondes are better than brunettes based on it;
referring blondes as bad women due to their hair types will be unfair as it may not make any
sense referring such attributions with regards to colours either feminist perspective could be
used but the simplest explanation remains that ’they are merely dissimilar.
• Limited Options: Categorical variables are characterized by a fixed number of possibilities. One
may have such choices as red, blonde, brown, black hair color. The number of categories may
fluctuate, but they all remain distinct and bounded in scope.
Goodness-Of-Fit
A goodness-of-fit test is used to determine whether or not a model or hypothesis being utilized is
consistent with collected data type. Suppose you were to come up with a hypothesis such as: ‘It is
likely that humans who live in urban areas are taller than those from rural areas’. After collecting data
on the heights of people and comparing it with your hypothesis’ prediction, if there is close agreement
between the two then one has grounds for believing that these predictions are correct. But if such
agreement does not exist, then perhaps one has to rethink on his/her hypothesis. Thus, the goodness
of fit test helps us.
Key Aspects of a Goodness-of-Fit Test
1. Purpose: The aim is to check if a guessed distribution fits well with the data we have.
2. Data Requirements: It can be used with both continuous and categorical data, among other forms
of data.
3. Common Applications:
• The aim of this procedure is to examine if a set of figures is taken from a standard random
distribution.
• In addition to this, it will juxtapose the actual occurrence of stuff with what we expect more
times than not – a good example is making comparisons using chi-square test.
4. Benefits:
• Helps us spot any weird or unusual data that might cause problems.
5. Limitations:
• These results can be altered by the test type and volume of data points as well.
• Chi-Square Test: It is mainly used for categorical data and helps in comparing the observed
frequencies of classes with their expected frequencies based on a theoretical model.
• Anderson-Darling Test: A nonparametric test that is specifically concerned with the weighted
absolute differences between cumulative observed and calculated distributions. When we are
talking about the discrepancies in the tails of distributions, it usually is more sensitive than
Kolmogorov-Smirnov’s test. It examines the extent of these disparities as renewed emphasis
on tails. This is much more efficient at finding outages in the tails punansa suuntaajiin.
A study investigates the relationship between eye color (blue, brown, green) and hair color (blonde,
brunette, Redhead) . The following data is collected:
Solution:
Calculate the chi-square value for each cell in the contingency table using the formula
χ² = (Oi - Ei)² / Ei
For instance, consider someone with brown hair and blue eyes:
To complete the total chi-square statistic, find each cell’s chi-squared value and sum them up across all
the nine cells in the table.
df = (3 - 1) × (3 - 1)
df = 2 × 2 = 4
Finding p-value:
You may reference a chi-square distribution table to get an estimated chi-square stat of (χ²) using the
appropriate degrees of freedom. Look for the closest value and its corresponding p-value since most
tables do not show precise numbers.
If your Chi-square value was 20.5, you would observe that the nearest number in the table for df = 4 is
14.88 with a p-value in 0.005; an illustration is.
Interpreting Results:
• Selecting a level of significance (α = 0.05 is common)or than if the null hypothesis holds, the
probability of either rejecting it at all is limited (Type I error).
• When the p-value is less than the significance level, which in this case is written as p-value <
0.05, we can reject the null hypothesis. There is sufficient evidence to say that hair and eye
color are related in one direction according to statistical terms. If the p-value is greater than
the significance level it means that we cannot reject the null hypothesis therefore p-value >
0.05.
• Based on the data at hand, we cannot say that there is a statistically significant correlation
between eye and hair colors.
f-test
F test is a statistical test that is used in hypothesis testing, that determines whether or not the
variances of two populations or two samples are equal. An f distribution is what the data in a f test
conforms to. By dividing the two variances, this test compares them using the f statistic. Depending
on the details of the situation, a f-test can be one-tailed or two-tailed.
F-distribution
• The related samples’ degrees of freedom are denoted by df1 and df2.
Degree of Freedom
The degrees of freedom represent the number of observations used to
calculate the chi-square variables that form the ratio. The shape of the F-
distribution is determined by its degrees of freedom. It is a right-skewed
distribution, meaning it has a longer tail on the right side. As the degrees
of freedom increase, the F-distribution becomes more symmetric and
approaches a bell shape.
What is F-Test?
The F test is a statistical technique that determines if the variances of
two samples or populations are equal using the F test statistic. Both the
samples and the populations need to be independent and fit into an F-
distribution. The null hypothesis can be rejected if the results of the F
test during the hypothesis test are statistically significant; if not, it stays
unchanged.
Several assumptions are used in the F Test equation. For the F-test Formula to be utilized, the
population distribution needs to be normal. Independent events should be the basis for the test
samples. Apart from this, the following considerations should also be taken into consideration.
• It is simpler to calculate right-tailed tests. By pushing the bigger variance into the numerator,
the test is forced to be right tailed.
• Before the critical value is determined in two-tailed tests, alpha is divided by two.
8. Draw a conclusion.
What is an F-Test?
An F-test is a statistical test used to compare two or more variances to determine if they are
significantly different from each other. The F-test is commonly used in the following contexts:
1. Comparison of Two Variances: To test if two populations have different variances (commonly
used in analysis of variance, ANOVA).
2. ANOVA (Analysis of Variance): To test if there are significant differences between the means
of more than two groups based on variances.
In an F-test, the ratio of two sample variances is compared, and the resulting statistic follows an F-
distribution.
ANOVA Test, or Analysis of Variance, is a statistical method used to test the differences between
means of two or more groups. Developed by Ronald Fisher in the early 20th century, ANOVA helps
determine whether there are any statistically significant differences between the means of three or
more independent groups.
The ANOVA analysis is a statistical relevance tool designed to evaluate whether or not the null
hypothesis can be rejected while testing hypotheses. It is used to determine whether or not the
means of three or more groups are equal.
The ANOVA test is used to look for heterogeneity within groups as well as variability across groupings.
The f test returns the ANOVA test statistic.
Examples of the use of ANOVA Formula
• Assume it is necessary to assess whether consuming a specific type of tea will result in
a mean weight decrease. Allow three groups to use three different varieties of tea: green tea,
Earl Grey tea, and Jasmine tea. Thus, the ANOVA test (one way) will be utilized to examine if
there was any mean weight decrease displayed by a certain group.
• Assume a poll was undertaken to see if there is a relationship between salary and gender and
stress levels during job interviews. A two-way ANOVA will be utilized to carry out such a test.
ANOVA Table
An ANOVA (Analysis of Variance) test table is used to summarize the results of an ANOVA test, which
is used to determine if there are any statistically significant differences between the means of three or
more independent groups. Here’s a general structure of an ANOVA table:
ANOVA Table
One-Way ANOVA
This test is used to see if there is a variation in the mean values of three or more groups. Such a test is
used where the data set has only one independent variable. If the test statistic exceeds the critical
value, the null hypothesis is rejected, and the averages of at least two different groups are significant
statistically.
Two-Way ANOVA
Two independent variables are used in the two-way ANOVA. As a result, it can be viewed as an
extension of a one-way ANOVA in which only one variable influences the dependent variable. A two-
way ANOVA test is used to determine the main effect of each independent variable and whether there
is an interaction effect. Each factor is examined independently to determine the main effect, as in a
one-way ANOVA. Furthermore, all components are analyzed at the same time to test the interaction
impact.
Three different kinds of food are tested on three groups of rats for 5 weeks. The objective is to check
the difference in mean weight(in grams) of the rats per week. Apply one-way ANOVA using a 0.05
significance level to the following data:
8 4 11
12 5 8
Food I Food II Food III
19 4 7
8 6 13
6 9 7
11 7 9
Solution:
Since, X̄ 1 = 5, X̄ 2 = 9, X̄3 = 10
Total mean = X̄ = 8
SSE = 68
MSB = SSB/df1 = 42
Enlist the results in APA format after performing ANOVA on the following data set:
Solution:
= 134.53
MSbetween = (17.62)(30) = 528.75
F = MSbetween / MSerror
= 528.75 / 134.53
F = 4.86
What is ANOVA?
ANOVA (Analysis of Variance) is a statistical test used to compare the means of three or more groups
to determine if at least one of the group means is significantly different from the others. It is based on
the idea that the variation in a data set can be partitioned into two sources: the variation between
groups (due to the treatment or factor) and the variation within groups (due to random error).
Types of ANOVA
1. One-Way ANOVA: Compares the means of three or more groups based on one factor
(independent variable).
2. Two-Way ANOVA: Compares the means of groups based on two factors. It can also assess the
interaction effect between the two factors.
3. Repeated Measures ANOVA: Used when the same subjects are used for each treatment or
condition (e.g., before and after treatment).
• Data should be organized into groups based on the factor you're analyzing (e.g., three different
treatments, or different age groups).
• For example, suppose you want to compare the exam scores of students from three different
teaching methods.
Teaching Method Group 1 (Method A) Group 2 (Method B) Group 3 (Method C)
Student 1 75 80 85
Student 2 80 85 88
Student 3 78 82 84
Student 4 79 87 86
Student 5 82 88 87
Tests of Significance Based on Normal Distribution are statistical tests used to make inferences about
the population from which a sample is drawn. These tests are often applied when the data follows a
normal distribution or when the sample size is large enough for the Central Limit Theorem to apply,
making the sampling distribution approximately normal.
Key Concepts
• Significance Test: A procedure to determine if the observed data provides enough evidence to
reject a null hypothesis.
• Normal Distribution: A symmetric, bell-shaped distribution where the mean, median, and
mode are all equal. It is commonly used when data points cluster around a central value and
the data is symmetric.
There are several tests based on the normal distribution to assess hypotheses about population
parameters (mean, variance, etc.). The most common tests of significance include:
This test is used to determine whether there is a significant difference between the sample mean and
the population mean when the population standard deviation is known.
Conclusion
• Z-Test: Used when population standard deviation is known and the sample size is large (n >
30).
• T-Test: Used when population standard deviation is unknown and the sample size is small (n <
30).
• Z-Test for Proportions: Used for categorical data to test if the sample proportion is different
from a known population proportion
ANOVA (Analysis
Feature F-Test T-Test Z-Test Chi-Square Test
of Variance)
To compare To test the
To compare sample
variances population mean To compare the
mean with population To test the goodness of fit or
Purpose between two (for large means of three or
mean (or between two compare variances
or more samples) or more groups
means)
groups proportions
Quantitative Categorical data (for
Quantitative Quantitative Quantitative
Data Type continuous goodness of fit) or
continuous data continuous data continuous data
data quantitative (for variance)
H0:σ12=σ22H H0:Observed frequencies=Ex
H0:μ1=μ2=⋯=μkH
_0: H0:μ=μ0H_0: \mu pected frequenciesH_0:
H0:μ=μ0H_0: \mu = _0: \mu_1 = \mu_2
Null \sigma_1^2 = = \mu_0H0:μ=μ0 \text{Observed frequencies}
\mu_0H0:μ=μ0 (the = \dots = \mu_kH0
Hypothesis \sigma_2^2H0 (the sample mean = \text{Expected
sample mean equals :μ1=μ2=⋯=μk
(H₀) :σ12=σ22 equals the frequencies}H0
the population mean) (group means are
(equal population mean) :Observed frequencies=Expe
equal)
variances) cted frequencies
F-statistic (based on
F-statistic Z-statistic (based Chi-square statistic (based on
Test t-statistic (based on between-group
(ratio of on sample mean observed vs expected
Statistic sample mean) variance vs within-
variances) or proportion) frequencies)
group variance)
1. Independent
1. Independence of
random 1. Large sample 1. Independent
observations
samples 1. Normally size (n > 30) or samples
2. Large sample size for
2. Normally distributed population normal 2. Normally
Assumptio expected frequency
distributed 2. Independent distribution distributed
ns assumptions
populations samples 2. Known populations
3. Normally distributed
3. 3. Small sample size population 3. Homogeneity of
population (for testing
Homogeneity standard deviation variances
variance)
of variances
Sample
Large sample sizes
Size Small to large Small samples (n < Large samples (n Small to large
(particularly for expected
Requireme samples 30) > 30) samples
frequencies > 5)
nt
Comparing
Population mean
variances Comparing the Testing for categorical data
Comparing means or proportion
Used When between two means of three or (goodness-of-fit) or variance
(single or two groups) hypothesis (large
or more more groups comparison
sample)
groups
Two sets of
degrees of df = number of
df = n - 1 (for one-
freedom (df1 groups - 1 (between df = (number of categories -
sample) or df = n - 1 (for
Degrees of = number of groups) and total 1) for goodness-of-fit or
(n1+n2−2)(n_1 + n_2 sample mean
Freedom groups - 1, df2 sample size - based on sample size for
- 2)(n1+n2−2) (for comparison)
= total sample number of groups variance
two-sample)
size - number (within groups)
of groups)
Critical F-distribution Z-distribution
t-distribution table F-distribution table Chi-square distribution table
Value table table
Normal
distribution (for
Assumptio Normal Normal distribution Distribution of data under the
Normal distribution of large samples, or
ns about distribution of and homogeneity of null hypothesis (for
populations if population
Population populations variance goodness-of-fit)
standard deviation
is known)
Testing if a
ANOVA, sample mean
Comparing sample Comparing means Goodness-of-fit for
comparing matches a known
mean to a population in different groups categorical data, testing for
Application variances in population mean
mean or comparing to assess treatment independence, and testing
two or more or testing a
two sample means effects population variance
groups population
proportion