0% found this document useful (0 votes)
300 views180 pages

UNIT IV - Inferential Statistics

The document discusses inferential statistics and provides definitions and examples. Inferential statistics draws conclusions about populations based on samples. It outlines common statistical tests used in inferential analysis including parametric tests like the t-test, z-test, and ANOVA as well as non-parametric tests such as the chi-square test. The document also defines key concepts such as populations, samples, parameters, and statistics. It provides examples to distinguish between parameters that describe populations and statistics that describe samples.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
300 views180 pages

UNIT IV - Inferential Statistics

The document discusses inferential statistics and provides definitions and examples. Inferential statistics draws conclusions about populations based on samples. It outlines common statistical tests used in inferential analysis including parametric tests like the t-test, z-test, and ANOVA as well as non-parametric tests such as the chi-square test. The document also defines key concepts such as populations, samples, parameters, and statistics. It provides examples to distinguish between parameters that describe populations and statistics that describe samples.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 180

UNIT IV

Inferential Statistics
Group 3: Conception, Garcia, Jacob, Morgan, Puno-Sanchez,
Santiago, and Villarin
OUTLINE
A. Inferential Statistics Definition
B. Purpose of Inferential Statistics
C. Statistical Tests
1. Parametric
• T-Test
• Z-Test
• ANOVA
2. Non-Parametric
• Chi-Square Test
• Other Non-Parametric Test
Inferential Statistics
Definition
Introduction to Inferential Statistics

It is found to be most effective in the teaching of


statistics that learners first understand the concepts
behind statistics prior to being taught “how to
undertake” them.

- Marshall & Jonker, 2009


An introduction to inferential statistics: A review and practical guide
What is Inferential Statistics?

▪ Inferential statistics is a branch of statistics that makes the use of


various analytical tools to draw inferences about the population
data from sample data.
▪ Population refers to the entire group that you want to study and
draw some conclusions about;
▪ A sample, on the other hand, is the subset of the population that
you will use in your study.
Sample vs. Population

INFERENCE

SAMPLE

POPULATION
Inferential vs. Descriptive Statistics

Unlike descriptive statistics, inferential statistics draws a conclusion


about a population by examining random samples. The goal of
inferential statistics is to make inferences or generalizations about
a population. In inferential statistics, a statistic is taken from the
sample data (e.g., the sample mean or x̅) which is then used to
make inferences about the population parameter (e.g., the
population mean or µ).
Statistic vs. Parameter

Sample Population
Statistic Parameter
x̅ Mean µ
s SD σ
s2 Variance σ2
n Size N
Exercise: statistic or parameter?

1. The average catch of all fishermen in Manila Bay.


2. The average weight of trash collected by ten Estero Rangers
of Baseco.
3. The average fecal coliform reading of water quality
monitoring station no. 3 in Baseco.
4. The average working hours of all Bakawan Warriors in Tanza
Marine Tree Park.
Fecal coliform reading in Estero de Magdalena
Contribution of ISFs in the deterioration of Water
Quality in Manila Bay
NO. OF INFORMAL SETTLER FAMILIES WITHIN THE MANILA BAY WATERSHED
(PSA, as of August 2015)
National Capital Region Central Luzon CALABARZON
2,968,651 2,511,783 3,297,110

REPRESENTATIVE SAMPLE

TO ESTABLISH IF ISFs REALLY CONTRIBUTE TO THE DETERIORATION OF WATER QUALITY OF MANILA BAY (Null
Hypothesis), WE HAVE DO A RESEARCH USING INFERENTIAL STATISTICS
Inferential Statistics, definition (continued)
▪ As it would take too long and be too expensive to actually survey every informal
settler within the Manila Bay region, researchers instead take a smaller survey of
say, 1,200 respondents (SWS standard), and use the results of the survey to draw
inferences about the population as a whole.
▪ This is the whole premise behind inferential statistics – we want to answer some
question about a population, so we obtain data for a small sample of that
population and use the data from the sample to draw inferences about the
population.
Inferential Statistics, definition (continued)
▪ As it would take too long and be too expensive to actually do a survey of every
informal settler within the Manila Bay region. Researchers instead take a smaller
survey of say, 1,200 respondents (SWS standard), and use the results of the survey
to draw inferences about the population as a whole.
▪ This is the whole premise behind inferential statistics – we want to answer some
question about a population, so we obtain data for a small sample of that
population and use the data from the sample to draw inferences about the
population.
What is considered a “representative sample”?
▪ In order to be confident in our ability to use a sample to draw
Population
inferences about a population, we need to make sure that we have a
representative sample – that is, a sample in which the
characteristics of the individuals in the sample closely match the Sample
characteristics of the overall population.
▪ Ideally, we want our sample to be like a “mini version” of our
population. So, if we want to draw inferences on a population of
informal settlers composed of 60% that reside near or along
waterways and 40% that reside inland, our sample would not be
representative if it included only 40% of residents along waterways
and 60% of those who reside inland. If our sample is not similar to
the overall population, then we cannot generalize the findings from Sample is not representative of the
the sample to the overall population with any confidence. population
What is sampling?
Sampling is a technique of selecting
individual members or a subset of the
population to make statistical inferences
from them and estimate characteristics
of the whole population. Different
sampling methods are widely used by
researchers in social and market
research so that they do not need to
research the entire population to collect
actionable insights.
Types of Sampling:
Sampling in social research is of two types –
probability sampling and non-probability sampling.
Probability sampling is a sampling technique where a
researcher sets a selection of a few criteria and
chooses members of a population randomly. All the
members have an equal opportunity to be a part of
the sample with this selection parameter.
Non-probability sampling is when the researcher
chooses members for research at random. This
sampling method is not a fixed or predefined
selection process. This makes it difficult for all
elements of a population to have equal opportunities
to be included in a sample.
Probability Sampling Techniques
▪ Simple random sampling is one of the best probability sampling techniques
that helps in saving time and resources. It is a reliable method of obtaining
information where every single member of a population is chosen
randomly, merely by chance. Each individual has the same probability of
being chosen to be a part of a sample.

▪ Cluster sampling is a method where the researchers divide the entire


population into sections or clusters that represent a population. Clusters are
identified and included in a sample based on demographic parameters like
age, sex, location, etc. This makes it very simple for a survey creator to
derive effective inference from the feedback.
Probability Sampling Techniques

▪ Systematic sampling samples members of a population at regular


intervals. It requires the selection of a starting point for the sample and
sample size that can be repeated at regular intervals. This type of sampling
method has a predefined range, and hence this sampling technique is the
least time-consuming.
▪ Stratified random sampling is a method in which the researcher divides
the population into smaller groups that don’t overlap but represent the
entire population. While sampling, these groups can be organized and then
draw a sample from each group separately.
Non-probability Sampling Techniques
▪ Convenience sampling method is dependent on the ease of access to subjects such as
surveying customers at a mall or passers-by on a busy street. It is usually termed as
convenience sampling because of the researcher’s ease of carrying it out and getting in touch
with the subjects. Researchers have nearly no authority to select the sample elements, and it’s
purely done based on proximity and not representativeness. This non-probability sampling
method is used when there are time and cost limitations in collecting feedback. In situations
where there are resource limitations such as the initial stages of research, convenience
sampling is used.
▪ Judgmental or purposive sampling are based on the discretion of the researcher. Researchers
purely consider the purpose of the study, along with the understanding of the target audience.
For instance, when researchers want to understand the thought process of people interested
in studying for their master’s degree. The selection criteria will be: “Are you interested in
doing your masters in …?” and those who respond with a “No” are excluded from the sample.
Non-probability Sampling Techniques
▪ Snowball sampling is a sampling method that researchers apply when the subjects are
difficult to trace. For example, it will be extremely challenging to survey homeless people or
illegal immigrants. In such cases, using the snowball theory, researchers can track a few
categories to interview and derive results. Researchers also implement this sampling method
in situations where the topic is highly sensitive and not openly discussed—for example,
surveys to gather information about HIV Aids. Not many victims will readily respond to the
questions. Still, researchers can contact people they might know or volunteers associated with
the cause to get in touch with the victims and collect information.
▪ Quota sampling: the selection of members in this sampling technique is based on a pre-set
standard. In this case, as a sample is formed based on specific attributes, the created sample
will have the same qualities found in the total population. It is a rapid method of collecting
samples.
Statistical Tests
A. Parametric
T-Test
t-Test

Content
A. Description
B. Types of t-test
➢Paired t-test
➢One-sample t-test
➢Two-sample t-test
t-Test
▪ A t-test is an inferential statistic used to determine if there is a
significant difference between the means of two groups and
how they are related.
▪ T-tests are used when the data sets follow a normal distribution
and have unknown variances, like the data set recorded from
flipping a coin 100 times.
▪ It is often used in hypothesis testing to determine whether a
process or treatment actually has an effect on the population of
interest, or whether two groups are different from one another.
▪ The t-test was devised by
William Sealy Gosset
around 1908
▪ while he was an employee
of Guinness brewery in
Ireland.
▪ He used the t-test to
monitor the quality of the
stout brewed by Guinness.
t-Test Assumptions
The t-test is a parametric test of difference, meaning that it makes the same
assumptions about your data as other parametric tests.

The t-test assumes your data:


▪ data are continuous
▪ are (approximately) normally distributed
▪ sample data have been randomly sampled from a population
▪ homogeneity of variance (have a similar amount of variance within each group
being compared)
What type of t-test should I use?
▪ If the groups come from a single population (e.g. measuring before and after an
experimental treatment), perform a paired t-test.

▪ If the groups come from two different populations (e.g. two different species, or
people from two separate cities), perform a two-sample t-
test (a.k.a. independent t-test).

▪ If there is one group being compared against a standard value (e.g. comparing
the acidity of a liquid to a neutral pH of 7), perform a one-sample t-test.
Types of t-test
Types of t-test One-sample t-test Two-sample t-test Paired t-test
Number of Variables One Two Two
Type of Variable Continuous measurement Continuous measurement Continuous measurement
Categorical or Nominal to define pairing
Categorical or Nominal to define groups
within group
Decide if the population mean is equal Decide if the population mea and for Decide if the difference between paired
Purpose of test
to a specific value or not two different groups are equal or not measurements for a population is zero or not

Mean heart rate of a group of people is Mean heart rates for two groups of Mean difference in heart rate for a group of
Example test
equal to 65 or not people are the same or not people before and after exercise is zero or not

Estimate of population Sample average of differences in paired


Sample Average Sample average for each group
mean measurements

Population standard Unknown, use sample standard Unknown, use sample standard Unknown, use sample standard deviation of
deviation deviation deviations for each group differences in paired measurements

Number of observations in sample Sum of observations in sample minus 2, Number of paired observations in sample
Degrees of freedom
minus 1, or: n-1 or: n1 + n2 – 2 minus 1, or: n-1
How to conduct a t-test
1. Define your null (Ho) hypotheses .

2. Decide on the alpha value (or α value).


▪ This involves determining the risk you are willing to take of drawing the wrong conclusion. For example, suppose you set
α=0.05 when comparing two independent groups. Here, you have decided on a 5% risk of concluding the unknown population
means are different when they are not.

3. Calculate the t-statistic:


▪ Each type of t-test has a different formula for calculating the t-statistic
4. Calculate the degrees of freedom:
▪ Degrees of freedom are the number of ways the mean could vary. In this case, the degrees of freedom are the number of NPS
ratings that you could have in a given group of respondents. Similar to the t-statistic, the formula for degrees of freedom will
vary depending on the type of t-test you perform.
5. Determine the critical value:
▪ The critical value is the threshold at which the difference between two numbers is considered to be statistically significant.
6. Compare absolute value of the t-statistic to critical value:
▪ If your t-statistic is larger than your critical value, your difference is significant. If your t-statistic is smaller, then your two
numbers are, statistically speaking, indistinguishable.
Calculate the t-statistic
▪ Two-sample t-test

▪ One-sample t-test

▪ Paired t-test
Calculate the degrees of freedom
▪ Two-sample t-test

▪ One-sample t-test

▪ Paired t-test
Determine Critical Value

For example :
According to this
table, for a two-tailed
test with an alpha
level of 0.05 at 40
degrees of freedom,
the critical value is
2.02.
Paired t-Test
▪ The paired t-test is a method used to test whether the mean difference
between pairs of measurements is zero or not.

▪ You can use the test when your data values are paired measurements. For
example, you might have before-and-after measurements for a group of
people. Also, the distribution of differences between the paired
measurements should be normally distributed.
Paired t-test example
▪ An instructor wants to use two exams in her
classes next year. This year, she gives both exams
to the students. She wants to know if the exams
are equally difficult and wants to check this by
looking at the differences between scores. If the
mean difference between scores for students is
“close enough” to zero, she will make a practical
conclusion that the exams are equally difficult.

▪ If you look at the table, you see that some of the


score differences are positive and some are
negative. You might think that the two exams are
equally difficult. Other people might disagree. The
statistical test gives a common way to make the
decision, so that everyone makes the same
decision on the same data.
Data
▪ Before jumping into the analysis, we should plot the data. The
figure below shows a histogram and summary statistics for the
score differences.
How to perform the paired t-test
Steps
1. Ho : The mean score difference is equal to zero. (If the mean difference between scores for students is “close enough” to
zero, she will make a practical conclusion that the exams are equally difficult. )

2. Alpha value : 0.05

3. We start by calculating our test statistic. To accomplish this, we need the average difference, the standard deviation of
the difference and the sample size.
The average score difference is: Ⴟd=1.31

Next, we calculate the standard error for the score difference. The calculation is:
Standard Error=sd/√n =7.00/√16 =7/4 = 1.75

We now have the pieces for our test statistic. We calculate our test statistic as:
t = Difference of averages / Standard Error
t = 1.31 / 1.75 = 0.75
Determine Critical Value

According to this
table, for a paired t-
test with an alpha
level of 0.05 at 15
degrees of freedom,
the critical value is
2.131.
To make our decision, we compare the test statistic to a value from
the t-distribution.

Because 0.750 < 2.131, we cannot reject our idea that the mean score
difference is zero. We make a practical conclusion to consider exams as equally
difficult.
EXERCISE FOR PAIRED T-TEST

Consider a sample of 5 toys comparing the 1. Ho : The average battery mean life length is equal to zero
battery life of 2 brands:
2. Decide on the Alpha. α = 0.05
Data : 3. Compute for t-statistic
TOY BATTERY TYPE 1 BATTERY TYPE 2 DIFF.
t = x̅d / se
1 52.6 61.4 -8.8 = -6.18 / 2.11
2 103.4 112.8 -9.4 = -2.93
3 68.2 67.1 1.1 4. Compute for degrees of freedom : df = n-1 = 5-1 = 4
4 88.4 92.3 -3.9 5. Determine Critical Value : 2.776
5 111.6 121.5 -9.9
6. Compare t-statistic and t-value : -2.93 < 2.776
Is there a significant difference in the life length 7. Decide : Accept Ho, which means there is no difference in
of two brands of batteries? batteries type 1 and 2.
▪ Given Data:
n = 5
x̅d = -6.18 [ ∑difference/n ]
s = 4.73 [ √∑(x-x̅) 2 / n-1 ]
se = 2.11 [ s / √n ]
One sample t-test
▪ The one-sample t-test is a statistical hypothesis test used to determine
whether an unknown population mean is different from a specific value.
▪ You can use the test for continuous data. Your data should be a random
sample from a normal population.
One-sample t-test example
Imagine we have collected a random
sample of 31 energy bars from a number of
different stores to represent the population
of energy bars available to the general
consumer. The labels on the bars claim that
each bar contains 20 grams of protein.
If you look at the table, you see that
some bars have less than 20 grams of protein.
Other bars have more. You might think that
the data support the idea that the labels are
correct. Others might disagree. The statistical
test provides a sound method to make a
decision, so that everyone makes the same
decision on the same set of data values.
Data
Before jumping into analysis, we should take a quick look at the
data. The figure below shows a histogram and summary statistics
for the energy bars.
▪ From a quick look at the histogram, we see that
there are no unusual points, or outliers. The data
look roughly bell-shaped, so our assumption of a
normal distribution seems reasonable.

▪ From a quick look at the statistics, we see that the


average is 21.40, above 20. Does this average from
our sample of 31 bars invalidate the label's claim of
20 grams of protein for the unknown entire
population mean? Or not?
How to perform the one-sample t-test
Steps
1. Ho : The mean grams of protein is equal to 20.

2. Alpha value : 0.05

3. Calculate T-statistic . Formula : t = Mean – Hypothesized mean / standard error for mean
We start by finding the difference between the sample average and Assumption value:
= 21.40 − 20 = 1.40
Next, we calculate the standard error for the mean. The calculation is:
Standard Error for the mean = standard deviation/√total sample size
se = 2.54/√31 =0.456
We now have the pieces for our test statistic. We calculate our test statistic as:
t = Mean – Hypothesized mean / Standard Error for mean
= 1.40 / 0.456
= 3.07

4. Calculate degrees of freedom. Formula: df = n -1 = 30-1 = 30


Determine Critical Value
To make our decision, we compare the test statistic to a value
from the t-distribution.

5. We find the value from the t-distribution based on our decision. For a t-test, we need
the degrees of freedom to find this value. The degrees of freedom are based on the
sample size.
For the energy bar data:
degrees of freedom = n−1 = 31−1 = 30.
The critical value of t with α = 0.05 and 30 degrees of freedom is +/- 2.042.

6. We compare the value of our statistic (3.07) to the t value. Since 3.07 > 2.042, we
reject the null hypothesis that the mean grams of protein is equal to 20. We make a
practical conclusion that the labels are incorrect, and the population mean grams of
protein is greater than 20.
EXERCISE FOR ONE-SAMPLE T-TEST

Consider a sample of 10 penguin heights: 1. Ho : Ⴟ = µ (The average mean height of penguins is


equal to 1 meter)
Data : 0.93, 1.18, 1.34, 1.21, 1.24, 0.97, 0.93, 2. Decide on the Alpha. α = 0.05
1.17, 1.30, 0.83 3. Compute for t-statistic
t = Ⴟ - µ / se
Is the mean of population they come from = 1.11 – 1 / 0.06
equal to 1 meter? = 0.11 / 0.06 = 1.83
4. Compute for degrees of freedom : df = n-1 = 10-1 = 9
▪ Given Data: 5. Determine Critical Value : 2.262
n = 10 6. Compare t-statistic and t-value : 1.83 < 2.62
x̅ = 1.11 7. Decide : Accept Ho, which means that the average
s = 0.18 [ √∑(x-x̅) 2 / n-1 ] height of a penguin is about 1 meter.
se = 0.06 [ s / √n ]
µ =1
Two-sample t-test
▪ The two-sample t-test (also known as the independent samples t-test) is a
method used to test whether the unknown population means of two
groups are equal or not.

▪ You can use the test when your data values are independent, are
randomly sampled from two normal populations and the two independent
groups have equal variances.
Two-sample t-test example
▪ One way to measure a person’s fitness is to measure their body fat
percentage. Average body fat percentages vary by age, but according to
some guidelines, the normal range for men is 15-20% body fat, and the
normal range for women is 20-25% body fat.

▪ Our sample data is from a group of men and women who did workouts at
a gym three times a week for a year. Then, their trainer measured the
body fat.
▪ You can clearly see some overlap in the body fat
measurements for the men and women in our
sample, but also some differences. Just by
looking at the data, it's hard to draw any solid
conclusions about whether the underlying
populations of men and women at the gym have
the same mean body fat. That is the value of
statistical tests – they provide a common,
statistically valid way to make decisions, so that
everyone makes the same decision on the same
set of data values.
Data

▪ Before jumping into analysis, we


should always take a quick look at the
data. The figure shows histograms and
summary statistics for the men and
women.
How to perform the two-sample t-test
▪ For each group, we need the average, standard deviation and sample size. These
are shown in the table below.

▪ Without doing any testing, we can see that the averages for men and women in
our samples are not the same. But how different are they? Are the averages
“close enough” for us to conclude that mean body fat is the same for the larger
population of men and women at the gym? Or are the averages too different for
us to make this conclusion?
How to perform the two-sample t-test
Steps
1. Ho : The mean body fat for men and women are equal.

2. Alpha value : 0.05

3. Calculate T-statistic . Formula : t = Ⴟ1 – Ⴟ2


-------------------------------------------------------
sႿ1Ⴟ2 * √(1/ n1 + 1/n2)
▪ We start by calculating our test statistic.

▪ This calculation begins with finding the difference between the two averages:

difference of averages = 22.29−14.95 = 7.3422.29−14.95 =7.34


CALCULATE POOLED STANDARD DEVIATION
Next, we calculate the pooled standard vp = ((n1 – 1)s12) + ((n2 – 1)s22)
deviation. -----------------------------------
This builds a combined estimate of the overall n1 + n2 - 2
standard deviation. The estimate adjusts for
different group sizes. = ((10-1) 5.322) + ((13-1)6.842)
----------------------------------------
10+13-2
First, we calculate the pooled variance.
= ((9) 28.30) + ((12) 46.79)
GIVEN DATA: --------------------------------
n1 = sample size women = 10 21
n2 = sample size men = 13
= 254.7 + 561.48
s1 = standard deviation of women = 5.32 -------------------
s2 = standard deviation of men = 6.84 21

= 38.87
Next, we take the square root of the pooled variance to get the pooled standard deviation.

This is: √38.87 = 6.23

We now have all the pieces for our test statistic. We have the difference of the averages, the pooled standard deviation and the sample
sizes.

We calculate our test statistic as follows: t = Ⴟ1 – Ⴟ2


-------------------------------------------------------
sႿ1Ⴟ2 * √(1/ n1 + 1/n2)
= (average women – average of men) / (pooled standard deviation x √(1/n1+1/n2)

= 7.34 / (6.23 × √(1/10+1/13))

= 7.34 / 2.62

= 2.80

5. Calculate degrees of freedom. Formula: df = n1 + n2 - 2 = 10+13-2 = 21


Determine Critical Value

According to this
table, for a two-tailed
test with an alpha
level of 0.05 at 21
degrees of freedom,
the critical value is
2.08.
To evaluate the difference between the means in order to make a decision about our gym
programs, we compare the test statistic to a theoretical value from the t-distribution.

6. We compare the value of our statistic (2.80) to the t value. Since 2.80 > 2.080, we
reject the null hypothesis that the mean body fat for men and women are equal, and
conclude that we have evidence body fat in the population is different between men and
women.
EXERCISE FOR TWO-SAMPLE T-TEST

Andro grows tomatoes in two separate fields. When 1. Ho : Mean size in field A is equal to the mean size of field B
the tomatoes are ready to be picked, he is curious as 2. Decide on the Alpha. α = 0.05
to whether the sizes of tomatoes differ between 2 3. Compute for t-statistic
fields. He takes random samples of plants from each vp = ((n1 – 1)s12) + ((n2 – 1)s22) t = Ⴟ1 – Ⴟ2
field and measures the heights of the plants. ----------------------------------- -------------------------------------------------------

n1 + n2 - 2 sႿ1Ⴟ2 * √(1/ n1 + 1/n2)

=((5 –1)3.522) + ((5–1)9.232) t = 95.49 – 105.8 / ( 6.98 * 0.63)


----------------------------------- = -10.31/ 4.40
Is there a significant difference in the tomatoes 5+ 5- 2 = -2.34
between two fields? vp = 48.79

▪ Given Data: sႿ1Ⴟ2 = √48.79 = 6.98


n1 = 5 ; n2 = 5
x̅1 = 95.49 ; x̅1 = 105.8 4. Compute for degrees of freedom : df = n1 + n2 - 2 = 5+5-2 = 8
s1 = 3.52 ; s2 = 9.23 5. Determine Critical Value : 2.306
6. Compare t-statistic and t-value : -2.34 < 2.306
7. Decide : Accept Ho, which there is a difference between the heights of
tomatoes between two fields.
Statistical Tests
A. Parametric
• Z-Test
What is Z-test?

▪ is a statistical procedure used to test an alternative hypothesis


against a null hypothesis
▪ is any statistical hypothesis used to determine whether two
samples’ means are difference when variances are known, and
sample is large (n ≥ 30)
▪ it is a comparison of the means of two independent groups of
samples, taken from one population with known variance.
When do we use Z-test?

▪ when samples are drawn at random


▪ when the samples are taken from population are
independent
▪ when the standard deviation is known
▪ when the number of observation is large (n ≥ 30)
Types of Hypotheses in Z-test

Null Hypothesis (Ho) – is the hypothesis that the researcher tries


to disprove, reject or nullify in the study. The word ‘null’ often
refers to the common view of something. It is a statement about
a parameter (a numerical characteristic of the population).

Alternative Hypothesis (H1) – is what the researcher really thinks


is the cause of a phenomenon. It is typically the research
hypothesis of interest.
Example

About 10% of the human population is left-handed. Suppose a researcher at


Penn State speculates that students in the College of Arts and Architecture
are more likely to be left-handed than people found in the general
population. We only have one sample since we will be comparing a
population proportion based on a sample value to a known population
value.
Research Question: Are artists more likely to be left-handed than people
found in the general population?
Response Variable: Classification of the student as either right-handed or
left-handed
Example: Stating Null and Alternative Hypothesis

Null Hypothesis: Students in the College of Arts and Architecture are no


more likely to be left-handed than people in the general population
(population percent of left-handed students in the College of Art and
Architecture = 10% or p = .10).
Alternative Hypothesis: Students in the College of Arts and Architecture are
more likely to be left-handed than people in the general population
(population percent of left-handed students in the College of Arts and
Architecture > 10% or p > .10). This is a one-sided alternative hypothesis.
Z-Test Formula

The z test formula compares the z statistic with the z critical


value to test whether there is a difference in the means of two
populations. In hypothesis testing, the z critical value divides the
distribution graph into the acceptance and the rejection regions.
If the test statistic falls in the rejection region, then the null
hypothesis can be rejected otherwise it cannot be rejected. The z
test formula to set up the required hypothesis tests for a one
sample or for a two-sample z test.
One-sample Z-test

A one-sample Z-test is used to check if there is a difference


between the sample mean and the population mean when the
population standard deviation is known. The formula for the Z-
test statistic is given as follows:
One-sample Z-test

The algorithm to set a one-sample Z-test based on the Z-test


statistic is given as follows:
Left Tailed Test:
Null Hypothesis: H0 : μ = μ0
Alternate Hypothesis: H1 : μ < μ0
Decision Criteria: If the z statistic < z critical value then reject the
null hypothesis.
Sample (Left-tailed one-sample z-test)

Problem: An online medicine shop claims that the


mean delivery time for medicines is less than 120
minutes with a standard deviation of 30 minutes.
Is there enough evidence to support this claim at a
0.05 significance level if 49 orders were examined
with a mean of 100 minutes?
Sample (Left-tailed one-sample z-test)

How to solve?
Step 1 – identify the sample size and if the population
standard deviation is known.
Step 2 – state your null and alternate hypotheses
Step 3 – state the alpha level. If you aren’t given one, use
0.05.
Step 4 – find the Z using the one-sample Z-test formula
Step 5 – conclude
Sample (Left-tailed one-sample z-test)

Solution: As the sample size is 49 and population standard


deviation is known, this is an example of a left-tailed one-sample
z test.
Null Hypothesis: H0 : μ = 120
Alternate Hypothesis: H1 : μ < 120
*From the z table the critical value at αα = -1.645. A negative sign
is used as this is a left tailed test.
Sample (Left-tailed one-sample z-test)

▪ Answer:
▪ Reject the null hypothesis
▪ As -4.66 < -1.645 thus, the null
hypothesis is rejected, and it is
concluded that there is enough
evidence to support the
medicine shop's claim.
One-sample Z-test

Right Tailed Test:


Null Hypothesis: H0 : μ = μ0

Alternate Hypothesis: H1 : μ ≠ μ0
Decision Criteria: If the z statistic > z critical value then reject the
null hypothesis.
Example (Right-tailed one-sample z-test)

Problem: A teacher claims that the mean score of


students in his class is greater than 82 with a
standard deviation of 20. If a sample of 81
students was selected with a mean score of 90
then check if there is enough evidence to support
this claim at a 0.05 significance level.
Example (Right-tailed one-sample z-test)

Solution: As the sample size is 81 and population standard


deviation is known, this is an example of a right-tailed one-
sample z test.
Null Hypothesis: H0 : μ = μ0
Alternate Hypothesis: H1 : μ > μ0
* critical value at α = 1.645
Example (Right-tailed one-sample z-test)

Answer:
Reject the null hypothesis
As 3.6 > 1.645 thus, the null
hypothesis is rejected, and it is
concluded that there is enough
evidence to support the teacher's
claim.
Two-sample Z-test

A two-sample Z-test is used to check if there is a


difference between the means of two samples. The z
test statistic formula is given as follows:
Example of Two-sample Z-test

Problem:
The amount of a certain trace element in blood is known to vary with a
standard deviation of 14.1 ppm (parts per million) for male blood donors
and 9.5 ppm for female donors. Random samples of 75 male and 50
female donors yield concentration means of 28 and 33 pp, respectively.
What is the likelihood that the population means of concentrations of
the element are the same for men and women?
Example of Two-sample Z-test

How to solve?
Step 1 – identify the sample size and if the population standard deviation
is known.
Step 2 – state your null and alternate hypotheses
Step 3 – find the Z using the two-sample Z-test formula
Step 4 – conclude
Example of Two-sample Z-test

Solution: As the sample sizes are 75 (male) and 50 (female) and


population standard deviations are both known.
Null Hypothesis: H0 : μ1 = μ2 or H0 : μ1 - μ2 = 0
Alternate Hypothesis: H1 : μ1 ≠ μ2 or H0 : μ1 - μ2 ≠ 0
Example of Two-sample Z-test
Example of Two-sample Z-test

Answer:
Reject the null hypothesis
The computed Z-value is negative because the (larger) mean for females
was subtracted from the (smaller) mean for males.
But the order of the samples in this computation is arbitrary – it could be
in opposite order, in which the case z would be 2.37 instead of -2.37. An
extreme Z-score from alpha level, which is 5% = 1.654 (plus or minus) will
lead to rejection of the null hypothesis.
Let’s test your understanding

EXERCISE:
A principal at a school claims that the students in his school are above
average intelligence. A random sample of 30 students’ IQ scores have a
mean score of 112.5. Is there sufficient evidence to support the
principal’s claim? The mean population IQ is 100 with a standard
deviation of 15.
Exercise:

Solution: As the sample size is 30 and population standard


deviation is known, this is an example of a right-tailed one-
sample z test.
Null Hypothesis: H0 : μ = 100
Alternate Hypothesis: H1 : μ > 100
* critical value at α = 1.645
Exercise:

Solution: Answer:
Reject the null hypothesis
As 4.56 > 1.645 thus, the null
hypothesis is rejected, and it is
Z = (112.5 – 100) / (15/√30)
concluded that there is enough
Z = 4.56 evidence to support the principal's
claim.
Statistical Tests
A. Parametric
• One-Way ANOVA
Analysis of Variance (ANOVA) Test
• ANOVA test can be defined as a type of test used in hypothesis testing to
compare whether the means of two or more groups are equal or not.
• This test is used to check if the null hypothesis can be rejected or not depending
upon the statistical significance exhibited by the parameters. The decision is
made by comparing the ANOVA test statistic with the critical value.
• ANOVA test can be defined as a type of test used in hypothesis testing to
compare whether the means of two or more groups are equal or not. This test is
used to check if the null hypothesis can be rejected or not depending upon the
statistical significance exhibited by the parameters. The decision is made by
comparing the ANOVA test statistic with the critical value.
• An ANOVA test can be either one-way or two-way depending upon the number
of independent variables.
One-way vs Two-way ANOVA Differences Chart

One-Way ANOVA Two-Way ANOVA


A test that allows one to make
A test that allows one to make comparisons between the means of
Definition comparisons between the means of three or more groups of data, where
three or more groups of data. two independent variables are
considered.

Number of Independent Variables One. Two.

The means of three or more groups of an The effect of multiple groups of two
What is Being Compared? independent variable on a dependent independent variables on a dependent
variable. variable and on each other.

Each variable should have multiple


Number of Groups of Samples Three or more.
samples.
The one-way ANOVA is an omnibus test statistic. This implies
that the test will determine whether the means of the various
groups are statistically significant or not. However, it cannot
distinguish the specific groups that have a statistically
significant mean. Thus, to find the specific group with a
different mean, a post hoc test needs to be conducted.
Limitations of One-Way ANOVA Test

The one-way ANOVA is an omnibus test statistic. This


implies that the test will determine whether the means
of the various groups are statistically significant or not.
However, it cannot distinguish the specific groups that
have a statistically significant mean. Thus, to find the
specific group with a different mean, a post hoc test
needs to be conducted.
One-Way ANOVA Hypotheses

▪ The null hypothesis (H0) is that there is no difference


between the groups and equality between means.
H0: µ1= µ2= µ3
▪ The alternative hypothesis (H1) is that there is a
difference between the means and groups.
H1: At least one group is different from another.
Example of One-Way ANOVA Test

At what temperature is it ideal for students to take their exam – cold (15°),
moderate (25°), or hot (35°)?

Moderate
n Cold (15°) Hot (35°)
(25°)
1 4 8 2
2 7 6 2
3 5 7 3
4 3 6 4
5 5 5 1
Step 1: Compute for the Mean in each groups.

n Cold (15°) Moderate (25°) Hot (35°)


1 4 8 2
2 7 6 2
3 5 7 3
4 3 6 4
5 5 5 1

M= 4.8 6.4 2.4


Step 2: Compute for the X12, X22, and X32.

2 2 2
X1 (15 °) X1 X2 (25 °) X2 X3 (35 °) X3
4 16 8 64 2 4
7 49 6 36 2 4
5 25 7 49 3 9
3 9 6 36 4 16
5 25 5 25 1 1
Step 3: Compute for the ∑X12, ∑X22, and ∑X32.

X1 (15 X12 X2 (25 °) X22 X3 (35 °) X32


°)
4 16 8 64 2 4
7 49 6 36 2 4
5 25 7 49 3 9
3 9 6 36 4 16
5 25 5 25 1 1
T= 24 ∑X12= 124 T= 32 ∑X22= 210 T= 12 ∑X32= 34
M= 4.8 M= 6.4 M= 2.4
Step 3: Compute for the ∑X12, ∑X22, and ∑X32.

X1 (15 X12 X2 (25 °) X22 X3 (35 °) X32


°)
4 16 8 64 2 4
7 49 6 36 2 4
5 25 7 49 3 9
3 9 6 36 4 16
5 25 5 25 1 1
T= 24 ∑X12= 124 T= 32 ∑X22= 210 T= 12 ∑X32= 34
M= 4.8 M= 6.4 M= 2.4
Step 4: Complete the details on the table:.
X1 (15 °) X12 X2 (25 °) X22 X3 (35 °) X32
4 16 8 64 2 4 k= 3
7 49 6 36 2 4 n= 5
5 25 7 49 3 9 N= 15
3 9 6 36 4 16 G= 68
5 25 5 25 1 1 ∑X2= 368
T= 24 ∑X12= 124 T= 32 ∑X22= 210 T= 12 ∑X32= 34
M= 4.8 M= 6.4 M= 2.4
Wherein:
k= number of treatment conditions T= total for each treatment condition

n= number of scores in each treatment G= sum of all scores in the study

N= total number of scores SS= sum of squares


Step 3: Compute for the sum of squares (SS).
Step 4: Complete the ANOVA Table:

Source SS df MS F
Between
Treatments
Within
Treatments
Total 59.73
Source SS df MS F
Between
Treatments
Within
19.2
Treatments
Total 59.73
Source SS df MS F
Between
40.53
Treatments
Within
19.2
Treatments
Total 59.73
Source SS df MS F
Between
40.53 2
Treatments
Within
19.2 12
Treatments
Total 59.73 14
Source SS df MS F
Between
40.53 2 20.265
Treatments
Within
19.2 12 1.6
Treatments
Total 59.73 14
Source SS df MS F
Between
40.53 2 20.265 12.67
Treatments
Within
19.2 12 1.6
Treatments
Total 59.73 14
Step 5: Find the tabular value of F
Step 6: Compare the F statistic (computed value) to the critical value.

F= 12.67
table F= 3.89

If F is higher compared to table F, reject the null hypothesis (H0).

Conclusion:
H1: There is a significant difference in the exam scores between the three
groups.
The Mozart effect refers to a boost of average performance on tests for elementary school students if the students listen
to Mozart’s chamber music for a period of time immediately before the test. Many educators believe that such an effect is
not necessarily due to Mozart’s music per se but rather a relaxation period before the test. To support this belief, an
elementary school teacher conducted an experiment by dividing her third-grade class of 15 students into three groups of
5. Students in the first group were asked to give themselves a self-administered facial massage; students in the second
group listened to Mozart’s chamber music for 15 minutes; students in the third group listened to Schubert’s chamber
music for 15 minutes before the test. The scores of the 15 students are given below:

n Group 1 Group 2 Group 3


(Massage) (Mozart) (Schubert)
1 7 8 8
2 7 8 8
3 8 9 7
4 9 9 9
5 8 9 9
n Group 1 Group 2 Group 3
1 7 8 8
2 7 8 8
3 8 9 7
4 9 9 9
5 8 9 9
M= 7.8 8.6 8.2
X2 X22 X3 X32
X1 X12
7 49 8 64 8 64

7 49 8 64 8 64

8 64 9 81 7 49

9 81 9 81 9 81

8 64 9 81 9 81
X1 X12 X2 X22 X3 X32
7 49 8 64 8 64
7 49 8 64 8 64
8 64 9 81 7 49
9 81 9 81 9 81
8 64 9 81 9 81
T= 39 ∑X12= 307 T= 43 ∑X22=371 T= 41 ∑X32= 339
M= 7.8 M= 8.6 M= 8.2
X1 X12 X2 X22 X3 X32
7 49 8 64 8 64 k= 3
7 49 8 64 8 64 n= 5
8 64 9 81 7 49 N= 15
9 81 9 81 9 81 G= 123
8 64 9 81 9 81 ∑X2= 1017
T= 39 ∑X12= 307 T= 43 ∑X22=371 T= 41 ∑X32= 339
M= 7.8 M= 8.6 M= 8.2
X1 X12 X2 X22 X3 X32
7 49 8 64 8 64 k= 3
7 49 8 64 8 64 n= 5
8 64 9 81 7 49 N= 15
9 81 9 81 9 81 G= 123
8 64 9 81 9 81 ∑X2= 1017
T= 39 ∑X12= 307 T= 43 ∑X22=371 T= 41 ∑X32= 339
M= 7.8 M= 8.6 M= 8.2

SS1= 2.8 SS2= 1.2 SS3= 2.8


Source SS df MS F
Between
1.6 2 0.8 1.412
Treatments
Within
6.8 12 0.56667
Treatments
Total 8.4 14
Step 5: Find the tabular value of F
Step 6: Compare the F statistic (computed value) to the critical value.

F= 1.412
table F= 3.89

If F is lower compared to table F, accept the null hypothesis (H0).

Conclusion:
H1: There is significant difference in the exam scores between the three
groups who participated in the experiment.
Statistical Tests
A. Parametric
• Two-Way ANOVA
Two-way ANOVA

▪ Two-way ANOVA is an extension of the One-way ANOVA in term of the


second factor is to the analysis.
▪ This is an extension the one factor situation to take account of a second
factor are often determined by grouping of subjects or units used in the
research.
▪ A two-way ANOVA is used to determine whether or not there is a
statistically significant difference between the means of three or more
independent groups that have been split on two factors.
Two-way ANOVA Table

Where:
df SS MSS F F-crit
Between Row df = Degrees of Freedom
Between Column SS = Sum of Squares
Within Error
MSS = Mean Sum of Square
Total
F = F-ratio
F = F-crit value from table
Example 1

A farmer applied three type of fertilizers on 4 separate plots for his cultivation. The
figure on yield per acre are tabulated below.

YIELD
FERTILIZER
A B C D
NITROGEN 6 4 8 6
PHOSPHORUS 7 6 6 9
POTASSIUM 8 5 10 9

H0 : no significant effect on the yield


Step 1: Find the Degrees of Freedom

YIELD
FERTILIZER
A B C D
NITROGEN 6 4 8 6
PHOSPHORUS 7 6 6 9
POTASSIUM 8 5 10 9

dfBetweenRow = No. of rows – 1 dfTotal = No. of Samples – 1 df SS MSS F F-crit


= 3 -1 = 12 -1
Between Row 2
dfBR = 2 dfT = 11
Between Column 3
dfBetweenColumn = No. of columns – 1 df = df – (df + df )
WithinError T BR BC
= 4 -1 = 11 – (2 + 3) Within Error 6
dfBC = 3 dfWE = 6 Total 11
Step 2: Find the Sum of Squares

YIELD SSR = σ 𝑛𝑅 (𝑥𝑅 − x̅G)2


FERTILIZER
A B C D xR = 4(6-7)2 + 4(7-7)2 + 4(8-7)2
NITROGEN 6 4 8 6 6 =8
PHOSPHORUS 7 6 6 9 7 SSC = σ 𝑛𝐶 (𝑥𝐶 − x̅G)2
POTASSIUM 8 5 10 9 8 = 3(7-7)2 + 3(5-7)2 + 3(8-7)2 +3(8-7)2
xC 7 5 8 8 x̅G = 7 = 18
SSTotal = σ(𝑎 − x̅G)2
6+4+8+6 6+7+8
xR-NITROGEN = =6 xC-A = =7 = (6-7) 2 + (4-7)2 + (8-7)2 + (6-7)2 + (7-7)2 + (6-7)2 + (6-7)2 + (9-7)2+ (8-7)2 + (5-7)2 + (10-7)2 + (9-7)2
4 3
7+6+6+9 4+6+5
xR-PHOSPHORUS = =7 xC-B = =5 = 36
4 3
8+5+10+9 8+6+10
xR-POTASSIUM = =8 xC-C = =8 SSWE = SSTotal – (SSR + SSC)
4 3
6+9+9 = 36 – (8 +18)
xC-C = =8
3 = 10
6+4+8+6+7+6+6+9+8+5+10+9
x̅G = =7
12
Step 2: Find the Sum of Squares

df SS MSS F F-crit
Between Row 2 8
Between Column 3 18
Within Error 6 10
Total 11 36
Step 3: Find the Mean Sum of Square

df SS MSS F F-crit
Between Row 2 8 4
𝑺𝑺
Between Column 3 18 6 MSS =
Within Error 6 10 1.667 𝒅𝒇
Total 11 36

𝑺𝑺𝑹 𝑺𝑺𝑪 𝑺𝑺𝑾𝑬


MSSR = MSSC = MSSWE =
𝒅𝒇𝑹 𝒅𝒇𝑪 𝒅𝒇𝑾𝑬
𝟖 𝟏𝟖 𝟏𝟎
= = =
𝟐 𝟑 𝟔
=4 =6 = 1.667
Step 4: Find the F-ratio

df SS MSS F F-crit
Between Row 2 8 4 2.4
𝑴𝑺𝑺
Between Column 3 18 6 3.6 F=
Within Error 6 10 1.667 𝑴𝑺𝑺𝑾𝑬
Total 11 36

𝑴𝑺𝑺𝑹 𝑴𝑺𝑺𝑪
FR = FC =
𝑴𝑺𝑺𝑾𝑬 𝑴𝑺𝑺𝑾𝑬
𝟒 𝟔
= =
𝟏.𝟔𝟔𝟕 𝟏.𝟔𝟔𝟕
= 2.4 = 3.6
Step 5: Find the p-value

▪ Use the Table of Critical Values of the F distribution for α = 0.05


▪ Numerator = df
▪ Denominator = dfWE
p-value for between row (Fertilizer)
dfR = 2

dfWE = 6
p-value for between column (Plot)
dfC = 3

dfWE = 6
Conclusion:

df SS MSS F F-crit
Between Row 2 8 4 2.4 5.14
Between Column 3 18 6 3.6 4.76
Within Error 6 10 1.667
Total 11 36

Fertilizer Plot
• Since 2.4 < 5.14, H0 is not rejected • Since 3.6 < 4.76, H0 is not rejected
• The Fertilizer does not have significant • The Plot does not have significant
effect on the yield effect on the yield
Example 2: https://forms.gle/bnrzKmaWg4nN8moR6

Students taking up Master in Public Administration were surveyed on how many


hour they have been dedicating their time per day in different social media
platforms. The figures were tabulated below.
Number of Hours
NAME GENDER
Facebook Tiktok Youtube

Male

Female

Does the gender or type of social media have an effect on how many hour a
student had been ?
H0 : no significant effect on the number of hours
Step 1: Find the Degrees of Freedom Step 2: Find the Sum of Squares
Gender Facebook Tiktok Youtube Xr
Female 3 1 4 2.667
dfBetweenRow = No. of rows – 1 SSR = σ 𝑛𝑅 (𝑥𝑅 − x̅G)2
Female 6 0 1 2.333
dfBR = 17 = 93.155
Female 0 1 2 1
Female 8 0 2 3.333 dfBetweenColumn = No. of columns – 1 SSC = σ 𝑛𝐶 (𝑥𝐶 − x̅G)2
Female 0 0 2 0.667 dfBC = 2 = 24.663
Female 1 0 3 1.333 dfTotal = No. of Samples – 1
Female 1 2 1 1.333 dfT = 53 SSTotal = σ(𝑎 − x̅G)2
Female 2 3 0 1.667
Female 2 5 1 2.667 dfWithinError = dfT – (dfBR + dfBC) = 232.095
Male 2 0.5 0 0.833 dfWE = 34
SSWE = SSTotal – (SSR + SSC)
Male 5 3 6 4.667 = 114.277
Male 1 0 0.5 0.5
Male 2 0 2 1.333 df SS MSS F F-crit
Male 1 1 1 1 Between Row 17 93.155
Male 2 0 1 1
Between Column 2 24.663
Male 8 0.1 8 5.367
Male 2 2 2 2 Within Error 34 114.277

Male 3 1 2 2 Total 53 232.095

Xc 2.722 1.089 2.139 Xg= 1.983


Step 3: Find the Mean Sum of Square

df SS MSS F F-crit
Between Row 17 93.155 5.48 𝑺𝑺
Between Column 2 24.663 12.33 MSS =
Within Error 34 114.277 3.36
𝒅𝒇
Total 53 232.095

𝑺𝑺𝑹 𝑺𝑺𝑪 𝑺𝑺𝑾𝑬


MSSR = MSSC = MSSWE =
𝒅𝒇𝑹 𝒅𝒇𝑪 𝒅𝒇𝑾𝑬
𝟗𝟑.𝟏𝟓𝟓 𝟐𝟒.𝟔𝟔𝟑 𝟏𝟏𝟒.𝟐𝟕𝟕
= = =
𝟏𝟕 𝟐 𝟑𝟒
= 5.48 = 12.33 = 3.36
Step 4: Find the F-ratio

df SS MSS F F-crit
Between Row 17 93.155 5.48 1.63 𝑴𝑺𝑺
Between Column 2 24.663 12.33 3.67 F=
𝑴𝑺𝑺𝑾𝑬
Within Error 34 114.277 3.36
Total 53 232.095

𝑴𝑺𝑺𝑹 𝑴𝑺𝑺𝑪
FR = FC =
𝑴𝑺𝑺𝑾𝑬 𝑴𝑺𝑺𝑾𝑬
𝟓.𝟒𝟖 𝟏𝟐.𝟑𝟑
= =
𝟑.𝟑𝟔 𝟑.𝟑𝟔
= 1.63 = 3.67
Step 5: F-crit value for between row (Gender) dfR = 17

dfWE = 34
Step 5: F-crit value for between column (Social Media Platforms)
dfC = 2

dfWE = 34
Conclusion:

df SS MSS F F-crit
Between Row 17 93.155 5.48 1.63 1.933
Between Column 2 24.663 12.33 3.67 3.276
Within Error 34 114.277 3.36
Total 53 232.095

Gender
Social Media Platforms
• Since 1.63 < 1.933, H0 is not rejected
• Since 3.67 > 3.276, H0 is rejected
• The Gender have significant effect on
• The type of social media platforms have
how many hours a student delegated
significant effect on how many hours a
their time in social media
student delegated their time.
Statistical Tests
B. Non-Parametric
• Chi-Square Test
The Chi-Square test is a statistical procedure for determining the difference between
observed and expected data. This test can also be used to determine whether it correlates
to the categorical variables in our data. It helps to find out whether a difference between
two categorical variables is due to chance or a relationship between them. Also called
Pearson’s Chi-Square Test or Chi-Square Test of Association.

Formula For Chi-Square Test:


Where

c = Degrees of freedom
O = Observed Value

E = Expected Value

The degrees of freedom in a statistical calculation represent the number of variables that can
vary in a calculation. The degrees of freedom can be calculated to ensure that chi-square tests
are statistically valid. These tests are frequently used to compare observed data with data that
would be expected to be obtained if a particular hypothesis were true.

The Observed values are those you gather yourselves.

The Expected values are the frequencies expected, based on the null hypothesis.
Chi-square is a statistical test that examines the differences between categorical variables from a
random sample in order to determine whether the expected and observed results are well-fitting.

Here are some of the uses of the Chi-Squared test:


• The Chi-squared test can be used to see if your data follows a well-known theoretical
probability distribution like the Normal or Poisson distribution.
• The Chi-squared test allows you to assess your trained regression model's goodness of fit on
the training, validation, and test data sets.
What Does A Chi-Square Statistic Test Tell You?

A Chi-Square test ( symbolically represented as 2 ) is fundamentally a data analysis based on the


observations of a random set of variables. It computes how a model equates to actual observed
data.

A Chi-Square statistic test is calculated based on the data, which must be raw, random, drawn
from independent variables, drawn from a wide-ranging sample and mutually exclusive. In simple
terms, two sets of statistical data are compared. Karl Pearson introduced this test in 1900 for
categorical data analysis and distribution. This test is also known as ‘Pearson’s Chi-Squared Test’.

Chi-Squared Tests are most commonly used in hypothesis testing. A hypothesis is an assumption
that any given condition might be true, which can be tested afterwards. The Chi-Square test
estimates the size of inconsistency between the expected results and the actual results when the
size of the sample and the number of variables in the relationship is mentioned.

These tests use degrees of freedom to determine if a particular null hypothesis can be rejected
based on the total number of observations made in the experiments. Larger the sample size,
more reliable is the result.
When to Use a Chi-Square Test?

A Chi-Square Test is used to examine whether the observed results are in order with the expected
values. When the data to be analysed is from a random sample, and when the variable is the
question is a categorical variable, then Chi-Square proves the most appropriate test for the same. A
categorical variable consists of selections such as breeds of dogs, types of cars, genres of movies,
educational attainment, male v/s female etc. Survey responses and questionnaires are the primary
sources of these types of data. The Chi-square test is most commonly used for analysing this kind
of data. This type of analysis is helpful for researchers who are studying survey response data. The
research can range from customer and marketing research to political sciences and economics.
There are Two Main Types of Chi-Square Tests, namely :

1. Independence
2. Goodness-of-Fit
Independence
The Chi-Square Test of Independence is a derivable ( also known as inferential ) statistical test which examines
whether the two sets of variables are likely to be related with each other or not. This test is used when we have
counts of values for two nominal or categorical variables and is considered as non-parametric test. A relatively
large sample size and independence of observations are the required criteria for conducting this test.

For Example:

In a movie theatre, suppose we made a list of movie genres. Let us consider this as the first variable. The
second variable is whether or not the people who came to watch those genres of movies have bought snacks at
the theatre. Here the null hypothesis is that the genre of the film and whether people bought snacks or not are
unrelatable. If this is true, the movie genres don’t impact snack sales.
Goodness-Of-Fit
In statistical hypothesis testing, the Chi-Square Goodness-of-Fit test determines whether a variable is
likely to come from a given distribution or not. We must have a set of data values and the idea of the
distribution of this data. We can use this test when we have value counts for categorical variables. This
test demonstrates a way of deciding if the data values have a “ good enough” fit for our idea or if it is a
representative sample data of the entire population.
For Example-

Suppose we have bags of balls with five different colors in each bag. The given condition is that the bag
should contain an equal number of balls of each color. The idea we would like to test here is that the
proportions of the five colors of balls in each bag must be exact.
Who Use Chi-Square Analysis?

Chi-square is most commonly used by researchers who are studying survey response data
because it applies to categorical variables. Demography, consumer and marketing research,
political science, and economics are all examples of this type of research.

Example

Let's say you want to know if gender has anything to do with political party preference. You poll
440 voters in a simple random sample to find out which political party they prefer. The results of
the survey are shown in the table below:
To see if gender is linked to political party preference, perform a Chi-Square test of independence
using the steps below.

Step 1: Define the Hypothesis

H0: There is no link between gender and political party preference.

H1: There is a link between gender and political party preference.

Step 2: Calculate the Expected Values

Now you will calculate the expected frequency.


For example, the expected value for Male Republicans is:

Similarly, you can calculate the expected value for each of the cells.
Step 3: Calculate (O-E)2 / E for Each Cell in the Table

Now you will calculate the (O - E)2 / E for each cell in the table.

Where:

O=Observed Value

E = Expected Value
Step 4: Calculate the Test Statistic X2

X2 is the sum of all the values in the last table

= 0.743 + 2.05 + 2.33 + 3.33 + 0.384 + 1

= 9.837

Before you can conclude, you must determine first the critical statistic, which requires determining our
degrees of freedom. The degrees of freedom in this case are equal to the table's number of columns minus
one multiplied by the table's number of rows minus one, or (r-1) (c-1). We have (3-1)(2-1) = 2.

Finally, you compare our obtained statistic to the critical statistic found in the chi-square table. As you can see,
for an alpha level of 0.05 and two degrees of freedom, the critical statistic is 5.991, which is less than our
obtained statistic of 9.83. You can reject our null hypothesis because the critical statistic is higher than your
obtained statistic.

This means you have sufficient evidence to say that there is an association between gender and political party
preference.
Chi-Square Distribution

In statistical analysis, the Chi-Square distribution is used in many hypothesis tests and is determined by
the parameter k degree of freedoms. It belongs to the family of continuous probability distributions.
The Sum of the squares of the k independent standard random variables is called the Chi-Squared
distribution. Pearson’s Chi-Square Test formula is -

Where X^2 is the Chi-Square test symbol

Σ is the summation of observations

O is the observed results


E is the expected results
The shape of the distribution graph changes with the increase in the value of k, i.e. degree of
freedoms.

When k is 1 or 2, the Chi-square distribution curve is shaped like a backwards ‘J’. It means
there is a high chance that X^2 becomes close to zero.
When k is greater than 2, the shape of the distribution curve looks like a hump and has a low
probability that X^2 is very near to 0 or very far from 0. The distribution occurs much longer on
the right-hand side and shorter on the left-hand side. The probable value of X^2 is (X^2 - 2).

When k is greater than ninety, a normal distribution is seen, approximating the Chi-square
distribution.
Example

A restaurant wishes to determine whether there is a difference in the level


of spiciness selected by its frequent clients, who belong to different social
and cultural groups: the Bicolanos and Manilenos, for their ramen. A
random sample provides the data given below. At α=0.05, test the claim that
the level of spiciness selected is dependent on the ethnicity of the
individual.

Cultural/social Spiciness level- Hot Very Hot Total


group Medium
Bicolanos 24 20 19 63
Manilenos 13 15 20 48
Total 37 35 39 111
Hypotheses

Null Hypothesis: The spiciness level selected by an individual is


independent of the ethnicity of the individual.

Alternative Hypothesis: The spiciness level selected by an individual is


dependent of the ethnicity of the individual.
Observed frequencies:
Cultural/social Spiciness level- Hot Very Hot Total
group Medium
Bicolanos 24 20 19 63
Manilenos 13 15 20 48
Total 37 35 39 111

Expected Frequency:
Region Spiciness level- Hot Very Hot Total
Medium
Bicolanos 21.00 19.86 22.14 63
Manilenos 16.00 15.14 16.86 48
Total 37 35 39 111
𝑂−𝐸 2 Expected value= row sum x column sum
x2 = σ
𝐸
grand total

2 24−21 ^2 (20−19.86)^2 (19−22.14) (13−16)^2 (15−15.14)^2


x= + + + + +
21 19.86 22.14 16 15.14
20−16.86 ^2
16.86
x2 = 2.03
d.f.= (rows-1) (colums-1)
= (2-1) (3-1)
=1x2
df= 2
Conclusion:

Since the chi-square value of 2.03 is less than the critical value of
5.991, then the null hypothesis is not rejected. Meaning, there is
not enough evidence to support the claim that the spiciness level
selected by an individual is dependent on their respective
ethnicity.
Statistical Tests
B. Non-Parametric
Non-Parametric Tests

▪ Sometimes called distribution free test


▪ It does not assume anything about the underlying distribution (for
example, that the data comes from a normal distribution)
▪ It usually means that you know the population data does not have a
normal distribution.
Advantages of Non-Parametric Tests

▪ More statistical power when assumptions for the parametric tests have
been violated. When assumptions haven’t been violated, they can be
almost as powerful.
▪ Fewer assumptions (i.e. the assumption of normality doesn’t apply).
▪ Small sample sizes are acceptable.
▪ They can be used for all data types, including nominal variables, interval
variables, or data that has outliers or that has been measured imprecisely.
Disadvantages of Non-Parametric Tests

▪ Less powerful than parametric tests if assumptions haven’t been violated.


▪ More labor-intensive to calculate by hand (for computer calculations, this
isn’t an issue).
▪ Critical value tables for many tests aren’t included in many computer
software packages. This is compared to tables for parametric tests (like
the z-table or t-table) which usually are included.
Non-Parametric Tests and their alternatives

Parametric Tests Non-Parametric Tests


One sample Simple t-test Wilcoxon test for one
sample
Two dependent samples Paired sample t-test Wilcoxon test
Two independent samples Unpaired sample t-test Mann-Whitney U test
More than two One factorial ANOVA Kruskal-Wallis test
independent samples
More than two Repeated Measures Friedman test
dependent samples ANOVA
Correlation between two Pearson Correlation Spearman Correlation
variables
Friedman Test

▪ A rank-based, nonparametric test for several related samples


▪ Named in honor of the Nobel laureate and American economist Milton
Friedman, who first proposed the test in 1937 in the Journal of the American
Statistical Association
▪ Used to test for differences between groups when the dependent variable
being measured is ordinal. It can also be used for continuous data that has
violated the assumptions necessary to run the one-way ANOVA with repeated
measures (e.g., data that has marked deviations from normality).
▪ This test is the non-parametric counterpart of the analysis of variance with
repeated measures.
Friedman Test
Friedman Test Research Question
Friedman Test Example

You might be interested to know if therapy after a slipped disc has any effect on the
perception of pain.
Friedman Test Calculation
Friedman Test Calculation
Mann-Whitney Test

▪ The Mann-Whitney U test checks whether there is a difference between


two independent groups
▪ A non-parametric counterpart to the t-test for independent samples
▪ Unlike the independent-samples t-test, the Mann-Whitney U test allows
you to draw different conclusions about your data depending on the
assumptions you make about your data's distribution. These conclusions
can range from simply stating whether the two populations differ through
to determining if there are differences in medians between groups.
Mann-Whitney Test

The Mann-Whitney U test is the non-parametric counterpart to the t-test for


independent samples; it is subject to less stringent assumptions than the t-test and is
always used when the requirement of normal distribution for the t-test is not met.
Assumptions Mann-Whitney U Tests

To be able to calculate a Mann-Whitney U test, only two independent


random samples with at least ordinally scaled characteristics must be
available. The variables do not have to satisfy any distribution curve.
Assumptions for Mann-Whitney U Tests

▪ The dependent variable should be measured on an ordinal scale or a


continuous scale.
▪ The independent variable should be two independent, categorical groups.
▪ Observations should be independent. In other words, there should be no
relationship between the two groups or within each group.
▪ Observations are not normally distributed. However, they should follow
the same shape (i.e. both are bell-shaped and skewed left).
Mann Whitney Test Example

You measured the reaction time of a small group of men and women and
want to know if there is a difference.
Gender Reaction Rank Calculation of the rank sums:
Time
Female 34 2
Female 36 4
Female 41 7
For Females, T1 = 2 + 4 + 7 + 9 + 10 + 5 = 37
Female 43 9
Female 44 10
Female 37 5
Male 45 11
Male 33 1 For Males, T2 = 11 + 1 + 3 + 6 + 8 = 29
Male 35 3
Male 39 6
Male 42 8
Mann Whitney Test Example

You measured the reaction time of a small group of men and women and
want to know if there is a difference.
Kruskal Wallis Test

▪ A non-parametric alternative to the One Way ANOVA


▪ A rank-based nonparametric test that can be used to determine if there
are statistically significant differences between two or more groups of an
independent variable on a continuous or ordinal dependent variable
▪ The Kruskal Wallis test will tell you if there is a significant
difference between groups. However, it won’t tell you which groups are
different.
Assumptions for Kruskal Wallis Test

▪ One independent variable with two or more levels (independent groups).


The test is more commonly used when you have three or more levels. For
two levels, consider using the Mann Whitney U Test instead.
▪ Ordinal scale, Ratio Scale or Interval scale dependent variables.
▪ Your observations should be independent. In other words, there should be
no relationship between the members in each group or between groups.
▪ All groups should have the same shape distributions.
Kruskal Wallis Test Example

You have measured the reaction time of three groups and want to know if
there is a difference between the groups.
Kruskal Wallis Test Example

You have measured the reaction time of three groups and want to know if
there is a difference between the groups.
Kruskal Wallis Test Example

You have measured the reaction time of three groups and want to know if
there is a difference between the groups.
Kruskal Wallis Test Example

You have measured the reaction time of three groups and want to know if
there is a difference between the groups.
Kruskal Wallis Test Example

You have measured the reaction time of three groups and want to know if
there is a difference between the groups.

You might also like