UNIT IV - Inferential Statistics
UNIT IV - Inferential Statistics
Inferential Statistics
Group 3: Conception, Garcia, Jacob, Morgan, Puno-Sanchez,
Santiago, and Villarin
OUTLINE
A. Inferential Statistics Definition
B. Purpose of Inferential Statistics
C. Statistical Tests
1. Parametric
• T-Test
• Z-Test
• ANOVA
2. Non-Parametric
• Chi-Square Test
• Other Non-Parametric Test
Inferential Statistics
Definition
Introduction to Inferential Statistics
INFERENCE
SAMPLE
POPULATION
Inferential vs. Descriptive Statistics
Sample Population
Statistic Parameter
x̅ Mean µ
s SD σ
s2 Variance σ2
n Size N
Exercise: statistic or parameter?
REPRESENTATIVE SAMPLE
TO ESTABLISH IF ISFs REALLY CONTRIBUTE TO THE DETERIORATION OF WATER QUALITY OF MANILA BAY (Null
Hypothesis), WE HAVE DO A RESEARCH USING INFERENTIAL STATISTICS
Inferential Statistics, definition (continued)
▪ As it would take too long and be too expensive to actually survey every informal
settler within the Manila Bay region, researchers instead take a smaller survey of
say, 1,200 respondents (SWS standard), and use the results of the survey to draw
inferences about the population as a whole.
▪ This is the whole premise behind inferential statistics – we want to answer some
question about a population, so we obtain data for a small sample of that
population and use the data from the sample to draw inferences about the
population.
Inferential Statistics, definition (continued)
▪ As it would take too long and be too expensive to actually do a survey of every
informal settler within the Manila Bay region. Researchers instead take a smaller
survey of say, 1,200 respondents (SWS standard), and use the results of the survey
to draw inferences about the population as a whole.
▪ This is the whole premise behind inferential statistics – we want to answer some
question about a population, so we obtain data for a small sample of that
population and use the data from the sample to draw inferences about the
population.
What is considered a “representative sample”?
▪ In order to be confident in our ability to use a sample to draw
Population
inferences about a population, we need to make sure that we have a
representative sample – that is, a sample in which the
characteristics of the individuals in the sample closely match the Sample
characteristics of the overall population.
▪ Ideally, we want our sample to be like a “mini version” of our
population. So, if we want to draw inferences on a population of
informal settlers composed of 60% that reside near or along
waterways and 40% that reside inland, our sample would not be
representative if it included only 40% of residents along waterways
and 60% of those who reside inland. If our sample is not similar to
the overall population, then we cannot generalize the findings from Sample is not representative of the
the sample to the overall population with any confidence. population
What is sampling?
Sampling is a technique of selecting
individual members or a subset of the
population to make statistical inferences
from them and estimate characteristics
of the whole population. Different
sampling methods are widely used by
researchers in social and market
research so that they do not need to
research the entire population to collect
actionable insights.
Types of Sampling:
Sampling in social research is of two types –
probability sampling and non-probability sampling.
Probability sampling is a sampling technique where a
researcher sets a selection of a few criteria and
chooses members of a population randomly. All the
members have an equal opportunity to be a part of
the sample with this selection parameter.
Non-probability sampling is when the researcher
chooses members for research at random. This
sampling method is not a fixed or predefined
selection process. This makes it difficult for all
elements of a population to have equal opportunities
to be included in a sample.
Probability Sampling Techniques
▪ Simple random sampling is one of the best probability sampling techniques
that helps in saving time and resources. It is a reliable method of obtaining
information where every single member of a population is chosen
randomly, merely by chance. Each individual has the same probability of
being chosen to be a part of a sample.
Content
A. Description
B. Types of t-test
➢Paired t-test
➢One-sample t-test
➢Two-sample t-test
t-Test
▪ A t-test is an inferential statistic used to determine if there is a
significant difference between the means of two groups and
how they are related.
▪ T-tests are used when the data sets follow a normal distribution
and have unknown variances, like the data set recorded from
flipping a coin 100 times.
▪ It is often used in hypothesis testing to determine whether a
process or treatment actually has an effect on the population of
interest, or whether two groups are different from one another.
▪ The t-test was devised by
William Sealy Gosset
around 1908
▪ while he was an employee
of Guinness brewery in
Ireland.
▪ He used the t-test to
monitor the quality of the
stout brewed by Guinness.
t-Test Assumptions
The t-test is a parametric test of difference, meaning that it makes the same
assumptions about your data as other parametric tests.
▪ If the groups come from two different populations (e.g. two different species, or
people from two separate cities), perform a two-sample t-
test (a.k.a. independent t-test).
▪ If there is one group being compared against a standard value (e.g. comparing
the acidity of a liquid to a neutral pH of 7), perform a one-sample t-test.
Types of t-test
Types of t-test One-sample t-test Two-sample t-test Paired t-test
Number of Variables One Two Two
Type of Variable Continuous measurement Continuous measurement Continuous measurement
Categorical or Nominal to define pairing
Categorical or Nominal to define groups
within group
Decide if the population mean is equal Decide if the population mea and for Decide if the difference between paired
Purpose of test
to a specific value or not two different groups are equal or not measurements for a population is zero or not
Mean heart rate of a group of people is Mean heart rates for two groups of Mean difference in heart rate for a group of
Example test
equal to 65 or not people are the same or not people before and after exercise is zero or not
Population standard Unknown, use sample standard Unknown, use sample standard Unknown, use sample standard deviation of
deviation deviation deviations for each group differences in paired measurements
Number of observations in sample Sum of observations in sample minus 2, Number of paired observations in sample
Degrees of freedom
minus 1, or: n-1 or: n1 + n2 – 2 minus 1, or: n-1
How to conduct a t-test
1. Define your null (Ho) hypotheses .
▪ One-sample t-test
▪ Paired t-test
Calculate the degrees of freedom
▪ Two-sample t-test
▪ One-sample t-test
▪ Paired t-test
Determine Critical Value
For example :
According to this
table, for a two-tailed
test with an alpha
level of 0.05 at 40
degrees of freedom,
the critical value is
2.02.
Paired t-Test
▪ The paired t-test is a method used to test whether the mean difference
between pairs of measurements is zero or not.
▪ You can use the test when your data values are paired measurements. For
example, you might have before-and-after measurements for a group of
people. Also, the distribution of differences between the paired
measurements should be normally distributed.
Paired t-test example
▪ An instructor wants to use two exams in her
classes next year. This year, she gives both exams
to the students. She wants to know if the exams
are equally difficult and wants to check this by
looking at the differences between scores. If the
mean difference between scores for students is
“close enough” to zero, she will make a practical
conclusion that the exams are equally difficult.
3. We start by calculating our test statistic. To accomplish this, we need the average difference, the standard deviation of
the difference and the sample size.
The average score difference is: Ⴟd=1.31
Next, we calculate the standard error for the score difference. The calculation is:
Standard Error=sd/√n =7.00/√16 =7/4 = 1.75
We now have the pieces for our test statistic. We calculate our test statistic as:
t = Difference of averages / Standard Error
t = 1.31 / 1.75 = 0.75
Determine Critical Value
According to this
table, for a paired t-
test with an alpha
level of 0.05 at 15
degrees of freedom,
the critical value is
2.131.
To make our decision, we compare the test statistic to a value from
the t-distribution.
Because 0.750 < 2.131, we cannot reject our idea that the mean score
difference is zero. We make a practical conclusion to consider exams as equally
difficult.
EXERCISE FOR PAIRED T-TEST
Consider a sample of 5 toys comparing the 1. Ho : The average battery mean life length is equal to zero
battery life of 2 brands:
2. Decide on the Alpha. α = 0.05
Data : 3. Compute for t-statistic
TOY BATTERY TYPE 1 BATTERY TYPE 2 DIFF.
t = x̅d / se
1 52.6 61.4 -8.8 = -6.18 / 2.11
2 103.4 112.8 -9.4 = -2.93
3 68.2 67.1 1.1 4. Compute for degrees of freedom : df = n-1 = 5-1 = 4
4 88.4 92.3 -3.9 5. Determine Critical Value : 2.776
5 111.6 121.5 -9.9
6. Compare t-statistic and t-value : -2.93 < 2.776
Is there a significant difference in the life length 7. Decide : Accept Ho, which means there is no difference in
of two brands of batteries? batteries type 1 and 2.
▪ Given Data:
n = 5
x̅d = -6.18 [ ∑difference/n ]
s = 4.73 [ √∑(x-x̅) 2 / n-1 ]
se = 2.11 [ s / √n ]
One sample t-test
▪ The one-sample t-test is a statistical hypothesis test used to determine
whether an unknown population mean is different from a specific value.
▪ You can use the test for continuous data. Your data should be a random
sample from a normal population.
One-sample t-test example
Imagine we have collected a random
sample of 31 energy bars from a number of
different stores to represent the population
of energy bars available to the general
consumer. The labels on the bars claim that
each bar contains 20 grams of protein.
If you look at the table, you see that
some bars have less than 20 grams of protein.
Other bars have more. You might think that
the data support the idea that the labels are
correct. Others might disagree. The statistical
test provides a sound method to make a
decision, so that everyone makes the same
decision on the same set of data values.
Data
Before jumping into analysis, we should take a quick look at the
data. The figure below shows a histogram and summary statistics
for the energy bars.
▪ From a quick look at the histogram, we see that
there are no unusual points, or outliers. The data
look roughly bell-shaped, so our assumption of a
normal distribution seems reasonable.
3. Calculate T-statistic . Formula : t = Mean – Hypothesized mean / standard error for mean
We start by finding the difference between the sample average and Assumption value:
= 21.40 − 20 = 1.40
Next, we calculate the standard error for the mean. The calculation is:
Standard Error for the mean = standard deviation/√total sample size
se = 2.54/√31 =0.456
We now have the pieces for our test statistic. We calculate our test statistic as:
t = Mean – Hypothesized mean / Standard Error for mean
= 1.40 / 0.456
= 3.07
5. We find the value from the t-distribution based on our decision. For a t-test, we need
the degrees of freedom to find this value. The degrees of freedom are based on the
sample size.
For the energy bar data:
degrees of freedom = n−1 = 31−1 = 30.
The critical value of t with α = 0.05 and 30 degrees of freedom is +/- 2.042.
6. We compare the value of our statistic (3.07) to the t value. Since 3.07 > 2.042, we
reject the null hypothesis that the mean grams of protein is equal to 20. We make a
practical conclusion that the labels are incorrect, and the population mean grams of
protein is greater than 20.
EXERCISE FOR ONE-SAMPLE T-TEST
▪ You can use the test when your data values are independent, are
randomly sampled from two normal populations and the two independent
groups have equal variances.
Two-sample t-test example
▪ One way to measure a person’s fitness is to measure their body fat
percentage. Average body fat percentages vary by age, but according to
some guidelines, the normal range for men is 15-20% body fat, and the
normal range for women is 20-25% body fat.
▪ Our sample data is from a group of men and women who did workouts at
a gym three times a week for a year. Then, their trainer measured the
body fat.
▪ You can clearly see some overlap in the body fat
measurements for the men and women in our
sample, but also some differences. Just by
looking at the data, it's hard to draw any solid
conclusions about whether the underlying
populations of men and women at the gym have
the same mean body fat. That is the value of
statistical tests – they provide a common,
statistically valid way to make decisions, so that
everyone makes the same decision on the same
set of data values.
Data
▪ Without doing any testing, we can see that the averages for men and women in
our samples are not the same. But how different are they? Are the averages
“close enough” for us to conclude that mean body fat is the same for the larger
population of men and women at the gym? Or are the averages too different for
us to make this conclusion?
How to perform the two-sample t-test
Steps
1. Ho : The mean body fat for men and women are equal.
▪ This calculation begins with finding the difference between the two averages:
= 38.87
Next, we take the square root of the pooled variance to get the pooled standard deviation.
We now have all the pieces for our test statistic. We have the difference of the averages, the pooled standard deviation and the sample
sizes.
= 7.34 / 2.62
= 2.80
According to this
table, for a two-tailed
test with an alpha
level of 0.05 at 21
degrees of freedom,
the critical value is
2.08.
To evaluate the difference between the means in order to make a decision about our gym
programs, we compare the test statistic to a theoretical value from the t-distribution.
6. We compare the value of our statistic (2.80) to the t value. Since 2.80 > 2.080, we
reject the null hypothesis that the mean body fat for men and women are equal, and
conclude that we have evidence body fat in the population is different between men and
women.
EXERCISE FOR TWO-SAMPLE T-TEST
Andro grows tomatoes in two separate fields. When 1. Ho : Mean size in field A is equal to the mean size of field B
the tomatoes are ready to be picked, he is curious as 2. Decide on the Alpha. α = 0.05
to whether the sizes of tomatoes differ between 2 3. Compute for t-statistic
fields. He takes random samples of plants from each vp = ((n1 – 1)s12) + ((n2 – 1)s22) t = Ⴟ1 – Ⴟ2
field and measures the heights of the plants. ----------------------------------- -------------------------------------------------------
How to solve?
Step 1 – identify the sample size and if the population
standard deviation is known.
Step 2 – state your null and alternate hypotheses
Step 3 – state the alpha level. If you aren’t given one, use
0.05.
Step 4 – find the Z using the one-sample Z-test formula
Step 5 – conclude
Sample (Left-tailed one-sample z-test)
▪ Answer:
▪ Reject the null hypothesis
▪ As -4.66 < -1.645 thus, the null
hypothesis is rejected, and it is
concluded that there is enough
evidence to support the
medicine shop's claim.
One-sample Z-test
Alternate Hypothesis: H1 : μ ≠ μ0
Decision Criteria: If the z statistic > z critical value then reject the
null hypothesis.
Example (Right-tailed one-sample z-test)
Answer:
Reject the null hypothesis
As 3.6 > 1.645 thus, the null
hypothesis is rejected, and it is
concluded that there is enough
evidence to support the teacher's
claim.
Two-sample Z-test
Problem:
The amount of a certain trace element in blood is known to vary with a
standard deviation of 14.1 ppm (parts per million) for male blood donors
and 9.5 ppm for female donors. Random samples of 75 male and 50
female donors yield concentration means of 28 and 33 pp, respectively.
What is the likelihood that the population means of concentrations of
the element are the same for men and women?
Example of Two-sample Z-test
How to solve?
Step 1 – identify the sample size and if the population standard deviation
is known.
Step 2 – state your null and alternate hypotheses
Step 3 – find the Z using the two-sample Z-test formula
Step 4 – conclude
Example of Two-sample Z-test
Answer:
Reject the null hypothesis
The computed Z-value is negative because the (larger) mean for females
was subtracted from the (smaller) mean for males.
But the order of the samples in this computation is arbitrary – it could be
in opposite order, in which the case z would be 2.37 instead of -2.37. An
extreme Z-score from alpha level, which is 5% = 1.654 (plus or minus) will
lead to rejection of the null hypothesis.
Let’s test your understanding
EXERCISE:
A principal at a school claims that the students in his school are above
average intelligence. A random sample of 30 students’ IQ scores have a
mean score of 112.5. Is there sufficient evidence to support the
principal’s claim? The mean population IQ is 100 with a standard
deviation of 15.
Exercise:
Solution: Answer:
Reject the null hypothesis
As 4.56 > 1.645 thus, the null
hypothesis is rejected, and it is
Z = (112.5 – 100) / (15/√30)
concluded that there is enough
Z = 4.56 evidence to support the principal's
claim.
Statistical Tests
A. Parametric
• One-Way ANOVA
Analysis of Variance (ANOVA) Test
• ANOVA test can be defined as a type of test used in hypothesis testing to
compare whether the means of two or more groups are equal or not.
• This test is used to check if the null hypothesis can be rejected or not depending
upon the statistical significance exhibited by the parameters. The decision is
made by comparing the ANOVA test statistic with the critical value.
• ANOVA test can be defined as a type of test used in hypothesis testing to
compare whether the means of two or more groups are equal or not. This test is
used to check if the null hypothesis can be rejected or not depending upon the
statistical significance exhibited by the parameters. The decision is made by
comparing the ANOVA test statistic with the critical value.
• An ANOVA test can be either one-way or two-way depending upon the number
of independent variables.
One-way vs Two-way ANOVA Differences Chart
The means of three or more groups of an The effect of multiple groups of two
What is Being Compared? independent variable on a dependent independent variables on a dependent
variable. variable and on each other.
At what temperature is it ideal for students to take their exam – cold (15°),
moderate (25°), or hot (35°)?
Moderate
n Cold (15°) Hot (35°)
(25°)
1 4 8 2
2 7 6 2
3 5 7 3
4 3 6 4
5 5 5 1
Step 1: Compute for the Mean in each groups.
2 2 2
X1 (15 °) X1 X2 (25 °) X2 X3 (35 °) X3
4 16 8 64 2 4
7 49 6 36 2 4
5 25 7 49 3 9
3 9 6 36 4 16
5 25 5 25 1 1
Step 3: Compute for the ∑X12, ∑X22, and ∑X32.
Source SS df MS F
Between
Treatments
Within
Treatments
Total 59.73
Source SS df MS F
Between
Treatments
Within
19.2
Treatments
Total 59.73
Source SS df MS F
Between
40.53
Treatments
Within
19.2
Treatments
Total 59.73
Source SS df MS F
Between
40.53 2
Treatments
Within
19.2 12
Treatments
Total 59.73 14
Source SS df MS F
Between
40.53 2 20.265
Treatments
Within
19.2 12 1.6
Treatments
Total 59.73 14
Source SS df MS F
Between
40.53 2 20.265 12.67
Treatments
Within
19.2 12 1.6
Treatments
Total 59.73 14
Step 5: Find the tabular value of F
Step 6: Compare the F statistic (computed value) to the critical value.
F= 12.67
table F= 3.89
Conclusion:
H1: There is a significant difference in the exam scores between the three
groups.
The Mozart effect refers to a boost of average performance on tests for elementary school students if the students listen
to Mozart’s chamber music for a period of time immediately before the test. Many educators believe that such an effect is
not necessarily due to Mozart’s music per se but rather a relaxation period before the test. To support this belief, an
elementary school teacher conducted an experiment by dividing her third-grade class of 15 students into three groups of
5. Students in the first group were asked to give themselves a self-administered facial massage; students in the second
group listened to Mozart’s chamber music for 15 minutes; students in the third group listened to Schubert’s chamber
music for 15 minutes before the test. The scores of the 15 students are given below:
7 49 8 64 8 64
8 64 9 81 7 49
9 81 9 81 9 81
8 64 9 81 9 81
X1 X12 X2 X22 X3 X32
7 49 8 64 8 64
7 49 8 64 8 64
8 64 9 81 7 49
9 81 9 81 9 81
8 64 9 81 9 81
T= 39 ∑X12= 307 T= 43 ∑X22=371 T= 41 ∑X32= 339
M= 7.8 M= 8.6 M= 8.2
X1 X12 X2 X22 X3 X32
7 49 8 64 8 64 k= 3
7 49 8 64 8 64 n= 5
8 64 9 81 7 49 N= 15
9 81 9 81 9 81 G= 123
8 64 9 81 9 81 ∑X2= 1017
T= 39 ∑X12= 307 T= 43 ∑X22=371 T= 41 ∑X32= 339
M= 7.8 M= 8.6 M= 8.2
X1 X12 X2 X22 X3 X32
7 49 8 64 8 64 k= 3
7 49 8 64 8 64 n= 5
8 64 9 81 7 49 N= 15
9 81 9 81 9 81 G= 123
8 64 9 81 9 81 ∑X2= 1017
T= 39 ∑X12= 307 T= 43 ∑X22=371 T= 41 ∑X32= 339
M= 7.8 M= 8.6 M= 8.2
F= 1.412
table F= 3.89
Conclusion:
H1: There is significant difference in the exam scores between the three
groups who participated in the experiment.
Statistical Tests
A. Parametric
• Two-Way ANOVA
Two-way ANOVA
Where:
df SS MSS F F-crit
Between Row df = Degrees of Freedom
Between Column SS = Sum of Squares
Within Error
MSS = Mean Sum of Square
Total
F = F-ratio
F = F-crit value from table
Example 1
A farmer applied three type of fertilizers on 4 separate plots for his cultivation. The
figure on yield per acre are tabulated below.
YIELD
FERTILIZER
A B C D
NITROGEN 6 4 8 6
PHOSPHORUS 7 6 6 9
POTASSIUM 8 5 10 9
YIELD
FERTILIZER
A B C D
NITROGEN 6 4 8 6
PHOSPHORUS 7 6 6 9
POTASSIUM 8 5 10 9
df SS MSS F F-crit
Between Row 2 8
Between Column 3 18
Within Error 6 10
Total 11 36
Step 3: Find the Mean Sum of Square
df SS MSS F F-crit
Between Row 2 8 4
𝑺𝑺
Between Column 3 18 6 MSS =
Within Error 6 10 1.667 𝒅𝒇
Total 11 36
df SS MSS F F-crit
Between Row 2 8 4 2.4
𝑴𝑺𝑺
Between Column 3 18 6 3.6 F=
Within Error 6 10 1.667 𝑴𝑺𝑺𝑾𝑬
Total 11 36
𝑴𝑺𝑺𝑹 𝑴𝑺𝑺𝑪
FR = FC =
𝑴𝑺𝑺𝑾𝑬 𝑴𝑺𝑺𝑾𝑬
𝟒 𝟔
= =
𝟏.𝟔𝟔𝟕 𝟏.𝟔𝟔𝟕
= 2.4 = 3.6
Step 5: Find the p-value
dfWE = 6
p-value for between column (Plot)
dfC = 3
dfWE = 6
Conclusion:
df SS MSS F F-crit
Between Row 2 8 4 2.4 5.14
Between Column 3 18 6 3.6 4.76
Within Error 6 10 1.667
Total 11 36
Fertilizer Plot
• Since 2.4 < 5.14, H0 is not rejected • Since 3.6 < 4.76, H0 is not rejected
• The Fertilizer does not have significant • The Plot does not have significant
effect on the yield effect on the yield
Example 2: https://forms.gle/bnrzKmaWg4nN8moR6
Male
Female
Does the gender or type of social media have an effect on how many hour a
student had been ?
H0 : no significant effect on the number of hours
Step 1: Find the Degrees of Freedom Step 2: Find the Sum of Squares
Gender Facebook Tiktok Youtube Xr
Female 3 1 4 2.667
dfBetweenRow = No. of rows – 1 SSR = σ 𝑛𝑅 (𝑥𝑅 − x̅G)2
Female 6 0 1 2.333
dfBR = 17 = 93.155
Female 0 1 2 1
Female 8 0 2 3.333 dfBetweenColumn = No. of columns – 1 SSC = σ 𝑛𝐶 (𝑥𝐶 − x̅G)2
Female 0 0 2 0.667 dfBC = 2 = 24.663
Female 1 0 3 1.333 dfTotal = No. of Samples – 1
Female 1 2 1 1.333 dfT = 53 SSTotal = σ(𝑎 − x̅G)2
Female 2 3 0 1.667
Female 2 5 1 2.667 dfWithinError = dfT – (dfBR + dfBC) = 232.095
Male 2 0.5 0 0.833 dfWE = 34
SSWE = SSTotal – (SSR + SSC)
Male 5 3 6 4.667 = 114.277
Male 1 0 0.5 0.5
Male 2 0 2 1.333 df SS MSS F F-crit
Male 1 1 1 1 Between Row 17 93.155
Male 2 0 1 1
Between Column 2 24.663
Male 8 0.1 8 5.367
Male 2 2 2 2 Within Error 34 114.277
df SS MSS F F-crit
Between Row 17 93.155 5.48 𝑺𝑺
Between Column 2 24.663 12.33 MSS =
Within Error 34 114.277 3.36
𝒅𝒇
Total 53 232.095
df SS MSS F F-crit
Between Row 17 93.155 5.48 1.63 𝑴𝑺𝑺
Between Column 2 24.663 12.33 3.67 F=
𝑴𝑺𝑺𝑾𝑬
Within Error 34 114.277 3.36
Total 53 232.095
𝑴𝑺𝑺𝑹 𝑴𝑺𝑺𝑪
FR = FC =
𝑴𝑺𝑺𝑾𝑬 𝑴𝑺𝑺𝑾𝑬
𝟓.𝟒𝟖 𝟏𝟐.𝟑𝟑
= =
𝟑.𝟑𝟔 𝟑.𝟑𝟔
= 1.63 = 3.67
Step 5: F-crit value for between row (Gender) dfR = 17
dfWE = 34
Step 5: F-crit value for between column (Social Media Platforms)
dfC = 2
dfWE = 34
Conclusion:
df SS MSS F F-crit
Between Row 17 93.155 5.48 1.63 1.933
Between Column 2 24.663 12.33 3.67 3.276
Within Error 34 114.277 3.36
Total 53 232.095
Gender
Social Media Platforms
• Since 1.63 < 1.933, H0 is not rejected
• Since 3.67 > 3.276, H0 is rejected
• The Gender have significant effect on
• The type of social media platforms have
how many hours a student delegated
significant effect on how many hours a
their time in social media
student delegated their time.
Statistical Tests
B. Non-Parametric
• Chi-Square Test
The Chi-Square test is a statistical procedure for determining the difference between
observed and expected data. This test can also be used to determine whether it correlates
to the categorical variables in our data. It helps to find out whether a difference between
two categorical variables is due to chance or a relationship between them. Also called
Pearson’s Chi-Square Test or Chi-Square Test of Association.
c = Degrees of freedom
O = Observed Value
E = Expected Value
The degrees of freedom in a statistical calculation represent the number of variables that can
vary in a calculation. The degrees of freedom can be calculated to ensure that chi-square tests
are statistically valid. These tests are frequently used to compare observed data with data that
would be expected to be obtained if a particular hypothesis were true.
The Expected values are the frequencies expected, based on the null hypothesis.
Chi-square is a statistical test that examines the differences between categorical variables from a
random sample in order to determine whether the expected and observed results are well-fitting.
A Chi-Square statistic test is calculated based on the data, which must be raw, random, drawn
from independent variables, drawn from a wide-ranging sample and mutually exclusive. In simple
terms, two sets of statistical data are compared. Karl Pearson introduced this test in 1900 for
categorical data analysis and distribution. This test is also known as ‘Pearson’s Chi-Squared Test’.
Chi-Squared Tests are most commonly used in hypothesis testing. A hypothesis is an assumption
that any given condition might be true, which can be tested afterwards. The Chi-Square test
estimates the size of inconsistency between the expected results and the actual results when the
size of the sample and the number of variables in the relationship is mentioned.
These tests use degrees of freedom to determine if a particular null hypothesis can be rejected
based on the total number of observations made in the experiments. Larger the sample size,
more reliable is the result.
When to Use a Chi-Square Test?
A Chi-Square Test is used to examine whether the observed results are in order with the expected
values. When the data to be analysed is from a random sample, and when the variable is the
question is a categorical variable, then Chi-Square proves the most appropriate test for the same. A
categorical variable consists of selections such as breeds of dogs, types of cars, genres of movies,
educational attainment, male v/s female etc. Survey responses and questionnaires are the primary
sources of these types of data. The Chi-square test is most commonly used for analysing this kind
of data. This type of analysis is helpful for researchers who are studying survey response data. The
research can range from customer and marketing research to political sciences and economics.
There are Two Main Types of Chi-Square Tests, namely :
1. Independence
2. Goodness-of-Fit
Independence
The Chi-Square Test of Independence is a derivable ( also known as inferential ) statistical test which examines
whether the two sets of variables are likely to be related with each other or not. This test is used when we have
counts of values for two nominal or categorical variables and is considered as non-parametric test. A relatively
large sample size and independence of observations are the required criteria for conducting this test.
For Example:
In a movie theatre, suppose we made a list of movie genres. Let us consider this as the first variable. The
second variable is whether or not the people who came to watch those genres of movies have bought snacks at
the theatre. Here the null hypothesis is that the genre of the film and whether people bought snacks or not are
unrelatable. If this is true, the movie genres don’t impact snack sales.
Goodness-Of-Fit
In statistical hypothesis testing, the Chi-Square Goodness-of-Fit test determines whether a variable is
likely to come from a given distribution or not. We must have a set of data values and the idea of the
distribution of this data. We can use this test when we have value counts for categorical variables. This
test demonstrates a way of deciding if the data values have a “ good enough” fit for our idea or if it is a
representative sample data of the entire population.
For Example-
Suppose we have bags of balls with five different colors in each bag. The given condition is that the bag
should contain an equal number of balls of each color. The idea we would like to test here is that the
proportions of the five colors of balls in each bag must be exact.
Who Use Chi-Square Analysis?
Chi-square is most commonly used by researchers who are studying survey response data
because it applies to categorical variables. Demography, consumer and marketing research,
political science, and economics are all examples of this type of research.
Example
Let's say you want to know if gender has anything to do with political party preference. You poll
440 voters in a simple random sample to find out which political party they prefer. The results of
the survey are shown in the table below:
To see if gender is linked to political party preference, perform a Chi-Square test of independence
using the steps below.
Similarly, you can calculate the expected value for each of the cells.
Step 3: Calculate (O-E)2 / E for Each Cell in the Table
Now you will calculate the (O - E)2 / E for each cell in the table.
Where:
O=Observed Value
E = Expected Value
Step 4: Calculate the Test Statistic X2
= 9.837
Before you can conclude, you must determine first the critical statistic, which requires determining our
degrees of freedom. The degrees of freedom in this case are equal to the table's number of columns minus
one multiplied by the table's number of rows minus one, or (r-1) (c-1). We have (3-1)(2-1) = 2.
Finally, you compare our obtained statistic to the critical statistic found in the chi-square table. As you can see,
for an alpha level of 0.05 and two degrees of freedom, the critical statistic is 5.991, which is less than our
obtained statistic of 9.83. You can reject our null hypothesis because the critical statistic is higher than your
obtained statistic.
This means you have sufficient evidence to say that there is an association between gender and political party
preference.
Chi-Square Distribution
In statistical analysis, the Chi-Square distribution is used in many hypothesis tests and is determined by
the parameter k degree of freedoms. It belongs to the family of continuous probability distributions.
The Sum of the squares of the k independent standard random variables is called the Chi-Squared
distribution. Pearson’s Chi-Square Test formula is -
When k is 1 or 2, the Chi-square distribution curve is shaped like a backwards ‘J’. It means
there is a high chance that X^2 becomes close to zero.
When k is greater than 2, the shape of the distribution curve looks like a hump and has a low
probability that X^2 is very near to 0 or very far from 0. The distribution occurs much longer on
the right-hand side and shorter on the left-hand side. The probable value of X^2 is (X^2 - 2).
When k is greater than ninety, a normal distribution is seen, approximating the Chi-square
distribution.
Example
Expected Frequency:
Region Spiciness level- Hot Very Hot Total
Medium
Bicolanos 21.00 19.86 22.14 63
Manilenos 16.00 15.14 16.86 48
Total 37 35 39 111
𝑂−𝐸 2 Expected value= row sum x column sum
x2 = σ
𝐸
grand total
Since the chi-square value of 2.03 is less than the critical value of
5.991, then the null hypothesis is not rejected. Meaning, there is
not enough evidence to support the claim that the spiciness level
selected by an individual is dependent on their respective
ethnicity.
Statistical Tests
B. Non-Parametric
Non-Parametric Tests
▪ More statistical power when assumptions for the parametric tests have
been violated. When assumptions haven’t been violated, they can be
almost as powerful.
▪ Fewer assumptions (i.e. the assumption of normality doesn’t apply).
▪ Small sample sizes are acceptable.
▪ They can be used for all data types, including nominal variables, interval
variables, or data that has outliers or that has been measured imprecisely.
Disadvantages of Non-Parametric Tests
You might be interested to know if therapy after a slipped disc has any effect on the
perception of pain.
Friedman Test Calculation
Friedman Test Calculation
Mann-Whitney Test
You measured the reaction time of a small group of men and women and
want to know if there is a difference.
Gender Reaction Rank Calculation of the rank sums:
Time
Female 34 2
Female 36 4
Female 41 7
For Females, T1 = 2 + 4 + 7 + 9 + 10 + 5 = 37
Female 43 9
Female 44 10
Female 37 5
Male 45 11
Male 33 1 For Males, T2 = 11 + 1 + 3 + 6 + 8 = 29
Male 35 3
Male 39 6
Male 42 8
Mann Whitney Test Example
You measured the reaction time of a small group of men and women and
want to know if there is a difference.
Kruskal Wallis Test
You have measured the reaction time of three groups and want to know if
there is a difference between the groups.
Kruskal Wallis Test Example
You have measured the reaction time of three groups and want to know if
there is a difference between the groups.
Kruskal Wallis Test Example
You have measured the reaction time of three groups and want to know if
there is a difference between the groups.
Kruskal Wallis Test Example
You have measured the reaction time of three groups and want to know if
there is a difference between the groups.
Kruskal Wallis Test Example
You have measured the reaction time of three groups and want to know if
there is a difference between the groups.