0% found this document useful (0 votes)
35 views

Chi-Square Test of Independence

The chi-square test of independence is used to determine if there is a relationship between two categorical variables. It compares the observed frequencies in each category to the expected frequencies if the variables were independent. The document provides steps for conducting a chi-square test of independence including: 1) calculating expected frequencies, 2) stating the null and alternative hypotheses, 3) computing the test statistic, and 4) making a decision about whether to reject the null hypothesis based on the significance level. An example calculates expected frequencies and performs a chi-square test to examine the relationship between knowledge of gum disease and hygienic instruments used.

Uploaded by

Safi Sheikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Chi-Square Test of Independence

The chi-square test of independence is used to determine if there is a relationship between two categorical variables. It compares the observed frequencies in each category to the expected frequencies if the variables were independent. The document provides steps for conducting a chi-square test of independence including: 1) calculating expected frequencies, 2) stating the null and alternative hypotheses, 3) computing the test statistic, and 4) making a decision about whether to reject the null hypothesis based on the significance level. An example calculates expected frequencies and performs a chi-square test to examine the relationship between knowledge of gum disease and hygienic instruments used.

Uploaded by

Safi Sheikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Chi-square Test of Independence

Objectives
➢Reviewing the Concept of Independence

➢Chi-square Test of Independence

➢Chi-square Tests in SPSS


Chi-square Test of Independence

➢ The chi-square test of independence is probably the most


frequently used hypothesis test in the social sciences.

➢ In this exercise, we will use the chi-square test of


independence to evaluate group differences when the test
variable is nominal, dichotomous (two values) , ordinal, or
grouped interval.

➢ Many set of nominal and ordered data may be grouped


according to two or more way of classification, and it is
often the case that we would like to know whether or not
these various variables are independent of one another.
Independence Defined
➢ Two variables are independent if, for all cases, the classification of a case
into a particular category of one variable (the group variable) has no effect
on the probability that the case will fall into any particular category of the
second variable (the test variable).
➢ When two variables are independent, there is no relationship between
them. We would expect that the frequency breakdowns of the test
variable to be similar for all groups.
Hygiene Instrument Used
Brush and
Brush Miswak Miswak Total
Yes 20 10 10 40
The Knowledge of
the Gum Disease No 15 10 5 30

Not 5 10 15 30
Sure
40 30 30 100
Total
Expected Frequencies

➢ The proportion of subjects in each category of the group


variable can differ, we take group category into account in
computing expected frequencies as well.

➢ To summarize, the expected frequencies for each cell are


computed to be proportional to both the breakdown for
the test variable and the breakdown for the group variable.
Second First attribute
Total
attribute 1 2 3 . . c
1 O11 O12 O13 . . O1c O1.
2 O21 O22 O23 . . O2c O2.
3 O31 O32 O33 . . O31 O3.
. . . . . . . Observed frequencies
. . . . . . .
. . . . . . .
r Or1 Or2 Or3 . . Orc Or.
Total O.1 O.2 O.3 O.c O..
Hygiene Instrument
Used
We want to examine is Brush
there any association and
between knowledge of Brush Miswak Miswak Total
gum disease and hygienic Yes 20 10 10 40
instruments used. The data The Knowledge
of the Gum No 15 10 5 30
given in the following table. Disease
Not 5 10 15 30
Sure
40 30 30 100
Total
Expected frequencies
Expected frequencies are calculated from the following formula
Oi.* O. j ith rowtotal * jth columntotal
Eij = =
O.. Grand total
Second First attribute
Total
attribute 1 2 3 . . c
1 E11 E12 E13 . . E1c E1.
2 E21 E22 E23 . . E2c E2.
3 E31 E32 E33 . . E31 E3.
. . . . . . .
. . . . . . .
. . . . . . .
r Er1 Er2 Er3 . . Erc Er.
Total E.1 E.2 E.3 E.c E..
Expected Frequencies versus Observed
Frequencies

➢ The chi-square test of independence plugs the observed


frequencies and expected frequencies into a formula which
computes how the pattern of observed frequencies differs
from the pattern of expected frequencies.
Independent and Dependent Variables

➢ The two variables in a chi-square test of independence each


play a specific role.

➢ The group variable is also known as the independent variable because


it has an influence on the test variable.

➢ The test variable is also known as the dependent variable because its
value is believed to be dependent on the value of the group variable.
Step 1. Assumptions for the Chi-square Test

➢ The chi-square Test of Independence can be used for any level


variable, including interval level variables grouped in a
frequency distribution. It is most useful for nominal variables
for which we do not another option.
Assumptions:

➢ All expected frequencies are at least 1


➢ At most 20% of the expected frequencies are less than 5.

If these assumptions are violated, the chi-square distribution will


give us misleading probabilities.
Step 2. Hypotheses and alpha

➢ The alternative hypothesis states that the two variables are


dependent or related. This will be true if the observed
counts for the categories of the variables in the sample are
different from the expected counts.

➢ The null hypothesis is that the two variables are


independent. This will be true if the observed counts in the
sample are similar to the expected counts.

➢ The amount of difference needed to make a decision about


difference or similarity is the amount corresponding to the
alpha level of significance, which will be either 0.05 or
0.01. The value to use will be stated in the problem.
Step 3. Sampling distribution and test statistic

➢ To test the relationship, we use the chi-square test


statistic, which follows the chi-square distribution.

➢ If we were calculating the statistic by hand, we would have


to compute the degrees of freedom to identify the
probability of the test statistic. SPSS will print out the
degrees of freedom and the probability of the test
statistics for us.
Step 4. Computing the Test Statistic

➢ Conceptually, the chi-square test of independence statistic


is computed by summing the difference between the
expected and observed frequencies for each cell in the
table divided by the expected frequencies for the cell.

➢ We identify the value and probability for this test statistic


from the SPSS statistical output.
Step 5. Decision and Interpretation

➢ If the probability of the test statistic is less than or equal to


the probability of the alpha error rate, we reject the null
hypothesis and conclude that our data supports the
research hypothesis. We conclude that there is a
relationship between the variables.

➢ If the probability of the test statistic is greater than the


probability of the alpha error rate, we fail to reject the null
hypothesis. We conclude that there is no relationship
between the variables, i.e. they are independent.
Example 1

We want to examine is there any association between knowledge of gum


disease and hygienic instruments used. The data given in the following table.

Hygiene Instrument Used

Brush and
Brush Miswak Miswak Total
Yes 20 10 10 40
The Knowledge of the
Gum Disease No 15 10 5 30

Not Sure 5 10 15 30

40 30 30 100
Total
First step calculate expected frequencies

The expected frequency for the first cell would be E11 = (40*40)/100=16.
similarly others expected will be calculated. The complete table is given
below.

Hygiene Instrument Used

Brush and
Brush Miswak Miswak Total
Yes 20 16 10 12 10 12 40
The Knowledge of the
Gum Disease No 15 12 10 9 5 9 30

Not Sure 5 12 10 9 15 9 30

40 30 30 100
Total
Knowledge of gum disease * Hygenic instrument used Crosstabulation

Hygenic instrument used

Brush and
Brush Miswak miswak Total
Knowledge of gum Yes Count 20 10 10 40
disease
Expected Count 16.0 12.0 12.0 40.0
No Count 15 10 5 30
Expected Count 12.0 9.0 9.0 30.0
Not sure Count 5 10 15 30
Expected Count 12.0 9.0 9.0 30.0
Total Count 40 30 30 100
Expected Count 40.0 30.0 30.0 100.0
Chi-Square Tests

Value df Asymp. Sig. (2-sided)


Pearson Chi-Square 12.500a 4 .014
Likelihood Ratio 13.234 4 .010
Linear-by-Linear Association
7.507 1 .006
N of Valid Cases
100
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 9.00.
Example 1 continue

Step # 1: Statement of Hypothesis:


Ho: The oral hygiene instrument used and the knowledge of gum
disease are independent to each other
HA: The oral hygiene instrument used and the knowledge of gum
disease are dependent to each other

Step # 2: Level of Significance: α = 0.05


r c

  (O ij − Eij ) 2
Step # 3: Test statistic: 2 = i =1 j =1

Eij

with (r-1)(c-1) d.o.f


Example 1 continue

Step # 4: Critical Region


Reject Ho if  2 cal   tab P-Valuer < αc
  (O ij − Eij ) 2
2 = i =1 j =1

Step# 5: Computation: Eij


2
𝜒𝑡𝑎𝑏 = 0.710 =chisqinv (0.05,4)
P-Value = 0.014
Step # 6: Result:
2 2
𝜒𝑐𝑎𝑙 =12.50, 𝜒𝑡𝑎𝑏 = 0.710
P-Value < α
Conclusion: we reject Ho and conclude that the oral hygiene
instrument use and the knowledge of gum disease are not
independent to each other
THANK YOU

You might also like