11 Chi Square
11 Chi Square
CHI-SQUARE Distribution
Certain statistical tests need normality of distribution before such tests can be used. The Z test and T test, as well as F test are
just some of these tests. Such tests are known as parametric tests. However, there are some statistical tests which do not require this
basic assumption of normality of distribution before they could be used. These are classified as nonparametric tests. Chi-square is a
nonparametric test for data presented in frequencies, or data which can be transformed to frequencies.
1. Tests for Goodness-of-Fit
Chi-square test can be utilized to know whether the set of observed frequencies on one variable will be the same as expected
frequencies on the same variable. This is known as the test for goodness-of-fit. The Chi-square goodness-of-fit test can also be used
to test the normality of data. To illustrate this, a one-peso coin is tossed 200 times to test whether this coin is fair or loaded and the
result of the experimentation was recorded below.
( OF−EF )2
X2 = ∑
EF
where :
OF = observed frequencies
EF = expected frequencies
We have the usual procedure to test a hypothesis. Below are the steps:
1. H0: p = ½
2. Ha: p ≠ ½
3. α = 0.05
4. Test Statistics
2 ( OF−EF )2
X =∑
EF
df =n-1= 2-1=1
Where n represents the number of categories.
C.V. = 3.84 from X2 Distribution
5. Computation:
OF EF (OF - EF) (OF - EF)2 (OF - EF)2/EF
Head 109 100 9 81 0.81
Tail 91 100 –9 81 0.81
Total 200 200 X2 = 1.62
6. Decision: Accept Ho and reject H1, since /1.62/ is less than /3.84/. Thus, the coin is fair at the 0.05 level of significance.
Another example worth mentioning for testing goodness-of-fit is testing whether a die is loaded or fair, that the chances of
a boy and a girl among 50 deliveries are fair.
Testing for goodness-of-fit is one of the many applications of the chi-square. Another application is the test of independence.
The hypothesis is to be tested here is whether the two variables are independent of each other. Contingency tables in this case may be
made upon of any number of rows (r) and any number of columns (c). examples of these may be placed on the following contingency
tables with their corresponding degrees of freedom, like:
1 x c = one row with several columns df = c - 1
2 x 2 = two rows and two columns df = (2 - 1) (2 - 1)
Edited 2nd sem 2024-25
Perry Boy Rolan
Mr. PORFERIO I. ROLAN
(Asst. Instructor)
Page 1 of 4
MMW GEC 004 Mathematics in the Modern World 01-19-2025 04:12:20 AM
r x c = r rows and c columns df = (r - 1) (c - 1)
An example of a 3 x 3 contingency table is illustrated in the figure above. Letters A, B, C, D, E, F, G, H and I represent the
observed frequencies in each cell, while letters X, Y, and Z are the total observed frequencies per row, letters Q, R and S are the total
observed frequencies per column and T is the total number of observed frequencies. To obtain the expected frequency per cell, for
instance, the expected frequency for A, we multiply X by Q and divide the product by T. The computation of the expected frequencies
for the other cells are summarized on the other r x c contingency tables for expected frequency.
OPINION
SEX
Agree Disagree
Male 28 122
Female 20 140
Do the responses of two groups differ using the 0.05 level?
Solution:
1. H0: The responses of the two groups do not differ.
2. Ha: The responses of the groups differ.
3. α = 0.05
4. Test Statistics:
X2 =∑
( OF−EF 2 )
EF
df = (2 - 1)(2 - 1) = 1
C.V. = 3.84
5. Computation:
Opinion
Agree Disagree Total
Sex
Male 28 122 150
Female 20 140 160
Total 48 262 310
Expected frequencies for:
6. Decision: Accept H0 and reject Ha since /2.2459/ is less than 3.84. Therefore, the responses of the two groups
differ at the 0.05 level.
Exercises
NAME: SCORE:
COURSE/YR. & SEC: DATE:
STUDENT No. TIME: DAY:
A study was made to know if hypertension is independent of the drinking habits of 200 male respondents in a certain locality,
as shown from the table below
X2 =∑
( OF−EF 2 )
EF
df = (r - 1)(c - 1) =
C.V. = 5.99
5. Computation:
6. Decision: