Chi Square (X2) Test
Chi Square (X2) Test
CHI SQUARE (X ) 2
TEST
CSM 001
RESEARCH METHODOLOGY
Submitted By:
Rahil Arshad
2002422
M.Tech. CS II Yr
Contents
Chi-Square Test
Use in research
Conditions for using
Chi-square distributions
Method
Example Problem
Characteristics
Advantages & Disadvantages
Chi-Square Test
To determine whether the association between
two qualitative variables is statistically
significant, researchers must conduct a test of
significance called the Chi-Square Test.
It was first introduced by German statistician
Friedrich Robert Helmert.
The chi-square test was first used by Karl
Pearson in 1900.
Meaning of Chi-Square Values
The Chi-square test is intended to test how
likely it is that an observed distribution is
due to chance.
It is also called a "goodness of fit" statistic,
because it measures how well the observed
distribution of data fits with the distribution
that is expected if the variables are
independent.
Use in Research
The test is, in fact, a technique through the use
of which it is possible for all researchers to
test the goodness of fit;
test the significance of association between
two attributes, and
test the homogeneity or the significance of
population variance
Conditions for using X test 2
1 16 22 -6 36 36/22
2 20 22 -2 4 4/22
3 25 22 3 9 9/22
4 14 22 -8 64 64/22
5 29 22 7 49 49/22
6 28 22 6 36 36/22
X2 = Σ(Oi – Ei )2 /Ei ] = 9
Hence, the calculated value of X2 = 9.
Degrees of freedom in the given problem is
(n – 1) = (6 – 1) = 5.
The table value of X2 for 5 degrees of freedom at
5% level of significance is 11.071.
Comparing calculated and table values of X2, we
find that calculated value is less than the table
value and as such could have arisen due to
fluctuations of sampling.
The result, thus, supports the hypothesis and it
can be concluded that the die is unbiased.
Characteristics of Chi-Square test
This test (as a non-parametric test) is based on frequencies
and not on the parameters like mean and standard deviation.
The test is used for testing the hypothesis and is not useful
for estimation.
This test possesses the additive property.
This test can also be applied to a complex contingency table
with several classes and as such is a very useful test in
research work.
This test is an important non-parametric test as no rigid
assumptions are necessary in regard to the type of
population, no need of parameter values and relatively less
mathematical details are involved.
Advantages & Disadvantages
Advantages
Can test association between variables
Identifies differences between observed and
expected values
Disadvantages
Can't use percentages
Data must be numerical
Categories of 2 are not good to compare
The number of observations must be 20+
The test becomes invalid if any of the expected values are
below 10
Quite complicated to get right - difficult formula
THANK YOU !!!