0% found this document useful (0 votes)
93 views15 pages

Chi Square (X2) Test

The document discusses the chi-square test, which is used to determine if an association between two qualitative variables is statistically significant. It describes the conditions for using the test, how to calculate chi-square values, and provides an example problem demonstrating how to conduct a chi-square test to analyze the results of dice rolls and determine if the dice is unbiased. The key advantages are that the test can analyze associations between variables and identify differences between observed and expected values.

Uploaded by

Rahil Arshad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views15 pages

Chi Square (X2) Test

The document discusses the chi-square test, which is used to determine if an association between two qualitative variables is statistically significant. It describes the conditions for using the test, how to calculate chi-square values, and provides an example problem demonstrating how to conduct a chi-square test to analyze the results of dice rolls and determine if the dice is unbiased. The key advantages are that the test can analyze associations between variables and identify differences between observed and expected values.

Uploaded by

Rahil Arshad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Applications in RM

CHI SQUARE (X ) 2

TEST
CSM 001
RESEARCH METHODOLOGY

Submitted By:
Rahil Arshad
2002422
M.Tech. CS II Yr
Contents
 Chi-Square Test
 Use in research
 Conditions for using
 Chi-square distributions
 Method
 Example Problem
 Characteristics
 Advantages & Disadvantages
Chi-Square Test
To determine whether the association between
two qualitative variables is statistically
significant, researchers must conduct a test of
significance called the Chi-Square Test.
 It was first introduced by German statistician
Friedrich Robert Helmert.
 The chi-square test was first used by Karl
Pearson in 1900.
Meaning of Chi-Square Values
 The Chi-square test is intended to test how
likely it is that an observed distribution is
due to chance.
 It is also called a "goodness of fit" statistic,
because it measures how well the observed
distribution of data fits with the distribution
that is expected if the variables are
independent.
Use in Research
 The test is, in fact, a technique through the use
of which it is possible for all researchers to
 test the goodness of fit;
 test the significance of association between
two attributes, and
 test the homogeneity or the significance of
population variance
Conditions for using X test 2

 Observations recorded and used are collected on a


random basis.
 All the items in the sample must be independent.
 No group should contain very few items, say less than 10.
In case where the frequencies are less than 10, regrouping
is done by combining the frequencies of adjoining groups
so that the new frequencies become greater than 10.
 The overall number of items must also be reasonably
large. It should normally be at least 50, howsoever small
the number of groups may be.
 The constraints must be linear.
Chi-Square Distributions
 The chi-square distributions are a family of
distributions that take only positive values and
are skewed to the right. A particular chi-square
distribution is specified by giving its degrees of
freedom.
 The chi-square test for a two-way table with r
rows and c columns uses critical values from
the chi-square distribution with (r – 1)(c – 1)
degrees of freedom.
Chi-Square Distributions
 The image shows that the
distribution of the chi-square
statistic starts at zero and can
only have positive values.
 The shape of the distribution
is much different than the t
or z statistic and is skewed to
the right.
 The shape of the distribution
changes as the degrees of
freedom increases.
Conducting the Chi-square test
There are five steps to conduct this test:
 Formulate the hypothesis.
 Specify the expected values
 Compare the observed values with the
expected values.
 Compute the test statistic.
 Decide if chi-square is statistically
significant.
Example Problem
A die is thrown 132 times with following results,
Is the die unbiased ?
No 1 2 3 4 5 6
Frequency 16 20 25 14 29 28

Let us take the hypothesis that the die is unbiased.


If that is so, the probability of obtaining any one of the
six numbers is 1/6 and as such the expected frequency
of any one number coming upward is 132 ×1/6 = 22.
Now we can write the observed frequencies
along with expected frequencies and work out
the value of X2 as follows:
No Observed Expected (Oi – Ei) (Oi – Ei)2 (Oi – Ei)2/Ei
Frequency, Frequency, Ei
Oi

1 16 22 -6 36 36/22

2 20 22 -2 4 4/22

3 25 22 3 9 9/22

4 14 22 -8 64 64/22

5 29 22 7 49 49/22

6 28 22 6 36 36/22
X2 = Σ(Oi – Ei )2 /Ei ] = 9
Hence, the calculated value of X2 = 9.
Degrees of freedom in the given problem is
(n – 1) = (6 – 1) = 5.
The table value of X2 for 5 degrees of freedom at
5% level of significance is 11.071.
Comparing calculated and table values of X2, we
find that calculated value is less than the table
value and as such could have arisen due to
fluctuations of sampling.
The result, thus, supports the hypothesis and it
can be concluded that the die is unbiased.
Characteristics of Chi-Square test
 This test (as a non-parametric test) is based on frequencies
and not on the parameters like mean and standard deviation.
 The test is used for testing the hypothesis and is not useful
for estimation.
 This test possesses the additive property.
 This test can also be applied to a complex contingency table
with several classes and as such is a very useful test in
research work.
 This test is an important non-parametric test as no rigid
assumptions are necessary in regard to the type of
population, no need of parameter values and relatively less
mathematical details are involved.
Advantages & Disadvantages
Advantages
 Can test association between variables
 Identifies differences between observed and
 expected values

Disadvantages
 Can't use percentages
 Data must be numerical
 Categories of 2 are not good to compare
 The number of observations must be 20+
 The test becomes invalid if any of the expected values are
below 10
 Quite complicated to get right - difficult formula
THANK YOU !!!

You might also like