We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 9
Chi-Squared
Goodness of Fit Test
By
Dr. Muhammad Mujtaba Shaikh,
Assistant Professor (Mathematics),
Department of Basic Sciences and Related Studies,
MULT, Jamshoro.
Email: [email protected]
Goodness of Fit Test
+ Purpose
= determine whether a populati
ppdtarmine wether a. Popaatn
cfiterion, at fixed significance level.
USS aeTt
~ fo do this, we compare the observed frequencies (o)) obtained
from sample data th the expected frequehcies (e,) bdsed on the
assume iypothetical
= The Null hypothesis, is, the assumption that data follows the
Be Ae SS ee et eae ola
~ Te ewe cae seh Sha Gale Sosa low the aut
distribution/criterion, Le. 0,# ¢, and the fit 1s poor.
riterion or distribution.Goodness of fit test
Main Assumptions:
+ The goodness of fit test in fact wants to measure the average dispersion in
observed frequencies wat. expected frequencies, so chi-squared
distribution is suitable for this test.
+ All frequencies (observed/ expected) in the table must be at least 5. If this
is not the case then we have to adjust the table by combining few cells so
that this condition holds.
+ The chi-squared statistic used in goodness of fit test is defined as:
where _n: number of cells in final table (ie. after adjustment if necessary)
k: number of parameters of observed data that are used to
compute expected frequencies.
Procedure to test for Goodness of a Fit:
1, Extract data from question and the hypothetical criterion,
2.State a hypotheses based on the fit of the data
3.Makea table of the observed and expected values. You will mostlikely be
given the observed values. The expected frequencies are computed through
the criterion given in the question.
4. Calculate the chi-squared test statistic, this is
5. Look up the chi-squared critical value from chi-squared table. ra
6. BB isthe critical Value of Chi- Square from table with ‘Alpha” level of
Significance and “v” degrees of freedom. This constitutes a critical region of
the form
7. Compare your test statistic with your critical value and make a conclusion. If
the test statisticlies in the critical region then reject Hy in favor of Hi.
Otherwise do not reject Hs in favor of
tmufaba sha-Isthe frequency of balls with different colors equal in
our bag? Use 5% significance level.
Observed Frequencies
Expected Frequencies
Observed Frequencies Total
a no
Expected Frequencies
Expected Frequencies
30Chi-Square test for goodness of fit
Observed Frequencies
Expected Frequencies
Chi-Square test for goodness of fit
Critical value = 7.81
What will you
conclude?
Hence, frequency of
balls with different
colors is not equal in
our bag at 5% level of
significance. Awangntot'7eis | =/005 CompuExample-2
+ A survey was targeted at determining if educational
attainment affected Internet use. Randomly selected
shoppers at a busy mall were asked if they used the
Internet and their highest level of education attained. The
results are listed below. Is there sufficient evidence at the
0.05 level of significance that the proportion of Internet
users differs for any of the groups?
Graduated college: 44
Attended college: 41
Did not attend: 40
Example-2 (Solution)
+ DATA: n-125, [@= 502005)
“Equality in Proportion’ Criterion is given.
+ Hypotheses:
Hy: The proportion of internet users is same for all groups of shoppers
with given educational level,
The proportion of internet users is not same for all groups of
shoppers with given educational level.
+ Expected Frequencies:
‘The expected frequencies are computed by dividing the Total 125 into Educational
level options equally:
41.66
41.66
41.66
~125Example-2 (Solution)
+ Adjusted Table:
No adjustment is necessary as none of the observed and expected frequencies are less
than 5.
* Degreesoffreedom: v=n-k = 3-1-2
Note that only parameter of observed data used for the computation of expected
frequenciesis the Total n.
* Critical Region and Limit: DP? fa PEP HBL PSI
— eo§f2 et oom
+ Conclusionand Decision:
As value of sample statistic is not in Critical region, thus we accept null hypothesis and
reject alternative which means, the proportion of Internet users among shoppers at
busy Mall is same for all shoppers with given Educational levels at 5% level of
Significance.
Example-3
Mistakes made by an skilled typist on routine work is hypothesized
to be a Rare experiment. Following is the record of mistakes made
per day by an skilled typist for 300 working days:
Number of 0 1 2 3 4 5 6
Mistakes
Number ofDays | 143 | 90 | 42 | 12 | 9 3 1
Test the hypothesis that mistakes/day is a rare experiment using
chi-squared goodness of fit test at 5% significance level.
HINT: (Rare Experiment suggests that data follows Poisson's
distribution)Example-3 (Solution)
+ DATA: n= 300,
Rare Experiment's Criterion (Poisson! 's Distribution).
+ Hypotheses:
Hg: Number of mistakes made by skilled typist per day on routine work is a
rare experiment, means, follow Poisson's Distribution.
Hy: Number of mistakes made by skilled typist per day on routine work is
not a rare experiment, means, doesn't follow Poisson's Distribution.
Number ofDays(0,) 143 90 42 12
+ Mean of Poisson's Distribution:
+ Expected Frequenci
For expected probabilities first,
We use the formula:
Then Expected frequencies are given bys...
SEE THE COMPLETE TABLE ON NEXT SLIDE!
Example-3 (Solution)
Observed Number 143,90 42 12 9 3 i
of Days (0),
Expected 0.41065 0.365 0.16264 0.04845 0.010736 0.001911 0.00028345
probabilities ( P,) 5
Expected Number 123195 1096 48792 14535 3.2208 0.5733 0.085035
of Days (e) 5
Adjusted Table is:
Observed Number 143 90 42 25
of Days (0)
Expected 0.41065 0.3655 0.16264 —0.06138045
probabilities (P.)
Expected Number of 123195 109.65 48.792 18414135
Days (e), Dr. Mubarimad Mufaba ShaikhExample-3 (Solution)
Degrees of freedom:
v= [Number of columns in adjusted table]
- [Number of observed data parameters used to compute €,]
v= 4-252,
Critical Region and Limi
Test Statistic from Sample:
Conclusion and Decision:
As value of sample statistic is in Critical region, thus we reject null hypothesis and
acceptalternative which means, number of mistakes made per day on routine work by
the typist is not a rare experiment at 5% level of Significance.
Home Work-I
* The grades in a Statistics course fora particular semester were
as follows:
Grade A B c D F
Number of Students 143 90 42 12 a
‘Test the hypothesis, at 0.05 level of significance, that the
distribution of grades is uniform.
Home Work-ll
+ Number of days per week a planning firm remains off in working
season is hypothesized to be a rare experiment. Following data
gives observed number of days per week the firm remained off
for last 1000 weeks:
Number ofoftdays ] 0 | 1 | 2 | 3 | 4]s5 |] 6] 7
per week
Numberofweeks | 305 | 366 | 210 | go | 28 | 9 | 2 | 1
‘Test the hypothesis at 5% significance, level.Home Work-III
The results of an Annual Job Satisfaction Survey showed that 28% of Field
Planners are very satisfied with their job, 46% are somewhat satisfied, 12%
are neither satisfied nor dissatisfied, 10% are somewhat dissatisfied, and 4%
are very dissatisfied, A sample of 500 off-field Planners yielded the following
information on job satisfaction:
Category Number of Respondents
Very satisfied 105
Somewhat satisfied 235
Neither 55
Somewhat dissatisfied 90
Very dissatisfied 15
Use a = .05 and test to determine whether the job satisfaction for off-field
planners is different from the job satisfaction for field planners.
HINT: (Use chi-squared Goodness of Fit Test)
Home Work-lV
* Three cards are drawn from an ordinary deck of well shuffled
playing cards and number of spades was noted. After repeating
the experiment 64 times following data was obtained:
Number ofSpades | 0 1 2 3
Number oftimes (0) | 21 | 31 | 12 | 0
Test the hypothesis, at 0.01 level of significance, that the number of
spades follows binomial distribution.
Home Work-V
+ A machine is designed to mix peanuts, hazelnuts, cashews and
pecans, respectively, in the ratio 3:2:2:1. A recently mixed can of
500 nuts was found to have 269 peanuts, 112 hazelnuts, 74
cashews, and 45 pecans. At 5% level of significance, test the
hypothesis, that the machine is working properly.