0% found this document useful (0 votes)
177 views9 pages

2018 16EL Goodness of Fit Test

Chi-Squared Goodness of Fit Test

Uploaded by

shahab moin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
177 views9 pages

2018 16EL Goodness of Fit Test

Chi-Squared Goodness of Fit Test

Uploaded by

shahab moin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 9
Chi-Squared Goodness of Fit Test By Dr. Muhammad Mujtaba Shaikh, Assistant Professor (Mathematics), Department of Basic Sciences and Related Studies, MULT, Jamshoro. Email: [email protected] Goodness of Fit Test + Purpose = determine whether a populati ppdtarmine wether a. Popaatn cfiterion, at fixed significance level. USS aeTt ~ fo do this, we compare the observed frequencies (o)) obtained from sample data th the expected frequehcies (e,) bdsed on the assume iypothetical = The Null hypothesis, is, the assumption that data follows the Be Ae SS ee et eae ola ~ Te ewe cae seh Sha Gale Sosa low the aut distribution/criterion, Le. 0,# ¢, and the fit 1s poor. riterion or distribution. Goodness of fit test Main Assumptions: + The goodness of fit test in fact wants to measure the average dispersion in observed frequencies wat. expected frequencies, so chi-squared distribution is suitable for this test. + All frequencies (observed/ expected) in the table must be at least 5. If this is not the case then we have to adjust the table by combining few cells so that this condition holds. + The chi-squared statistic used in goodness of fit test is defined as: where _n: number of cells in final table (ie. after adjustment if necessary) k: number of parameters of observed data that are used to compute expected frequencies. Procedure to test for Goodness of a Fit: 1, Extract data from question and the hypothetical criterion, 2.State a hypotheses based on the fit of the data 3.Makea table of the observed and expected values. You will mostlikely be given the observed values. The expected frequencies are computed through the criterion given in the question. 4. Calculate the chi-squared test statistic, this is 5. Look up the chi-squared critical value from chi-squared table. ra 6. BB isthe critical Value of Chi- Square from table with ‘Alpha” level of Significance and “v” degrees of freedom. This constitutes a critical region of the form 7. Compare your test statistic with your critical value and make a conclusion. If the test statisticlies in the critical region then reject Hy in favor of Hi. Otherwise do not reject Hs in favor of tmufaba sha -Isthe frequency of balls with different colors equal in our bag? Use 5% significance level. Observed Frequencies Expected Frequencies Observed Frequencies Total a no Expected Frequencies Expected Frequencies 30 Chi-Square test for goodness of fit Observed Frequencies Expected Frequencies Chi-Square test for goodness of fit Critical value = 7.81 What will you conclude? Hence, frequency of balls with different colors is not equal in our bag at 5% level of significance. Awangntot'7eis | =/005 Compu Example-2 + A survey was targeted at determining if educational attainment affected Internet use. Randomly selected shoppers at a busy mall were asked if they used the Internet and their highest level of education attained. The results are listed below. Is there sufficient evidence at the 0.05 level of significance that the proportion of Internet users differs for any of the groups? Graduated college: 44 Attended college: 41 Did not attend: 40 Example-2 (Solution) + DATA: n-125, [@= 502005) “Equality in Proportion’ Criterion is given. + Hypotheses: Hy: The proportion of internet users is same for all groups of shoppers with given educational level, The proportion of internet users is not same for all groups of shoppers with given educational level. + Expected Frequencies: ‘The expected frequencies are computed by dividing the Total 125 into Educational level options equally: 41.66 41.66 41.66 ~125 Example-2 (Solution) + Adjusted Table: No adjustment is necessary as none of the observed and expected frequencies are less than 5. * Degreesoffreedom: v=n-k = 3-1-2 Note that only parameter of observed data used for the computation of expected frequenciesis the Total n. * Critical Region and Limit: DP? fa PEP HBL PSI — eo§f2 et oom + Conclusionand Decision: As value of sample statistic is not in Critical region, thus we accept null hypothesis and reject alternative which means, the proportion of Internet users among shoppers at busy Mall is same for all shoppers with given Educational levels at 5% level of Significance. Example-3 Mistakes made by an skilled typist on routine work is hypothesized to be a Rare experiment. Following is the record of mistakes made per day by an skilled typist for 300 working days: Number of 0 1 2 3 4 5 6 Mistakes Number ofDays | 143 | 90 | 42 | 12 | 9 3 1 Test the hypothesis that mistakes/day is a rare experiment using chi-squared goodness of fit test at 5% significance level. HINT: (Rare Experiment suggests that data follows Poisson's distribution) Example-3 (Solution) + DATA: n= 300, Rare Experiment's Criterion (Poisson! 's Distribution). + Hypotheses: Hg: Number of mistakes made by skilled typist per day on routine work is a rare experiment, means, follow Poisson's Distribution. Hy: Number of mistakes made by skilled typist per day on routine work is not a rare experiment, means, doesn't follow Poisson's Distribution. Number ofDays(0,) 143 90 42 12 + Mean of Poisson's Distribution: + Expected Frequenci For expected probabilities first, We use the formula: Then Expected frequencies are given bys... SEE THE COMPLETE TABLE ON NEXT SLIDE! Example-3 (Solution) Observed Number 143,90 42 12 9 3 i of Days (0), Expected 0.41065 0.365 0.16264 0.04845 0.010736 0.001911 0.00028345 probabilities ( P,) 5 Expected Number 123195 1096 48792 14535 3.2208 0.5733 0.085035 of Days (e) 5 Adjusted Table is: Observed Number 143 90 42 25 of Days (0) Expected 0.41065 0.3655 0.16264 —0.06138045 probabilities (P.) Expected Number of 123195 109.65 48.792 18414135 Days (e), Dr. Mubarimad Mufaba Shaikh Example-3 (Solution) Degrees of freedom: v= [Number of columns in adjusted table] - [Number of observed data parameters used to compute €,] v= 4-252, Critical Region and Limi Test Statistic from Sample: Conclusion and Decision: As value of sample statistic is in Critical region, thus we reject null hypothesis and acceptalternative which means, number of mistakes made per day on routine work by the typist is not a rare experiment at 5% level of Significance. Home Work-I * The grades in a Statistics course fora particular semester were as follows: Grade A B c D F Number of Students 143 90 42 12 a ‘Test the hypothesis, at 0.05 level of significance, that the distribution of grades is uniform. Home Work-ll + Number of days per week a planning firm remains off in working season is hypothesized to be a rare experiment. Following data gives observed number of days per week the firm remained off for last 1000 weeks: Number ofoftdays ] 0 | 1 | 2 | 3 | 4]s5 |] 6] 7 per week Numberofweeks | 305 | 366 | 210 | go | 28 | 9 | 2 | 1 ‘Test the hypothesis at 5% significance, level. Home Work-III The results of an Annual Job Satisfaction Survey showed that 28% of Field Planners are very satisfied with their job, 46% are somewhat satisfied, 12% are neither satisfied nor dissatisfied, 10% are somewhat dissatisfied, and 4% are very dissatisfied, A sample of 500 off-field Planners yielded the following information on job satisfaction: Category Number of Respondents Very satisfied 105 Somewhat satisfied 235 Neither 55 Somewhat dissatisfied 90 Very dissatisfied 15 Use a = .05 and test to determine whether the job satisfaction for off-field planners is different from the job satisfaction for field planners. HINT: (Use chi-squared Goodness of Fit Test) Home Work-lV * Three cards are drawn from an ordinary deck of well shuffled playing cards and number of spades was noted. After repeating the experiment 64 times following data was obtained: Number ofSpades | 0 1 2 3 Number oftimes (0) | 21 | 31 | 12 | 0 Test the hypothesis, at 0.01 level of significance, that the number of spades follows binomial distribution. Home Work-V + A machine is designed to mix peanuts, hazelnuts, cashews and pecans, respectively, in the ratio 3:2:2:1. A recently mixed can of 500 nuts was found to have 269 peanuts, 112 hazelnuts, 74 cashews, and 45 pecans. At 5% level of significance, test the hypothesis, that the machine is working properly.

You might also like