Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
152 views
160 pages
Edexcel AS&Alevel Statistics S3
Uploaded by
M Di
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download
Save
Save Edexcel AS&Alevel Statistics S3 For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
152 views
160 pages
Edexcel AS&Alevel Statistics S3
Uploaded by
M Di
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Carousel Previous
Carousel Next
Download
Save
Save Edexcel AS&Alevel Statistics S3 For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 160
Search
Fullscreen
edexcel dancing aig hanging es :dexcel AS and A-level Modular MathematicsContents About this book a 1 Combinations of random variables 1 11 Finding the distribution of random variables 2 jons and sampling 8 22 Random sampling ° 23 Simple random sampling 0 2-4 Other methods of sampling 3 2.5. Non-random sampling 1s 2.6 Primary and secondary sources of data ” 3. Estimation, confidence intervals and tests a 3.1 Concept of statistic and sampling distribution 2 3.2 Estimation of population parameters using a sample 27 33 ‘Standard error ofthe mean 31 3:4 The Cental Limit Theorem 35 3.5. Confidence intervals 39 3.6 Hypothesis tests 48 43.7. Hypothesis test for the diference between two means 30 38 Large samples 4 Review Exercise 1 cy 4 Goodness of fit and contingeney tables 0 441 Forming a hypothesis 68 42 Goodness offi « 43. Degrees of freedom 70 44. The chi-squared (2) family of distibutions 70 415 Testing your hypothesis n 446 ‘The general method for testing the goodness of ft 73 47 Applying goodness-of-fit tests to discrete data 74 4.8. Applying yoodness-o-it tests o continuous distibutions 8 49 Contingency tables 91Regression and correlation 5.1 Spearman's rank cortelation coefficient ‘5.2 Testing the hypothesis that a correlation coefficient is 2ero, 5. Testing the hypothesis that Spearman’s population rank correlation cootficient i 2er0 Review Exercise 2 Examination style paper Appendix: Answers Index 101 102 107 119 126 128 140 1s1About this book This book is designed to provide you with the best preparation posible for your Edexcel S3 unt examination Briel chapter overview anc “into undeine he + This is Edexcel’ own course forthe GCE speciation, importnce of mathematics + Whitten by senior examiners, tothe real word to you + The LiveText CD-ROM in the back of the book contains study of futher units ane even more resources to support you through the unit to your caret Finding your way around the book helps you consolidate your learning COE f Every few chapter, (Detailed contents ) — lstshows which | Comtents parts of the $2 specification are covered in each | section (Each section begins with 2 statement of what is covered in the section | Past examination { Concise learning ‘questions are points marked 'E Each section ends Step-by-step with an exercise worked examples ~ the questions are = they are model carefully graded solutions and so they increase in difficulty and gradually bring you up to standard include examiners hints J Each chapter basa cfferent lou scheme, to help you find the ight chapter iy | Each chapter ends with A the end of the book there is amined exercise and aan examination-style paper. summary of key pointsD LiveText > Livetext software NM Te tient sotvare gives you ational resources Solutionbank and Exam café, Simply turn the pages ofthe electronic book to the page you need, and explore! Unique Exam café feature: * Relax and prepare ~ revision plam hints and tips; common mistakes, * Refresh your memory ~ revision checklist language of the examination; glossary * Get the result! — fully worked examination-style paper with chief examiner's commentary Solutionbank * Hints and solutions to every question in the textbook * Solutions and commentary forall review exercises and the practice examination paper"bse by Peanon Eston Limite, a company inorporats in England ae Was, having ts estes fice at abut Gat, Hain, Ese, CMO 2 Reseed company numer 472528 excel ses trademark of Rexel Ln “ox © Ging Atwood, Aan Clegg, Cl ye Jne Dyer 2008 won? etn Library Cataloguing in Pb Daas avis om he His Libary ques BAN TH OA85519 148 Copyright notice ‘Aig seve No pat ofthis pulcaton rye epreduced in any fom ot by any meas inca potoxpyan trig in ny mea y ltr means and whether ono ase oneal 0 ome fesse ofthis pletion without the writen pension othe copy owner excep in accordance mah he frovisons ofthe Copyright, Deg and Patents Ac 968 or unde th en fence we y the Cyt | ees Agency, Salto Hose, 10 Kiby Stet Landon ECIN ATS (wel... Appt copyright ‘omer wen permuson Should be arose 1 the publisher net by Suan Gardner “Typeset by Teche, Gateshead iste by Teh Set Li, Gatshend Index by Indexing Spt (UK) Lt Cover as by Chesopter Hoon Ptr esench Che Marin ‘Cover piotflustatan © Senet Photo Ls Desi Printed in Chins (TIS) Acknowledgements ‘Theauthorant pusher wold ke wo thank the folowing indus and egos for permis to proce photogs ‘Getty Images / Mtoe pl Alay Image Kt Coop Smith p; Getty Images Photos 2; Amy Images / Davi Atm po, Agere | Peter Dean p01, ery fot as ben made to contact copy holders of mater prado nts bck Anyomsonswl be rected in susequent printings noice ea othe pushes. ‘This exe publation of high-quality sport er the dey af exe quictons ‘ees erdoriement dos ot men hat ts mater eset to ahlee a Eee! quicaton nor oes t ‘mean that hi ithe only stalemate wae ost any Elect question. No edo teal wb {hd vrai Sern ay Exe! examinations an ny rst Ins pie by Eee slike {his and ether appropiate eats, (Cope ical Speaton oral Edexcel quia may be found on the easel mbit - wens cm,Alter completing this chapter you should be able to: ‘© combine independent normal random variables © combine linear combinations of independent normal random variables. Combinations of random variables 3 A:sweet manufacturer produces two varieties of fruit sweet, Xtras and Yummies, The weights, Xand Y, In grams, of randomly selected Xtras and Yummies are such that X ~ N(30, 25) and Y~ N(32, 16). The manufacturer wishes to work ut the probability that the average weight of a packet containing 6 Xtras and 4 Yummies lies between 28g and 33 g. By the end of this chapter you will know how to combine these distributions and work this out.IF Xand ¥ are two random variables then. + E+ Y) = EX) + E(Y) + EQX-Y) = £0) - EY) 1 IfX and ¥ are two independent random variables then + Var(X + ¥) = Var(X) + VarC¥) + Var(X— ¥) = Var(X) + Var(¥) The proofs ofthese relationships are not needed! at this stage but they ae used when combining, Independent variables 1X sa random variable with E(X) = and Var(X) = o? and Vis an independent random Variable with E(Y) = js and Var¥) = a2 fnd the mean and variance of: axty, bx a x+y) =X) +E) fs __ Var(X + ¥) = Var(X) + Var(Y) b eX - 1) = BX) - EO) = He Var(X) + Var(Y)_ atoe var(X = ¥) In S1, Chapter 8 the following properties of expectation were introduced, + ElaX) = ab(X) + Variax) = wVarix) Using these it can be shown that + E(oX + BY) = oF(X) + HEC) + E(oX ~ BY) = a(X) ~ HEY) + Var(aX + bY) = o?Var(X) + b8Var(¥) # Var(aX ~ bY) = 0° Var(X) + bVar(¥) A linear combination of normal variables is also normal and so if X ~ N (4,042) and Y~N (uz, 8) and X and ¥ are independent. then # aX + DY ~N(om, + by a? a2 + 6 02) +X ~ BY ~ N(oyy ~ byt, a? oy? + BP 2)1m The general form can be extended to any number of random variables. For example Vas + 9) = a8 +8 = 2? XX + KAN tH te Ys Nare2Xi) = 42 = NG * Si, 90? + 502) Vat + XQ # Vor2X) X;~ NOS, dyad X; —NO6,2) 105, 3) means thatthe mean i UX, anal Xzare indepenclent find the cistibution of ¥ where: 18 and the variance 3 aYaXthy b YAN; 2% sig 4 X) + Xa~ Nils + aa 7 + 07) b_Y=4X,~ 2X,~N@X 1 —2RGIOXB+4X2)— ¥~ N42, 56) Using 4X; ~ 2X, NGI ~ 2a, a? + 2a) 16X,, Xzand Xj are independent normal random variables such that X, ~ Ni, Na3, 28) and X;~ N(18, 3?) and Visa random variable defined by Y= 3X, ~ Xa + Xi find the cistribution of Y. YN B+ x4+4+9) Y ~ N(22, 49) Bottles of mineral water are delivered to shops in crates containing 12 bottles each. The weights of bottles are normally distributed with mean weight 2kg and standard deviation 0.05kg. The hts of empty’ crates ate normally distributed with mean 2.5 kg and standard deviation 0.3 ky, ‘Assuming that all random variables are independent, find the probability that a full crate will weigh between 26 ky and 27 kg 1b Two bottles are selected at random from a crate, Find the probability that they differ in weight by more than 0.1 kg, ¢ Find the 1% ‘aximum weight, M, that a full crate should have on its label so that ther is only a hance that it will weigh more than M.a ltWeX4X t+ X40 where X ~ N(2, 0.05%) and C~ N(25, 0.3") E(M = 12600 + E() = (2x2) +25 265 Var(W) = 12.Var(X) + Var (0) = (2x 0.08%) + (0.34) ou W~ N(265, 0:2) re8-
08) = 2F(K, ~ X,> 08). =ar(z> 0-08 =ar(z> 9-08 = 201-4) = 0159 (0157) © FW>M) =001 2268 + 1. Given the random variables X ~ Ni80, 32) and ¥ ~ N(SO, 2°) where X and Y are independent find the disteibution of W’ where: aWex+y, bW=X-y¥. 2. Given the random variables X ~ N(45, 6), ¥~ N(S4, 4) and W~ N(A9, 8) where X, and W are independent, find the distribution of R where R= X + ¥ + W.0 Xi; and X, ate Independent normal random variables. X, ~ N(60, 25) and X» ~ N60, 16), Find the distribution of 7 a T= 3X, br=7%, er BX) +2Ky, d T=Xy— 2X ¥,, Yoand Y; are independent normal random variables. ¥; ~ N(8, 2), Yo ~ N12, 3) and ¥;~ NUS, 4), Find the distribution of A where: AAaN tht Yy bA=%-%, Aaa tay, © A= 2¥,- Yat Ve ~ Yat Ya, ‘A, Band C are independent normal random variables. 4 ~ N(S0, 6), B~ N(6O, 8) and. C= N80, 10) Find a PAT B< 115), b PU + B+ C> 198), © PB +C< 138), d POA +B-C<70, fe PU + 3B C> 140), POS < (A ~ B)< 116). Given the random variables X ~N(20, 5) and ¥ ~ N(1O, 4) where X and Y are independent, find aBx-n, b Van), © PU3-
~ Nay + bus, ao? + BP 2) aX, ~ BX, ~ Nay ~ buy, a 2 + BE 3)‘After completing this chapter you should be able to: ‘© takea simple random sample # use random numbers for sampling © takea stratified sample 1s take a systematic sample * take a quota sample # know the circumstances in which each method of sampling might be used a = e| in Z Sampling A charity wants you to find out which of their proposed projects is most popular with the public. What should you do? Do you interview everyone in the country or do you take a sample? If you take a sample, what method should you use, and why? After reading this chapter you would be able to answer these questions and carry out such an investigation.Bin ‘You need to know about populations and sampling. 1 Statistically a popaation isthe whole set of For example 'you want find the Trems that are of interest. form then the population would consist ot Information may be obtained from a population _ the heights ofall the sixth form students by taking a census or by taking @ sample. The information obtained is Known ss raw data, Taking a census A census observes or measures every To ind the mean height of students ina certain member of a population, sixth form, you could measure each student. Perhaps the best known census is that conducted by the British Government. In this census every known householder in Great Britain receives a census form every 10 years. Each householder is required by law to complete and return the form by a certain date. The census form records a variety of information, such as the numberof people present, their ages, and soon. Aconsus is used if ‘= the siz of the population is small, or if ‘© extreme accuracy is required, Sampling Suppose ina sath frm there were A sample is a selection of observations taken ee sie ere eae te . umber is quite smal t would take ‘rom a sub-set ofthe population, which is used Sige and cay to find out information about the population Inti he lt each te, 5 a whole. This is known as a sample survey. SE eee Se This scones 2 very member ofthe population fused, Population ofthe heights of a ‘lass of 25 pois ‘The sample will be truly representative of the population as a whole provid that you select it so ‘that i is fee from bias. To do this you must make sure that your selection is truly random. ‘The size ofa sample (the number of people or units The size ofa sample will ao depend sampled) does not depend entirely onthe ize of the tat you ar gong our i, population. It depends on the aecuraey you require fenastatan ceauteae and the resources you are willing to allocate to data population was al te people over 17 collection, A large sample will usually be more accurate living in Great Bain, a random sample than a small one, but will need greater resources. a a coeThe numberof items or people sampled may also be affected by the nature of the population: i the population is very variable you will require a larger sample size than you would ithe population were more uniform. Both methods have advantages and disadvantages. Advantages Disadvantages ‘Census | Teshould givea completely accurate result. Its very time consuming and expensive. ¢ Ieeannot be uses when the testing proces isto destruction (for example testing an apple for sweetness) + The information is cificut to process because there sso much of It «+ Asample survey costs less than a census. |» The data may not be as accurate Survey” |e Results are obtained quicker fora sample |» The sample may not be lage enough t0 sagvey than fora census give information about small sub-groups «+ Fewer people have to respond in the sample. | of the population, o deal with than Give a brief explanation, and an example ofthe use of, a a census bb asample survey. a Ceneuo ~ very meter of the population is absenved. Don't ixget to = = sof the population anaes give both an Example 1-year national census —— Explanation and b Sample aurvey — a emall portion of the population fs jai Example: opiion pals, 2.2. You need to know about random sampling. In random sampling each unit is chosen entirely by chance and each member of the population hasa known chance of being included in the sample. Sampling with and without replacement IF the unit selected at each draw is replaced into the population before the next draw, then it can appear more than once in the sample. This is known as sampling with replacement, 1m If the unit is not replaced, so that only those units that have not previously been selected are eligible for the next draw, then itis known as sampling without replacement. {wo well-known examples of random sampling are ERNIE (Flectrontc Random Number Indicating Equipment), which is used to select winning numbers on Premium Bonds, and the selection of numbers for the national lottery,7 2.3 You need to know about simple random sampling. Suppose you wish to take a sample from a population of ire 1 Asample of size nis called a simple random sample We secs vor the se ever other sample of size nhas an equal cance of being SW spun and nor selected. it ‘Simple random sampling is sampling without replacement. To do simple random sampling you need a sampling frame. 1 Asampling frame is alist identifying every single sampling unit that could be included in the sample, Simple random sampling ‘Advantages Disadvantages Provided thatthe population is small in size | » Its not sultable where the population « itis cheap todo size is large. «© itissimple to do ‘+A sampling frame is required, ‘© standard formulae can be used to analyse the results, ‘each person or unit is included only There are two simple techniques that are commonly used andl do not require elaborate equipment ‘* random number sampling «lottery or ticket sampling, | Random number sampling In random number sampling cach clement of Fora population of 400 you could esign the sampling frame is assigned a number. the nuibes 000,001, 002, 388, 399. ‘Once you have done this you can use tables of random sampling numbers such asthe one at the back ofthis book (table on page 139). These tables contain 1000 or more digits, that Iso say, integers starting from 0, ie. 0, 1,2, 3,4, 5,6, 7,8, 8 The table is constructed with great care so that each digit Is equally likely to appeae. Suppose you want a sample of 50, You will ned to select SO random Fora sample rom a ‘numbers ftom the table. You could start a the top let hand comer pulton of 400 you. and work down the column, you reach the bottom of the table auned to ae he you could start again atthe top with the next unused digits along guample"018, 276, ee, the top row To obtain a set of random numbers, you may start at the top of the table and read dowenwards, but it is better to start at a randomly selected place inthe table, and you may tavel in any direction. If number appears that has already appeared, itis ignored (in effect this is then sampling without replacement)‘Once you have extracted 50 random numbers, the sample i selected from the numbered sampling frame by using these numbers, «+ Inrandom number sampling, each element is given a number and the numbers of the equired elements are solected by using random number tables or other random number generators Town G Cowan D. Sample rmirs You are going to take a sample of 50 from a population of size 400. Write down the first five random ‘numbers starting atthe seventh column from the left of table on page 139 and working down. Starting atthe top form 3 dat ‘umbers, The fist 372, “The seventh column begine 3726+ ~ ame ‘Continue down the column, 1726 gnare 415 because i is grater than _ — tgnore 875 and 951. The net numbers 117. ‘The numbers are 372, O39, 172, 117, and O56, ela 083: Computers and caleulators can produce lists of random numbers. Random number sampling has advantages and one disadvantage. Random number samp Advantages Disadvantage + The numbers ae truly «+ Its not suitable where the random and fee ftom bias. | population size i large + This easy to use + Each number has a known equal chance of selection. SDF yin agg In lottery sampling each element of the population is ldentified by some charecteristic such as ‘aname or number, and this is put on a ticket. The tickets, which should all be te same size and ‘shape, are put into a container and are drawn one at a time (without replacement). The elements of the population comesponding to the tickets are selected. Tottery sampling Advantages Disadvantages > The tickets are drawn at random. ~ Teis not suitable where the population J+ Itiseasy touse size is large. + Each ticket has a known chance of election. | A sampling frame is needed. Describe what is meant by a random sample, and give one advantage and one disadvantage associated with it. ‘A random sample is one in which every other possible sample of size 1 has an equal chance of being selected. | Advantage: It is free from bias, Any of the other advantages ‘could have been given ner, Disadvantage: I ls not suitable for large sample sizes, ‘The 100 members of a yacht club are listed numerically in the club’s membership book, ‘The committee wants to select a sample of 12 members to fil in a questionnalre about the facilities offered by the club, 4 Explain how the committee could use a table of random numbers to take a simple random sample of the members. b Give one advantage of this method over taking a census 1a Allocate a two-digit number to each person, Sith column ad 7th row for starting at 00 and ending at 99. example 56, 86, 80,57, 11, 78,40, Select a random starting point in the table. ———— 38" $8" a0’ ag; 14, SY across. Select 12 random numbers. 56, 71, 66,87, 09,11, 48, 414, 33,79, 12,02 vertically. Go back to the original population and select the people comesponding to these numbers. Mana b A sample suney costs less than a census. umber inthe fist case because B6 occurs ice 50 oF ignored the second te Results are obtained quicker. OR Fewer people have to respond in the sample.1 Explain briefly what is meant by the term sampling and give three advantages of taking sample as opposed to a census, 2. Define what is meant by a census. By refering to specific examples, suggest two reasons why a census might be used, 3. A factory makes safety harnesses for climbers and has an order to supply 3000 harnesses The buyer wishes to know that the load at which the hamess breaks exceeds a certain figure Suggest a reason why a census would not be used for this purpose 4 Explain: fa why a sample might be preferred to a census, 1b what you understand by a sampling frame, {€ what effect the size of the population has on the size of the sampling frame, «what effect the variability of the population has on the siz of the sampling frame. 5 Using the random numbers 4 and 3 to give you the column andl line respectively in the random number table (table on page 139), select a sample of size 6 from the numbers: a 0-99 b 50-150 © 1-600 You need to know about other methods of sampling Systematic sampling mn systematic sampling the required elements are chosen at regular intervals from an ordered list To take a systematic sample, you take every cr OA RTT ‘Ath element trom a sampling fame, where k ‘the sampling interval, is calculated as: a Pick a number at random between 1 and 8, and population size (N) ‘fits say, the number 3, start at the third hame ampeshe ‘on thelist followed by the 11th, 19th ee To overcome the objection that the first name is bound to be selected, you introduce a direct clement of randomness by selecting the first item randomly. ‘own G eonint J jenen 11 = 4 = 15, Method sampling‘When you are selecting the interval, itis possible to introduce bias if you are not careful. Suppose ‘you were investigating the mean rainfall each month over 100 years: an interval of 12 months ‘would introduce bias, as you would be looking atthe same month in each yeas Systematic sampling is used when: ‘the population is too large for simple random number sampling. Systematic sampling Advantages Disadvantages © Itissimple to use '* Teis only random if the ordered Its uitable for large samples, | list is truly random, + Itecan introduce bias. Stratified sampling ‘This isa form of random sampling in which the eee population is divided into groups or categories which Son ne-vbper Si a Bre mutually exclusive, sono individual oritem can be feud spp ad ower in two soups andit used where we may expect the fora sample 40m Sei ‘observation of interest to vary between the different take 5 = aes is, ae upper groups. These groups are called strata (singular: and 30 lower sath former. n this way stratum). The strata would be decided according to one eeypumet weston sth ath ‘or more criteria such as gender, age, religion and soon. proportionate represented, and wth ‘within each ofthese stata asimple random sample is —_Shpengaeed 2 nares selected, The same proportion ofeach stratum is taken in ‘the sample sf found In the population, so that each stratum will be represented inthe correct proportion inthe overall result. =_number in stratum ‘The number sample in stratum =, mumber in stat x overall sample siae bmn stratified sampling the population divided into mutually exclusive strata and a random sample is taken from each, Sampling frame ‘of popalation The proportion for each stratum isthe same as that in the population. Splizinto ‘Simple random samples are taken from these. Srp ame Sampling ame‘actory manager wants to find out what his workers think about the factory cariteea facilites, He decides to give a questionnaire to a sample of 80 workers, It is thought that different age groups will have different opinions. There are 75 workers between 18 and 32. There are 140 workers between 33 and 47, There are 85 workers between 48 and 62 ‘a Write down the name of the method of sampling the manager should use. 'b Explain how he could use this method to select a sample of workers’ opinions, Stratified eampling Find the total number of workers There ate: 75 + 40 + 6 = 300 workers alcagether. -— In the 18-22 age-group: he wll select 3, x BO 20 ee For each age group find the number In the 33-47 age-group, . ‘of workers needed for the sample. te wil select 2 x 60 = 57 = 87 workers < : umber = propetion of workers» £0 In the 48-62. age-group, Ni he will select 22, x 80 13 workers. > Whee the required number of york rhe workers is not a whele number The workers h each age group woud be numbered anda ona neta whee number random number tabe (or generator) would produce the we required quantity of random numbers, the workers corresponding to these numbers woud be asked their opiniona Stratified sampling is used when: + the sample is large and + the population divides naturally into mutually exclusive groups Advantages Disadvantages - * Itcan give more accurate estimates |» Within the strata, the problems ae the than simple random sampling where | same as for any simple rancom simple. there are clear strata present. Tf the stata are not clearly defined they + Itzeflects the population structure, may overlap. 2.5 You need to know about non-random sampling The chief characteristic of simple random, systematic and stratified sampling is that every individual has a known probability of being included in the sample ~ the sample is random. Non- random sampling methods are used when itis not possible to use random methods, for example, when no sampling frame is available. An example of non-random sampling is quota samplingQuota sampling In quota sampling the population is divided into groups in terms of gender, socal class, ete ‘The number of people in each group is set to try and reflect the group's proportion in the ‘whole population, The interviewer selects the actual sampling units ‘When taking a quota sample, as you meet people you assess their age or socio-economic group, et After they have been interviewed, they are put towards the quota Into which they fit. This continues until all yhe quotas have been filled. If person refuses to be interviewed, fr the quota into which they would ft is full, then you simply ignore them and pass onto the next person. In practice you might also decide to take gender into account, but the more characteristics you introduce the harder it becomes to select people fitting all the characters For example suppate you wished to {get an idea about how the people Within your consttueney are going {ovote in an election. You would interview a ilferent number from feach age group. You would then take a sample so thatthe proportions ofeach age represented the preportons present ‘within the wile constituency. “The number beside each age group Iskrown asthe quota fr that group. A quota sampling scheme Age group Socio-economic group | Number/Quota MB 4 18-29 c 18 bie 4 NB 6 30-44 c 7 bie 4 MB 7 45-64 c 7 bit 6 MB 4 65-85 c 2 bie 6 Total 100 Quota sampling Advantages Disadvantages ‘Tt enables the fieldwork to be done quickly because a representative sample can be achieved with a small sample size. Costs are kept to a minimum, + Administering the testis easy, Ieis not possible to estimate the sampling errors. (The process is nota random process.) The interviewer has to choose the respondents and may not be able to judge the characteristics easly Non-responses are not recorded. (Pethaps the non-respondent in the constituency survey did not agree to be interviewed because he was ‘don’t know’ voter) Itcan Introduce interviewer blas in who Is included.2.6 You need to know about primary and secondary sources of data. Primary data are data that i collected by, oF on behalf of, the person who is going to use the data, Secondary data are data that is neither collected by, nor on behalf of, the person who is to.use the data, The data are second hand. Type of data | Advantages Disadvantages Primary | Thecollection method isknown, | « It is costly in time and data 4¢ The accuracy is known, | tort ‘+ The exact data needed are collected, Secondary | They are cheap to obtain ~ «Bas Is not always recognised data government publications, for # Tecan be ina form that is, example, are relatively cheap. aunitib Genial + Alarge quantity of data is avallable, {or example, on the internet + Much ofthe data has been collected {or yeats and can be used to plot trends. 1. Explain briefly the Write brief notes on: simple random sampling, bb stratified sampling, systematic sampling a quota ference between a census and a sample survey. mpling Your notes should include the definition, and any advantages and disadvantages associated with each method of sampling, 2. Explain the purpose of stratification in carrying out a sample survey. 1b The headteacher of an infant school wishes to take a stratified sample of 20% of the pupils at his school. The school has the following numbers of pupil Yeart | Year2 | Year3 40 oo 80 ‘Work out how many pupils in each age group there will be in the sample. 3. Asurvey sto be done on the adult population of a certain city suburb, the popuation of Which Is 2000. An ordered list ofthe inhabitants is available, f@ What sampling method would yc 'b What condition would have to be applied to youe ordered ist i the selection Is to be uly random? w use and why? Fi5 a Explain briefly: i why itis often desirable to take samples, Hi what you understand by a sampling frame. bb State one circumstance when you would consider using i systematic sampling, Hf stratification when sampling from a population, it quota sampling, {6 A factory manager wants to get information about the ways his workers travel to work. There are 480 workers in the factory, and each has a clocking in number. The numbers go from 1 t0 4480, Explain how the manager could take a systematic sample of size 30 from these worker. 1 Using the random numbers on page 139, and starting atthe top of the column with the ‘number 88 and working down, a simple random sample (without replacement of size 10 ‘was taken of numbers between (and 75 inclusive, The fist two numbers were 17 and $2. 1a Find the other eight numbers in the sample. 1b Explain, with the ald of a practical situation, how this et of random numbers could used to take a sample of size 10 2 a Give one advantage and one disadvantage of using i acensus, fi a sample survey, 1b Its decided to take a sample of 100 fom a population consisting of $00 elements. Explain hhow you would obtain a simple random sample without replacement from this population, 3. a Explain briefly what you understand by ia population, a sampling frame, 'b A market research organisation wants to take a sample of i owners of diesel motor cars in the 4 persons living in Oxford who suffered from injuries to the back duing July 1996, ‘Suggest a suitable sampling frame in each case, 4 A.gym keeps a numbered alphabetical lst oftheir 200 clients Explain how you would choose a simple random sample of 40: 5. Write down one advantage and one disadvantage of using ‘stratified sampling, 1b simple random sampling,6. The managing director ofa factory wants to know what the workers think about the factory canteen facilities. One hundred people work in the offices and 200 work on the shop floor. He decides to ask the peaple who work inthe offices. a Suggest easons why this i likely to produce a biased sample. 'b Explain briefly how the factory manager could select a sample of 30 workers us 4 systematic sampling, HL stratified sampling, Hi quota sampling, 7 Agarden centre employs 150 workers. Sixty-five of the workers are women and 85 are men. Explain briefly how you would take a random sample of 30 workers using stratified sampling. 8 The 240 members of a bowling club are listed alphabetically in the club's membership book. The committee wishes to select a sample of 30 members to fil in a questionnaire about the facilities the club has to offer. 4 Fxplain how the committee could use a table of random numbers to take a systematic sample. 1b Give one advantage of this methox! overtaking a simple random sample ° 9 a Explain briely what you understand by ia population, i a sample, 1b Give one advantage and one disadvantage of ta ng asamp. 10 A college of 3000 students has students registered in four departments: ats, science, education and crafts. The principal wishes to take a sample from the student population to gain information about the likely student response to a rearrangement of the college timetable in order to hold lectures on Wednesday, previously reserved for sports ‘What sampling method would you advise the principal to use? Give reasons to jastify your 11. As part of her statistics project, Deepa decided to estimate the amount of time Aleve students at her school spent on private study each week, She took a random sample of students from those studying arts subjects, science subjects and a mixture of arts and science subjects. Each student kept a record of the time they spent on private study during the third seek of term, ‘a Write down the name of the sampling method used by Deepa 1b Give a reason for using this method and give one advantage this method has over simple ranclom sampling, 12 Thor are 6 gis and 56 boys a schoo Explain billy how you coud take random sample of 15 pupils using 4 simple random sample, b stated same ° \ee] Summary of key points 1 A population is the whole set of items that are of interest, A.census observes or measures every member of a population. A sample isa selection of observations taken from a sub-set of the population which is used to find out information about the population as a whole. This is knewn as a sample survey. A random sample is one in which every possible sample of size m has an equal chance of being selected, A sampling frame is a lst identifying every single sampling unit that could be included In the sample, In random number sampling, each element is given a number to identify it and the ‘numbers of the required elements are selected by using random number tables or other random number generators [In systematic sampling the required elements are chosen at regular intervals from an ‘ordered lst In stratified sampling the population Is divided into mutually exclusive stata and a simple random sample is taken from each. The proportion of the strata in the sample isthe same as the proportion of the strata in the population. In quota sampling the population is divided into groups in terms of gender, social cass, etc. The number of people in each group i set to try and reflect the group$ proportion in the whole population. The interviewer selects the actual sampling unitsAfter studying this chapter you should understand the concept of an unbiased estimate © appreciate the significance of the Central Limit Theorem + know how to find confidence intervals for the population mean ‘© be able to test hypotheses about the population mean The doctors say jim is of average height. In fact, he is 1.84 m tall; but how can you find the average height of adult men? john was also told that he ‘was of average height, but he is 1.88 m tall. Does this mean that the doctors are using a range of values to describe average height, say 1.80 to 1.90m perhaps? Paul is 1.92m tall but he claims to be of average height. How can you test this claim, ‘and what basis could you give for saying that Paul was above average height? In this chapter you will examine ways of finding estimates, as well as using probability to test claims like Pauls,3.1 You need to understand the concept of a statistic and a sampling distribution Imagine that a new company is thinking of selling raincoats to students. The company would like to know something about the heights of students, and in particular the maa height of student. Unfortunately, the number of students is so large tha it is not practical to measure every student and so a method of estimating this mean height is required. The heights of the students at the college form a large population. As in book S2, here the mean height of the students will be called 4 (mu) and the standard deviation of the heights of the students (sigma), and ‘these parameters will be referred to as population parameters. They are the mean and standard deviation for the whole population. The company does not know the values of wand «rand it cannot afford the time or money to find them. The problem that the company has is how to estimate the parameter 4. In order to answer this question you take a sample from the population. In Chapter 2, several methods of sampling Were discussed but the theory of estimation that is used in this course assumes that a simple random sample of size 1 is used. Population X ~ the height of students ‘The population mean and popubtion standar ‘eviation 7 are unknown population parameters, ‘The sample will consist of» observations a the random Sample Variable X These are usually ofszen refered teas Xy, Xan Xe A statistic is defined as follows WX, Xp Xy Xi random sample of size n from some population then a statistic Tis a random variable consisting of any function of the X, that involves no other quantities. In particulara s sample, X, and «. istic should not involve any unknown population parameters 8 taken from a population with unknown population parameters State whether or not each of the following are statistics. a Xitht% 3 marty, Xoo Xy) # median(X, Xo Xois only 2 funtion of he sample Xt%+ = Xa Xk kat need a SS ie a statisti not invoive all members of the mpl. bo athe nota statistic. « The function contains 2 © — HP le not a atatiatic. “The function contains jt. ds max (X,, Xp ....X,) io a statistic. — —— Itis only a function of the sample Xap Kar von Xe oe » 33 SE oe _ # median (X, Xz, ..., X,) is a statistic. ——— ‘itis only a function of the sample ear eas Since it is possible to repeat the process of taking a sample, the particular value of a statistic Tin 4 specific case, namely f, will be different for each sample. If all possible samples are taken, then. ‘these values will form a probability distribution called the sampling distribution of T. This ‘will usually depend upon the distribution of the population X. The sampling distribution of a statistic Tis the probability distribution of T. {In Chapter 1 you saw how linear combinations of independent normal distributions could be ‘combined, The rule for VarlaX = BY) in particular requiced that the random variables X and Y were independent, For this reason the theory inthis chapter is based upon the idea ofa simple random simple met in Chapter 2. The sample is usually referred to as a random sample and it has the following definition: 1 Arandom sample of size n consists of In this chapter (and throughout this the observations Xy, Xp, Xs wy Xq froma series of books) we shal distinguish population where the, between the random variable X, representing the th observation in © are independent random variables, ‘Sample, and the value; ofthe > have the same distribution as the bbservation in a $0, population. ‘example, ifthe fourth person measured twas 1851 tall then x, = 1.85. The noon day temperature, in °C, is measured for a random sample of S days in July In a certain city and the following results were obtained 28.3, 312, 240, 287, 309 Calculate the values of the following statistics aX b xe © manly, Nosy Xe) ~ MMM Xp oy Medf 28343124... +309 5 We use ¥ for themean value and X forthe mean of the statistic, 1 b DX? = 26.3 + 312° = 4128.88 28.62 bau #209 © The minimum value ts 24.0 The maxinum value is 21.2 ‘The statistic has a value of B12 — 24.0 = 7.2 This of course, i the statistic commonly known asthe range. If the distelbution of the popul sometimes be found. a) ‘The welghts, in grams ofa consignment of apples are normally distbted with aman and standard deviation 4, A simple of size 25 taken and the stasis Rand Tare caluated as fllows Koc Xpand T= Xs + Xp ba +My Fin the distributions of Rand. ion is known then the sampling distribution ef a statistic can The sample wil be X, X, where each X\~ Niu 44) State the dstbutton for “Now R= Xo — Ki R~ Nu w+ 4) cmon, that ie n(o, [v2 7) Usgthefomdas fo Neo T=X+%+~+X, Goncoeaaih 201 T= N@25p, 25 x 4) Extend the formulae for ECR Y) and vargk — 1) from Chapter 1 Ale bg contains conte. Sty percent ofthe counters have te number onthe and fy pr at vet a 2 Find the mean pand vara fo this populstion of counts A simple random sample of se 3 taken om hs population b Lista possi sampes Find he sampling dstbution forthe mean 3 here X;, X and X; are the three varlables representing samples 1, 2 and 3 Hence find F(X) and Vari). € Find the sampling distribution for the mode M. f Hence find E(A9 and Var(Mp.1a The distribution of the population is Use the methads from —— Stoihd sent P(X = 2}: 2 2 / n=eay= Do wnk=2)-0+850=2 ie imi - Bzta- pee or og b The possibie samples are Since the sample is random the ie hseraions we independent, $010 find te probabity of ease I 0,9) vice (.0.0)(0.1.0)(0.0.1) youn may the probable ee (1,090.40 Mihara takench erate ann fame asian as ie. the (0, 0,0) case |e. the (1, 0, O}: (0.1.0): (0, 0,1) cases 2. the (1,1, 0}; (1, 0, (0. 1.1) cases ie, the (111) case, General formulae for E(X) and Var(X) ae sven in Secon 3.2. 4 P(M = 0) [je. cases (0, 0, 0); (1, 0, 0}; (0,1, 0}: (0,0, 1) and FM {le the other cases} soths diirbuvionof Mis Mk pl ‘ EM) = 0+ 1x8 = ‘and Var(M)=0+1% 38 ~( 228 Notice that E(X) = yx but E(M) # wand that neither E(R) nor E(M0) are equal to the population made, wich io of course 2ero as G0% of the counters have a zero on them These results will be examined in greater data in Section 3.2Exerci 1. The random variable # ~N(u, 0} represents the height of a variety of flower where po? are unknown population parameters random sample of 5 flowers ofthis variety are measured and thels height, n em, is given below. hy = 38.4, y= 323, y= 345, y= 374, hy 328 Determine which ofthe following are statistics. by ww ax 2 Arandom sample of 6 apples are weighed and their weights, x; g, are recorded xy = 168, x)= 185, x)= 161, x 172, 4-187, x= 176 Calculate the values of the following statistics. . oe us +X 3) The lengths of nals produced by a certain machine are normaly distributed with a mean j. and standard deviation ¢. A random sample of 10 nails is taken and thelr lengths UX, Xa, Nay oy Xi ave measured. i. Write down the distributions of the following: pit Ky ea ax eda-¥x D4) i State which of the above are statistics, 4 A large bag of coins contains Ip, Sp and I0p coins in the ratio 2:2:1, ‘Find the mean ye and the var nce g? forthe value of coins in this population. ‘Arandom sample of two coins is taken and thelr values X and Xo are recorded 1 List all possible samples, Kith € Find the sampling distribution for the mean X = Xt Hence show that F(X) = eand VaniX) =You need to be able to estimate population parameters using a sample. In Section 3.1, the problem of trying to estimate the mean height of students in a sixta form college was considered. If you take a random sample of siz 1 then you can find various statistics. ‘The question is, are any of these statistics useful in estimating the population parameters? A statistic that is used to estimate a population parameter is called an estimator: the particular ‘value of this estimator generated by a particular sample is called an estimate Since all the X,are random variables having the same mean and variance as the population, you ‘can sometimes find expected values of astatistc T, F(T), and this will tll you what the ‘average’ vale ofthe statistic should be Asandom sample X,, Xs, X is taken from a population with X ~ Nis, °). Show that E) = 4. In Chapter ® of book 81 an Important property of expected values was giver: E(aX) = aE(X) @® Also la Chapter 1 of this book you saw that + Y) = 2X) +BY) ® 7 il Ecos ot ‘You can extend formula ® by No 1a x) ‘multiple applications ~ censider: 0 R= Tet +X) yO BO +9) =F, +X) + £06) FX) +. +E WO FOG) + EG) + £0) pet tad =m Travis ECR) =p ample 5 shows that if we use the sample mean X as an estimator for u then ‘on average’ it will aive us the corzect eesult. This isan important property for an estimator to have and you say that isan unbiased estimator of (So a specific value of x will provide an unbiased estimate of) 1m Ifa statistic Tis used as an estimator for a population parameter #/and E(7) = @ then Tis an ‘unbiased estimator for 8. It seems obvious that being unbiased isa desirable feature to have in an estimator, but not all ‘estimators possess this property. In Example 4 you found two statistics based on samples of size 3 fom a population of counters of which 60% had the number 0 and 40% had the number 1. The population mean ye was and the population mode was 0 (since 60% of the counters had 0 on.them). The two statistics that you calculated were the sample mean X and the sample mode M. You could use either of them as estimators for , the population mean, but you saw that EQ) = u but E(W) + 4 50 you would prefer to use the sample mean X rather than the sample mode Mas an estimator for in this case. How about an estimator for the population mode? Neither ofthe statistics that you calculated had the property of being unbiased since EX) = w= 2 and EiM) = ;{ whereas the population mode was 0 Intuitively you might prefer the estimator M since i afterall, a mode and isalso slightly closer to the population mode. In this case you refer to M as a biased estimator for the population ‘mode. The Bias is simply the expected value of the estimator minus the parameter of the population its estimating, I If statistic Tis used as an estimator fora population parameter @then the bias = E(T) = @ In this case the bia is $f So far you have found an unbiased estimator frp, bat how would you find a estimator fro? Before answering this question you need to find Vari) random sample X, taken from a population with X ~ Niu, 3 Show that VaniX) = 2 ln Chapter 8 of book St an important property of variances was given Var(aX) = a? Var(X) ® Also in Chapter 1 of this book you ssw that Var(X + ¥) = Var(X) + Var(¥) if X ana Y are independent @ Now G+. +X) Var + on # X) Oo = Elven) + + Vert) 4® 1 aLptt $04 ‘This result wl be referred to again later in Section 3:3, ‘that is: Var(X) Show that = ~ nX®) Is an unblased estimator forIn order to Find E(S?) you need co recall certain facts about expected values and variances, These are 0? = Var(X) = £0) — pf oo EX) =o? +t o and Nar (R) = % and EGR) = a See Examples 5 ade. 0: = 8%) - a? and a2} ® hen using EGoX— WY) = [e(rx2) - ne(X)] ‘9E(X) ~ BECY). Bue EEX?) = LEC) = rex?) since exch X, fas the sme eo ES) = [ecxe) — eC] Sstbuton 98 X [oe + ue- (9 + ]] yOara® ai Thaties E(S)= 0% ‘and 60 the otatistic io an unbiased estimator of the population variance a”. Ibis because of this property of S* that Wwe use 6 to estimate a in calculations where a value of 0° 18 not: known Sometimes the “hat notation’ is used to describe an estimator ofa parameter and @ represemts an estimator of 6, So you might use Similarly you sometimes use @° to represent an estimator of a2, Following on from Example 7, Yyou usually have S? = @ ‘The table below summarises the number of breakdowns, x, on. town's bypass on 30 ancomly chosen days. to represent an estimator of , usually you have A=. Number of breakdowns | 2] 3 [4] s[e6|7[ 8] 9 Number of days 3ts[4[s3[s[4[4 Calculate unbiased estimates of the mean and variance of the number of breakdowns.You can use your calculator 10 find these values but itis recommended that you show them in the Approprite formulae By calculator: 0 and 4726 = 471 (8 5f) “The working shown heres recommended when answering (questions ofthis type, Es The random variable X has a continuous uniform distribution defined over the range [0 al. A random sample Xj, Xo, Nw 1 taken 1a Show that X isa biased estimator for «and state the bia, 1b Suggest a sultable unbiased estimator for a. See the 5 par of the formula book where the a Since X~ UO, a] x = E(X) {ormulaefor mean and Since ER) = w Yarlance ofa continuous aoe Uniform csibtion are Then ER) = $ ‘given. 50 Kis abiased estimator of a ‘See Example 5. £0) = for ony sample The bias = E(X) — -$ = Use the definition of bias b IFY is an unbiased estimator of a then from page 28. EM) =a. This is the definition of an unbiased estimator Since ECR) = .a sensible statistic for Yo ¥ = 2X 11 Find unbiased estimates of the mean and variance of the populations from which the following random samples have been taken, a 21,3; 19.6; 18.5; 22.3; 17-4; 16.3; 18.9; 17.6; 18.7; 16.5; 19.3; 21.8; 20.1; 220 W 15255545 6 451; 3; 258 5; 6:2; 4; 31 © 120.4; 230.6; 356.1; 129.8; 185.6; 147.6; 258.3; 329.7; 249.3, 1 0862; 0,754; 0.459; 0.473; 0.493; 0,681; 0,743; 0.469; 0.538; 0,361 2. Find unbiased estimates of the mean and the variance of the populations fom which random samples with the following summaries have been taken, am=120 Yx= 4368 Vx? = 162466 bn=30 Yx-270 2546 em= 1037 x= 11407 dn=15 Yx=168 Sx = 19133. The concentrations, In mg per litre, of a trace element in 7 randomly chosen samples of ‘water from a spring were: 2408 237.3 236.7 2366 23942 233.9 2325 Determine unbiased estimates of the mean and the variance ofthe concentration of the trace element per litre of water from the spring 4. Cartons of orange are filed by a machine, A sample of 10 cartons selected at random from. the production contained the following quantities of orange (in ml). 201.2 205.0 209.1 202.3 204.6 206.4 210.1 201.9 203.7 207.3 Calculate unbiased estimates of the mean and variance of the population from which this sample was taken. 5 Amanufacturer of self-assembly furniture required bolts of two lengths, Sem and 10cm, in the ratio 2:1 respectively ‘a Find the mean cand the varlance a? forthe lengths of bolts in this population ‘A andom sample of three bolts is selected from a large box containing bolts in the required ratlo, i Listall possible samples. Find the sampling distribution for the mean X. Hence find £18) and Var) € Find the sampling distribution for the mode M, £ Hence find E(M) and Vari). 4g Find the bias when Mis used as an estimator of the population mode 6 A biased six-sided die has probability p of landing on a six. very day, for a perlod of 25 days, the die is rolled 10 times and the number of sives X Is recorded giving rise to a sample Ny, Xo,» a Write down EX) in tens ofp bi Show that the sample mean X i based estimator ofp and find the bias. € Suggest a suitable unbased estimator ofp. 7 ‘The random variable X ~ Ula, al a Find £OX) and B(X%), A random sample X;, X., Xyis taken and the statistic Y= X\2-+ b Show that Vis an unbiased estimator of a2 X¥ is calculated, 3,3. You need to be able to use the standard error of the mean. So far you have seen how to find unbiased estimators for w and «litle thought will show that, fora sample X;, Xs, ., Ny ECA) = a for every value of i So why should you bother tocalculate the mean X when any member of the sample has the same property; that it provides an Lnbiased estimator for the mean yu? To answer this question you need to look back at Examples 5 andl 6 where some important properties ofthe estimator X were found 108) = wand Vand = Notice that Xi always an unbiased estimator of but also that, a the sample siz increases, the variance of ths estimator decease. Iti this property of VarX) that makes Xa weft estimator of wand eerainly a better estimator than X; or any othe single member ofthe sample, since a smaller variance means tat the values of any estimates shouldbe closer to the required ‘ale js (This principle is examined further in 4) ‘The variance of an estimator is clearly a helpful gulde about how useful a particular estimate may be. In calulations you often want to find the standard deviation ofthe estimator, and this is refered toa the standard error ofthe estimator. Sof the estimators the mean X then the standaederor of the mean i. Since, in practice, you often have to uses? insead ofa, the standard error ofthe mean is used to refer to either -% or -*-if is nat known. standard error of the mean is or (This san extension of Example 8) ‘The table below summarises the number of breakdowns, x, on a town's bypass on 30 randomly chosen days. [Numberortrenaows[2[3[@[s][o[7]6]9] [Numberotdars Ts ts[alals]sle[2 4 Calculate unbiased estimates ofthe mean and varlance of the number of breakdowns. ‘Twenty more days were randomly sampled and this sample had a mean of 6.0 days and s? = 5.0, Db THeating the 50 results asa single sample, obtain further unbiased estimates of the population mean and variance. Find the standard error ofthis new estimate of the mean. Estimate the size of sample required to achieve a standard error of less than (.25 2 Bycaleulator: 160 and Sx? ‘These cakultions were completed in Example 8.Now sail 20%60= 120 fst you ned ue ss feu ae ae Et] tolnd Ty andy Be sx19+20%36 la 215 i ts coin carpe uot maa Kae w= 160 + 120 = 260 area eee ey ech Eeand oe te he ‘combined variable be w. bined estimate of pio 20 i a ‘ahd the estimate for os 1805 4B867 ... = 4.84 (3 af) The best eotimate of will be 6 since its based (on a larger ample than 3, or 9,4 BBB «6 Use the formula for 0 the standard errorte —2e— = = 0211 i oe yee 500" standard err. 4. To schieve a standard eror < 0.25 you require (BE 228 You do not know the value fora so you will have to use your best etimate of itp namely 5, n> 7738 So we neod a sample of at leant 72. 1. John and Mary each independently took a random sample of sixth-formers in their college land asked them how much money, in pounds, they earned last week, John used his sample of size 20 o obtain unbiased estimates of the mean and variance of the amount zamed by a sixth-former at their college last week. He obtained values of = 15.5 and s,* = 8.0. Mary's sample of size 30 can be summarised as Uy ~ 486 and -y* = 8222. fa Use Mary’s sample to find unbiased estimates of wand o* Bb Combine the samples and use all $0 observations to obtain further unbiased estimates of mand 02. Find the standacd etror of the mean for each of these estimates of. ‘4 Comment on which estimate of u you would prefer to use.A machine operator checks a random sample of 20 bottles from a producton line in order to «estimate the mean volume of bottles (in em) from this production fun. The 20 values can be summarised as Sx = 1300 and Sx? = 84 685, ‘Use this sample to find unbiased estimates of and a, A supervisor knows from experience that the standard deviation of volumss on this process, «@, should be 3.cm* and he wishes to have an estimate of u that has a stancard ertar of less than 0Sem! b What size sample will he need to achieve this? “The supervisor takes a further sample of size 16 and finds Dx ‘© Combine the two samples to obtain a revised estimate ofp, The heights of certain seedlings after growing for 10 weeks ina greenhouse have a standard deviation of 2.6cm. Find the smallest sample that must be taken for the standard error of ‘the mean to be less than 0.5.cm, The hardness of a plastic compound was determined by measuring the indentation produced by a heavy pointed device The following observations in tenths of a millimetre were obtained: 47,5.2,5.4,48, 45,49, 45,5.1,5.0, 48, ‘4 Estimate the mean indentation for this compound, bb Estimate the standard exror of the mean. € Estimate the size of sample required in order that in future the standard enor of the mean should be just less than 0.05, Prospective army recruits receive a medical test The probability ofeach recrt passing the test isp, independent of any other recruit. The medicals are cared out over twodays and on the first day m recruits ate seen and on the next day 2s are seen. Let X, be the number of recrults ‘who pass the test on the first day and let Xs be the number passing on the second day, fa Write down F(X,), BAX), VanlX;) and VartX) © show hat Stand ae both unis estimates ofp and at, ng season, which you would preter toe «show hat x ~(S! + $2) san unbiased estimator op show that ¥ = (*5* 5} an unbiased estimator ofp. (55 © Which of the statistics 2K, + an The statistic T £ Find the bias. 9) fs proposed as an estimator of p. Two independent random samples Xi, Xoy oy Ny And Y, Voy np Fy ane taken from a Population with mean p and variance a. The unbiased estimators X and ¥ of y are ‘calculated. A new unbiased estimator T of ps sought of the form T = rX + s¥.Storr cnc TH wlan 1+ ¥ W By wang T= + (12 show tat fe,a-0 +4] Van(T ‘© Show that the minimum variance of Tis when {Find the best (in the sense of minimum variance) estimator of of the form 1X + s¥, 7 Allarge bag of counters has 40% with the number 0 on, 40% with the number 2 cn and 20% with the number 1 f Find the mean j, and the variance ofr tis population of counters. ‘random sample of size 3 is taken from the bag. b List all possible samples. € Find the sampling distribution for the mean . 4 Find £(X) and Var). € Find the sampling distribution for the median W. Hence find E(N) and Vani), g Show that Vis an unbiased estimator of. 1h Explain which estimatos,X oF N, you would choose as an estimator of 3.4. You need to be able to use the Central Limit Theorem to find an approximation to the sampling distribution of X. |Arandom sample Xi, Xo). Nels taken from a population where X~ Nia, a) Show that ~ N{, 2) + Xp~N@u 20%) Use te ets rom Chapter UX Xt +X, NM mo) fend the above rests X= LX and we have soon that E(R) = wand var() = ‘ee Example 5 and Example 6, So, since the population ls normal we krow that X wi be normal too and therefore X ~ N(u, 2) Example 11 has shown that ifthe distribution of the population is known to be normal then the sampling distribution of X is normal too. However, in many cases the distribution of thepopulation is not known ori is clearly not normal so what will the distibution ofX be when the population from which the sample was taken doesnot have a normal distribution? The answer, in general, is that it depends upon the distibution of the population and in most cases there is no easy way of describing the distribution ofX. However, there isan i: portant result that enables you to say something about the distribution of X when the sample size» is large. ‘This suis known as the Central Limit Theorem and it tells you that when 1s large X is approximately normally distributed, whether or not the population is normally dstributed. 1M The Contra Limit Theorem says that XX population with mean y and variance « then Xis approximately ~ N{ yu, ") X, isa random sample of size n froma This theorem Is very important in statistics and is one of the main reasons why the normal distribution is so useful. The theorem is an approximation but the approximation improves as 1, the sample size, inereases; this is another reason (remember Var(X) gets smaller as 0 increases) why a large sample is often desirable. A proof of this theorem is beyond the scope of this course, but the following example should help you to see why it might be true, Ea) A table of random digits is designed so that the value, R, ofa digit comes from a discrete uniform distribution over the set (0,1, 2, 3, 4,5, 6,7, 8, 9} Find j= FAR). 1b Using the first row of the table on page 139 take a sample of size 10. € Calculate the sample mean, a Byeymmetry E(R) = 45 + Use} 0 +9) b The first 10 random digits are: Notice thatthe sample has some high (eg. 8) and apes sees roa?) some low (e.g, 0) digits but that the high and tow Value tend to cancel each other aut 33 thatthe ‘ean value fr the sample Is close tothe mean for the population as a whove tis therefere much more Unlikely that you would get a mean ¥ of O or 8 than a value close top. It's this ‘canceling out effect of taking 8 mean that might lead you to expect the distribution Of X to peak cose to and tal of at each end. Ti Jsa worthwhile experiment to repeat this sampling of random numbers and obtain a large number ‘of observations of X.A histogram of these values canbe plotted and a shape approximating t0 8 ‘normal dstibution should result. This can be done ona calculator ora compute, a) A sample of size 9s taken ftom a population with distbution N(10, 2). Find te probability ‘hat the sample mean X is more than 11The population ie narmal, 90 X wil have a normal distribution despite the smal sie of che sample. var() = = & oT “he mean of 8 2y (10) and he et vatiance of Kis 3, 50: x~n(10.(2))) The mean of X js 10 and the standard deviation is 3 o: = rR> 1) =r =F(Z>15) 0.9332 0668 [A cubical die is relabelled so that there ate three faces marked 1, two faces marked 3 and one ‘marked 6, The die is rolled 40 times ane! the mean of the 40 scores is reconded. Find an approximation for the probability that the mean Is over 3, Let the random variable X the dlatrbution of X ie ‘the score on a single roll then Use the techniques for finding means and variances of dscrete Sistutions you met in book st “The population scary not normally cstributed but the sample size (= 40) quite large - 30 the Centr Limit. ‘Theorem canbe used go (X>a)=r(z>F PE > 15 =1- 0.9599 = 0.0401 oF 0.040 (5 ap) Itis worth pointing out that although the X;and therefore X are discrete distributions, ‘whereas the normal distribution is @ continuous distribution, a continuity corsection fs not Appropriate inthis example, However, Ifyou had been asked to find a probability for Sithas PDX > 120), then a continuity corecton as described in book S2 could be applied1. Assample of size 6 is taken from a normal distribution N10, 24. ‘What is the probability that the sample mean exceeds 12? 2. machine fills cartons in such a way that the amount of drink in each carton is distributed ‘normally with a mean of 40cm? and a standard deviation of 1.Sem’ ‘aA sample of four cartons is examined. Find the probability that the mean amount of drink is more than 40.5 cm’ b A sample of 49 cartons is examined. Find the probability that the mean amount of drink is more than 40,S an’ on this 3. The lengths of bolts produced by a machine have an unknown distribution with mean 3.03em and standard deviation 0.20m. A sample of 100 bolts is taken, 4 Estimate the probability that the mean length of this sample is less than 3cm, 1b What size sample is required ifthe probability that the mean is less than 3em is to be less than 1%2 4. Forty observ density function ns are taken from a population with distribution given by the probability { 05253, 0, otherwise. fay ‘Find the mean and variance of this population. bb Find an estimate of the probability that the mean of the 40 observations Is more ‘than 2.10, 5 Afairdie is rolled 35 times. 1 Find the approximate probability that the mean of the 35 scores is more than 4 'b Find the approximate probability that the total of the 35 scores i less than 100, 6 ‘The 25 children ina class each roll a fair die 30 Umes and record the numberof sixes they obtain, Find an estimate of the probability that the mean number of sixes recorded forthe class is less than 45. 7 The error in mm made in measuring the length of a table has a uniform distribution over the range [~5, 5]. The table is measured 20 times, Find an estimate of the probability that the mean error is less than —1 mm.8 Telephone calls arrive at an exchange at an average, rate of two per minute. Over petiod of 30 days a telephonist records the numberof calls that arrive in the five-minute period before her break ‘a Find an approximation for the probability that the total number of calls recorded is more than 380, b Estimate the probability that the mean number of cals in the five-minute interval i less than 9.0 9 How many times must a fair die be rolled In order for there to be a less than 1% chance that ‘he mean of al the score differs from 3.5 by more than 0.1? 10 The heights of women in a cestain area have a mean of 175m and a standard deviation (of 2.5¢m. The heights of men in the same area have a mean of 177 em and a standard deviation of 2.0cm. Samples of 40 women and 50 men are taken and their heights ate recorded, Find the probability that the mean height of the men is more than 3.cm greater than the mean height of the women, 11 Acompute rounding errors are independent and come [-05, 0.5] ‘4 Given that 1000 numbers are added, find the probability that the total error is greater than +10, }b Find how many numbers can be added together so thatthe probability that the magnitude of the total excor is les than 10 is at least 0.95. adding numbers, rounds each number off to the nearest integer. Al the -om a uniform distribution over the range 12. An electrical company eps very lage numbers of television ets and wishes toestimate the mean time taken to repara pacar faults known frm previous research that the Standard deviation ofthe ime taken to ea ths particular fault 2.8 minutes ‘The manager wishes to ensure thatthe probably thatthe estimate differs from the tue mean by less than 30 seconds I 095. Find hove large a sample is equied e 3.5 You need to be able to calculate confidence intervals for a population parameter. ‘You are now in a postion to complete the estimation of», the population mean. In the previous sections we considered taking a random sample of students and measuring their heights. Now we shall assume that the standard deviation of heights of students, Le. fs known but the mean (in metres) is not known and this isthe parameter we seek to estimate. Suppose the sample ‘gave an estimate = 1.73. What can you say about y? You know that an estimate of jis f= 1.73, but it would be more helpful if you could give a range of values for a and also provide some measure of how reliable this range of values Is, People sometimes use phrases like ‘'m 90% (or 99% or 95%) certain that I eft the keys on the kitchen table’ In statistics we use the properties ofthe standard normal distribution, N(O, 19), to formalise this idea, and arsive at a range of values for z about which we are, say, 95% confident.Show that a 95h confidence interval for, based on a sample of size n, is given by #=196% Ye approximately ~ N(u. 2) ‘Whatever the distribution of the and therefore . pepuatin yal now Entra Liat Theorem rat Xl X- be approximately normal 2=354-No - Using che table on page 180 you can see that for fey the N(O, F) aistrbution P(Z> 1.9600) = P(Z< ~19600) = 0.025 and 20 95% of the distribution Is between 1.9600 and 1.9600 So P(-196 <2<198) = 0.95 196 196% le. e{-196 < X54 <196| = 095 Look at the inequality inside the probably statement: Start to aoate co seat ate iy bya org De ¥t196 x % M ‘The upper and lower values of a confidence interval are sometimes called the confidence limits In general we have the following formula The 95% confidence interval for is You should notice that the 95% level of confidence gives rse to the value of 1.96 in the formula. So, again using the table on page 130, a 99% confidence interval would have the 1.96 replaced by 2.5758 so that a 99% confidence interval is given by 2 25758xInterpreting confidence interval + Fis, itis important to remember that isa fixed, but Soli does not make sense to “unknown, number and as such it cannot vary and does not ‘alk about the probability that hhave a distribution. eis between certain vals. «secondly tis woth emenbedg that we tase 195% Seemed ara ona probably stent aout he normal eed aes eistbution 2 NE, 1 fitters «However, although you star by considering probes ihe table on pg: 130, = avocnted with he vandom vavaleZ, the nal confidence gaat Pays of intrvtdoes nt tell you thatthe probability that es inside AUP neat 1.96. the interval is 0.95. Rather, since pis fixed, its the confidence interval that varies (according to the value of 2). + What a 95% confidence interval tells you is that the probability that the interval contains jis 0.95. The diagram opposite illustrates the 95% confidence intervals calculated from different r= samples and also shows the position of Suppose 20 samples of size 100 were taken and 95% confidence —§—§ ———J#—+— + Intervals for were calculated for each sample. This would give 20 different confidence intervals each based on one of the 20 different_§ ———-}]—}— ‘values of Ifyou imagine for a moment that you actually do know ‘what the value of ys then you can plot each of these confidence intervals on a diagram similar to the one here; You would expects sumice of wast vel that 95% of these confidence intervals would contain the value _artcula stuaton will but about once in every 20 times you would get an interval which depend on the robles dd not contain w (ike the one marked * here). The problem for the _{nvolved but a value of Sian iat be rhe nee koows wheter te condense’ | aaa no other value specified. interval they have just calculated is one that contains «-or not ‘The choice of what level However, 95% (or 909% or 99% depending on the degree of confidence required) of the time they. will be right! Sting in this batch The wo cad ves ina ‘The manufactorer become concemed ifthe lower 95% coninence gaan a limit falls below 5ikg. A sample of 80 eels from another batch gave a ¢onfidence limits, mean breaking strain of 5.31 kg 1b Will the manufacturer be concerned? Sometimes a question The distribution for breaking strains is not known but the enanlege sample is quite large and by the Central Limit Theorem X will be en iee Trex approximately normally distributed, to your caeuations,2 98% confidence interval (Cl) Ie 880 +196 x Use the & 96 x formula, we (6.006, 5.594) Lower 26% confidence limit is 196 X15 6at~ vB 498 50 the manufacturer wll be concorned, ‘We ate sometimes interested in the width of a confidence interval, The width of confidence interval isthe difference between the upper confidence limit {and the lower confidence limit, Ths is 2 x z x "©, where 2s the value from the tables, for ‘example, 1.96, 1.6449, ete ‘There are three factors that affect the width: the value of o, the size ofthe sample w and the degree of confidence required. In a particular example where «rand n are determined, the only factor you can change to alte the width Is the degree of confidence. A high level of confidence (cg, 99%) will give a greater width than a lower level of confidence (e.g, 90%) and the statistician thas to weigh up the advantages of high confidence against greater width when calculating a confidence interval Example [EZ A random sample of size 25 is taken from a normal population with standard deviation of 2.5. ‘The mean of the sample was 17.8 4 Find a 999% C. forthe population mean 'b Whar size sample is required to obtain a 999% C.l, of width of at most 1,5? © What confidence leve! would be assoclated with the interval based on the above sample of 25 but of width 1S, Le(17.05, 18.55)? a RClie Use the table on page 150 to find 2.5758. £2 25700 x % = 178 = 25760 x 25 (1655, 12.09) 0 x 28 b Width of a9% Cl. ie 2 x 2 soyos requre 15> 12879. le. n> 73719. soyounced n= 74x2x 28 vB a rabiem weak Pesan =H an ‘is given by the area. tie! [=] 1 Arandom sample of size 9 is taken from a normal distribution with variance 36. The sample mean is 128, ‘a Find a 95% confidence interval for the mean y. ofthe distribution 1b Find a 99% confidence interval for the mean sof the distribution, = 2. Arandom sample of size 25 is taken from a normal distribution with standard deviation 4, ‘The sample mean is 85. a Find a 90% confidence interval for the mean sof the distribution. bb Find a 95% confidence interval for the mean y of the distribution. 3. Anormal distribution has mean j-and variance 4.41. A random sample has the following values: 23:1, 218, 946,205. Use this sample to find 98% confidence limits for the mean j. 4 normal distribution has standard deviation 15. Estimate the sample size required if the following confidence intervals for the mean should have width of less than 2 a 90% b 959%, © 99% ‘5 Repeat Question 4 for a normal distribution with standard deviation 2.4 and a desired width of les than 0.8, 6 An experienced poultry farmer knows that the mean weight ekg fora large population of ‘chickens will vary from season to season but the standard deviation of the weigh's should remain at 0.70kg, A random sample of 100 chickens is taken from the population and the weight xkg of each chicken in the sample is recorded, 190.2, Find a 95% ‘confidence interval for s. wing 7 Arailway watchdog is studying the number of seconds that express tains are late in arsiving. Previous surveys have shown that the standard deviation is 50, A random sample of 200, trains was selected and gave rise to a mean of 310 seconds late. Find a 909% confidence Interval for the mean number of seconds that express trains are lat,8 An investigation was caried out into the total distance travelled by lorties in current use The standaed deviation can be assumed to be 15 000km. A random sample of 80 lorties were stopped and their mean distance travelled was found to be 75 872m, Find a 90% confidence interval for the mean distance travelled by lorries in current use. 9 Its known that each year the standard deviation ofthe marks ina certair examination is 135 but the mean mark » will fluctuate. An examiner wishes to estimate the mean mark of ll the candidates on the examination but he only has the marks ofa sample of 250 ‘candidates which givea sample mean of 68.4 ‘a What assumption about these candidates must the examiner make in order to use this sample mean to calculate a confidence interval for 4? bb Assuming that the above assumption is justified, calculate a 95% confidence interval for p, Later the examiner discovers that the actual value of was 65.3. © What conclusions might the examiner draw about his sample? 410 The number of houts for which an electronic device can retain information has a uniform distribution over the range [a ~ 10, + 10) but the value of pis not known. 1a Show that the variance of the number of hours the device can retain the information for A random sample of 120 devices were tested and the mean number of hours they retained {information for was 78.7. 1b Find 2 95% confidence interval for. 111A statistics student calculated a 95% and a 99% confidence interval for the mean j of a certain population but failed to label them. The two intesvals were (22.7, 27.3) and (23.2, 268) a State, with a reason, which interval isthe 95% one 'b Estimate the standard error of the mean in this case, (© What was the student’ unbiased estimate of the mean in this case? 12. 495% confidence interval for a mean jis 85.3 = 2.35. Find the following confidence intervals for 290% Bb 989% © 99% 13 The managing director of a certain firm has commissioned a survey to estimate the mean ‘expenditure of customers on electrical appliances. A random sample of 100 people were «questioned and the research team presented the managing director with a95% confidence Interval of (£128.14, £141.86). The director says that this interval is too wide and wants a confidence interval of total width £10. ‘a Using the same value of %, find the confidence limits in this case. b Find the level of confidence forthe interval in part a. The managing director i stil not happy and now wishes to kriow how lai a sample would ‘be required to obtain a 95% confidence Interval of total width no more than £10. ‘€ Find the smallest size of sample that wil satisfy this request.14 A plant produces steel sheets whose weights are known to be normally distributed with a standard deviation of 2.4kg, A random sample of 36 sheets had a mean weight of 31.4 kg, Find 999% confidence limits for the population mean. 18. A machine is regulated to dispense liquid into cartons in such a way that the amount of liquid dispensed on each oceasion is normally distributed witha standard deviation of 20m Find 99% confidence limits for the mean amount of liquid dispensed if a random sample (of 40 cartons had an average content of 266:ml. 16 a The error made when a certain Instrument is used to measure the body length of a ‘buttery ofa particular species is knovn to be normally distributed with mear. 0 and standard deviation 1 mm, Calculate, to 3 decimal places, the probability that the error ‘made when the instrument is used once is numerically less than 0.4 mam, 'b Given that the body length of a butterfly is measured 9 times with the instrument, caleulate, to 3 decimal places, the probability that the mean ofthe 9 readings will be within 0.5 mm of the true length, © Given that the mean of the 9 readings was 22.53 mm, determine a 98% confidence interval for the true body length of the butterfly. 3.6 You need to be able to test hypotheses about the mean ofa normal distribution, In book S2 you met the klea of a hypothesis test and a definition of Is given below A typothesis test about a population parameter # tests a null hypothesis H, specifying a particular value for 0, against an alternative hypothesis Hi, which will indicate whether the testis one-talled or two-taled, In book 2 the parameters considered were the proportion p of a binomial distribution and the mean A or » of a Poisson distribution. In this section you will learn how ta extenc the idea to tests for y, the mean, of a normal dsteibution. The process s similar to that of atrial in a courtroom. The null hypothesis is on trial, evidence is presented and the jury has to make a Xecision ‘on the balance of probability’ certain company sells fruit juice in cartons. The amount of juice ina carton has a normal distribution with a standard deviation of 3m ‘The company claims that the mean amount of juice per carton, y, 18 60m. A trading inspector hhas received complaints that the company is overstating the mean amount of juice per carton and he wishes to investigate this complaint. The trading inspector took a random sample of 16 cartons which gave a mean of 59.1 ml. Using a 5% level of significance, and stating your hypotheses clearly test whether or wot there is ‘evidence to uphold this complaintRemember He mustspecty a particular The hypotheses are vale of. The inspector therefore ‘wal asume that the compory i innocent nd wih to formulate a nll hypothesis to express ths idea in terms the parameter. If the company is guilty then c must be less than 60 (there would be few complaints the carons contained on average more than 60m) and so the fkematve hypothess Myf hj = 6, ‘This means the test s one-tailed. This io like the ‘evidence’ presented at trial “ “The inspector like te jury ina tra) 16 and ¥ = 591 then has to caeulate the probability cof obtaining evidence as bad oF rfz< 2212) ‘Worse than tis, asuming that the The sample aves n Ae ‘ul hypothesis is tre ‘he atematvehypates that the 7 company is decehingcostomers and Pae—12) ‘that 1 < 60; the inspector's sample = ont61 gave a mean of 591 and so any value fof the sample mean es than of equal 01181 > 0.08 60 the result is not significant to 59.1 wil be as bad or worse and there is insufficient evidence to reject Ho, that 1 = 60. Yeu ow that ®~ then ‘The conclusion should incorporate two statements: pA 1 State whether or not the test i significant. 2 Interpret this in the context ofthe question. eee reason to suspect the valty of Hy, There is insufficient evidence to support the complaint. Notice that the test was based on the statistic Recall that a satsticmust not contain any unknown populstion parameters. and this i the test statistie in this case. Thetenstctstetn ate tortepepeton mom x2 © Behar wate gam by the null hypothesis and cis given. =n sometime iphlte emlcerwnet vine (aaa 7of the tes statistic the inspector in Example _ypothess is eeced Tiss ke a guilty ‘Bwoill beveoerdedithe were toosje (aga cr a Gebypobalstata=60tyeuwesos | tata eg Spuifence level then Grom leon. 130 sbamal's ome inp ar te ne assumption of innocence is sstanable. 1.6449) = 0.05so any value of z= 1.6449 would mean that the probability of obtaining a sample ‘as bad oF worse’ Is less than or equal to $%, which is unlikely. This means thatthe assumption that Hy Is true is called into question and we reject Hy a the 5% lve of significance. We call the region Z = ~1.6449 the critical region of the statistic Z and the value ~1.6449 is sometimes called the eritical value. The critical region ofa test statistic Tis the You should note that inthis ase the critical values can be found s thatthe range of values of Tsuch that ifthe value of T, robabilty of iying inthe crcl regon namely t, obtained from your particular sample Gjuals the significance level. In book S2 lies in this critical region then you reject the sve were appiving these ideasto discrete rl figpanhesis Aistibuions and an exact match was ‘usually not posible. The corinuous The boundary value(s ofthe critical ‘ature ofthe normal distribution region is (are) called the critical value(s) ew See In $3 tests for u, the mean of a normal distribution are best carried out using the test statistic z and the critical region rather than calculating the probability Ata certain college new students are weighed when they join the college. Notice that the ‘The distribution of weights of students at the college when they enrol ‘question does not haa standard deviation of7ganamean of 70g Aandora sample Sy tibet 1of00 students fom the new enty were welghedandthelrmean weight, ERT gg srs 71.8hg Assuming that he landarddevition has not changed lave Tecan int ‘test, atthe $% level, whether there is evidence that the mean of the HE Cenlel Limit new entry is more than 70 kg, thats the reason b State the importance of the Central Limit Theorem to your test. ‘reat a Horn a= 78 0 H=:m> 70 11> 70,50 a onetalled testis required. Sh significance level is taquired 6o the -F critical region for Zuil be 26 shown by the diagram on the ht From the table on page 180 this ie Z* 16449, The eample aes n= 90,3 = 716 and oes these give a value ofthe test statistic of N16 — 70 ~ 20030 fi 90 ‘This value is n the critical region, so you reject: Hy and conclude that Aways give ‘there is evidence that the new class havea higher mean weight. Sconetsion incontext The Gonsral Limit Theorem io used to assume that: X (which is the mean weight of the 90 students) is normally distriuted.A machine produces bolts of dlameter D where D has a normal distribution with mean 0.580em and standard deviation 0.015 em, ‘The machine is serviced and after the service a andom sample of 50 bolts fom the next production run is taken to see if the mean diameter ofthe bolts has changed from 0.580cm, The distribution ofthe diameters of bolt after the service is stil normal with a standard deviation c£.01Sem, The mean diameter ofthe 50 bots is 0.577 em, ‘a Stating your hypotheses clearly test, atthe 1% level, whether or not theres evidence that the mean diameter ofthe bolts has changed, bb Find the extcal egion for Xn the above test. The word ‘change in the question suggests that the alternative hypothesis is 20.580 so a totale test needed a Hew = 0580 Hz #0580 o= 0016 There shoud be avaue of 7 «given in the question Ath signiRcance testis required ¢o the critical region for Zvi be a6 shown by the diagram on she right. ef From the table on page 180 the critical region of Zio 2< ~25768 or 2= 25768. The sample glves n= 60,¥ = O77 so the value of 0S. ee ‘the test etatiotic 6 z= OSTI = 0560 "2 ‘ODIS v Always give your endusion in ‘hie i notin the critical region so you accept Hy eantext= mention mean ameter land conclude that there is na significant evidence ‘that the mean diameter has changed, Use the critical regions for Z b The critical region of Zs and the 2 = %5# formula on 25 ~2575B or Z> 25758 +0 form critical regions for X. = 0580 X= 98080 = —9 5758 —- Notice that theresa smarty ve0 between a contdence interval ta Rx0s80- 25760 O88= ons... alma mat So z . txample hectic region was X= 0860 x 257 N= 0575 of = 1588 and found or = 25768 coe ftom caeuating a 225758 x EO and taking the region outside. 7 The 99% confidence intervals le, Ke os80 + 25768 x O08 = 0.5064... simpy (872, 0882) and found by calculating = a and taking the region inside Xs 0576 or X= 0585 ‘Notice that the etical region uses ‘nad the confidence interval 0 the critical region for K ie Note that ¥ = 0577 does not le in the critical regionThe following four steps su the mean j. imatise the stages in answering questions about hypothesis tests for 1 Identify the sample mean ¥ and value forthe population mean y given by the null hypothesis, 2 Write down the null (Ha) and altemative (H,) hypotheses. The alternative hypothesis will determine whether you want a one-tailed ora two-talled test, 3 Calculate the value of the test statistic 2 = 7 5 # 4 Either using the critical region for Z, or by calculating a probability, complete the test and state your conclusions. The following points should be addressed. ‘Is the result significant or not? bb What are the implications in terms of the cuntext ofthe orignal problem? In each of Questions 1-S a random sample of size is taken from a population having a normal distuibution with mean wand variance v2, Test the hypotheses at the stated levels of significance, 1 Hyw=21, Hey *21, = 20, F=212, o= 15, atthe siolevel 2 He w= 100, Hyrn<100, = 36, F=985, o= 5.0, atthe S%level 3 Hew Hew#5, m=25, F= 61, o=30, at the S% level 4 Hew=15, Hea> lS, = 40, F=165, 0-35, atthe 1% level 3 Hy 50, Hea? $0, = 60, F= 489, 7-40, atthe TH evel In each of Questions 6-10 a random sample of size mistaken from a population having a N(u, 2) distribution, Find the critical regions forthe test statistic X In the following tests, 6 Hyw= 120, Hew < 120, 20, at the 5% evel at the 196 level 7 Hye 125, He e> 125, 0 B He w=85, Hew <85, m= 50, 740, at the 10% level 9 Hw= 0, Hie #0, 3.0, at the Si level 10 He Hea -8, 2, atthe 1% level 111. The times taken for a capful of stain remover to remove a standard chocolate stain from a baby’s bib are normally distributed with a mean of 185s and a standard deviation of 15s, “The manufactures ofthe stain remover claim to have developed a new formula which will shorten the time taken fora stain to be removed. A random sample of 25 capfuls ofthe new formula are tested and the mean time for the sample Is 179s ‘Tes, at the 5% level, whether or not there is evidence that the new formula isan Improvement.12 The 1Q scores of a population are normally distributed with a mean of 100 and standard deviation of 15. A psychologist wishes to tes the theory that eating chocolate before sitting am IQ test improves your score. A random sample of 100 people are selected and they are ‘each given a 100g bar of chocolate to eat before taking a standard 1Q test, Their mean score ‘on the test was 102.5. Test the psychologist’s theory atthe 54% level 13 The diameters of circular cardboard drinks mats produced by a certain machine are normally distributed with a mean of 9cm and a standard deviation of 0.1Scm. After the machine Is serviced a random sample of 30 mats is selected andl their diameters are measured to sce if the mean diameter has altered, ‘The mean of the sample was 8.95 cm, Test, at the 59 level, whether there significant evidence of a change in the mean diameter of mats produced by the machine. 14 a Research workers measured the body lengths, in mm, of 10 specimens of fish spawn of & cettain species off the coast of Eastern Scotland and found these lengths to be 125 102 M1 96 121 93 107 114 147 104 Obtain unbiased estimates forthe mean and variance ofthe lengths ofall such fish spawn off Eastern Scotland, 'b Research shows that, fora very large numberof specimens of spawn of this species off the coast of Wales, the mean body length is 10.2 mm. Assuming that the variance of the lengths of spaven off Eastern Scotland is 2.56, perform a significance test atthe 5% level to decide whether the mean body length of fish spaven off the coast of Eastern Scotland is larger than that of fsh spawn off the coast of Wales. 15 @ Explain what you understand by the Central Limit Theorem, bb An electrical firm claims that the average lifetime of the bulbs it produces is 800 hours with a standard deviation of 42 hours. To test this claim a random sample of 120 bulbs ‘was taken and these bulbs were found to have an average lifetime of 788 hous, Stating your hypotheses clearly and using a 5% level of significance, test the claim ‘made by the electrical firm. 3.7 You need to be able to test hypotheses about the difference between means of two independent normal distributions. 1, Instead of one population, you now have two independent populations then you can test hypotheses about the differences in the population means. In Chapter 1 you saw that IX and Yare two independent normal distributions with means of and j, and standard deviations and a, respectively then XV Ni ~ ty a? +07) Now if and Fare sample means based on samples of size , and 1, respectively from the above two normal populations then: RP N ue ay GE + and the statistic X ~ F-can be used to test hypotheses about the values of andy.The Central Limit Theorem tells you that, provide the sample sizes m, an, are large, then ¥— Yovill have a normal distribution whatever the distributions of X and ¥. You can therefore use this to test if there is a significant difference between the means of any two populations. The ‘usual null hypothests is that the values of , and are equal, but other situations are possible provided that the null hypothesis gives you a value for ny ~ ty The test statistic you will need to use is based upon the distribution of X ~ Yand is K-P- 4-4) This is given in the fo 1m Test for difference between two means IFX~ Njty 2) and the independent random variable ¥ ~ N(vy, 7,2) then a test ofthe null hypothesis Hes out using the test statistic If the sample sizes n, and n, are large then the result can be extended, by the Central Limit Theorem, to include cases where the distributions of X and ¥ are not normal Gn The weights of boys and girl in certain school are known to be normally dsteibuted with standard deviations of S kg and 8 kg respectively. A rancom sample of 25 boys had a mean weight of 48 kg and a random sample of 30 girls had # mean weight of 45 kg Stating your hypotheses clearly test, atthe 59% level of significance, whether or not there is lvidence that the mean Weight of boys in the school is greater than the mean weight of the girls ‘Te question you ae asked s whether eo,» TS ll not ye a ale for yy 0 choose as your m Inypothest ay = pr in ater words ifthe mean weights are the same, ‘does the sample provide evidence to ‘contradict this assumption”) a Higa HE hay > Bg ——$ t= 25,0 The test statistic Is Remember that fom the null Pypothesis you bo i y=. ees = 16947. The 5% (one-tailed) critial value for Zis 2 = 16449 (table on page 180) 20 this value is significant and you can re ieee Ho and conclude that there ls evidence that the mean alu fom the tables weight of boys is greater than the maan weight of the gir, infu and ge yourSometimes you may be asked to test, for example, whether or not the mean weight ofthe boys exceeds the mean weight ofthe gitls by more than 2g. The test would be similar to the above but the hypotheses will be slightly different and this will affect the tes statistic 7 The weights of boys and girs in a certain school are known to be normally distributed with standard deviations of Sky and 8kg respectively. A random sample of 25 boys nad a mean weight of 48 kg and a random sample of 30 girls had a mean weight of 4S k, Stating your hypotheses clearly test, atthe $9 level of significance, whether ornot there is evidence that the mean weight of boys inthe school is more than 2kg greater than the mean ‘weight ofthe gil [Notice the nll hypothesis sil gives you a value for ~ 1 Ho Hy ~ dig = 2H Hy ~ Hs 5m) = 25,0, = Band ny Test statistic io ____ Notice how the tst statistic calculation has changed. Tis 2 comes from ss ~ ty 0565 The St (one-tailed) critical value for Zis 2 = 1.6449 (table on page 120) s0 this value ie rot significant. There is insufficient evidence that the mean weight of the boys is more than 2kg more than the mean weight of the girl. a) A manufacturer of personal stereos can use batteries made by two different manufacturers. The Standard deviation of lifetimes for Never Die bateris i 3.1 hours and for Everlasting batteries it is 2.9 hours. A random sample of 80 Never Die batteries and a random sample of 90 Everlasting batteries were tested and their mean lifetimes were 7.9 hours and 8.2 hours respectively. Stating your hypotheses clearly tes, at the 5% level of significance, whether there is evidence of a difference between the mean lifetimes of the two makes of batteries,Let py be the mean Ifetime of Never Die batteries and let pty be the mean lifetime of Everlasting batteries Ue Hit te # dy {ora difec, soa two- talled testis appropriate, 80, a, = 2.9 and ny = 90 eee ¥-J=79-62=-03 You are not told that the - distbutions of lie times of batteries are normally flistlouted but te sample Sizes are both qute large and so by the Central Limit Theorem you car proceed with X~ ¥ approximately ormally distibured. From the ul hypothesis ~0648, youtnow tat fy The St (two-talled) critical values for Z are 2= 419600. 0 this value io not significant and you do not reject Ho. You can conclude that there Is no significant evidence of a aiference In the mean lifetimes of the ‘two makes of battery. In Questions 1-3 carry out atest on the given hypotheses atthe given level of significance. The populations from which the random samples are drawn are normaly distributed. 1 He ay= so Hyg ma m=18, 0 =50, m= 20, ¥, = 23.8 and X; ~ 21.5 using a 5% level 2 Wasa = py Hy * say m= 30, 1 = 42, m= 25, 1 ~ 49.6and¥, ~ 51.7 usinga 5% level 3 Him spy Hem Shy m= 25, 01081, m= 36, 02 = 075, ¥ = 3.62and Z; = 4.11 using a 196 level In Questions 4-6 carry out atest on the given hypotheses atthe given level of significance, What {s the significance of the Central Limit Theorem in these three questions? 4 Hoss =p Him tp» m= 85, 0 =82, m= 100, a= 113, 12,0 ands = 108.1 using a 1% level 8 Heese = py Hirai tor m= 100, a) = 18.3, my~ 180, p= 15.4, 4) = 72.6andx, = 69.5 using a 5% level 6 Hhemy= ms Him < po m,= 120, a ~ 0.013, m= 90, o,= 0015, 863 and, 868 using a 1% level
You might also like
Edexcel Statistics 3
PDF
82% (11)
Edexcel Statistics 3
82 pages
Introduction To Number Theory AOPS Part I
PDF
100% (4)
Introduction To Number Theory AOPS Part I
192 pages
Precalculus Solutions (2009, Art of Problem Solving)
PDF
100% (2)
Precalculus Solutions (2009, Art of Problem Solving)
276 pages
Edexcel As&alevel Statistics S1
PDF
No ratings yet
Edexcel As&alevel Statistics S1
215 pages
0 Chemistry Factsheets Index
PDF
No ratings yet
0 Chemistry Factsheets Index
14 pages
Econ A2 Model Essays 1
PDF
No ratings yet
Econ A2 Model Essays 1
28 pages
S2 Continuous Random Variables
PDF
100% (1)
S2 Continuous Random Variables
59 pages
Further Pure 2 and 3 For Ocr
PDF
33% (3)
Further Pure 2 and 3 For Ocr
14 pages
Edexcel As&alevel Statistics S4
PDF
No ratings yet
Edexcel As&alevel Statistics S4
136 pages
Normal Binomial Distribution
PDF
No ratings yet
Normal Binomial Distribution
8 pages
Course 2 Unit 4
PDF
No ratings yet
Course 2 Unit 4
90 pages
CH 11. Counting in Probability
PDF
No ratings yet
CH 11. Counting in Probability
28 pages
Mte 101 Chapter 4 Oct 2020
PDF
No ratings yet
Mte 101 Chapter 4 Oct 2020
22 pages
Binary Operators
PDF
100% (1)
Binary Operators
10 pages
USyd MATH1011 Full Course Notes
PDF
100% (1)
USyd MATH1011 Full Course Notes
122 pages
Cornell Linear Algebra
PDF
No ratings yet
Cornell Linear Algebra
331 pages
Statistics2 Chapter1 Draft
PDF
No ratings yet
Statistics2 Chapter1 Draft
18 pages
Probability and Statistics Ver.6 - May2013 PDF
PDF
100% (1)
Probability and Statistics Ver.6 - May2013 PDF
129 pages
Cnditional Probability
PDF
No ratings yet
Cnditional Probability
5 pages
Course 2 Unit 5
PDF
No ratings yet
Course 2 Unit 5
98 pages
NJC Math Apgp Lecture Notes Teachers Edition
PDF
100% (1)
NJC Math Apgp Lecture Notes Teachers Edition
18 pages
Combinatorics and Binomial Expansions
PDF
No ratings yet
Combinatorics and Binomial Expansions
38 pages
Probability
PDF
100% (1)
Probability
145 pages
Computer Enabled Mathematics
PDF
No ratings yet
Computer Enabled Mathematics
275 pages
Advanced Level Mathematics Statistics1
PDF
100% (1)
Advanced Level Mathematics Statistics1
190 pages
Precalc EXP Sol
PDF
100% (2)
Precalc EXP Sol
78 pages
Advanced Functions and Introductory Calculus v2
PDF
100% (4)
Advanced Functions and Introductory Calculus v2
480 pages
Algebra 1 Rev Summer 2011
PDF
0% (1)
Algebra 1 Rev Summer 2011
329 pages
Proofsss
PDF
No ratings yet
Proofsss
30 pages
Released AP Statistics Exam 2002
PDF
100% (4)
Released AP Statistics Exam 2002
27 pages
Functions I: Supporting Australian Mathematics Project
PDF
50% (2)
Functions I: Supporting Australian Mathematics Project
31 pages
Exploring Data: AP Statistics Unit 1: Chapters 1-4
PDF
No ratings yet
Exploring Data: AP Statistics Unit 1: Chapters 1-4
83 pages
Examples of Proofs
PDF
No ratings yet
Examples of Proofs
4 pages
MATH 223: Calculus II: Dr. Joseph K. Ansong
PDF
No ratings yet
MATH 223: Calculus II: Dr. Joseph K. Ansong
28 pages
MDM4U NOTES Week 5
PDF
No ratings yet
MDM4U NOTES Week 5
16 pages
Computing Means and Variances
PDF
No ratings yet
Computing Means and Variances
3 pages
Poisson Distribution
PDF
No ratings yet
Poisson Distribution
22 pages
Factoring-Box Method
PDF
No ratings yet
Factoring-Box Method
1 page
Jan 2020 QP
PDF
No ratings yet
Jan 2020 QP
24 pages
01252022010047AnGeom - Q3 - Module 3 - Rotation of Axes
PDF
No ratings yet
01252022010047AnGeom - Q3 - Module 3 - Rotation of Axes
15 pages
Statistics 1 AQA Revision Notes
PDF
No ratings yet
Statistics 1 AQA Revision Notes
7 pages
MA261
PDF
No ratings yet
MA261
75 pages
Calculus Volume 2-OP PDF
PDF
100% (2)
Calculus Volume 2-OP PDF
830 pages
AP Calculus AB BC Trigonometric Integration
PDF
No ratings yet
AP Calculus AB BC Trigonometric Integration
23 pages
MaC Final Term 2 Sample E
PDF
No ratings yet
MaC Final Term 2 Sample E
16 pages
Differentiation Notes
PDF
No ratings yet
Differentiation Notes
11 pages
01 Binomial Theorem
PDF
No ratings yet
01 Binomial Theorem
6 pages
The Real Number System
PDF
No ratings yet
The Real Number System
11 pages
Student Solutions Manual To Accompany Boyce Elementary Differential Equations and Boundary Value Problems
PDF
9% (11)
Student Solutions Manual To Accompany Boyce Elementary Differential Equations and Boundary Value Problems
1 page
Binomial Theorem
PDF
No ratings yet
Binomial Theorem
14 pages
Euler's Identity: Washington State University
PDF
No ratings yet
Euler's Identity: Washington State University
9 pages
Sem 2 Final Exam Review
PDF
No ratings yet
Sem 2 Final Exam Review
21 pages
Grading Policy-Ap Calculus BC
PDF
No ratings yet
Grading Policy-Ap Calculus BC
2 pages
Applied Mathematics: Syllabus & Programme Structure
PDF
No ratings yet
Applied Mathematics: Syllabus & Programme Structure
36 pages
The Function of Composition and Inverse
PDF
No ratings yet
The Function of Composition and Inverse
20 pages
Discrete Random Variables: Online Page Proofs
PDF
No ratings yet
Discrete Random Variables: Online Page Proofs
32 pages
Chapter 1 - Complex Number
PDF
No ratings yet
Chapter 1 - Complex Number
42 pages
s1 Past Paper Questions
PDF
No ratings yet
s1 Past Paper Questions
377 pages
New General Maths 3
PDF
No ratings yet
New General Maths 3
332 pages
Binomial Theorem, Exponential and Logarithmic Series: TH TH
PDF
No ratings yet
Binomial Theorem, Exponential and Logarithmic Series: TH TH
4 pages
S3 (Old) PDF
PDF
No ratings yet
S3 (Old) PDF
157 pages
2009 Linear Combination of R.V
PDF
100% (2)
2009 Linear Combination of R.V
6 pages
SS4 Linear Combinations of Random Variables Student Notes
PDF
No ratings yet
SS4 Linear Combinations of Random Variables Student Notes
16 pages
Normal Distribution Summary
PDF
No ratings yet
Normal Distribution Summary
4 pages
Linear Combinations of Random Variables Notes
PDF
No ratings yet
Linear Combinations of Random Variables Notes
18 pages
186 Halogenation in Organic Chemistry
PDF
100% (1)
186 Halogenation in Organic Chemistry
3 pages
173 The Chemistry of Breathalysers
PDF
100% (1)
173 The Chemistry of Breathalysers
3 pages
Econ A2 Model Essays 2
PDF
No ratings yet
Econ A2 Model Essays 2
54 pages