Combined QP (Reduced) - S1 Edexcel PDF
Combined QP (Reduced) - S1 Edexcel PDF
1. The students in a class were each asked to write down how many CDs they
Paper Reference(s)
owned. The student with the least number of CDs had 14 and all but one of the
6683 others owned 60 or fewer. The remaining student owned 65. The quartiles for the
class were 30, 34 and 42 respectively.
Edexcel GCE Outliers are defined to be any values outside the limits of 1.5(Q 3 – Q 1 ) below the
Statistics S1 lower quartile or above the upper quartile.
(New Syllabus) On graph paper draw a box plot to represent these data, indicating clearly any
outliers. (7 marks)
Advanced/Advanced Subsidiary
Friday 19 January 2001 Afternoon
Time: 1 hour 30 minutes 2. The random variable X is normally distributed with mean 177.0 and standard
deviation 6.4.
Materials required for examination Items included with question papers
Answer Book (AB16) Nil (a) Find P(166 < X < 185). (4 marks)
Graph Paper (GP02)
Mathematical Formulae
It is suggested that X might be a suitable random variable to model the height,
in cm, of adult males.
Candidates may use any calculator EXCEPT those with the facility for symbolic
algebra, differentiation and/or integration. Thus candidates may NOT use calculators (b) Give two reasons why this is a sensible suggestion. (2 marks)
such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP
48G.
(c) Explain briefly why mathematical models can help to improve our
understanding of real-world problems. (2 marks)
Instructions to Candidates 3. A fair six-sided die is rolled. The random variable Y represents the score on the
In the boxes on the answer book, write the name of the examining body (Edexcel), your uppermost, face.
centre number, candidate number, the unit title (Statistics S1), the paper reference (6683),
your surname, other name and signature. (a) Write down the probability function of Y. (2 marks)
Values from the statistical tables should be quoted in full. When a calculator is used, the
answer should be given to an appropriate degree of accuracy. (b) State the name of the distribution of Y. (1 mark)
N6990 2
PMT
4. The employees of a company are classified as management, administration or 5. The following grouped frequency distribution summarises the number of
production. The following table shows the number employed in each category minutes, to the nearest minute, that a random sample of 200 motorists were
and whether or not they live close to the company or some distance away. delayed by roadworks on a stretch of motorway.
Of the managers, 90% are married, as are 60% of the administrators and 80% of (c) Use interpolation to estimate the median of this distribution. (2 marks)
the production employees.
(d) Calculate an estimate of the mean and an estimate of the standard deviation of
(c) Construct a tree diagram containing all the probabilities. (3 marks) these data. (6 marks)
(d) Find the probability that an employee chosen at random is married. (3 marks) One coefficient of skewness is given by
(f) Explain why the normal distribution may not be suitable to model the number
of minutes that motorists are delayed by these roadworks. (2 marks)
N6994 3 N6894 4
PMT
were the operating time x (in thousands of hours) since last reconditioning and 6683
the reconditioning cost y (in £1000). None of the incinerators had been used for
more than 3000 hours since last reconditioning.
Edexcel GCE
The data are summarised below, Statistics S1
6x = 25.0, 6x = 65.68, 6y = 50.0, 6y = 260.48, 6xy = 130.64.
2 2
(New Syllabus)
(a) Find S xx , S xy , S yy . (3 marks) Advanced/Advanced Subsidiary
(b) Calculate the product moment correlation coefficient between x and y.
Tuesday 12 June 2001 Afternoon
(3 marks) Time: 1 hour 30 minutes
(c) Explain why this value might support the fitting of a linear regression model Materials required for examination Items included with question papers
of the form y = a + bx. (1 mark) Answer Book (AB16) Nil
Graph Paper (ASG2)
Mathematical Formulae (Lilac)
(d) Find the values of a and b. (4 marks)
(e) Give an interpretation of a. (1 mark) Candidates may use any calculator EXCEPT those with the facility for symbolic
algebra, differentiation and/or integration. Thus candidates may NOT use calculators
such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP
(f) Estimate 48G.
(ii) the financial effect of an increase of 1500 hours in operating time. (4 marks)
Instructions to Candidates
(g) Suggest why the authority might be cautious about making a prediction of the In the boxes on the answer book, write the name of the examining body (Edexcel), your
reconditioning cost of an incinerator which had been operating for 4500 hours centre number, candidate number, the unit title (Statistics S1), the paper reference (6683),
since its last reconditioning. (2 marks) your surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the
answer should be given to an appropriate degree of accuracy.
END
Information for Candidates
A booklet ‘Mathematical Formulae and Statistical Tables’ is provided.
Full marks may be obtained for answers to ALL questions.
This paper has seven questions. Pages 6, 7 and 8 are blank.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N6994 5
PMT
1. Each of the 25 students on a computer course recorded the number of minutes x, 4. The discrete random variable X has the probability function shown in the table
to the nearest minute, spent surfing the internet during a given day. The results below.
are summarised below.
x 2 1 0 1 2 3
6x = 1075, 6x2 = 46 625. P(X = x) 0.1 D 0.3 0.2 0.1 0.1
6x = 7300, 6x2 = 6 599 600, S xy = 13 060, S yy = 140.9. 5. A market researcher asked 100 adults which of the three newspapers A, B, C they
read. The results showed that 30 read A, 26 read B, 21 read C, 5 read both A
(a) Find S xx . and B, 7 read both B and C, 6 read both C and A and 2 read all three.
(2)
(b) Calculate, to 3 significant figures, the product moment correlation coefficient (a) Draw a Venn diagram to represent these data.
between x and y. (6)
(2) One of the adults is then selected at random.
(c) Give an interpretation of your coefficient.
(1) Find the probability that she reads
6. Three swimmers Alan, Diane and Gopal record the number of lengths of the 7. A music teacher monitored the sight-reading ability of one of her pupils over a
swimming pool they swim during each practice session over several weeks. The 10 week period. At the end of each week, the pupil was given a new piece to
stem and leaf diagram below shows the results for Alan. sight-read and the teacher noted the number of errors y. She also recorded the
number of hours x that the pupil had practised each week. The data are shown in
Lengths 2~0 means 20 the table below.
2 0122 (4)
2 5567789 (7) x 12 15 7 11 1 8 4 6 9 3
3 01224 (5)
3 56679 (5) y 8 4 13 8 18 12 15 14 12 16
4 0133333444 (10)
4 556667788999 (12) (a) Plot these data on a scatter diagram.
5 000 (3) (3)
(b) Find the equation of the regression line of y on x in the form y = a + bx.
(a) Find the three quartiles for Alan’s results. (You may use 6x2 = 746, 6xy = 749.)
(4) (9)
The table below summarises the results for Diane and Gopal. (c) Give an interpretation of the slope and the intercept of your regression line.
(2)
Diane Gopal (d) State whether or not you think the regression model is reasonable
Smallest value 35 25
(i) for the range of x-values given in the table,
Lower quartile 37 34
(ii) for all possible x-values.
Median 42 42
Upper quartile 53 50 In each case justify your answer either by giving a reason for accepting the model
or by suggesting an alternative model.
Largest value 65 57 (2)
(b) Using the same scale and on the same sheet of graph paper draw box plots to END
represent the data for Alan, Diane and Gopal.
(8)
(c) Compare and contrast the three box plots.
(4)
N6894 4 N6993 5
PMT
N6988 This publication may only be reproduced in accordance with Edexcel copyright policy.
Edexcel Foundation is a registered charity. ©2002 Edexcel
N6993 2
PMT
4. A contractor bids for two building projects. He estimates that the probability of 6. Hospital records show the number of babies born in a year. The number of babies
winning the first project is 0.5, the probability of winning the second is 0.3 and delivered by 15 male doctors is summarised by the stem and leaf diagram below.
the probability of winning both projects is 0.2.
Babies (4~5 means 45) Totals
(a) Find the probability that he does not win either project.
(3) 0 (0)
(b) Find the probability that he wins exactly one project.
(2) 1 9 (1)
(c) Given that he does not win the first project, find the probability that he wins 2 1 6 7 7 (4)
the second.
(2) 3 2 2 3 4 8 (5)
(d) By calculation, determine whether or not winning the first contract and
winning the second contract are independent events. 4 5 (1)
(3)
5 1 (1)
6 0 (1)
5. The duration of the pregnancy of a certain breed of cow is normally distributed
with mean P days and standard deviation V days. Only 2.5% of all pregnancies 7 (0)
are shorter than 235 days and 15% are longer than 286 days.
8 6 7 (2)
(a) Show that P 235 = 1.96V .
(2)
(b) Obtain a second equation in P and V . (a) Find the median and inter-quartile range of these data.
(3) (3)
(c) Find the value of P and the value of V . (b) Given that there are no outliers, draw a box plot on graph paper to represent
(4) these data. Start your scale at the origin.
(d) Find the values between which the middle 68.3% of pregnancies lie. (4)
(2) (c) Calculate the mean and standard deviation of these data.
(5)
The records also contain the number of babies delivered by 10 female doctors.
34 30 20 15 6
32 26 19 11 4
(d) Using the same scale as in part (b) and on the same graph paper draw a box
plot for the data for the 10 female doctors.
(3)
(e) Compare and contrast the box plots for the data for male and female doctors.
(2)
7. A number of people were asked to guess the calorific content of 10 foods. The END
mean s of the guesses for each food and the true calorific content t are given in
the table below.
Food t s
Packet of biscuits 170 420
1 potato 90 160
1 apple 80 110
Crisp breads 10 70
Chocolate bar 260 360
1 slice white bread 75 135
1 slice brown bread 60 115
Portion of beef curry 270 350
Portion of rice pudding 165 390
Half a pint of milk 160 200
[You may assume that 6t = 1340, 6s = 2310, 6ts = 396 775, 6t2 = 246 050,
6s2 = 694 650.]
(d) Find the equation of the regression line of s on t excluding the values for rice
pudding and biscuits.
(3)
[You may now assume that S ts = 72 587, S tt = 63 671.875, t = 125.625,
s = 187.5.]
N6993 5 N6894 6
PMT
1. An unbiased die has faces numbered 1 to 6 inclusive. The die is rolled and the
Paper Reference(s)
number that appears on the uppermost face is recorded.
6683
(a) State the probability of not recording a 6 in one roll of the die.
Edexcel GCE The die is thrown until a 6 is recorded.
(1)
Statistics S1 (b) Find the probability that a 6 occurs for the first time on the third roll of the die.
Advanced/Advanced Subsidiary (3)
Candidates may use any calculator EXCEPT those with the facility for symbolic (a) explain in words the meaning of the term P(B~A),
algebra, differentiation and/or integration. Thus candidates may NOT use calculators (2)
such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP (b) sketch a Venn diagram to illustrate the relationship P(B~A) = 0.
48G.
(2)
Three companies operate a bus service along a busy main road. Amber buses run
50% of the service and 2% of their buses are more than 5 minutes late. Blunder buses
Instructions to Candidates run 30% of the service and 10% of their buses are more than 5 minutes late. Clipper
buses run the remainder of the service and only 1% of their buses run more than
In the boxes on the answer book, write the name of the examining body (Edexcel), your 5 minutes late.
centre number, candidate number, the unit title (Statistics S1), the paper reference (6683),
your surname, other name and signature. Jean is waiting for a bus on the main road.
Values from the statistical tables should be quoted in full. When a calculator is used, the
answer should be given to an appropriate degree of accuracy. (c) Find the probability that the first bus to arrive is an Amber bus that is more than
5 minutes late. (2)
Information for Candidates
A booklet ‘Mathematical Formulae and Statistical Tables’ is provided. Let A, B and C denote the events that Jean catches an Amber bus, a Blunder bus and a
Full marks may be obtained for answers to ALL questions. Clipper bus respectively. Let L denote the event that Jean catches a bus that is more
This paper has seven questions. Pages 6, 7 and 8 are blank. than 5 minutes late.
Advice to Candidates (d) Draw a Venn diagram to represent the events A, B, C and L. Calculate the
probabilities associated with each region and write them in the appropriate places on
You must ensure that your answers to parts of questions are clearly labelled. the Venn diagram.
You must show sufficient working to make your methods clear to the Examiner. Answers (4)
without working may gain no credit.
(e) Find the probability that Jean catches a bus that is more than 5 minutes late.
(2)
N10636 This publication may only be reproduced in accordance with Edexcel copyright policy.
Edexcel Foundation is a registered charity. ©2002 Edexcel
N6993 2
PMT
6. The labelling on bags of garden compost indicates that the bags weigh 20 kg. The
4. A discrete random variable X takes only positive integer values. It has a cumulative weights of a random sample of 50 bags are summarised in the table below.
distribution function F(x) = P (X d x) defined in the table below.
Weight in kg Frequency
X 1 2 3 4 5 6 7 8 14.6 – 14.8 1
F(x) 0.1 0.2 0.25 0.4 0.5 0.6 0.75 1 14.8 – 18.0 0
18.0 – 18.5 5
(a) Determine the probability function, P (X = x), of X.
18.5 – 20.0 6
(3)
(b) Calculate E (X) and show that Var (X) = 5.76. 20.0 – 20.2 22
(6) 20.2 – 20.4 15
(c) Given that Y = 2X + 3, find the mean and variance of Y. 20.4 – 21.0 1
(3)
(a) On graph paper, draw a histogram of these data.
5. A random variable X has a normal distribution. (4)
(b) Using the coding y = 10(weight in kg – 14), find an estimate for the mean and
(a) Describe two features of the distribution of X. standard deviation of the weight of a bag of compost.
(2) (6)
A company produces electronic components which have life spans that are normally [Use 6fy2 = 171 503.75]
distributed. Only 1% of the components have a life span less than 3500 hours and
2.5% have a life span greater than 5500 hours. (c) Using linear interpolation, estimate the median.
(2)
(b) Determine the mean and standard deviation of the life spans of the components.
(6) The company that produces the bags of compost wants to improve the accuracy of the
labelling. The company decides to put the average weight in kg on each bag.
The company gives warranty of 4000 hours on the components.
(d) Write down which of these averages you would recommend the company to
(c) Find the proportion of components that the company can expect to replace under use. Give a reason for your answer.
the warranty. (2)
(4)
7. An ice cream seller believes that there is a relationship between the temperature on a
Paper Reference(s)
summer day and the number of ice creams sold. Over a period of 10 days he records
the temperature at 1p.m., t qC, and the number of ice creams sold, c, in the next hour. 6683
The data he collects is summarised in the table below.
t c
Edexcel GCE
13 24
Statistics S1
22 55 Advanced/Advanced Subsidiary
17 35 Tuesday 5 November 2002 Morning
20 45 Time: 1 hour 30 minutes
10 20 Materials required for examination Items included with question papers
Answer Book (AB16) Nil
15 30 Graph Paper (ASG2)
Mathematical Formulae (Lilac)
19 39
12 19
Candidates may use any calculator EXCEPT those with the facility for symbolic
18 36 algebra, differentiation and/or integration. Thus candidates may NOT use calculators
such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard
23 54 HP 48G.
(a) Calculate the value of the product moment correlation coefficient between t and c. Instructions to Candidates
(7)
In the boxes on the answer book, write the name of the examining body (Edexcel), your
(b) State whether or not your value supports the use of a regression equation to centre number, candidate number, the unit title (Statistics S1), the paper reference (6683),
predict the number of ice creams sold. Give a reason for your answer. (2) your surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the
(c) Find the equation of the least squares regression line of c on t in the form answer should be given to an appropriate degree of accuracy.
c = a + bt. (2)
Information for Candidates
(d) Interpret the value of b. (1)
A booklet ‘Mathematical Formulae and Statistical Tables’ is provided.
Full marks may be obtained for answers to ALL questions.
(e) Estimate the number of ice creams sold between 1 p.m. and 2 p.m. when the
This paper has seven questions. Pages 6, 7 and 8 are blank.
temperature at 1 p.m. is 16 qC.
(3) Advice to Candidates
(f) At 1 p.m. on a particular day, the highest temperature for 50 years was recorded. You must ensure that your answers to parts of questions are clearly labelled.
Give a reason why you should not use the regression equation to predict ice cream You must show sufficient working to make your methods clear to the Examiner. Answers
sales on that day. without working may gain no credit.
(1)
END
N13318A This publication may only be reproduced in accordance with Edexcel copyright policy.
Edexcel Foundation is a registered charity. ©2002 Edexcel
N6994 5
PMT
1. (a) Explain briefly why statistical models are used when attempting to solve real-world
problems. 4. Strips of metal are cut to length L cm, where L a N(P, 0.52).
(2)
(b) Write down the name of the distribution you would recommend as a suitable model for each (a) Given that 2.5% of the cut lengths exceed 50.98 cm, show that P = 50.
of the following situations. (5)
(b) Find P(49.25 < L < 50.75).
(i) The weight of marmalade in a jar. (4)
(ii) The number on the uppermost face of a fair die after it has been rolled. Those strips with length either less than 49.25 cm or greater than 50.75 cm cannot be used.
(2)
Two strips of metal are selected at random.
2. There are 125 sixth-form students in a college, of whom 60 are studying only arts subjects, 40 (c) Find the probability that both strips cannot be used.
only science subjects and the rest a mixture of both. (2)
(a) all three students are studying only arts subjects, The data were coded using s = x – 6 and t = y – 20 and the following summations were obtained.
(4)
(b) exactly one of the three students is studying only science subjects. 6 s = 48.5, 6 t = 65.0, 6 s2 = 402.11, 6 t 2 = 701.80, 6 st = 523.23
(3)
(a) Find the equation of the regression line of t on s in the form t = p + qs.
(7)
3. The events A and B are independent such that P(A) = 0.25 and P(B) = 0.30. (b) Find the equation of the regression line of y on x in the form y = a + bx, giving a and b to
3 decimal places.
Find (3)
(a) P(A B), The value of the product moment correlation coefficient between s and t is 0.943, to 3 decimal
(2) places.
(b) P(A B), (c) Write down the value of the product moment correlation coefficient between x and y. Give a
(2) justification for your answer.
(2)
(c) P(A~Bc).
(4)
(a) Given that E(X) = 0.2, find the value of D and the value of E. 3 1 2 9 (3)
(6)
4 2 4 6 8 9 (5)
(b) Write down F(0.8).
(1) 5 1 3 3 5 6 7 9 (7)
(a) Evaluate Var(X). 6 0 1 3 3 3 5 6 8 8 9 (10)
(4)
7 1 2 2 2 4 5 5 5 6 8 8 8 8 9 (14)
Find the value of
8 0 1 2 3 5 8 8 9 (8)
(d) E(3X – 2),
(2) 9 0 1 2 (3)
(d) Calculate, to 2 decimal places, the mean and the standard deviation for these data.
(3)
(e) Use two different methods to show that these data are negatively skewed.
(4)
END
N13318A 4 N13318A 5
PMT
1. The total amount of time a secretary spent on the telephone in a working day was
Paper Reference(s)
recorded to the nearest minute. The data collected over 40 days are summarised in the
6683 table below.
Edexcel GCE Time (mins) 90–139 140–149 150–159 160–169 170–179 180–229
Candidates may use any calculator EXCEPT those with the facility for symbolic Zippy 35 15
algebra, differentiation and/or integration. Thus candidates may NOT use calculators Nifty 40 10
such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP
48G.
One of the purchasers is chosen at random. Let A be the event that no claim is made
by the purchaser under the warranty and B the event that the car purchased is a Nifty.
Information for Candidates (c) find the probability that the car purchased is a Zippy.
(2)
A booklet ‘Mathematical Formulae and Statistical Tables’ is provided.
Full marks may be obtained for answers to ALL questions. (d) Show that making a claim is not independent of the make of the car purchased.
This paper has seven questions. Pages 6, 7 and 8 are blank.
Comment on this result.
Advice to Candidates (3)
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N10623A This publication may only be reproduced in accordance with Edexcel copyright policy.
Edexcel Foundation is a registered charity. ©2003 Edexcel
N10632A 2
PMT
3. A drinks machine dispenses coffee into cups. A sign on the machine indicates that 5. The discrete random variable X has probability function
each cup contains 50 ml of coffee. The machine actually dispenses a mean amount of
55 ml per cup and 10% of the cups contain less than the amount stated on the sign.
{
Assuming that the amount of coffee dispensed into each cup is normally distributed k(2 – x), x = 0, 1, 2,
find
P(X = x) = k(x – 2), x = 3,
(a) the standard deviation of the amount of coffee dispensed per cup in ml, 0, otherwise,
(4)
where k is a positive constant.
(b) the percentage of cups that contain more than 61 ml.
(3)
(a) Show that k = 0.25.
Following complaints, the owners of the machine make adjustments. Only 2.5% of (2)
cups now contain less than 50 ml. The standard deviation of the amount dispensed is
(b) Find E(X) and show that E(X 2) = 2.5.
reduced to 3 ml.
(4)
Assuming that the amount of coffee dispensed is still normally distributed, (c) Find Var(3X – 2).
(3)
(c) find the new mean amount of coffee per cup.
(4) Two independent observations X 1 and X 2 are made of X.
(a) Find the median and inter-quartile range of the waiting times.
(5)
An outlier is an observation that falls either 1.5 u (inter-quartile range) above the
upper quartile or 1.5 u (inter-quartile range) below the lower quartile.
(b) Draw a boxplot to represent these data, clearly indicating any outliers.
(7)
(c) Find the mean of these data.
(2)
(d) Comment on the skewness of these data. Justify your answer.
(2)
N10623A 3 N6894 4
PMT
6. The chief executive of Rex cars wants to investigate the relationship between the
number of new car sales and the amount of money spent on advertising. She collects
data from company records on the number of new car sales, c, and the cost of
advertising each year, p (£000). The data are shown in the table below. Paper Reference(s)
6683
Edexcel GCE
Number of new car sale, Cost of advertising (£000),
Year
c p
1990 4240 120 Statistics S1
1991 4380 126
Advanced/Advanced Subsidiary
1992 4420 132
Thursday 5 June 2003 Morning
1993 4440 134
Time: 1 hour 30 minutes
1994 4430 137
Materials required for examination Items included with question papers
1995 4520 144 Answer Book (AB16) Nil
Graph Paper (ASG2)
1996 4590 148 Mathematical Formulae (Lilac)
(a) Using the coding x = (p – 100) and y = 1 (c – 4000), draw a scatter diagram to
10
represent these data. Explain why x is the explanatory variable. Instructions to Candidates
(5) In the boxes on the answer book, write the name of the examining body (Edexcel), your centre
number, candidate number, the unit title (Statistics S1), the paper reference (6683), your
(b) Find the equation of the least squares regression line of y on x.
surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the answer
[Use 6x = 402, 6y = 517, 6x2 = 17 538 and 6xy = 22 611.]
should be given to an appropriate degree of accuracy.
(7)
(c) Deduce the equation of the least squares regression line of c on p in the form Information for Candidates
c = a + bp. A booklet ‘Mathematical Formulae and Statistical Tables’ is provided.
(3) Full marks may be obtained for answers to ALL questions.
(d) Interpret the value of a. This paper has seven questions.
(2)
Advice to Candidates
(e) Predict the number of extra new cars sales for an increase of £2000 in advertising
budget. Comment on the validity of your answer. You must ensure that your answers to parts of questions are clearly labelled.
(2) You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
END
N13348A This publication may only be reproduced in accordance with Edexcel copyright policy.
Edexcel Foundation is a registered charity. ©2003 Edexcel
N10623A 5
PMT
1. In a particular week, a dentist treats 100 patients. The length of time, to the nearest minute, for 3. A company owns two petrol stations P and Q along a main road. Total daily sales in the same
each patient’s treatment is summarised in the table below. week for P (£p) and for Q (£q) are summarised in the table below.
p q
Time
4–7 8 9 – 10 11 12 – 16 17 – 20
(minutes) Monday 4760 5380
Tuesday 5395 4460
Number
of 12 20 18 22 15 13 Wednesday 5840 4640
patients
Thursday 4650 5450
4. The discrete random variable X has probability function 6. The number of bags of potato crisps sold per day in a bar was recorded over a two-week period.
The results are shown below.
k ( x 2 9), x 4, 5, 6
P(X = x) = ® 20, 15, 10, 30, 33, 40, 5, 11, 13, 20, 25, 42, 31, 17
¯0, otherwise,
(a) Calculate the mean of these data.
where k is a positive constant. (2)
Three fair dice are thrown and the numbers on the uppermost faces are recorded.
(d) Write down all the different ways of scoring a total of 16 when the three numbers are added
together.
(4)
(e) Find the probability of scoring a total of 16.
(2)
7. Eight students took tests in mathematics and physics. The marks for each student are given in the
table below where m represents the mathematics mark and p the physics mark.
Paper Reference(s)
Student 6683
A B C D E F G H
Edexcel GCE
Statistics S1
m 9 14 13 10 7 8 20 17
Mark Advanced/Advanced Subsidiary
p 11 23 21 15 19 10 31 26
Tuesday 4 November 2003 Morning
A science teacher believes that students’ marks in physics depend upon their mathematical
ability. The teacher decides to investigate this relationship using the test marks. Time: 1 hour 30 minutes
Materials required for examination Items included with question papers
(a) Write down which is the explanatory variable in this investigation. Answer Book (AB16) Nil
(1) Graph Paper (ASG2)
Mathematical Formulae (Lilac)
(b) Draw a scatter diagram to illustrate these data.
(3) Candidates may use any calculator EXCEPT those with the facility
for symbolic algebra, differentiation and/or integration. Thus
(c) Showing your working, find the equation of the regression line of p on m. candidates may NOT use calculators such as the Texas Instruments
(8) TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G.
(d) Draw the regression line on your scatter diagram.
(2) Instructions to Candidates
A ninth student was absent for the physics test, but she sat the mathematics test and scored 15. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre
number, candidate number, the unit title (Statistics S1), the paper reference (6683), your
(e) Using this model, estimate the mark she would have scored in the physics test. surname, other name and signature.
(2) Values from the statistical tables should be quoted in full. When a calculator is used, the answer
should be given to an appropriate degree of accuracy.
END Information for Candidates
A booklet ‘Mathematical Formulae and Statistical Tables’ is provided.
Full marks may be obtained for answers to ALL questions.
This paper has six questions.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N16611A This publication may only be reproduced in accordance with Edexcel copyright policy.
Edexcel Foundation is a registered charity. ©2003 Edexcel
N13348A 6
PMT
1. A company wants to pay its employees according to their performance at work. The performance 2. A fairground game involves trying to hit a moving target with a gunshot. A round consists of up
score x and the annual salary, y in £100s, for a random sample of 10 of its employees for last to 3 shots. Ten points are scored if a player hits the target, but the round is over if the player
year were recorded. The results are shown in the table below. misses. Linda has a constant probability of 0.6 of hitting the target and shots are independent of
one another.
x 15 40 27 39 27 15 20 30 19 24
(a) Find the probability that Linda scores 30 points in a round.
y 216 384 234 399 226 132 175 316 187 196 (2)
[You may assume 6xy = 69 798, 6x2 = 7 266] The random variable X is the number of points Linda scores in a round.
(a) Draw a scatter diagram to represent these data. (b) Find the probability distribution of X.
(4) (5)
(b) Calculate exact values of S xy and S xx . (c) Find the mean and the standard deviation of X.
(4) (5)
(c) (i) Calculate the equation of the regression line of y on x, in the form y = a + bx. A game consists of 2 rounds.
Give the values of a and b to 3 significant figures. (d) Find the probability that Linda scores more points in round 2 than in round 1.
(6)
(ii) Draw this line on your scatter diagram.
(5)
3. Cooking sauces are sold in jars containing a stated weight of 500 g of sauce The jars are filled by
(d) Interpret the gradient of the regression line. a machine. The actual weight of sauce in each jar is normally distributed with mean 505 g and
(1) standard deviation 10 g.
The company decides to use this regression model to determine future salaries. (a) (i) Find the probability of a jar containing less than the stated weight.
(e) Find the proposed annual salary for an employee who has a performance score of 35. (ii) In a box of 30 jars, find the expected number of jars containing less than the stated
(2) weight.
(5)
The mean weight of sauce is changed so that 1% of the jars contain less than the stated weight.
The standard deviation stays the same.
4. Explain what you understand by 6. A travel agent sells holidays from his shop. The price, in £, of 15 holidays sold on a particular
day are shown below.
(a) a sample space,
(1) 299 1050 2315 999 485
(b) an event. 350 169 1015 650 830
(1)
99 2100 689 550 475
1 1
Two events A and B are independent, such that P(A) = and P(B) = .
3 4 For these data, find
Find (a) the mean and the standard deviation,
(3)
(c) P(A B),
(1) (b) the median and the inter-quartile range.
(4)
(d) P(A~B),
(2) An outlier is an observation that falls either more than 1.5 u (inter-quartile range) above the
(e) P(A B). upper quartile or more than 1.5 u (inter-quartile range) below the lower quartile.
(2)
(c) Determine if any of the prices are outliers.
(3)
5. The random variable X has the discrete uniform distribution
The travel agent also sells holidays from a website on the Internet. On the same day, he recorded
1 the price, £x, of each of 20 holidays sold on the website. The cheapest holiday sold was £98, the
P(X = x) = , x = 1, 2, ..., n.
n most expensive was £2400 and the quartiles of these data were £305, £1379 and £1805. There
were no outliers.
Given that E(X) = 5,
(d) On graph paper, and using the same scale, draw box plots for the holidays sold in the shop
(a) show that n = 9. and the holidays sold on the website.
(3) (4)
(e) Compare and contrast sales from the shop and sales from the website.
Find
(2)
(b) P(X < 7),
(2) END
(c) Var (X).
(4)
N16611A 5
N16611A 4
PMT
1. An office has the heating switched on at 7.00 a.m. each morning. On a particular day, the
temperature of the office, t qC, was recorded m minutes after 7.00 a.m. The results are shown in
Paper Reference(s)
the table below.
6683
Edexcel GCE
m 0 10 20 30 40 50
t 6.0 8.9 11.8 13.5 15.3 16.1
Statistics S1
Advanced/Advanced Subsidiary (a) Calculate the exact values of S mt and S mm .
(4)
Wednesday 14 January 2004 Morning (b) Calculate the equation of the regression line of t on m in the form t = a + bm.
Time: 1 hour 30 minutes (3)
Materials required for examination Items included with question papers (c) Use your equation to estimate the value of t at 7.35 a.m.
Answer Book (AB16) Nil (2)
Graph Paper (ASG2)
Mathematical Formulae (Lilac)
(d) State, giving a reason, whether or not you would use the regression equation in (b) to
estimate the temperature
Candidates may use any calculator EXCEPT those with the facility for symbolic (i) at 9.00 a.m. that day,
algebra, differentiation and/or integration. Thus candidates may NOT use calculators
such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP
48G.
(ii) at 7.15 a.m. one month later.
(4)
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N16969A This publication may only be reproduced in accordance with London Qualifications Limited copyright policy.
©2004 London Qualifications Limited.
N16969A 2
PMT
3. A discrete random variable X has the probability function shown in the table below. 5. The values of daily sales, to the nearest £, taken at a newsagents last year are summarised in the
table below.
x 0 1 2 3
Sales Number of days
1 1 1 1 1 – 200 166
P(X = x)
3 2 12 12 201 – 400 100
401 – 700 59
Find 701 – 1000 30
(a) P(1 < X d 3), 1001 – 1500 5
(2)
(b) F(2.6), (a) Draw a histogram to represent these data.
(1) (5)
(c) E(X), (b) Use interpolation to estimate the median and inter-quartile range of daily sales.
(2) (5)
(d) E(2X – 3), (c) Estimate the mean and the standard deviation of these data.
(2) (6)
(e) Var(X) The newsagent wants to compare last year’s sales with other years.
(3)
(d) State whether the newsagent should use the median and the inter-quartile range or the mean
and the standard deviation to compare daily sales. Give a reason for your answer.
2 1 4 (2)
4. The events A and B are such that P(A) = , P(B) = and P(A~Bc) = .
5 2 5
(a) Find
6. One of the objectives of a computer game is to collect keys. There are three stages to the game.
2 1
(i) P(A Bc), The probability of collecting a key at the first stage is , at the second stage is , and at the
3 2
(ii) P(A B), 1
third stage is .
4
(iii) P(A B),
(a) Draw a tree diagram to represent the 3 stages of the game.
(iv) P(A~B). (4)
(7) (b) Find the probability of collecting all 3 keys.
(b) State, with a reason, whether or not A and B are (2)
(c) Find the probability of collecting exactly one key in a game.
(i) mutually exclusive, (5)
(2)
(d) Calculate the probability that keys are not collected on at least 2 successive stages in a game.
(ii) independent. (5)
(2)
END
1. A fair die has six faces numbered 1, 2, 2, 3, 3 and 3. The die is rolled twice and the number
showing on the uppermost face is recorded each time.
Find the probability that the sum of the two numbers recorded is at least 5.
Paper Reference(s) (5)
6683
Edexcel GCE
2. A researcher thinks there is a link between a person's height and level of confidence. She
measured the height h, to the nearest cm, of a random sample of 9 people. She also devised a test
to measure the level of confidence c of each person. The data are shown in the table below.
Statistics S1
179 169 187 166 162 193 161 177 168
Advanced/Advanced Subsidiary h
c 569 561 579 561 540 598 542 565 573
Friday 11 June 2004 Morning
Time: 1 hour 30 minutes [You may use 6h2 = 272 094, 6c2 = 2 878 966, 6hc = 884 484]
Materials required for examination Items included with question papers (a) Draw a scatter diagram to illustrate these data.
Answer Book (AB16) Nil
Graph Paper (ASG2) (4)
Mathematical Formulae (Lilac)
(b) Find exact values of S hc S hh and S cc .
Candidates may use any calculator EXCEPT those with the facility for (4)
symbolic algebra, differentiation and/or integration. Thus candidates may NOT
(c) Calculate the value of the product moment correlation coefficient for these data.
use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G,
Hewlett Packard HP 48G. (3)
(d) Give an interpretation of your correlation coefficient.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N17022A This publication may only be reproduced in accordance with London Qualification Limited
©2004 London Qualification Limited
N17022A 2
PMT
3. A discrete random variable X has a probability function as shown in the table below, where a and 4. The attendance at college of a group of 18 students was recorded for a 4-week period.
b are constants.
The number of students actually attending each of 16 classes are shown below.
x 0 1 2 3
18 18 17 17
P(X = x) 0.2 0.3 b a
16 17 16 18
18 14 17 18
Given that E(X) = 1.7, 15 17 18 16
(a) find the value of a and the value of b. (a) (i) Calculate the mean and the standard deviation of the number of students attending these
(5) classes.
Find
(ii) Express the mean as a percentage of the 18 students in the group.
(b) P(0 < X < 1.5),
(1) (5)
(c) E(2X 3). In the same 4-week period, the attendance of a different group of 20, students is shown below.
(2)
20 16 18 19
(d) Show that Var(X) = 1.41. 15 14 14 15
(3)
18 15 16 17
(e) Evaluate Var(2X 3). 16 18 15 14
(2)
(b) Construct a back-to-back stem and leaf diagram to represent the attendance in both groups.
(5)
(c) Find the mode, median and inter-quartile range for each group of students.
(6)
The mean percentage attendance and standard deviation for the second group of students are 81.25
and 1.82 respectively.
N17022A 3 N17022A 4
PMT
5. A health club lets members use, on each visit, its facilities for as long as they wish. The club’s
records suggest that the length of a visit can be modelled by a normal distribution with mean
90 minutes. Only 20% of members stay for more than 125 minutes. Paper Reference(s)
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N16628A This publication may only be reproduced in accordance with London Qualifications Limited copyright policy.
©2004 London Qualifications Limited
N17022A 5
PMT
1. As part of their job, taxi drivers record the number of miles they travel each day. A random 2. An experiment carried out by a student yielded pairs of (x, y) observations such that
sample of the mileages recorded by taxi drivers Keith and Asif are summarised in the back-to-
back stem and leaf diagram below. x = 36, y = 28.6, S xx = 4402, S xy = 3477.6
Totals Keith Asif Totals (a) Calculate the equation of the regression line of y on x in the form y = a + bx. Give your
values of a and b to 2 decimal places.
(9) 8 7 7 4 3 2 1 1 0 18 4 4 5 7 (4) (3)
(11) 9 9 8 7 6 5 4 3 3 1 1 19 5 7 8 9 9 (5)
(b) Find the value of y when x = 45.
(6) 8 7 4 2 2 0 20 0 2 2 4 4 8 (6)
(1)
(6) 9 4 3 1 0 0 21 2 3 5 6 6 7 9 (7)
(4) 6 4 1 1 22 1 1 2 4 5 5 8 (7)
(2) 2 0 23 1 1 3 4 6 6 7 8 (8) 3. The random variable X a N(P, V 2).
(2) 7 1 24 2 4 8 9 (4)
(1) 9 25 4 (1) It is known that
(2) 9 3 26 (0)
P(X d 66) = 0.0359 and P(X t 81) = 0.1151.
Key: 0 ~ 18 ~ 4 means 180 for Keith and 184 for Asif
(a) In the space below, give a clearly labelled sketch to represent these probabilities on a
Normal curve.
The quartiles for these two distributions are summarised in the table below.
(1)
Keith Asif (b) (i) Show that the value of V is 5.
Lower quartile 191 a
(ii) Find the value of P.
Median b 218 (8)
Upper quartile 221 c
(c) Find P(69 d X d 83).
(3)
(a) Find the values of a, b and c.
(3)
(b) On graph paper, and showing your scale clearly, draw a box plot to represent Keith’s data.
(8)
(c) Comment on the skewness of the two distributions.
(3)
Find
(a) Evaluate S xx , S yy and S xy .
(a) D, (4)
(2) (b) Calculate, to 3 decimal places, the product moment correlation coefficient between x and y.
(3)
(b) P(–1 d X < 2),
(1) (c) Give an interpretation of your coefficient.
(2)
(c) F(0.6),
(1) (d) Calculate the mean and the standard deviation of the number of press-ups done by these
students.
(d) the value of a such that E(aX + 3) = 1.2,
(4)
(4)
(e) Var(X), Mr Brawn assumes that the number of press-ups that can be done by any student can be
(4) modelled by a normal distribution with mean P and standard deviation V. Assuming that P and V
(f) Var(3X – 2). take the same values as those calculated in part (d),
(2)
(e) find the value of a such that P(P – a < X < P + a) = 0.95.
(3)
5. The events A and B are such that P(A) = 1
2
, P(B) = 1
3
and P(A B) = 1
4
. (f) Comment on Mr Brawn’s assumption of normality.
(2)
(a) Using the space below, represent these probabilities in a Venn diagram.
(4)
N16628A 5
N16628A 4
PMT
7. A college organised a ‘fun run’. The times, to the nearest minute, of a random sample of 100
students who took part are summarised in the table below.
Paper Reference(s)
Time
40–44
Number of students
10
6683
45–47
48
15
23
Edexcel GCE
49–51 21
52–55 16
Statistics S1
56–60 15
Advanced Subsidiary
(a) Give a reason to support the use of a histogram to represent these data.
(1) Friday 14 January 2005 Morning
(b) Write down the upper class boundary and the lower class boundary of the class 40–44.
(1) Time: 1 hour 30 minutes
(c) On graph paper, draw a histogram to represent these data. Materials required for examination Items included with question papers
(4) Mathematical Formulae (Lilac) Nil
Graph Paper (ASG2)
END
Candidates may use any calculator EXCEPT those with the facility
for symbolic algebra, differentiation and/or integration. Thus
candidates may NOT use calculators such as the Texas Instruments
TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G.
Instructions to Candidates
In the boxes on the answer book, write the name of the examining body (Edexcel), your centre
number, candidate number, the unit title (Statistics S1), the paper reference (6683), your
surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the answer
should be given to an appropriate degree of accuracy.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N16741A This publication may only be reproduced in accordance with London Qualifications Limited copyright policy.
©2005 London Qualifications Limited
N16628A 6
PMT
1. A company assembles drills using components from two sources. Goodbuy supplies 85% of the 3. The following table shows the height x, to the nearest cm, and the weight y, to the nearest kg, of
components and Amart supplies the rest. It is known that 3% of the components supplied by a random sample of 12 students.
Goodbuy are faulty and 6% of those supplied by Amart are faulty.
x 148 164 156 172 147 184 162 155 182 165 175 152
(a) Represent this information on a tree diagram.
(3) y 39 59 56 77 44 77 65 49 80 72 70 52
An assembled drill is selected at random. (a) On graph paper, draw a scatter diagram to represent these data.
(3)
(b) Find the probability that it is not faulty.
(3) (b) Write down, with a reason, whether the correlation coefficient between x and y is positive or
negative.
(2)
2. The number of caravans on Seaview caravan site on each night in August last year is summarised
in the following stem and leaf diagram. The data in the table can be summarised as follows.
Caravans 1~0 means 10 Totals 6x = 1962, 6y = 740, 6y2 = 47 746, 6xy = 122 783, S xx = 1745.
1 0 5 (2)
(c) Find S xy .
2 1 2 4 8 (4) (2)
3 0 3 3 3 4 7 8 8 (8)
4 1 1 3 5 8 8 8 9 9 (9) The equation of the regression line of y on x is y = –106.331 + bx.
5 2 3 6 6 7 (5)
6 2 3 4 (3) (d) Find, to 3 decimal places, the value of b.
(2)
(e) Find, to 3 significant figures, the mean y and the standard deviation s of the weights of this
(a) Find the three quartiles of these data. sample of students.
(3) (3)
During the same month, the least number of caravans on Northcliffe caravan site was 31. The (f ) Find the values of y r 1.96s.
maximum number of caravans on this site on any night that month was 72. The three quartiles (2)
for this site were 38, 45 and 52 respectively.
(g) Comment on whether or not you think that the weights of these students could be modelled
(b) On graph paper and using the same scale, draw box plots to represent the data for both by a normal distribution.
caravan sites. You may assume that there are no outliers. (1)
(6)
(c) Compare and contrast these two box plots.
(3)
(d) Give an interpretation to the upper quartiles of these two distributions.
(2)
4. The random variable X has probability function 5. Articles made on a lathe are subject to three kinds of defect, A, B or C. A sample of 1000 articles
was inspected and the following results were obtained.
P(X = x) = kx, x = 1, 2, ..., 5.
31 had a type A defect
1
(a) Show that k = . 37 had a type B defect
15 42 had a type C defect
(2) 11 had both type A and type B defects
Find 13 had both type B and type C defects
10 had both type A and type C defects
(b) P(X < 4), 6 had all three types of defect.
(2)
(a) Draw a Venn diagram to represent these data.
(c) E(X), (6)
(2)
(d) E(3X – 4). Find the probability that a randomly selected article from this sample had
(2)
(b) no defects,
(1)
(c) no more than one of these defects.
(2)
An article selected at random from this sample had only one defect.
6. A discrete random variable is such that each of its values is assumed to be equally likely.
(a) Write down the name of the distribution that could be used to model this random variable.
(1)
(b) Give an example of such a distribution.
(1)
(c) Comment on the assumption that each value is equally likely.
(2)
(d) Suggest how you might refine the model in part (a).
(2)
7. The random variable X is normally distributed with mean 79 and variance 144.
It is known that P(79 – a d X d 79 + b) = 0.6463. This information is shown in the figure below. Statistics S1
Advanced Subsidiary
Thursday 9 June 2005 Morning
Time: 1 hour 30 minutes
0.6463
Materials required for examination Items included with question papers
Mathematical Formulae (Lilac) Nil
Graph Paper (ASG2)
Candidates may use any calculator EXCEPT those with the facility
79 – a 79 79 + b for symbolic algebra, differentiation and/or integration. Thus
candidates may NOT use calculators such as the Texas Instruments
TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G.
Given that P(X t 79 + b) = 2P(X d 79 – a),
(c) show that the area of the shaded region is 0.1179. Instructions to Candidates
(3)
In the boxes on the answer book, write the name of the examining body (Edexcel), your centre
(d) Find the value of b. number, candidate number, the unit title (Statistics S1), the paper reference (6683), your
(4) surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the answer
TOTAL FOR PAPER:75 MARKS should be given to an appropriate degree of accuracy.
END
Information for Candidates
A booklet ‘Mathematical Formulae and Statistical Tables’ is provided.
Full marks may be obtained for answers to ALL questions.
This paper has seven questions.
The total mark for this paper is 75.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N20910A This publication may only be reproduced in accordance with London Qualifications Limited copyright policy.
©2005 London Qualifications Limited
N16741A 6
PMT
1. The scatter diagrams below were drawn by a student. 2. The following table summarises the distances, to the nearest km, that 134 examiners travelled to
attend a meeting in London.
Diagram A Diagram B Diagram C
y v t Distance (km) Number of examiners
+ + + 41–45 4
+ + + + 46–50 19
+ + + +
+ + + +++ + + + 51–60 53
+ + + + + + +
+ + + + 61–70 37
+ + ++ ++
+ + + + + + 71–90 15
91–150 6
x u s
(a) Give a reason to justify the use of a histogram to represent these data.
The student calculated the value of the product moment correlation coefficient for each of the (1)
sets of data.
(b) Calculate the frequency densities needed to draw a histogram for these data.
The values were (DO NOT DRAW THE HISTOGRAM)
(2)
0.68 –0.79 0.08 (c) Use interpolation to estimate the median Q 2 , the lower quartile Q 1 , and the upper quartile Q 3
of these data.
Write down, with a reason, which value corresponds to which scatter diagram.
(6) The mid-point of each class is represented by x and the corresponding frequency by f.
Calculations then give the following values
(d) Calculate an estimate of the mean and an estimate of the standard deviation for these data.
(4)
(e) Evaluate this coefficient and comment on the skewness of these data.
(4)
(f) Give another justification of your comment in part (e).
(1)
3. A long distance lorry driver recorded the distance travelled, m miles, and the amount of fuel 5. The random variable X has probability function
used, f litres, each day. Summarised below are data from the driver’s records for a random
sample of 8 days. kx, x 1, 2, 3,
P(X = x) = ®
The data are coded such that x = m – 250 and y = f – 100. ¯k ( x 1), x 4, 5,
where k is a constant.
¦x = 130 ¦y = 48 ¦xy = 8880 S xx = 20 487.5
(a) Find the value of k.
(a) Find the equation of the regression line of y on x in the form y = a + bx.
(2)
(6)
(b) Find the exact value of E(X).
(b) Hence find the equation of the regression line of f on m.
(2)
(3)
(c) Show that, to 3 significant figures, Var (X) = 1.47.
(c) Predict the amount of fuel used on a journey of 235 miles.
(4)
(1)
(d) Find, to 1 decimal place, Var (4 – 3X).
(2)
4. Aeroplanes fly from City A to City B. Over a long period of time the number of minutes delay in
take-off from City A was recorded. The minimum delay was 5 minutes and the maximum delay
6. A scientist found that the time taken, M minutes, to carry out an experiment can be modelled by
was 63 minutes. A quarter of all delays were at most 12 minutes, half were at most 17 minutes
a normal random variable with mean 155 minutes and standard deviation 3.5 minutes.
and 75% were at most 28 minutes. Only one of the delays was longer than 45 minutes.
Find
An outlier is an observation that falls either 1.5 u (interquartile range) above the upper quartile or
1.5 u (interquartile range) below the lower quartile. (a) P(M > 160),
(3)
(a) On graph paper, draw a box plot to represent these data.
(7) (b) 3M
(4)
(b) Comment on the distribution of delays. Justify your answer.
(2) (c) the value of m, to 1 decimal place, such that P(M m) = 0.30.
(4)
(c) Suggest how the distribution might be interpreted by a passenger who frequently flies from
City A to City B.
(1)
7. In a school there are 148 students in Years 12 and 13 studying Science, Humanities or Arts Paper Reference(s)
subjects. Of these students, 89 wear glasses and the others do not. There are 30 Science students
of whom 18 wear glasses. The corresponding figures for the Humanities students are 68 and 44
6683/01
respectively.
Edexcel GCE
A student is chosen at random.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N20910A 6
N20908A
This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2006 Edexcel Limited.
PMT
1. Over a period of time, the number of people x leaving a hotel each morning was recorded.
2. The random variable X has probability distribution
These data are summarised in the stem and leaf diagram below.
x 1 2 3 4 5
Number leaving 3 2 means 32 Totals
2 7 9 9 (3) P(X = x) 0.10 p 0.20 q 0.30
3 2 2 3 5 6 (5)
4 0 1 4 8 9 (5) (a) Given that E(X) = 3.5, write down two equations involving p and q.
5 2 3 3 6 6 6 8 (7) (3)
6 0 1 4 5 (4)
Find
7 2 3 (2)
8 1 (1) (b) the value of p and the value of q,
(3)
For these data, (c) Var (X),
(4)
(a) write down the mode, (d) Var (3 – 2X).
(1) (2)
(b) find the values of the three quartiles.
(3)
3. A manufacturer stores drums of chemicals. During storage, evaporation takes place. A
Given that 6x = 1335 and 6x2 = 71 801, find random sample of 10 drums was taken and the time in storage, x weeks, and the evaporation
loss, y ml, are shown in the table below.
(c) the mean and the standard deviation of these data.
(4) x 3 5 6 8 10 12 13 15 16 18
mean mode (a) On graph paper, draw a scatter diagram to represent these data.
. (3)
standard deviation
(b) Give a reason to support fitting a regression model of the form y = a + bx to these data.
(d) Evaluate this measure to show that these data are negatively skewed. (1)
(2)
(c) Find, to 2 decimal places, the value of a and the value of b.
(e) Give two other reasons why these data are negatively skewed.
(4) (You may use 6x2 = 1352, 6y2 = 53 112 and 6xy = 8354.)
(7)
(d) Give an interpretation of the value of b.
(1)
(e) Using your model, predict the amount of evaporation that would take place after
(i) 19 weeks,
(ii) 35 weeks.
(2)
(f ) Comment, with a reason, on the reliability of each of your predictions.
(4)
4. A bag contains 9 blue balls and 3 red balls. A ball is selected at random from the bag and its
7. The heights of a group of athletes are modelled by a normal distribution with mean 180 cm
colour is recorded. The ball is not replaced. A second ball is selected at random and its colour
and a standard deviation 5.2 cm. The weights of this group of athletes are modelled by a
is recorded.
normal distribution with mean 85 kg and standard deviation 7.1 kg.
(a) Draw a tree diagram to represent the information.
Find the probability that a randomly chosen athlete
(3)
(a) is taller than 188 cm,
Find the probability that
(3)
(a) the second ball selected is red, (b) weighs less than 97 kg.
(2) (2)
(b) both balls selected are red, given that the second ball selected is red. (c) Assuming that for these athletes height and weight are independent, find the probability
(2) that a randomly chosen athlete is taller than 188 cm and weighs more than 97 kg.
(3)
(d) Comment on the assumption that height and weight are independent.
5. (a) Write down two reasons for using statistical models.
(1)
(2)
(b) Give an example of a random variable that could be modelled by
TOTAL FOR PAPER: 75 MARKS
END
(i) a normal distribution,
(a) Draw a Venn diagram to illustrate the complete sample space for the events A and B.
(3)
(b) Write down the value of P(A) and the value of P(B).
(3)
(c) Find P(A~Bc).
(2)
(d) Determine whether or not A and B are independent.
(3)
N20908A 4 N20908A 5
PMT
Paper Reference(s) Children from schools A and B took part in a fun run for charity. The times, to the nearest minute,
taken by the children from school A are summarised in Figure 1.
6683/01
Edexcel GCE School A
Figure 1
Statistics S1
Advanced/Advanced Subsidiary u u
(b) (i) Write down the time by which 75% of the children in school A had completed the run.
(d) On graph paper, draw a box plot to represent the data from school B.
Instructions to Candidates (4)
Write the name of the examining body (Edexcel), your centre number, candidate number, the (e) Compare and contrast these two box plots.
unit title (Statistics S1), the paper reference (6683), your surname, initials and signature. (4)
Values from the statistical tables should be quoted in full. When a calculator is used, the
answer should be given to an appropriate degree of accuracy.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N22337A This publication may only be reproduced in accordance with London Qualifications copyright policy.
©2006 London Qualifications Limited.
N22337A 2
PMT
2. Sunita and Shelley talk to each other once a week on the telephone. Over many weeks they 3. A metallurgist measured the length, l mm, of a copper rod at various temperatures, t qC, and
recorded, to the nearest minute, the number of minutes spent in conversation on each occasion. recorded the following results.
The following table summarises their results.
t l
Time Number of 20.4 2461.12
(to the nearest minute) conversations
27.3 2461.41
5–9 2 32.1 2461.73
10–14 9 39.0 2461.88
The mid-point of each class was represented by x and its corresponding frequency by f, (b) Find the equation of the regression line of y on x in the form y = a + bx.
giving ¦ fx = 1060. (5)
(c) Estimate the length of the rod at 40 qC.
(b) Calculate an estimate of the mean time spent on their conversations. (3)
(2)
(d) Find the equation of the regression line of l on t.
During the following 25 weeks they monitored their weekly conversation and found that at the (2)
end of the 80 weeks their overall mean length of conversation was 21 minutes. (e) Estimate the length of the rod at 90 qC.
(1)
(c) Find the mean time spent in conversation during these 25 weeks.
(4) (f ) Comment on the reliability of your estimate in part (e).
(2)
(d) Comment on these two mean values.
(2)
N23557A 3 N22337A 4
PMT
4. The random variable X has the discrete uniform distribution 6. A group of 100 people produced the following information relating to three attributes. The
attributes were wearing glasses, being left-handed and having dark hair.
1
P(X = x) = , x = 1, 2, 3, 4, 5. Glasses were worn by 36 people, 28 were left-handed and 36 had dark hair. There were 17 who
5
wore glasses and were left-handed, 19 who wore glasses and had dark hair and 15 who were
(a) Write down the value of E(X) and show that Var(X) = 2. left-handed and had dark hair. Only 10 people wore glasses, were left-handed and had dark hair.
(3)
(a) Represent these data on a Venn diagram.
Find (6)
(b) E(3X – 2), A person was selected at random from this group.
(2)
Find the probability that this person
(c) Var(4 – 3X)
(2) (b) wore glasses but was not left-handed and did not have dark hair,
(1)
5. From experience a high jumper knows that he can clear a height of at least 1.78 m once in 5 (c) did not wear glasses, was not left-handed and did not have dark hair,
attempts. He also knows that he can clear a height of at least 1.65 m on 7 out of 10 attempts. (1)
Assuming that the heights the high jumper can reach follow a Normal distribution, (d) had only two of the attributes,
(2)
(a) draw a sketch to illustrate the above information, (e) wore glasses, given they were left-handed and had dark hair.
(3) (3)
(b) find, to 3 decimal places, the mean and the standard deviation of the heights the high jumper
can reach, TOTAL FOR PAPER: 75 MARKS
(6)
END
(c) calculate the probability that he can jump at least 1.74 m.
(3)
N23557A 5 N22337A 6
PMT
1. As part of a statistics project, Gill collected data relating to the length of time, to the nearest
minute, spent by shoppers in a supermarket and the amount of money they spent. Her data for
a random sample of 10 shoppers are summarised in the table below, where t represents time
Paper Reference(s)
and £m the amount spent over £20.
6683/01
t (minutes) £m
Edexcel GCE 15 í
23 17
5 í
Statistics S1 16 4
30 12
Advanced/Advanced Subsidiary 6 í
32 27
Tuesday 16 January 2007 Morning 23 6
Time: 1 hour 30 minutes 35 20
27 6
Materials required for examination Items included with question papers (a) Write down the actual amount spent by the shopper who was in the supermarket for
Mathematical Formulae (Green or Lilac) Nil 15 minutes.
(1)
(b) Calculate Stt , S mm and Stm .
Candidates may use any calculator EXCEPT those with the facility for symbolic
algebra, differentiation and/or integration. Thus candidates may NOT use
calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, (You may use 6t2 = 5478, 6m2 = 2101, and 6tm = 2485)
Hewlett Packard HP 48G. (6)
(c) Calculate the value of the product moment correlation coefficient between t and m.
(3)
Instructions to Candidates (d) Write down the value of the product moment correlation coefficient between t and the
In the boxes on the answer book, write the name of the examining body (Edexcel), your actual amount spent. Give a reason to justify your value.
centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), (2)
your surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the On another day Gill collected similar data. For these data the product moment correlation
answer should be given to an appropriate degree of accuracy. coefficient was 0.178.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N23957A
This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2007 Edexcel Limited.
N23957A 2
PMT
4. Summarised below are the distances, to the nearest mile, travelled to work by a random
2. In a factory, machines A, B and C are all producing metal rods of the same length. Machine A
sample of 120 commuters.
produces 35% of the rods, machine B produces 25% and the rest are produced by machine C.
Of their production of rods, machines A, B and C produce 3%, 6% and 5% defective rods
Distance Number of
respectively.
(to the nearest mile) commuters
(a) Draw a tree diagram to represent this information. 0–9 10
(3) 10 – 19 19
(b) Find the probability that a randomly selected rod is 20 – 29 43
30 – 39 25
(i) produced by machine A and is defective, 40 – 49 8
50 – 59 6
(ii) is defective.
(5) 60 – 69 5
70 – 79 3
(c) Given that a randomly selected rod is defective, find the probability that it was produced
80 – 89 1
by machine C.
(3)
For this distribution,
N23957A 3 N23957A 4
PMT
It has been suggested that there are 7 stages involved in creating a statistical model. They are
summarised below, with stages 3, 4 and 7 missing.
Stage 3.
Stage 4.
Stage 6. Statistical concepts are used to test how well the model describes the real-world
problem.
Stage 7.
N23957A 5 N23957A 6
PMT
1. A young family were looking for a new 3 bedroom semi-detached house. A local survey recorded
the price x, in £1000, and the distance y, in miles, from the station of such houses. The following
summary statistics were provided
Paper Reference(s)
Edexcel GCE (a) Use these values to calculate the product moment correlation coefficient.
(2)
Statistics S1 (b) Give an interpretation of your answer to part (a).
(1)
Advanced/Advanced Subsidiary
Another family asked for the distances to be measured in km rather than miles.
Tuesday 5 June 2007 Afternoon (c) State the value of the product moment correlation coefficient in this case.
Time: 1 hour 30 minutes (1)
Candidates may use any calculator EXCEPT those with the facility for symbolic
algebra, differentiation and/or integration. Thus candidates may NOT use calculators
such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard
HP 48G.
Instructions to Candidates
Write the name of the examining body (Edexcel), your centre number, candidate number, the
unit title (Statistics S1), the paper reference (6683), your surname, initials and signature.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N26118A This publication may only be reproduced in accordance with London Qualifications copyright policy.
©2007 London Qualifications Limited.
N26118A 2
PMT
2. The box plot in Figure 1 shows a summary of the weights of the luggage, in kg, for each 3. A student is investigating the relationship between the price (y pence) of 100g of chocolate and
musician in an orchestra on an overseas tour. the percentage (x%) of the cocoa solids in the chocolate.
The following data is obtained
Figure 1
Chocolate brand A B C D E F G H
x (% cocoa) 10 20 30 35 40 50 60 70
One musician of the orchestra suggests that the weights of the luggage, in kg, can be modelled by (i) state which brand is overpriced,
a normal distribution with quartiles as given in Figure 1.
(ii) suggest a fair price for this brand.
(c) Find the standard deviation of this normal distribution.
(4) Give reasons for both your answers.
(4)
N26118A 3 N26118A 4
PMT
4. A survey of the reading habits of some students revealed that, on a regular basis, 25% read 5.
Frequency Histogram of times
quality newspapers, 45% read tabloid newspapers and 40% do not read newspapers at all.
Density
(a) Find the proportion of students who read both quality and tabloid newspapers. 6
(3)
5
(b) Draw a Venn diagram to represent this information.
(3) 4
A student is selected at random. Given that this student reads newspapers on a regular basis, 3
(c) find the probability that this student only reads quality newspapers. 2
(3)
1
0 5 10 14 18 20 25 30 40 t
Figure 2
Figure 2 shows a histogram for the variable t which represents the time taken, in minutes, by a
group of people to swim 500 m.
t 5 – 10 10 – 14 14 – 18 18 – 25 25 – 40
Frequency 10 16 24
(2)
(b) Estimate the number of people who took longer than 20 minutes to swim 500 m.
(2)
(c) Find an estimate of the mean time taken.
(4)
(d) Find an estimate for the standard deviation of t.
(3)
(e) Find the median and quartiles for t.
(4)
3(mean median)
One measure of skewness is found using .
standard deviation
(f) Evaluate this measure and describe the skewness of these data.
(2)
N26118A 5 N26118A 6
PMT
6. The random variable X has a normal distribution with mean 20 and standard deviation 4. Paper Reference(s)
(b) Find the value of d such that P(20 < X < d) = 0.4641.
(3)
Edexcel GCE
(4)
Statistics S1
7. The random variable X has probability distribution
Advanced Subsidiary
x 1 3 5 7 9
Tuesday 15 January 2008 Morning
P(X = x) 0.2 p 0.2 q 0.15
Time: 1 hour 30 minutes
(a) Given that E(X) = 4.5, write down two equations involving p and q.
(3)
Materials required for examination Items included with question papers
Find Mathematical Formulae (Green) Nil
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N26118A 7
N29283A
This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2008 Edexcel Limited.
PMT
1. A personnel manager wants to find out if a test carried out during an employee’s interview
2. Cotinine is a chemical that is made by the body from nicotine which is found in cigarette
and a skills assessment at the end of basic training is a guide to performance after working for
smoke. A doctor tested the blood of 12 patients, who claimed to smoke a packet of cigarettes
the company for one year.
a day, for cotinine. The results, in appropriate units, are shown below.
The table below shows the results of the interview test of 10 employees and their performance
after one year. Patient A B C D E F G H I J K L
Cotinine
Employee A B C D E F G H I J 160 390 169 175 125 420 171 250 210 258 186 243
level, x
Interview
65 71 79 77 85 78 85 90 81 62
test, x % [You may use ¦ x2 = 724 961]
Performance
after one 65 74 82 64 87 78 61 65 79 69 (a) Find the mean and standard deviation of the level of cotinine in a patient’s blood.
year, y % (4)
(b) Find the median, upper and lower quartiles of these data.
(3)
[You may use ¦ x = 60 475,
2
¦y 2
= 53 122, ¦ xy = 56 076 ]
A doctor suspects that some of his patients have been smoking more than a packet of
(a) Showing your working clearly, calculate the product moment correlation coefficient cigarettes per day. He decides to use Q 3 + 1.5(Q 3 – Q 1 ) to determine if any of the cotinine
between the interview test and the performance after one year. results are far enough away from the upper quartile to be outliers.
(5)
(c) Identify which patient(s) may have been smoking more than a packet of cigarettes a day.
The product moment correlation coefficient between the skills assessment and the Show your working clearly.
performance after one year is –0.156 to 3 significant figures. (4)
(b) Use your answer to part (a) to comment on whether or not the interview test and skills Research suggests that cotinine levels in the blood form a skewed distribution.
assessment are a guide to the performance after one year. Give clear reasons for your (Q1 2Q 2 Q 3 )
One measure of skewness is found using .
answers. (Q 3 Q1 )
(2)
(d) Evaluate this measure and describe the skewness of these data.
(3)
N29283A 2 N29283A 3
PMT
3. The histogram in Figure 1 shows the time taken, to the nearest minute, for 140 runners to
4. A second hand car dealer has 10 cars for sale. She decides to investigate the link between the
complete a fun run.
age of the cars, x years, and the mileage, y thousand miles. The data collected from the cars
are shown in the table below.
Age, x
(years) 2 2.5 3 4 4.5 4.5 5 3 6 6.5
Mileage, y
(thousands) 22 34 33 37 40 45 49 30 58 58
Figure 1
Use the histogram to calculate the number of runners who took between 78.5 and 90.5
minutes to complete the fun run.
(5)
N29283A 4 N29283A 5
PMT
5. The following shows the results of a wine tasting survey of 100 people.
7. Tetrahedral dice have four faces. Two fair tetrahedral dice, one red and one blue, have faces
numbered 0, 1, 2, and 3 respectively. The dice are rolled and the numbers face down on the
96 like wine A,
two dice are recorded. The random variable R is the score on the red die and the random
93 like wine B,
variable B is the score on the blue die.
96 like wine C,
92 like A and B,
(a) Find P(R = 3 and B = 0).
91 like B and C,
(2)
93 like A and C,
90 like all three wines. The random variable T is R multiplied by B.
(a) Draw a Venn Diagram to represent these data. (b) Complete the diagram below to represent the sample space that shows all the possible
(6) values of T.
Find the probability that a randomly selected person from the survey likes
3
(b) none of the three wines,
(1)
2 2
(c) wine A but not wine B,
(2)
(d) any wine in the survey except wine C, 1 0
(2)
(e) exactly two of the three kinds of wine.
(2) 0
N29283A 6 N20908A 7
PMT
Paper Reference(s) Given that a person has the disease, the test is positive with probability 0.95.
6683/01 Given that a person does not have the disease, the test is positive with probability 0.03.
Statistics S1 A person is selected at random from the population and tested for this disease.
Thursday 15 May 2008 Morning A doctor randomly selects a person from the population and tests him for the disease. Given that
the test is positive,
Time: 1 hour 30 minutes (c) find the probability that he does not have the disease.
(2)
(d) Comment on the usefulness of this test.
Materials required for examination Items included with question papers (1)
Mathematical Formulae (Green) Nil
Candidates may use any calculator allowed by the regulations of the Joint
Council for Qualifications. Calculators must not have the facility for symbolic
algebra manipulation, differentiation and integration, or have retrievable
mathematical formulae stored in them.
Instructions to Candidates
Write the name of the examining body (Edexcel), your centre number, candidate number, the
unit title (Statistics S1), the paper reference (6683), your surname, initials and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the
answer should be given to an appropriate degree of accuracy.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may gain no credit.
H32582A This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2008 Edexcel Limited.
H32582A 2
PMT
2. The age in years of the residents of two hotels are shown in the back to back stem and leaf 3. The random variable X has probability distribution given in the table below.
diagram below.
x –1 0 1 2 3
Abbey Hotel 8 ¨5¨0 means 58 years in Abbey Hotel and 50 years in Balmoral Hotel Balmoral Hotel P(X = x) p q 0.2 0.15 0.15
(1) 2 0 Given that E(X) = 0.55, find
(4) 9751 1
(a) the value of p and the value of q,
(4) 9831 2 6 (1) (5)
(11) 99997665332 3 447 (3) (b) Var (X),
(6) 987750 4 005569 (6) (4)
(1) 8 5 000013667 (9) (c) E(2X – 4).
(2)
6 233457 (6)
7 015 (3)
4. Crickets make a noise. The pitch, v kHz, of the noise made by a cricket was recorded at
15 different temperatures, t °C. These data are summarised below.
For the Balmoral Hotel,
For the Abbey Hotel, the mode is 39, the mean is 33.2, the standard deviation is 12.7 and the
measure of skewness is –0.454.
(e) Compare the two age distributions of the residents of each hotel.
(3)
H32582A 3 H32582A 4
PMT
5. A person’s blood group is determined by whether or not it contains any of 3 substances A, B 6. The discrete random variable X can take only the values 2, 3 or 4. For these values the cumulative
and C. distribution function is defined by
A doctor surveyed 300 patients’ blood and produced the table below. (x k)2
F(x) for x = 2, 3, 4,
25
Blood contains No. of Patients
where k is a positive integer.
only C 100
A and C but not B 100 (a) Find k.
(2)
only A 30
(b) Find the probability distribution of X.
B and C but not A 25 (3)
only B 12
A, B and C 10 7. A packing plant fills bags with cement. The weight X kg of a bag of cement can be modelled by a
A and B but not C 3 normal distribution with mean 50 kg and standard deviation 2 kg.
H32582A 5 H32582A 6
PMT
1. A teacher is monitoring the progress of students using a computer based revision course. The
improvement in performance, y marks, is recorded for each student along with the time,
x hours, that the student spent using the revision course. The results for a random sample of
Paper Reference(s)
10 students are recorded below.
6683/01
Edexcel GCE
x hours 1.0 3.5 4.0 1.5 1.3 0.5 1.8 2.5 2.3 3.0
y marks 5 30 27 10 –3 –5 7 15 –10 20
Time: 1 hour 30 minutes (c) Give an interpretation of the gradient of your regression line.
(1)
Instructions to Candidates
In the boxes on the answer book, write the name of the examining body (Edexcel), your
centre number, candidate number, the unit title (Statistics S1), the paper reference (6683),
your surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the
answer should be given to an appropriate degree of accuracy.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. Answers
without working may gain no credit.
N32680A
This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2009 Edexcel Limited.
N32680A 2
PMT
4. In a study of how students use their mobile telephones, the phone usage of a random sample
2. A group of office workers were questioned for a health magazine and 52 were found to take
of 11 students was examined for a particular week.
regular exercise. When questioned about their eating habits 23 said they always eat breakfast
and, of those who always eat breakfast 9
25 also took regular exercise. The total length of calls, y minutes, for the 11 students were
Find the probability that a randomly selected member of the group 17, 23, 35, 36, 51, 53, 54, 55, 60, 77, 110
(a) always eats breakfast and takes regular exercise, (a) Find the median and quartiles for these data.
(2) (3)
(b) does not always eat breakfast and does not take regular exercise. A value that is greater than Q 3 + 1.5 × (Q 3 – Q 1 ) or smaller than Q 1 – 1.5 × (Q 3 – Q 1 ) is
(4) defined as an outlier.
(c) Determine, giving your reason, whether or not always eating breakfast and taking regular
exercise are statistically independent. (b) Show that 110 is the only outlier.
(2) (2)
(c) Draw a box plot for these data indicating clearly the position of the outlier.
(3)
3. When Rohit plays a game, the number of points he receives is given by the discrete random
variable X with the following probability distribution. The value of 110 is omitted.
N32680A 3 N32680A 4
PMT
5. In a shopping survey a random sample of 104 teenagers were asked how many hours, to the
nearest hour, they spent shopping in the last month. The results are summarised in the table
below.
TOTAL FOR PAPER: 75 MARKS You must ensure that your answers to parts of questions are clearly labelled.
END You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit.
H34279A This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2009 Edexcel Limited.
N32680A 5
PMT
1. The volume of a sample of gas is kept constant. The gas is heated and the pressure, p, is 3. The variable x was measured to the nearest whole number. Forty observations are given in the
measured at 10 different temperatures, t. The results are summarised below. table below.
Ȉp = 445 Ȉp2 = 38 125 Ȉt = 240 Ȉt 2 = 27 520 Ȉpt = 26 830 x 10 – 15 16 – 18 19 –
(a) Find S pp and S pt . Frequency 15 9 16
(3)
A histogram was drawn and the bar representing the 10 – 15 class has a width of 2 cm and a
Given that S tt = 21 760, height of 5 cm. For the 16 – 18 class find
(b) calculate the product moment correlation coefficient. (a) the width,
(2) (1)
(c) Give an interpretation of your answer to part (b). (b) the height
(1) (2)
of the bar representing this class.
2. On a randomly chosen day the probability that Bill travels to school by car, by bicycle or on foot
1 1 1
is , and respectively. The probability of being late when using these methods of travel
2 6 3
1 2 1
is , and respectively.
5 5 10
H34279A 2 H34279A 3
PMT
4. A researcher measured the foot lengths of a random sample of 120 ten-year-old children. The 5. The weight, w grams, and the length, l mm, of 10 randomly selected newborn turtles are given in
lengths are summarised in the table below. the table below.
Foot length, l, (cm) Number of children l 49.0 52.0 53.0 54.5 54.1 53.4 50.0 51.6 49.5 51.2
w 29 32 34 39 38 35 30 31 29 30
10 d l < 12 5
12 d l < 17 53 (You may use S ll = 33.381 S wl = 59.99 S ww = 120.1)
17 d l < 19 29
(a) Find the equation of the regression line of w on l in the form w = a + bl.
19 d l < 21 15 (5)
21 d l < 23 11 (b) Use your regression line to estimate the weight of a newborn turtle of length 60 mm.
23 d l < 25 7 (2)
(c) Comment on the reliability of your estimate giving a reason for your answer.
(a) Use interpolation to estimate the median of this distribution. (2)
(2)
(b) Calculate estimates for the mean and the standard deviation of these data. 6. The discrete random variable X has probability function
(6)
(c) Evaluate this coefficient and comment on the skewness of these data. x 0 1 2 3
(3) P(X=x) 3a 2a b
Greg suggests that a normal distribution is a suitable model for the foot lengths of ten-year-old (1)
children.
Given that E(X) = 1.6,
(d) Using the value found in part (c), comment on Greg’s suggestion, giving a reason for your
answer. (b) find the value of a and the value of b.
(2) (5)
Find
H34279A 4 H34279A 5
PMT
7. (a) Given that P(A) = a and P(B) = b express P(A B) in terms of a and b when
Paper Reference(s)
H34279A 6
N35711A
This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2010 Edexcel Limited.
PMT
1. A jar contains 2 red, 1 blue and 1 green bead. Two beads are drawn at random from the jar
3. The birth weights, in kg, of 1500 babies are summarised in the table below.
without replacement.
(a) Draw a tree diagram to illustrate all the possible outcomes and associated probabilities. Weight (kg) Midpoint, x kg Frequency, f
State your probabilities clearly. 0.0 – 1.0 0.50 1
(3)
1.0 – 2.0 1.50 6
(b) Find the probability that a blue bead and a green bead are drawn from the jar.
2.0 – 2.5 2.25 60
(2)
2.5 – 3.0 280
3.0 – 3.5 3.25 820
2. The 19 employees of a company take an aptitude test. The scores out of 40 are illustrated in 3.5 – 4.0 3.75 320
the stem and leaf diagram below.
4.0 – 5.0 4.50 10
2°6 means a score of 26 5.0 – 6.0 3
An outlier is an observation whose value is less than the lower quartile minus 1.0 times the
interquartile range.
(c) Explain why there is only one employee who will undergo retraining.
(2)
(d) Draw a box plot to illustrate the employees’ scores.
(3)
N35711A 2 N35711A 3
PMT
4. There are 180 students at a college following a general course in computing. Students on this
6. The blood pressures, p mmHg, and the ages, t years, of 7 hospital patients are shown in the
course can choose to take up to three extra options.
table below.
112 take systems support,
70 take developing software, Patient A B C D E F G
81 take networking, t 42 74 48 35 56 26 60
35 take developing software and systems support,
28 take networking and developing software, P 98 130 120 88 182 80 135
40 take systems support and networking,
4 take all three extra options. [ ¦ t = 341, ¦ p = 833, ¦ t 2 = 18 181, ¦ p2 = 106 397, ¦ tp = 42 948 ]
7. The heights of a population of women are normally distributed with mean P cm and standard
5. The probability function of a discrete random variable X is given by
deviation V cm. It is known that 30% of the women are taller than 172 cm and 5% are shorter
p(x) = kx2, x = 1, 2, 3. than 154 cm.
where k is a positive constant. (a) Sketch a diagram to show the distribution of heights represented by this information.
(3)
N35711A 4 N35711A 5
PMT
1. Gary compared the total attendance, x, at home matches and the total number of goals, y, scored
at home during a season for each of 12 football teams playing in a league. He correctly
calculated:
Paper Reference(s)
Edexcel GCE (a) Calculate the product moment correlation coefficient for these data.
(2)
(b) Interpret the value of the correlation coefficient.
Statistics S1 (1)
Helen was given the same data to analyse. In view of the large numbers involved she decided to
Advanced Subsidiary divide the attendance figures by 100. She then calculated the product moment correlation
x
coefficient between and y.
Thursday 27 May 2010 Morning 100
Time: 1 hour 30 minutes (c) Write down the value Helen should have obtained.
(1)
Materials required for examination Items included with question papers
Mathematical Formulae (Pink) Nil
Candidates may use any calculator allowed by the regulations of the Joint
Council for Qualifications. Calculators must not have the facility for symbolic
algebra manipulation, differentiation and integration, or have retrievable
mathematical formulas stored in them.
Instructions to Candidates
In the boxes on the answer book, write the name of the examining body (Edexcel), your centre
number, candidate number, the unit title (Statistics S1), the paper reference (6683), your
surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the answer
should be given to an appropriate degree of accuracy.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit.
H35395A This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2010 Edexcel Limited
H35395A 2
PMT
2. An experiment consists of selecting a ball from a bag and spinning a coin. The bag contains 5 red 3. The discrete random variable X has probability distribution given by
balls and 7 blue balls. A ball is selected at random from the bag, its colour is noted and then the
ball is returned to the bag.
x –1 0 1 2 3
When a red ball is selected, a biased coin with probability 2
3 of landing heads is spun. P(X = x) 1
5 a 1
10 a 1
5
When a blue ball is selected a fair coin is spun.
where a is a constant.
(a) Copy and complete the tree diagram below to show the possible outcomes and associated
probabilities. (a) Find the value of a.
(2)
(b) Write down E(X).
(1)
(c) Find Var(X ).
(3)
(2)
(d) Find the probability that the colour of the ball Shivani selects is the same as the colour of the
ball Tom selects.
(3)
4. The Venn diagram in Figure 1 shows the number of students in a class who read any of 3 popular 5. A teacher selects a random sample of 56 students and records, to the nearest hour, the time spent
magazines A, B and C. watching television in a particular week.
$ KLVWRJUDP ZDV GUDZQ WR UHSUHVHQW WKHVH GDWD 7KH í JURXS ZDV UHSUHVHQWHG E\ D bar of
Figure 1 width 4 cm and height 6 cm.
Given that the student reads at least one of the magazines, The teacher estimated the lower quartile and the upper quartile of the time spent watching
television to be 15.8 and 29.3 respectively.
(d) find the probability that the student reads C.
(2) (e) State, giving a reason, the skewness of these data.
(2)
(e) Determine whether or not reading magazine B and reading magazine C are statistically
independent.
(3)
6. A travel agent sells flights to different destinations from Beerow airport. The distance d, 7. The distances travelled to work, D km, by the employees at a large company are normally
measured in 100 km, of the destination from the airport and the fare £f are recorded for a random distributed with D a N( 30, 82 ).
sample of 6 destinations.
(a) Find the probability that a randomly selected employee has a journey to work of more
Destination A B C D E F than 20 km.
(3)
d 2.2 4.0 6.0 2.5 8.0 5.0
(b) Find the upper quartile, Q 3 , of D.
f 18 20 25 23 32 28 (3)
(c) Write down the lower quartile, Q 1 , of D.
[You may use ¦ d2 = 152.09 ¦ f 2 = 3686 ¦ fd = 723.1] (1)
(a) On graph paper, draw a scatter diagram to illustrate this information. An outlier is defined as any value of D such that D < h or D > k where
(2)
h = Q 1 í 1.5 × (Q 3 í Q 1 ) and k = Q 3 + 1.5 × (Q 3 í Q 1 ).
(b) Explain why a linear regression model may be appropriate to describe the relationship
between f and d. (d) Find the value of h and the value of k.
(1) (2)
(c) Calculate S dd and S fd .
(4) An employee is selected at random.
(d) Calculate the equation of the regression line of f on d giving your answer in the form (e) Find the probability that the distance travelled to work by this employee is an outlier.
f = a + bd. (3)
(4)
(e) Give an interpretation of the value of b. TOTAL FOR PAPER: 75 MARKS
(1) END
Jane is planning her holiday and wishes to fly from Beerow airport to a destination t km away.
A rival travel agent charges 5p per km.
(f) Find the range of values of t for which the first travel agent is cheaper than the rival.
(2)
H35395A 7
H35395A 8
PMT
1. A random sample of 50 salmon was caught by a scientist. He recorded the length l cm and
weight w kg of each salmon.
Paper Reference(s)
The following summary statistics were calculated from these data.
6683/01
¦ l = 4027 ¦ l2 = 327 754.5 ¦ w = 357.1 ¦ lw = 29 330.5 S ww = 289.6
Edexcel GCE (a) Find S ll and S lw .
(3)
Statistics S1 (b) Calculate, to 3 significant figures, the product moment correlation coefficient between l
and w.
Advanced Level (2)
(c) Give an interpretation of your coefficient.
Friday 14 January 2011 Morning (1)
Time: 1 hour 30 minutes 2. Keith records the amount of rainfall, in mm, at his school, each day for a week. The results are
given below.
Materials required for examination Items included with question papers
Mathematical Formulae (Pink) Nil
2.8 5.6 2.3 9.4 0.0 0.5 1.8
Candidates may use any calculator allowed by the regulations of the Joint Jenny then records the amount of rainfall, x mm, at the school each day for the following
Council for Qualifications. Calculators must not have the facility for symbolic 21 days. The results for the 21 days are summarised below.
algebra manipulation, differentiation and integration, or have retrievable
mathematical formulas stored in them.
¦ x = 84.6
(a) Calculate the mean amount of rainfall during the whole 28 days.
Instructions to Candidates
(2)
In the boxes on the answer book, write the name of the examining body (Edexcel), your centre
number, candidate number, the unit title (Statistics S2), the paper reference (6684), your Keith realises that he has transposed two of his figures. The number 9.4 should have been 4.9
surname, other name and signature. and the number 0.5 should have been 5.0.
Values from the statistical tables should be quoted in full. When a calculator is used, the answer
should be given to an appropriate degree of accuracy. Keith corrects these figures.
Information for Candidates (b) State, giving your reason, the effect this will have on the mean.
A booklet ‘Mathematical Formulae and Statistical Tables’ is provided. (2)
Full marks may be obtained for answers to ALL questions.
This paper has 8 questions.
The total mark for this paper is 75.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit.
H35410A This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2011 Edexcel Limited
H35410A 2
PMT
3. Over a long period of time a small company recorded the amount it received in sales per month. 4. A farmer collected data on the annual rainfall, x cm, and the annual yield of peas, p tonnes per
The results are summarised below. acre.
(a) On the graph paper below, draw a box plot to represent these data, indicating clearly any 5. On a randomly chosen day, each of the 32 students in a class recorded the time, t minutes to the
outliers. nearest minute, they spent on their homework. The data for the class is summarised in the
(5) following table.
Given that
6. The discrete random variable X has the probability distribution 7. The bag P contains 6 balls of which 3 are red and 3 are yellow.
The bag Q contains 7 balls of which 4 are red and 3 are yellow.
x 1 2 3 4 A ball is drawn at random from bag P and placed in bag Q. A second ball is drawn at random
from bag P and placed in bag Q.
P(X = x) k 2k 3k 4k A third ball is then drawn at random from the 9 balls in bag Q.
The event A occurs when the 2 balls drawn from bag P are of the same colour.
(a) Show that k = 0.1 The event B occurs when the ball drawn from bag Q is red.
(1)
(a) Copy and complete the tree diagram shown below.
Find
(b) E(X )
(2)
(c) E(X 2)
(2)
(d) Var (2 – 5X )
(3)
y 2 3 4 5 6 7 8
8. The weight, X grams, of soup put in a tin by machine A is normally distributed with a mean of
160 g and a standard deviation of 5 g.
Paper Reference(s)
A tin is selected at random.
6683/01
Edexcel GCE
(a) Find the probability that this tin contains more than 168 g.
(3)
Candidates may use any calculator allowed by the regulations of the Joint
Council for Qualifications. Calculators must not have the facility for symbolic
algebra manipulation, differentiation and integration, or have retrievable
mathematical formulas stored in them.
Instructions to Candidates
In the boxes on the answer book, write the name of the examining body (Edexcel), your centre
number, candidate number, the unit title (Statistics S1), the paper reference (6683), your
surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the answer
should be given to an appropriate degree of accuracy.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit.
P38164A This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2011 Edexcel Limited
1. On a particular day the height above sea level, x metres, and the mid-day temperature, y °C, were 3. The discrete random variable Y has the probability distribution
recorded in 8 north European towns. These data are summarised below
y 1 2 3 4
S xx = 3 535 237.5 ¦ y = 181 ¦ y2 = 4305 S xy = –23 726.25
P(Y = y) a b 0.3 c
(a) Find S yy .
(2) where a, b and c are constants.
(b) Calculate, to 3 significant figures, the product moment correlation coefficient for these data.
(2) The cumulative distribution function F(y) of Y is given in the following table.
4. Past records show that the times, in seconds, taken to run 100 m by children at a school can be
2. The random variable X ~ N(ȝ, 52) and P(X < 23) = 0.9192. modelled by a normal distribution with a mean of 16.12 and a standard deviation of 1.60.
(a) Find the value of ȝ. A child from the school is selected at random.
(4)
(b) Write down the value of P(ȝ < X <23). (a) Find the probability that this child runs 100 m in less than 15 s.
(1) (3)
On sports day the school awards certificates to the fastest 30% of the children in the 100 m race.
(b) Estimate, to 2 decimal places, the slowest time taken to run 100 m for which a child will be
awarded a certificate.
(4)
5. A class of students had a sudoku competition. The time taken for each student to complete the
6. Jake and Kamil are sometimes late for school.
sudoku was recorded to the nearest minute and the results are summarised in the table below.
The events J and K are defined as follows
Time Mid-point, x Frequency, f J = the event that Jake is late for school,
2–8 5 2 K = the event that Kamil is late for school.
9 – 12 7 P(J ) = 0.25, P(J K) = 0.15 and P(J ƍ Kƍ .
13 – 15 14 5
On a randomly selected day, find the probability that
16 – 18 17 8
19 – 22 20.5 4 (a) at least one of Jake or Kamil are late for school,
23 – 30 26.5 4 (1)
(b) Kamil is late for school.
(You may use ¦fx2 = 8603.75) (2)
(a) Write down the mid-point for the 9 – 12 interval. Given that Jake is late for school,
(1)
(c) find the probability that Kamil is late.
(b) Use linear interpolation to estimate the median time taken by the students. (3)
(2)
(c) Estimate the mean and standard deviation of the times taken by the students. The teacher suspects that Jake being late for school and Kamil being late for school are linked in
(5) some way.
The teacher suggested that a normal distribution could be used to model the times taken by the (d) Determine whether or not J and K are statistically independent.
students to complete the sudoku. (2)
(e) Comment on the teacher’s suspicion in the light of your calculation in part (d).
(d) Give a reason to support the use of a normal distribution in this case. (1)
(1)
On another occasion the teacher calculated the quartiles for the times taken by the students to
complete a different sudoku and found
(e) Describe, giving a reason, the skewness of the times on this occasion.
(2)
7. A teacher took a random sample of 8 children from a class. For each child the teacher recorded 8. A spinner is designed so that the score S is given by the following probability distribution.
the length of their left foot, f cm, and their height, h cm. The results are given in the table below.
s 0 1 2 4 5
f 23 26 23 22 27 24 20 21
P(S = s) p 0.25 0.25 0.20 0.20
h 135 144 134 136 140 134 130 132
(a) Find the value of p.
(You may use ¦ f =186 ¦h =1085 S ff = 39.5 S hh =139.875 ¦ fh = 25 291)
(2)
(a) Calculate S fh . (b) Find E(S).
(2) (2)
2
(b) Find the equation of the regression line of h on f in the form h = a + bf. (c) Show that E(S ) = 9.45.
Give the value of a and the value of b correct to 3 significant figures. (2)
(5)
(d) Find Var(S).
(c) Use your equation to estimate the height of a child with a left foot length of 25 cm. (2)
(2)
Tom and Jess play a game with this spinner. The spinner is spun repeatedly and S counters are
(d) Comment on the reliability of your estimate in part (c), giving a reason for your answer.
awarded on the outcome of each spin. If S is even then Tom receives the counters and if S is odd
(2)
then Jess receives them. The first player to collect 10 or more counters is the winner.
The left foot length of the teacher is 25 cm.
(e) Find the probability that Jess wins after 2 spins.
(2)
(e) Give a reason why the equation in part (b) should not be used to estimate the teacher’s
height. (f) Find the probability that Tom wins after exactly 3 spins.
(1) (4)
(g) Find the probability that Jess wins after exactly 3 spins.
(3)
1. The histogram in Figure 1 shows the time, to the nearest minute, that a random sample of 100
motorists were delayed by roadworks on a stretch of motorway.
Paper Reference(s)
6683/01
Edexcel GCE
Statistics S1
Advanced Level
Tuesday 17 January 2012 Morning
Time: 1 hour 30 minutes
Materials required for examination Items included with question papers
Mathematical Formulae (Pink) Nil
Candidates may use any calculator allowed by the regulations of the Joint
Council for Qualifications. Calculators must not have the facility for symbolic
algebra manipulation, differentiation and integration, or have retrievable
mathematical formulas stored in them.
Instructions to Candidates
In the boxes on the answer book, write the name of the examining body (Edexcel), your centre
number, candidate number, the unit title (Statistics S1), the paper reference (6683), your Figure 1
surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the answer (a) Complete the table.
should be given to an appropriate degree of accuracy.
Delay (minutes) Number of motorists
Information for Candidates 4–6 6
A booklet ‘Mathematical Formulae and Statistical Tables’ is provided. 7–8
Full marks may be obtained for answers to ALL questions. 9 21
This paper has 7 questions. 10 – 12 45
The total mark for this paper is 75.
13 – 15 9
Advice to Candidates 16 – 20
(2)
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner. (b) Estimate the number of motorists who were delayed between 8.5 and 13.5 minutes by the
Answers without working may not gain full credit. roadworks.
(2)
P44699A This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2012 Edexcel Limited
P44699A 2
PMT
2. (a) State in words the relationship between two events R and S when P(R S) = 0. 4. The marks, x, of 45 students randomly selected from those students who sat a mathematics
(1) examination are shown in the stem and leaf diagram below.
(b) F(3), (e) Find the mean and standard deviation of the scaled marks of all the students.
(1) (4)
(c) E(X),
(2)
(d) E(X 2),
(2)
(e) Var (7X – 5).
(4)
5. The age, t years, and weight, w grams, of each of 10 coins were recorded. These data are
6. The following shows the results of a survey on the types of exercise taken by a group of
summarised below.
100 people.
(ii) the effect of an increase of 4 years in age on the weight of a coin. (c) swims but does not run,
(2) (2)
(d) takes at least two of these types of exercise.
It was discovered that a coin in the original sample, which was 5 years old and weighed (2)
20 grams, was a fake.
Jason is one of the above group.
(f) State, without any further calculations, whether the exclusion of this coin would increase or
decrease the value of the product moment correlation coefficient. Give a reason for your Given that Jason runs,
answer.
(2) (e) find the probability that he swims but does not cycle.
(3)
7. A manufacturer fills jars with coffee. The weight of coffee, W grams, in a jar can be modelled by
a normal distribution with mean 232 grams and standard deviation 5 grams.
(c) Find the probability that only one of the jars contains between 232 grams and w grams of
coffee.
(3)
P44699A 6
PMT
k (1 x) 2 x 1, 0, 1 and 2
Paper Reference(s) P(X = x) = ®
¯0 otherwise.
6683/01
Edexcel GCE (a) Show that k =
1
6
.
(3)
Statistics S1 (b) Find E(X).
(2)
Advanced Level 4
(c) Show that E(X 2) = .
3
(2)
Friday 18 May 2012 Afternoon
(d) Find Var(1 – 3X).
Time: 1 hour 30 minutes (3)
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit.
P40105XA This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2012 Edexcel Limited
P40105XA 2
PMT
3. A scientist is researching whether or not birds of prey exposed to pollutants lay eggs with thinner 4.
shells. He collects a random sample of egg shells from each of 6 different nests and tests for
pollutant level, p, and measures the thinning of the shell, t. The results are shown in the table
below.
p 3 8 30 25 15 12
t 1 3 9 10 5 6
(b) Explain why a linear regression model may be appropriate to describe the relationship
between p and t. Figure 1
(1)
(c) Calculate the value of S pt and the value of S pp . Figure 1 shows how 25 people travelled to work.
(4)
Their travel to work is represented by the events
(d) Find the equation of the regression line of t on p, giving your answer in the form t = a + bp.
(4) B bicycle
(e) Plot the point ( p, t ) and draw the regression line on your scatter diagram. T train
(2)
W walk
The scientist reviews similar studies and finds that pollutant levels above 16 are likely to result in
the death of a chick soon after hatching. (a) Write down 2 of these events that are mutually exclusive. Give a reason for your answer.
(2)
(f ) Estimate the minimum thinning of the shell that is likely to result in the death of a chick.
(2) (b) Determine whether or not B and T are independent events.
(3)
(e) find the probability that they will also take the train.
(2)
5.
6. The heights of an adult female population are normally distributed with mean 162 cm and
standard deviation 7.5 cm.
(a) Find the probability that a randomly chosen adult female is taller than 150 cm.
(3)
Sarah is a young girl. She visits her doctor and is told that she is at the 60th percentile for height.
(b) Assuming that Sarah remains at the 60th percentile, estimate her height as an adult.
(3)
The heights of an adult male population are normally distributed with standard deviation 9.0 cm.
Given that 90% of adult males are taller than the mean height of adult females,
7. A manufacturer carried out a survey of the defects in their soft toys. It is found that the
probability of a toy having poor stitching is 0.03 and that a toy with poor stitching has a
probability of 0.7 of splitting open. A toy without poor stitching has a probability of 0.02 of
Figure 2
splitting open.
A policeman records the speed of the traffic on a busy road with a 30 mph speed limit.
(a) Draw a tree diagram to represent this information.
(3)
He records the speeds of a sample of 450 cars. The histogram in Figure 2 represents the results.
(b) Find the probability that a randomly chosen soft toy has exactly one of the two defects, poor
(a) Calculate the number of cars that were exceeding the speed limit by at least 5 mph in the stitching or splitting open.
sample. (3)
(4)
The manufacturer also finds that soft toys can become faded with probability 0.05 and that this
(b) Estimate the value of the mean speed of the cars in the sample.
defect is independent of poor stitching or splitting open. A soft toy is chosen at random.
(3)
(c) Estimate, to 1 decimal place, the value of the median speed of the cars in the sample. (c) Find the probability that the soft toy has none of these 3 defects.
(2) (2)
(d) Comment on the shape of the distribution. Give a reason for your answer. (d) Find the probability that the soft toy has exactly one of these 3 defects.
(2) (4)
(e) State, with a reason, whether the estimate of the mean or the median is a better
TOTAL FOR PAPER: 75 MARKS
representation of the average speed of the traffic on the road.
END
(2)
P40105XA 6
PMT
1. A teacher asked a random sample of 10 students to record the number of hours of television, t,
they watched in the week before their mock exam. She then calculated their grade, g, in their
mock exam. The results are summarised as follows.
Paper Reference(s)
Advanced Level The teacher also recorded the number of hours of revision, v, these 10 students completed during
the week before their mock exam. The correlation coefficient between t and v was –0.753.
Friday 18 January 2013 Afternoon
(c) Describe, giving a reason, the nature of the correlation you would expect to find between
v and g.
Time: 1 hour 30 minutes (2)
Materials required for examination Items included with question papers
Mathematical Formulae (Pink) Nil
2. The discrete random variable X can take only the values 1, 2 and 3. For these values the
cumulative distribution function is defined by
Candidates may use any calculator allowed by the regulations of the Joint
Council for Qualifications. Calculators must not have the facility for symbolic
algebra manipulation, differentiation and integration, or have retrievable x3 k
F(x) = , x = 1, 2, 3.
mathematical formulas stored in them. 40
In the boxes on the answer book, write the name of the examining body (Edexcel), your centre (b) Find the probability distribution of X.
number, candidate number, the unit title (Statistics S1), the paper reference (6683), your (4)
surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the answer 259
Given that Var(X) = ,
should be given to an appropriate degree of accuracy. 320
Information for Candidates (c) find the exact value of Var(4X – 5).
(2)
A booklet ‘Mathematical Formulae and Statistical Tables’ is provided.
Full marks may be obtained for answers to ALL questions.
This paper has 7 questions.
The total mark for this paper is 75.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit.
P41805A This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2013 Edexcel Limited
P41805A 2
PMT
3. A biologist is comparing the intervals (m seconds) between the mating calls of a certain species 5. A survey of 100 households gave the following results for weekly income £y.
of tree frog and the surrounding temperature (t °C). The following results were obtained.
Income y (£) Mid-point Frequency f
t °C 8 13 14 15 15 20 25 30
0 d y < 200 100 12
m secs 6.5 4.5 6 5 4 3 2 1
200 d y < 240 220 28
(You may use ¦ tm = 469.5, S tt = 354, S mm = 25.5) 240 d y < 320 280 22
6. A fair blue die has faces numbered 1, 1, 3, 3, 5 and 5. The random variable B represents the score 7. Given that
when the blue die is rolled.
P(A) = 0.35 , P(B) = 0.45 and P(A B) = 0.13,
(a) Write down the probability distribution for B. find
(2)
(b) State the name of this probability distribution. (a) P(A B),
(1) (2)
A second die is red and the random variable R represents the score when the red die is rolled. The event C has P(C) = 0.20.
The probability distribution of R is The events A and C are mutually exclusive and the events B and C are independent.
Tom spins a fair coin with one side labelled 2 and the other side labelled 5. When Avisha sees
the number showing on the coin she then chooses one of the dice and rolls it. If the number
showing on the die is greater than the number showing on the coin, Avisha wins, otherwise Tom
wins.
Avisha chooses the die which gives her the best chance of winning each time Tom spins the coin.
(f) Find the probability that Avisha wins the game, stating clearly which die she should use in
each case.
(4)
1. A meteorologist believes that there is a relationship between the height above sea level, h m, and
the air temperature, t °C. Data is collected at the same time from 9 different places on the same
mountain. The data is summarised in the table below.
Paper Reference(s)
6683/01 h 1400 1100 260 840 900 550 1230 100 770
Edexcel GCE t 3 10 20 9 10 13 5 24 16
Statistics S1 [You may assume that ¦h = 7150, ¦t =110, ¦h2 = 7171500, ¦t2 = 1716, ¦th = 64 980
and Stt = 371.56]
Advanced Subsidiary (a) Calculate S th and S hh . Give your answers to 3 significant figures.
(3)
Friday 17 May 2013 Morning (b) Calculate the product moment correlation coefficient for this data.
(2)
Time: 1 hour 30 minutes (c) State whether or not your value supports the use of a regression equation to predict the air
temperature at different heights on this mountain. Give a reason for your answer.
Materials required for examination Items included with question papers (1)
Mathematical Formulae (Pink) Nil
(d) Find the equation of the regression line of t on h giving your answer in the form t = a + bh.
(4)
Candidates may use any calculator allowed by the regulations of the Joint
Council for Qualifications. Calculators must not have the facility for symbolic (e) Interpret the value of b.
algebra manipulation, differentiation and integration, or have retrievable (1)
mathematical formulas stored in them.
(f) Estimate the difference in air temperature between a height of 500 m and a height
of 1000 m.
(2)
Instructions to Candidates
In the boxes on the answer book, write the name of the examining body (Edexcel), your centre
number, candidate number, the unit title (Statistics S1), the paper reference (6683), your
surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the answer
should be given to an appropriate degree of accuracy.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit.
P42831A This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2013 Edexcel Limited
P42831A 2
PMT
2. The marks of a group of female students in a statistics test are summarised in Figure 1. An outlier is a mark that is
either more than 1.5 × interquartile range above the upper quartile
(c) On graph paper draw a box plot to represent the marks of the male students, indicating
clearly any outliers.
(5)
(d) Compare and contrast the marks of the male and the female students.
(2)
Figure 1
(a) Write down the mark which is exceeded by 75% of the female students.
(1)
The marks of a group of male students in the same statistics test are summarised by the stem and
leaf diagram below.
(b) Find the median and interquartile range of the marks of the male students.
(3)
3. In a company the 200 employees are classified as full-time workers, part-time workers or 4. The following table summarises the times, t minutes to the nearest minute, recorded for a group
contractors. of students to complete an exam.
The table below shows the number of employees in each category and whether they walk to
Time (minutes) t 11 – 20 21 – 25 26 – 30 31 – 35 36 – 45 46 – 60
work or use some form of transport.
Number of students f 62 88 16 13 11 10
Walk Transport
Part-time worker 35 75 (a) Estimate the mean and standard deviation of these data.
(5)
Contractor 30 50
(b) Use linear interpolation to estimate the value of the median.
The events F, H and C are that an employee is a full-time worker, part-time worker or contractor (2)
respectively. Let W be the event that an employee walks to work. (c) Show that the estimated value of the lower quartile is 18.6 to 3 significant figures.
(1)
An employee is selected at random.
(d) Estimate the interquartile range of this distribution.
Find (2)
(e) Give a reason why the mean and standard deviation are not the most appropriate summary
(a) P(H) statistics to use with these data.
(2) (1)
(b) P( [F W ]c)
The person timing the exam made an error and each student actually took 5 minutes less than the
(2)
times recorded above. The table below summarises the actual times.
(c) P(W ¨C)
(2)
Time (minutes) t 6 – 15 16 – 20 21 – 25 26 – 30 31 – 40 41 – 55
Let B be the event that an employee uses the bus.
Number of students f 62 88 16 13 11 10
Given that 10% of full-time workers use the bus, 30% of part-time workers use the bus and
20% of contractors use the bus, (f) Without further calculations, explain the effect this would have on each of the estimates
found in parts (a), (b), (c) and (d).
(d) draw a Venn diagram to represent the events F, H, C and B, (3)
(4)
(e) find the probability that a randomly selected employee uses the bus to travel to work.
(2)
5. A biased die with six faces is rolled. The discrete random variable X represents the score on the
6. The weight, in grams, of beans in a tin is normally distributed with mean P and standard
uppermost face. The probability distribution of X is shown in the table below.
deviation 7.8.
1 2
F(y) 3k 4k 5k
10 10
Each die is rolled once. The scores on the two dice are independent.
(f) Find the probability that the sum of the two scores equals 2.
(2)
P42831A 8
PMT
Paper Reference(s)
1. Sammy is studying the number of units of gas, g, and the number of units of electricity, e,
6683/01R used in her house each week. A random sample of 10 weeks use was recorded and the data
g 60 e
Edexcel GCE for each week were coded so that x
summarised below
4
and y
10
. The results for the coded data are
Advanced/Advanced Subsidiary (a) Find the equation of the regression line of y on x in the form y = a + bx.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit.
P43956A
This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2013 Edexcel Limited.
P43956A 2
PMT
3. An agriculturalist is studying the yields, y kg, from tomato plants. The data from a random 4. The time, in minutes, taken to fly from London to Malaga has a normal distribution with
sample of 70 tomato plants are summarised below. mean 150 minutes and standard deviation 10 minutes.
Yield ( y kg) Frequency (f ) Yield midpoint (x kg) (a) Find the probability that the next flight from London to Malaga takes less than 145
minutes.
0 y < 5 16 2.5 (3)
y < 10 24 7.5 The time taken to fly from London to Berlin has a normal distribution with mean 100 minutes
y < 15 14 12.5 and standard deviation d minutes.
y < 25 12 20 Given that 15% of the flights from London to Berlin take longer than 115 minutes,
y < 35 4 30
(b) find the value of the standard deviation d.
(4)
¦ fx = 755 and ¦ fx
2
(You may use = 12 037.5)
The time, X minutes, taken to fly from London to another city has a normal distribution with
mean ȝminutes.
A histogram has been drawn to represent these data.
Given that P(X < ȝ– 15) = 0.35
The bar represenWLQJWKH\LHOGy < 10 has a width of 1.5 cm and a height of 8 cm.
(c) find P(X > ȝ+ 15 | ;!ȝ– 15).
(a) Calculate the width and the height of the bar representing the yield y < 25.
(3)
(3)
(b) Use linear interpolation to estimate the median yield of the tomato plants.
(2)
(c) Estimate the mean and the standard deviation of the yields of the tomato plants.
(4)
(d) Describe, giving a reason, the skewness of the data.
(2)
(e) Estimate the number of tomato plants in the sample that have a yield of more than
1 standard deviation above the mean.
(2)
P43956A 3 P43956A 4
PMT
5. A researcher believes that parents with a short family name tended to give their children a 6.
long first name. A random sample of 10 children was selected and the number of letters in
their family name, x, and the number of letters in their first name, y, were recorded.
Given that the addition of the child with family name “Turner” to the sample leads to an (a) Find the value of p.
increase in S yy (3)
(e) use the definition Sxy ¦ x x y y to determine whether or not the value of r will Given that P( B _ C )
5
,
increase, decrease or stay the same. Give a reason for your answer. 11
(2)
(b) find the value of q and the value of r.
(4)
(c) Find P( A C _ B) .
(2)
P43956A 5 P43956A 6
PMT
7. The score S when a spinner is spun has the following probability distribution.
s 0 1 2 4 5
P(S = s) 0.2 0.2 0.1 0.3 0.2 Paper Reference(s)
The spinner is spun twice. Candidates may use any calculator allowed by the regulations of the Joint
Council for Qualifications. Calculators must not have the facility for symbolic
The score from the first spin is S 1 and the score from the second spin is S 2 . algebra manipulation, differentiation and integration, or have retrievable
mathematical formulas stored in them.
The random variables S 1 and S 2 are independent and the random variable X = S 1 × S 2 .
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit.
P42831A This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
©2013 Edexcel Limited
P43956A 7
PMT
1. A meteorologist believes that there is a relationship between the height above sea level, h m, and 2. The marks of a group of female students in a statistics test are summarised in Figure 1.
the air temperature, t °C. Data is collected at the same time from 9 different places on the same
mountain. The data is summarised in the table below.
t 3 10 20 9 10 13 5 24 16
[You may assume that ¦h = 7150, ¦t =110, ¦h2 = 7171500, ¦t2 = 1716, ¦th = 64 980
and Stt = 371.56]
(b) Find the median and interquartile range of the marks of the male students.
(3)
An outlier is a mark that is 3. In a company the 200 employees are classified as full-time workers, part-time workers or
contractors.
either more than 1.5 × interquartile range above the upper quartile
The table below shows the number of employees in each category and whether they walk to
or more than 1.5 × interquartile range below the lower quartile. work or use some form of transport.
(c) On graph paper draw a box plot to represent the marks of the male students, indicating Walk Transport
clearly any outliers.
(5) Full-time worker 2 8
(d) Compare and contrast the marks of the male and the female students.
Part-time worker 35 75
(2)
Contractor 30 50
The events F, H and C are that an employee is a full-time worker, part-time worker or contractor
respectively. Let W be the event that an employee walks to work.
Find
(a) P(H)
(2)
(b) P( [F W ]c)
(2)
(c) P(W ¨C)
(2)
Given that 10% of full-time workers use the bus, 30% of part-time workers use the bus and
20% of contractors use the bus,
4. The following table summarises the times, t minutes to the nearest minute, recorded for a group 5. A biased die with six faces is rolled. The discrete random variable X represents the score on the
of students to complete an exam. uppermost face. The probability distribution of X is shown in the table below.
Time (minutes) t 11 – 20 21 – 25 26 – 30 31 – 35 36 – 45 46 – 60 x 1 2 3 4 5 6
(a) Given that E(X) = 4.2 find the value of a and the value of b.
[You may use ¦ft2 = 134281.25]
(5)
(a) Estimate the mean and standard deviation of these data. (b) Show that E(X 2) = 20.4.
(5) (1)
(b) Use linear interpolation to estimate the value of the median. (c) Find Var(5 – 3X).
(2) (3)
(c) Show that the estimated value of the lower quartile is 18.6 to 3 significant figures.
A biased die with five faces is rolled. The discrete random variable Y represents the score which
(1)
is uppermost. The cumulative distribution function of Y is shown in the table below.
(d) Estimate the interquartile range of this distribution.
(2)
y 1 2 3 4 5
(e) Give a reason why the mean and standard deviation are not the most appropriate summary
statistics to use with these data. 1 2
(1) F(y) 3k 4k 5k
10 10
The person timing the exam made an error and each student actually took 5 minutes less than the
times recorded above. The table below summarises the actual times. (d) Find the value of k.
(1)
Time (minutes) t 6 – 15 16 – 20 21 – 25 26 – 30 31 – 40 41 – 55 (e) Find the probability distribution of Y.
(3)
Number of students f 62 88 16 13 11 10
Each die is rolled once. The scores on the two dice are independent.
(f) Without further calculations, explain the effect this would have on each of the estimates (f) Find the probability that the sum of the two scores equals 2.
found in parts (a), (b), (c) and (d). (2)
(3)
Paper Reference(s)
WST01/01
6. The weight, in grams, of beans in a tin is normally distributed with mean P and standard
deviation 7.8.
Pearson Edexcel
Given that 10% of tins contain less than 200 g, find International Advanced Level
(a) the value of P,
(3) Statistics S1
(b) the percentage of tins that contain more than 225 g of beans.
(3) Advanced/Advanced Subsidiary
The machine settings are adjusted so that the weight, in grams, of beans in a tin is normally
distributed with mean 205 and standard deviation V.
Friday 17 January 2014 Afternoon
(c) Given that 98% of tins contain between 200 g and 210 g find the value of V. Time: 1 hour 30 minutes
(4)
Materials required for examination Items included with question papers
TOTAL FOR PAPER: 75 MARKS Mathematical Formulae (Blue) Nil
END
Candidates may use any calculator allowed by the regulations of the Joint Council for
Qualifications. Calculators must not have the facility for symbolic algebra manipulation,
differentiation and integration, or have retrievable mathematical formulae stored in them.
Instructions
x Use black ink or ball-point pen.
x If pencil is used for diagrams/sketches/graphs it must be dark (HB or B).
Coloured pencils and highlighter pens must not be used.
x Fill in the boxes at the top of this page with your name,
centre number and candidate number.
x Answer all questions and ensure that your answers to parts of questions are
clearly labelled.
x Answer the questions in the spaces provided
– there may be more space than you need.
x You should show sufficient working to make your methods clear. Answers
without working may not gain full credit.
x Values from the statistical tables should be quoted in full. When a calculator is
used, the answer should be given to an appropriate degree of accuracy.
Information
x The total mark for this paper is 75.
x The marks for each question are shown in brackets
– use this as a guide as to how much time to spend on each question.
Advice
x Read each question carefully before you start to answer it.
x Try to answer every question.
x Check your answers if you have time at the end.
P43140A This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
P42831A 8 ©2014 Edexcel Limited.
PMT
1. A price comparison website publishes data on the cost per month, £c, and the level of 2. A rugby club coach uses club records to take a random sample of 15 players from 1990 and
satisfaction, s, of a random sample of six internet service providers. A low value of s an independent random sample of 15 players from 2010. The body weight of each player was
corresponds to a low level of satisfaction. The data are given in the table below. recorded to the nearest kg and the results from 2010 are summarised in the table below.
(You may use Ȉc Ȉc2 Ȉs Ȉs2 Ȉcs = 380, S cc = 321.5.) (a) Find the estimated values in kg of the summary statistics a, b and c in the table below.
(a) Calculate the value of S cs and the value of S ss . Estimate in 1990 Estimate in 2010
(3) Mean 83.0 a
(b) Calculate the product moment correlation coefficient for these data. Median 82.0 b
(2) Variance 44.0 c
Brad is not satisfied with his current internet service and decides to change his provider. Give your answers to 3 significant figures.
He decides to pay a lot more for his new internet service.
(6)
(c) On the basis of your calculation in part (b), comment on Brad’s decision. Give a reason
for your answer. The rugby coach claims that players’ body weight increased between 1990 and 2010.
(2)
(b) Using the table in part (a), comment on the rugby coach’s claim.
(2)
P43134A 2 P43140A 3
PMT
3. Jean works for an insurance company. She randomly selects 8 people and records the price of 4. A discrete random variable X has the probability distribution given in the table below, where
their car insurance, £p, and the time, t years, since they passed their driving test. The data is a and b are constants.
shown in the table below.
x –1 0 1 2 3
t 10 13 17 18 22 24 25 27
720 650 430 490 500 390 280 300 1 1 3
p P(X = x) a b
10 5 10
(You may use t = 19.5, p = 470, S tp = –6080, S tt = 254, S pp = 169 200.)
9
Given E(X) =
(a) On the graph below draw a scatter diagram for these data. 5
(2)
(a) (i) find two simultaneous equations for a and b,
(b) Comment on the relationship between p and t.
(1) 1
(ii) show that a = and find the value of b.
(c) Find the equation of the regression line of p on t. 20
(4) (4)
(d) Use your regression equation to estimate the price of car insurance for someone who (b) Specify the cumulative distribution function F(x) for x = –1, 0, 1, 2 and 3.
passed their driving test 20 years ago. (2)
(2) (c) Find P(X < 2.5).
(1)
Jack passed his test 39 years ago and decides to use Jean’s data to predict the price of his car
insurance. (d) Find Var(3 – 2X).
(4)
(e) Comment on Jack’s decision. Give a reason for your answer.
(2)
P43140A 4 P43140A 5
PMT
5. A group of 100 students are asked if they like folk music, rock music or soul music. 6. A manufacturer has a machine that fills bags with flour such that the weight of flour in a bag
is normally distributed. A label states that each bag should contain 1 kg of flour.
All students who like folk music also like rock music
(a) The machine is set so that the weight of flour in a bag has mean 1.04 kg and standard
No students like both rock music and soul music
deviation 0.17 kg. Find the proportion of bags that weigh less than the stated weight
75 students do not like soul music of 1 kg.
(3)
12 students who like rock music do not like folk music
30 students like folk music The manufacturer wants to reduce the number of bags which contain less than the stated
weight of 1 kg. At first she decides to adjust the mean but not the standard deviation so that
(a) Draw a Venn diagram to illustrate this information. only 5% of the bags filled are below the stated weight of 1 kg.
(4)
(b) Find the adjusted mean.
(b) State two of these types of music that are mutually exclusive.
(3)
(1)
The manufacturer finds that a lot of the bags are overflowing with flour when the mean is
Find the probability that a randomly chosen student
adjusted, so decides to adjust the standard deviation instead to make the machine more
accurate. The machine is set back to a mean of 1.04 kg. The manufacturer wants 1% of bags
(c) does not like folk music, rock music or soul music,
to be under 1 kg.
(1)
(d) likes rock music, (c) Find the adjusted standard deviation. Give your answer to 3 significant figures.
(1) (3)
(e) likes folk music or soul music.
(1)
3 3 1
Given that a randomly chosen student likes rock music, 7. In a large college, of the students are male, of the students are left handed and of the
5 10 5
male students are left handed.
(f) find the probability that he or she also likes folk music.
(2) A student is chosen at random.
(a) Given that the student is left handed, find the probability that the student is male.
(2)
(b) Given that the student is female, find the probability that she is left handed.
(3)
(c) Find the probability that the randomly chosen student is male and right handed.
(2)
(d) Find the probability that one student is left handed and one is right handed.
(2)
P43140A 6 P43140A 7
PMT
Paper Reference(s)
8. A manager records the number of hours of overtime claimed by 40 staff in a month.
WST01/01
The histogram in Figure 1 represents the results.
Pearson Edexcel
International Advanced Level
Statistics S1
Advanced/Advanced Subsidiary
Tuesday 10 June 2014 Morning
Time: 1 hour 30 minutes
Candidates may use any calculator allowed by the regulations of the Joint Council for
Qualifications. Calculators must not have the facility for symbolic algebra manipulation,
differentiation and integration, or have retrievable mathematical formulae stored in them.
Instructions
(d) State, giving a reason, whether the manager should use the median or the mean to Information
compare the overtime claimed by staff.
(2) x The total mark for this paper is 75.
x The marks for each question are shown in brackets
– use this as a guide as to how much time to spend on each question.
Advice
x Read each question carefully before you start to answer it.
TOTAL FOR PAPER: 75 MARKS
x Try to answer every question.
END x Check your answers if you have time at the end.
P43144A This publication may only be reproduced in accordance with Pearson Education Limited copyright policy.
©2014 Pearson Education Limited.
P43140A 8
PMT
1. A medical researcher is studying the relationship between age (x years) and volume of blood 2. The table below shows the distances (to the nearest km) travelled to work by the 50
(y ml) pumped by each contraction of the heart. The researcher obtained the following data employees in an office.
from a random sample of 8 patients.
Distance (km) Frequency (f) Distance midpoint (x)
Age (x) 20 25 30 45 55 60 65 70
0–2 16 1.25
Volume (y) 74 76 77 72 68 67 64 62
3–5 12 4
2
[You may use Ȉx = 370, S xx Ȉy Ȉy = 39 418, S xy = –710] 6 – 10 10 8
11 – 20 8 15.5
(a) Calculate S yy .
(2) 21 – 40 4 30.5
(b) Calculate the product moment correlation coefficient for these data.
(2) ><RXPD\XVHȈIx = 394, ȈIx2 = 6500]
(c) Interpret your value of the correlation coefficient. A histogram has been drawn to represent these data.
(1)
The bar representing the distance of 3 – 5 has a width of 1.5 cm and a height of 6 cm.
The researcher believes that a linear regression model may be appropriate to describe these
data. (a) Calculate the width and height of the bar representing the distance of 6 – 10.
(3)
(d) State, giving a reason, whether or not your value of the correlation coefficient supports
the researcher’s belief. (b) Use linear interpolation to estimate the median distance travelled to work.
(1) (2)
(e) Find the equation of the regression line of y on x, giving your answer in the form (c) (i) Show that an estimate of the mean distance travelled to work is 7.88 km.
y = a + bx.
(4) (ii) Estimate the standard deviation of the distances travelled to work.
(4)
Jack is a 40-year-old patient. (d) Describe, giving a reason, the skewness of these data.
(2)
(f) (i) Use your regression line to estimate the volume of blood pumped by each contraction
of Jack’s heart. Peng starts to work in this office as the 51st employee.
(ii) Comment, giving a reason, on the reliability of your estimate. She travels a distance of 7.88 km to work.
(2)
(e) Without carrying out any further calculations, state, giving a reason, what effect Peng’s
addition to the workforce would have on your estimates of the
(i) mean,
(ii) median,
P43134A 2 P43144A 3
PMT
3. A biased four-sided die has faces marked 1, 3, 5 and 7. The random variable X represents the 5. The discrete random variable X has the following probability distribution
score on the die when it is rolled. The cumulative distribution function of X, F(x), is given in
the table below. x –2 0 2 4
P(X = x) a b a c
x 1 3 5 7
F(x) 0.2 0.5 0.9 1 where a, b and c are probabilities.
P43144A 4 P43144A 5
PMT
6. The Venn diagram below shows the probabilities of customers having various combinations 7. One event at Pentor sports day is throwing a tennis ball. The distance a child throws a tennis
of a starter, main course or dessert at Polly’s restaurant. ball is modelled by a normal distribution with mean 32 m and standard deviation 12 m. Any
child who throws the tennis ball more than 50 m is awarded a gold certificate.
S = the event a customer has a starter.
(a) Show that, to 3 significant figures, 6.68% of children are awarded a gold certificate.
M = the event a customer has a main course.
(3)
D = the event a customer has a dessert.
A silver certificate is awarded to any child who throws the tennis ball more than d metres but
less than 50 m.
Three children are selected at random from those who take part in the throwing a tennis ball
event.
(c) Find the probability that 1 is awarded a gold certificate and 2 are awarded silver
certificates. Give your answer to 2 significant figures.
(4)
Given that the events S and D are statistically independent
(i) P(D | M S )
(ii) P(D | M S c )
(4)
One evening 63 customers are booked into Polly’s restaurant for an office party. Polly has
asked for their starter and main course orders before they arrive.
Of these 63 customers
(d) Estimate the number of desserts that these 63 customers will have.
(2)
P43144A 6 P43144A 7
PMT
Paper Reference(s)
1. The discrete random variable X has probability distribution
6683/01R
–4 –2 1 3 5
Edexcel GCE
x
P(X = x) 0.4 p 0.05 0.15 p
(b) E(X)
Tuesday 10 June 2014 Morning (2)
Candidates may use any calculator allowed by the regulations of the Joint (e) find the possible values of a such that Var(aX + 3) = 53.4.
Council for Qualifications. Calculators must not have the facility for symbolic (2)
algebra manipulation or symbolic differentiation/integration, or have
retrievable mathematical formulae stored in them.
This paper is strictly for students outside the UK. 2. The discrete random variable X has probability distribution
1
Instructions to Candidates P( X x) x = 1, 2, 3, … 10
10
In the boxes above, write your centre number, candidate number, your surname, initials and signature.
Check that you have the correct question paper. (a) Write down the name given to this distribution.
Answer ALL the questions.
(1)
You must write your answer for each question in the space following the question.
Values from the statistical tables should be quoted in full. When a calculator is used, the answer (b) Write down the value of
should be given to an appropriate degree of accuracy.
(i) P(X = 10)
Information for Candidates
A booklet ‘Mathematical Formulae and Statistical Tables’ is provided. (ii) P(X < 10)
Full marks may be obtained for answers to ALL questions. (2)
The marks for the parts of questions are shown in round brackets, e.g. (2).
There are 7 questions in this question paper. The total mark for this paper is 75. The continuous random variable Y has the normal distribution N(10, 22).
There are 24 pages in this question paper. Any blank pages are indicated.
(c) Write down the value of
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled. (i) P(Y = 10)
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit. (ii) P(Y < 10)
(2)
P43148A
This publication may only be reproduced in accordance with Pearson Education Limited copyright policy.
©2014 Pearson Education Limited.
P43148A 2
PMT
3. A large company is analysing how much money it spends on paper in its offices every year. 4. A and B are two events such that
The number of employees, x, and the amount of money spent on paper, p (£ hundreds), in
8 randomly selected offices are given in the table below. 1 2 13
P(B) = P(A | B) = P A B
2 5 20
x 8 9 12 14 7 3 16 19
(a) Find P A B .
p (£ hundreds) 40.5 36.1 30.4 39.4 32.6 31.1 43.4 45.7
(2)
(You may use Ȉx2 = 1160 Ȉp = 299.2 Ȉp2 = 11 422 Ȉxp = 3449.5) (b) Draw a Venn diagram to show the events A, B and all the associated probabilities.
(3)
(a) Show that S pp = 231.92 and find the value of S xx and the value of S xp .
(5) Find
(b) Calculate the product moment correlation coefficient between x and p.
(2) (c) P(A)
(1)
The equation of the regression line of p on x is given in the form p = a + bx. (d) P(B | A)
(2)
(c) Show that, to 3 significant figures, b = 0.824 and find the value of a.
(4) (e) P Ac B
(d) Estimate the amount of money spent on paper in an office with 10 employees. (1)
(2)
(e) Explain the effect each additional employee has on the amount of money spent on paper.
(1)
Later the company realised it had made a mistake in adding up its costs, p. The true costs
were actually half of the values recorded. The product moment correlation coefficient and the
equation of the linear regression line are recalculated using this information.
P43148A 3 P43148A 4
PMT
5. The table shows the time, to the nearest minute, spent waiting for a taxi by each of 80 people 6. The time taken, in minutes, by children to complete a mathematical puzzle is assumed to be
one Sunday afternoon. normally distributed with mean ȝand standard deviation ı. The puzzle can be completed in
less than 24 minutes by 80% of the children. For 5% of the children it takes more than
Waiting time 28 minutes to complete the puzzle.
Frequency
(in minutes)
(a) Show this information on the Normal curve below.
2–4 15
(2)
5–6 9
(b) Write down the percentage of children who take between 24 minutes and 28 minutes to
7 6 complete the puzzle.
(1)
8 24
(c) (i) Find two equations in ȝand ı.
9–10 14
11–15 12 (ii) Hence find, to 3 significant figures, the value of ȝand the value of ı.
(7)
(a) Write down the upper class boundary for the 2–4 minute interval.
A child is selected at random.
(1)
(d) Find the probability that the child takes less than 12 minutes to complete the puzzle.
A histogram is drawn to represent these data. The height of the tallest bar is 6 cm.
(3)
(b) Calculate the height of the second tallest bar.
(3)
(c) Estimate the number of people with a waiting time between 3.5 minutes and 7 minutes.
(2)
(d) Use linear interpolation to estimate the median, the lower quartile and the upper quartile
of the waiting times.
(4)
(e) Describe the skewness of these data, giving a reason for your answer.
(2)
P43148A 5 P43148A 6
PMT
7. In a large company,
Paper Reference(s)
78% of employees are car owners,
30% of these car owners are also bike owners, 6683/01
85% of those who are not car owners are bike owners.
Two employees are selected at random. Candidates may use any calculator allowed by the regulations of the Joint
Council for Qualifications. Calculators must not have the facility for symbolic
(d) Find the probability that only one of them is a bike owner. algebra manipulation, differentiation and integration, or have retrievable
(3) mathematical formulas stored in them.
Instructions to Candidates
TOTAL FOR PAPER: 75 MARKS In the boxes on the answer book, write the name of the examining body (Edexcel), your
END centre number, candidate number, the unit title (Statistics S1), the paper reference (6683),
your surname, other name and signature.
Values from the statistical tables should be quoted in full. When a calculator is used, the
answer should be given to an appropriate degree of accuracy.
Advice to Candidates
You must ensure that your answers to parts of questions are clearly labelled.
You must show sufficient working to make your methods clear to the Examiner.
Answers without working may not gain full credit.
P43017A This publication may only be reproduced in accordance with Pearson Education Limited copyright policy.
©2014 Pearson Education Limited
P43148A 7
PMT
1. A random sample of 35 homeowners was taken from each of the villages Greenslax and Question 1(b) graph paper
Penville and their ages were recorded. The results are summarised in the back-to-back stem
and leaf diagram below.
Some of the quartiles for these two distributions are given in the table below.
Greenslax Penville
Lower quartile, Q 1 a 31 2. The mark, x, scored by each student who sat a statistics examination is coded using
Median, Q 2 64 39 y = 1.4x – 20
Upper quartile, Q 3 b 55
The coded marks have mean 60.8 and standard deviation 6.60.
(a) Find the value of a and the value of b. Find the mean and the standard deviation of x.
(2) (4)
(b) On the graph paper on the next page draw a box plot to represent the data from Penville.
Show clearly any outliers.
(4)
(c) State the skewness of each distribution. Justify your answers.
(3)
P43017A 2 P43017A 3
PMT
3. The table shows data on the number of visitors to the UK in a month, v (1000s), and the 4. In a factory, three machines, J, K and L, are used to make biscuits.
amount of money they spent, m (£ millions), for each of 8 months.
Machine J makes 25% of the biscuits.
Number of visitors
2450 2480 2540 2420 2350 2290 2400 2460 Machine K makes 45% of the biscuits.
v (1000s)
Amount of money spent The rest of the biscuits are made by machine L.
1370 1350 1400 1330 1270 1210 1330 1350
m (£ millions)
It is known that 2% of the biscuits made by machine J are broken, 3% of the biscuits made by
You may use machine K are broken and 5% of the biscuits made by machine L are broken.
(a) Draw a tree diagram to illustrate all the possible outcomes and associated probabilities.
S vv = 42587.5 S vm = 31512.5 S mm = 25187.5 Ȉv = 19390 Ȉm = 10610 (2)
(a) Find the product moment correlation coefficient between m and v.
A biscuit is selected at random.
(2)
(b) Give a reason to support fitting a regression model of the form m = a + bv to these data. (b) Calculate the probability that the biscuit is made by machine J and is not broken.
(1) (2)
(c) Find the value of b correct to 3 decimal places. (c) Calculate the probability that the biscuit is broken.
(2) (2)
(d) Find the equation of the regression line of m on v. (d) Given that the biscuit is broken, find the probability that it was not made by machine K.
(2) (3)
(e) Interpret your value of b.
(2)
5. The discrete random variable X has the probability function
(f) Use your answer to part (d) to estimate the amount of money spent when the number of
visitors to the UK in a month is 2 500 000.
(2) kx x 2, 4, 6
°
P X x ®k x 2 x 8 x 8
(g) Comment on the reliability of your estimate in part (f). Give a reason for your answer.
°0 otherwise
(2) ¯
where k is a constant.
1
(a) Show that k = .
18
(2)
(b) Find the exact value of F(5).
(1)
(c) Find the exact value of E(X).
(2)
(d) Find the exact value of E(X 2).
(2)
(e) Calculate Var(3 – 4X) giving your answer to 3 significant figures.
(3)
P43017A 4 P43017A 5
PMT
Time (seconds) Number of customers, f (a) Find the probability that a randomly selected adult female has a height greater than
0 – 30 2 170 cm.
(3)
30 – 60 10
Any adult female whose height is greater than 170 cm is defined as tall.
60 – 70 17
70 – 80 25 An adult female is chosen at random. Given that she is tall,
80 – 100 25 (b) find the probability that she has a height greater than 180 cm.
100 – 150 6 (4)
A histogram was drawn to represent these data. The 30 – 60 group was represented by a bar Half of tall adult females have a height greater than h cm.
of width 1.5 cm and height 1 cm.
(c) Find the value of h.
(a) Find the width and the height of the 70 – 80 group. (5)
(3)
(b) Use linear interpolation to estimate the median of this distribution.
8. For the events A and B,
(2)
Given that x denotes the midpoint of each group in the table and P Ac B 0.22 and P Ac Bc 0.18
One measure of skewness is given by (d) Determine whether or not A and B are independent.
(2)
3 mean median
coefficient of skewness
standard deviation
(d) Evaluate this coefficient and comment on the skewness of these data.
(3) TOTAL FOR PAPER: 75 MARKS
END
P43017A 6
P40105XA 7