Interpreting Data
Interpreting Data
Interpreting Data
Curriculum Ready
www.mathletics.com
Interpreting
DataDATA
INTERPRETING
Different lists of data have different properties. This unit is focused on the results and conclusions that
can be found from these different properties.
Answer these questions, before working through the chapter.
I used to think:
The median of a data set is the middle score. Does this mean that the number of scores greater than the
median is the same as the number of scores less than the median?
If the median splits the data in two halves, what do you think "quartiles" do?
The "range" of data is the difference between the highest and lowest score. What is "interquartile range"?
If the median splits the data in two halves, what do you think "quartiles" do?
The "range" of data is the difference between the highest and lowest score. What is "interquartile range"?
3P Learning
K 19
SERIES
TOPIC
Interpreting Data
Basics
Basic Statistics
Data is just a list of numbers called 'scores' or 'results'. The basic statistics that can be found from these scores are the
mean, median or mode. (These are also called "measures of central tendency")
fx
The mean is the average score. The symbol for the mean is xr . It is found using the formula xr =
.
f
The mode is the score with the highest frequency. This is the score that occurs the most often.
/
/
The median is the middle score when the scores are arranged in ascending order.
The cumulative frequency (cf) is the sum of the frequencies for all scores less than or equal to that score.
Also remember that the symbol / (called "sigma") means 'sum of' and so when
of the scores'.
Here is an example.
A group of people's height was measured (in cm) and the results were written in this table
Score (x)
Frequency (f)
Cumulative
frequency (cf)
fx
110
3#110=330
112
3+5=8
5#112=560
113
10
8+10=18
10#113=1130
115
18+9=27
9#115=1035
116
27+8=35
8#116=928
/ f = 35
f #x
/ fx = 3983
/ fx
/f
= 3983 = 113.8 cm
35
d
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Basics
9
8
Frequency (f)
7
6
5
Polygon
4
3
2
Histogram
1
0
Leave half a column
on either side
2 3 4 5 6 7 8 9 10
Score (x)
/ f = 2+6+9+7+8+3
= 35
/ fx
/f
= 2 # 2 + 6 # 4 + 9 # 6 + 7 # 7 + 8 # 9 + 3 # 10
35
= 6.7 (1 d.p.)
3P Learning
K 19
SERIES
TOPIC
Interpreting Data
Basics
9
Cumulative frequency (cf)
8
7
Cumulative
frequency polygon
6
5
4
Cumulative
frequency histogram
3
2
1
0
2 3 4 5 6 7 8 9 10
Score (x)
` median =
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Questions
Basics
1. A group of people were asked how many languages they speak and this table was partly completed.
a
Frequency (f)
fx
20
20#1=20
38
12#3=36
3
4
Are
50
57
/ f = 60
b
/ fx =
3P Learning
K 19
SERIES
TOPIC
Interpreting Data
Questions
Basics
2. A group of people were asked how many movies they had seen in the last year. The diagram below shows the
frequency polygon for the results.
Number of movies seen
9
Frequency (f)
8
7
6
5
4
3
2
1
0
a
20 21 22 23 24 25
Movies (x)
Frequency (f)
/ f=
fx
/ fx =
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Questions
Basics
3. A group of people were asked their age and this frequency histogram was produced.
Different Ages
Frequency (f)
5
4
3
2
1
0
a
30 31 32 33 34 35
Ages (x)
Frequency (f)
/ f=
b
fx
/ fx =
3P Learning
K 19
SERIES
TOPIC
Interpreting Data
Knowing More
The median is the middle score of the data (or the average of the two middle scores). This means that 50% of the
scores are less than or equal to the median. Quartiles work the same way.
Quartiles
There are 3 quartiles.
The first quartile written as Q1 is the score that 25% of the scores are less than or equal to. Q1 is
the median of the lower half of the scores.
The second quartile written as Q2 is the median.
The third quartile written as Q3 is the score that 75% of the scores are less than or equal to. Q3 is
the median of the upper half of the scores.
In other words the quartiles divide the data into quarters.
Range of scores
Lowest value
Median
Q1
Lower quartile
Highest value
Q3
Upper quartile
Frequency (f)
13
15
16
20
/
a
f = 20
` Q1 = 2 + 2 = 2
2
Since there are 20 scores, the Q3 will be the average of the scores in the 15 th and 16 th positions. From the
table, the score in the 15 th position is 4, and the score in the 16 th position is 5.
` Q3 = 4 + 5 = 4.5
2
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Knowing More
Frequency (f)
10
12
13
14
17
16
10
27
18
33
20
36
/ f = 36
a
Find Q1
There are 36 scores in total, so Q1 is the average of the 9th and 10th scores (median of the lower half). The
cf of x=10 is 5 and the cf of x=12 is 13.
` Q1 = 12 + 12 = 12
2
Find Q2
There are 36 scores in total, so the median is the average of the scores in 18th and 19th position:
` Q2 = 16 + 16 = 16
2
Find Q3
There are 36 scores in total, so Q3 is the average of the in 27th and the 28th scores. The cf of x=16 is 27
and cf of x = 18 is 33:
` Q3 = 16 + 18 = 17
2
IQR = Q3 - Q1
= 17 - 12 = 5
3P Learning
K 19
SERIES
TOPIC
Interpreting Data
Questions
Find Q1 .
10
K 19
SERIES
TOPIC
3P Learning
Knowing More
Interpreting Data
Questions
Knowing More
100
/ f = 52
a
Find Q1 and Q3 .
3P Learning
K 19
SERIES
TOPIC
11
Interpreting Data
Box-and-Whisker Plots
A 5-point summary is used to plot a "Box-and-Whisker" plot for a set of Data.
These are used to compare different data sets. They are drawn like this:
Median
Q1
Lowest value
Whisker
Q3
Highest value
Box
Whisker
Q1
Q2
Q3
Q1
20
12
K 19
SERIES
TOPIC
21
22
23
24
25
Q3
26
3P Learning
27
28
29
30
Interpreting Data
25% of data
25% of data
25% of data
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
2009
28
22
24
22
16
26
20
25
29
20
23
24
2010
21
24
26
25
24
25
24
25
30
27
28
36
2010
Ascending Order
16, 20, 20, 22, 22, 23, 24, 24, 25, 26, 28, 29
21, 24, 24, 24, 25, 25, 25, 26, 27, 28, 30, 36
Lowest Temperature
16
21
Q1
21
24
Q2
23.5
25
Q3
25.5
27.5
Highest Temperature
29
36
Draw box and whisker plots for the average temperatures in 2009 and 2010.
2009
2010
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
c
3P Learning
K 19
SERIES
TOPIC
13
Interpreting Data
Standard Deviation
Standard deviation measures the average distance each score is away from the mean. It has this symbol using
lower case sigma v n , pronounced 'sigma-n'. This is the formula for v n :
/ ^ x - xrh2
vn =
Where xr is the mean and / still means 'sum of'. Here is an example:
Find the standard deviation (correct to 1 decimal place) of this set of data: 11, 8, 13, 3, 9, 15, 17, 17, 6, 11
Find the mean, xr :
xr = sum of scores
n
= 110
10
= 11
Draw a table with these 3 columns.
This is called "mean difference"
Score (x)
x - xr
^ x - xr h2
11
11-11=0
8-11=-3
13
13-11=2
3-11=-8
64
9-11=-2
15
15-11=4
16
17
17-11=6
36
17
17-11=6
36
6-11=-5
25
11
11-11=0
/ ^ x - xrh = 0
/ ^ x - xrh2 = 194
/ ^ x - xrh2
n
194
10
= 4.4 ^1 d.p.h
14
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Questions
1. An athlete runs the same race 16 times. This is how long it takes him (in seconds) to run each time:
14 , 12 , 18 , 14 , 16 , 18 , 19 , 14 , 16 , 17 , 15 , 13 , 20 , 16 , 14 , 19
a
11 12 13 14 15 16 17 18 19 20 21
3P Learning
K 19
SERIES
TOPIC
15
Interpreting Data
Questions
2. During 8 days, a cat and a dog eat an amount of food (in grams) according to the table below.
Mon
Tues
Wed
Thur
Fri
Sat
Sun
Mon
Cat
70
100
40
90
50
70
55
100
Dog
65
100
90
80
75
85
50
85
Draw box-and-whisker plots for the different data sets on the number line below.
20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
105
110
115
16
From the box-and-whisker plot, which had the greater interquartile range? Find the interquartile range.
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Questions
3. 150 men and 150 women were in a survey and these are the resulting box-and-whisker plots from their ages:
Women
Men
20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58
a
What is the age that half the men were older than?
3P Learning
K 19
SERIES
TOPIC
17
Interpreting Data
Questions
4. Ava counted the number of books she read each month for a year. She wrote them in the table below:
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
10
10
x - xr
-1
^ x - xr h2
4
1
-5
6
1
-5
9
9
10
10
5
6
/ ^ x - xrh =
/ ^ x - xrh2 =
Show that the standard deviation of the above set of data to 2 decimal places is 2.97.
18
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Thinking More
Skewness of Data
A set of data can be one of three things: a normal distribution, skewed to the right or skewed to the left.
normal distribution
This is only a general rule of thumb which holds most of the time and exceptions to this rule do occur. This can be
used to compare different data sets.
Two data sets are shown on the column graph below, data set 1 (white) and data set 2 (black).
60
50
40
30
20
10
0
Data Set 2:
median = 25 + 30
2
= 27.5
median = 15 + 20
2
= 17.5
Find the mean of both data sets and comment on the skewness of each data set.
Data set 1:
Data set 2:
Mean = 27.5
Mean = 23 1
3
3P Learning
K 19
SERIES
TOPIC
19
Interpreting Data
Thinking More
Spread of Data
The spread of a data set measures how consistent (close to the mean) a data set is. This depends on:
Range the wider the range of the data set the less likely scores will be close to the mean
Interquartile range the wider the range of the data set the less likely scores will be close to the mean
Standard deviation v n measures how far the scores are from the mean.
A gameplayer tries two strategies for playing a game. He tries each strategy eight times and these are the
points received
Strategy 1: 34 , 35 , 28 , 28 , 30 , 31 , 32 , 27
a
Strategy 2: 1 , 13 , 5 , 10 , 16 , 14 , 1 , 5
Strategy 2
Lowest
27
Q1
28
1+5 = 3
2
Q2
30.5
7.5
Q3
33
13 + 14 = 13.5
2
Highest
35
16
Strategy 2
Range
35-27=8
16-1=15
IQR
33-28=5
13.5-3=10.5
vn =
20
/ ^ x - xrh2
n
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Questions
Thinking More
1. Two movies received reviews from eight critics who gave the movie a score between 1 and 10.
Here are their results:
Movie 2: 8 , 7 , 3 , 4 , 7 , 10 , 6 , 3
Movie 1: 3 , 8 , 8 , 6 , 3 , 5 , 2 , 5
a
Use this table to find the standard deviation of the Movies' scores:
Movie 1
Score (x)
x - xr
^ x - xr h2
Score (x)
10
5
/ ^ x - xrh =
Movie 2
/ ^ x - xrh =
/ ^ x - xrh =
^ x - xr h2
x - xr
/ ^ x - xrh2 =
Movie 2
Lowest
Q1
Q2
Q3
Highest
vn
c
3P Learning
K 19
SERIES
TOPIC
21
Interpreting Data
Questions
Thinking More
2. At the olympics, divers receive a score between 1 and 10 each time they dive. These are the scores after
12 dives for the divers who came in first and second place.
Diver A
Diver B
7 ,8 ,5 ,7 ,8 ,6 ,6 ,5 ,8 ,5 ,8 ,5
7 , 6 , 4 , 5 , 6 , 5 , 10 , 9 , 8 , 6 , 9 , 9
Draw a box-and-whisker plot for each of the divers' scores. Are the scores skewed?
0 1 2 3 4 5 6 7 8 9 10
11
12
22
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Questions
Thinking More
Draw up a table with the headings, score (x), x - xr and ^ x - xr h2 and use it to find v n for both divers.
3P Learning
K 19
SERIES
TOPIC
23
Interpreting Data
Questions
Thinking More
If the median is less than the mean which way would the data be skewed (according to the rule of thumb)?
Standard deviation is a measure of how far each score is from the mean. What does this mean?
Does data with a higher or lower standard deviation have more consistency?
24
K 19
SERIES
TOPIC
3P Learning
Mathletics 100%
3P Learning
SERIES
/ f = 20
35
/ fx = 651
35 5 = 175
15 + 5 = 20
mode = 20
13 + 2 = 15
34 2 = 68
median = 22
10 + 3 = 13
7 + 3 = 10
4+3=7
25
24
23
22
21
20
/ f = 29
Movies Frequency
(x)
(f)
/ fx = 636
25 4 = 100
24 2 = 48
23 3 = 69
22 7 = 154
21 5 = 105
20 8 = 160
fx
25 + 4 = 29
23 + 2 = 25
20 + 3 = 23
13 + 7 = 20
8 + 5 = 13
Cumulative
frequency (cf)
/ fx = 20 + 36 + 36 + 28 + 15 = 135
57 +3 = 60
5 # 3 = 15
/ f = 60
50 + 7 = 57
4 # 7 = 28
xr = 21.93 (2 d.p.)
33 3 = 99
32 3 = 96
31 3 = 93
Cumulative
frequency (cf)
38 + 12 = 50
12 # 3 = 36
50 - 38 = 12
34
33
30 4 = 120
fx
Yes,
32
mode = 1
31
median = 2
/ fx = 135 = 2.25
/ f 60
30
3. a
Frequency
(f)
xr =
3
20 + 18 = 38
18 # 2 = 36
38 - 20 = 18
Basics:
Ages
(x)
0 + 20 = 20
Cumulative
frequency (cf)
20 # 1 = 20
fx
20
Frequency (f)
1. a
Number of
languages (x)
Interpreting Data
Answers
Basics:
2. a
K 19
TOPIC
25
Interpreting Data
Answers
Basics:
3. b
Knowing More:
2. a
Different Ages
Results
in % (x)
10
20
30
40
50
60
70
80
90
100
Frequency (f)
5
4
3
2
1
0
30 31 32 33 34 35
Ages (x)
xr = 32.55
mode = 35
median = 32.5
Number of
students (f)
5
3
1
8
3
9
6
3
8
6
Cumulative
frequency (cf)
5
5+3=8
8+1=9
9 + 8 = 17
17 + 3 = 20
20 + 9 = 29
29 + 6 = 35
35 + 3 = 38
38 + 8 = 46
46 + 6 = 52
/ f = 52
Knowing More:
median = 60
Q1 = 40
1. a 0 , 0 , 1 , 1 , 2 , 2 , 2 , 3 , 4 , 4 , 5 , 7 , 7
,8,8,8
Q3 = 90
b
median = 3.5
Q1 = 1.5
IQR = 50
Lowest score = 10
Q3 = 7
Q1 = 40 (lower quartile)
Lowest score = 0
Q2 = 60 (the median)
Q3 = 90 (upper quartile)
Q3 = 7 (upper quartile)
Highest score = 8
Range = 8
IQR = 5.5
Lowest score = 12
Q1 = 14 (lower quartile)
Q2 = 16 + 16 = 16 (the median)
2
Q3 = 18 (upper quartile)
Highest score = 20
26
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Answers
2.
Cat
Range = 8
d
35 40 45 50 55 60 65 70 75 80 85 90 95 100
11 12 13 14 15 16 17 18 19 20 21
1.
Dog
IQR = 4
IQR (dog) = 17.5
2. a
Cat
40
50
55
70
70
90
100
100
Dog
50
65
75
80
85
85
90
100
Cat:
Lowest score = 40
Q2 = 70 (the median)
Q3 = 95 (upper quartile)
Highest score = 100
Dog:
Lowest score = 50
Q1 = 70 (lower quartile)
Q2 = 82.5 (the median)
Q3 = 87.5 (upper quartile)
Highest score = 100
3P Learning
K 19
SERIES
TOPIC
27
Interpreting Data
Answers
Thinking More:
1. a
xr = 6
x
Score
(x)
^ x - xr h2
x - xr
x - xr
^ x - xr h2
-1
( 1) 2 = 1
8 - 6 = 2
7 - 6 = 1
4 - 6 = -2
3 - 6 = -3
-5
25
4 - 6 = -2
6-6=0
7 - 6 = 1
-5
25
9-6=3
10
10 - 6 = 4
16
6 - 6 = 0
10
16
3 - 6 = -3
10
16
-1
/ ^ x - xrh = 0 / ^ x - xrh2 = 44
Movie 1: v n = 2.35
vn =
vn =
/ (x
xr) 2
n
/ (x
xr) 2
n
106 = 2.97
12
Thinking More:
1. a
Movie 1
Movie 2
Lowest
Q1
3.5
Q2
6.5
Q3
7.5
Highest
10
2.12
2.35
vn
Movie 1
Score
(x)
x - xr
^ x - xr h2
3 - 5 = -2
8 - 5 = 3
8 - 5 = 3
6-5=1
3 - 5 = -2
5 - 5 = 0
2 - 5 = -3
5 - 5 = 0
For Movie 2:
the mean = 6 and the median = 6.5,
median 2 mean
` the scores are skewed to the left (ie
there are more scores that are less than
the median than there are scores greater
than the median)
/ ^ x - xrh = 0 / ^ x - xrh2 = 36
K 19
SERIES
TOPIC
For Movie 1:
the mean = the median = 5
` the data is not skewed and the
scores are distributed normally
Movie 1: v n = 2.12
28
Movie 2
3P Learning
Interpreting Data
Answers
2.
Movie 1
Movie 2
Range
8-2=6
10 - 3 = 7
Interquartile
range
7-3=4
7.5 - 3.5 = 4
2.12
2.35
vn
2. a
Diver A
Diver B
Lowest
Q1
5.5
Q2
6.5
6.5
Q3
Highest
10
Diver A
3 4 5 6 7 8 9 10
1. d
Thinking More:
Diver B
Thinking More:
Diver A
Diver B
Range
8-5=3
10 - 4 = 6
Interquartile
range
8-5=3
9-5.5= 3.5
Diver A
xr
xr ) 2
-1.5
2.25
-1.5
2.25
-1.5
2.25
-1.5
2.25
-0.5
0.25
-0.5
0.25
0.5
0.25
0.5
0.25
1.5
2.25
1.5
2.25
1.5
2.25
1.5
2.25
(x
/ ^ x - xrh = 0 / (x - xr) 2 = 19
Diver A: v n = 1.126
3P Learning
K 19
SERIES
TOPIC
29
Interpreting Data
Answers
Thinking More:
2. d
Thinking More:
3.
Diver B
x
xr
xr ) 2
(x
-3
-2
-2
-1
-1
-1
10
/ (x - xr) = 0 / (x - xr) 2 = 42
Driver B: v n = 1.87
e
3. a
b
30
K 19
SERIES
TOPIC
3P Learning
Interpreting Data
Notes
3P Learning
K 19
SERIES
TOPIC
31
Interpreting Data
32
K 19
SERIES
TOPIC
Notes
3P Learning
Interpreting Data
www.mathletics.com