GCE AS Level Representation of Data Advantages and Disadvantages of Different Representations of Data
GCE AS Level Representation of Data Advantages and Disadvantages of Different Representations of Data
SMIYL
April 2023
• select a suitable way of presenting raw data, and discuss the advantages
and/or disadvantages that particular representations may have
Representation of Data
We will learn how to represent data using a stem-and-leaf diagrams,
box-and-whisker plots, histograms and cumulative frequency graphs.
We will also learn to calculate measures of central tendency and vari-
ation. We’re going to explore the advantages and disadvantages of
particular representations and statistical data.
1
Box and Whisker Plot
Advantages
• You can compare two or more sets of data by drawing on the same diagram
Disadvantages
• It does not show frequencies
Histogram
Advantages
• It can represent groups of different widths
• It shows whether the distribution is symmetrical or skew
• The mean and standard deviation can be estimated from the histogram
Disadvantages
• The visual impact can be altered by using different scales
2
Measures of Central Tendency
Mean
Advantages
• It is calculated using all the data so it represents all the items
Mode
Advantages
• Useful when the most popular category is required, e.g clothes or shoe
sizes
Disadvantages
• Not very useful for small data sets, or when there are more than two modes
• There may not be a mode
• It may not be representative, e.g. it could be the lowest value
Median
Advantages
• It is not affected by extreme values
3
Variation
Range
Advantages
• It is easy to calculate
• It represents the complete spread of the data
Disadvantages
• It is affected by extreme values
Interquartile Range
Advantages
• It is not unduly influenced by extreme values
• It can be used to investigate extreme values
Disadvantages
• It depends only on particular values when the data is ranked
Standard Deviation
Advantages
• It is calculated using all the data and so represents every item
• It is calculated using a mathematical formula so calculators can be pro-
grammed to find it
• It is very useful for further analysis
• It is useful in comparing two sets of data, for example by showing which
is more consistent
Disadvantages
• It can be unduly affected by one or two extreme values
• For a single set of data its value is difficult to interpret
4.1 4.2 4.4 4.5 4.6 4.8 5.0 5.2 5.3 5.4
5.5 5.8 6.0 6.2 6.3 6.4 6.6 6.8 6.9 19.4
It is given that the mean is 6.17 and the median is 5.45. Give a reason
why the median is likely to be more suitable than the mean as a measure
of the central tendency for this information.
4
Since we have a value that appears anomalous (does not follow
the trend), 19.4, the mean will be inflated due to this value.
However, this extreme value has no effect on the median.
50 45 62 30 40 55 110 38 52 60 55 40
3. The heights, in cm, of the 11 basketball players in each of two clubs, the
Amazons and the Giants, are shown below. (9709/52/M/J/21 number 7)
Amazons 205 198 181 182 190 215 201 178 202 196 184
Giants 175 182 184 187 189 192 193 195 195 195 204