0% found this document useful (0 votes)
5 views

starting-statistics_5

This chapter discusses the standardization of categorical data using percentages, proportions, ratios, and rates of occurrence. It emphasizes how these measures can be misleading and advocates for using rates to provide more meaningful comparisons, particularly in the context of varying population sizes. The chapter concludes by highlighting the importance of understanding these statistical concepts for accurate data interpretation.

Uploaded by

tausif shams
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

starting-statistics_5

This chapter discusses the standardization of categorical data using percentages, proportions, ratios, and rates of occurrence. It emphasizes how these measures can be misleading and advocates for using rates to provide more meaningful comparisons, particularly in the context of varying population sizes. The chapter concludes by highlighting the importance of understanding these statistical concepts for accurate data interpretation.

Uploaded by

tausif shams
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Standardising Categories

In: Starting Statistics: A Short, Clear Guide

By: Neil Burdess


Pub. Date: 2013
Access Date: June 14, 2019
Publishing Company: SAGE Publications Ltd
City: London
Print ISBN: 9781849200981
Online ISBN: 9781446287873
DOI: https://dx.doi.org/10.4135/9781446287873
Print pages: 34-39
© 2010 SAGE Publications Ltd All Rights Reserved.
This PDF has been generated from SAGE Research Methods. Please note that the pagination of the
online version will vary from the pagination of the print book.
SAGE SAGE Research Methods
2010 SAGE Publications, Ltd. All Rights Reserved.

Standardising Categories

Chapter Overview

This chapter will:

• Recap earlier comments about using percentages with categorical data.


• Show how to calculate proportions.
• Show how to calculate ratios.
• Illustrate how percentages, proportions, and ratios can be misleading. In these
circumstances, rates of occurrence can give more meaningful results.

Recall that the term standardisation refers to changing original values to make them easier to understand
and compare. The previous chapter introduced standardisation. This chapter recaps some of the earlier
comments about percentages, and then looks at some of the other ways to standardise categorical data:
proportions, ratios, and rates of occurrence.

Percentages

A percentage standardises information per 100. It allows you to compare two numbers by standardising one
of them to every 100 units in the other. For example, there are 250 Arts students in a class of 1000 students.
What is the percentage of Arts students in the class? Because percentages standardise information per 100,
you have to find out how many hundreds of students there are in the class. Clearly, there are 10 lots of 100
in 1000. If the 250 Arts students were divided equally among each of these 10 groups of 100 students, there
would be 25 Arts students in each group. In other words, Arts students make up 25% of all students. This line
of reasoning is reflected in the following calculation:

In the current example, the category is Arts students, and the category frequency is 250. As there are 1000
students altogether, the total frequency is 1000. The percentage calculation now becomes:

Starting Statistics: A Short, Clear Guide


Page 2 of 9
SAGE SAGE Research Methods
2010 SAGE Publications, Ltd. All Rights Reserved.

Table 5.1 shows several real-world percentage calculations. For example, Table 5.1(b) shows that 44 of the
192 UN members are in Europe. The result of the percentage calculation shows that 23 of every 100 UN
members are in Europe. In other words, 23% of UN members are European countries. This figure is very
similar to the European percentage for 1945 (25%), despite membership of the UN having almost quadrupled.
What the percentage figures do not tell you is that several European countries in existence in 1945, in
particular the Soviet Union and Yugoslavia, later divided into smaller independent states, and then became
UN members.
Table 5.1 Percentages: UN members, by continent, 1945 and 2009

Proportions

You can standardise by any number you want; it doesn't have to be 100. For example, you could use the
permillage system, and standardise by 1000 (mille is Latin for thousand). There is even a permillage symbol
(%0). However, this system has not caught on, mainly I suspect because it requires working with numbers up
to 999 – which are too large for many people.

Of course, you could go the other way, and standardise to less than 100. For example, 10 seems a nice
round number. But such a system is likely to be too crude for many situations, unless you add a decimal point
– which spoils the simplicity. However, the use of proportions is common in statistics. This system involves
standardising ‘per 1’ or ‘per unit’. Calculate a proportion exactly as for a percentage, but without the final
multiplication by 100:

For example, there are 44 UN members in Europe out of a total of 192 UN members. Thus, the proportion of
UN members located in Europe is:

Starting Statistics: A Short, Clear Guide


Page 3 of 9
SAGE SAGE Research Methods
2010 SAGE Publications, Ltd. All Rights Reserved.

Table 5.2 shows this and other proportion values. Proportions are commonplace in statistics books, and
it's important to be aware of them. However, you rarely find proportions anywhere else, mainly because
proportions lie well outside the 1 to 100 comfort zone of many people.
Table 5.2 Percentages, proportions, and ratios: UN member states, by continent, 2009

Behind the Stats

One exception of proportions not being used in the mass media is tables of standings in
US sports. For example, basketball fans would immediately understand that 0.500 shows
that their team had won half of its matches. However, in a standings table, the column
showing proportions is usually headed ‘Pct’, for Percentage. And the proportion is usually
referred to as a whole number – for example, ‘I'm pleased my team has reached 500 in
the standings’.

Ratios

At times, the level of detail provided by percentages and proportions is more than you really need. For
example, you might say that approximately 1 in 4 UN members are Asian states. These are ratios. A ratio
shows how many individuals there are in the total for every one in a particular category. Calculate a ratio as
follows:

Starting Statistics: A Short, Clear Guide


Page 4 of 9
SAGE SAGE Research Methods
2010 SAGE Publications, Ltd. All Rights Reserved.

For example, there are 46 UN members in the category ‘Asia’ out of a total of 192 UN members. Thus, the
ratio of UN members located in Asia is:

You often see ratios presented as whole numbers in sentences. Thus, you are likely to describe the 1 in 4.2
ratio as ‘just over 1 in 4 UN members are from Asia’. The Ratio column of Table 5.2 shows the ratios for all
continents.

Rates of Occurrence

Percentages, proportions, and ratios can sometimes be misleading. For example, one column of Table 5.3
shows the number of traffic fatalities in 15 European countries; the next column shows the associated
percentage values. Over three-quarters (77.1%) of all fatalities occur in just five countries: Italy, Germany,
France, Spain, and the UK. On the other side of the coin, less than one-quarter (22.9%) of all fatalities occur
in the other 10 countries.

Starting Statistics: A Short, Clear Guide


Page 5 of 9
SAGE SAGE Research Methods
2010 SAGE Publications, Ltd. All Rights Reserved.

Table 5.3 Percentages and rates of occurrence: West European states, road fatalities

Starting Statistics: A Short, Clear Guide


Page 6 of 9
SAGE SAGE Research Methods
2010 SAGE Publications, Ltd. All Rights Reserved.

Starting Statistics: A Short, Clear Guide


Page 7 of 9
SAGE SAGE Research Methods
2010 SAGE Publications, Ltd. All Rights Reserved.

The impression given by these percentages is that the roads in Italy, Germany, France, Spain, and the UK
are much more dangerous than roads elsewhere in Western Europe. But there are more people – and thus
more traffic – in these countries, and you might expect that the more traffic there is, the more traffic accidents
there will be.

You can avoid such misleading conclusions by using a rate of occurrence. The major difference between this
and the other statistics in this chapter is that rates of occurrence bring a second variable into the calculations.
For example, you can standardise the number of traffic fatalities in each country by its total population, thus
getting over the problem that countries vary greatly in size. For example, in the UK, 3297 people out of a total
population of just over 60 million were killed in road accidents. Thus, the UK's rate of occurrence is as follows:

This rate is much too small to be readily understandable. The way to get around the problem is to express
the rate of occurrence as a ‘rate per so many’, the so many being whatever number provides a readily
understandable set of figures. For example:

The rates of occurrence per hundred and per thousand population still produce values that are much less
than 1, and thus are too small to be readily understandable. In contrast, the rate per 100,000 is 5.4, and the
rate per million population is 54, both of which fall within the preferred range of 1 to 100. Table 5.3 uses the
rate per million to show road fatalities for 15 West European countries.

Clearly, the rate per million population shows a very different picture from the original percentages. This is
because the rates are not affected by the very different population totals between countries. For example, the

Starting Statistics: A Short, Clear Guide


Page 8 of 9
SAGE SAGE Research Methods
2010 SAGE Publications, Ltd. All Rights Reserved.

UK has relatively safe roads, with a rate of 54 deaths per million population, and Germany is not far behind
(62). In contrast, both countries with fatality rates over 100 per million have relatively small populations (and
thus have quite small percentage values). Table 5.3 shows that Greece stands out as the country with the
most dangerous roads, with a fatality rate of 149 per million population, more than twice that of Germany.

Behind the Stats

An Irish politician has suggested that Ireland should give up driving on the left-hand side
of the road to reduce accidents by foreigners used to driving on the right. In Europe, only
the UK and Ireland still drive on the left. Sweden was the last European state to change
from left to right, in 1967. The main reason why the right-hand side became the standard
across Europe was that Napoleon decreed that conquered countries use the same side of
the road as France. Similarly, largely because of the global extent of the British Empire,
worldwide about one-third of drivers use the left-hand side. There is little evidence that
driving on the left was ever widespread among British colonies in America. In Canada,
however, British Columbia and the Atlantic provinces switched to the right in the 1920s,
and Newfoundland switched in 1949 when it joined Canada. See Lucas (2005) and Kincaid
(1986).

This chapter focused on standardising categorical variables. The next, Chapter 6, looks at ways to
standardise numerical variables. The chapter starts with simplifying original values, then recaps and develops
some earlier comments about using ranks, and finally looks at two more specialised techniques.

http://dx.doi.org/10.4135/9781446287873.n5

Starting Statistics: A Short, Clear Guide


Page 9 of 9

You might also like