Week 7 - Topic Overview
Week 7 - Topic Overview
Introduction
In week 6, we discussed about the differences between primary and secondary data; we
explained the differences between quantitative, qualitative and mixed methods research
designs, how to identify the full variety of available data, which secondary data we have to
choose in order to answer our research questions and objectives. Moreover, we understood the
main advantages and disadvantages of using secondary data as well as the range of techniques
to search for them and how to evaluate and the secondary data. This week, we will learn about
how to find patterns in data and how to explain the data with diagrams and tables. If you think
of data as just a bunch of numbers, you might be surprised to learn that there are several
classifications or levels of measurement. Knowing which category your data belongs to is
critical because it determines the type of statistical analysis you will perform. Data analysis is
the process by which a researcher discovers relationships and gains an understanding of what
the information gathered from the data collection truly means and how it is relevant (Albers,
2017). Designing a valid and reliable study necessitates critically considering why the data
measurements are required (Velleman & Wilkinson, 1993).
As mentioned in the introduction, there are several classifications or levels of Data
measurement. Each category depends on what kind of secondary data is.
There are four levels of measurement. These are:
1. Continuous Data: Data that is measured on a scale, such as weight or temperature. The scale
can be subdivided into as many intervals as required, depending on the accuracy of the
4. Nominal Data: Data that does not have a numerical value and can only be placed in a suitable
category (Oakshott, 2016; Cliff, 2014) Nominal data can be qualitative as well as quantitative.
Words, letters, and symbols may be included (Cliff, 2014). People's names, gender, and
nationality are some of the most common examples of nominal data. The only thing we can do
with nominal data is categorize it.
Ordinal and nominal data are usually referred to as categorical data.
Tabulation of data
Let’s see an example and understand how we use the secondary data.
A small survey was carried out into the mode of travel to work. The information below related
to a random sample of 20 employed individuals.
Person Mode of Travel Person Mode of Travel
1 Car 11 Car
2 Car 12 Bus
3 Bus 13 Walk
4 Car 14 Car
5 Walk 15 Train
6 Cycle 16 Bus
7 Car 17 Car
8 Cycle 18 Cycle
9 Bus 19 Car
10 Train 20 Car
The frequency of each category is simply the number of times it appeared. The relative
frequency has been calculated in addition to the actual frequency. This is frequency expressed
as a percentage, which is calculated by dividing a frequency by the total frequency and
multiplying by 100. The sum of the proportions, therefore, adds to 1 instead of 100. The order
in which you write these down is not essential, although ordering by descending size of
frequency makes comparison clearer.
Let’s see another example. Below you will see the number of foreign holidays sold by a travel
agent over the past four weeks.
Day No. sold Day No. sold Day No. sold Day No. sold
Monday 10 Monday 13 Monday 11 Monday 11
Tuesday 12 Tuesday 10 Tuesday 18 Tuesday 13
Wednesday 9 Wednesday 12 Wednesday 10 Wednesday 10
Thursday 10 Thursday 8 Thursday 10 Thursday 14
Friday 22 Friday 12 Friday 11 Friday 13
Saturday 14 Saturday 12 Saturday 9 Saturday 12
Can the travel agent sell a fraction of a holiday? Assuming that a holiday is a holiday regardless
of length or cost, this is clearly discrete data that would have been obtained by counting.
Examining the figures, you should notice that 10 sales occur the most frequently, with a range
of 8 to 22 sales. You could aggregate the data into a table to make this information more visible:
Number sold Frequency
8 1
9 2
10 6
11 3
12 5
13 3
14 2
More than 14 2
The sales by department of a high street store over the past three years are shown in the table
below:
The table above shows that total sales have increased over the last three years, despite a decline
in clothing sales. Diagrams should aid in highlighting these and other differences.
The complete pie chart for the sales for 2014 is shown in the figure below. This diagram
demonstrates that the furniture department has contributed the bulk of the total sales for this
year. (Note: adjustments have been made to allow for rounding errors.)
Sales
Electrical Goods
6%
Clothing
19%
Furniture
75%
Bar Charts
Although pie charts are a popular way to compare the size of categories, they have the
disadvantage of not being suitable for displaying multiple sets of data at the same time. You
would, for instance, need three separate pie charts to represent the data in the table above (see
Table: Sales by department and year). Another effective way to display categorical data or an
ungrouped frequency table is with a simple bar chart. A vertical bar is drawn for each category,
with the height proportional to the frequency. The figure below illustrates total sales (in
millions of dollars) in the form of a simple bar chart.
Year
When a category is subdivided into several subcategories, the simple bar chart is insufficient
because each subcategory requires a different bar chart. A multiple bar chart is used when you
want to see changes in the components but not the totals. The figure below is a multiple bar
chart (in millions of dollars) for the data in the table above (see Table: Sales by department
and year).
$6.00
$5.00
$4.00
Clothing
$3.00
Furniture
$2.00 Electrical Goods
$1.00
$-
2012 2013 2014
A component bar chart is used if you want to compare totals and see how totals are made up.
The figure below is a component bar chart for the data in the table above (see Table: Sales by
department and year). This graph depicts the variation in total sales from year to year (in
millions of dollars), as well as how each department contributes to total sales.
$8.00
$7.00
$6.00
$5.00
Electrical Goods
$4.00
Furniture
$3.00
Clothing
$2.00
$1.00
$-
2012 2013 2014
100%
80%
0%
2012 2013 2014
Line graphs
When data is in the form of a time series a line graph can be useful means of showing any
trends in the data.
$8.0
$7.5
$7.0
$6.7
$6.0
$5.0 $5.3
$4.0
$3.0
$2.0
$1.0
$-
2012 2013 2014
Figure above is a line graph for the total sales given in the table above (see Table: Sales by
department and year) and this line graph shows the rise in sales over the three years (from 2012
to 2014). When this type of diagram is shown in company publications, the scale on the y-axis
is frequently broken. This will exaggerate sales or other measures and can be misleading unless
you are aware of what is going on. This can also be justified if none of the values are close to
zero; however, the break in scale should be clearly visible in this case.
You should be able to mention that:
1. Total sales have increased over the three years, although the largest increase was
between 2012 and 2013.
2. Most of this increase has been the result of sales of furniture.
3. Clothing has shown a decrease in sales from 2012 to 2013 but has then remained steady.
4. The sales of clothing as a proportion of total sales have declined, while the proportion
of electrical sales has increased.
A new worksheet is a grid of rows and columns. The rows are labeled with numbers, and the
columns are labeled with letters. Each intersection of a row and a column is a cell. Each cell
has an address, which is the column letter and the row number. The arrow on the worksheet to
the right points to cell A1, which is currently highlighted, indicating that it is an active cell. A
cell must be active to enter information into it. To highlight (select) a cell, click on it.
One worksheet can have up to 256 columns and 65,536 rows, so it'll be a while before you
run out of space.
Very few people draw charts by hand these days, as it is much easier to use a spreadsheet, such
as excel. Charts produced by a spreadsheet also look more professional, and they can be
immediately updated if the data changes. When drawing charts in Excel, you can choose
whether to create the chart as an object in the same worksheet as the data or to create the chart
in a new sheet. Within each tab, commands are grouped logically, so in the Insert tab there is
a charts group which contains all the charts.
1. Pie Chart
Highlight cells A6 to A8. And while holding down the <Ctrl> key on your keyboard,
highlight cells D6 to D8. Click on the Insert tab, then Pie and choose the one you want.
Click on the
one you want
Click on edit
and highlight
cells B4 to D4
$6.00
$5.00
$4.00
Clothing
$3.00
Furniture
$2.00 Electrical Goods
$1.00
$-
2012 2013 2014
A component var chart is called a Stacked Column chart in Excel. It can be found under more
charts in the quick analysis gallery. Proceed exactly as before. The final chart can be seen
below.
$8.00
$7.00
$6.00
$5.00
Electrical Goods
$4.00
Furniture
$3.00 Clothing
$2.00
$1.00
$-
2012 2013 2014
100%
80%
40% Furniture
Clothing
20%
0%
2012 2013 2014
6. Line graph
To display a line graph, highlight the totals as in the simple bar chart. In the design ribbon
select the chart with markers. The chart can be seen in the figure below.
$8.0
$7.5
$6.7
$6.0
$5.3
$4.0
$2.0
$-
2012 2013 2014
Cliff, N. (2014). Ordinal methods for behavioral data analysis. Psychology Press.
Elliott, A. C., Hynan, L. S., Reisch, J. S., & Smith, J. P. (2006). Preparing data for analysis using Microsoft
Excel. Journal of investigative medicine, 54(6), 334-341.
Howell, D. C. (2012). Statistical methods for psychology. Cengage Learning.
McCue, C. (2014). Data mining and predictive analysis: Intelligence gathering and crime analysis.
Butterworth-Heinemann.
Mers, A. (Ed.). (2008). Useful pictures. Whitewalls Incorporated.
Oakshott, L. (2016). Essential Quantitative Methods (1st ed., pp. 50-87). UK: Palgrave Macmillan.
Umoquit, M. J., Dobrow, M. J., Lemieux-Charles, L., Ritvo, P. G., Urbach, D. R., & Wodchis, W. P.
(2008). The efficiency and effectiveness of utilizing diagrams in interviews: an assessment of
participatory diagramming and graphic elicitation. BMC Medical Research
Methodology, 8(1), 1-12.
Umoquit, M., Tso, P., Varga-Atkins, T., O’Brien, M., & Wheeldon, J. (2013). Diagrammatic
elicitation: Defining the use of diagrams in data collection. The Qualitative Report, 18(30), 1-
12.
Velleman, P. & Wilkinson, L. (2011). Nominal, Ordinal, Interval, and Ratio Typologies are
Misleading. In I. Borg & P. Mohler (Ed.), Trends and Perspectives in Empirical Social
Research (pp. 161-177). Berlin, New York: De Gruyter.
Wheeldon, J. (2011). Is a Picture Worth a Thousand Words? Using Mind Maps to Facilitate
Participant Recall in Qualitative Research. Qualitative Report, 16(2), 509-522.