0% found this document useful (0 votes)
50 views

Lecture 4 Data Analysis Summary Measures 1

Calculate the coefficients of variation for each operation and compare them.

Uploaded by

Keaobaka Tlagae
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Lecture 4 Data Analysis Summary Measures 1

Calculate the coefficients of variation for each operation and compare them.

Uploaded by

Keaobaka Tlagae
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

DATA ANALYSIS-AVERAGES

AND MEASURES OF
DISPERSION
Lecture 4-Class Discussion Notes
BM & EBL Year 1
Kelebogile Kenalemang
OUTLINE
• Basic summary measures
• Measures of Location
• The nature of variation
• Measuring variation
• Interpreting the measures
Measures of Location/Measures of Central
Tendency
• A measure of central tendency is a value used to represent
the typical or “average” value in a data set.
• The purpose is to identify the location of the centre in a data
set.
• There are three common measures of Central Tendency
1. Mean
2. Median
3. Mode
Mean
• Mean – (average) is the sum of all data values divided by the number
of values in the data set.
• The mean of a sample data set is denoted by and the mean of a
population data set by the Greek letter
• The mean for ungrouped sample data set is given as
follows;
• The mean for ungrouped population data set is given as;
Example
• The demand for a product for a sample of each of the 20 days is as
follows;

3, 12, 7, 17, 3, 14, 9, 6, 11, 10, 1, 4, 19, 7, 15, 6, 9, 12, 12, 8

• Calculate the sample mean for this data set


Calculating the arithmetic mean for a
frequency distribution;
• The arithmetic mean for a frequency distribution is given as follows;

• Where is multiplied by the frequency

• is the sum of frequencies


Example: Calculating the arithmetic mean
for an ungrouped frequency table
• Consider the following ungrouped frequency table that gives the daily number of
sales made by the sales department of a double-glazing company;
No. of Sales Frequency
2 3
3 7
4 9
5 6
6 5
7 2
8 1

• What is the mean number of sales made per day by the company’s sales
department?
Example: Calculating the arithmetic mean
for a grouped frequency table

Daily Demand Frequency


0-4 4
5-9 8
10-14 6
15-20 2
Class exercise

Class Intervals Frequency


30-34 4
35-39 7
40-44 14
45-49 23
50-54 16
55-59 9
60-64 4
65-69 1
70-74 1
75-79 0
80-84 1
Median
• The problem with the mean is that it gives equal importance to values
in the data set including any extreme values or outliers.
• However, the median overcomes this problem by choosing the middle
value of a set of numbers by first arranging the data in ascending
order.
• The median of an ungrouped data is found by arranging the values in
a data set in ascending or descending order and selecting the value in
the middle of the range
• Example: Determine the median for the following data set
3, 12, 7, 17, 3, 14, 9, 6, 11, 10, 1, 4, 19, 7, 15, 6, 9, 12, 12, 8
Mode
• The mean and the median can be both used for numerical data but
not for categorical data.

• The mode can be used for both categorical and numerical data.

• The mode of a set of data values is the value that appears most
frequently.
Example
• Data from a travel to work survey was provided. This data is
summarized in a frequency table below, which is the most typical
mode of travel?

Mode of Travel Frequency


Car 9
Bus 4
Cycle 3
Walk 2
Train 2
Measures of Dispersion/Spread/Variability
• An average is not always sufficient in explaining how a set of data is
distributed, in addition to a measure of location, a measure of
dispersion can be provided.
• Measures of dispersion tell us how squeezed or how scattered a
distribution is.
• When a data has a large dispersion this means that the values in the
data set are widely scattered.
• When the dispersion is small, the items in the data set are squeezed.
The main measures of dispersion
• The Range
• Quartiles and semi interquartile range
• The mean deviation
• The Variance and the standard deviation
• The coefficient of variation
The Range
• The Range is the difference between the highest value and the lowest
value in a data set.

• Example: Calculate the range for the following data set;

• 4, 8, 7, 3, 5, 16, 24, 5, 6, 4, 3
Quartiles and Semi Interquartile Range
• The quartiles and the median divide the sample into four groups of
equal size
• Quartiles help us to identify the range within which most of the
values in the sample occur.
• Lower quartile Q1, is the value below which 25% of the data set falls.
• Upper quartile Q3 is the value above which 25% of the data set falls.
• Median Q2 is the value of the middle value in the data set
Example: Calculate, Q1, Q2, Q3 using the following data set

4, 8, 7, 3, 5, 16, 24, 5, 6, 4, 3
The mean deviation
• The mean deviation is a measure of the average amount by which the
values in a distribution differ from the arithmetic mean.
Class exercise
• The hours of overtime worked in a particular quarter by 60 employees
of a company are as follows; Calculate the mean deviation of this
frequency distribution.
Hours Frequency
0-10 3
10-20 6
20-30 11
30-40 15
40-50 12
50-60 7
60-70 6
The Variance
• The variance , is the average of the squared mean deviation for each
value in a distribution.
• Calculation of the variance for ungrouped data

• Calculation of the variance for grouped data

• The standard deviation is the square root of the variance


Class exercise
• Calculate the variance and the standard deviation of the following
frequency distribution;
Coefficient of Variation
• If two sets of data have similar means then it is easy to compare the
variation by calculating their standard deviations.

• The coefficient of variation compares the spreads of two frequency


distributions.

• The bigger the variation, the wider the spread


Example
• Suppose that two sets of data, A and B have the following means and
standard deviations;
A B
Mean 120 125
Standard Deviation 50 51

• Calculate the CV and compare the variation


Class exercise
• A hospital is comparing the times patients are waiting for two types of
operation. For bypass surgery the mean wait is 17 weeks with a
standard deviation of 6 weeks, while for hip replacement the mean is
11 months with a standard deviation of 1 month.
• Which operation has the highest variability?

You might also like