0% found this document useful (0 votes)
50 views

Lecture 4 Data Analysis Summary Measures 1

Calculate the coefficients of variation for each operation and compare them.

Uploaded by

Keaobaka Tlagae
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Lecture 4 Data Analysis Summary Measures 1

Calculate the coefficients of variation for each operation and compare them.

Uploaded by

Keaobaka Tlagae
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

DATA ANALYSIS-AVERAGES

AND MEASURES OF
DISPERSION
Lecture 4-Class Discussion Notes
BM & EBL Year 1
Kelebogile Kenalemang
OUTLINE
• Basic summary measures
• Measures of Location
• The nature of variation
• Measuring variation
• Interpreting the measures
Measures of Location/Measures of Central
Tendency
• A measure of central tendency is a value used to represent
the typical or “average” value in a data set.
• The purpose is to identify the location of the centre in a data
set.
• There are three common measures of Central Tendency
1. Mean
2. Median
3. Mode
Mean
• Mean – (average) is the sum of all data values divided by the number
of values in the data set.
• The mean of a sample data set is denoted by and the mean of a
population data set by the Greek letter
• The mean for ungrouped sample data set is given as
follows;
• The mean for ungrouped population data set is given as;
Example
• The demand for a product for a sample of each of the 20 days is as
follows;

3, 12, 7, 17, 3, 14, 9, 6, 11, 10, 1, 4, 19, 7, 15, 6, 9, 12, 12, 8

• Calculate the sample mean for this data set


Calculating the arithmetic mean for a
frequency distribution;
• The arithmetic mean for a frequency distribution is given as follows;

• Where is multiplied by the frequency

• is the sum of frequencies


Example: Calculating the arithmetic mean
for an ungrouped frequency table
• Consider the following ungrouped frequency table that gives the daily number of
sales made by the sales department of a double-glazing company;
No. of Sales Frequency
2 3
3 7
4 9
5 6
6 5
7 2
8 1

• What is the mean number of sales made per day by the company’s sales
department?
Example: Calculating the arithmetic mean
for a grouped frequency table

Daily Demand Frequency


0-4 4
5-9 8
10-14 6
15-20 2
Class exercise

Class Intervals Frequency


30-34 4
35-39 7
40-44 14
45-49 23
50-54 16
55-59 9
60-64 4
65-69 1
70-74 1
75-79 0
80-84 1
Median
• The problem with the mean is that it gives equal importance to values
in the data set including any extreme values or outliers.
• However, the median overcomes this problem by choosing the middle
value of a set of numbers by first arranging the data in ascending
order.
• The median of an ungrouped data is found by arranging the values in
a data set in ascending or descending order and selecting the value in
the middle of the range
• Example: Determine the median for the following data set
3, 12, 7, 17, 3, 14, 9, 6, 11, 10, 1, 4, 19, 7, 15, 6, 9, 12, 12, 8
Mode
• The mean and the median can be both used for numerical data but
not for categorical data.

• The mode can be used for both categorical and numerical data.

• The mode of a set of data values is the value that appears most
frequently.
Example
• Data from a travel to work survey was provided. This data is
summarized in a frequency table below, which is the most typical
mode of travel?

Mode of Travel Frequency


Car 9
Bus 4
Cycle 3
Walk 2
Train 2
Measures of Dispersion/Spread/Variability
• An average is not always sufficient in explaining how a set of data is
distributed, in addition to a measure of location, a measure of
dispersion can be provided.
• Measures of dispersion tell us how squeezed or how scattered a
distribution is.
• When a data has a large dispersion this means that the values in the
data set are widely scattered.
• When the dispersion is small, the items in the data set are squeezed.
The main measures of dispersion
• The Range
• Quartiles and semi interquartile range
• The mean deviation
• The Variance and the standard deviation
• The coefficient of variation
The Range
• The Range is the difference between the highest value and the lowest
value in a data set.

• Example: Calculate the range for the following data set;

• 4, 8, 7, 3, 5, 16, 24, 5, 6, 4, 3
Quartiles and Semi Interquartile Range
• The quartiles and the median divide the sample into four groups of
equal size
• Quartiles help us to identify the range within which most of the
values in the sample occur.
• Lower quartile Q1, is the value below which 25% of the data set falls.
• Upper quartile Q3 is the value above which 25% of the data set falls.
• Median Q2 is the value of the middle value in the data set
Example: Calculate, Q1, Q2, Q3 using the following data set

4, 8, 7, 3, 5, 16, 24, 5, 6, 4, 3
The mean deviation
• The mean deviation is a measure of the average amount by which the
values in a distribution differ from the arithmetic mean.
Class exercise
• The hours of overtime worked in a particular quarter by 60 employees
of a company are as follows; Calculate the mean deviation of this
frequency distribution.
Hours Frequency
0-10 3
10-20 6
20-30 11
30-40 15
40-50 12
50-60 7
60-70 6
The Variance
• The variance , is the average of the squared mean deviation for each
value in a distribution.
• Calculation of the variance for ungrouped data

• Calculation of the variance for grouped data

• The standard deviation is the square root of the variance


Class exercise
• Calculate the variance and the standard deviation of the following
frequency distribution;
Coefficient of Variation
• If two sets of data have similar means then it is easy to compare the
variation by calculating their standard deviations.

• The coefficient of variation compares the spreads of two frequency


distributions.

• The bigger the variation, the wider the spread


Example
• Suppose that two sets of data, A and B have the following means and
standard deviations;
A B
Mean 120 125
Standard Deviation 50 51

• Calculate the CV and compare the variation


Class exercise
• A hospital is comparing the times patients are waiting for two types of
operation. For bypass surgery the mean wait is 17 weeks with a
standard deviation of 6 weeks, while for hip replacement the mean is
11 months with a standard deviation of 1 month.
• Which operation has the highest variability?

You might also like