0% found this document useful (0 votes)
15 views

Chapter 2 - Descriptive Statistics

note STA108 - chapter 2

Uploaded by

haireenizz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Chapter 2 - Descriptive Statistics

note STA108 - chapter 2

Uploaded by

haireenizz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

STA 108

AMIRUDDIN AB AZIZ
LEARNING OUTCOMES
1. Able to organise and represent qualitative and quantitative
data using an appropriate analysis tool.

2. Able to differentiate between the grouped and ungrouped


data.

3. Able to describe and summarise the data using numerical


descriptive measures and graphical exploratory data
analysis tools.
CONTENT

• Data organization and presentation


• Measures of Central Tendency
• Measures of Location
• Measures of Dispersion
• Measures of Skewness
2.1 Data Organization and Presentation
Graphical Method for Qualitative Data

• Data presentation is a method to summarise, organise Pie Chart


and communicate information for a set of data using a 9% 1st Qtr
10%
2nd Qtr
variety tools such as diagrams, frequency distribution, 23% 58% 3rd Qtr

charts and graphs. 4th Qtr

• Qualitative data can be classified into categories. Bar Chart


Series 1 Series 2 Series 3
6
• They can best be presented in the form of frequency 4

distribution, bar chart, pie chart and contingency table. 2


0
Category 1
Graphical Method for Qualitative Data

• Frequency distribution is a table that displays the frequency of various


outcomes in a sample.
• There are many type of frequency distribution table:

One way table – information Contingency Table/Cross tabulation –


concerning one variable information concerning two variable

Laptop Model No. of laptop Shop Income (RM) Expenses


acer 12 A 25000 14500
Hp 7 B 30000 18700
Asus 12 C 15000 5400

Table 1: Laptop model use by student in Class A Table 2: Income and Expenses for shops in Ayu Mall
Graphical Method for Qualitative Data

Table 3: Number of staff register for training


Frequency
Variable Column
Department Number of Staff
Administration 8
Category
IT 3 Frequency
Facility Management 7
Treasury 4
Total 22
Graphical Method for Qualitative Data
The Census department in Bukit Besi
conducted a survey on the type of car brand in
Masjid Bukit Besi on Friday Prayer last week. The Frequency
Car Model
data acquired are shown below. Construct a (no. of car)
frequency distribution table for these data. Honda 10
Honda Toyota Perodua Proton Proton Nissan Honda Toyota 10
Honda Honda Perodua Perodua Proton Toyota Toyota Perodua 9
Nissan Perodua Proton Toyota Honda Toyota Nissan
Proton 8
Nissan 5
Nissan Toyota Honda Perodua Proton Perodua Proton
Total 42
Honda Honda Perodua Perodua Proton Toyota Toyota
Table 4: No of car in Masjid BB
Honda Perodua Proton Toyota Honda Toyota Nissan
Graphical Method for Qualitative Data
• Besides frequency, there are other ways to describe distribution which are
relative frequency and percentage.
𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲 𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲
𝐑𝐞𝐥𝐚𝐭𝐢𝐯𝐞 𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲 = 𝐏𝐞𝐫𝐜𝐞𝐧𝐭𝐚𝐠𝐞 = × 𝟏𝟎𝟎
𝐭𝐨𝐭𝐚𝐥 𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲 𝐭𝐨𝐭𝐚𝐥 𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲

• The following example is based on the data in Example 1.

Car Model Frequency Relative frequency Percentage (%)


Honda 10 0.24 24
Toyota 10 0.24 24
Perodua 9 0.21 21
Proton 8 0.19 19
Nissan 5 0.12 12
Total 42 1 100
Graphical Method for Qualitative Data
• Pie Chart – a circle that divided into sectors which
represent the relative frequencies or percentages of a
population or a sample belonging to different
categories. 23%
31%
• Bar chart – a graph made of bars and there are a few
type of bar charts.
1. Vertical bar chart - the bars represent one unit
46%
quantity or variable only. The length of the bar
indicates quantities or percentages.
2. Multiple bar chart (Cluster bar chart) – used to
FKK FKM FSG
present comparison of data for different items in a
category. Figure 1.1: Percentage of Students in KRBB
in 2018
3. Component bar chart (Stacked bar chart) – is used
when total of various category is also compared.
4. Percentage component bar chart – data value of
category is expressed in percentage.
Component Bar Chart
Vertical Bar Chart 80
70
400
60
300

NUMBER OF SALES
50
40
200
30
100 20
0 10
Category 1 0
2016 2017 2018
FKK FKM FSG
Product C Product B Product A

Figure 1.2: Number of Students in KRBB in 2018 Figure 1.4: Sales in Ayu Trading
Multiple bar chart
5
Percentage Component Bar Chart
FKK FKM FSG 100
4.5
4 80
3.5

SALES PERCENTAGE
60
3
2.5 40
2
20
1.5
1 0
0.5 2016 2017 2018
0 Product C Product B Product A
2016 2017 2018
Figure 1.3: Number of Students in KRBB Figure 1.5: Percentage sales in Ayu Trading
Graphical Method for Quantitative Data

Frequency Distribution for Grouped Data


• Class limits - the end values of a class interval. The value on the left is
the lower class limit and the value on the right is the upper class limit.
• Class Boundary - a value that falls midway between the upper limit of
one class and the lower limit of the next one.
• Class midpoint - the middle value of a class interval.
lower limit + upper limit
Class midpoint =
2
• Class width = upper boundary − lower boundary
• Cumulative frequency - is determined by summing the frequencies for
the class and all prior classes.
Graphical Method for Quantitative Data
The table below shows the number of service years by 120 employees
at Company A.
Service No. of Class Class Cumulative
years employee boundary midpoint frequency
1–4 16 0.5 – 4.5 (1 +4)/2 = 2.5 16
5–8 20 4.5 – 8.5 6.5 36
9 – 12 28 8.5 – 12.5 10.5 64
13 – 16 24 12.5 – 16.5 14.5 88
17 – 20 16 16.5 – 20.5 18.5 104
21 – 24 11 20.5 – 24.5 22.5 115
25 – 28 5 24.5 – 28.5 26.5 120

Lower class Upper class Class width = 4.5 – 0.5 = 4


limit limit
Graphical Method for Quantitative Data
The table below shows the weight of 100 honeydews produced by
Farm X
Service No. of Class Class Cumulative
years employee boundary midpoint frequency
4–6 14 4–6 5 14
6–8 10 6–8 7 24
8 – 10 25 8 – 10 9 49
10 – 12 31 10 – 12 11 80
12 – 14 20 12 – 14 13 100

Class width = 6 – 4 = 2
Graphical Method for Quantitative Data
GRAPH OF FREQUENCY DISTRIBUTIONS
1. Histogram
• A graph that displays the data by using adjacent vertical bars of various
heights to represent the frequency of the classes.
• Guidelines for constructing histogram
a) Draw the x-axis and the y-axis. x-axis represent the class boundaries
and y-axis represent the frequency.
b) Using frequency as the height, draw a vertical bar for each classes
which is adjacent to each other (no gap).

Example: Based on example 2 in slide 12, draw a histogram for the given
data.
Graphical Method for Quantitative Data
30

25
Number of employees

20

15

10

Service
year
0.5 4.5 8.5 12.5 16.5 20.5 24.5 28.5
Figure 1.6: Histogram of service year distribution for employees at Company A (Example 2)
Graphical Method for Quantitative Data
2. Frequency Polygon polygon
A graph that displays the data by using lines that connect point plotted for
the frequencies at the midpoint of classes. The frequencies is represented
by the heights of the points.

Example: Based on example 2 in slide 12, draw a frequency polygon for the
given data.
Graphical Method for Quantitative Data
30

25
Number of employees

20

15

10

Service
2.5 6.5 10.5 14.5 18.5 22.5 26.5 year

Figure 1.7: Frequency polygon of service year distribution for employees at Company A (Ex. 2)
Graphical Method for Quantitative Data
3. Ogive
• A curve (also known as cumulative frequency graph) drawn based on
the cumulative frequency distribution by joining with smooth lines the
dots marked above the upper boundaries of classes at heights equal to
the cumulative frequencies of respective classes.

Example: Based on example 2 in slide 12, draw an ogive for the given data.
Graphical Method for Quantitative Data
30

25
Number of employees

20

15

10

Service
year
0.5 4.5 8.5 12.5 16.5 20.5 24.5 28.5
Figure 1.8: Ogive of service year distribution for employees at Company A (Example 2)
Graphical Method for Quantitative Data

4. Stem and Leafd leaf plot


• This plot presents a graphical display of the data using the actual
numerical values of each data point. This plot separates data value
into stem (leading digits) and leaf (trailing digits).
• Guidelines for constructing stem and leaf plot
a) Arrange the data value in order.
b) Divide each measurement into two parts: the stem and the leaf.
c) List all the stems in order, from lowest to the highest.
d) For each measurement, record the leaf portion in the same row
as its corresponding stem.
Graphical Method for Quantitative Data
JKR conducted a survey passing through Bridge A for 12 days. The
data acquired are shown below. Illustrate the stem and leaf plot for
the given data.
17 12 20 9 15 21 15 16 12 14 8 23

Rearrange the data in order 08 09 12 12 14 15 15 16 17 20 21 23


Stem Leaf
0 8 9
1 2 2 4 5 5 6 7
2 0 1 3

Notes: If the data value are hundreds such as 325, the stem is 32 and the leaf is 5
Graphical Method for Quantitative Data
An insurance company researcher conducted a survey on the
number of car thefts in a large city for a period of 30 days. The raw
data are shown below. Construct a steam and leaf plot for the given
data.
22 32 21 20 29 28 47 36 23 27 45 26 22 37 43
49 29 38 35 42 27 21 33 39 45 35 23 48 36 25

Rearrange the data in order


20, 21, 21, 22, 22, 23, 23, 25, 26, 27, 27, 28, 29, 29, 32, 33, 35, 35, 36, 36, 37, 38, 39, 42, 43, 45, 45,
47, 48, 49
Stem Leaf
2 0 1 1 2 2 3 3 5 6 7 7 8 9 9

3 2 3 5 5 6 6 7 8 9

4 2 3 5 5 7 8 9
2.2 Measures of Central Tendency

• Measures of central tendency is measure of average which include


mean, median, and mode.

• A measure of central tendency gives the centre of histogram or a


frequency distribution curves.

• Ungrouped data – data that give information on each member of


the population or sample individually.

• Grouped data – data that are presented in the form of frequency


distribution table.
2.2 Measures of Central Tendency
1. Mean/Arithmetic average, 𝑿
ഥ Example 1 The data represent the number of
day off in 2018 for 8 staffs in sales
2. Mean for ungrouped data department. Find the mean.
σ𝒙
ഥ=
𝒙 20, 25, 18, 15, 17, 22, 19, 24
𝒏
where; Σ𝒙 is sum of all data value 20 + 25 + 18 + 15 + 17 + 22 + 19 + 24
𝒏 is total number of data 𝑥ҧ =
8
= 20days
• Mean for grouped data
σ 𝒇𝒙𝒎 𝑥ҧ is sample mean
ഥ=
𝒙 𝜇 is population mean
𝒏
where; 𝒇 is the frequency of a class
𝒙𝒎 is the midpoint of a class
2.2 Measures of Central Tendency
Example 2
The following table shows the summary data of the distance in kilometre
that 20 runners ran in one week. Determine the mean of the distance
ran.
Frequency Class midpoint
Class 𝒇 ∙ 𝒙𝒎
𝒇 𝒙𝒎
5.5 – 10.5 3 8 24
10.5 – 15.5 5 13 65
15.5 – 20.5 4 18 72
20.5 – 25.5 5 23 115
25.5 – 30.5 3 28 84
Σ𝒇 = 𝟐𝟎 Σ𝒇𝒙𝒎 = 𝟑𝟔𝟎
360
∴ 𝑥ҧ = = 18 km
20
2.2 Measures of Central Tendency
2. Median
• the value of the middle term in a data set that has been ranked in
increasing order

• Median for ungrouped data


a) Rank the data set in ascending order.
b) Determine the position of the middle term
𝑛+1
position =
2
c) Find the value of median
⟹ n is odd number, median = middle value in sequence
⟹ n is even number, median = average of 2 middle values
2.2 Measures of Central Tendency
2. Median
• Median for grouped data, 𝒙

𝒏
− 𝜮𝒇𝒎−𝟏
෥ = 𝑳𝒎 +
𝒙 𝟐 𝑪
𝒇𝒎
where; 𝑳𝒎 is lower median class boundary
𝜮𝒇𝒎−𝟏 = cumulative frequencies for class before median class
𝒇𝒎 = median class frequency
𝒏 is sample size
𝑪 is class width
2.2 Measures of Central Tendency
Example 3
Find the median for the following data.
20, 25, 18, 15, 17, 22, 19

1. Rearrange the data in ascending order : 15, 17, 18, 19, 20, 22, 25 (n = 7)

7+1
2. Position = =4 median
2

Example 4
Find the median for the following data.
205, 150, 125, 180, 215, 175, 150, 140
1. Rearrange the data in ascending order : 125, 140, 150, 150, 175, 180, 205, 215, (n = 8)
8+1
2. Position = = 4.5
2
150 + 175
3. Median = = 162.5 median
2
2.2 Measures of Central Tendency
Example 5
Find the median for the following data.
20
Median = th = 10th
Frequency Cumulative 2
Class
𝒇 Frequency
20
5.5 – 10.5 3 3 −8
Median = 15.5 + 2 5
10.5 – 15.5 5 8 4
= 18
15.5 – 20.5 4 12
20.5 – 25.5 5 17
25.5 – 30.5 3 20

Median class
2.2 Measures of Central Tendency
3. Mode
• The most frequent data value in a data set.
• There is data set with no mode (it is wrong to say the mode is zero, since zero
can be a data value) or more than one mode
• Mode for ungrouped data
Data value that occur the most

• Mode for grouped data, 𝒙


ෝ Modal class is the class
∆1 with the largest frequency
𝑥ො = 𝐿𝑚0 + 𝐶
∆1 + ∆2

where; 𝑳𝒎𝟎 is lower modal class boundary


∆1 = frequency model class – frequency for class before model class
∆2 = frequency model class – frequency for class after model class
𝐶 is class width
2.2 Measures of Central Tendency
Example 6
The following data represent the duration in minutes for 20 students to
finish their experiment. Find the mode of the given data.
35, 20, 25, 35, 33, 22, 25, 21, 27, 28, 25 ⟹ Mode = 25
Example 7
The following table shows the summary data of the distance in kilometre
that 20 runners ran in one week. Determine the mode of the distance
ran.
Frequency
Class 7−3
𝒇 Mode, 𝑥ො = 10.5 + (5)
7−3 + 7−4
5.5 – 10.5 3
10.5 – 15.5 7 = 13.4
15.5 – 20.5 4
20.5 – 25.5 5
25.5 – 30.5 3 Modal class
2.3 Measures of Location
• Measures of location tell where a specific data values fall within the data
set or its relative position in comparison with other data values.

Measures of
location

Quartiles Deciles Percentiles


– Divide data set – Divide data set – Divide data set
into 4 equal parts into 10 equal parts into 100 equal parts
2.3 Measures of Location

• Quartiles are separated into first quartile (Q1), second quartile (Q2) and
third quartile (Q3).
1 𝐧+𝟏
• Q1 is the th value in the data set. 𝐏𝐨𝐬𝐢𝐭𝐢𝐨𝐧 𝐨𝐟 Q1 =
4 𝟒
25% of all the observation is less than Q1 and another 75% is more than Q1
𝟐 𝐧+𝟏
• Q2 is the median value. 𝐏𝐨𝐬𝐢𝐭𝐢𝐨𝐧 𝐨𝐟 Q2 =
𝟒
3 𝟑 𝐧+𝟏
• Q3 is the th value in the data set. 𝐏𝐨𝐬𝐢𝐭𝐢𝐨𝐧 𝐨𝐟 Q3 =
4 𝟒
75% of all the observation is less than Q3 and another 25% is more than Q3
2.3 Measures of Location

a) Ungrouped data:
Step 1 : Arrange the data according to ascending order.
𝐧+𝟏 𝟑(𝐧+𝟏)
Step 2 : Position of Q1 is , position of Q3 is
𝟒 𝟒
Step 3 : Find Q1 or / and Q3.
2.3 Measures of Location
Example 8
Find Q1, Q2, and Q3 for the following data: 15, 13, 6, 5, 12, 50, 22, 18

ANSWER
Rearrange the data into ascending orders
5, 6, 12, 13, 15, 18, 22, 50 (n = 8)

8+1 6, 12 3(8 + 1)
Position of 𝑄1 = = 2.25 Position of 𝑄3 = = 6.75
4 4
𝑄1 = 6 + 0.25(12 − 6) = 7.5 18, 22
𝑄1

2(8 + 1)
= 4.5 13, 15
𝑄3
Position of 𝑄2 =
4 𝑄3 = 18 + 0.75 22 − 18 = 21
13 + 15
𝑄2 = = 14 𝑄2
2
2.3 Measures of Location
b) For grouped data:
Step 1 : Find the cumulative frequency.
Σ𝒇 𝟑Σ𝒇
Step 2 : Position of Q1 is , position of Q3 is
𝟒 𝟒
Σ𝒇 𝟑Σ𝒇
−Σ𝑓 − Σ𝑓Q3−1
Step 3 : Find QuartileQ1= 𝐿Q +
𝟒 Q1−1 𝐶 and Q3 = 𝐿Q3 + 4 𝐶
1 𝑓 𝑓Q3
Q1
where Σ𝒇 = total frequency = n
C = class size
𝐿Q = lower class boundary of the lower quartile class
1

𝑓Q = frequency of the lower quartile class


1

Σ𝑓Q −1 = cumulative frequency before the Lower quartile class


1

𝐿Q = lower class boundary of the upper quartile class


3

Σ𝑓Q3−1 = cumulative frequency before the upper quartile class


2.3 Measures of Location
Example 9
The marks score of 44 students in a Statistics Examination are shown in the table below. Find the lower
quartile and upper quartile. Hence, give your comments.

Marks Class Boundary No. of Students Cumulative


(f) Frequency (Σf)
30 – 39 29.5 – 39.5 2 2
40 - 49 39.5 – 49.5 4 6
50 – 59 49.5 – 59.5 8 14 *
60 – 69 59.5 – 69.5 14 28
70 – 79 69.5 – 79.5 10 38 **
80 – 89 79.5 – 89.5 5 43
90 - 99 89.5 – 99.5 1 44
2.3 Measures of Location
Solution:
f
Q1 = = 11*
Position of lower quartile is 4
 11 − 6 
Q1 = 49.5 +  10 = 55.75
 8 

Comment : 25% of the student marks is less than 55.75 and another 75%
marks is more than 55.75.

3f
Q3 = = 33*
Position of upper quartile is 4
 33 − 28 
Q3 = 69.5 +  10 = 74.5
 10 

Comment : 75% of the student marks is less than 74.5 and another 25%
marks is more than 74.5.
2.4 Measures of Dispersion
Example pg 67
• Measures of dispersion – measures the spread of data values.
• This measures include the range, inter-quartile range, quartile deviation, variance
and standard deviation.
• The measures of dispersion is important because two groups of data may have
same central values but they have different variability.

Group A: 20 30 40 50 60 70 Group B: 40 42 43 45 55
20 + 30 + 40 + 50 + 60 40 + 42 + 43 + 45 + 55
mean𝐴 = = 45 mean𝐵 = = 45
6 6

The average marks for both Group A and Group B are the same,
but the data for Group A are more disperse than Group B.
2.4 Measures of Dispersion
a) Range – The difference between the largest and the smallest values in the data set.

Range for ungroup data = 𝐦𝐚𝐱𝐢𝐦𝐮𝐦 𝐯𝐚𝐥𝐮𝐞 − 𝐦𝐢𝐧𝐢𝐦𝐮𝐦 𝐯𝐚𝐥𝐮𝐞

Range for group data = 𝐮𝐩𝐩𝐞𝐫 𝐜𝐥𝐚𝐬𝐬 𝐛𝐨𝐮𝐧𝐝𝐚𝐫𝐲 𝐨𝐟 𝐥𝐚𝐬𝐭 𝐜𝐥𝐚𝐬𝐬 − 𝐥𝐨𝐰𝐞𝐫 𝐜𝐥𝐚𝐬𝐬 𝐛𝐨𝐮𝐧𝐝𝐚𝐫𝐲 𝐨𝐟 𝐟𝐢𝐫𝐬𝐭 𝐜𝐥𝐚𝐬𝐬

Example pg 67

Find the range value for each of the following group


Group A: 20 30 40 50 60 70 Group B: 40 42 43 45 55

range𝐴 = 70 − 20 = 50 range𝐵 = 55 − 40 = 15
2.4 Measures of Dispersion

Example pg 67

Class range = UCB of last class − LCB of first class


Class Frequency
Boundary = 54.5 − 24.5
25 – 29 24.5 – 29.5 2 = 30
30 – 34 29.5 – 34.5 3

35 – 39 34.5 – 39.5 5

40 – 44 39.5 – 44.5 7

45 – 49 44.5 – 50.5 9
range𝐴 = 70 − 20 = 50
50 – 54 50.5 – 54.5 2
2.4 Measures of Dispersion

Ungrouped data Grouped data

1 𝑛
1 𝑛 2
𝑠= Σ𝑖=1 𝑥𝑖 − 𝑥ҧ 2 𝑠= Σ𝑖=1 𝑓 𝑥𝑖 − 𝑥ҧ
Standard 𝑛−1 Σ𝑓 − 1
Deviation 1 Σ𝑥 2 1 Σ𝑓𝑥 2
= 2
Σ𝑥 − = 2
Σ𝑓𝑥 −
𝑛−1 𝑛 𝑛−1 Σ𝑓

Note: 𝑥 in the group data


𝐈𝐧𝐭𝐞𝐫– 𝐐𝐮𝐚𝐫𝐭𝐢𝐥𝐞 𝐫𝐚𝐧𝐠𝐞 = 𝑸𝟑 − 𝑸𝟏
is midpoint value
𝑸𝟑 − 𝑸𝟏
𝐐𝐮𝐚𝐫𝐭𝐢𝐥𝐞 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧 =
𝟐

𝐕𝐚𝐫𝐢𝐚𝐧𝐜𝐞 = 𝐬 𝟐
2.4 Measures of Dispersion

Review Question pg 82
The number of sales made last month by each of the 14 members of the sales
staff of a company was recorded:
101, 137,
68, 77, 128,
101, 124,
108, 121,121,
112, 133,123,
77, 124,
127, 127,
108, 128,
139, 130,
68, 112,
133, 123,
137, 130
139,
Find a) mean 1628
a) mean = = 116.29
b) Standard deviation 14

c) range 1 1628 2
b) s = 195300 − = 21.46
d) Quartile deviation 14 − 1 14

c) range = 139 − 68 = 71
𝑄3 − 𝑄1 130.75 − 106.25
d) = = 12.25
2 2
2.4 Measures of Dispersion
• If two or more data sets have equal mean values, standard deviation can
be used as the measures of dispersion.
• If two or more data sets have different values of mean and standard
deviation, the dispersion of data can be measured using coefficient of
variation.
𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧 𝐬
𝐂𝐨𝐞𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐭 𝐨𝐟 𝐯𝐚𝐫𝐢𝐚𝐭𝐢𝐨𝐧 = = × 𝟏𝟎𝟎
𝐦𝐞𝐚𝐧 ഥ
𝒙
Example pg 72 Set A Set B Set C
Mean, 𝑥ҧ 75 75 75
Standard deviation, 𝑠 2 10 15 7
➢ Set B has greater dispersion because it has the largest standard deviation.
➢ Set C is has dispersion which is more consistent because it has the smallest standard
deviation.
2.4 Measures of Dispersion
Example 10
Number of car sold Commissions (RM)
Mean, 𝑥ҧ 87 20900
Standard deviation, 𝑠 5 3092
Based on the data above, compare the variations between number of
cars sold and the commissions received.
s 5
𝐍𝐮𝐦𝐛𝐞𝐫 𝐨𝐟 𝐜𝐚𝐫 𝐬𝐨𝐥𝐝: Coefficient of variation = × 100 = × 100 = 5.7%
𝑥ҧ 87
s 3092
𝐂𝐨𝐦𝐦𝐢𝐬𝐬𝐢𝐨𝐧𝐬: Coefficient of variation = × 100 = × 100 = 14.8%
𝑥ҧ 20900
Conclusion
➢ Commissions are more variable than sales.
➢ Sales are more consistent than commissions.
2.4 Measures of Dispersion
Example 11
Grouped Data with Frequency
From the table below, find the range, Interquartile Range, Quartile Deviation and
variance NO. of Computers (x) 0 1 2 3 4
Frequency (f) 6 8 10 4 2
𝑓𝑥 2 6(02 ) 8(12 ) 10(22 ) 4(32 ) 2(42 )
𝑓𝑥 6(0) 8(1) 10(2) 4(3) 2(4)
Cum. Frequency 6 14 24 28 30
1) Range = 4 – 0 = 4
1 Σ𝑓𝑥 2
2) IQR = 𝑸𝟑 − 𝑸𝟏 = 2 – 1 = 1 SD = Σ𝑓𝑥 2 −
𝑛−1 Σ𝑓
𝑸𝟑 −𝑸𝟏 𝟏
3) QD = = = 𝟎. 𝟓
𝟐 𝟐
1 48 2
4) Variance, 𝐬𝟐 = 1.357 SD = 116 − = 1.1626
30−1 30
5) SD = 1.1626
2.4 Measures of Dispersion
Example 12
Grouped Data
From the table below, find the range, Interquartile Range, Quartile Deviation
and variance
Time Class Mid Point (x) Frequency (f) 5Cumulative
Boundary Frequency
30 – 39 29.5 – 39.5 34.5 25 25 (1 - 25)
40 – 49 39.5 – 49.5 44.5 16 41 (26 - 41)
50 – 59 49.5 – 59.5 54.5 12 53 (42 – 53)
60 – 69 59.5 – 69.5 64.5 7 60 (54 – 60)
70 - 79 69.5 – 79.5 74.5 5 65 (61 – 64)
Time Class Mid Point (x) Frequency (f) Cumulative 𝑓𝑥 2 𝑓𝑥
Boundary Frequency
30 – 39 29.5 – 39.5 34.5 25 25 (1 - 25) 25(34.52 ) 25(34.5)
40 – 49 39.5 – 49.5 44.5 16 41 (26 - 41) 16(44.52 ) 16(44.5)
50 – 59 49.5 – 59.5 54.5 12 53 (42 – 53) 12(54.52 ) 12(54.5)
60 – 69 59.5 – 69.5 64.5 7 60 (54 – 60) 7(64.52 ) 7(64.5)
70 - 79 69.5 – 79.5 74.5 5 65 (61 – 64) 5(74.52 ) 5(74.5)
Σ𝑓𝑥 2 = Σ𝑓𝑥 =
153965.25 3052.5
1) R = 79.5 – 29.5 = 50
2) IQR = 𝑸𝟑 − 𝑸𝟏 = 55.96 – 36 = 19.96
𝑸 −𝑸 𝟏𝟗.𝟗𝟔
3) QD = 𝟑 𝟏 = = 𝟗. 𝟗𝟖
𝟐 𝟐
4) Variance, 𝐬 𝟐 = 165.7212
5) SD = 12.8733
2.5 Measures of Skewness
• Frequency distribution can assume many shapes.
• Histogram and frequency polygon can be used to determine the shape
of distribution.
• Shape of distribution also can be determined by using min median and
mode.

Distribution is skewed to the right Normal distribution Distribution is skewed to the left
mode < median < mean mean = median = mode mean < median < mode
2.5 Measures of Skewness

• Pearson’s coefficient of skewness (PCS)– used to measure the degree of


skewness of a data set.
𝐦𝐞𝐚𝐧 − 𝐦𝐨𝐝𝐞 𝟑 𝐦𝐞𝐚𝐧 − 𝐦𝐞𝐝𝐢𝐚𝐧
𝐏𝐂𝐒 = =
𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧

➢ PCS = 0, the distribution is symmetry (normal distribution)


➢ PCS < 0, the distribution is skewed to the left (negatively skewed)
➢ PCS > 0, the distribution is skewed to the right (positively skewed)
2.5 Measures of Skewness
Example pg 76

Group A Group B
Mean = 47.5 Mean = 45
Standard deviation = 10.84 Standard deviation = 5.25
Median = 50 Median = 44
Mode = 50 Mode = 45

47.5 − 50 45 − 45
𝑃𝐶𝑆 = = −0.231 𝑃𝐶𝑆 = =0
10.84 5.25

3(47.5 − 50) 3(45 − 44)


𝑃𝐶𝑆 = = −0.691 𝑃𝐶𝑆 = = 0.571
10.84 5.25

Conclusion: The data for Group A is negatively skewed or skewed to the left
The data for Group B is positively skewed or skewed to the right.
2.5 Measures of Skewness

• If skewness is less than -1 or greater than 1, the distribution is highly skewed.


• If skewness is between -1 and -0.5 or between 0.5 and 1, the distribution is
moderately skewed.
• If skewness is between -0.5 and 0.5, the distribution is approximately
symmetric.
2.5 Measures of Skewness
• Box-and-Whisker Plot –shows the minimum value, Q1, median, Q3 and
the maximum value. Q1 median

Min value Max


Q3 value

0 5 10 15 20 25 30

Step to draw boxplot


• Draw a scale for the data on x-axis.
• Locate the lowest value, Q1, median, Q3 and the highest value.
• Draw a box around Q1 and Q3 , a vertical line through the median.
• Draw a line form lowest value to Q1 and from Q3 to highest value.
2.5 Measures of Skewness
Example pg 80
Rearrange the data in ascending order :
5, 8, 9, 10, 11, 12, 12, 13, 14,14, 15, 15, 15, 17, 17, 18, 18, 19, 20, 20, 20, 23, 29

First Quartile, Q1 = 12 Median = 15 Third Quartile, Q3 = 19

Q1 median

Min value Max


Q3 value

0 5 10 15 20 25 30

You might also like