0% found this document useful (0 votes)
48 views

CHAPTER 2 - Descriptive Statistics

This document summarizes different methods for organizing and graphing quantitative and qualitative data, including frequency distributions, pie charts, bar charts, histograms, and cumulative frequency curves. It also describes numerical methods for describing data, such as measures of central tendency (mean, median, mode), measures of dispersion (range, standard deviation, variance), and measures of central position (quartiles). Examples are provided to illustrate how to construct and interpret these different graphs and analyses.

Uploaded by

2022680144
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

CHAPTER 2 - Descriptive Statistics

This document summarizes different methods for organizing and graphing quantitative and qualitative data, including frequency distributions, pie charts, bar charts, histograms, and cumulative frequency curves. It also describes numerical methods for describing data, such as measures of central tendency (mean, median, mode), measures of dispersion (range, standard deviation, variance), and measures of central position (quartiles). Examples are provided to illustrate how to construct and interpret these different graphs and analyses.

Uploaded by

2022680144
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

___________________________________________Chapter 2: Describing Data & Numerical Methods

CHAPTER 2
DESCRIBING DATA & NUMERICAL METHODS
ORGANIZING & GRAPHING DATA

QUALITATIVE DATA QUANTITATIVE DATA

1) Frequency distribution 1) Stem-and-leaf plots


2) Pie chart 2) Frequency distribution
3) Bar chart Ø Ungrouped & grouped data
Ø Vertical / Horizontal bar 3) Histogram
chart 4) Frequency polygon
Ø Cluster @ Multiple bar 5) Ogives (Cumulative frequency curve)
chart
Ø Component @ Stacked bar
chart

NUMERICAL METHODS

GROUPED DATA UNGROUPED DATA

MEASURES OF MEASURES OF MEASURES OF


CENTRAL TENDENCY DISPERSION CENTRAL POSITION

1) Mean 1) Range Quartiles

2) Mode 2) Interquartile Range 1) First Quartile


3) Median 3) Quartile Deviation 2) Third Quartile
4) Mean Deviation

5) Standard Deviation

6) Variance

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 35
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.1 FREQUENCY DISTRIBUTION

Ù Frequency distribution is a table consisting of rows and columns with the purpose to

summarized data into different classes & their frequencies.

Ù CLASS = Is a category into which qualitative data can be classified.

Ù CLASS FREQUENCY = Is the number of observations that fall in a particular class.

EXAMPLE 1

A car dealer in Kuala Lumpur makes the sales for the following types of cars in the

month of January 2005 as shown in table below.


Qualitative Variable (Class) Class Frequency

Car Model Number of Cars


Waja 66
Wira 50
Saga 39
Gen-2 25
TOTAL 180

EXAMPLE 2

Construct a frequency distribution table for the type of blood of 21 staffs in a company

as given below.

A B B A O O A
B O A O A B A
B O A A B B B

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 36
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.2 PIE CHART


Ù It consists of one or more circles that are divided into sectors.

Ù The sectors show the number of objects or percentage of each category/group.

Ù STEPS:

(1) Identify number of categories

(2) Find total for each category & total for all categories.

(3) Calculate the percentages for each category (convert to the nearest whole

number).
𝑻𝒐𝒕𝒂𝒍 𝒇𝒐𝒓 𝒆𝒂𝒄𝒉 𝒄𝒂𝒕𝒆𝒈𝒐𝒓𝒚
= × 𝟏𝟎𝟎
𝑻𝒐𝒕𝒂𝒍 𝒇𝒐𝒓 𝒂𝒍𝒍 𝒄𝒂𝒕𝒆𝒈𝒐𝒓𝒚

(4) Convert the percentage into degrees. (% × 360o)

(5) Draw each sector (for each category) using protector.

EXAMPLE 3

NZ Holdings’ current assets (RM million) for the year 2000 are given in table below.

Construct pie chart for the information given and give a brief comment.

Current Asset RM (Million)


Stocks 1520

Cash 720

Others 860

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 37
___________________________________________Chapter 2: Describing Data & Numerical Methods

STEP 1 STEP 3 STEP 4

Current Asset RM (Million) Percentage Convert to Degree

Stocks

Cash

Others

TOTAL

STEP 2

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 38
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.3 VERTICAL & HORIZONTAL BAR CHART

Ù This bar chart is frequently used in newspapers, magazines, companies’ annual report &

other presentation to highlight information.

Ù It used the lengths of horizontal bars (in horizontal bar chart) or vertical columns (in

vertical bar chart) to represent quantity or percentages.

Ù GUIDELINES for constructing bar chart:

(1) Label the horizontal axis (categories) and vertical axis (with appropriate scale).

(2) Construct a rectangle over each category with the height of the rectangle equal

to the number of objects in that category. The base of each rectangle should

be of the same width.

(3) Leave space between each category on the horizontal axis to distinguish

between the categories & to clarify the presentation.

EXAMPLE 4

NZ Given below are data showing the quarterly profit (in RM ‘000) for XYZ Company

for the year 2005. In the space provided below, draw a vertical and horizontal bar chart

to represent the quarterly profit. Give a brief comment to the chart drawn.

Profit (RM ‘000) Quarter


30 1st
37 2nd
40 3rd
29 4th

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 39
___________________________________________Chapter 2: Describing Data & Numerical Methods

a) Vertical Bar Chart

b) Horizontal Bar Chart

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 40
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.4 CLUSTER @ MULTIPLE BAR CHART


Ù It utilizes several bars for each item.

Ù It enables us to see immediately the differences between clusters.

EXAMPLE 5

Given below are data showing the quarterly profit (in RM ‘000) for companies A, B and

C for the year 2008. Draw a multiple bar chart to represent this data & give comments.

Quarter Company Profit (RM '000)


A 16
1st B 32
C 49
A 16
2nd B 34
C 47
A 18
3rd B 37
C 57

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 41
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.5 STACKED @ COMPONENT BAR CHART

Ù It shows the total breakdown in its component bars.

Ù If the components are converted into percentage, then the percentage stacked bar

chart is produced where all the bars are of equal height.

Ù STEPS:

(1) Find total for each category.

(2) Calculate the percentage (if you are asked for percentage component bar chart)

for every category of respondents.

EXAMPLE 6

A study has been undertaken to determine if there is a relationship between the place

of residence & ownership of foreign made cars. A random sample of 500 car owners was

selected & the following results were obtained.

Car Ownership City Town Rural


Owns foreign car 90 60 25
Do not own a foreign car 110 90 125
TOTAL 200 150 150 STEP 1

STEP 2

Car Ownership City Town Rural

Owns foreign car

Do not own a foreign car

TOTAL

Draw a component bar chart & percentage component bar chart for the above data in the space

provided below. Give comments.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 42
___________________________________________Chapter 2: Describing Data & Numerical Methods

a) Component Bar Chart

b) Percentage Component Bar Chart

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 43
___________________________________________Chapter 2: Describing Data & Numerical Methods

EXERCISE 1
Information on the number of students enrolled in four diploma programs at a private college
in a particular semester was recorded. The following table shows the number of students by
gender.
GENDER
PROGRAM
MALE FEMALE
Computer Science 54 120
Mathematics 77 108
Statistics 89 114
Actuarial Science 64 88
Present the above information using stacked bar chart and give your comment.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 44
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.6 CONTINGENCY TABLE

Ù Contingency table is also known as cross-classification table or 2×2 table.

Ù To examine the categorical responses in terms of two qualitative variables

simultaneously.

EXAMPLE 7

A car manufacturer might be interested to know whether colour preference for a car

is independent of gender.

Gender\ Colour Red Green White Black Blue TOTAL


Male 30 10 26 33 1 100
Female 45 8 12 10 25 100
TOTAL 75 18 38 43 26 200

Contingency table for colour preference of a car

Contingency table above shows that men preferred black cars while women prefer red

cars. In addition, we can conclude that men dislike blue cars while women dislike green

cars.

EXAMPLE 8

A survey finds that from a total of 155 respondents taken randomly from a housing
park, 65 are male respondents and from these male respondents 25 are unmarried
respondents. From the survey, there are 92 married respondents. Construct a 2×2 table
for the above information. How many percent of the female respondents out of the
total respondents are married?

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 45
___________________________________________Chapter 2: Describing Data & Numerical Methods

ORGANIZING & GRAPHING QUANTITATIVE DATA

Ù Quantitative data are normally summarized in tabular forms & we used frequency table to

simplify the data.

Ù Quantitative data can be divided into two, ungrouped & grouped data.

UNGROUPED DATA

Ù The data give information on each member of the population or sample individually.

GROUPED DATA

Ù Data organized into categories or intervals.

2.7 STEM-AND-LEAF PLOT


Ù It is widely used in exploratory data analysis when the data set is small, (𝑛 ≤ 30).

Ù This plot separates data entries into leading digits (stem) & trailing digits (leaf).

Ù It allows us to show the range of data values & shape of distribution.

Ù GUIDELINES:

1) Split each value into 2 sets of digits, the first set of digits is the stem and the

second is the leaf.

2) List all the possible stem digits from the lowest to highest.

3) For each value in mass data, write down the leaf numbers on the line labeled by

the appropriate stem number.

EXAMPLE 9

Construct a stem-and-leaf display for the data below.

3.4 4.5 2.3 2.7 3.8 5.9 3.4 4.7 2.4 4.1 3.6 5.1

STEP 1
Arrange the data in ascending order as follows:

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 46
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.3 2.4 2.7 3.4 3.4 3.6 3.8 4.1 4.5 4.7 5.1 5.9

STEP 2

Then, draw a vertical line to separate the stem (at left) and leaf values (at right) as

follows:

Stem Leaf
2 3 4 7
3 4 4 6 8
4 1 5 7
5 1 9
Note: 2 | 4 means 2.4

Based on stem-and-leaf plot above, we can say that the data is normally distributed.

Ù Usually we interpret stem-and-leaf plots in terms of the skewness. Skewness

measures the lack of symmetry in a data distribution. (Refer text book page 83 for

more detail of skewness)

Ù SHAPE OF DISTRIBUTION

1) Skewed to the left 2) Skewed to the right

Mean < Median < Mode Mean > Median > Mode

3) Normally distributed/Symmetric

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 47
___________________________________________Chapter 2: Describing Data & Numerical Methods

Mean = Median = Mode

EXERCISE 2
The statistics marks for 30 students taken randomly from ne final exam result are as follows.

Construct a stem-and-leaf display for the data below and comment on the shape of distribution.

75 68 62 82 80 55 91 65 71 84

52 72 77 63 84 92 60 53 45 80

58 60 74 72 54 64 68 70 62 58

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 48
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.8 FREQUENCY DISTRIBUTION OF UNGROUPED DATA

Ù Frequency distribution is a summary table where the data are grouped into a number

of classes.

Ù Objective: To obtain the number of responses associated with the different values of

the variable.

Ù The frequency of an observation is the number of times the observation has occurred.

Ù For ungrouped data, the frequency distribution is a table consisting of the observed

values with their corresponding frequencies.

EXAMPLE 10

Consider the following ungrouped data.

3 2 2 3 2 4 4 1 2 2

4 3 2 0 2 2 1 3 3 1

Ù Frequency distribution of the above data can express as follows.

Ù The value 1 occurs 3 times, thus frequency for value 1 is 3. Likely, 2 occurs 8 times &

its frequency is then 8 and so on.

Class, (x) t
0 1
1 3
2 8
3 5
4 3

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 49
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.9 FREQUENCY DISTRIBUTION OF GROUPED DATA

Ù A frequency table summarizes the data collected by forming intervals of values &

indicating the number of data that falls into each interval.

Ù Advantages of grouped data:

1) Reduces the complexity of the data

2) Helps to smoothen out irregularities in the distribution

Ù Disadvantage of grouped data: Some information is lost when data are grouped into

several class intervals. For instance, if it is known that there are six observations in

an interval labeled 15-20, one cannot say whether they are all at one end of the

interval or are spread through it.

Ù GUIDELINES:

1) Class interval should be mutually exclusive (class should be clearly defined &

not overlapped).

2) Class intervals should be equal width.

3) It should neither be too few classes nor too many classes. ( 5 ≤ Class ≤15)

4) Finally, frequency of each class is indicated in the frequency table. Note that

number of observation (𝑛) should be equal to the sum of the frequencies.


(∑ 𝒇 = 𝒏)

Ù CLASS SIZE is the width of a class.

𝑪𝒍𝒂𝒔𝒔 𝑺𝒊𝒛𝒆 = 𝑼𝒑𝒑𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑩𝒐𝒖𝒏𝒅𝒂𝒓𝒚 − 𝑳𝒐𝒘𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑩𝒐𝒖𝒏𝒅𝒂𝒓𝒚

Ù CLASS LIMITS of a class are the highest & lowest values of a class. Every class has

2 class limits, lower class limit & upper class limit.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 50
___________________________________________Chapter 2: Describing Data & Numerical Methods

Ù CLASS MIDPOINT is also known as class mark. Every class has class midpoint.

𝑼𝒑𝒑𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑳𝒊𝒎𝒊𝒕 + 𝑳𝒐𝒘𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑳𝒊𝒎𝒊𝒕


𝑪𝒍𝒂𝒔𝒔 𝑴𝒊𝒅𝒑𝒐𝒊𝒏𝒕 =
𝟐

Ù CLASS BOUNDARIES are values such that with the values each class is joined to the

next class. Each class has 2 class boundaries, the upper class boundary & lower class

boundary. A frequency table can be either constructed with class boundaries or

without class boundaries. The class boundary is given by the midpoint of the upper

limit of one class and the lower limit of the next class.

Ù When ungrouped data is given to you & you are asked to construct the frequency

distribution & if the class width/size is not mentioned, the CLASS SIZE is calculated

as:

𝑳𝒂𝒓𝒈𝒆𝒔𝒕 𝑫𝒂𝒕𝒂 𝑽𝒂𝒍𝒖𝒆 − 𝑺𝒎𝒂𝒍𝒍𝒆𝒔𝒕 𝑫𝒂𝒕𝒂 𝑽𝒂𝒍𝒖𝒆


𝑪𝒍𝒂𝒔𝒔 𝑺𝒊𝒛𝒆 =
𝑵𝒐. 𝒐𝒇 𝑪𝒍𝒂𝒔𝒔𝒆𝒔, (𝒌)

𝒍𝒐𝒈 𝒏
Where NO. OF CLASS is given by: 𝒌=
𝒍𝒐𝒈 𝟐

Ù FREQUENCY TABLE 1: Example of frequency table WITHOUT CLASS

BOUNDARIES. For this type of frequency table, the upper class limit of a class is not

equal to the lower class limit of the next class. The class intervals do not overlapped.

Age (Years) No. of Employees


18 - 22 8
23 - 27 13
28 - 32 15
33 - 37 24

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 51
___________________________________________Chapter 2: Describing Data & Numerical Methods

Ù FREQUENCY TABLE 2: The class boundaries can be constructed for the above table

as follows.

Age (Years) No. of Employees


17.5 – 22.5 8
22.5 – 27.5 13
27.5 – 32.5 15
32.5 – 37.5 24

Ù FREQUENCY TABLE 3: Example of frequency table WITH CLASS BOUNDARIES. For

this type of frequency table, the upper class limit of a class is equal to the lower class

limit of the next class.

Age (Years) No. of Employees


18 < 23 8
23 < 28 13
28 < 33 15
33 < 38 24

Ù Frequency Table 3 above also can be constructed in with open-ended classes as follow.

Age (Years) No. of Employees


18- 8
23- 13
28- 15
33- 24

NOTE: 18- means 18 and more, but less than 23.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 52
___________________________________________Chapter 2: Describing Data & Numerical Methods

EXAMPLE 11

a) Class boundary for the 2nd class is 600.5 to less than 800.5

Upper class boundary = (800 + 801) ÷ 2 = 800.5

Lower class boundary = (600 + 601) ÷ 2 = 600.5

b) Class width/size for the 2nd class is 200

Upper boundary – Lower boundary = 800.5 – 600.5 = 200

c) Midpoint for the 2nd class is 700.5

(Lower limit + Upper limit) ÷ 2 = (601 + 800) ÷ 2 = 700.5

EXERCISE 3
Calculate the class limit, class boundaries, class width and class midpoint for all the
classes based on data Example 11 for your understanding.

Class Limits Class Boundaries Class Width Class Midpoint

601 to 800 600.5 to less than 800.5 200 700.5

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 53
___________________________________________Chapter 2: Describing Data & Numerical Methods

EXERCISE 4
Construct a frequency distribution table based on data given below by using class limit.

Data on home runs hit by Major League Baseball teams during the 2002 season

Team Home Runs Team Home Runs

Anaheim 152 Milwaukee 139


Arizona 165 Minnesota 167
Atlanta 164 Montreal 162
Baltimore 165 New York Mets 160
Boston 177 New York Yankees 223
Chicago Cubs 200 Oakland 205
Chicago White Sox 217 Philadelphia 165
Cincinnati 169 Pittsburgh 142
Cleveland 192 St. Louis 175
Colorado 152 San Diego 136
Detroit 124 San Francisco 198
Florida 146 Seattle 152
Houston 167 Tampa Bay 133
Kansas City 140 Texas 230
Los Angeles 155 Toronto 187

SOLUTION:

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 54
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.10 HISTOGRAM

Ù Histogram is a graphical presentation of the frequency distribution in which bars

represent frequencies.

Ù Histogram is constructed by using class boundaries & frequencies of the classes.

Ù Horizontal axis represents the random variable while vertical axis represents the

number, proportion or percentage of observations per class interval.

EXAMPLE 12

Data below shows the weight of 100 honeydews produced from Farm X. Draw a
histogram for the data in the space provided below.
Weight ('00 g) Frequency
4–6 4
6–8 9
8 – 10 34
10 – 12 25
12 – 14 28

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 55
___________________________________________Chapter 2: Describing Data & Numerical Methods

EXERCISE 5
Data below shows the employees’ age (in years) in Company A. Draw a histogram for the data

in the space provided below.

Age (Years) No. of Employees


18 - 22 8
23 - 27 13
28 - 32 15
33 - 37 24
38 - 42 20

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 56
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.11 FREQUENCY POLYGON


Ù Frequency polygon is a graph that is formed by joining the mid points of each class.

Ù If histogram is available, the frequency polygon is obtained by connecting the mid-points

of the tops of the rectangles in the histogram.

Ù Two additional classes with zero frequencies are added to the 2 ends of the histograms.

Thus, the 2 ends of the frequency polygon are connected to the horizontal axis.

EXAMPLE 13

Daily sales (in RM) of 35 hawkers taken randomly in a town are shown in the table below.
Draw a histogram and frequency polygon to represent the data.
Daily Sales (RM) No. of Hawkers
121 - 136 7
137 – 152 6
153 – 168 11
169 – 184 6
185 - 200 5

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 57
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.12 CUMULATIVE FREQUENCY DISTRIBUTION

Ù There are 2 types of cumulative frequency distributions, “less than” (frequently used)

& “more than”.

Ù Cumulative frequency is determined by adding the frequencies.

Ù The frequencies up to the upper boundary of each class interval are progressively added

& the cumulative totals are placed in a new column.

Ù Cumulative relative frequency is calculated by adding the relative frequency.

𝒇𝒊 𝑭𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 𝒐𝒇 𝒕𝒉𝒂𝒕 𝒄𝒂𝒕𝒆𝒈𝒐𝒓𝒚


𝑹𝒆𝒍𝒂𝒕𝒊𝒗𝒆 𝑭𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 = =
∑𝒇 𝑺𝒖𝒎 𝒐𝒇 𝒂𝒍𝒍 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚

𝒇𝒊 𝑭𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 𝒐𝒇 𝒕𝒉𝒂𝒕 𝒄𝒂𝒕𝒆𝒈𝒐𝒓𝒚


𝑷𝒆𝒓𝒄𝒆𝒏𝒕𝒂𝒈𝒆 𝑹𝒆𝒍𝒂𝒕𝒊𝒗𝒆 𝑭𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 = × 𝟏𝟎𝟎 = × 𝟏𝟎𝟎
∑𝒇 𝑺𝒖𝒎 𝒐𝒇 𝒂𝒍𝒍 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚

EXAMPLE 14

For the data below, calculate the cumulative frequency, relative frequency and

cumulative relative frequency.

STEP 2 STEP 3 STEP 4


Cumulative
Class Frequency Cumulative Relative
Class Relative
Number (𝒇) Frequency Frequency
Frequency
1 27 – 29 2 2 2 ÷ 50 = 0.04 0.04
2 29 – 31 9 2 + 9 = 11
3 31 – 33 13
4 33 – 35 14
5 35 – 37 8
6 37 – 39 4
STEP 1: Total

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 58
___________________________________________Chapter 2: Describing Data & Numerical Methods

EXERCISE 6
The following information shows the relative frequency distribution table of the monthly

telephone bills (in RM) for November 2008, spent by 200 households in Shah Alam.

Monthly Telephone Bills (in Cumulative


RM) Frequency
50 – 99 0.12
100 – 149 0.05
150 - 199 0.10
200 – 249 0.20
250 – 299 0.30
300 – 349 0.13
350 – 399 0.10

Construct a frequency distribution table for the above data.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 59
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.13 CUMULATIVE FREQUENCY CURVE @ OGIVE

Ù Ogive is a graph or a line chart of a cumulative frequency distribution (S-Shape).

Ù “Less than” ogive is an increasing function & rises to the right.

Ù “More than” ogive is a decreasing function & falls to the right.

Ù Ogive is drawn based on the data from a cumulative frequency table or data from

cumulative relative frequencies.

Ù Ogives for relative frequencies are used when two cumulative distributions with

different total frequencies are to be compared.

EXAMPLE 15

Draw a “less than” ogive for data below.

Service Years Number of Employees


1–4 16
5–8 20
9 – 12 28
13 – 16 24
17 – 20 16
21 – 24 11
25 – 28 5
STEPS

Step 1: Find the cumulative frequency.

Step 2: Find lower limit for each class (if the class interval do not overlapped).

Cumulative Frequency Lower Limit


Service Years Frequency
STEP 1 STEP 2
1–4 16 16 (0 + 1) ÷ 2 = 0.5
5–8 20 36 (4 + 5) ÷ 2 = 4.5
9 – 12 28 64 (8 + 9) ÷ 2 = 8.5
13 – 16 24 88 (12 + 13) ÷ 2 = 12.5
17 – 20 16 104 (16 + 17) ÷ 2 = 16.5
21 – 24 11 115 (21 + 20) ÷ 2 = 20.5
25 – 28 5 120 (25 + 24) ÷ 2 = 24.5

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 60
___________________________________________Chapter 2: Describing Data & Numerical Methods

Step 3: Find the upper limit for the last class, so that you can plot for the total number
of frequency. Then, create another table as follows. This step is unnecessary if
you understand how to read the cumulative frequency according to their
respective limits (since we used “less than ogive”). This step will ease you to plot
the ogive.
Service Years Cumulative Frequency
Less than 0.5 0
Less than 4.5 16
Less than 8.5 36
Less than 12.5 64
Less than 16.5 88
Less than 20.5 104
Less than 24.5 115
Less than 28.5 120

Step 4: Draw a less than ogive in the space provided below.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 61
___________________________________________Chapter 2: Describing Data & Numerical Methods

EXERCISE 7
The following data show the monthly income (RM) of 50 fishermen in a village. Draw a less than
ogive. Hence, find the percentage of fishermen having income more than RM340.
Monthly Income (RM) No. of Fisherman
300 < 350 4
350 < 400 13
400 < 450 18
450 < 500 10
500 < 550 2
550 < 600 3

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 62
___________________________________________Chapter 2: Describing Data & Numerical Methods

NUMERICAL METHODS
MEASURES OF CENTRAL TENDENCY
MEASUREMENT UNGROUPED DATA GROUPED DATA

∑ 𝒇𝒙
Mean is the average of the data values. "
𝒙=
∑𝒇
MEAN
∑𝒙 Step 1: Find midpoint (x)
(𝒙
#) 𝒙
"=
𝒏
Step 2: Calculate fx
Step 3: Find total of f and total of fx

Median is the middle value of the arranged


data in ascending order.

𝒏
𝒏+𝟏 − ∑ 𝒇𝒎"𝟏
𝑳𝒐𝒄𝒂𝒕𝒊𝒐𝒏 𝒐𝒇 𝒙
.= 𝒙 = 𝑳𝒎 + 4 𝟐
. 6×𝑪
𝟐 𝒇𝒎

MEDIAN Step 1: Arranged data in ascending order.


Step 1: Construct cumulative frequency table.
Step 2: Find location of median. Step 2: Find position of median, (n/2).
(𝒙
%)
Step 3: Locate the median value from the Step 3: Create column for ‘position of data’ (using
arranged data & hence determine cumulative frequency) to determine the
the median value. median class.
Note: Step 4: Apply formula.
If n = ODD, 𝑥.is in the middle of the data.
If n = EVEN, 𝑥3 is the average of the 2 middle
numbers.

𝒇𝟎 − 𝒇𝟏
𝒙
9 = 𝑳+: =×𝑪
(𝒇𝟎 − 𝒇𝟏 ) + (𝒇𝟎 − 𝒇𝟐 )

Step 1: Find modal class (class with highest


frequency).
Mode is the most frequent value that
MODE occurs in a data set. Step 2: Apply formula.

(𝒙
&) F Compute the number of times the value of Estimating Mode from Histogram
the data that occurs the most frequent.

MEASURES OF POSITION

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 63
___________________________________________Chapter 2: Describing Data & Numerical Methods

MEASUREMENT UNGROUPED DATA GROUPED DATA


Step 1: Find location of Q1 to determine the class
Step 1: Arrange data in ascending order. of Q1.
Step 2: Find location of Q1.
𝒏
FIRST 𝑳𝒐𝒄𝒂𝒕𝒊𝒐𝒏 𝒐𝒇 𝑸𝟏 =
𝒏+𝟏 𝟒
QUARTILE 𝑳𝒐𝒄𝒂𝒕𝒊𝒐𝒏 𝒐𝒇 𝑸𝟏 =
𝟒
(Q1) Step 2: Apply formula.
Step 3: Find the value of Q1 according to the 𝒏
− 𝑭𝟏
location. 𝑸𝟏 = 𝑳𝟏 + + 𝟒 2 × 𝒄𝟏
𝒇𝟏
Step 1: Find location of Q3 to determine the class
Step 1: Arrange data in ascending order. of Q3.
Step 2: Find location of Q3.
𝟑𝒏
THIRD 𝑳𝒐𝒄𝒂𝒕𝒊𝒐𝒏 𝒐𝒇 𝑸𝟑 =
𝟑(𝒏 + 𝟏) 𝟒
QUARTILE 𝑳𝒐𝒄𝒂𝒕𝒊𝒐𝒏 𝒐𝒇 𝑸𝟑 =
𝟒
(Q3) Step 2: Apply formula.
Step 3: Find the value of Q3 according to the 𝟑𝒏
− 𝑭𝟑
location. 𝑸𝟑 = 𝑳𝟑 + + 𝟒 2 × 𝒄𝟑
𝒇𝟑
Q2 = MEDIAN

Step 1: Arrange data in ascending order.


Step 2: Find location of Q2.

𝒏+𝟏
𝑳𝒐𝒄𝒂𝒕𝒊𝒐𝒏 𝒐𝒇 𝑸𝟐 =
SECOND 𝟐 Q2 = MEDIAN
QUARTILE
Step 3: Find the value of Q2 according to the (Refer how to calculate median for grouped data)
(Q2)
location.
Note:
If n = ODD, Q2 is in the middle of the data.
If n = EVEN, Q2 is the average of the 2
middle numbers.

INTERPRETATION OF QUARTILES VALUE

Q1 = First quartile means 25% of the total data is less than first quartile and 75% of the total data is more than
first quartile.

Q3 = First quartile means 75% of the total data is less than third quartile and 25% of the total data is more than
third quartile.
25% 75%

Q1 Q2 Q3
MEASURES OF DISPERSION
50% 50%

75% 25%
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 64
___________________________________________Chapter 2: Describing Data & Numerical Methods

MEASUREMENT UNGROUPED DATA GROUPED DATA

RANGE Range = Largest Value – Smallest Value Range = Upper Boundary – Lower Boundary
Highest Class Lowest Class
INTERQUARTILE
IR = Q3 – Q1 IR = Q3 – Q1
RANGE
QUARTILE
DEVIATION
@
QD = ½ (Q3 – Q1) QD = ½ (Q3 – Q1)
SEMI
INTERQUARTILE
RANGE

CHARACTERISTICS OF MEASURES OF DISPERSION

1) The more spread out or dispersed the data are, the larger will be the range, the interquartile range, the

quartile deviation, the variance & standard deviation.

2) The more clustered the data are, the smaller will be the range, the interquartile range, the quartile

deviation, the variance & standard deviation.

3) If all data values all the same, that is, there is no variation in the data, the range, the interquartile range,

the quartile deviation, the variance & standard deviation will be equal to zero.

4) The range, the interquartile range, the quartile deviation, the variance & standard deviation can never be

negative.

VARIANCE & STANDARD DEVIATION


MEASUREMENT UNGROUPED DATA GROUPED DATA
STANDARD
𝟏 ( ∑ 𝒙 )𝟐 𝟏 ( ∑ 𝒇𝒙 )𝟐
DEVIATION 𝒔=B C D 𝒙𝟐 − E 𝒔=B C D 𝒇𝒙𝟐 − E
𝒏−𝟏 𝒏 𝒏−𝟏 𝒏
(𝒔)

VARIANCE 𝟏 ( ∑ 𝒙 )𝟐 𝟏 ( ∑ 𝒇𝒙 )𝟐
𝑺𝟐 = C D 𝒙𝟐 − E 𝑺𝟐 = C D 𝒇𝒙𝟐 − E
( 𝑺𝟐 ) 𝒏−𝟏 𝒏 𝒏−𝟏 𝒏

Note: Sfx2 ¹ S(fx)2

MEAN DEVIATION @ AVERAGE ABSOLUTE DEVIATION

∑|𝒙 − 𝒙
V|
𝑴𝒆𝒂𝒏 𝑫𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 =
𝒏
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 65
___________________________________________Chapter 2: Describing Data & Numerical Methods

PEARSON’S COEFFICIENT OF SKEWNESS

Ù Skewness measures the lack of symmetry in a data distribution.

V−𝒙
𝒙 W V−𝒙
𝟑(𝒙 Y)
𝑺𝒌𝒆𝒘𝒏𝒆𝒔𝒔 = OR 𝑺𝒌𝒆𝒘𝒏𝒆𝒔𝒔 =
𝒔 𝒔

Ù If coefficient of skewness is positive = Distribution of the data is skewed to the right.

Ù If coefficient of skewness is negative = Distribution of the data is skewed to the left.

Ù If (MEAN – MODE) = Positive :The distribution is skewed to the right.

Ù If (MEAN – MODE) = Negative :The distribution is skewed to the left.

Ù If (MEAN – MODE) = 0 :The distribution is symmetrical/normal.

Ù If MEAN = MEDIAN = MODE : The distribution is symmetrical/normal.

Ù If MEAN > MEDIAN > MODE : The distribution is skewed to the right.

Ù If MEAN < MEDIAN < MODE : The distribution is skewed to the left.

2.14 UNGROUPED DATA

EXAMPLE 16

5 3 8 12 18 20 24 25 8 2

Based on the above data, calculate:

(1) Mean

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 66
___________________________________________Chapter 2: Describing Data & Numerical Methods

(2) Mode and interpret the value obtained

(3) Median and interpret the value obtained

(4) First quartile and interpret the value obtained

(5) Third quartile and interpret the value obtained

(6) Variance

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 67
___________________________________________Chapter 2: Describing Data & Numerical Methods

(7) Standard deviation

(8) Range

(9) Interquartile range

(10) Quartile deviation

(11) Pearson’s coefficient of skewness

(12) Mean deviation

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 68
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.15 GROUPED DATA

EXAMPLE 17

Data below shows the monthly income (in RM) for employees at Suria Company.

Monthly No. of
Income Employees
(in RM) (f)
700 - 799 8
800 - 899 13
900 - 999 14
1000 - 1099 10
1100 - 1199 25
1200 - 1299 4

Based on the above data, calculate:

(1) Mean

(2) Mode and interpret the value obtained

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 69
___________________________________________Chapter 2: Describing Data & Numerical Methods

(3) Median and interpret the value obtained

(4) Variance

(5) Standard deviation

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 70
___________________________________________Chapter 2: Describing Data & Numerical Methods

(6) Range

(7) Pearson’s coefficient of skewness

(8) Mean deviation

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 71
___________________________________________Chapter 2: Describing Data & Numerical Methods

EXERCISE 8
The table below represents the record high temperatures in Fahrenheit for each of the 50

states.

Temperature Frequency
100 – 104 2
105 – 109 8
110 – 114 18
115 – 119 13
120 – 124 7
125 – 129 1
130 – 134 1

(a) Calculate the measures of central tendency and hence interpret the values obtained.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 72
___________________________________________Chapter 2: Describing Data & Numerical Methods

(b) Find the variance and standard deviation.

(c) Construct a frequency polygon.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 73
___________________________________________Chapter 2: Describing Data & Numerical Methods

(d) Draw a less than ogive and find the value of the first and third quartile. Interpret the

values obtained.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 74
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.16 COEFFICIENT OF VARIATION @ RELATIVE DISPERSION


Ù Coefficient of variation (CV) is used to compare the dispersions or spreads or variability

of 2 or more data distributions.

Ù CV compares the standard deviation & mean for a distribution & converts this value to

percent.

𝒔
𝑪𝑽 = × 𝟏𝟎𝟎
V
𝒙

Ù If CV of distribution A is greater than CV of distribution B, then we say that

distribution A is more dispersed or more spread than distribution B.

Ù We also can say that distribution B is more consistent (or less dispersed or more stable)

than distribution A.

Ù A larger relative variation implies less consistency, while smaller relative variation

implies more consistency.

EXAMPLE 18

During the first six months of 2009, the mean share price of Company A was RM1.90

with standard deviation of RM0.50, while the mean share price of Company B was

RM8.00 with standard deviation of RM0.85, which company’s share price is more

consistent?

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 75
___________________________________________Chapter 2: Describing Data & Numerical Methods

2.17 BOX-AND-WHISKER PLOT

Ù Box plot provides a useful graphical presentation of data using minimum, maximum, first quartile (𝑄' ), third

quartile (𝑄( ) & median value.

Minimum Maximum
𝑸𝟏 Median 𝑸𝟑

Ù Vertical line inside the box represents the location of the median value.

Ù Vertical line at the left-hand side of the box represents the location of the Q1.

Ù Vertical line at the right-hand side of the box represents the location of the Q3.

Ù The extreme end of line (a whisker) connecting to the left-hand side of the box is the location for the

smallest value.

Ù The extreme end of line (a whisker) connecting to the right-hand side of the box is the location for the

largest value.

Ù Normally distributed è Median is in the middle of the box & whiskers are of equal length.

Ù Negatively skewed è The whisker & the rectangular box is longer on the left-hand side.

Minimum Maximum
𝑸𝟏 Median 𝑸𝟑

Ù Positively skewed è The whisker & the rectangular box is longer on the right-hand side.

Minimum Maximum
𝑸𝟏 Median 𝑸𝟑

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 76
___________________________________________Chapter 2: Describing Data & Numerical Methods

EXAMPLE:
Construct a box-and-whisker plot for the data below.

12 18 20 34 8 42 30 58 40

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 77
___________________________________________Chapter 2: Describing Data & Numerical Methods

TUTORIAL 2
Please do all the questions listed below & show your calculations clearly.

REVIEW QUESTIONS 3
Text Book Page 64

Question 2 Question 4 Question 9 Question 13

REVIEW QUESTIONS 4
Text Book Page 93

Question 4 Question 6 Question 8 Question 10

Question 13 Question 20 Question 21 Question 23

Question 25 Question 27 Question 28 Question 29

REVIEW QUESTIONS 5
Text Book Page 127

Question 1 Question 2 Question 6 Question 7

Question 8 Question 9 Question 12 Question 13

Question 15 Question 17 Question 18 Question 20

Question 21 Question 22 Question 25 Question 29

Question 32 Question 33 Question 34

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 78

You might also like