MMW Module 4
MMW Module 4
Data Management
OVERVIEW
When conducting a statistical research, investigation or study, the researcher
must gather data for the particular variable under investigation. To describe situations,
make conclusions, and draw inferences about events, the researcher must organize the
data gathered in some meaningful way. After organizing data, the next move of the
researcher is to present the data so they can be understood easily by those who will
benefit from reading the study.
Any data set can be characterized by measuring its central tendency. This
module will discuss three different measures of central tendency: the mean median and
mode. An important characteristic of data set is how it is distributed, or how far each
element is from some measure of central tendency. There are several ways to measure
variability of the data. Although the most common and most important is the standard
deviation, which provides an average distance for each element from the mean, several
others are also important. When presenting or analyzing data set it is sometimes
helpful to group subjects into several equal groups. For example, to create four equal
groups we need the values that split the data such that 25% of the observations are in
each group.
In this module we will improve your understanding of a population from which
data was obtained and use that information to help you with decision making. Definition
of statistical measures and how they are obtained will be presented. Significance of
statistical measures will also be discussed.
Lesson Content
Learning Competencies
After completing the module, the learner should be able to:
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 48
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
9. Analyze and interpret the data presented in the table using measure of
central tendency.
10. Advocate the use of statistical data in making important decisions.
11. Use a variety of statistical tools to process and manage numerical data.
12. Use linear regression to predict the value of a variable given certain
conditions.
13. Apply correlation to determine the relationship between two variables.
14. Perform operations on mathematical expressions correctly.
15. Articulate the importance of mathematics in one’s life.
16. Express appreciation for mathematics as a human endeavor.
17. Support the use of mathematics in various aspects and endeavors in life.
Motivation Questions:
What is the importance of Data Management in Mathematics?
Is Statistics and data management the same?
A. Organization of Data
The easiest way and widely used for organizing data is to construct a frequency
distribution table. A frequency distribution is a grouping of the data into
categories showing the number of observations in each of the non – overlapping
classes.
After organizing data, the next step is to present the data so they can be
understood easily by the readers.
Definition of Terms:
A grouped frequency distribution is used when the range of the data set is
large; the date must be grouped into class whether it is categorical data or
interval data. For interval data, the class is more than one unit in width. The
procedure for constructing the frequency distribution is discussed in the
succeeding sections.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 49
GE 1 Mathematics in Modern World
Example 1:
Twenty applicants were given a performance evaluation appraisal. The data set is
Excellent Very Satisfactory Satisfactory Satisfactory
Excellent Satisfactory Very Satisfactory Satisfactory
Excellent Very Satisfactory Very Satisfactory Very Satisfactory
Satisfactory Very Satisfactory Excellent Excellent
Very Satisfactory Very Satisfactory Excellent Excellent
Solution.
𝑓
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 = 𝑥 100
𝑛
where f is the frequency of the class and n is the total number of values.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 50
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
Generally, the number of classes for a frequency distribution table varies from 5
to 20. The decision about the number of classes depends on the method used by
the researcher.
𝑅𝑎𝑛𝑔𝑒 𝐻𝑉−𝐿𝑉
𝑆𝑢𝑔𝑔𝑒𝑠𝑡𝑒𝑑 𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 (𝑖) = =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑙𝑎𝑠𝑠𝑒𝑠 𝑘
2. Rule 2. Another way to determine the class interval is by applying the formula
𝑅𝑎𝑛𝑔𝑒
𝑆𝑢𝑔𝑔𝑒𝑠𝑡𝑒𝑑 𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 (𝑖) =
1 + 3.322 (𝑙𝑜𝑔𝑎𝑟𝑖𝑡ℎ𝑚 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠)
Example 1:
Suppose a researcher wished to do a study on the score of students in an
entrance examination conducted by a certain High School. The research would
have to collect the data by obtaining the scores of the students. The data
collected is presented below.
19 44 24 43 33 29 26 25 29 23
31 33 38 18 33 33 39 33 37 32
36 37 40 24 40 37 57 48 39 48
26 39 42 32 24 30 30 39 35 28
34 45 39 49 46 43 40 34 41 45
32 21 32 33 22 43 33 29 29 19
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 51
GE 1 Mathematics in Modern World
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 52
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
Step 6: Determine the midpoints. The midpoint can be found by getting the
average of the upper and lower limit in each class.
Example 2:
MRE Travel and Tours, a local travel agency offers special discounts during
summer. The owner wanted to find out the ages of the people who avail special
discounts. A random sample of 40 customers taking the travel last summer
revealed these ages.
24 36 28 34 23 37 28 31 22 39
27 28 45 23 21 55 48 48 43 27
33 29 31 25 26 37 49 25 42 42
28 40 34 27 28 37 51 16 38 32
Solution:
39
=
1 + 3.322(log 40)
39
=
1 + 3.322 (1.60205999133)
39
=
6.3220432912
= 6. 1688916389 ≈ 6
• Class Limits
Select a starting point for the lowest class limit. The starting
point can be the smallest data value or any convenient
number less than the smallest data value. In our case, 16 is
used.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 53
GE 1 Mathematics in Modern World
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 54
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
Step 6: Determine the midpoints. The midpoint can be found by getting the
average of the upper and lower limit in each class.
16 29 32 21 44 44 36 41 24 40
28 30 47 47 34 47 46 27 35 50
26 33 50 46 33 48 38 29 19 27
22 32 53 31 44 42 55 28 40 19
Prepare a frequency distribution by completing the table below using Rule 1 and
Rule 2.
A. Mean
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 55
GE 1 Mathematics in Modern World
Example 1:
The daily salaries of a sample of eight employees of Freedomlife Inc.
are: ₱650, ₱550, ₱470, ₱580, ₱500, ₱750, ₱700, ₱450. Find the mean
daily wage of the employees.
Solution 1:
∑x x1 +x2 +x3 +x4 +x5 +x6 +x7 +xn
x̅ = =
n n
650+550+470+580+500+750+700+450
x̅ =
8
4650
𝑥̅ = = 581.25
8
The mean daily wage of the employees is ₱ 581.25.
B. Weighted Mean
Weighted Mean is an average computed by giving different weights to some
of the individual values. If all the weights are equal, then the weighted mean
is the same as the arithmetic mean. The weighted mean is found by
multiplying each vale by the corresponding weight and dividing by the sum of
the weights.
𝑥1 𝑤1 +𝑥2 𝑤2 +𝑥3 𝑤3 +⋯+𝑥𝑛 𝑤𝑛
𝑥̅ =
𝑤1 +𝑤2 +𝑤3 +⋯+𝑤𝑛
Example 1
Suppose that a marketing firm conducts a survey of 1,000 households to
determine the average number of Electric Fans each household owns. The
data show a large number of households with two or three electric fans and a
smaller number with one or four. Every household in the sample has at least
one electric fan and no household has more than four.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 56
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
Solution:
Step 1. Assign a weight to each value in the data set.
𝑥1 = 1 𝑤1 = 73
𝑥2 = 2 𝑤2 = 378
𝑥3 = 3 𝑤3 = 459
𝑥4 = 4 𝑤4 = 90
Step 2. Compute the weighted mean using the formula.
𝑥1 𝑤1 + 𝑥2 𝑤2 +𝑥3 𝑤3 + 𝑥4 𝑤4
𝑥̅ =
𝑤1 +𝑤2 + 𝑤3 + 𝑤4
2566
𝑥̅ = = 2.566
1000
The mean number of electric fans per household in this sample is 2.566.
C. Median
Whenever the data is arranged in ascending or descending order, it is called
a data array. The median is the midpoint of the data array.
To determine the value of the median for the ungrouped data, we consider
the following rules:
1) Arrange the data in ascending or descending order.
2) If n is odd, the median is the middle value.
3) If n is even, the median is the average of the two middle values.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 57
GE 1 Mathematics in Modern World
𝑛+1
𝑀𝑒𝑑𝑖𝑎𝑛 (𝑅𝑎𝑛𝑘 𝑉𝑎𝑙𝑢𝑒) =
2
𝑁𝑜𝑡𝑒: 𝑛 𝑖𝑠 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑜𝑟 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
Example 1
Find the median of the ages of 9 top management employees of Villar
Holdings Inc. The ages are 56, 49, 61, 58, 56, 53, 60, 59, and 48.
Solution:
Example 2
The daily salaries of a sample of eight employees of Freedomlife Inc. are:
₱650, ₱550, ₱470, ₱580, ₱500, ₱750, ₱700, ₱450. Find the median daily
wage of the employees
Solution:
Since, the middle point falls between ₱550 and ₱580, we can determine the
median of the data set by getting the average of the two values.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 58
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
D. Mode
The mode is the number that appears most frequently in a data set. A set of
numbers may have one mode or unimodal, two modes or bimodal, more
than one mode or multimodal, or no mode at all.
Example 1
The following data represents the total unit sales for brand new cars from a
sample of 10 Car Dealer Shops in Region XII for the 1 st Quarter of 2019: 13,
14, 8, 10, 11, 13, 10, 8, 10, and 9. Find the mode.
Solution:
The ordered array of the data set is 8, 8, 9, 10, 10, 10, 11, 13, 13, 14.
Since 10 appears 3 times more than the other values, therefore the mode is
10.
Example 2
Find the mode of the ages of 9 top management employees of Villar
Holdings Inc. The ages are 56, 49, 61, 58, 56, 53, 60, 59, and 48.
Solution:
The ordered array of data is 48, 49, 53, 56, 56, 58, 59, 60, 61
There is no mode since each of the data has the same frequency.
Example 3
In a crash test, 11 cars were tested to determine what impact speed was
required to obtain minimal bumper damage. Find the mode of the speeds
given in miles per hour below.
24, 15, 18, 20, 18, 22, 24, 26, 18, 26, 24
Solution:
The ordered array of data is 15, 18, 18, 18, 20, 22, 24, 24, 24, 26, 26
Since both 18 and 24 occurs 3 times in the data set, we have two modes and
the data is considered bimodal.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 59
GE 1 Mathematics in Modern World
Dispersion is the difference between the actual value and the average value.
Measure of dispersion shows the scatterings of the data. It tells the variation of
the data from one another and gives a clear idea about the distribution of the
data. The measure of dispersion shows the homogeneity or the heterogeneity of
the distribution of the observations.
A. Range
Example 1
The daily salaries of a sample of eight employees of Freedomlife Inc. are: ₱650,
₱550, ₱470, ₱580, ₱500, ₱750, ₱700, ₱450. Find the range.
Solution:
Step 1: Identify the highest value and the lowest value in the data set.
HV = ₱750 LV = ₱450
∑(𝑥− 𝑥̅ )2 ∑(𝑥− 𝑥̅ )2
𝑠2 = 𝑠=√
𝑛−1 𝑛−1
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 60
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
(∑ 𝑥)2 2 −(∑ 𝑥)
2
∑ 𝑥2− ∑𝑥
𝑠2 = 𝑛
𝑠= √ 𝑛
𝑛−1 𝑛−1
Where:
𝑠2 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒
𝑠 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
𝑥 = 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑎𝑛𝑦 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
𝑥̅ = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒
Example 2
The daily salaries of a sample of eight employees of Freedomlife Inc. are: ₱650,
₱550, ₱470, ₱580, ₱500, ₱750, ₱700, ₱450. Find the variance and standard
deviation.
Solution:
∑x x +x +x +x +x +x +x +x
x̅ = = 1 2 3 4 5 6 7 n
n n
650+550+470+580+500+750+700+450
x̅ =
8
4650
𝑥̅ = = 581.25
8
Step 2: Subtract the mean from each of the value in the data set.
𝑥 𝑥 − 𝑥̅
650 68.75
550 -31.25
470 -111.25
580 -1.25
500 -81.25
750 168.75
700 118.75
450 -131.25
∑ 𝑥 = 4650 ∑(𝑥 − 𝑥̅ ) = 0
𝒙 ̅
𝒙− 𝒙 ̅ )𝟐
(𝒙 − 𝒙
650 68.75 4,726.5625 (68.75)2 = 4,726.5625
550 -31.25 976.5625
470 -111.25 12,376.5625
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 61
GE 1 Mathematics in Modern World
Step 4: Solve for the variance and standard deviation. We can simply obtain the
standard deviation by extracting the square root of the variance.
∑(𝑥− 𝑥̅ )2 ∑(𝑥− 𝑥̅ )2
𝑠2 = 𝑠=√
𝑛−1 𝑛−1
84,487.5 84.487.5
𝑠2 = 𝑠=√
8−1 8−1
84,487.5 84.487.5
𝑠2 = 𝑠=√
7 7
𝑠2 = 12,069.64 𝑠 = √12,069.64
𝑠 = 109. 86
Step 1: Get the sum of the data set and square the values in the data set and
get also the sum.
𝑥 𝑥2
650 422500
550 302500
470 220900
580 336400
500 250000
750 562500
700 490000
450 202500
∑ 𝑥 = 4650 ∑ 𝑥 2 = 2787300
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 62
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
(∑ 𝑥)2 2 −(∑ 𝑥)
2
∑ 𝑥2− ∑𝑥
𝑠2 = 𝑛
𝑠= √ 𝑛
𝑛−1 𝑛−1
(4650)2 (4650)2
2,787,300− 2,787,300−
𝑠2 = 8
𝑠= √ 8
8−1 8−1
21,622,500 21,622,500
2,787,300− 2,787,300−
𝑠2 = 8
𝑠= √ 8
7 7
𝑠 = 109. 86
Example 3
The monthly income of five research directors of Recoletos schools are: ₱
55,000, ₱59,500, ₱62,500, ₱57,000, and ₱61,000. Find the variance and
standard deviation.
Solution:
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 63
GE 1 Mathematics in Modern World
Step 2: Subtract the population mean from each of the value in the data set.
𝑥 𝑥− 𝜇
55,000 -4,000
59,500 500
62,500 3,500
57,000 -2,000
61,000 2,000
Step 3: Get the square of 𝑥 − 𝜇 and get the sum of the squares.
𝑥 𝑥− 𝜇 (𝑥 − 𝜇)2
55,000 -4,000 16,000,000
59,500 500 250,000
62,500 3,500 12,250,000
57,000 -2,000 4,000,000
61,000 2,000 4,000,000
∑ 𝑥 = 295,000 ∑ (𝑥 − 𝜇 ) = 0 ∑(𝑥 − 𝜇)2 = 36,500,000
Step 4: Solve for the population variance and population standard deviation
𝜎 = 2, 701.85
Hence, the population variance is 730,000 and the population standard deviation
is 2,701.85.
A. Quartiles
A quartile is a statistical term describing a division of observations into four
defined intervals based upon the values of the data and how they compare to
the entire set of observations. It divides data into three points – a lower quartile,
median, and upper quartile – to form four groups of the data set. The lower
quartile or first quartile is denoted as Q1 and is the middle number that falls
between the smallest value of the data set and the median. The second
quartile, Q2, is also the median. The upper or third quartile, denoted as Q3, is
the central point that lies between the median and the highest number of the
distribution.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 64
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
There are four groups formed from the quartiles. The first group of values
contains the smallest number up to Q1; the second group includes Q1 to the
median; the third set is the median to Q3; the fourth category comprises Q3 to
the highest data point of the entire set.
Each quartile contains 25% of the total observations. Generally, the data is
arranged from smallest to largest:
𝑘(𝑁+1)
𝑄𝑘 =
4
Where: 𝑄𝑘 = 𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒
𝑁 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
𝑘 = 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛
Example 1
Find the first, second and third quartile of the ages of 9 top management
employees of Villar Holdings Inc. The ages are 56, 49, 61, 58, 56, 53, 60, 59,
and 48.
Solution:
Step 2: Select the first, second and third quartile values using the formula.
1(𝑁+1) 1( 9+1) 10
𝑄1 = = = = 2.5
4 4 4
2(𝑁+1) 2( 9+1) 20
𝑄2 = = = =5
4 4 4
3(𝑁 + 1) 3( 9 + 1) 30
𝑄3 = = = = 7.5
4 4 4
Step 3: Identify the first, second and third quartile values in the data set.
48, 49, 53, 56, 56, 58, 59, 60, 61
↑ ↑ ↑
th th
2.5 5 7.5th
Since the 2.5th falls between 49 and 53; and 7.5th falls between 59 and 60,
we can determine the first and third quartile of the data set by getting the
average of the two values.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 65
GE 1 Mathematics in Modern World
Therefore, 𝑄1 = 51 𝑄2 = 56 𝑄3 = 59.5
B. z – scores
A z – score is a numerical measurement that describes a value's relationship to
the mean of a group of values. z – score is measured in terms of standard
deviations from the mean. If a z – score is 0, it indicates that the data point's
score is identical to the mean score. A z – score of 1.0 would indicate a value
that is one standard deviation from the mean. z – scores may be positive or
negative, with a positive value indicating the score is above the mean and a
negative score indicating it is below the mean.
𝑥− 𝜇 (𝑥− 𝑥̅ )
𝑧= (for population) 𝑧= (for sample)
𝜎 𝑠
Example 1
Books in the library are found to have average length of 350 pages with
standard deviation of 100 pages. What is the z-score corresponding to a book
of length 80 pages?
Solution:
Let 𝜇 = 350 𝜎 = 100 𝑥 = 80
𝑥− 𝜇
𝑧=
𝜎
80− 350
𝑧= 𝑧 = −2.7
100
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 66
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
6. If the right line is longer than the left line, the distribution is positively
skewed
Example 1
Construct a box plot for the following data:
12, 5, 22, 30, 7, 36, 14, 42, 15, 53, 25
Solution:
Step 1: Arrange the data in ascending order.
5, 7, 12, 14, 15, 22, 25, 30, 36, 42, 53
2(𝑁+1) 2(11+1)
𝑄2 = = =6
4 4
Step 3: Draw a number line that will include the smallest and the largest data.
Step 4: Draw three vertical lines at the lower quartile (12), median (22) and the
upper quartile (36), just above the number line.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 67
GE 1 Mathematics in Modern World
Step 5: Join the lines for the lower quartile and the upper quartile to form a box.
Step 6: Draw a line from the smallest value (5) to the left side of the box and
draw a line from the right side of the box to the biggest value (53).
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 68
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
Where: 𝑧 = 𝑧 𝑣𝑎𝑙𝑢𝑒
𝑥 = 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑎𝑛𝑦 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 𝑜𝑟 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡
𝜇 = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
𝜎 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
Example 1
Determine the area under the standard normal distribution curve between z = 0
and z = 1.35.
Solution: Draw the figure and represent the area as shown in the figure below.
0 1.35
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 69
GE 1 Mathematics in Modern World
Example 2
Determine the area under the standard normal distribution curve between z = 0
and z = -1.85.
-1.85 0
Example 3
Find the area under the standard normal distribution to the right of 1.25.
0 1.25
The required area is the right tail of the normal curve. Since Table A gives the area
between z = 0 and z = 1.25, first find that area.
Then subtract 𝑃(0 < 𝑧 < 1.25) = 0.3944 from 0.5000, since half of the area under
the curve is to the right of z = 0.
𝑃(𝑧 > 1.25) = 0.5000 − 𝑃(0 < 𝑧 < 1.25)
𝑃(𝑧 > 1.25) = 0.5000 − 0.3944
𝑃(𝑧 > 1.25) = 0.1056
Example 4
Determine the area under the standard normal distribution curve between z = 0.5
and z = 1.75.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 70
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
0.2684
0 0.5 1.75
𝑃(0 < 𝑧 < 1.75) = 0.4599 𝑃(0 < 𝑧 < 0.5) = 0.1915
𝑃(0.5 < 𝑧 < 1.75) = 𝑃(0 < 𝑧 < 1.75) − 𝑃(0 < 𝑧 < 0.5)
𝑃(0.5 < 𝑧 < 1.75) = 0.4599 − 0.1915
𝑃(0.5 < 𝑧 < 1.75) = 0.2684
Example 5
Determine the area under the standard normal distribution curve between z = 1.25
and z = - 1.5.
-1.5 0 1.25
Since the two areas are on the opposite sides of z = 0, we must find both areas
and add them.
𝑃(−1.5 < 𝑧 < 1.25) = 𝑃(−1.5 < 𝑧 < 0) + 𝑃(0 < 𝑧 < 1.25) = 0.4332 + 0.3944 =
0.8276
Example 6
Find the z value such that the area under the standard normal distribution curve
between 0 and z value is 0.4625.
Solution: Draw the figure. Find the area in table A. Then connect z value in the left
column 1.7 and in the top as 0.08, and add these two values to get 1.78.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 71
GE 1 Mathematics in Modern World
0.4625
0 z
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 72
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
Example 1
A radar unit is used to measure speeds of cars on a motorway. The speeds are
normally distributed with a mean of 90km/hr and a standard deviation of 10km/hr. What
is the probability that a car picked at random is travelling at more than 100 km/hr?
Solution:
P( x > 100)
90 100
The probability that a car selected at random has a speed greater than 100
km/hr is equal to 0.1587 or 15.87%
Example 2
For certain types of computers, the length of time between charges of battery is
normally distributed with a mean of 50 hrs and a standard deviation of 15 hrs.
John owns one of these computers and wants to know the probability that the
length of time will be between 50 to 70 hrs.
Solution:
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 73
GE 1 Mathematics in Modern World
50 70
a. x = 50 b. x = 70
𝑥−𝜇 𝑥−𝜇
𝑧= 𝑧=
𝜎 𝜎
50−50 70−50 20
𝑧= =0 𝑧= = = 1.33
15 15 15
Hence the probability that the length between charges of John’s computer is
between 50 to 70 hrs is 40.82%
Example 3
Entry to a certain University is determined by a national test. The scores on this
test are normally distributed with a mean of 500 and a standard deviation of 100.
Tom wants to be admitted to this University and he knows that he must score
better than at least 70% of the students who took the test. Tom takes the test
and scores 585. Will he be admitted to this University?
Solution:
500 585
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 74
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
Tom scored better than 80.23% of the students who took the test and he will be
admitted to the University.
Example 4
The length of life of an instrument produced by a machine has a normal
distribution with a mean of 12 months and standard deviation of 2 months. Find
the probability that an instrument produced by this machine will last
Solution (a):
7 12
Solution (b):
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 75
GE 1 Mathematics in Modern World
7 12
12
Hence the probability that an instrument produced by this machine will last
between 7 and 12 months is 49.38%.
Example 5
a) what is the probability that the length of this component is between 4.98 and
5.02 cm?
b) what is the probability that the length of this component is between 4.96 and
5.04cm?
Solution (a):
Step 1. Draw the figure and represent the area.
4.98 5.02
𝑓𝑜𝑟 𝑧 = 4.98
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 76
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
4.98−5.0 −0.02
𝑧= = = −1
0.02 0.02
𝑓𝑜𝑟 𝑧 = 5.02
Hence, the probability that the length of the component is between 4.98 and
5.02 is 68.26%
Solution (b):
Step 1. Draw the figure and represent the area.
4.96 5.04
𝑓𝑜𝑟 𝑧 = 4.96
4.96−5.0 −0.04
𝑧= = = −2
0.02 0.02
𝑓𝑜𝑟 𝑧 = 5.04
Hence, the probability that the length of the component is between 4.96 and
5.04 is 95.44%
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 77
GE 1 Mathematics in Modern World
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 78
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
𝑟 √𝑛−2
𝑡= where: 𝑡 = 𝑡 − 𝑡𝑒𝑠𝑡 𝑓𝑜𝑟 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡
√1−𝑟 2
𝑟 = 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡
𝑛 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑎𝑖𝑟𝑒𝑑 𝑠𝑎𝑚𝑝𝑙𝑒𝑠
Assumptions:
1) Samples are randomly selected.
2) Both populations are normally distributed.
The test for correlation coefficient is two – tailed; the rejection region is
divided into two equal parts. The figure below illustrates the rejection and non-
rejection region of the test of hypothesis of correlation coefficient.
When the null hypothesis has been rejected for a specific significance level,
there are possible relationship between x and y variables.
1) There is a direct cause – and – effect relationship between the two variables.
2) There is a reverse cause – and – effect relationship between the two
variables.
3) The relationship between the two variables may be caused by the third
variable.
4) There may be a complexity of interrelationship among many variables.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 79
GE 1 Mathematics in Modern World
Example 1
A mathematics instructor at a university would like to examine the relationship (if any)
between the number of optional homework problems students do during the semester
and their final grade. She randomly selected 12 students for study and ask them to keep
track of the number of these problems completed during the course in the semester. At
the end of each class, each student’s total is recorded along with their final grade. The
data is in the table below.
Student 1 2 3 4 5 6 7 8 9 10 11 12
# of Problems 51 58 62 65 68 76 77 78 78 84 85 91
Final Grade 62 68 66 66 67 72 73 72 78 73 76 75
Plot the data on a scatter diagram. Does it appear that there is a relationship
between the number of optional problems students do and their final grade?
Compute the coefficient correlation. Determine at the 0.05 significance level
whether the correlation in the population is greater than zero.
Solution:
Final Grade
90
80
70
No. of Problems
60
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 80
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
Step 4: Determine the degrees of freedom and the critical value of t based on the
table of critical values (t – distribution table).
df = n- 2 = 12 – 2 = 10 and t = ±2.228
x
y
Student (No. of
(Final x2 y2 xy
number optional
Grade)
problems)
1 51 62 2601 3844 3162
2 58 68 3364 4624 3944
3 62 66 3844 4356 4092
4 65 66 4225 4356 4290
5 68 67 4624 4489 4556
6 76 72 5776 5184 5472
7 77 73 5929 5329 5621
8 78 72 6084 5184 5616
9 78 78 6084 6084 6084
10 84 73 7056 5329 6132
11 85 76 7225 5776 6460
12 91 75 8281 5625 6825
∑𝑥 ∑𝑦 ∑ 𝑥2 ∑ 𝑦2 ∑ 𝑥𝑦
Total
= 873 = 848 = 65093 = 60180 = 62254
𝑛 ∑ 𝑥𝑦 − (∑ 𝑥)(∑ 𝑦)
𝑟=
√[𝑛(∑ 𝑥 2 ) − (∑ 𝑥)2 ][𝑛(∑ 𝑦 2 ) − (∑ 𝑦)2 ]
12(62254) − (873)(848)
𝑟=
√[12(65093) − (873)2 ][12(60180) − (848)2
747048 − 740304
𝑟=
√[781116 − 762129][722160 − 719104
6744
𝑟=
√[18987][3056]
6744
𝑟=
√58024272
6744
𝑟=
7617.37
𝑟 = 0.8853449
𝑟 = 0.89
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 81
GE 1 Mathematics in Modern World
𝑟√𝑛 − 2
𝑡=
√1 − 𝑟 2
0.89√12 − 2
𝑡=
√1 − (0.89)2
0.89√10
𝑡=
√1 − .7921
0.89(3.16)
𝑡=
√0.2079
2.8124
𝑡=
0.4560
𝑡 = 6.1675
Since the computed value of 6.6175 is greater than the tabular value of 2.228 at α =
0.05, we need to reject the null hypothesis.
Step 7: Conclusion:
Since the null hypothesis has been rejected, we can conclude that there is evidence
that shows significant association between optional problems a student and final
grade in the course.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 82
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
The least square model determines the regression equation by minimizing the
sum of squares of the vertical distances between the actual y values and the
predicted values of y. This method gives what is generally known as the “best
fitting” line. The difference between an observed and predicted value is called
the residual. The mean of the residuals is always zero. The points that fall
outside the overall pattern of the other points is called the outliers.
In a scatterplot, there are scores whose removal greatly changes the regression
line which are called influential scores. In some cases, these scores are
restricted to points with extreme x – values. Some influential scores may have a
small residual but still have a greater effect on the regression line than scores
with possibly larger residuals but average x – values.
𝑦
= 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑎𝑛𝑦 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒
Example 2
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 83
GE 1 Mathematics in Modern World
𝑏0 = 70.67 − (0.35519)(72.75)
𝑏0 = 70.67 − 25.84
𝑏0 = 44.83
Step 5. Substitute the slope and intercept in the general simple linear regression
equation.
𝑦̂ = 𝑏1 𝑥 + 𝑏0
𝑦̂ = 0.35519𝑥 + 44.83
Thus, the regression equation is 𝑦̂ = 0.35519𝑥 + 44.83. The b1 of 0.35519
indicates that for each additional number of optional problems done, final grades
are expected to increase by 0.35519 units. The b0 value of 44.83 indicates that if
the problems done by the student is zero, his final grade would be 44.83.
The National Housing Authority wants to investigate the relationship between the
size of houses and the rent paid by the tenants in General Santos City. The NHA
collected the following information on the sizes (in hundreds of square feet) for
eight houses and monthly rents (in thousands of pesos) paid by the tenants.
Construct a scatter diagram for these data. (a) Determine if the relationship (if
any) exists between the sizes of houses and the monthly rents using 0.05 level
of significance (b)n Find the regression line.
Size of House 35 40 50 60 28 34 45 25
Monthly Rent 11 17 18 20 6 10 19 5
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 84
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
Module 4 SUMMARY
The mode is the number that appears most frequently in a data set. A set of
numbers may have one mode or unimodal, two modes or bimodal, more than
one mode or multimodal, or no mode at all.
A box and whisker plot (sometimes called a boxplot) is a graph that presents
information from a five-number summary. It is especially useful for indicating
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 85
GE 1 Mathematics in Modern World
A test of significance for the coefficient of correlation may be used to find out if
the Pearson’s r could have occurred in a population in which the two variables
are related or not.
The least square model determines the regression equation by minimizing the
sum of squares of the vertical distances between the actual y values and the
predicted values of y.
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 86
For Instructional Purposes Only * 1st Semester AY 2023 - 2024
1. Following are the amount on customers’ meal checks at a diner for one
day’s lunches:
355 427 304 404 279 583 40 590 286 495
255 262 280 353 310 186 530 474 583 600
300 445 187 216 278 290 635 536 404 680
290 316 364 358 275 184 640 570 310 470
Tabulate into frequency distribution using Rule 1.
2. A garment company has declared bankruptcy. As an accountant you wish to
clarify the company’s account payable. The following are the amounts owed
by the garment company. (In hundred thousands)
395 252 250 268 285 305 304 375 400 306
320 341 355 340 278 312 278 265 408 324
325 372 286 359 305 416 312 286 311 378
238 313 290 263 314 325 278 401 314 371
Construct the frequency distribution table using Rule 2.
3. A week’s records of a bus company show the amounts (in pesos) spent on
gasoline by each of its 16 buses.
₱ 10,780 ₱ 12,790 ₱18,100 ₱ 13,480
₱ 17,740 ₱ 12,780 ₱ 19,120 ₱ 19,200
₱ 14,380 ₱ 14,712 ₱ 16,745 ₱ 13,725
₱ 15,145 ₱ 15, 314 ₱ 14,314 ₱ 17,189
Find the mean, median and mode of the expenses incurred for gasoline.
4. The monthly salaries (in thousand pesos) of the top executive of
Telecommunication Companies in the Philippines are: ₱ 380, ₱ 275, ₱ 477,
₱ 315, ₱ 415, ₱ 340, ₱ 415, ₱ 425, ₱ 376, ₱352, ₱ 285, ₱ 296, ₱ 338, ₱412
and 349. Determine the range, variance and standard deviation.
5. Determine the first second and third quartile of the data in problem #4.
6. The average cholesterol content of a certain duck egg is 210 mg, and the
standard deviation is 16mg. Assume the variable is normally distributed. If a
single egg is selected at random, find the probability that the cholesterol
content will be greater than205 mg.
7. A random sample of nine (9) cities gave the following figures for annual per
capita of cigarette consumption and annual death rate from lung cancer.
City 1 2 3 4 5 6 7 8 9
Cigarette Consumption (x) 350 370 250 260 255 300 400 330 240
Death Rate (y) 21 24 17 18 17 19 25 20 16
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 87
GE 1 Mathematics in Modern World
References:
Aufmann, R. et. al.(2018).Mathematics in the Modern World (Philippine
Edition). Rex Bookstore Inc. Manila, Philippines. pp. 101 - 143
Vision: A Premier technological institution in Agriculture and Allied Sciences in the Region
Mission: Advancing Agriculture, allied sciences and technological development through production, research, extension, management,
instruction and entrepreneurship for rural development. 88