Chapter 4 - Data Management_compressed
Chapter 4 - Data Management_compressed
DATA
MANAGEMENT
Mathematics as a Tool: Part 1
DATA
MANAGEMENT
Variables| Methods of Data Collection | Methods of Data Presentation | Frequency Distribution Table |
Measures of Central Tendency | Measures of Variation | Measures of Position
As of late 2022, 96.8% of Philippine healthcare workers completed
their primary COVID-19 vaccine series, with 63.8% receiving the first
booster and 28.2% the second. Nationwide, 66.1% of the population
completed their primary series, while booster uptake remains lower.
QUALITATIVE VARIABLES
variables that can be placed into distinct
categories, according to some characteristic or
attribute.
QUANTITATIVE VARIABLES
numerical and can be ordered or ranked.
QUANTITATIVE VARIABLES
MEASUREMENT
OF SCALE
NOMINAL
level of measurement
e.g measure height, weight, area, and number of phone calls received
IDENTIFY THE LEVEL OF SCALE OF MEASUREMENT FOR
THE FOLLOWING GIVEN EXAMPLE.
Judging a
NCAE Score
singing contest
METHODS OF
DATA COLLECTION
SURVEYS
Advantage
Telephone surveys are valuable for gathering
information as they allow reaching a diverse audience
quickly. They offer real-time data collection for prompt
analysis and are cost-effective compared to other
methods. These surveys provide insights for informed
decision-making.
Disadvantage
Telephone surveys face challenges like declining
response rates due to telemarketing and scams, which
can bias results. They can be time-consuming and
costly, limiting depth of responses without visual
cues. Strategies and technologies exist to enhance
telephone survey effectiveness.
MAILED QUESTIONNAIRE
Advantage
Mailed questionnaires offer convenience for
respondents to answer at their own pace, leading to
more thoughtful responses. They allow for a wider
distribution, reaching a diverse group for varied
perspectives and improved data quality.
Disadvantage
Mailed questionnaires offer convenience but may result
in low response rates, biased responses, and delays.
Mitigate issues with incentives and reminders. Consider
drawbacks when using them for research or data
collection.
PERSONAL INTERVIEW
Advantage
Personal interviews offer deeper insights through
nuanced interactions, immediate clarification, body
language observation, and rapport building. They
provide a valuable tool for gathering in-depth
information and fostering authentic connections in
areas like research, journalism, and recruitment.
Disadvantage
Personal interviews may introduce bias, impacting
candidate selection. Interviewer preferences can lead
to unfair outcomes and hinder selecting the best
candidate. To mitigate bias, interviewers should remain
impartial, focus on relevant criteria, and ensure a fair
evaluation process for all candidates.
INTERNET SURVEY
Advantage
Internet surveys are cost-effective, reach a wider
audience, collect data faster, offer convenience,
ensure anonymity, and simplify analysis. They
provide valuable insights quickly, aiding decision-
making for businesses and organizations.
Disadvantage
Internet surveys have advantages like convenience but
also drawbacks like response bias due to voluntary
participation. Limitations include lack of sample
representativeness, limited inclusivity, and
difficulties in verifying response authenticity.
Understanding these limitations is crucial for researchers
in designing and analyzing data.
EXAMPLE
Identify what surveys is appropriate in the given sample.
The choices are
A. Telephone Survey
B. Mailed Questionnaire
C. Personal Interview
D. Internet Survey
1. You are conducting a study on the preferences of residents in a small town regarding
local recreational facilities. You want to gather detailed responses about their
favorite activities, concerns, and suggestions for improvement. Which survey method
would be most appropriate for this situation?
2. You are researching the impact of a new educational program on student
performance. You want to gather quantitative data from a large sample across
different schools efficiently. Which survey method would be most appropriate?
3. You are conducting market research for a new product and need to reach a diverse
audience quickly. You aim to collect both qualitative and quantitative data on
consumer preferences. Which survey method would be the most suitable?
GNED03:: Mathematics in the Modern World
SAMPLING
TECHNIQUES
SAMPLING TECHNIQUES
Cluster Sampling
RESEARCH
STUDY
OBSERVATIONAL STUDY
the researcher merely observes what is happening
or what has happened in the past and tries to draw
conclusions based on these observations.
ADVANTAGES OF OBSERVATIONAL STUDIES
Occur in natural settings.
Allows us to study situations for which it would be
illegal/unethical to conduct an experiment (e.g., rape,
suicide, illegal drug use).
TREATMENT GROUP
the group(s) in the sample that receives a treatment or
experimental condition.
CONTROL GROUP
the group in the sample that is treated identically in all
respects to the treatment group EXCEPT that they don’t
receive the active treatment.
PLACEBO
a treatment that looks like a real drug but
has no active ingredient
PLACEBO EFFECT
when people take a placebo and it works
like the treatment or better
INDEPENDENT VARIABLE
the variable that is being manipulated by the
researcher (also called the explanatory variable).
DEPENDENT VARIABLE
the response to the independent variable or the
result of the explanatory variable (also called the
response or outcome variable).
ADVANTAGES OF EXPERIMENTS
The effect of an explanatory variable can be studied more precisely.
Researcher has (some) control over selecting participants, assigning
them to groups, and manipulating the independent variable.
Cause and effect relationships can be established using randomized
experiments (e.g., smoking causes cancer in lab rats). Note: In order
to make cause and effect conclusions in an experiment, the subjects
must be randomly assigned among the treatment groups.
DISADVANTAGES OF EXPERIMENTS
May occur in unnatural settings (e.g., laboratories).
Hawthorne Effect - when subjects know they are participating in an
experiment and change their behavior in ways that affect the results
of the study. (weight loss studies)
Not all variables can be controlled for in a study.
EXAMPLE
As the evidence on the adverse effects of cigarette smoke grew, people tried many different
ways to quit smoking. Some people tried chewing tobacco or, as it was called, smokeless
tobacco. A small amount of tobacco was placed between the cheek and gum. Certain chemicals
from the tobacco were absorbed into the bloodstream and gave the sensation of smoking
cigarettes. This prompted studies on the adverse effects of smokeless tobacco. One study in
particular used 40 university students as subjects. Twenty were given smokeless tobacco to
chew, and twenty given a substance that looked and tasted like smokeless tobacco, but did not
contain any of the harmful substances. The students were randomly assigned to one of the
groups. The students’ blood pressure and heart rate were measured before they started
chewing and 20 minutes after they had been chewing. A significant increase in heart rate
occurred in the group that chewed the smokeless tobacco. Answer the following questions.
What type of study was this (observational, quasi-experimental, or experimental)?
What are the independent and dependent variables?
Which was the treatment group?
Could the students’ blood pressures be affected by knowing that they are part of a study?
List some possible confounding variables.
Do you think this is a good way to study the effect of smokeless tobacco?
EXAMPLE
As the evidence on the adverse effects of cigarette smoke grew, people tried many different ways to quit
smoking. Some people tried chewing tobacco or, as it was called, smokeless tobacco. A small amount of
tobacco was placed between the cheek and gum. Certain chemicals from the tobacco were absorbed into the
bloodstream and gave the sensation of smoking cigarettes. This prompted studies on the adverse effects of
smokeless tobacco. One study in particular used 40 university students as subjects. Twenty were given
smokeless tobacco to chew, and twenty given a substance that looked and tasted like smokeless tobacco, but
did not contain any of the harmful substances. The students were randomly assigned to one of the groups. The
students’ blood pressure and heart rate were measured before they started chewing and 20 minutes after they
had been chewing. A significant increase in heart rate occurred in the group that chewed the smokeless
tobacco. Answer the following questions.
What type of study was this (observational or experimental)? - Experimental
What are the independent and dependent variables? - Independent - participant chewed tobacco or
not & Dependent - student’s blood pressures and heart rates
Which was the treatment group? - the tobacco group
Could the students’ blood pressures be affected by knowing that they are part of a study? - Yes
List some possible confounding variables. - example: the way that the students chewed the
tobacco
Do you think this is a good way to study the effect of smokeless tobacco? - example: it cannot
generalized beyond the population
GNED03: Mathematics in the Modern World
GATHERING, ORGANIZING,
REPRESENTING
AND INTERPRETING DATA
DATA PRESENTATION
Example
It shows the mean and the corresponding interpretation of the
academic performance along the area of cognitive learning in different
modality. It reveals that all learning outcomes for face-to-face and
blended modalities are highly observed, which means that the
cognitive learning of students is far above standards. Face-to-face
modality is better than blended and online modality.
Tables
Qualitative
Quantitative
Temporal
Spatial
Table 1 Table 2
Sociodemographic of the respondents in terms of sex Sociodemographic of the respondents in terms of age group
11 - 20 12
Male 52
21 - 30 46
Female 36
31 - 40 21
TOTAL 88
41 - 50 9
TOTAL 88
Qualitative Quantitative
Table 3 Table 4
Yearly sales of Milktea Shop from 2020 to 2024 Rice export from Philippines to the rest of the world in 2021
TOTAL 100
Temporal Spatial
Diagrams
Geometric diagram
Frequency diagram
Arithmetic line graph
Geometric Diagram
Figure 1 Figure 2
Coronavirus Cases in the Philippines Coronavirus Cases in the Philippines
Deaths
Active Case Recovered Deaths 1.6%
5000000
4000000
3000000
2000000
1000000
0
Coronavirus Cases in the Philippines Recovered
98.2%
99
10
0
20
15
5
.5
-1
04
10 .5
4.
5
Figure 3
-1
09
10 .5
9.
5
-1
14
.5
11
4.
Record High Temperatures
5
-1
19
11 .5
9.
5
-1
24
12 . 5
4.
5
-1
29
12 .5
9.
5
-1
34
.5
99
10
0
20
15
.5
-1
04
10 .5
4.
Figure 4
5
-1
09
. 5
10
9.
5
-1
14
. 5
11
4.
5
-1
Frequency Diagram
19
.5
119
Coronavirus Cases in the Philippines
.5
-1
24
12 5 .
4.
5
-1
29
12 .5
9.
5
-1
34
.5
Figure 4
Record High Temperatures
99
10
0
20
30
50
40
.5
-1
04
10 .5
4.
5
-1
09
10 .5
9.
5
-1
14
.5
11
4.
5
-1
19
11 .5
9.
5
-1
24
12 . 5
4.
5
-1
29
12 .5
9.
5
-1
34
.5
Frequency Diagram
Arithmetic Line Graph
Figure 5
Total Coronavirus Cases in the Philippines
GEC 004: Mathematics in the Modern World
FREQUENCY
DISTRIBUTION
FREQUENCY DISTRIBUTION
Suppose a researcher wished to do a study on the ages of the top 50 wealthiest people
in the world. The researcher first would have to get the data on the ages of the people. In
this case, these ages are listed in Forbes Magazine. Create a frequency distribution with
8 classes. 49 57 38 73 81
74 59 76 65 69
54 56 69 68 78
65 85 49 69 61
48 81 68 35 43
78 82 43 64 67
52 56 81 77 79
85 40 85 59 80
60 71 57 61 69
61 83 90 87 74
Step by step in creating frequency
distribution table
Creating a frequency distribution table can be a helpful way to organize and analyze
data. Here are the steps to guide you through the process:
1. **Determine the Range:** Start by identifying the range of values in your data set. This will
help you decide on the appropriate intervals for your frequency distribution.
2. **Choose the Number of Intervals:** Select the number of intervals or classes you want
to use in your table. This will depend on the size of your data set and the level of detail you
require.
3. **Calculate Interval Width:** Divide the range by the number of intervals to determine the
width of each interval. Round up to a convenient number to create clear boundaries.
4. **Create Interval Boundaries:** Establish the boundaries for each interval by starting from
the minimum value and adding the interval width successively.
5. **Tally the Data:** Go through your data set and tally the number of occurrences of each
value within the intervals you've defined.
6. **Calculate Frequencies:** Count the tally marks for each interval to determine the
frequency of values falling within that range.
Frequency Distribution Table
CLASS TALLY
a quantitative or qualitative category where raw
data value is placed
FREQUENCY
CLASS WIDTH the number of data values contained in a specific
the range of values within a class class
CUMULATIVE FREQUENCY
CLASS BOUNDARIES
Cumulative frequency in statistics is the total sum
Upper class boundaries = Upper Class Limit + 0.5 of frequencies up to a certain point in a dataset.
Lower class boundaries = Lower Class Limit - 0.5
EXAMPLE
Suppose a researcher wished to do a study on the ages of the top 50 wealthiest people
in the world. The researcher first would have to get the data on the ages of the people. In
this case, these ages are listed in Forbes Magazine. Create a frequency distribution with
8 classes.
49 57 38 73 81 STEP 1
74 59 76 65 69
54 56 69 68 78 Range = Highest value - Lowest value
65 85 49 69 61 = 90 - 35
48 81 68 35 43 = 55
78 82 43 64 67 STEP 2
52 56 81 77 79 Number of classes = 8 classes (GIVEN)
85 40 85 59 80 if not given, k = 1+3.322logN (Sturge's rule)
60 71 57 61 69 where N = number of values
61 83 90 87 74
EXAMPLE
Suppose a researcher wished to do a study on the ages of the top 50 wealthiest people
in the world. The researcher first would have to get the data on the ages of the people. In
this case, these ages are listed in Forbes Magazine. Create a frequency distribution with
8 classes.
49 57 38 73 81 STEP 3
74 59 76 65 69
54 56 69 68 78 Class width = Range / Number of Classes
65 85 49 69 61 = 55 / 8
48 81 68 35 43 = 6.875 (Round up)
78 82 43 64 67 =7
52 56 81 77 79
85 40 85 59 80
60 71 57 61 69
61 83 90 87 74
EXAMPLE STEP 4
Cumaltive
Classes Class Boundaries Tally Frequency Frequency
(> cf)
49 57 38 73 81
74 59 76 65 69
54 56 69 68 78
65 85 49 69 61
48 81 68 35 43
78 82 43 64 67
52 56 81 77 79
85 40 85 59 80
60 71 57 61 69
61 83 90 87 74
EXAMPLE STEP 5
Cumaltive
Classes Class Boundaries Tally Frequency Frequency
(> cf)
49 57 38 73 81 35 - 41
74 59 76 65 69
42 - 48
54 56 69 68 78
65 85 49 69 61 49 - 55
48 81 68 35 43
78 82 43 64 67 56 - 62
52 56 81 77 79
85 40 85 59 80 63 - 69
60 71 57 61 69
70 - 76
61 83 90 87 74
77 - 83
84 - 90
EXAMPLE
Suppose a researcher wished to do a study on the ages of the top 50 wealthiest people
in the world. The researcher first would have to get the data on the ages of the people. In
this case, these ages are listed in Forbes Magazine. Create a frequency distribution with
8 classes.
49 57 38 73 81 STEP 6
74 59 76 65 69
54 56 69 68 78 Class boundaries
65 85 49 69 61
48 81 68 35 43 Lower boundaries = Lower Class Limit - 0.5
78 82 43 64 67
52 56 81 77 79 Upper boundaries = Upper Class Limit + 0.5
85 40 85 59 80
60 71 57 61 69 example
61 83 90 87 74 35 - 41
class boundaries = 34.5 - 41.5
EXAMPLE STEP 7
Cumaltive
Classes Class Boundaries Tally Frequency Frequency
(> cf)
49 57 38 73 81 35 - 41 34.5 - 41.5
74 59 76 65 69
42 - 48 41.5 - 48.5
54 56 69 68 78
65 85 49 69 61 49 - 55 48.5 - 55.5
48 81 68 35 43
78 82 43 64 67 56 - 62 55.5 - 62.5
52 56 81 77 79
85 40 85 59 80 63 - 69 62.5 - 69.5
60 71 57 61 69
70 - 76 69.5 - 76.5
61 83 90 87 74
77 - 83 76.5 - 83.5
84 - 90 83.5 - 90.5
EXAMPLE STEP 7
Cumaltive
Classes Class Boundaries Tally Frequency Frequency
(> cf)
74 59 76 65 69
42 - 48 41.5 - 48.5 III
54 56 69 68 78
65 85 49 69 61 49 - 55 48.5 - 55.5 IIII
48 81 68 35 43
78 82 43 64 67 56 - 62 55.5 - 62.5 IIIIIIIIII
52 56 81 77 79
85 40 85 59 80 63 - 69 62.5 - 69.5 IIIIIIIIII
60 71 57 61 69
70 - 76 69.5 - 76.5 IIIII
61 83 90 87 74
77 - 83 76.5 - 83.5 IIIIIIIIII
74 59 76 65 69
42 - 48 41.5 - 48.5 III 3
54 56 69 68 78
65 85 49 69 61 49 - 55 48.5 - 55.5 IIII 4
48 81 68 35 43
78 82 43 64 67 56 - 62 55.5 - 62.5 IIIIIIIIII 10
52 56 81 77 79
85 40 85 59 80 63 - 69 62.5 - 69.5 IIIIIIIIII 10
60 71 57 61 69
70 - 76 69.5 - 76.5 IIIII 5
61 83 90 87 74
77 - 83 76.5 - 83.5 IIIIIIIIII 10
74 59 76 65 69
42 - 48 41.5 - 48.5 III 3 6
54 56 69 68 78
65 85 49 69 61 49 - 55 48.5 - 55.5 IIII 4 10
48 81 68 35 43
78 82 43 64 67 56 - 62 55.5 - 62.5 IIIIIIIIII 10 20
52 56 81 77 79
85 40 85 59 80 63 - 69 62.5 - 69.5 IIIIIIIIII 10 30
60 71 57 61 69
70 - 76 69.5 - 76.5 IIIII 5 35
61 83 90 87 74
77 - 83 76.5 - 83.5 IIIIIIIIII 10 45
CATEGORICAL
FREQUENCY
DISTRIBUTION
used for data that can be placed in
specific categories, such as nominal-
or ordinal-level data
EXAMPLE
Twenty-five students were given a blood test to determine their blood type. The data set is
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
Construct a frequency distribution for the data
CATEGORICAL FREQUENCY DISTRIBUTIONS
Class Tally Frequency Percent
CATEGORICAL FREQUENCY DISTRIBUTIONS
Class Tally Frequency Percent
A IIIII 5 20
B IIIIIII 7 28
AB IIIIIIIII 9 36
O IIII 4 16
EXAMPLE
Thirty-five students in a classroom. The data set is
M M F M M F F
F M F F M M M
M M F M M F F
F F F F M M F
F F M F M M F
Construct a frequency distribution for the data
Frequency Distribution Table
CATEGORICAL GROUPED
FREQUENCY FREQUENCY
DISTRIBUTION DISTRIBUTION
When the range of the data is large,
the data must be grouped into classes
that are more than one unit in width
GROUP FREQUENCY DISTRIBUTIONS
Class limits Class boundaries Tally Frequency
the class limits should have the same decimal place value as the data, but the class
boundaries should have one additional place value and end in a 5
GROUP FREQUENCY DISTRIBUTIONS
Rules in constructing a frequency distribution
112 100 127 120 134 118 105 110 109 112
110 118 117 116 118 122 114 114 105 109
107 112 114 115 118 117 118 122 106 110
116 108 110 121 113 120 119 111 104 111
120 113 120 117 105 110 118 112 114 114
EXAMPLE
Cumaltive Frequency
Classes Class Boundaries Tally Frequency
(> cf)
12 17 12 14 16 18
16 18 12 16 17 15
15 16 12 15 16 16
12 14 15 12 15 15
19 13 16 18 16 14
Mathematics in the Modern World
DATA
MANAGEMENT
Mathematics as a Tool: Part 1
Mathematics in the Modern World
MEASURES OF
CENTRAL
TENDENCY
Mathematics as a Tool: Part 2
MEASURES OF AVERAGE
MEASURES OF
CENTRAL
TENDENCY
UNGROUP DATA
GEC 004
Mean
For a population, the Greek letter μ (mu) is used for the mean.
X₁ + X₂ + X₃ + . . . + Xₙ ΣX
μ= =
N N
where N represents the total number of values in the population.
EXAMPLE
The data represent the number of days off per year for a sample of individuals
selected from nine different countries. Find the mean.
110 76 29 38 105 31
Median
the midpoint of the data array. The symbol for the median is MD
Odd - x̃ = x₁ x₂ x₃ x₄ x₅
Even - x̃ = x₁ x₂ x₃ x₄ x₅ x₆
x₃ + x₄
x̃ =
2
EXAMPLE
The number of rooms in the seven hotels in Tagaytay. Find the median.
unimodal - A data set that has only one value that occurs
with the greatest frequency
bimodal - If a data set has two values that occur with the
same greatest frequency, both values are considered to be
the mode and the data set
multimodal - If a data set has more than two values that
occur with the same greatest frequency, each value is used
as the mode, and the data set
EXAMPLE
Find the mode of the signing bonuses of eight PBA players for a specific year. The
bonuses in millions of pesos are
DATA
MANAGEMENT
Mathematics as a Tool: Part 1