0% found this document useful (0 votes)
19 views

Math

Uploaded by

joyosakarwyn21
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Math

Uploaded by

joyosakarwyn21
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

CvSU Mission

CvSU Vision Republic of the Philippines Cavite State University shall provide excellent,
The premier university in
CAVITE STATE UNIVERSITY
equitable and relevant educational opportunities in
historic Cavite recognized for the arts, science and technology through quality
excellence in the development of Cavite City Campus instruction and relevant research and development
activities.
globally competitive and morally
Brgy. 8, Pulo II, Dalahican, Cavite City It shall produce professional, skilled and
upright individuals. morally upright individuals for global competitiveness.

CHAPTER 4
DATA MANAGEMENT
Objectives:
After the completion of the chapter, students should be able to:
• Use variety of statistical tools to process and manage numerical data;
• Use methods of linear regression and correlations to predict the value of a variable given certain
conditions; and
• advocate the use of statistical data in making important decisions.

EVALUATION REQUIREMENTS:
• Problem Sets and Exercises
• Quiz
• Quantitative Research Proposal (FINAL PROJECT)
SAMPLE: You want the university to offer an online enrolment system to improve the enrolment
process. CSG asks your team to present hard data that will convince the administration. Prepare a
proposal on how you will do this task.

Statistical tools derived from mathematics are useful in processing and managing numerical data
in order to describe a phenomenon and predict values.

4.1 BASIC CONCEPTS AND TERMS


DEFINITION OF STATISTICS
It is a branch of science which deals with the collection, presentation, analysis and interpretation of data.

NATURE OF STATISTICS
General Uses of Statistics
a. Statistics aids in decision making
• provides comparison
• explains action that has taken place
• justifies a claim or assertion
• predicts future outcome
• estimates unknown quantities
b. Statistics summarizes data for public use

FIELDS OF STATISTICS
a. Statistical Methods of Applied Statistics – refers to procedures and techniques used in the
collection, presentation, analysis and interpretation of data.
• Descriptive statistics
- methods concerned with the collection, description and analysis of a set of data
without drawing conclusions or inferences about a larger set.
- the main concern is simply describe the set of data.
• Inferential Statistics
- methods concerned with making predictions or inferences about a larger set of data
using only the information gathered from a subset of this larger set.
- the main is not merely to describe but actually predict and make inferences based
on the information gathered.
2

b. Statistical Theory of Mathematical Statistics – deals with the development and exposition of
theories that serve as bases of statistical methods.

POPULATION AND SAMPLE


• A population is a collection of all the elements under consideration in a statistical study.
• A sample is a part or subset of the population from which the information is collected.
• A parameter is numerical characteristic of a population.
• A statistic is a numerical characteristic of the sample.

Steps in Statistical Inquiry


1. Define the problem.
2. Formulate the research design.
3. Collect data.
4. Code and analyzed the collected data.
5. Interpret the results.

VARIABLE AND MEASUREMENT


• A variable is a characteristic or attribute of persons or objects which can assume different values or
labels for different persons or objects under consideration.
• Measurement is the process of determining the value or label of a particular variable for a particular
experimental unit.
• An experimental unit is the individual or object on which a variable is measured.

CLASSIFICATION OF VARIABLE
1. Discrete vs. Continuous
Discrete – a variable which can assume finite number of values; usually measured by counting or
enumeration.
Continuous – a variable which can assume infinitely many values corresponding to a line number.
2. Qualitative vs. Quantitative
Qualitative – a variable that yields a categorical response.
Example: Occupation, Marital Status
Quantitative – a variable that takes on numerical values representing an amount or quantity.
Example: Weight, Height, Age, Number of cars

LEVEL OF MEASUREMENT
1. Nominal Level – the nominal level or classificatory scale is the weakest level of measurement where
numbers or symbols are used simply for categorizing subjects into different groups.
Examples: Sex: M-Male F-Female
Marital Status: 1-Single 2-Married 3-Widowed 4-Separated
2. Ordinal Level – the ordinal level of measurement contains the properties of the nominal level, and in
addition, the numbers assigned to categories of any variables may be ranked or ordered in some
low-to-high manner.
Examples: Teaching Ratings 1-poor 2-fair 3-good 4-excellent
Year Level 1-1st year 2-2nd year 3-3rd year 4-4th year
3. Interval Level – the interval level is that which the distances between any two numbers on the scale
are of known sizes.
Example: IQ level, Temperature
4. Ratio Level – the ratio level of measurement contains all the properties of the interval level, and in
addition, it has a “true zero” point.
Example: Number of correct answers in exam.

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


3

CLASSIFICATION OF DATA
1. Primary vs. Secondary
a. Primary Source – data measured by the researcher/agency that published it.
b. Secondary Source – any republication of data by another agency.
Example: The publication of the National Statistics Office (NSO) is primary sources and
all subsequent publications of other agencies are secondary sources.
2. External vs. Internal
a. Internal Data – information that relates to the operations and functions of the organization
collecting the data.
b. External Data – information that relates to some activity outside the organization collecting
the data.
Example: The sales data of SM is internal data for SM but external data for any other
organization such as Robinson’s.

EXERCISE 4.1 ______________


A. Identify each item as discrete or continuous.
_______________1.Student enrolment in Cavite State University – Cavite City Campus
_______________2.Weight of the students
_______________3.Student number
_______________4.Amount of time spent surfing the internet per week.
_______________5.Number of persons in a family
B. Determine whether the data are qualitative or quantitative.
_______________1. The colors of automobiles on a used car lot.
_______________2. The numbers on the shirts of a girl’s soccer team.
_______________3. The seats in a movie theater.
_______________4. A list of house numbers on your street.
_______________5. The ages of a sample of 350 employees of a large hospital.
C. Identify the data set’s level of measurement (nominal, ordinal, interval, ratio).
_______________1. Hair color of women on a high school tennis team.
_______________2. Number of milligrams of tar in 28 cigarettes.
_______________3. Temperatures of 22 selected refrigerators.
_______________4. The ratings of a movie raging from “poor” to “good’ to “excellent”.
_______________5. List of zip codes for Chicago.
D. Identify the population, variable of interest, and type of variable of the following:
1. From all students registered this semester, the Mathematics Department would like to know how
many students like mathematics.
Population: _________________________________________________________________________
Variable: ___________________________________________________________________________
Type of Variable: ____________________________________________________________________

2. A study to be conducted by an NGO would determine the Filipinos’ awareness about the war
against IRAQ.
Population: _________________________________________________________________________
Variable: ___________________________________________________________________________
Type of Variable: ____________________________________________________________________

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


4

4.2 DATA COLLECTION AND PRESENTATION


GENERAL CLASSIFICATION OF COLLECTING DATA
• Census of complete enumeration is the process of gathering information from every unit in the
population.
- not always possible to get timely, accurate and economical data
- costly, especially of the number of units in the population is too large
• Survey sampling is the process of obtaining information from the units in the selected sample.

SLOVIN’S FORMULA
𝑁
𝑛=
1 + 𝑁𝑒 2
Where:
n = sample size
N = population size
e = margin of error (0.05 or 0.01)

Example:
1. Solve for the sample size of 350 patients from Cavite Medical Center.
𝑁 350 350 350
𝑛= 2
= 2
= = = 186.67 = 187
1 + 𝑁𝑒 1 + (350)(0.05) 1 + (350)(0.0025) 1.875

2. Solve for the sample size of 4,565 students of CvSU – Rosario.


𝑁 4565 4565 4565
𝑛= 2
= 2
= = = 367.77 = 368
1 + 𝑁𝑒 1 + (4565)(0.05) 1 + (4565)(0.0025) 12.4125

NOTE: Sample size, when computed, must be rounded up to its nearest whole number.

EXERCISE 4.2.1: _______________


Solve for the sample size of the following using Slovin’s formula:
1. 6,666

2. 12,345

3. 1000

4. 1203

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


5

PROBABILITY AND NON-PROBABILITY SAMPLING


• A sampling procedure that gives every element of the population a nonzero chance of being
selected in the sample is called probability sampling. Otherwise, the sampling procedure is called
non-probability sampling.
• The target population is the population from which information is desired.
• The sampled population is the collection of elements from which the sample is actually taken.
• The population frame is a listing of all individual units in the population.

METHODS OF NON-PROBABILITY SAMPLING


1. Purposive sampling – sets out to make a sample agree with the profile of the population based on
some pre-selected characteristic.
2. Quota sampling – selects a specified number (quota) of sampling units possessing certain
characteristics.
3. Convenience sampling – selects sampling units that come to hand or are convenient to get
information from.
4. Judgment sampling – selects sample in accordance with an expert’s judgment.

METHODS OF PROBABILITY SAMPLING


1. Simple random sampling – is a method of selecting n units out of the N units in the population in such
a way that every distinct sample of size n has an equal chance of being drawn.
2. Stratified random sampling – the population of N units is first divided into subpopulations called
strata. Then a simple random sample is drawn from each stratum, the selection being made
independently in different strata.
3. Systematic sampling – is a method of selecting a sample by taking every kth unit from an ordered
population, the first unit being selected at random.
4. Cluster sampling – is a method where a sample of distinct groups, or cluster, of elements is selected
and then a census of every element in the selected cluster is taken.
5. Multistage sampling – the population is divided into a hierarchy of sampling units corresponding to
the different sampling stages. In the first stage of sampling, the population is divided into primary
stage units (PSU) then a sample of PSUs is drawn. In the second-stage units (SSU) then a sample of
SSUs is drawn.
6. Sequential sampling – units are drawn one by one in a sequence without prior fixing of the total
number of observations and the results of the drawing at any stage are used to decide whether to
terminate sampling or not.

DATA COLLECTION METHODS


Data Collection Methods
1. Survey method – questions are asked to obtain information, either through self-administered
questionnaire or personal interview.

Self-administered Questionnaire Personal Interview


• It can be administered to a large number of • It is administered to a person or group one
people simultaneously. at a time.
• Respondents may feel free to express views • Respondents may feel more cautious
and are less pressured to answer particularly in answering sensitive questions
immediately. for fear of disapproval.
• It is more appropriate for obtaining about
• It is more appropriate for obtaining complex emotionally-laden topics or
objective information. probing sentiments underlying an expressed
opinion.

2. Observation method – makes possible the recording of behavior but only at the time of occurrence.

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


6

3. Experimental method – a method designed for collecting data under controlled conditions. An
experiment is an operation where there is actual human interference with the conditions
than can affect the variable under study.
4. Use of existing studies – e.g., census, health statistics, and weather bureau reports.
Two type:
• Documentary sources – published or written reports, periodicals, unpublished documents,
etc.
• Field sources – researchers who have done studies on the area of interest are asked
personally or directly for information needed.
5. Registration method – e.g., car registration, student registration and hospital admission.

EXERCISE 4.2.2 ______________


A. Identify which data collection method is best used on the following statements:
_______1. Tracer Study on BSBM graduates of CvSU – CCC from 2011-2016
_______2. The role of Brgy Officials in maintaining peace and order in the community.
_______3. The effects of entertainment media to the academic performance of senior high school
students.
_______4. Grading the demonstration teaching of pre-service teachers at CNHS
_______5. Testing the new vaccine for Parvo virus on puppies.
B. Identify the sampling technique used (random, cluster, stratified, convenience, systematic).
_______________1. Every fifth person boarding a plane is searched thoroughly.
_______________2. At a local community College, five math classes are randomly selected out of 20
and all of the students from each class are interviewed.
_______________3. A researcher randomly selects and interviews fifty male and fifty female teachers.
_______________4. Based on 12,500 responses from 42,000 surveys sent to its alumni, a major
university estimated that the annual salary of its alumni was 92,500.
_______________5. A community college student interviews everyone in a biology class to determine
the percentage of students that own a car.

TABULAR AND GRAPHICAL PRESENTATION OF DATA


Textual Presentation – data incorporated to a paragraph of text.

Advantages Disadvantages
• When a large mass of quantitative data are
• It gives emphasis to significant figures and included in a text or paragraph, the
comparisons. presentation becomes almost
incomprehensible.
• It is simplest and most appropriate • Paragraphs can be tiresome to read
approach when there are only a few especially if the same words are repeated
numbers to be presented. so many times.

Tabular Presentation – the systematic organization of data in rows and columns.


Advantages
• More concise than textual presentation
• Easier to understand
• Facilitates comparisons and analysis of relationship among different categories
• presents data in greater detail than a graph
Parts of a Formal Statistical Table
1. Heading – consist of a table number, title, and a head note.

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


7

2. Box Head –the portion of the table that contains the column heads which describe the data in each
column.
3. Stub – The portion of the table usually comprising the first column on the left. The row caption is a
descriptive title of the data on the given line.
4. Field – main part of the table; contains the substance or the figures of one’s data.
5. Source note – an exact citation of the source of data presented in the table (should always be
placed when the figures are not original).
6. Foot note – any statement or note inserted at the bottom of the table.

Table 4.4 – CRIME VOLUME AND RATE BY TYPE: 1991 – 1993


heading
(Rate per 100,000 populations)

1991 1992 1993


Type Crime Crime Crime boxhead
Volume Volume Volume
Rate Rate Rate

Total 121,326 195 104,719 164 96,686 148

Index CrimesPhilippine77,261
Source: National 124
Police 67,354 106 58,684 90
stub Murder 8,707 14 8,293 13 7,758 12
Homicide 8,069 13 7,912 12 7,123 11
Physical 29,862 35 20,462 32 18,722 29 field
Injury 13,817 22 11,164 18 9,856 15
Robbery 22,780 37 17,374 27 12,940 20
Theft 2,026 3 2,149 3 2,285 4
Rape
44,065 71 37,365 59 38002 58
Nonindex crimes

Graphical Presentation – a graph or chart is a device for showing numerical values or relationships in
pictorial form.
Advantages:
• Main features and implications of a body of data can be grasped at a glance.
• Can attract attention and hold the reader’s interest.
• Simplifies concepts that would otherwise have been expressed in so many words.
• Can readily clarify data; frequently bring hidden facts and relationships.

Quality of a Good Graph


1. Accuracy
2. Clarity
3. Simplicity
4. Appearance

Common Types of Graph


1. Line chart – graphical presentation of data especially useful for showing trends over a period
of time.
2. Pie chart – a circular graph that is useful in showing how a total quantity is distributed among
a group of categories.
3. Bar chart – consist of a series of rectangular bars where the length of the bar represents the
quantity or frequency for each category if the bars are arranged horizontally. If the
bars are arranged vertically, the height of the bar represents the quantity.
4. Pictorial unit chart – a pictorial chart in which each symbol represents a definite and uniform
value.

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


8

FREQUENCY DISTRIBUTION TABLE


1. Raw Data – The raw data is the set of data in its original form.

2. Array – An array is an arrangement of observations according to their magnitude, either


in increasing or decreasing order.
Example: Final grades of 110 Stat students arranged in an array.

50 50 50 50 50 50 51 52 53 53 57
59 59 60 60 60 62 62 62 62 63 65
66 66 68 68 68 68 68 69 69 69 69
69 70 71 71 71 71 72 72 72 72 72
73 73 73 73 74 74 74 75 75 75 75
75 76 76 76 76 77 77 77 77 78 79
79 79 79 79 80 80 80 81 81 81 81
82 82 82 82 82 82 83 83 84 84 84
84 84 84 84 85 85 86 86 87 87 87
87 87 87 88 89 89 91 92 94 94 96

3. Frequency Distribution Table is a condensed version of an array. It categorizes the numerical


data into intervals or classes. It has the following parts:

• Classes – these are mutually exclusive categories defining the lower limit and upperlimit
with equal intervals. (C – Class size; R – range; K – Class Interval)
𝑹 = 𝑯𝒊𝒈𝒉𝒆𝒔𝒕 𝑽𝒂𝒍𝒖𝒆 − 𝑳𝒐𝒘𝒆𝒔𝒕 𝑽𝒂𝒍𝒖𝒆 = 96 − 50 = 46
𝑲 = 𝟏 + 𝟑. 𝟑𝟐𝟐 𝐥𝐨𝐠 𝑵 = 1 + 3.322 log 110 = 7.78 = 8
𝑹 46
𝑪= = = 5.75 = 𝟔
𝑲 8
• Class Frequency – the number of observations falling in the class
• Class interval – the numbers defining the class
• Class limits – the end numbers of the class
• Class boundaries – the true class limits; lower class boundary (LCB) is usually defined as
halfway between the lower class limit of the class and the upper class limit of the preceding
class while the upper class boundary (UCB) is usually defined as the halfway between the
upper class limit of the class and the lower limit of the next class.
• Class size – the difference between the upper class boundaries of the class and the
preceding class
• Class mark – midpoint of a class interval

Steps in Constructing a Frequency Distribution Table


1. Determine the number of classes using the Sturge’s formula.
Sturge’s Formula : 𝐾 = 1 + 3.322 log 𝑛
Where: 𝑛 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠
Example: 𝐾 = 1 + 3.322 𝑙𝑜𝑔110 = 7.78 𝑜𝑟 8

2. Determine the approximate class size.


• solve for the range, 𝑅 = max − 𝑚𝑖𝑛.
• compute for 𝐶 = 𝑅 ÷ 𝐾
Example: 𝑅 = 96 − 50 = 46
𝐶 = 46 ÷ 8 = 5.75 = 6

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


9

3. Determine the lowest class limit. The first class must include the smallest value in the data set.
In our example, 50 is the lowest class limit.
4. Determine all the class limits by adding the class size to the limit of the previous class. (In our
example, the series of lower limits will be: 50 (+6), 56 (+6), 62, … up to 92. And upper limits are
one less than the next class.)
5. Tally the frequencies for each class. Sum the frequencies and check against the total number
of observations.
6. Determine the lower class boundaries by subtracting 0.5 from the lower limits.
7. Determine the upper class boundaries by adding 0.5 to the upper limits.
8. Determine the class mark by getting the average of the lower and upper limits.

NOTE: A frequency distribution can be extended with cumulative frequencies and relative
frequencies.
• <CF or less than cumulative frequency is the accumulated frequencies below the upper limit.
• >CF or greater than cumulative frequency is the number of observations above the lower
limit.
• Relative frequency(percentage) is the fraction to which the class comprises the whole
observation.

Class Intervals Class Boundary Cumulative Frequency Relative frequency


Frequency Class Mark
𝑳𝑳 + 𝑼𝑳 LCB UCB RF RFP (%)
LL UL (f) 𝒙=
𝟐 𝑳𝑳 − 𝟎. 𝟓 𝑼𝑳 + 𝟎. 𝟓
<CF >CF 𝒇 𝒇
× 𝟏𝟎𝟎%
𝒏 𝒏
50 55 10 52.5 49.5 55.5 10 110 0.0909 9.09%
56 61 6 58.5 55.5 61.5 16 100 0.0545 5.45%
62 67 8 64.5 61.5 67.5 24 94 0.0727 7.27%
68 73 24 70.5 67.5 73.5 48 86 0.2182 21.82%
74 79 22 76.5 73.5 79.5 70 62 0.2 20%
80 85 24 82.5 79.5 85.5 94 40 0.2182 21.82
86 91 12 88.5 85.5 91.5 106 16 0.1091 10.91%
92 97 4 94.5 91.5 97.5 110 4 0.0364 3.64%
TOTAL 110 1 100%

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


10

EXERCISE 4.2.3 _____________


Create a Frequency Distribution Table for the data below:

144 112 156 122 168 172 141 159 127 154
156 145 134 137 123 149 144 160 136 139
142 138 159 151 147 150 126 152 147 136
135 132 146 133 150 122 139 149 152 129
131 155 116 140 145 135 160 125 172 163

Class Intervals Frequency Class Mark Class Boundary Cummulative Frequency Relative frequency
LL UL (f) (x) LCB UCB <CF >CF RF RFP (%)

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


11

4.3 MEASURES OF CENTRAL TENDENCY


MEASURES OF CENTRAL TENDENCY: UNGROUPED DATA
• It is any single value that is used to identify the “center” or the typical value of a data set. It is often
referred to as the averages.
a. Mean – this is obtained by summing up all the observations and divided by the sum by the number of
observations. We call this the simple mean.
∑𝑥
Formula: 𝑥̅ =
𝑛
Where: 𝑥̅ = mean
𝑥 = value of the particular item
𝑛 = number of items in the sample
Example:
A sample of 10 students was taken and was asked how much time they travel from their respective places
of residences to the school. The results are listed below. Compute the mean.
Student Travel time
A 30 min
B 15
C 35
D 20
E 25
F 45
G 10
H 25
I 30
J 15
b. Median – It is the middle value after arranging the set of observations into ascending or descending
order. If the number of observation is odd number, the median is the middle value and if the number
of observation is even number, the median is the average of the two middle values or observations.
Formula:
𝑋(𝑛+1)/2 𝑖𝑓 𝑁 𝑖𝑠 𝑜𝑑𝑑
𝑀𝑑 = {𝑋𝑛/2 + 𝑋(𝑛+1)/2
𝑖𝑓 𝑁 𝑖𝑠 𝑒𝑣𝑒𝑛
2
Example:
A sample of 10 students was taken and was asked how much time they travel from their respective
places of residences to the school. The results are listed below. Find the median..
Student Travel time
A 30 min
B 15
C 35
D 20
E 25
F 45
G 10
H 25
I 30
J 15
a. Mode – it is the observation that appears most often. Mode is the least preferred measure of central
location.
Example: Find the mode
Observations Mode
3 8 6 7 9 9 3 3 10 3 - unimodal
10 15 15 20 25 25 30 35 45 15 & 25 - bimodal
10 15 15 20 25 25 30 30 35 45 15, 25 & 30 - trimodal
3 8 6 6 7 7 9 9 3 6 3 10 7 9 3, 6, 7, & 9 - multimodal

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


12

MEASURES OF CENTRAL TENDENCY: GROUPED DATA


a. Mean
∑ 𝑓𝑥
Formula: 𝑥̅ = 𝑛
Where: 𝑥̅ = mean
𝑓 = frequency
𝑥 = value of the particular item
𝑛 = number of observation
Example:

Class Frequency CM (x) fx


∑ 𝑓𝑥
50 – 55 10 52.5 525 𝑥̅ = 𝑛
56 – 61 6 58.5 351
62 – 67 8 64.5 516 8175
68 – 73 25 70.5 1,762.5 =
110
74 – 79 22 76.5 1,683
80 – 85 23 82.5 1,897.5 = 74.32
86 – 91 12 88.5 1,062
92 – 97 4 94.5 378
N=110 ∑ 𝑓𝑥 = 8175

b. Median
𝑛
( −<𝑐𝑓𝑝)
Formula: 𝑥̃ = 𝐿𝐶𝐵𝑚𝑑 + 𝐶 [ 2
]
𝑓𝑚𝑑

Where: 𝐿𝐶𝐵𝑚𝑑 = lower class boundary of the median class


𝑛 = number of observations
< 𝑐𝑓𝑝 = sum of the frequencies before the median class
𝑓𝑚𝑑 = frequency of the median class
𝐶 = class interval/size

Example:
Final grades of Stat 101 students arrange in array. Solve for the median.
Solution:
1. Determine the median class by dividing the total number of observations by 2.
𝑛 110
= 2 = 55
2

2. Go over the entries in the less than cumulative frequency column. The class that immediately
has a sum of frequencies greater than the result of step 1 is the median class.
3.
𝑛
( −<𝑐𝑓𝑝)
2
Class Frequency LCB <cf 𝑥̃ = 𝐿𝐶𝐵𝑚𝑑 + [ ]𝑖
𝑓𝑚𝑑
50 – 55 10 49.5 10
56 – 61 6 55.5 16 (
110
−49)
2
62 – 67 8 61.5 24 𝑥̃ = 73.5 + 6 [ ]
22
68 – 73 25 67.5 49
Median class 74 – 79 22 73.5 71 𝑥̃ = 75.14
80 – 85 23 79.5 94
86 – 91 12 85.5 106
92 – 97 4 91.5 110
N= 110

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


13

c. Mode
𝑓 −𝑓
Formula: 𝑥̂ = 𝐿𝐶𝐵𝑚 + 𝐶 (2𝑓 𝑚 𝑓 −𝑓
1
)
𝑚− 1 2
Where: 𝑥̂ = Mode
𝐿𝐶𝐵𝑚 = LCB of the modal class
𝑓𝑚 = Frequency of the modal class
𝑓1 = frequency of the class below the modal class
𝑓2 = frequency of the class above the modal class

Example:
Final grades of Stat 110 students arrange in array. Solve for the median.

Solution:
1. Determine the modal class by identifying the class that contains the highest frequency or
observation. (NOTE: This should be a bimodal class. Since two classes has 24 as its frequency. But
in this case, the author altered the 3rd and 5th class to make it unimodal).
𝑓 𝑓1
Class Frequency LCB <cf 𝑥̂ = 𝐿𝐶𝐵𝑚 + 𝐶 (2𝑓 𝑚− )
𝑚 −𝑓1 −𝑓2
50 – 55 10 49.5 10
56 – 61 6 55.5 16 25−8
𝑥̂ = 67.5 + 6 ( )
62 – 67 8 61.5 24 2(25)−8−22

Modal class 68 – 73 25 67.5 49


74 – 79 22 73.5 71 𝑥̂ = 72.6
80 – 85 23 79.5 94
86 – 91 12 85.5 106
92 – 97 4 91.5 110
N= 110

EXERCISE 4.3.1 _________________


1. The owner of a newly opened Internet café recorded the number of customers who are coming in
to his Internet café. Below is a tabulation of the number of customers for 10 days. Calculate the
mean, median and mode.
Days No. of Customers
1st 8
2nd 5
3rd 9
4th 12
5th 12
6th 10
7th 15
8th 15
9th 15
10th 14

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


14

2. Complete the Frequency Distribution Table to find the mean, median and mode of the data set
given:
Class F CM (x) fx LCB <CF
10-19 3
20-29 1
30-39 3
40-49 2
50-59 9
60-69 8
70-79 35
80-89 30
90-99 9

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


15

4.4 MEASURES OF ABSOLUTE DISPERSION


MEASURES OF DISPERSION
• It indicates the extent to which individual items in a series are scattered about an average.
Some Uses for Measuring Dispersion:
• To determine the extent of the scatter so that steps may be taken to control the existing variation.
• Used as a measure of reliability of the average value
General Classifications of Measures of Dispersion:
1. Measures of Absolute Dispersion
2. Measures of Relative Dispersion

MEASURES OF ABSOLUTE DISPERSION: UNGROUPED DATA


• Expected in the units of the original observations.
• They cannot be used to compare variations of two data sets when the averages of these data sets
differ a lot in value or when the observations differ in units of measurement.

1. Range – it is the difference between the largest and smallest values.


Range = maximum – minimum
Example:
a. The IQ’s of 5 members of a certain family are 108,112,127,116 and 113. Find the range.
Range = maximum – minimum
Range = 127 -108 = 19
2. Mean Absolute Deviation or Average Deviation
∑ |𝑥 − 𝑥̅ |
𝑀𝐷 =
𝑁
3. Standard Deviation – is the most frequently used measure of dispersion.
∑(𝑥−𝑥̅ )2
Formula: 𝑠=√ 𝑛−1
Where: 𝑠 = sample standard deviation
𝑥 = observation
𝑥̅ = sample mean
𝑛 = number of observation
Steps in Calculating the Standard Deviation
1. Compute the mean
2. Compute the deviations by subtracting the mean from each of the observations
3. Square the deviations
4. Take the sum of the squared deviations
5. Divide the sum by N – 1
6. Take the square root of the sample variance

Example:
Below is the list of the scores of two groups of students in a grammar quiz.
Group A Group B
13 10
14 10
15 15
16 18
19 18
20 19
25 26
30 36

Solution:
1. Compute the mean
∑𝑥 152 ∑𝑥 152
𝑥̅𝐴 = 𝑛 = 8 = 19 𝑥̅ 𝐵 = = = 19
𝑛 8

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


16

2. Compute the deviations by subtracting the mean from each of the observations, and then
square the deviations.
Group A 𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )2 Group B 𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )2
13 -6 36 10 -9 81
14 -7 49 10 -9 81
15 -4 16 15 -4 16
16 -3 9 18 -1 1
19 0 0 18 -1 1
20 1 1 19 0 0
25 6 36 26 7 49
30 11 121 36 17 289

3. Take the sum of the squared deviations, then divide the sum by N – 1, then take the square root
of the sample variance
∑(𝑥−𝑥̅ )2 268 ∑(𝑥−𝑥̅ )2 518
𝑠𝐴 = √ = √8−1 = 6.19 𝑠𝐵 = √ = √8−1 = 8.60
𝑛−1 𝑛−1

MEASURES OF ABSOLUTE DISPERSION: GROUPED DATA


Mean Deviation
∑ 𝑓|𝑥 − 𝑥̅ |
𝑀𝐷 =
𝑁
Standard Deviation – is the most frequently used measure of dispersion.
∑ 𝑓(𝑥−𝑥̅ )2
Formula: 𝑠=√ 𝑛−1
Where: 𝑠 = sample standard deviation
𝑓 = frequency
𝑥 = class mark
𝑥̅ = sample mean
𝑛 = number of observation
Steps in Calculating the Standard Deviation
1. Compute the mean
2. Compute the deviations by subtracting the mean from each of the class mark
3. Square the deviations
4. Multiply the squared deviations by its corresponding frequency
5. Take the sum of the product of the squared deviations and the frequency
6. Divide the sum by N – 1
7. Take the square root of the sample variance

Example:
Final grades of students in Stat 110 arranged in FDT. Solve for the Standard deviation.
𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )2 𝑓 (𝑥 − 𝑥̅ )2
Class Frequency CM (x) 𝑓𝑥

50 – 55 10
56 – 61 6
62 – 67 8
68 – 73 25
74 – 79 22
80 – 85 23
86 – 91 12
92 – 97 4
N= 110

∑ 𝑓(𝑥−𝑥̅ )2 ∑ 𝑓|𝑥− 𝑥̅ |
𝑠=√ 𝑀𝐷 =
𝑛−1 𝑁

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


17

EXERCISE 4.4 ____________


A pediatrician has clinic hours in two leading hospitals. His clinic schedule in Alabang is 10:00 to
12:00 pm, MWF. His clinic schedule in Makati is 2:00 to 4:00 pm, TTh. The logbook of his secretaries
shows the number of patients who visited him for the last two weeks.
Hospital in Alabang Hospital in Makati
4,800 4,200
4,200 3,600
4,200 3,600
3,000 3,000
2,400 4,800

Complete the Frequency Distribution Table to find the standard deviation of the data set given:
Class F CM (x) 𝑓𝑥 𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )2 𝑓(𝑥 − 𝑥̅ )2

10-19 3

20-29 1

30-39 3

40-49 2

50-59 9

60-69 8

70-79 35

80-89 30

90-99 9

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


18

4.5 MEASURES OF RELATIVE DISPERSION


NORMAL DISTRIBUTION
Properties of a Normal Distribution

a. The mean, median, and mode are all equal and are located at the center of the distribution.
b. The distribution is symmetric. The distribution depicts a bell-shaped curve where the left area
is a mirror image of the right area.
c. The total area under the normal curve is 1 or 100%.
d. The distribution is asymptotic.
e. The location of the distribution is determined by the mean and the standard deviation
determines dispersion of the distribution.

The graph below shows the graph of a normal distribution:

𝜇 − 3𝛿 𝜇 − 2𝛿 𝜇 − 1𝛿 𝜇 𝜇 + 1𝛿 𝜇 + 2𝛿 𝜇 + 3𝛿

The mean and the standard deviation determine the shape of the distribution.

As previously stated, there are infinite families of curves depending upon the standard deviation of
the distribution. This may suggest that we have to use different table corresponding to a particular
mean and standard deviation. Well, it is not. It is necessary that we need to standardize a given
observation. the standardized score may also be termed as Z-value, Z statistics, standard deviate,
standard normal value or just normal value. The formula is shown below.
𝑥−𝜇
𝑍=
𝜎

Where: 𝑧 = normal value


𝑥 = value of any particular observation
𝜇 = mean of the distribution
𝜎 = standard deviation of the distribution

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


19

The different rules presented by examples can be summarized as follows:

Z - values Rules
1. The z – values are positive and negative Add the areas of the corresponding Z – values.
2. Both Z – values are positive or both Z – In either case, subtract the smaller area from
Value are negative the bigger area
3. To the right of a positive z – value or to
Subtract the area from 0.5
the left of a negative z value
4. To the right of a negative z value or to
Add area to 0.5
the left of a positive z value

Examples:
Find the area under the normal distribution curve of the following z values:
1. 0 < z < 1.63 5. z > 1.63

2. 0 > z > - 2.44 6. z < -2.44

3. z < 2.44 7. – 2.44 < z < –1.05

4. z > - 1.63 8. – 1.05 < z < 1.63

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


20

EXERCISE 4.5 _____________


Sketch the normal distribution of the given problem. Show your solutions.
A data set follows a normal distribution with a mean of 40 and a standard deviation of 4.75.
What is the area under the normal curve?
a. Between 34.06 and 46.08?
b. Between 28.6 and 35.11?
c. Greater than 49.5?
d. Less than 44.04?

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


21

4.6 HYPOTHESIS TESTING


CONCEPT OF HYPOTHESIS TESTING
• Hypothesis – is a statement about the population developed for the purpose of testing.
• Hypothesis testing – is a procedure consisting of pertinent steps whose major objective is to
be able to make a decision based on the gathered data.

NULL AND ALTERNATIVE HYPOTHESES


The concept of hypothesis in statistical inference is classified into two:
1. Null hypothesis – denoted by H0 refers to the statement about the absence of any effect
claimed for a certain action. This hypothesis also asserts the absence of difference between
the observed and the expected values. The null hypothesis should be stated by saying “There
is no significant difference…”, “There is no relationship…”, or “there is no change…”
2. Alternative Hypothesis – denoted by Ha refers to the assertion contradicting the null
hypothesis. Thus, if the null hypothesis is proven to be true, then the alternative hypothesis
should be false. To state the alternative hypothesis of our null, we may say, “There is significant
relationship between…” The alternative hypothesis tells us if the test is one-tailed or two-tailed
test.

ONE-TAILED AND TWO-TAILED TEST


1. One – tailed test – is the test where the area of rejection is at either side. The one-tailed test
is used if the alternative hypothesis is directional.
Example:
A teacher employed two different teaching strategies in presenting her lesson: lecture
and discussion method. After the presentation, a 30 – point quiz was given. The mean score
of the students where the discussion method was the strategy used was found out to be 25
with the standard deviation of 3. The mean score of the students where the lecture method
was used was found out to be 19 with a standard deviation of 3.2. At the 0.01 significance
level, can we conclude that the discussion method is more effective than the lecture
method?
H0 = The discussion method is as effective as the lecture method.
Ha = Discussion method is more effective than the lecture method.

2. Two-tailed test – a test where the areas of rejection are both sides of the distribution. The two-
tailed test is used if the alternate hypothesis is non-directional.
Example:
A test was administered to two groups of students – the HRM student group and the
tourism student group. At the 0.05 significance level, is there difference between the scores
obtained by the two groups of students?
H0 = There is no significant difference between the scores obtained by the two groups of
students.
Ha = There is significant difference between the scores obtained by the two groups of
students.
LEVEL OF SIGNIFICANCE
• It is the probability of rejecting a true null hypothesis.
• If the null hypothesis is true and is rejected, it is called TYPE I ERROR. And if the null hypothesis
is false and is accepted, it is called TYPE II ERROR.
Decision
Null Hypothesis
Reject H0 Accept H0
H0 is true Type I Error Correct Decision

H0 is false Correct Decision Type II Error

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


22

CRITICAL VALUE
• The value that divides the area of rejection and the area of acceptance.

Region of
acceptance
Region of Region of
rejection rejection

-1.701 1.701
STEPS IN HYPOTHESIS TESTING
1. State the null hypothesis (H0) and the alternative hypothesis (Ha).
2. Set the desired level of significance.
3. Determine the appropriate test statistic and establish the critical region.
4. Compute the test statistic as a basis for decision.
5. Formulate the decision.

Examples:
For each of the problems below, do the following:
• Define the variable that you are going to use to represent information.
• Formulate the appropriate null hypothesis (H 0) and the appropriate alternative hypothesis
(Ha).

1. The soft drink dispenser of a fast food center was just readjusted. The manager, wanting to
know if the dispenser is really in good condition, got a sample of 50 cups filled by the
dispenser. She would only classify the dispenser as “in good condition” (and therefore need
not to be readjusted again) if the average fill per cup of the dispenser is 8 ounces.
Solution:
• Variable: The variable that will represent the information is –
X = fill per cup of the dispenser.

• Hypothesis: Ho: μ = 8 ounces (The dispenser is “in good condition”.)


Ha: μ ≠ 8 ounces (The dispenser is not “in good condition”.)

2. Jenny suspects that male CvSU-CCC students spend less time studying compare to their
female counterpart. She decided to conduct a study regarding the study habits of both
male and female CvSU-CCC student spends doing his/her school work.
Solution:
• Variable: The variable that will represent the information is –
X = time spent by male CvSU-CCC student in doing school work.
Y = time spent by female CvSU-CCC student in doing school work

• Hypothesis: Ho: μx = μy (The average time spent by male CvSU-CCC students in


doing school work is the same with the female CvSU-CCC students.)

Ha: μx < μy (The average time spent by male CvSU-CCC students in


doing school work is less than the female CvSU-CCC students.)

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


23

EXERCISE 4.6 _____________


State the null and alternative hypothesis, and type of test (Right-tailed, Left-tailed or Two-
tailed) of the following research problems:
1. A forester studying diameter growth of red pine believes that the mean diameter growth will
be different if a fertilization treatment is applied to the stand.

H0: _________________________________________________________________________________

H1: _________________________________________________________________________________

Test: _________________________________________________________________________________

2. A biologist believes that there has been an increase in the mean number of lakes infected
with milfoil, an invasive species, since the last study five years ago.
H0: _________________________________________________________________________________

H1: _________________________________________________________________________________

Test: _________________________________________________________________________________

3. A scientist’s research indicates that there has been a change in the proportion of people
who support certain environmental policies. He wants to test the claim that there has been
a reduction in the proportion of people who support these policies.

H0: _________________________________________________________________________________

H1: _________________________________________________________________________________

Test: _________________________________________________________________________________

4. For a shipment of cable, suppose that the specifications call for a mean breaking strength
of 2010 pounds. A sample of the breaking strength of 32 segments of cable has a mean of
1895 pounds with an associated standard deviation of 59 pounds. Using the 5% level, test the
significance of the difference found.

H0: _________________________________________________________________________________

H1: _________________________________________________________________________________

Test: _________________________________________________________________________________

5. An electrical company claimed that less than 2% of the parts which they supplied on a
government contract are defective. A sample of 642 parts was tested, and 17 did not meet
the specifications. Can we accept the company’s claim at a .05 level of significance?
H0: _________________________________________________________________________________

H1: _________________________________________________________________________________

Test: _________________________________________________________________________________

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


24

4.7 STATISTICAL TESTS


TEST OF RELATIONSHIP
1. Pearson Product Moment Correlation (Pearson R)
FUNCTION: Parametric. It is used to test relationship between two variables in the interval or
ratio.
LEVEL OF MEASUREMENT: Interval/Ratio
SAMPLE DATA: Test Scores, Grades, IQ, Academic performance, Attendance, Budget
RESEARCH PROBLEM: Is there a significant relationship between the level of academic
motivation and academic performance of the participants?

2. Spearman Rank-Order Correlation (Spearman’s Rho)


FUNCTION: Non-parametric. Used to determine if there is a correlation of relationship
between two variables of ordinal type.
LEVEL OF MEASUREMENT: Ordinal
SAMPLE DATA: Percentile, class ranking, social status
RESEARCH PROBLEM: Is there a significant relationship between the student’s ranking in
Mathematics and Science subjects?

3. Chi-Square Test of Independence


FUNCTION: Non-parametric. Used to determine if there is a correlation or relationship or
association between variables of nominal type.
LEVEL OF MEASUREMENT: Nominal
SAMPLE DATA: Gender / Sex, School location, Number of responses (Frequency)
RESEARCH PROBLEM: Is sex related to color preference?; Is there a relationship between the
type of school attended and students’ gender?

TEST OF DIFFERENCE
1. Z – Test of One Population Mean
FUNCTION: Parametric. Used to determine if a given sample mean was drawn from the
population with known parameters.
LEVEL OF MEASUREMENT: Interval/Ratio
SAMPLE DATA: SATT Scores, Average, Ratings, IQ, Budget, Gross Income
RESEARCH PROBLEM: Is the group of teenagers in Makati represent Metro Manila teenagers?;
Is there enough evidence to contradict the rental company’s claim that the mean time to
rent a car on their website is 60 seconds if the mean time of rent of random sample of 36
customers was 75 seconds?; Is there a significant difference between the mean score of the
2018 LET passers from CvSU with mean score of the total LET passers of CvSU?

2. Z – Test of Independent Proportions


FUNCTION: Non-parametric. Used to determine if there is a significant difference between
two independent or two different groups on situations that call for two types of responses.
LEVEL OF MEASUREMENT: Nominal
SAMPLE DATA: Gender/Sex, Public/Private School, Married/Single, Number of responses
RESEARCH PROBLEM: Is there a significant difference between the students and the teachers
who are in favor of Duterte’s war on drugs?;

3. Z – Test of Dependent Proportions


FUNCTION: Non-parametric. Used to determine if there is a significant difference between
pairs of observation from a single group.
LEVEL OF MEASUREMENT: Nominal
SAMPLE DATA: Gender/Sex, Public/Private School, Married/Single, Number of responses

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


25

RESEARCH PROBLEM: Is there a significant difference between students who are in favor of
Duterte’s war on drug before and after the forum?; Is there a significant difference between
voters’ choice of candidate before and after the political debate?

4. T – Test of Independent Means


FUNCTION: Parametric. Used to determine if there is a significant difference between two
different or two independent groups in terms of means.
LEVEL OF MEASUREMENT: Interval/Ratio
SAMPLE DATA: SATT Scores, Average, Ratings, IQ, Budget, Gross Income
RESEARCH PROBLEM: Is there a significant difference between the academic performance
in Mathematics of K-12 and Non-K-12 graduates?; Is there a significant difference between
the perception of the teachers and students on the use of an on-line learning management
system?

5. T – Test of Dependent Means (Paired T-Test)


FUNCTION: Parametric. Used to determine if there is a significant difference between two
groups or two sets of correlated scores; usually used when undergone a treatment
LEVEL OF MEASUREMENT: Interval/Ratio
SAMPLE DATA: Pre-test and Post-test Scores; Mean weight before and after intensive training;
SATT Scores, Average, Ratings, IQ, Budget, Gross Income
RESEARCH PROBLEM: Is there a significant difference on the diagnostic and summative
exam scores of the students after undergoing intervention program?; Is there a significant
difference on the English proficiency level of the participants before and after attending
Speech Communication courses?
6. Chi – Square Test of Goodness of Fit
FUNCTION: Non-parametric. Used to determine if there is a significant difference between
the observed distribution and the expected distribution.
LEVEL OF MEASUREMENT: Nominal
SAMPLE DATA: Gender/Sex, Public/Private School, Married/Single, Number of responses
RESEARCH PROBLEM: Is there a significant difference between the observed distribution and
the expected distribution of teachers’ responses on the issue of Duterte’s making alliance
with China and Russia?; Is there a significant difference between the observed and the
expected distribution of male and female enrollees in CvSU – CCC?

7. One – Way Analysis of Variance (ANOVA I)


FUNCTION: Parametric. Used to determine if there is a significant difference between two or
more groups in terms of means.
LEVEL OF MEASUREMENT: Interval/Ratio
SAMPLE DATA: Average, Ratings, IQ, Budget, Gross Income, Speed
RESEARCH PROBLEM: Is there a significant difference between three models of photocopy
machines in terms of average no. of photocopies it can produce in a week?; Is there a
significant difference on the Mathematics anxiety level of the participants in terms of their
learning styles?

8. Two – Way Analysis of Variance(ANOVA II)


FUNCTION: Used to determine if there is a significant difference in terms of means between
two or more groups that have two or more independent variables.
LEVEL OF MEASUREMENT: Interval/Ratio
SAMPLE DATA: Average, Ratings, IQ, Budget, Gross Income, Speed
RESEARCH PROBLEM: Is there a significant difference between the mean scores of the
students on the use of modularized instruction, cooperative learning and lecture method in
terms of medium of instruction in English and Filipino?; Is there a significant difference on
the ratings of the students to music and movies in terms of genres?

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r


26

EXERCISE 4.7 _____________


Determine the type of test that is suitable to the research problems below:
1. Is there a significant difference on the mathematics performance of grade 7 students in
terms of gender?
2. Is there a significant relationship between amount of alcohol intake and weight?
3. Do social status affect employability?
4. Is there a significant difference between the level of perception of the participants
regarding the benefits of networking before and after the seminar?
5. What is the difference in average pain levels among post-surgical patients given three
different painkillers?

G NE D0 3 : M a t h ema t ics in t h e M od er n Wor ld | A.B. Ag u ila r

You might also like