Week-1-Intro-to-Stat-Collection-of-Data
Week-1-Intro-to-Stat-Collection-of-Data
LEARNING OUTCOMES
At the end of the chapter, the students should be able to:
1. Define Statistics;
2. Differentiate descriptive and inferential Statistics;
3. Define the important terminologies in Statistics;
4. Cite examples of qualitative and quantitative data;
5. Enumerate and describe each of the scales of measurement; and
6. Identify the type of measurement in a given data.
3
LEARNING OUTCOMES
7. Define collection of data;
8. Classify the data based on type and kind;
9. Enumerate and define the sources of data;
10. Enumerate and define the methods of collecting data;
11. Point out the advantages and disadvantages of each method of
collecting data; and
12. Develop honesty and accuracy in solving problems.
4
Introduction to Statistics
5
The figure below shows the of Statistics:
6
The application of statistics is extensive, so let
us discuss the other fields where the subject
methods are commonly implemented:
7
Business and Industry – Manufacturing
Build products and deliver
services that satisfy consumers
and increase the corporation’s
profit margin
8
Engineering
Make a consistent product,
detect problems, minimize
waste, and predict product life
in electronics, chemicals,
aerospace, pollution control,
construction,
9
Statistical Computing
Work in software design and
development, testing, quality
assurance, technical support,
education, marketing, and sales
to develop code that is both
user-friendly and sufficiently
complex
10
Health and Medicine
Epidemiology
Work on calculating cancer
incidence rates, monitor disease
outbreaks, and monitor changes
in health-related behaviors such
as smoking and physical
activity
11
Health and Medicine
Public Health
Prevent disease, prolong life,
and promote health through
organized community efforts,
including sanitation, hygiene
education, diagnoses, and
preventative treatment
12
Health and Medicine
Pharmacology
Wo r k i n d r u g d i s c o v e r y,
development, approval, and
marketing, to ensure the
validity and accuracy of
findings at all stages of the
process
13
Learning
Education
Te a c h K - 1 2 t h r o u g h p o s t -
graduate students, assess
teacher effectiveness, or
develop statistical models to
represent student learning
14
Social Statistics
Law
Analyze data in court cases,
including DNA evidence, salary
discrepancies, discrimination
law suits, and disease clusters
15
IMPORTANT TERMS
16
Statistics
It is the science of collecting, organizing, analyzing, and
interpreting data in order to make decisions.
17
Population
The collection of all responses, measurements, or counts that are
of interest
18
Sample
A portion or subset of the population
19
Parameter
A number that describes a population characteristic
Example: Average gross income of all people in the Philippines
in 2019
20
Statistic
A number that describes a sample characteristic
Example: 2019 gross income of people in a sample of 3 regions.
21
Qualitative Variable
Variable whose observations vary in kind but not in degree
Examples: Sex
Religion
Marital Status 22
Quantitative variable
Variable whose observations vary in magnitude
Examples: Age
Number of children
Monthly income 23
Discrete Quantitative
Variable
Quantitative variables whose observations can assume only a
countable number of values
Examples: Number of children in the family
Number of family planning methods heard
Number of dates in the past month
24
C o n t i n u o u s
Quantitative Variable
Quantitative variables whose observations can assume any one of
the countless number of values in a line interval
Examples: Height
Weight
Time
25
Independent Variables
Cause or determine or influence the dependent variable(s)
26
Dependent Variables
Presumed outcome of the influence of the independent variable(s)
27
Intervening Variables
Sometimes referred to as test or control variables
Used to test whether the observed relations between the
independent and dependent variables are spurious
Serve either to increase or decrease the effect the independent
variable has on the dependent variable
28
Levels of Measurement
1. Nominal
2. Ordinal
3. Interval
4. Ratio
29
Nominal
A measurement level in which numbers are used as labels or
names rather than to reflect quantitative information
Examples Sex
Marital status
ID number
30
Ordinal
A measurement level in which values reflect only rank order
Example 1
Educational attainment 1 = Elementary
2 = High School
3 = College
31
Ordinal
A measurement level in which values reflect only rank order
Example 2
Opinion on an issue 5 = Strongly agree
4 = Agree
3 = Neutral
2 = Disagree
1 = Strongly disagree 32
Interval
A measurement level with an arbitrary zero point in which
numerically equal intervals at different locations on the scale reflect
the same quantitative difference
Examples
Temperature in Celsius or Fahrenheit
IQ level
33
Ratio
The highest level of measurement that has all the characteristics of
the interval scale plus a true zero point
Examples
Income
No. of children
Age
34
Properties Held by Each Level of Measurement
Level of Property
measurement Categories Ranks Equal True zero
intervals point
Nominal Yes No No No
Ordinal Yes Yes No No
37
TWO BRANCHES
OF STATISTICS
38
Descriptive Statistics
involves organizing, summarizing, and displaying data.
collects data through survey, interview
presents data by means of tables and graphs
characterizes data using sample mean
.
39
Descriptive Statistics
Examples
• A bowler wants to find his bowling average for the past 10
games.
• A teacher wishes to determine the percentage of students who
passed the preliminary examination in Differential calculus.
• A st ud e n t w i she s t o de t e r m i ne t h e a v e r a g e m o n t h l y
expenditures on school supplies for the past 3 weeks.
40
.
Inferential Statistics
involves using sample data to draw conclusions about a
population.
drawing conclusions and/or making decisions concerning a
population based on sample results.
41
Inferential Statistics
Examples
• A manager would like to predict based on previous years’ sales,
the sales performance of a company for the next five years.
• A politician would like to estimate, based on opinion poll, his
chance for winning in the upcoming 2019 senatorial election.
• A basketball player wants to estimate his chance of winning
the most valuable player (MVP) award based on his season
averages and the averages of his opponents.
. 42
ESTIMATION
Estimate the population mean weight using the sample mean
weight
43
Hypothesis Testing
Test the claim that the population mean weight is 120 pounds.
44
SAMPLING
TECHNIQUES
45
Sampling
is the process of identifying the sample from the population to
ensure that what is true for the sample is also true for the
population or simply “the process of measuring a small portion
of something and making a general statement about the whole
thing.”
46
TYPES OF SAMPLING
Probability Sampling
Non Probability Sampling
47
Probability Sampling
each element in the population has an equal, independent chance
of being selected. The goal is to obtain a sample representative of
the target population
48
Probability Sampling
1. Simple random sampling
2. Random Numbers
3. Stratified random sampling
4. Cluster sampling
5. Systematic Sampling.
49
Simple random sampling
All samples of the same size are equally likely.
Assign a number to each member of the population.
50
Random numbers
can be generated by a random number table, software program or
a calculator.
Data from members of the population that correspond to these
numbers become members of the sample
51
Stratified Random Sampling
Divide the population into groups (strata) and select a random
sample from each group. Strata could be age groups, gender or
levels of education, for example.
52
Cluster Sampling
Divide the population into individual units or groups and
randomly select one or more units. The sample consists of all
members from selected unit(s).
53
Systematic Sampling
Choose a starting value at random. Then choose sample members
at regular intervals.
We say we choose every kth member. In this example, k = 5.
Every 5th member of the population is selected.
54
Non Probability Sampling
1. Consecutive sampling
2. Convenience sampling
3. Purposive sampling: commonly used in qualitative research.
55
Consecutive Sampling
commonly used in intervention studies
56
Convenience Sampling
Choose readily available members of the population for your
sample.
57
Sample Size
To make a rough estimate of how many subjects required to
answer the research question. During the design of the study, the
sample size calculation will indicate whether the study is feasible.
During the review phase, it will reassure the reviewers that not
only is the study feasible, but that resources are not being wasted
by recruiting more subjects than is necessary
58
Two Basic Methods of
Sample Size Estimation
Hypothesis-based
Confidence interval-based
59
Brief Overview of Sample
Size Calculations
60
Hypothesis-based sample sizes
indicate the number of subjects necessary to reasonably test the primary
study hypothesis. Hypotheses can be shown to be wrong, but they can
never be proven correct. This is because the investigator cannot test all
people in the world with the condition of interest. The investigator
attempts to test the research hypothesis through a sample of the larger
target population
61
Hypothesis-based sample size
From the data collected, inferences are made about the larger
population. For example, if 80% of patients self-administering
analgesia report good pain control, whereas only 40% of patients
receiving nurse-administered analgesia report good pain control, one
would conclude that there is a difference between the two methods and
that self-administered analgesia is superior. However, there is always a
possibility that since we have only used a sample of all possible
patients, there may, in fact, be no difference between the two but the
results have just occurred due to chance To test this formally, a
statistical test would be done.
62
Hypothesis-based sample size
In this case the P value is 0.03. This P value means that the probability
of obtaining these results or results even more extreme, if in truth there
is no difference between the two methods, is no more than 3%.
Therefore, either self-administered analgesia is better than nurse-
administered analgesia or a very unusual event has occurred. When
there is truly no difference between two interventions, but the results of
our study suggest there is a difference, a type 1 error has occurred.
Generally, studies will accept a 5% risk (α level) of making a type 1
error. The calculated P value is the probability that we may have made
a type 1 error.
63
64
Type 2 Error
A type 2 error occurs when we conclude there is no evidence of a
difference between two groups, when in truth there is. Most
investigators accept a greater risk of making a type 2 error,
usually 10% or 20% (β level).
65
Components of the Hypothesis-based Sample
Size Calculation
Type 1 error (α): falsely rejects null hypothesis ∗ Usual risk 0.05
Type 2 error (β): falsely accepts null hypothesis ∗ Usual risk 0.1 - 0.2
∗ Study’s power = 1-β
66
• involves organizing, summarizing, and displaying data.
• collects data through survey, interview
• presents data by means of tables and graphs
• characterizes data using sample mean
79
80
EXAMPLES
A bowler wants to find his bowling for the past 10
games.
A teacher wishes to determine the of students
who passed the preliminary examination in Differential
calculus.
A student wishes to determine the
on school supplies for the past 3 weeks.
81
involves using sample data to draw conclusions about a
population.
drawing conclusions and/or making decisions concerning a
population based on sample results.
allows you to make predictions (“inferences”) from that
data.
82
83
EXAMPLES
A manager would like to based on previous years’
sales, the sales performance of a company for the next five
years.
A politician would like to , based on opinion poll,
his chance for winning in the upcoming 2019 senatorial
election.
A basketball player wants to his chance of
winning the most valuable player (MVP) award based on
his season averages and the averages of his opponents. 84
Collection of Data
85
COLLECTION OF DATA
It refers to the process of
obtaining numerical
measurements.
86
TWO SOURCES OF DATA
1. Documentary Sources
2. Field Sources
87
Documentary Sources
the information contained in published or unpublished reports,
statistics, Internet, letters, magazines, newspapers, journals, and
others.
88
Classification of Documentary Sources
Primary Data
data gathered are original
Secondary Data
data that were previously gathered from an original source, which
are computed and compiled.
89
Examples of Secondary Data
The United Nations’ compiled data for its yearbook, which
were originally gathered by government statistical agencies of
different countries.
A medical researcher’s documented data for his research paper,
which were originally collected by the Department of Health.
90
Advantages of Primary Data over
Secondary Data
Primary data frequently give detailed definition of terms
and accurate statistical units used in the survey.
Primary data lend more relevance to the researcher’s study.
Primary data are more reliable because of their first-hand
nature.
91
Field Sources
These sources would include individuals who have sufficient
knowledge and experience regarding the study under
investigation.
92
Methods Used in the Collection of Data
1. Direct Method
2. Indirect Method
3. Registration Method
4. Observation Method
5. Experiment Method
93
The Direct Method
Often referred as the interview method
This is a face-to-face encounter between the
interviewer and the interviewee.
Is suitable for obtaining data for many types of
research problems including those that concern
sentiments, emotions, and opinions of the people
regarding certain issues or programs.
Usually provides the most accurate and complete
responses.
94
The Indirect Method
Is popularly known as the questionnaire method
This method is done by giving prepared relevant
questionnaires to the respondents of the study from
which one would like to get the needed information.
95
Questionnaire
is a list of questions which are intended to elicit
answers to the problems under investigation
is a measurement instrument used in various data
collection methods, particularly surveys
96
Registration Method
It is a method of utilizing the existing data or fact or
information, which is kept systematized by the office
concerned such registration of births, death, motor vehicles,
and marriages and licenses because these are enforced by
certain law.
97
Observation Method
this method is used to collect data pertaining to
attitudes, behavior, values, and cultural patterns of the
samples under investigation. Subjects may be taken
individually or collectively. It is usually used when the
subjects cannot talk.
98
Five Major Approaches of
Observation Method
1. Duration Recording
2. Frequency Count Recording
3. Latency Recording
4. Interval Recording
5. Time Sampling
99
Duration Recording
the observer records how long the behavior lasts.
100
Frequency Count Recording
the observer counts the number of times a particular behavior
happens in a given period of time
101
Latency Recording
the observer measures the length of time between the
stimulus and the first occurrence of the behavior of interest.
102
Interval Recording
the researcher partitions time into fixed time intervals and
counts the number of time intervals where the behavior
occurred.
103
Time Sampling
the observer records the phenomenon of interest or
presence/absence of behavior under the study at every
specified time schedule.
104
Two Types of Observation
Structured Observation
The researcher designs a rigorous plan and formal
instruments for recording behavior before the actual data
collection in the field or laboratory.
Unstructured Observation
The researcher has complete flexibility in performing the
study and can modify the original plans at any stage of the
study.
105
Experiment Method
This method is used if the researcher would like to
determine the cause and effect relationship of certain
phenomena under investigation.
This used in making scientific inquiry.
It is a method of collecting data where there is direct
human intervention on the conditions that may affect
the values of the variable of interest.
106
REFERENCES
Print References
Guzman, P. (2016). Statistics and probability. Quezon City : C & E Publishing.
Mercado, J. P. (2016). Next century mathematics (statistics and probability). Manila:
Phoenix Publishing House
Belecina, R. R. (2016). Statistics and probability. Manila: Rex Book Store
Lim,Y. et.al. (2016). Statistics and probability. Manila: Sibs Publishing House .
Parreño, E. B. (2014). Basic statistics (A Worktext).Quezon City: C and E Publishing, Inc.
Narag, E. C. (2010). Basic statistics with calculator and computer application. Manila:
Rex Bookstore, Inc.