Smm105 - Reviewer (Prelim)
Smm105 - Reviewer (Prelim)
Week 1: The Nature of Statistics numeric codes for them, e.g., for sex variable, 1 and
2 may refer to make, and female, respectively).
Statistics – A science that studies data to be able to
Qualitative data answer questions “what kind.”
make a decision. It involves the methods of collecting,
Sometimes, there is a sense of ordering in
processing, summarizing, and analyzing data in order to
qualitative data, e.g., income data grouped into high,
provide answers or solutions to an inquiry.
middle, and low-income status.
➢ Statistics as a Tool in Decision-Making it enables us
➢ Quantitative Variables – Whose sizes are
to:
meaningful, answer questions such as “how much”
▪ Characterize persons, objects, situations, and
or “how many”. Quantitative Variables have actual
phenomena;
units of measure. Examples of quantitative variables
▪ Explain relationships among variables;
include the height, weight, number of registered
▪ Formulate objective assessment and comparisons; and,
cars, household size, and total household
more importantly
expenditures/ income of survey respondents.
▪ Make evidence-based decisions and predictions.
➢ Two types of Quantitative Variables:
▪ Discrete Variables – Those data that can be counted
Field of Statistics:
e.g., the number of days for cellphones to fail, the ages
➢ Descriptive Statistics - Which is concerned to the
of survey respondents measured to the nearest year, and
collection and description of a set of data to yield
the countable number of values. These are the variables
meaningful information. Descriptive Statistics
whose values or levels cannot take the form of decimals.
provides information only about collected data and
These are the variables that data can be taken through
does not draw inferences or conclusions about a
the process of enumeration or counting.
larger set of data. In general, this type of statistics is
▪ Continuous Variables – Those that can be measured,
devoted to summarization and description of data
e.g. the exact height of the survey respondent and the
set.
exact volume of some liquid substance. These are the
➢ Inferential Statistics - Which is composed of those
variables whose levels can take continuous values or
methods concerned with the analysis of smaller
assume a continuous set of numerical values.
group of data, which is known as the sample leading
to predictions or inferences about the larger set of
Levels of Measurement of Numerical Data:
data, or the population from which the sample is
➢ Nominal Data – Are commonly categorical data
drawn. This type of statistics will be used when one
assigned to numbers. In this data type, counting the
makes a decision, estimates prediction or
number of times a certain data would fall on the
generalization about a population based on a sample
category would only be the applicable
measurement. Example of which is assigning 1 for
➢ Basic Terms:
males and 2 for females. Data that can be
▪ Universe - The collection or set of units or entities from
categorized as nominal data include course, civil
whom we got the data. (Example: All students in the
status, color, preference, etc.
world.)
▪ Population - The set of all possible values of a variable.
➢ Ordinal Data - Are quantities where the numbers
are used to designate the rank order of data. In this
(Example: All students in a university.)
type, the correlation or the effect ranking of one
▪ Sample - A subgroup of a universe or of a population.
variable to another can be measured. However, the
(Example: 200 students from a university.)
range for each rank is not constant. Examples of
▪ Variable – It is characteristic that is observable or
quantities that use this level of measurement are the
measurable in every unit of the universe. (Example:
result of a race, ranking of beauty pageant,
Height, weight, or favorite subject of students.)
educational attainment, etc.
Broad Classification of Variables: ➢ Interval Data - Is a data type where the range
➢ Qualitative Variables – Express a categorical between the numeric values is constant. In this data
attribute, such as sex (male or female), religion, type addition and subtraction of values can be
marital status, region of residence, highest performed, however, multiplication and division are
educational attainment. Qualitative variables do not not appropriate. Example if for instance we would
strictly take on numeric values (although can have like to determine the information on describing the
level of the academic performance of students in
K♥K
Elementary Statistics & Probability 2SEDM-A
▪ Disadvantage: Higher risk of sampling error if clusters understandable homogeneous groups for the
are not representative. purpose of convenient interpretation. A uniformity
➢ Purposive Sampling - Selection based on the of attributes is the basic criterion for classification;
researcher's judgment to include specific, relevant and the grouping of data is made according to
cases. similarity.
▪ Advantage: Focused on specific characteristics or ➢ Tabulation of Data - Tabulation is the process of
expertise. summarizing raw data and displaying it in compact
▪ Disadvantage: Prone to researcher bias and limits form for further analysis. Therefore, preparing tables
generalizability. is a very important step.
➢ Quota Sampling - Pre-determined quotas are set for ▪ Title of Data - The table should be first given a brief,
subgroups, and participants are selected non- simple, and clear title which may express the basis of
randomly until quotas are filled. classification based on the purpose of the study.
▪ Advantage: Ensures representation of key groups. ▪ Columns and Rows - Each table should be prepared in
▪ Disadvantage: Subject of selection bias. just an adequate number of columns and rows.
➢ Convenience Sampling - Participants are chosen ▪ Captions - The columns and rows should be given
based on ease of access or availability. simple and clear captions so the ordinary reader can
▪ Advantage: Quick and inexpensive. understand the data.
▪ Disadvantage: Often unrepresentative and introduces ▪ Unit of Measurement - The unit should be noted below
bias. the lines.
➢ Other Sampling: ▪ Footnotes - This may be given below the table.
▪ Snowball Sampling: Participants recruit others, often ▪ Total - Totals of each column and grand total should be
used for hard-to-reach populations. in one line.
▪ Multistage Sampling: Combines multiple methods
(e.g., cluster and random sampling). ➢ Data Interpretation - After gathering the data, they
▪ Judgmental Sampling: Similar to purposive sampling must be tabulated and computed so the researcher
but more subjective. can analyze and interpret the result. Research
interpretation is defined as adequate exposition of
Week 3: Presentation of Data the true meaning of the material presented in terms
of the purpose of the study. Results in discussions
Data Processing - Is a fundamental step done in
should be systematic, logical, and comprehensive.
analyzing gathered data in study. In research, data is
The decision should blend the findings in relation to
manipulated to produce results that lead to answers to
those identified in the literature review and placed
specific problems for the improvement of an existing
within the context of the theoretical framework
situation. The data may be presented through diagram,
underpinning the study.
table, figure, or graph such as line graph, bar graph, pie
Example:
graph, etc.
▪ It is important to process the data collected carefully,
the essence of data processing in research is data
reduction. Data reduction involves winnowing out the
irrelevant from the relevant data and establishing order
from chaos and giving shape to a mass of data.
research report, it is important to use certain ▪ Graphical Methods of Presenting Data - A graph or
techniques to communicate findings and chart portrays the visual presentation of data using
interpretations of research studies into visual forms. symbols such as lines, dots, bars, or slices. It depicts a
The common techniques being used to display data certain set of measurements or shows comparison
results are tabular, textual, and graphical methods. between two or more sets of data or quantities. Charts
and graphs are very useful in simplifying the
▪ Textual Presentation of Data - Textual presentations presentation of research reports. It helps researchers
use words, statements or paragraphs with numerals, and readers understand data quickly and interestingly. A
numbers, or measurements to describe data. They can good graph or chart shows that the X and The Y axis has
be used independently to describe the data when there a heading and units are included. The figure number and
are very few quantities or numbers. They can also be title are usually placed below the figure. The known
used to compare data using paragraphs for discussion. value is plotted on the X-axis and the measured value is
plotted on the Y-axis.
▪ Line Graph - A line graph is a graphical presentation of
data that shows a continuous change or trend it may
show ascending or descending trend.
K♥K
Elementary Statistics & Probability 2SEDM-A
Distribution Shapes:
▪ Bell-Shaped - single peak and tapers off at either end
▪ Uniform - basically flat or rectangular
▪ J-Shaped - Few data values on the left side and
increases as one moves to the right
▪ Reverse J-Shaped - Opposite of J-Shaped
K♥K
Elementary Statistics & Probability 2SEDM-A
∑𝑥 29+31+38+76+105+110 329
Week 4-5: Data Description Solution: 𝑋̅ = 𝑛 = 6
= 6
= 64.8
Measures of Central Tendency – Are statistical values So, the mean of the number of hospital infection for six
that represent the center or average of a data set. They hospital is 64.8.
summarize the data by indicating where the data tends
to cluster. There are three main measures of central ➢ Median – It’s the middle value when all the data is
tendency: mean, median, and mode. arranged in order. The median is useful when there
are extreme values (like really high or low numbers)
➢ Mean – It is the most commonly used measures of because it’s not affected by them as much as the
central tendency. It is also called arithmetic average mean. It gives you an idea of where the “center” of
of all the data points. The mean can be easily the data is without being swayed by outliers.
affected by really big or small numbers that are far
away from the others. • Median of Ungrouped Data – The median of
ungrouped data is the middle value in a data set
• Mean for Ungrouped Data – The mean of when the data points are listed in order. To find the
ungrouped data is calculated by adding up all the median:
values in a data set and then dividing that sum by 1. Arrange the data points in ascending (or
the total number of values in the data set. descending) order
2. If the number of data points is odd, the median is
∑𝑥 the middle value
𝑋̅ = 𝑛
.
∑ 𝑥 = sum for all scores Example 1: The scores of 8 Students in Statistics test are
45, 22, 35, 30, 38, 48, and 49. Find the median of this set
𝑛 = number of scores from the population data of data values
For population, the Greek letter 𝜇 (mu) is used for the Solution: Arrange the data values in order from lowest
mean. value to highest value: 22, 29, 30, 35, 38, 45, 48
∑ 𝑥 = sum for all scores Example 2: Ten Statistics books were randomly selected,
and the numbers of pages were recorded as follows:
𝑁 = number of scores from the population data
530, 500, 465, 601, 610, 480, 510, 580, 600, 475
Example 1: Consider a data set of exam scores: 85, 90, Solution: Arrange the data values in order from lowest
78, 92, and 88. Calculate the mean. value to highest value: 465, 475, 480, 500, 510, 530, 580,
600, 601, 610.
K♥K
Elementary Statistics & Probability 2SEDM-A
K♥K