0% found this document useful (0 votes)
15 views7 pages

Smm105 - Reviewer (Prelim)

The document provides an overview of elementary statistics and probability, detailing the nature of statistics, types of data, and methods for data collection and analysis. It distinguishes between descriptive and inferential statistics, outlines various sampling techniques, and describes methods for presenting data visually. Additionally, it covers the classification of variables and levels of measurement, emphasizing the importance of data processing and interpretation in research.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views7 pages

Smm105 - Reviewer (Prelim)

The document provides an overview of elementary statistics and probability, detailing the nature of statistics, types of data, and methods for data collection and analysis. It distinguishes between descriptive and inferential statistics, outlines various sampling techniques, and describes methods for presenting data visually. Additionally, it covers the classification of variables and levels of measurement, emphasizing the importance of data processing and interpretation in research.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Elementary Statistics & Probability 2SEDM-A

Week 1: The Nature of Statistics numeric codes for them, e.g., for sex variable, 1 and
2 may refer to make, and female, respectively).
Statistics – A science that studies data to be able to
Qualitative data answer questions “what kind.”
make a decision. It involves the methods of collecting,
Sometimes, there is a sense of ordering in
processing, summarizing, and analyzing data in order to
qualitative data, e.g., income data grouped into high,
provide answers or solutions to an inquiry.
middle, and low-income status.
➢ Statistics as a Tool in Decision-Making it enables us
➢ Quantitative Variables – Whose sizes are
to:
meaningful, answer questions such as “how much”
▪ Characterize persons, objects, situations, and
or “how many”. Quantitative Variables have actual
phenomena;
units of measure. Examples of quantitative variables
▪ Explain relationships among variables;
include the height, weight, number of registered
▪ Formulate objective assessment and comparisons; and,
cars, household size, and total household
more importantly
expenditures/ income of survey respondents.
▪ Make evidence-based decisions and predictions.
➢ Two types of Quantitative Variables:
▪ Discrete Variables – Those data that can be counted
Field of Statistics:
e.g., the number of days for cellphones to fail, the ages
➢ Descriptive Statistics - Which is concerned to the
of survey respondents measured to the nearest year, and
collection and description of a set of data to yield
the countable number of values. These are the variables
meaningful information. Descriptive Statistics
whose values or levels cannot take the form of decimals.
provides information only about collected data and
These are the variables that data can be taken through
does not draw inferences or conclusions about a
the process of enumeration or counting.
larger set of data. In general, this type of statistics is
▪ Continuous Variables – Those that can be measured,
devoted to summarization and description of data
e.g. the exact height of the survey respondent and the
set.
exact volume of some liquid substance. These are the
➢ Inferential Statistics - Which is composed of those
variables whose levels can take continuous values or
methods concerned with the analysis of smaller
assume a continuous set of numerical values.
group of data, which is known as the sample leading
to predictions or inferences about the larger set of
Levels of Measurement of Numerical Data:
data, or the population from which the sample is
➢ Nominal Data – Are commonly categorical data
drawn. This type of statistics will be used when one
assigned to numbers. In this data type, counting the
makes a decision, estimates prediction or
number of times a certain data would fall on the
generalization about a population based on a sample
category would only be the applicable
measurement. Example of which is assigning 1 for
➢ Basic Terms:
males and 2 for females. Data that can be
▪ Universe - The collection or set of units or entities from
categorized as nominal data include course, civil
whom we got the data. (Example: All students in the
status, color, preference, etc.
world.)
▪ Population - The set of all possible values of a variable.
➢ Ordinal Data - Are quantities where the numbers
are used to designate the rank order of data. In this
(Example: All students in a university.)
type, the correlation or the effect ranking of one
▪ Sample - A subgroup of a universe or of a population.
variable to another can be measured. However, the
(Example: 200 students from a university.)
range for each rank is not constant. Examples of
▪ Variable – It is characteristic that is observable or
quantities that use this level of measurement are the
measurable in every unit of the universe. (Example:
result of a race, ranking of beauty pageant,
Height, weight, or favorite subject of students.)
educational attainment, etc.
Broad Classification of Variables: ➢ Interval Data - Is a data type where the range
➢ Qualitative Variables – Express a categorical between the numeric values is constant. In this data
attribute, such as sex (male or female), religion, type addition and subtraction of values can be
marital status, region of residence, highest performed, however, multiplication and division are
educational attainment. Qualitative variables do not not appropriate. Example if for instance we would
strictly take on numeric values (although can have like to determine the information on describing the
level of the academic performance of students in
K♥K
Elementary Statistics & Probability 2SEDM-A

mathematics or academic performance of students: ▪ Advantage: Directly records real-time behavior,


the intervals using the mean form a point scale from reducing respondent bias.
which the lowest point is 1 and the highest point is 5 ▪ Disadvantage: Time-intensive and may require
as follows: 1.00 – 1.49 Poor; 1.50 – 2.49 Needs advanced training.
Improvement; 2.50 – 3.49 Satisfactory; 3.50 – 4.49 ➢ Test Method - Use of standardized tests or tools to
Very Satisfactory; and 4.50 – 5.00 Outstanding. measure specific variables or attributes (e.g., IQ
➢ Ratio Data – Are the widely used in science and tests, aptitude tests).
engineering. Like the interval measurements are also ▪ Advantage: Reliable and replicable when standardized.
expressed in numbers, and the difference between ▪ Disadvantage: May not capture context-specific factors.
the two any successive numbers are consistent. It ➢ Registration Method - Collecting data from official
has, however, the addition characteristics of starting records or administrative processes (e.g., birth and
from a true zero. Example of this length, mass, death registries, school enrollments).
angles, charge, energy, relative frequency, velocity, ▪ Advantage: Accurate and legally verified data.
etc. ▪ Disadvantage: Limited to the scope of registered data.
➢ Other Methods - May include focus groups, case
Week 2: Collection and Sampling Techniques studies, ethnographies, or big data analysis
depending on the context.
Methodology: ▪ Advantage: Customizable to unique research needs.
➢ Qualitative Methods - Qualitative research analyzes ▪ Disadvantage: Can be resource intensive.
non-numeric data to understand motivations,
perceptions, and behaviors. It will explore consumer Sampling techniques:
preferences, attitudes, and experiences for the new Sampling Error - The difference between the results
product in the research proposal. from a sample and the actual values of the entire
➢ Quantitative Methods – Quantitative research uses population. This occurs due to the selection of a subset
numerical data to measure relationships and rather than the whole population.
patterns. It will be used in research proposals to ▪ Mitigation: Increasing sample size and using
quantify aspects of the new product. appropriate sampling methods.
➢ Random Sampling - Each member of the population
Data Collection Methods has an equal chance of being selected.
➢ Interview Method - A face-to-face or virtual ▪ Advantage: Reduces bias and ensures
interaction to gather qualitative or quantitative data representativeness.
directly from respondents. ▪ Disadvantage: May require a complete population list,
▪ Types: Structured, semi-structured, and unstructured which is not always feasible.
interviews. ➢ Systematic Random Sampling - Selects samples at
▪ Advantage: Provides in-depth insights and clarification regular intervals (e.g., every 5th individual) after
of responses. choosing a random starting point.
▪ Disadvantage: Time-consuming and prone to ▪ Advantage: Easier and quicker than simple random
interviewer bias. sampling.
➢ Questionnaire Method - A structured set of ▪ Disadvantage: Can introduce bias if there is a hidden
questions distributed to respondents, typically in pattern in the population.
written form. ➢ Stratified Random Sampling - Divides the
▪ Types: Open-ended, close-ended, or mixed-format population into distinct subgroups (strata) and
questionnaires. samples randomly from each stratum.
▪ Advantage: Cost-effective and allows data collection ▪ Advantage: Ensures representation from all subgroups.
from a large sample. ▪ Disadvantage: Requires knowledge of population
▪ Disadvantage: Low response rates and possible characteristics to create strata.
misinterpretation of questions. ➢ Cluster Sampling - The population is divided into
➢ Observation Methods - Recording behaviors or clusters (e.g., geographical areas), and entire
phenomena as they occur naturally. clusters are randomly selected.
▪ Types: Participant observation, non-participant ▪ Advantage: Cost-effective and practical for large,
observation, covert, and overt observation. dispersed populations.
K♥K
Elementary Statistics & Probability 2SEDM-A

▪ Disadvantage: Higher risk of sampling error if clusters understandable homogeneous groups for the
are not representative. purpose of convenient interpretation. A uniformity
➢ Purposive Sampling - Selection based on the of attributes is the basic criterion for classification;
researcher's judgment to include specific, relevant and the grouping of data is made according to
cases. similarity.
▪ Advantage: Focused on specific characteristics or ➢ Tabulation of Data - Tabulation is the process of
expertise. summarizing raw data and displaying it in compact
▪ Disadvantage: Prone to researcher bias and limits form for further analysis. Therefore, preparing tables
generalizability. is a very important step.
➢ Quota Sampling - Pre-determined quotas are set for ▪ Title of Data - The table should be first given a brief,
subgroups, and participants are selected non- simple, and clear title which may express the basis of
randomly until quotas are filled. classification based on the purpose of the study.
▪ Advantage: Ensures representation of key groups. ▪ Columns and Rows - Each table should be prepared in
▪ Disadvantage: Subject of selection bias. just an adequate number of columns and rows.
➢ Convenience Sampling - Participants are chosen ▪ Captions - The columns and rows should be given
based on ease of access or availability. simple and clear captions so the ordinary reader can
▪ Advantage: Quick and inexpensive. understand the data.
▪ Disadvantage: Often unrepresentative and introduces ▪ Unit of Measurement - The unit should be noted below
bias. the lines.
➢ Other Sampling: ▪ Footnotes - This may be given below the table.
▪ Snowball Sampling: Participants recruit others, often ▪ Total - Totals of each column and grand total should be
used for hard-to-reach populations. in one line.
▪ Multistage Sampling: Combines multiple methods
(e.g., cluster and random sampling). ➢ Data Interpretation - After gathering the data, they
▪ Judgmental Sampling: Similar to purposive sampling must be tabulated and computed so the researcher
but more subjective. can analyze and interpret the result. Research
interpretation is defined as adequate exposition of
Week 3: Presentation of Data the true meaning of the material presented in terms
of the purpose of the study. Results in discussions
Data Processing - Is a fundamental step done in
should be systematic, logical, and comprehensive.
analyzing gathered data in study. In research, data is
The decision should blend the findings in relation to
manipulated to produce results that lead to answers to
those identified in the literature review and placed
specific problems for the improvement of an existing
within the context of the theoretical framework
situation. The data may be presented through diagram,
underpinning the study.
table, figure, or graph such as line graph, bar graph, pie
Example:
graph, etc.
▪ It is important to process the data collected carefully,
the essence of data processing in research is data
reduction. Data reduction involves winnowing out the
irrelevant from the relevant data and establishing order
from chaos and giving shape to a mass of data.

Methods of Data processing:


➢ Editing of Data - Editing is the process of examining
the data collected to detect errors and omissions
and to see that they are corrected, and ready for
tabulation. The researcher must see to it that data
are accurate, relevant, consistent, complete, and
acceptable.
➢ Classification of Data - Data must be classified or ➢ Presentation of Data - To be able to create and
categorized in the statistical data under various present an organized picture of information from a
K♥K
Elementary Statistics & Probability 2SEDM-A

research report, it is important to use certain ▪ Graphical Methods of Presenting Data - A graph or
techniques to communicate findings and chart portrays the visual presentation of data using
interpretations of research studies into visual forms. symbols such as lines, dots, bars, or slices. It depicts a
The common techniques being used to display data certain set of measurements or shows comparison
results are tabular, textual, and graphical methods. between two or more sets of data or quantities. Charts
and graphs are very useful in simplifying the
▪ Textual Presentation of Data - Textual presentations presentation of research reports. It helps researchers
use words, statements or paragraphs with numerals, and readers understand data quickly and interestingly. A
numbers, or measurements to describe data. They can good graph or chart shows that the X and The Y axis has
be used independently to describe the data when there a heading and units are included. The figure number and
are very few quantities or numbers. They can also be title are usually placed below the figure. The known
used to compare data using paragraphs for discussion. value is plotted on the X-axis and the measured value is
plotted on the Y-axis.
▪ Line Graph - A line graph is a graphical presentation of
data that shows a continuous change or trend it may
show ascending or descending trend.

▪ Tabular Presentation of Data - Tables present clear and


organized data. A table must be clear and simple but
complete. A good table should include the following
parts:
- Table Number and Title: These are placed above the
table. The title is usually written right after the table
number.
- Caption Subhead: This refers to columns and rows.
- Body: It contains all the data under each subhead.
- Source: It indicates if the data is secondary, and it ▪ Double Line Graph - A double line graph has two lines
should be acknowledged. connecting points to show continuous change in the data
overtime. Like a single line graph, the lines can ascend or
descend in a double line graph.

▪ Tabular Presentation with Textual Analysis – Tabular


Presentation with Textual Analysis has similar parts to
the Tabular Presentation of Data, but it has a textual
analysis below the table.

▪ Bar Graph or Bar Chart - A bar graph uses bars to


compare categories of data. It may be drawn vertically or
horizontally. A vertical bar graph is best to use when
comparing means or percentage between distinct
categories.

K♥K
Elementary Statistics & Probability 2SEDM-A

Histograms, Frequency Polygons, and Ogives

Histogram - Graph that displays the data by using


contiguous vertical bars (unless the frequency of a class
is 0) of various heights to represent the frequencies of
the classes.

▪ Pictographs - A pictogram is a special type of bar graph.


Instead of using an axis with numbers, it uses pictures /
icons to represent a number of items. The use of icons
can sometimes help overcome differences in language,
culture, and education. Icons can also give a more
representational view of the data.

Frequency Polygon - Graph that uses lines that connect


points plotted for the frequencies at the midpoints of the
classes; frequencies are represented by the heights of
the points.

▪ Circle Graph or Pie Chart - A pie chart is usually used to


show how parts of a whole compared to each other and
to the whole. The entire circle represents the total, and
the parts are proportional to the amount of the total
they represent. The amount going to each part is
The Ogive - Graph that represents the cumulative
expressed as a percentage; then, a circle is divided into
frequencies for the classes in a frequency distribution
pieces proportional to the percentage of each category.
It is done by multiplying the percentage share by 360
degrees.

Distribution Shapes:
▪ Bell-Shaped - single peak and tapers off at either end
▪ Uniform - basically flat or rectangular
▪ J-Shaped - Few data values on the left side and
increases as one moves to the right
▪ Reverse J-Shaped - Opposite of J-Shaped

K♥K
Elementary Statistics & Probability 2SEDM-A

▪ Right Skewed - Peak of the distribution is to the left ∑𝑥 78+85+88+90+92 433


Solution: 𝑋̅ = = = = 86.6
𝑛 5 5
and the data values taper off to the right (Positively
skewed) So, the mean of the exam scores is 86.6.
▪ Left-Skewed - Data values are clustered to the right and
taper off to the left (Negatively skewed) Example 2: The data show the number of patients in a
▪ Bimodal - Two peaks of the same height sample of six hospitals who acquired an infection while
▪ U-Shaped - Peaks at both ends and decreases toward hospitalized. Find the mean:
middle.
110 76 29 38 105 31

∑𝑥 29+31+38+76+105+110 329
Week 4-5: Data Description Solution: 𝑋̅ = 𝑛 = 6
= 6
= 64.8

Measures of Central Tendency – Are statistical values So, the mean of the number of hospital infection for six
that represent the center or average of a data set. They hospital is 64.8.
summarize the data by indicating where the data tends
to cluster. There are three main measures of central ➢ Median – It’s the middle value when all the data is
tendency: mean, median, and mode. arranged in order. The median is useful when there
are extreme values (like really high or low numbers)
➢ Mean – It is the most commonly used measures of because it’s not affected by them as much as the
central tendency. It is also called arithmetic average mean. It gives you an idea of where the “center” of
of all the data points. The mean can be easily the data is without being swayed by outliers.
affected by really big or small numbers that are far
away from the others. • Median of Ungrouped Data – The median of
ungrouped data is the middle value in a data set
• Mean for Ungrouped Data – The mean of when the data points are listed in order. To find the
ungrouped data is calculated by adding up all the median:
values in a data set and then dividing that sum by 1. Arrange the data points in ascending (or
the total number of values in the data set. descending) order
2. If the number of data points is odd, the median is
∑𝑥 the middle value
𝑋̅ = 𝑛
.

3. If the number of data points is even, the median is


Where: 𝑋̅ = mean for sample data the average of the two middle values

∑ 𝑥 = sum for all scores Example 1: The scores of 8 Students in Statistics test are
45, 22, 35, 30, 38, 48, and 49. Find the median of this set
𝑛 = number of scores from the population data of data values

For population, the Greek letter 𝜇 (mu) is used for the Solution: Arrange the data values in order from lowest
mean. value to highest value: 22, 29, 30, 35, 38, 45, 48

∑𝑥 Since there’s an odd number of data points, the median is


𝜇̅ = 𝑁
the middle value, which is 35.
Where: 𝜇 = mean for population data
.
̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅

∑ 𝑥 = sum for all scores Example 2: Ten Statistics books were randomly selected,
and the numbers of pages were recorded as follows:
𝑁 = number of scores from the population data
530, 500, 465, 601, 610, 480, 510, 580, 600, 475

Example 1: Consider a data set of exam scores: 85, 90, Solution: Arrange the data values in order from lowest
78, 92, and 88. Calculate the mean. value to highest value: 465, 475, 480, 500, 510, 530, 580,
600, 601, 610.
K♥K
Elementary Statistics & Probability 2SEDM-A

The number of values in the data set is 10, which is even.


So, the median is the average of the two middle values.

5𝑡ℎ 𝑑𝑎𝑡𝑎+6𝑡ℎ 𝑑𝑎𝑡𝑎 510+550 1060


Median = = = = 530
2 2 2

➢ Mode – The third measure of central tendency is


called the mode. The mode is the value that occurs
most often in the data set. It is sometimes the most
typical case.

• Unimodal – Is a data set with only one value that


occurs with the greatest frequency.
• Bimodal – Occurs if two numbers appear most often
in a data set.
• Multimodal – Occurs if there are more than two
numbers with highest frequencies.
• No Mode – When no data value occurs more than
once.

Example: Find the mode for the following list values: 7, 7,


7, 5, 8, 9, 9, 9, 10

Solution: Since the values 7 and 9 both occur 3 times, the


modes are 7 and 9. The data set is known to be bimodal.

K♥K

You might also like