0% found this document useful (0 votes)
39 views

Comp. Stats

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Comp. Stats

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

CHAPTER 1: INTRODUCTION TO graphs, charts, or paragraphs.

It may be
STATISTICS tabular, graphical, or textual.
c. Analysis of Data. Pertains to the
Today, statistics and its application are process of extracting from the given
an integral part of our life. In such data relevant and noteworthy
diverse settings as politics, medicine, information and this uses statistical tools
education, business, and the legal or techniques.
arena, human activities are both d. Interpretation of Data. Refers to the
measured and guided by statistics. We drawing of conclusions or
begin the module with some basic inferences from the analyzed data
analysis. Since statistics involves the
collection and interpretation of data, we TYPES OF STATISTICS
must first know how to understand, As we have seen, statistics can refer to
display, and summarize large amounts a set of individual numbers or numerical
of quantitative information, before facts, or to general or specific statistical
undertaking a more sophisticated techniques. A further breakdown of the
analysis. Statistical analysis of subject is possible, depending on
quantitative data is important throughout whether the emphasis is on (1) simply
the pure and social sciences. describing the characteristics of a set of
LESSON I. WHAT IS STATISTICS? data or (2) proceeding from data
characteristics to making
Statistics has become the universal generalizations, estimates, forecasts, or
language of the sciences. As potential judgments based on the data. The
users of statistics, we need to master former is referred to as descriptive
both “sciences” and the “art” of using statistics, while the latter is called
statistical methodology correctly. Careful inferential statistics.
use of statistical methods will enable us
to obtain accurate information from data. Statistics: DESCRIPTIVE STATISTICS
These methods include and INFERENTIAL STATISTICS
(1) carefully defining the situation,
(2) gathering data, Descriptive Statistics. It relates to the
(3) accurately summarizing the data, gathering, classification and
and presentation of data and the collection of
(4) deriving and communicating summarizing values to describe group
meaningful conclusions. characteristics of data. The most used
summarizing values to describe group
Statistics is a branch of applied characteristics of data are percentage,
mathematics which deals with the measures of central tendency and
collection, organization, presentation, location, measures of variability,
analysis, and interpretation of data. skewness, and kurtosis. For example,
Statisticians develop and apply upon looking around your class, you
appropriate methods in collecting and may find that 35% of your fellow
analyzing data. They guide the design of students are wearing Casio watches. If
a research study then analyze the so, the figure “35%” is a descriptive
results. The interpretation of the results statistic. Chapter 3 and 4 will present
is the basis of the statisticians in making several popular visual and statistical
inferences about the population. approaches to expressing the data we
a. Data gathering or Collection. May be or others have collected. For now,
done through interview, questionnaires, however, just remember that descriptive
tests, observation, registration, and statistics are used only to summarize or
experiments. describe data.
b. Presentation of Data. Refers to the
organization of data into tables,
Inferential Statistics. Pertains to the • A student wants to get data on
methods dealing with making inference, classmates’ favorite rock groups to
estimates or prediction about large set satisfy a curiosity.
of data using the information gathered. TABLE 1.1 Six Main Reasons for Data
Commonly used inferential statistical Collection
tools or techniques are testing Reason for Obtaining Data
hypothesis using z-test, t-text, simple
linear correlation, analysis of variance 1. Data are needed to provide the
(ANOVA), chi-squares, regression, and necessary input to a survey.
time series analysis. For example, 2. Data are needed to provide the
observing a sample nurses and other necessary input to the study.
healthcare workers who were likely 3. Data are needed to measure
infected with the COVID-19, researchers performance of an ongoing service or
found that only half routinely wore the production process.
PPEs when dealing with patients. 4. Data are needed to evaluate
Chapter 5 and 6 will present several conformance to standards.
popular visual and statistical approaches 5. Data are needed to assist in
to predict the data collected. For now, formulating alternative courses of action
however, just remember that inferential in a decision-making process.
statistics draws conclusions about a 6. Data are needed to satisfy our
population based on data observed in a curiosity.
sample.
Reason for Obtaining Data
WHY DATA ARE NEEDED
1. Data are needed to provide the
Data is one of the most important and necessary input to a survey.
vital aspect of any research studies. 2. Data are needed to provide the
Researchers conducted in different necessary input to the study.
fields of study can be different in 3. Data are needed to measure
methodology, but every research is performance of an ongoing service or
based on data which is analyzed and production process.
interpreted to get information. Data is 4. Data are needed to evaluate
the basic unit in statistical studies. conformance to standards.
Statistical information like census, 5. Data are needed to assist in
population variables, health statistics, formulating alternative courses of action
and road accidents records all in a decision-making process.
developed from data. Data contain 6. Data are needed to satisfy our
information needed to make a more curiosity
informed decision in a situation, there
are many instances in which data are Key Data Collection Sources
needed:
1. Data may already be published by
• A market researcher needs to assess
governmental, industrial, or
product characteristics to distinguish
individual sources. The Philippine
one product from another.
Statistics Authority is responsible for
• An operations manager wants to
collecting and compiling data on
monitor an assembly process on a
economic, social, demographic, political
regular basis to find out whether it
affairs, and general affairs of the people
follows generally accepted accounting
of the Philippines.
principles.
2. An experimental may be designed
• A potential investor wants to determine
to obtain the necessary data.
what firms within what industries
Strict control is exercised over the
are likely to have accelerated growth in
treatments. For example, in a study
a period of economic recovery.
testing the effectiveness of laundry
detergent, the researcher determines
which brands in the study are most variables, also referred to as attributes,
effective in cleaning soiled clothes typically involve counting how many
by actually washing dirty laundry instead people or objects fall into each category.
of asking customers which In expressing results involving
brand they believe to be most effective. qualitative variables, we describe the
3. A survey may be conducted. In this percentage or the number of persons or
data collection sources, no control objects falling into each of the possible
is exercised over the behavior of the category.
people being surveyed. They are
merely asked questions about their Quantitative Variables. Yield numerical
beliefs, attitudes, behaviors, and responses representing an amount or
other characteristics. Responses are quantity. Examples are weight, height,
then edited, coded, and tabulated umber of children. There are two types
for analysis. of quantitative variables: the discrete or
4. An observational study may be continuous.
conducted. A researcher observes a. Discrete Quantitative Variables.
the behavior directly, usually in its Produces numerical responses
natural setting. Most knowledge of that arise from a counting process. For
animal behavior is developed in this example “number of children”,
way, as in our scientific knowledge it is a discrete numerical variable
other fields, such as astronomy and because the response is one of a
geology, in which experimentation finite number of integers ( 0,1,2,3,…).
and surveys are impractical if not b. Continuous Quantitative Variables.
impossible. produce numerical responses
that arise from a measuring process.
Two Types of Data Collection Sources Example:
Height (5’4, 157cm, 1.5m)
1. Primary Sources. It is measured and Weight (130.42 kilos, 210lbs, 432
gathered by the researcher that
grams)
published it. They are the data
collectors. Temperature (32.50 C, 1120 F)

2. Secondary Sources. It is republished SCALES OF MEASUREMENT


by another researcher or agency.
They are the data compilers a. Nominal Level. Classifies data into
various distinct categories in which
LESSON II. TYPES OF VARIABLES no ordering is implied. It is the weakest
AND SCALES OF MEASUREMENT form of measurement because no
attempt can be made to account for
The scale of measurement of your differences within a category or to
variables is important for two reasons. specify.
Each of the levels of measurement
b. Ordinal Level. Classifies data into
provides a different level of detail.
Nominal provides the least amount of distinct categories in which ordering
detail, ordinal provides the next highest is implied. Data are ranked from “bottom
amount of detail, and interval and ratio to top” or “low to high” manner.
provide the most amount of detail. Data Statements of the kin d “greater than” or
can also be obtained in terms of the “less than” may be made.
level of measurement attained. any ordering or direction across the
various categories.
TYPES OF VARIABLES
c. Interval Level. It is an ordered scale
Qualitative Variables. Some of the in which the difference between
variables associated with people or measurements is a meaningful quantity
objects are qualitative in nature, that does not involve a true zero point.
including that the person or object d. Ratio Level. It is an ordered scale in
belongs to a category. Qualitative which the difference between the
measurements involves a true zero point population mean denoted by “µ” (mu)
as in height, weight, age, or salary and population standard deviation
measurements. denoted by “σ” (sigma).
• Statistic is a measurable
CHAPTER 2: SAMPLING and SAMPLING characteristic of a sample such as
DISTRIBUTION sample mean denoted by “ẋ” (x-bar) or
sample standard deviation, denoted by
LESSON 1: SURVEY SAMPLING “s”.
Survey sampling or simply sampling • Sampling distribution is a probability
refers to the process of choosing a distribution of statistics. When we
sample of elements from a total say sampling distribution of the mean,
population of elements. It is a process of we are referring to the mean values of
selecting a subset of a population of every possible samples that can be
items for the purpose of making obtained from the population.
inferences about the whole population. • Sampling with replacement is used
The two broad categories of sampling when a population element can be
are probability and non-probability selected more than one time. After a
sampling that will be discussed in this person or item is selected, it is returned
lesson. to the frame where it has the same
probability of being selected again.
POPULATION VERSUS SAMPLE • Sampling without replacement is
used when a population element can
We will encounter the terms population be selected only once. A person or item
and sample on almost every page of once selected, it is not returned to the
this module. Consequently, frame and therefore cannot be selected
understanding the meaning of each of again.
these two terms and difference between • Standards error refers to the standard
them is crucial. deviation of the sampling distribution.
Hence, the standard error of the mean is
Population is the totality of items or the standard deviation of the sampling
things under consideration. It is also distribution of the mean.
the large set of data. The population that • Variable noted by the letter X and Y, is
is being studied is also called the a characteristic of interest for each
target population. Population size is person or thing in a population. It may
denoted by “N”. be qualitative or quantitative.
Sample is a subset of a population; • Data are the actual values of the
hence, the sample must possess variable. They may be numbers, or they
the same characteristics of the may be in words. Datum is a single
population. Sample size is denoted by value.
“n”.
Quality of Survey Results
Population and Sample
The collection of information from the 1. Accuracy. It refers to the closeness
elements of a population or a of the parameter of sample statistics
to a population. For example, if the
sample is called a survey. A survey that
sample mean is 99 and the real
includes every element of the target population mean is 100, then the sample
population is called a census. mean is accurate with gap of 1 unit.
Such a survey conducted on
a sample is called a sample survey. 2. Precision. It refers to the closeness
of the estimates and the different
samples. An example of a measure of
precision is standard error.
3. Margin error. The maximum
KEY TERMS
expected difference between the true
• Parameter is a measurable
characteristic of a population, such as
population parameter and a sample of
that parameter us expressed by a. Simple Random Sample (SRS). It is
the margin error. The larger the margin the simplest form of random sampling.
of error, the less confidence one
One in which every individual or item
should have that a poll result would
reflect the result of a survey of the from a frame has the same chance of
entire population. selection as every other individual or
item. In addition, every sample of a fixed
size has the same chance of selection
SAMPLE DESIGNS as every other sample of that size.

1. Sampling method. It is the process Systematic Sample. It relies on


of selecting a part from a given whole. arranging the target population
The primary purpose of which is to make according to some scheme and then
a generalization about the (unknown) selecting elements at regular intervals
characteristics of a whole. It is also through that ordered list. The sample
central to the study of statistical elements can be randomly selected.
inference.
2. Estimator. It refers to the process of c. Stratified Sample. It involves dividing
calculating sampling statistics. your population into homogeneous
Different estimators can be used in subgroups and then taking a simple
different sampling methods. random sample in each subgroup.

LESSON II. SAMPLING TECHNIQUES d. Cluster Random Sampling. This


refers when every number of
DETERMINING THE SAMPLE SIZE OF populations is assigned to one and only
THE POPULATION one group. This is where elements in
The Slovin's Formula is popularly use selected cluster are included in the
for determining the sample size for a sample. Clusters can be naturally
survey research, especially in occurring designation, such as
undergraduate thesis in education and countries, election, districts, city,
social sciences, because it is easy to apartment building, blocks, or families.
use and the computation is based
almost solely on the population size. It is e. Multistage Sampling. In this method,
not advisable to set a certain we select a sample by using
percentage; instead, the margin of error combination of different sampling
which is from 1% to 10 % in social methods.
sciences researches should be
considered. NON-PROBABILITY SAMPLING METHOD

n is the sample size This method does not involve random


N is the population size selection. Is a sampling technique where
e is the sampling error the sample are gathered in a process
that does not give all the individuals in
PROBABILITY SAMPLING METHOD the population equal chances of being
selected.
In a probability sampling, every element
of the population has a nonzero
a. Quota Sampling. Method in which
chance of being chosen as a sample.
participants are selected according to
Probability statements about sample
pre-specified quotas regarding
statistics are allowed by probability
demographics, attitudes, behavior, or
samples, the extent to which a sample
some other criteria.
statistic from a population parameter
b. Purposive sampling. It is done
can be estimated. It is one in which the
through choosing based on the
subjects of the sample are chosen
based on known probabilities.
predetermined criteria set by the based on the line chart. The area
researchers. Most likely they are between axis and line are commonly
engaged in market researchers. emphasized with colors, texture, and
c. Convenience Sampling. It is simply hatchings. Commonly one compares
one in which the researcher uses any two or more quantities with an area
subjects that are available to participate chart.
in the research study. Also known as the
“man on the street” or “person on the b. Bar. This type of data presentation is
street”. composed of bars or rectangular prisms
of equal widths. It can be horizontally or
CHAPTER 3: STATISTICAL vertically in single or paired bar graphs.
PRESENTATION AS AN AID TO The length of each rectangle is
REPORTING INFORMATION proportional to the frequency of
observed item or magnitude of class
LESSON I. FORMS OF under interval of item being studied.
PRESENTATION OF DATA
c. Column. This is a data visualization
After applying the different methods of where each category is represented by
collecting data, the raw data gathered a rectangle, with the height of the
from primary or secondary sources rectangle being proportional to the
should be organized and presented in
values being plotted. Column charts are
summarized form. This lesson focuses
on the different forms of data also known as vertical bar charts.
presentation, and the different types of
graphs and charts. d. Pie Chart. This represents
relationships of the different components
DIFFERENT FORMS OF DATA of a data. It is the ideal graph if you want
PRESENTATION to show the partition of a whole. The
angles or sectors should be proportional
1. Textual. This form of presentation
to the percentage components of the
combines text and numerical facts in
data.
paragraphs to explain the summary of
data gathered. It usually discusses the
e. Doughnut. This is a built-in chart
highlights of the data.
type. Doughnut charts are meant to
express a “part-to-whole” relationship,
2. Tabular. This form of presentation
where all pieces together represent
uses statistical table that shows the
100%.
data in a more concise and systematic
manner. The table facilitates the
f. Line Graph. This type of data
analysis of relationships of data.
presentation shows relationships
between two sets of quantities. This type
3. Graphical. This form of presentation
is often used to predict growth trends
is the most interesting and the most
such as sales and population for a long
effective means of organizing and
period of time.
presenting statistical data. The important
relationships of data can be easily seen
g. Scatter. This type illustrates the
merely looking at colorful figures that are
relationships between two variables,
creatively designed.
points are plotted in a Cartesian plane. It
is like making a line graph
except that there is no need to connect
the points.
Different types of graphs/charts LESSON II. CREATING AND
EVALUATING TABLES AND GRAPHS
a. Area. This type of chart displays
graphically quantitative data. It is Area
Trends can be emphasized effectively presented in the form of a frequency
because it illustrates the magnitude of distribution are called grouped data.
change over time.
CUMULATIVE FREQUENCY
Bar DISTRIBUTION
This chart type is ideal if you want to
make comparisons among individual Cumulative frequency distribution is the
items with two- way reading. sum of the class and all classes below it
in a frequency distribution. All that
Column means is you are adding up a value and
This is useful in showing changes over all the values that came before it.
a period. It has the same function as
with the bar chart. CHAPTER 4: DISCRIPTIVE STATISTICS

Pie Descriptive statistics is the term given to


This type of chart compares the sizes the analysis of data that helps describe,
of each sector as they relate to the show, or summarize data in a
whole unit. It illustrates the partition of meaningful way. Descriptive statistics
parts with a total of 100% and do not, however, allow us to make
applicable if there is only one kind of conclusions beyond the data we have
data to be analyzed. analyzed or reach conclusions regarding
any hypotheses we might have made.
Doughnut
It also shows the comparisons LESSON I. MEASURES OF CENTRAL
between the whole and the parts, but TENDENCY
this type can be used to show more
than one set of data. A measure of central tendency is a
summary statistic that represents the
Line center point or typical value of a dataset.
It illustrates the trends in data with These measures indicate where values
equal intervals. It is two-way reading. in a distribution fall and are also referred
to as the central location of a
Scatter distribution.
It illustrates the relationship between
two variables. MEASURES OF CENTRAL TENDENCY
USING UNGROUPED DATA
CREATING CHARTS
We often represent a data set by
To facilitate in making the graphs, you numerical summary measures, usually
can use the Microsoft Excel to create called the typical values. A measure of
your chart. This will guide you through central tendency gives the center of a
the steps of selecting the chart type, frequency distribution curve. This
adding chart titles and labels. Before section discusses three different
starting to use the Microsoft excel, measures of central tendency: the
select the data, or range that you want mean, the median, and the mode.
to convert into chart.

The mean, also called the arithmetic


mean, is the most frequently used
FREQUENCY DISTRIBUTION
measure of central tendency. This
module will use the words mean and
Frequency distribution for quantitative
average synonymously. For ungrouped
data lists all the classes and the number
data, the mean is obtained by dividing
of values that belong to each class. Data
the sum of all values by the number of falls in a sample or distribution. A
values in the data set. measure can tell us whether a value is
about the average, or whether it is
The median is the value of the middle unusually high or low.
term in a data set that has been ranked
in decreasing or decreasing order. As is THE QUARTILES
obvious from the definition of the The quartiles divide the distribution into
median, it divides a ranked data set into four equal parts, each comprising 25%
two equal parts. of the observation. The quartiles are Q1
(first quartile), Q2 (second quartile or the
The mode is a French word that means median), Q3 (third quartile), and Q4
fashion an item that is most popular or (fourth quartile).
common. In statistics, the mode is the
value that occurs with the highest THE DECILES
frequency in a data set. If there is no The deciles divide the distribution into
common score, the said data has no ten equal parts, each comprising 10% of
mode. A distribution with only one mode the observation. The median is the 5th
is said to be unimodal while a decile.
distribution with two or more modes is
described as multi-modal. THE PERCENTILES
The percentiles divide the distribution
MEASURES OF CENTRAL TENDENCY into 100 equal parts, each comprising
USING GROUPED DATA 1% of the observation. The median
describes the 50th percentile.
Data which are arranged in a frequency
distribution are called Grouped Data. LESSON III. MEASURES OF
When the number or items is too large, it DISPERSION
is best to compute for the measures of
The range, interquartile range, standard
central Tendency and variability using
deviation, and the variance
the frequency distribution.
provide us the distances of scores from
the measures of central tendency. It
The Mean.There are two methods we
can also be used to establish the actual
can use to compute for the mean from
similarities or the differences of the
grouped data: the long method and the
distribution.
coded deviation method.
a. THE RANGE
The Median. It is the value of the middle It is the simplest and easiest measures
in an ordered arrangement of data. In an of variability. It simply measures how far
ordered distribution, half of the terms are the highest score is to the highest score.
located above the median and half are It does not tell anything about the scores
below the median. between these two extreme scores.

b. THE INTERQUARTILE RANGE


The mode in a frequency distribution is
The interquartile range is a measure of
within the class interval with the highest
where the “middle fifty” is in a data set.
frequency. The class interval with the
Where a range is a measure of where
frequency is known as the modal class.
the beginning and end are in a set, an
A crude mode may be determined by
interquartile range is a measure of
taking the class mark with the highest
where the bulk of the values lie. This
frequency.
value is obtained by getting the
difference of the 3rd and 1st quartiles in
LESSON II. MEASURES OF POSITION
a set of data.
Measures of position give us a way to
see where a certain data point or value THE STANDARD DEVIATION AND THE
VARIANCE
It is the measure of variability that
involves all scores in the distribution
rather than through extreme scores. It
may be referred to as the root-mean
square of the deviation from the mean.

You might also like