MATH 4
Statistics
It is the science of conducting studies that collect, organize, summarize, analyze and draw
conclusions from data.
Statistics is the science of decision making in a world full of uncertainties
Scope of Statistics Summarizing:
Collecting: Summarizing the data to offer an
Data relating to certain events or overview of the situation
physical phenomena. Most datasets Presenting:
involve numbers Develop a comprehensive way to
Organizing: present the dataset
All collected data will be arranged in Analyzing:
logical and chronicle order for viewing and To analyze the dataset for the
analyses. Datasets are normally organized intended application
in either ascending or descending order.
Two Main branches of Statistics
Descriptive Statistics
-collection and organization of data -uses the data to provide descriptions of the population,
either through numerical calculations or graphs or tables
Inferential Statistics
-makes inference and predictions about a population based on a sample of data taken from the
population in question -consist of generalizing from the samples to populations, performing
hypothesis testing, determining relationships among variables, and making predictions
Population
refers to the groups of aggregates of people, objects, materials, events, or things of any
form. To save money statistician may study only a part of the population called sample
Sample
is a subgroup of the population, taken from the population to represent a population
characteristics or traits
Parameters - measures of the population
Statistic - measures of the sample
Variables - is a characteristic that takes Discrete variables - whose values can be
two or more values across individuals counted using integral values
Independent Variables - cause, predictor Continuous variables - assume any
Dependent Variables - effect, output, numerical value over an interval or
value being predicted intervals
Qualitative variables - represent Level Of Measurement Nominal - use
differences in quality, character, or kind numbers for the purpose of identifying
but not in amount name or membership in a group or
Quantitative variables - are numerical in category.
nature and can be ordered or ranked
Ordinal - connote ranking or inequalities, separating each score, specifically equal
numbers represent “greater than” or “less intervals.
than” measurements such as preferences Ratio - are similar to interval data, but has
or rankings. an absolute zero and multiples are
Interval - indicate an actual amount and meaningful. These are the highest level of
there is equal unit of measurement measurement
Data
is a collection of values for a particular variable
factual information (as measurements or statistics) used as a basis for reasoning, discussion, or
calculation (e.g., the data is plentiful and easily available).
Types of Data
Primary Data
are data collected directly by the researcher himself. These are first hand or original
sources
Secondary data
are information taken from published or unpublished materials previously gathered by other
researchers or agencies such as book, newspapers, magazines, journals, published and unpublished
thesis and dissertations
Sampling Design/Methods
1. Probability Sampling
a. Each of the units in the target population has the chance of being included in the sample
b. Greater possibility of representative sample of the population
c. Conclusion derived from data gathered can be generalized for the whole population
Types of Probability Sampling
1.Simple random Sampling-
is the sampling technique where the sample is obtained from the population randomly
2.Systematic Sampling-
the sample are taken from a systematic order of appearance in each sequence or
arrangement.
3.Stratified Sampling-
the population is divided into different strata or groups, or its representative size is
taken proportionally in the population
4. Cluster Sampling- the population is formed into different cluster (Area sampling)
5. Multistage Sampling- this is usually used for national, regional, provincial or country level
studies
2. Non-Probability Sampling
a. No way that each of the units in the target population has the same chance of being included in
the sample.
b. No assurance that every unit has some chance of being included.
c. Conclusion derived from data gathered is limited only itself.
Types of Non-Probability
1.Accidental or Convenience Sampling-
it is obtained when the researcher selects whatever sampling units are conveniently
available
2.Purposive Sampling-
under this scheme, the sampling units are selected subjectively by the researcher, who
attempts to obtain a sample that appears to be representative of the population.
3.Quota Sampling –
In this method, the researcher determines the sampling size which should be filled up -
4.Snowball Sampling-
this type of sampling that starts with the known sources of information, who or which will
in turn give other sources of information.
5.Networking Sampling-
this is used to find socially devalued urban populations such as addicts, alcoholics, child
abusers and criminals, because they are usually “hidden from outsiders
Collecting Engineering data
Three ways of collecting data on the impacts of factors on a response in a system:
1. Retrospective Study
This would use either all or a sample of the historical process data archived over some period
of time.
2. Observational Study
Collect relevant data from current operations without disturbing with the system.
3. Designed Experiments
Disturb the system and observe the impacts.
Planning and Conducting Surveys
Surveys
- It is a way to ask a lot of people a few well-constructed questions. It is a series of unbiased
questions that the subject must answer.
Advantages of surveys
- They are efficient ways of collecting information from a large number of people, they are
relatively easy to administer, a wide variety of information can be collected and they can be.
Disadvantages of surveys
- It arises from the fact that they depend on the subject’s motivation, honesty, memory and ability
to respond. Moreover, answer choices to survey questions could lead to vague data.
Five Simple Steps for Conducting Surveys
1. Identify the audience “Why do we Experiments?”
2. Find a survey provider Experiments are the basis of all
3. Conduct the survey theoretical predictions. Without
4. Create context for the survey experiments, there would be no results, and
5. Evaluate your research without any tangible data, there is no basis
for any scientist or engineer to formulate a
theory. The advancement of culture and
civilization depends on experiments which
bring about new technology (P. Cuadra)
Experiment – is a series of tests conducted in a systematic manner to increase the understanding of
an existing process or to explore a new product or process.
Design of Experiments (DOE) - It is a tool to develop an experimentation strategy that maximizes
learning using a minimum of resources.
Stages of Design of Experiments (DOE) 3. Optimization
1. Planning 4. Robustness Testing
2. Screening 5. Verification
Three Basic Principle of design experiments
1. Randomization-Allocation of the experimental material and the order of the runs of the
experiment performed are randomly determined.
2. Replication-Replication means independent repeat run of each factor combination.
Experimenter can obtain the estimate of experimental error.
3. Blocking- It helps in improving the precision of the experiment. It is used to reduce or
eliminate the variability transmitted from nuisance factor-factors that may influence the
response variable but in which we are not interested.
Strategy of Experimentation-
The general approach of planning and conducting the experiment.
Guidelines for Designing an Experiments
1. Recognition of and statement of the problem
- It is necessary to develop all ideas about the objectives of the experiment
a.) Factor screening- to find most influential factors having impact on response variable.
b.) Optimization – to find settings or levels of the important factors that result in desirable values of
response variable
c.) Confirmation- to verify some theory or past experience. Testing effectiveness of new substitute
material.
d.) Discovery- to find new material e.) Robustness- to find the conditions under which response
variable seriously degrade
2. Selection of the response variable
- It should give required information.
3. Choice of factors, levels and ranges
- The important factors having most influence is called design factors or nuisance factors.
4. Choice of experimental design
- It depends on the previous steps
5. Performing the experiment
- Take utmost care to execute experiment as per plan.
6. Statistical Analysis of the data
- It assures that the conclusions are objective.
7. Conclusions and recommendations
- Draw practical conclusions and recommend the action