0% found this document useful (0 votes)
54 views

Lesson 1 - Roles of Statistics and Data Analysis

The document discusses the roles of statistics and data analysis. It describes the data analysis process, which involves acknowledging variability in data, collecting data sensibly, describing variability, using descriptive statistics, and drawing conclusions while recognizing variability. Statistics provides methods for making sense of data and teaches how to make judgements in the presence of uncertainty. The document also discusses types of statistics, data, and ways of summarizing data through frequency distributions, bar charts, and dotplots.

Uploaded by

qrvccruz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Lesson 1 - Roles of Statistics and Data Analysis

The document discusses the roles of statistics and data analysis. It describes the data analysis process, which involves acknowledging variability in data, collecting data sensibly, describing variability, using descriptive statistics, and drawing conclusions while recognizing variability. Statistics provides methods for making sense of data and teaches how to make judgements in the presence of uncertainty. The document also discusses types of statistics, data, and ways of summarizing data through frequency distributions, bar charts, and dotplots.

Uploaded by

qrvccruz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Roles of Statistics and Data Analysis – CE 006 – Engineering Data Analysis

• The Data Analysis Process

Steps:

1. Acknowledging Variability – the differences in the data that we are dealing with.
2. Collecting Data Sensibly – gathering of correct information using the most efficient
way.
3. Describing Variability in the Data – describe the differences of the data gathered.
4. Descriptive Statistics – after we gather the correct information we have to present and
organize it in the most efficient way.
5. Drawing Conclusions in a Way that Recognizes Variability in the Data and Probability
Conclusion – we have to come up with a result/output using the correct statistical tool
and considering also the differences between the data.

• Statistics

➢ the scientific discipline that provides methods to help make sense of data.
➢ Suspicion: Extreme Skeptics, usually speaking out of ignorance, characterize this
discipline as a subcategory of lying.
➢ If used properly, statistical methods offer a set of POWERFUL tools for gaining
insight into the world around us.
➢ Often used in business, medicine, agriculture, social sciences, natural sciences and
applied sciences.
➢ And statistics teaches us how to make intelligent judgements and informed
decisions in the presence of uncertainty and variation.

• Three Reasons to Study Statistics

1. Be Informed
➢ To understand news reports making data-based claims.
➢ Extract information from tables and graphs.
➢ Understand the basics for valid research design.

2. Understand Issues and Sound Decision Making Based on Data


➢ Is existing info adequate, or do we need more?
➢ How to collect information in a reasonable and thoughtful manner.
➢ Summarize data in a useful and informative way.
➢ Analyze available data.
➢ Make conclusions and decisions, and assess risk for an incorrect decision.

3. Evaluate Decisions that Affect your Life


➢ Other people use statistical methods to make decisions that affect their life.
➢ Are the decisions made by these groups done in a reasonable way?
• The Nature and Role of Variability

➢ If all measurements were identical for every individual, this task would be easy.
➢ But population without variability are virtually non-existent.
➢ In fact, variability is universal.
➢ We need to understand variability to be able to collect, analyze, and draw
conclusions from data in a sensible way.
➢ The branch called descriptive statistics helps to increase our understanding of the
nature of variability in a population.

• Statistics and the Data Analysis Process

➢ Conclusions based on data are seen regularly in popular media and professional
and academic populations.
➢ Decisions are data driven in business, industry and government.

• Two Types of Statistics:

1. Descriptive Statistics – methods for organizing and summarizing data.


2. Inferential Statistics – involves generalizing from a sample to the population and
requires understanding of the variation in the population.

Population – the entire collection of individuals or objects about which information is


desired.

Sample – a subset of the population, selected for study in some prescribed manner.

• The Data Analysis Process

➢ Raw data without analysis is of little value, likewise even a sophisticated analysis
cannot provide meaningful information from data that were not collected in a
sensible way.

➢ Steps to data analysis process:


1. Understand the nature of the problem
2. Decide what to measure and how to measure it
3. Data collection
4. Data summarization and preliminary analysis
5. Formal Data Analysis
6. Interpretation of results
Example: A Proposed New Treatment for Alzheimer’s Disease

The article “Brain Shunt Tested to Treat Alzheimer’s “(San Francisco Chronicle, October
23, 2002) summarizes the findings of a study that appeared in the journal Neurology. Doctors
at Stanford Medical Center were interested in determining whether a new surgical approach
to treating Alzheimer’s disease results in improved memory functioning.

The surgical procedure involves implanting a thin tube, called a shunt, which is designed
to drain toxins from the fluid-filled space that cushions the brain. 11 patients had shunts
implanted and were followed for a year, receiving quarterly tests of memory function.

Another sample of Alzheimer’s patients was used as a comparison group. Those in the
comparison group received the standard care for Alzheimer’s disease.

After analyzing the data from this study, the investigators concluded that the “results
suggested the treated patients essentially held their own in the cognitive tests while the patients
in the control group steadily declined. However, the study was too small to produce conclusive
statistical evidence.”

Based on these results, a much larger 18-month study was planned. That study was to
include 256 patients at 25 medical centers around the country.

Evaluating a Research Study

The six data analysis steps can be used as a guide,

1. What were the researchers trying to learn? What question motivated their research?

2. Was relevant information collected? Were the right things measured?

3. Were the data collected in a sensible way?

4. Were the data summarized in an appropriate way?

5. Was an appropriate method of analysis used, given the type of data and how the data
were collected?

6. Are the conclusions drawn by the researchers supported by the data analysis?

• Describing Data
➢ Variable – any characteristics whose value may change from one individual or object
to another.
➢ Data – results from making observations either on a single variable or
simultaneously on two or more variables.
• Types of Data Set

1. Univariate Data Set – data set consisting of observations on a single attribute.

a) Categorical (or qualitative) – individual responses are categorical


responses.
b) Numerical (or quantitative) – observations are numerical.

2. Bivariate Data Set – when a data set consists of two attributes recorded
simultaneously for each individual.

3. Multivariate Data Set – result from obtaining a category or value for each of two
or more attributes.

• Two Types of Numerical Data

1. Discrete – a numerical variable in which the possible values of the variable


correspond to isolated points on the number line and you can attain the answer
by simply counting.

2. Continuous – a numerical variable in which the possible values of the variable


form an entire interval on the number line and you can attain the answer
through measurement.

• Frequency Distributions and Bar Charts for Categorical Data

➢ A tabular or graphical display can effectively communicate information.


➢ A common way to present categorical data is in the form of a table called frequency
distribution.
➢ Frequency Distributions for categorical data – a table that displays the possible
categories along with the associated frequencies and/or relative frequencies.
➢ Frequency – for a particular category, the number of times the category appears in
the data set.
➢ Relative Frequency – for a particular category, the fraction or proportion of the
observations resulting in the category.

𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑟𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 =
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑑𝑎𝑡𝑎 𝑠𝑒𝑡
• Bar Charts – a graph of the frequency distribution of categorical data.

• When to use: Categorical Data

• How to construct:
➢ Horizontal line, with category names below line at regularly spaced intervals.
➢ Vertical line, label the scale using in frequency or relative frequency.
➢ Rectangular bar above every category should be same width, height determined
by category’s frequency.

• What to look for: Frequently and infrequently occurring categories.

• Dotplots for Numerical Data – a dotplot is a simple way to display numerical data
when the data set is reasonably small.

• When to use: Small numerical data sets

• How to construct:
➢ Draw a horizontal line and mark with an appropriate measurement scale.
➢ Locate each value in the data set along the measurement scale and represent it by
a dot. If there are two or more observations with same value, stack the dots
vertically.

• What to look for:


➢ A representative or typical value in the data set.
➢ The extent of the spread of the data.
➢ The nature of the distribution of values along the number line.
➢ The presence of unusual values in the data set.

You might also like