AppliedStats - Chap1.data and Statistics
AppliedStats - Chap1.data and Statistics
2
3
1. Course Description
Prerequisite: No
Number of credits: 04 credits
Language of instruction: English
Requirement for laptop installed statistical software SPSS, Excel.
2. Course Objectives
The main goal of this course is to introduce the students to achieve the basic knowledge
of statistics and data presentation.
By completing this course, the student will:
Have a basic understanding of statistics in doing data analysis using SPSS/Excel.
Acquire skills of manipulating and performing accurate calculations on data.
Identify the various types of data and describe these using appropriate statistics.
Analyze data and interpret the output of statistical model to answer questions and solve the
problems
4
Assessment Goals Due Date Weight
5
Textbook
Anderson, David R., Dennis J. Sweeney, Thomas A. Williams,
Jeffrey D. Camm, James J. Cochran (2014), Statistics for
Business and Economics 12th, South-Western Cengage
Learning, USA. Microsoft Excel and (add-ins) Data Analysis
References
Newbold, Paul, William L. Carlson & Betty M. Thorne (2013),
Statistics for Business and Economics, 8th edition, Pearson
Education, USA.
Hoàng Trọng và Chu Nguyễn Mộng Ngọc (2011), Thống kê
ứng dụng trong kinh tế - xã hội, NXB Lao động - Xã hội
6
• Chapter 1: Data and Statistics
• Chapter 2: Descriptive Statistics: Tabular and Graphical Presentation
Part 1 • Chapter3: Descriptive Statistics: Numeric measurement
7
APPLIED STATISTICS FOR
ECONOMICS & BUSINESS
Trang, Ha Thi Thu (Ph.D)
Department of Business Administration
School of Economics and Management (SEM)
Hanoi University of Science and Technology (HUST)
8
9
1.1. Applications in Business and Economics
1.2. Data
10
Accounting
Public accounting firms use statistical sampling
procedures when conducting audits for their clients.
Finance
Financial advisors use a variety of statistical information,
including price-earnings ratios and dividend yields, to
guide their investment recommendations.
Marketing
Electronic point-of-sale scanners at retail checkout
counters are being used to collect data for a variety of
marketing research applications.
Production
A variety of statistical quality control charts are used to
monitor the output of a production process.
Economics
Economists use statistical information in making forecasts
about the future of the economy or some aspect of it.
11
12
Elements, Variables, and Observations
Scales of Measurement
13
“Data” comes from Latin Verb “dare” – “to give”.
Data are those pieces of information that any particular situation
gives to an observer
Data are measurements or observations that are collected as a source
of information.
Data are the facts and figures that are collected, summarized,
analyzed, and interpreted.
The data collected in a particular study are referred to as the data
set.
14
The elements are the entities on
which data are collected.
A variable is a characteristic of
interest for the elements.
The set of measurements collected
for a particular element is called an
observation.
The total number of data values in a
data set is the number of elements
multiplied by the number of
variables.
15
Scales of measurement include:
Nominal
Ordinal
Interval
Ratio
The scale determines the amount of information contained in the
data.
The scale indicates the data summarization and statistical analyses
that are most appropriate.
16
Nominal
Data are labels or names used to identify an attribute of the element.
A nonnumeric label or a numeric code may be used.
Example:
o Students at a university are classified by the school in which they are
enrolled using a nonnumeric label such as Business, Humanities,
Education, and so on.
o Alternatively, a numeric code could be used for the school variable:
• 1 denotes Business,
• 2 denotes Humanities,
• 3 denotes Education, and so on
17
Ordinal
The data have the properties of nominal data and the order or rank of the
data is meaningful.
A nonnumeric label or a numeric code may be used.
Example:
o Students of a university are classified by their class standing using a
nonnumeric label such as Freshman, Sophomore, Junior, or Senior.
o Alternatively, a numeric code could be used for the class standing
variable
• 1 denotes Freshman,
• 2 denotes Sophomore, and so on).
18
Interval
The data have the properties of ordinal data and the interval between
observations is expressed in terms of a fixed unit of measure.
Interval data are always numeric.
Example:
o Melissa has an SAT score of 1205, while Kevin has an SAT score of 1090.
Melissa scored 115 points more than Kevin.
19
Ratio
The data have all the properties of interval data and the ratio of two
values is meaningful.
Variables such as distance, height, weight, and time use the ratio scale.
This scale must contain a zero value that indicates that nothing exists for
the variable at the zero point.
Example:
o Melissa’s college record shows 36 credit hours earned, while Kevin’s
record shows 72 credit hours earned. Kevin has twice as many credit
hours earned as Melissa.
20
Numerical data Nominal data Ordinal data
exam grade
age income person married
55 75 000 1 yes
HD
42 68 000 2 no D
. . 3 no C
. . P
. . . .
F
With nominal data, all we
computer brand
1 IBM Food quality
can calculate is the 2 Dell Excellent
proportion of data that 3 Compaq Good
4 IBM
falls into each category. . . Satisfactory
Poor
IBM Dell Compaq other total
With ordinal data, all we
25 11 8 6 50
can use is computations
50% 22% 16% 12% 100%
involving the ordering
process. 21
Which type of data is it?
Sex (Male/Female/Gay/Lesbian)
23
24
Data can be further classified as being qualitative or quantitative.
The statistical analysis that is appropriate depends on whether the
data for the variable are qualitative or quantitative.
In general, there are more alternatives for statistical analysis when
the data are quantitative.
25
Qualitative data are labels or names used to identify an attribute of
each element.
Qualitative
data use either the nominal or ordinal scale of
measurement.
Qualitative data can be either numeric or nonnumeric.
The statistical analysis for qualitative data are rather limited.
26
Quantitative data indicate either how many or how much.
Quantitative data that measure how many are discrete.
Quantitative data that measure how much are continuous because
there is no separation between the possible values for the data.
Quantitative data are always numeric.
Ordinary arithmetic operations are meaningful only with quantitative
data.
27
28
Cross-sectional data are collected Time series data are collected over
at the same or approximately the several time periods.
same point in time.
29
30
31
32
Existing Sources
Data needed for a particular application might already exist within a firm. Detailed
information is often kept on customers, suppliers, and employees for example.
Substantial amounts of business and economic data are available from organizations that
specialize in collecting and maintaining data.
Government agencies are another important source of data.
Data are also available from a variety of industry associations and special-interest
organizations.
Internet
The Internet has become an important source of data.
Most government agencies, like the Bureau of the Census (www.census.gov), make their
data available through a web site.
More and more companies are creating web sites and providing public access to them.
A number of companies now specialize in making information available over the Internet.
33
Statistical Studies
Statistical studies can be classified as either experimental or observational.
In experimental studies, the variables of interest are first identified. Then one or
more factors are controlled so that data can be obtained about how the factors
influence the variables.
In observational (nonexperimental) studies, no attempt is made to control or
influence the variables of interest.
A survey is perhaps the most common type of observational study.
34
Time Requirement
Cost of Acquisition
DATA ACQUISITION
CONSIDERATIONS • Organizations often charge for information
even when it is not their primary business
activity.
Data Errors
35
Statistics is defined as the art and science of collecting, analyzing,
presenting, and interpreting data.
In this course, I emphasize the use of statistics for business data
analysis
Two major branches:
Descriptive statistics
Inferential statistics.
36
Descriptive statistics are the tabular, graphical, and numerical methods used to
summarize data.
37
The manager of Hudson Auto would like to
have a better understanding of the cost of
parts used in the engine tune-ups performed
in the shop.
She examines 50 customer invoices for
tune-ups. The costs of parts are listed below:
38
Tabular Summary Graphical Summary
(Frequencies and Percent Frequencies)
(Histogram)
39
Numerical Descriptive Statistics
The most common numerical descriptive statistic is the average (or
mean).
Hudson’s average cost of parts, based on the 50 tune-ups studied, is
$79 (found by summing the 50 cost values and then dividing by 50).
40
Statistical inference is the
process of using data
obtained from a small
group of elements (the
sample) to make estimates
and test hypotheses about
the characteristics of a
larger group of elements
(the population).
41
A population is the collection of all
outcomes, responses, measurements, or
counts that are of interest.
A sample is a subset, or part, of a
population.
To collect unbiased data, a researcher
must ensure that the sample is
representative of the population.
42
Example:
In a recent survey, 1500 adults in the United States were asked if they thought there
was solid evidence of global warming. 855 of the adults said yes.
Identify the population and the sample.
43
Example:
In a recent survey, 1500 adults in the United States were asked if they thought there
was solid evidence of global warming. 855 of the adults said yes.
Identify the population and the sample.
Solution:
Population: the responses of all adults in the
United States
Sample: the responses of the 1500 adults in the
United States in the survey. The sample data set
consists of 855 yes’s and 645 no’s.
44
The manager of Hudson Auto would like to have a better
understanding of the cost of parts used in the engine tune-ups
performed in the shop.
She examines 50 customer invoices for tune-ups. The costs of
parts are listed below:
45
Process of Statistical Inference
1. Population
consists of all 2. A sample of 50
tune-ups. Average engine tune-ups
cost of parts is is examined.
unknown.