01 Data & Statistics
01 Data & Statistics
W. Rofianto
1.Data & Statistics
2.Scales of Measurement
3.Summarizing Data for a
Categorical Variable
4.Summarizing Quantitative
Data
What is Statistics?
Example
Students of a university are classified by the school in which they are
enrolled using a nonnumeric label such as Business, Humanities,
Education, and so on.
Alternatively, a numeric code could be used for the school variable
(e.g. 1 denotes Business, 2 denotes Humanities, 3 denotes
Education, and so on).
Ordinal scale
• The data have the properties of nominal data and the order or
rank of the data is meaningful.
• A nonnumeric label or numeric code may be used.
Example
Students of a university are classified by their class standing using a
nonnumeric label such as Freshman, Sophomore, Junior, or Senior.
Alternatively, a numeric code could be used for the class standing
variable (e.g. 1 denotes Freshman, 2 denotes Sophomore, and so
on).
Interval scale
• The data have the properties of ordinal data, and the interval
between observations is expressed in terms of a fixed unit of
measure.
• Interval data are always numeric.
Example
Melissa has an TOEFL score of 550, while Kevin has an TOEFL score
of 500. Melissa scored 50 points more than Kevin.
SD SMP SMA
Example:
Price of a book at a retail store is $ 200, while the price of the
same book sold online is $100. The ratio property shows that
retail stores charge twice the online price.
Book A $200
Book B $100
TOEFL A 600
TOEFL B 400
Categorical and Quantitative Data
• Data can be further classified as being categorical or quantitative.
• The statistical analysis that is appropriate depends on whether the data for the
variable are categorical or quantitative.
• In general, there are more alternatives for statistical analysis when the data are
quantitative.
Categorical Data
• Labels or names are used to identify an attribute of each element
• Often referred to as qualitative data
• Use either the nominal or ordinal scale of measurement
• Can be either numeric or nonnumeric
• Appropriate statistical analyses are rather limited
Quantitative Data
• Quantitative data indicate how many or how much.
• Quantitative data are always numeric.
• Ordinary arithmetic operations are meaningful for quantitative data.
Scales of Measurement
Cross-Sectional vs Time Series Data
Cross-sectional data are collected at the same or approximately the same
point in time.
Example
Data detailing the number of building permits issued in November 2018 in each of
cities of Indonesia.
Statistical Studies
• Observational or Survey
• Experiment
Descriptive Statistics
• Most of the statistical information in newspapers, magazines, company
reports, and other publications consists of data that are summarized and
presented in a form that is easy to understand.
• Such summaries of data, which may be tabular, graphical, or numerical,
are referred to as descriptive statistics.
Example
The manager of Hudson Auto would like to have a better understanding of the
cost of parts used in the engine tune-ups performed in her shop. She examines 50
customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar,
are listed below.
Descriptive Statistics
Frequency and Percent Frequency
Class Midpoint
• In some cases, we want to know the midpoints of the classes in a frequency
distribution for quantitative data.
• The class midpoint is the value halfway between the lower and upper class
limits.
Example: Hudson Auto Repair
Sample of Parts Cost($) for 50 Tune-ups
• The last entry in a cumulative frequency distribution always equals the total
number of observations.
• The last entry in a cumulative relative frequency distribution always equals 1.
• The last entry in a cumulative percent frequency distribution always equals 100.
Ogive
In statistics, an ogive is a
free-hand graph showing
the curve of a cumulative
distribution function. The
points plotted are the upper
class limit and the
corresponding cumulative
frequency.
Scatter Diagram and Trendline
• A scatter diagram is a graphical presentation of the relationship between
two quantitative variables.
• One variable is shown on the horizontal axis and the other variable is
shown on the vertical axis.
• The general pattern of the plotted points suggests the overall
relationship between the variables.
• A trendline provides an approximation of the relationship.