0% found this document useful (0 votes)
14 views

Lecture 2-Introduction To Satistics

This document provides an introduction to statistics and discusses key concepts such as: 1) It defines statistics as the process of collecting data, presenting data through charts and tables, and characterizing data using measures like the average. 2) It describes the different types of data that can be collected including qualitative, quantitative discrete, and quantitative continuous data. 3) It explains the different measurement scales used in statistics including nominal, ordinal, interval, and ratio scales and provides examples of each.

Uploaded by

Ekta Agrawal
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Lecture 2-Introduction To Satistics

This document provides an introduction to statistics and discusses key concepts such as: 1) It defines statistics as the process of collecting data, presenting data through charts and tables, and characterizing data using measures like the average. 2) It describes the different types of data that can be collected including qualitative, quantitative discrete, and quantitative continuous data. 3) It explains the different measurement scales used in statistics including nominal, ordinal, interval, and ratio scales and provides examples of each.

Uploaded by

Ekta Agrawal
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Introduction To Statistics

“There are three kinds of lies: lies, damned lies, and statistics.“
(B.Disraeli)
Why study statistics?

1. Data are everywhere


2. Statistical techniques are used to make many decisions that affect
our lives
3. No matter what your career, you will make professional decisions
that involve data. An understanding of statistical methods will help
you make these decisions efectively
What Is Statistics?
1. Collecting Data Data Why?
e.g., Survey Analysis
2. Presenting Data
e.g., Charts & Tables © 1984-1994 T/Maker Co.

3. Characterizing Data Decision-


e.g., Average Making

© 1984-1994 T/Maker Co.


Collecting Data
Where does data come from?
• Surveys
• Response Rate
• Stratification/Clusters
• Reporting Error/Measurement Error

• Administrative Records
• Lots of different places
• Often kept real-time (so addresses “reporting” or “recollection” errors)
• May be missing, and that might not be random…

• Researchers (and you!)


• Often collected for specific project—so be careful what it has
• More “unique” with different types of data (e.g. content analysis)
Who Collects Data
• Government
• Official Statistics: Unemployment, GDP, etc
• Surveys: Labor Force, Consumption, etc.
• Records: Justice System, Social Programs

• Service providers
• Often this may be administrative (e.g. hospital records)
• Sometimes, internal surveys or evaluations which can be useful if you
can get them

• Third Parties
• Critical for places with limited capacity (e.g. World Bank is a big source
of this for developing countries)
• University or Survey Research Programs
• Newspapers and Media sources compile LOTS of things
Obtaining Data

1. Data from a published source


2. Data from a designed experiment
3. Data from a survey
4. Data collected observationally
Obtaining Data
Published source:
book, journal, newspaper, Web site
Designed experiment:
researcher exerts strict control over units
Survey:
a group of people are surveyed and their responses are
recorded
Observation study:
units are observed in natural setting and variables of
interest are recorded
Statistical data
The collection of data that are relevant to the problem being studied
is commonly the most difficult, expensive, and time-consuming part
of the entire research project.
Statistical data are usually obtained by counting or measuring items.
Primary data are collected specifically for the analysis desired
Secondary data have already been compiled and are available for statistical
analysis
A variable is an item of interest that can take on many different
numerical values.
A constant has a fixed numerical value.
Data
Statistical data are usually obtained by counting or measuring items.
Most data can be put into the following categories:
• Qualitative - data are measurements that each fail into one of several
categories. (hair color, ethnic groups and other attributes of the
population)
• quantitative - data are observations that are measured on a
numerical scale (distance traveled to college, number of children in a
family, etc.)
Quantitative data
Quantitative data are always numbers and are the
result of counting or measuring attributes of a population.
Quantitative data can be separated into two
subgroups:
• discrete (if it is the result of counting (the number of students of a given ethnic
group in a class, the number of books on a shelf, ...)
• continuous (if it is the result of measuring (distance traveled, weight of luggage,
…)
Quantitative Data
Measured on a numeric
scale. 4
• Number of defective 943
items in a lot.
21 52
• Salaries of CEOs of
oil companies. 120 12
• Ages of employees at
a company. 8
71 3
Qualitative data
Qualitative data are generally described by words or
letters. They are not as widely used as quantitative data
because many numerical techniques do not apply to the
qualitative data. For example, it does not make sense to
find an average hair color or blood type.
Qualitative data can be separated into two subgroups:
Dichotomic (if it takes the form of a word with two options (gender - male or
female)
Polynomic (if it takes the form of a word with more than two options (education
- primary school, secondary school and university).
Qualitative Data
Classified into categories.
• College major of each
student in a class.
• Gender of each employee
at a company.
• Method of payment
(cash, check, credit card).

$ Credit

© 2011 Pearson Education, Inc


Types of variables
Variables

Qualitative Quantitative

Dichotomic Polynomic Discrete Continuous

Children in family, Amount of income


Gender, marital Brand of Pc, hair
Strokes on a golf tax paid, weight of a
status color
hole student
Our Data Looks like: (Cross Sectional)
ID Income Race Sex Education
1 y1 x11 x21 x31
2 y2 x12 x22 x32
3 y3 x13 x23 x33
4 y4 x14 x24 x34
5 y5 x15 x25 x35
Our Data Example
• N=5
• k=3
We can index our individuals by ID.
Now to time series data
Data indexed by time instead of individual

Year Income Inflation Growth Unempl


2000 y1 x11 x21 x31
2001 y2 x12 x22 x32

2002 y3 x13 x23 x33

2003 y4 x14 x24 x34

2004 y5 x15 x25 x35


Data, Data Sets,
Elements, Variables, and Observations
Observation Variables
Element
Names Annual Earn/
Company Sales($M) Share($)

Dataram Dataram 73.10 0.86


EnergySouth EnergySouth 74.00 1.67
Keystone Keystone 365.70 0.86
LandCare LandCare 111.40 0.33
Psychemedics Psychemedics 17.60 0.13

Data Set
Data and Data Sets

• Data are the facts and figures collected, summarized,


analyzed, and interpreted.

The data collected in a particular study are referred


to as the data set.
Elements, Variables, and Observations
The elements are the entities on which data are collected.

A variable is a characteristic of interest for the elements.

The set of measurements collected for a particular


element is called an observation.

The total number of data values in a data set is the


number of elements multiplied by the number of
variables.
Measurement and Measurement Scales

• Measurement is the foundation of any scientific investigation


• Everything we do begins with the measurement of whatever it is we
want to study
• Definition: measurement is the assignment of numbers to objects
Four Types of Measurement Scales
Nominal
Ordinal
Interval
Ratio
• The scales are distinguished on the relationships
assumed to exist between objects having different
scale values
• The four scale types are ordered in that all later
scales have all the properties of earlier scales—
plus additional properties
Nominal Scale
• Not really a ‘scale’ because it does not scale
objects along any dimension
• It simply labels objects

Gender is a nominal scale


Male = 1
Female = 2
Religious Affiliation

Catholic =1
Protestant = 2
Jewish =3
Muslim =4
Other =5

Categorical data are measured on nominal scales which


merely assign labels to distinguish categories
What about symptoms of
depression from a psychiatric
assessment?

None =0
Mild =1
Moderate = 2
Severe =3
Ordinal Scale
• Numbers are used to place objects in order

• But, there is no information regarding the differences


(intervals) between points on the scale
Interval Scale
• An interval scale is a scale on which equal intervals between objects,
represent equal differences

• The interval differences are meaningful

• But, we can’t defend ratio relationships


Fahrenheit Scale
• Interval relationships are meaningful
• A 10-degree difference has the same meaning
anywhere along the scale
• For example, the difference between 10 and 20
degrees is the same as between 80 and 90 degrees
• But, we can’t say that 80 degrees is twice as hot as
40 degrees
• There is no ‘true’ zero, only an ‘arbitrary’ zero
Ratio Scale
• Have a true zero point

• Ratios are meaningful

• Physical scales of time, length and volume are ratio


scales

• We can say that 20 seconds is twice as long as 10


seconds
What are Nominal, Ordinal, Interval and Ratio Scales?

Nominal, Ordinal, Interval and Ratio are defined as the four fundamental levels of measurement scales that are used to
capture data in the form of surveys and questionnaires, each being a multiple choice question.
Scales of Measurement

Data

Qualitative Quantitative

Numerical
Numerical Non
Non numerical
numerical Numerical
Numerical

Nominal
Nominal Ordinal
Ordinal Nominal
Nominal Ordinal
Ordinal Interval
Interval Ratio
Ratio
Data Universe-Pictorial representation of different classification of
Data
Nominal
Ordinal
Interval
ratio

Quantitate Univariate
Data
Qualitative Bivariate

Discrete
Continuous
Statistics: Two Processes
Describing sets of data

and

Drawing conclusions (making estimates, decisions,


predictions, etc. about sets of data based on sampling)
Statistical Methods
Statistical
Methods

Descriptive Inferential
Statistics Statistics

© 2011 Pearson Education, Inc


Types of statistics
• Descriptive statistics – Methods of organizing, summarizing, and
presenting data in an informative way
• Inferential statistics – The methods used to determine something
about a population on the basis of a sample
• Population –The entire set of individuals or objects of interest or the
measurements obtained from all individuals or objects of interest
• Sample – A portion, or part, of the population of interest
Descriptive Statistics

• Collect data
• e.g., Survey

• Present data
• e.g., Tables and graphs

• Summarize data
• e.g., Sample mean =
X i

n
Inferential Statistics
• Estimation
• e.g., Estimate the population
mean weight using the sample
mean weight
• Hypothesis testing
• e.g., Test the claim that the
population mean weight is 70 kg

Inference is the process of drawing conclusions or making decisions about a


population based on sample results
What is Statistical Analysis

You might also like