Statistics Day 1a - Types of Data, Graphical Representation, Correlation, Data Modeling & Index Numbers
Statistics Day 1a - Types of Data, Graphical Representation, Correlation, Data Modeling & Index Numbers
e to
Course
Busine
ss
Statisti
cs
Why Business Statistics?
Introduction to
Probability
& Statistics
Why should we care about Probability &
Statistics? - I
Why should we care about Probability &
Statistics?
Why should we care about Probability &
Statistics? - II
Why should we care about Probability &
Statistics? - III
What will you learn in this course?
Examples:
• If you flip a coin 100 times, what is the probability of
getting at most 10 heads?
• What is the probability of getting a four of a heart in a
deck of cards?
Today’s Lecture
• By the end of this session, the students
should be able to
• Define the data
• Know the different types of data
• Know the different ways to present the
data systematically using Diagrammatic
and Graphic Representation
Definition of Data
Data
Categorical Numerical
(Quantitative
(Qualitative )
)
Discrete Continuous
Discrete and Continuous
Data
• Numerical data could be either discrete or
continuous
• Continuous data can take any numerical value
(within a range); For example, weight, height,
etc.
• There can be an infinite number of possible
values in continuous data
• Discrete data can take only certain values by
a finite ‘jumps’, i.e., it ‘jumps’ from one value
to another but does not take any intermediate
value between them (For example, number of
students in the class)
Comparison of continuous and
discrete data
• Continuous data is more precise than discrete
• Continuous data is more informative than
discrete
• Continuous data can remove estimation and
rounding of measurements
• Continuous data is often more time
consuming to obtain
• Discrete should also be converted to
continuous data when possible as to obtain a
higher level of information and detail
Examples of conversion of discrete
to continuous data
Types of
•Data
Based on their mathematical properties, data
are divided into four groups: NOIR
Nominal
Ordinal
Interval
Ratio
• They are ordered with their increasing
accuracy
powerfulness of measurement
preciseness
wide application of statistical techniques
Types of
Data
Data Presentation
• Principals of data presentation
a) To arrange the data in such a way that it
should create interest in the reader’s mind at
the first sight.
b) To present the information in a compact and
concise form without losing important
details.
c) To present the data in a simple form so as to
draw the conclusion directly by viewing at
the data.
d) To present it in such away that it can help in
further statistical analysis.
Presentation of data
Tabula Graphical
r
Types of tables
Simple Table
When characteristics with values are presented in the form of
table, it is known as simple table e.g
50
0
One Two Three Four Five Six Eight Nin
Seven e
No. of Students
Multiple Bar Charts
• Also called compound bar charts
• More then one sub-attribute of variable can be
expressed
6
0
Population
5 Land
0
Percentageof World
4
0
3
0
Total
2
0 Asi Europe Africa Latin USSR North
a Oceania
1 America
0
America
Histogram
• Used for Quantitative Continuous
Variables
• It is used to present variables which have
no gaps e.g age, weight, height, blood
pressure, blood sugar etc.
• It consist of a series of blocks. The
class intervals are given along
horizontal axis and the frequency along
the vertical axis
Histogram
Frequency
• Frequency polygon
polygon is an area diagram of frequency
distribution over a histogram.
• It is a linear representation of a frequency table and histogram,
obtained by joining the mid points of the histogram blocks.
• Frequency is plotted at the central point of a group
250
200
50
0
59-69 69-79 79-89 89-99 99- 109- 119- 129-
109 119 129 139
Line diagram
• Line diagrams are used to show the trend of events with the passage of
time.
Pie charts
• Most common way of presenting data
• The value of each category is divided by the total
values and then multiplied by 360 and then each
category is allocated the respective angle to
present the proportion it has.
• It is often necessary to indicate percentages in
the segment as it may not be sometimes very
easy virtually, to compare the areas of segments.
Pie Charts
Pictogram
A negative
positive correlation
correlation isis characterised by a straight line
line with
with aa negative
positive gradient.
gradient.
Heigh
Sales
Soup
Shoe
Size
t
47
Data Modeling – Entities or Concepts,
Attributes & Relationships
• A simple example - An employee of a company gets paid on a
monthly basis for a particular job role they perform. The company
wants to find a way to capture all the information relating to the
employee, their salary and payment details. The company is in
need of a database to capture all this data in.
48
Data Modeling – Meta Data
49
Index numbers
Index Number
Pi
Ii 100
Pbase
where
Ii = index number for year i
Pi = price for year i
Pbase = price for the base year
Index Numbers: Example
Airplane ticket prices from 1995 to 2003:
Index
Year Price (base year
= 2000)
1995 272 85.0
1996 288 90.0 P 288
I1996 P1996 100 320(100) 90
2000
1997 295 92.2
1998 311 97.2 Base Year:
1999 322 100.6
2000 320 100.0 P 320
I2000 P2000 100 320(100) 100
2000
2001 348 108.8
2002 366 114.4
2003 384 120.0 P 384
I2003 P2003 100 320(100) 120
2000
Index Numbers: Interpretation