CHAPTER 1
INTRODUCTION
TO
STATISTICS
INTRODUCTION
-DEFINITION
OF
STATISTICS- FUNCTIONSSCOPE-LIMITATIONS,
CLASSIFICAITON
AND
TABULATION OF DATA
Definitions
Data
observations (such as
measurements,
genders,
survey responses) that have
been collected.
What is statistics?
Statistics is the study of how to
collect, organize, analyze, and
interpret numerical information
from data.
Collection of methods for planning
experiments, obtaining data, and
then
organizing,
summarizing,
presenting, analyzing, interpreting,
and drawing conclusions.
There are two main branches of
statistics
1. Descriptive Statistics Descriptive
statistics
involves
methods
of
organizing, picturing and summarizing
information from data.
Example: Mean, Median, Mode
2. Inferential Statistics Inferential
statistics involves methods of using
information from a sample to draw
conclusions about the population i. e,
the use of descriptive statistics to
The basic fallacies of inferential
statistics:
They assume that the past is
prologue to the future
They assume you are going to
test your inference an infinite
amount of times.
Characteristics of Statistics
Statistics are aggregates of facts
Statistics are numerically expressed
Statistics are affected to a marked extent
by multiplicity of causes
Reasonable standards of accuracy in
enumeration, estimation or collection
Statistics are collected in a systematic
manner
Statistics are collected for predetermined
purpose
Five stages of statistical
investigation :
Collection of Data
Organization of data
Presentation of data
Analysis
Interpretation of Results
(1) Collection of Data: A structure of
statistical investigation is based on a
systematic collection of data.
The data is classified into two groups
i) Internal data and ii) External data
Internal data are obtained from internal
records related to operations of business
organisation such as production, source of
income
and
expenditure,
inventory,
purchases and accounts.
The external data are collected and
purchased by external agencies. The
external data could be either primary data or
secondary data. The primary data are
(2) Organisations of data :
The collected data is a large mass of
figures that needs to be organised.
The collected data must be edited to
rectify for any omissions, irrelevant
answers, and wrong computations.
The edited data must be classified
and tabulated to suit further analysis.
3) Presentation of data :
The large data that are collected
cannot be understood and should be
analysed
easily
and
quickly.
Therefore, collected data needs to be
presented in tabular or graphic form.
This systematic order and graphical
presentation
helps
for
further
analysis.
4) Analysis of data:
The analysis requires establishing the
relationship between one or more
variables. Analysis of data includes
condensation, abstracting, summarization,
conclusion etc. With the help of statistical
tools and techniques like measures of
dispersion central tendency, correlation,
variance analysis etc analysis can be
done.
(5)Interpretation of data
The interpretation requires deep
insight of the subject. Interpretation
involves
drawing
the
valid
conclusions on the bases of the
analysis of data. This process is very
important as conclusions of results
are done based on interpretation.
Functions
Presentation of facts
Simplification of complexities
Facilitating comparisons
Facilitating the formulation of policies
Widening of human knowledge
Useful in testing the laws of other sciences
Facilitates the forecasting
Establishment of correlation between two
facts
FUNCTIONS OF STATISTICS :
Statistics as a discipline is considered indispensable in
almost all spheres of human knowledge. There is
hardly any branch of study which does not use
statistics. Scientific, social and economic studies use
statistics in one form or another.
These disciplines make-use of observations, facts and
figures, enquiries and experiments etc. using statistics
and statistical methods. Statistics studies almost all
aspects in an enquiry. It mainly aims at simplifying the
complexity of information collected in an enquiry. It
presents data in a simplified form as to make them
intelligible. It analyses data and facilitates drawal of
conclusions. Now let us briefly discuss some of the
important functions of statistics.
1. Presents facts in simple form:
Statistics presents facts and figures
in a definite form. That makes the
statement logical and convincing
than mere description. It condenses
the whole mass of figures into a
single figure. This makes the problem
intelligible.
.2. Reduces the Complexity of data:
Statistics simplifies the complexity of
data. The raw data are unintelligible.
We make them simple and intelligible
by using different statistical measures.
Some such commonly used measures
are graphs, averages, dispersions,
skewness, kurtosis, correlation and
regression etc. These measures help in
interpretation and drawing inferences.
Therefore, statistics enables to enlarge
the horizon of ones knowledge
3. Facilitates comparison: Comparison between
different sets of observation is an important
function of statistics. Comparison is necessary
to draw conclusions as Professor Boddington
rightly points out. the object of statistics is to
enable comparison between past and present
results to ascertain the reasons for changes,
which have taken place and the effect of such
changes in future. So to determine the
efficiency of any measure comparison is
necessary. Statistical devices like averages,
ratios, coefficients etc. are used for the
purpose of comparison.
4.Testing hypothesis: Formulating and testing
of hypothesis is an important function of
statistics. This helps in developing new
theories. So statistics examines the truth
and helps in innovating new ideas.
5.Formulation of Policies :Statistics helps in
formulating plans and policies in different
fields. Statistical analysis of data forms the
beginning of policy formulations. Hence,
statistics
is
essential
for
planners,
economists, scientists and administrators
to prepare different plans and programmes.
6. Forecasting :The future is uncertain.
Statistics helps in forecasting the trend
and tendencies. Statistical techniques are
used for predicting the future values of a
variable. For example a producer forecasts
his future production on the basis of the
present demand conditions and his past
experiences. Similarly, the planners can
forecast the future population etc.
considering the present population trends.
7. Derives valid inferences :
Statistical methods mainly aim at
deriving inferences from an enquiry.
Statistical techniques are often used
by scholars planners and scientists to
evaluate different projects.
These techniques are also used to
draw inferences regarding population
parameters on the basis of sample
information.
Limitation
Statistics does not study individuals
Statistics deals with quantitative
facts
Statistics is true only to its averages
Statistics may lead to fallacious
conclusion
Only experts can make use of
statistics
Homogeneity and uniformity is must
Limitations of statistics: Statistics with all its wide application
in every sphere of human activity has its own limitations.
Some of them are given below.
1. Statistics is not suitable to the study of qualitative
phenomenon: Since statistics is basically a science and
deals with a set of numerical data, it is applicable to the
study of only these subjects of enquiry, which can be
expressed in terms of quantitative measurements.
As a matter of fact qualitative phenomenon like honesty,
poverty, beauty, intelligence etc, cannot be expressed
numerically and any statistical analysis cannot be directly
applied on these qualitative phenomenons.
Nevertheless, statistical techniques may be applied indirectly
by first reducing the qualitative expressions to accurate
quantitative terms. For example, the intelligence of a group
of students can be studied on the basis of their marks in a
particular examination.
2. Statistics does not study individuals:
Statistics does not give any specific importance
to the individual items, in fact it deals with an
aggregate of objects. Individual items, when
they are taken individually do not constitute
any statistical data and do not serve any
purpose for any statistical enquiry.
3. Statistical laws are not exact: It is well
known that mathematical and physical
sciences are exact. But statistical laws are not
exact
and
statistical
laws
are
only
approximations. Statistical conclusions are not
universally true. They are true only on an
average.
4. Statistics table may be misused: Statistics
must be used only by experts; otherwise,
statistical
methods
are
the
most
dangerous tools on the hands of the
inexpert. The use of statistical tools by the
inexperienced and untraced persons might
lead to wrong conclusions. Statistics can
be easily misused by quoting wrong figures
of data. As King says9aptly statistics are
like clay of which one can make a God or
Devil as one pleases
.
5. Statistics is only, one of the methods of studying a
problem: Statistical method do not provide complete
solution of the problems because problems are to be
studied taking the background of the countries
culture, philosophy or religion into consideration.
Thus the statistical study should be supplemented
by other evidences. Statistics can analyze only
aggregated observation or data: Any statistics is a
collection of data. Individual observation does not
belong to statistics hence, statistics analyses a
collection of data and enlighten the overall
estimated result.
For-example the average income of the labourers of
a business can be estimated by observing their per
capital
Scope and importance of statistics
Useful to bankers
Useful to insurance company
Useful to railways and other
transport agencies
Useful to business
Useful to economists
Useful to planning
Classification and
Tabulation
Definition of
Classification
Classification is the process of arranging
data into sequences and groups according
to their common characteristics or
separating them into different but related
parts.
- Secrist
The process of grouping large number of
individual facts and observations on the
basis of similarity among the items, is
called classification.
- Stockton
& Clark
Meaning of Classification
Classification
is
a
process
of
arranging things or data in groups or
classes
according
to
their
resemblances and affinities and gives
expressions to the unity of attributes
that may subsit among a diversity of
individuals.
Characteristics of
classification
Classification performs homogeneous
grouping of data
It brings out points of similarity and
dissimilating
The classification may be either real or
imaginary
Classification is flexible to
accommodate adjustments
Objectives / purposes of
classifications
To simplify and condense the large data
To present the facts to easily in
understandable form
To allow comparisons
To help to draw valid inferences
To relate the variables among the data
To help further analysis
To eliminate unwanted data
To prepare tabulation
Important types of
classification
Geographical (i.e. on the basis of
area or region wise)
Chronological (On the basis of
Temporal / Historical, i.e. with
respect to time)
Qualitative (on the basis of
character / attributes)
Numerical, quantitative (on the
basis of magnitude)
Geographical
Classification
In geographical classification, the
classification is based on the
geographical regions.
Ex: Sales of the company (In Million
Rupees) (region wise
Region
Sales
North
285
South
300
East
185
west
235
Chronological
If theClassification
statistical data are classified
according to the time of its occurrence,
the type of classification is called
chronological classification.
Sales reported
departmental store
Month by aSales
(Rs. in lakh)
January
Feb
mar
apr
may
22
26
32
25
27
Qualitative Classification
In qualitative classifications, the data are
classified according to the presence or
absence of attributes in given units.
Thus, the classification is based on some
quality characteristics / attributes.
Ex: Literacy, Education, Class grade etc.
Further, it may be classified as
a) Simple classification b) Manifold
classification
Simple classification: If the classification is
done into only two classes then
classification is known as simple
classification.
Ex: a) Population in to Male / Female
Manifold classification:
In this classification, the
classification is based on more
than one attribute at a time.
Population
Non-smokers
Smokers
Literat
e
Male
Male
Literate
Illiterat
e
Female
Female
Male
Illiterat
e
Male
Femal
e
Female
Quantitative Classification
In Quantitative classification, the classification is
based on quantitative measurements of some
characteristics, such as age, marks, income,
production,
sales
etc.
The
quantitative
phenomenon under study is known as variable and
hence this classification is also called as
classification by variable.
For a 50 marks
test, Marks obtained by
Marks
No. of students
students as
classified5 as follows
0 10
10 20
20 30
10
30 40
25
40 50
Total Students = 50
Meaning and Definition of
Tabulation
Tabulation may be defined as
systematic arrangement of data is
column and rows. It is designed to
simplify presentation of data for the
purpose of analysis and statistical
inferences.
Major Objectives of
Tabulation
To simplify the complex data
To facilitate comparison
To economise the space
To draw valid inference /
conclusions
To help for further analysis
Differences between
Classification and Tabulation
First data are classified and
presented in tables; classification is
the basis for tabulation.
Tabulation is a mechanical function
of
classification
because
is
tabulation classified data are placed
in row and columns.
Classification is a process of
statistical analysis while tabulation
is a process of presenting data is
suitable structure.
Classification of tables
Classification is done based
on
Coverage (Simple and complex table)
Objective / purpose (General purpose
/ Reference table / Special table or
summary table)
Nature of inquiry (primary and
divided table
Diagrammatic and Graphic
Representation
Diagrammatic presentation
A diagram is a visual form for
presentation of statistical data. The
diagram refers various types of
devices such as bars, circles, maps,
pictorials and cartograms etc.
Some important types of diagrams
Line diagram
This is simplest type of one
dimensional diagram. On the
basis of size of the figures,
heights of the bar / lines are
drawn. The distance between
bars are kept uniform.
The
limitation of this diagram are it
is not attractive cannot provide
more than one information.
Ex: Draw the line diagram for the
following data
Year
No. of students passed in FCD
16
(15)
14
(13)
(12)
12
10
8
6
4
(7)
(5)
2001
(5)
2002
2003
2004
Year
2005
2006
Simple bars diagram
The annual expresses of maintaining
the car of various types are given
below.
Draw the vertical bar
diagram. The annual expenses of
maintaining
includes
(fuel
+
maintenance + repair + assistance
+ insurance).
Type of the Expense in
car
Maruthi
Udyog
Rs. / Year
47533
Hyundai
59230
Tata Motors
63270
Horizontal bar diagram
World biggest top 10 steel makers are data
are given below.
Draw horizontal bar
diagram.
Compound bar diagram (Multiple
Ex: bar
Draw
the bar diagram for the
diagram
following data. Resale value of the cars
(Rs. 000) are as follows.
Year
Santro
Zen
Wagonr
(Model)
2003
2004
2005
208
240
261
252
278
296
248
274
302
Pie diagram
1. Ungrouped Series
Constructing Frequency Table:
Ex 1: The marks obtained by 50
students in an examination are given
3 4below:
4 5 3 3 3 2 2 1 5 5 6 3 3 4 1
0
3
8
2
9
5
1
9
5
8
8
6
2
5
5
5
7
4
6
3
9
4
3
6
4
2
7
3
4
4
1
4
1
4
3
2
4
6
2
7
1
3
3
3
2
8
5
1
4
3
4
3
7
5
2
9
8
5
3
1
Prepare a Frequency Table.
1
8
5
4
7
3
7
1
6
4
4
2
9
1
8
4 0
2 6
2 2
5
1
Ex 2: Form a frequency distribution
from the following data by inclusive
method taking 4 as the magnitude
1 class
1 1 2intervals.
1 3 3 4 2 2 8 1 9 2 2 3
0
3
9
3
4
7
4
0
2
6
5
1
4
1
2
3
2
7
1
5
1
3
7
1
9
3
2
8
3
1
3
8 1 5 6
1
4 9 2 7
1 1 1 1 1 1 2 2 2 3 3
3 4 8 7 6 5 3 5 8 0 2
4 3
0 9
Then convert into exclusive method.
INTRODUCTION TO TABULATION
DEFINITION
According to Tuttle, A statistical table is the logical listing
of related quantitative data in vertical columns and
horizontal rows of numbers, with sufficient explanatory
and qualifying words, phrases and statements in the form
of titles, heading and footnotes to make clear the full
meaning of the data and their origin
OBJECTIVES OF TABULATION
1. To simplify the complex data
2. To economize space
3. To facilitate comparison
4. To facilitate statistical analysis
5. To save time
6. To depict trend
7. To help reference
Components Of Table
1. Table number
2. Title of the table
3. Caption / Box head
4. Stub
5. Body / Field
6. Head note
7. Foot note
8. Source data
Stub
Caption
headings
Subhead
Total
Subhead
Column-
Column
Column-
Column
head
head
head
head
Stub
Entries
Total
(columns)
Foot note :
Source note:
(rows)
REQUIREMENTS OF GOOD
STATISTICAL TABLES
1. Suit the purpose
2. Scientifically prepared
3. Clarity
4. Manageable size
5. Columns and rows should be numbered
6. Suitably approximated
7. Attractive get-up
8. Units
9. Average and totals
10.Logical arrangement of items
11.Proper lettering
Types of tables
1. Simple and Complex tables.
2. General purpose and special purpose tables.
3. Original and derived table.
Advantages of classification and
tabulation
1. Clarifies the object
2. Simplifies the complex data
3. Economic space
4. Facilitates the comparison
5. It helps in references
6. Depict the trend
Disadvantages of classification and
tabulation
1. Complicated process
2. Every data can not be put into tables
3. Lack of flexibility