Statistics Notes
Statistics Notes
Contents
Specimen
1.7 Official Statistics 9
2 Collection of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1 Data Collection 17
2.2 Variables 19
Specimen
of collection of data and their
presentation in charts and tables.
It is now considered the science of
inferences on observed data and the
entire problem of making decisions
in the face of uncertainty. This
covers considerable ground since
uncertainties are met when we flip
a coin, when a dietician experiments with food additives, when an actuary
determines life insurance premiums, when a quality control engineer accepts
or rejects manufactured products, when a teacher compares the abilities of
students, when an economist forecast trend, when a newspaper predicts an
election result and so forth.
It would be presumptuous to say that statistics in its present state of development
can handle all situations involving uncertainties, but the new techniques are
constantly being developed and modern statistics can, provide the framework
for taking at these situations in a logical and systematic fashion. The beginning
of mathematics of statistics may be found in mid-eighteenth century studies in
probability motivated by interest in game of chance. Thus the scholars began
to apply probability theory to actuarial problems to some aspects of social
2 Statistics - Scope and Development
Specimen
The word Statistics have been derived
from Latin word ”Status” or the Italian
word ”Statista”. The meaning of these
words is ”Political State” or a Government.
Shakespeare used a word Statist in his
play Hamlet (1602). In the past, the
statistics was used by rulers for official
purposes. Even though application of
Statistics was very limited, the rulers
and kings needed information about lands,
agriculture, commerce, population of their
states to assess their military potential,
Sir Ronald Aylmer Fisher.
their wealth, taxation and other aspects of
Government.
Gottfried Achenwall used the word ’statistik’ at German University in 1749
which means political science of different countries. In 1771, W. Hooper
(Englishman) used the word ’statistics’ in his translation of Elements of Universal
Erudition written by Baron B.F Bieford. In his book, statistics has been defined
as the science that teaches us what is the political arrangement of all the
1.2 Definition of Statistics 3
modern states of the known world. There is a big gap between the old statistics
and the modern statistics, but old statistics is also used as a part of the present
statistics.
During the 18th century English writers have used the word statistics in their
works. A lot of work has been done in the end of the nineteenth century.
At the beginning of the 20th century, William S Gosset developed the methods
for decision making based on a small set of data. During the 20th century
several statisticians were active in developing new methods, theories and
application of statistics. The advent of electronic computers is certainly a
major factor in the development of modern statistics. Sir Ronald Aylmer
Fisher is known as father of modern statistics.
1.2
Specimen
Definition of Statistics
1. ”Statistics can be defined as the collection, presentation and interpretation
of numerical data.” - Croxton and Cowden.
characteristics are:
(i) Statistics are the aggregates of facts. It means, a single figure is not
statistics. For example, national income of a country for a single year is
not statistics but the same for two or more years is statistics.
Specimen
collected in a haphazard manner, they will not be reliable and will lead to
misleading conclusions. It is collected with a pre-determined purpose.
The complex mass of data are made simple and understandable with the
help of statistical methods.
Specimen
forecasting future events
Important policies, decision making and forecasting in business, economics,
finance, industry, etc are taken on the basis of statistical methods.
Specimen
drugs and medicines are of great importance.
Activity
Prepare a report regarding the functions and importance of statistics in daily
life by reading the features and reports in news papers and magazines.
(ii) Statistics reveals the average behaviour, the normal or the general trend.
Statistics does not study individual items but deals with aggregate. For
example, one may be misguided when told that the average depth of a
river from one bank to the other is four feet. There may be some points in
between where its depth is far more than four feet.
(iii) Since statistics are collected for a particular purpose, such data may or
may not be relevant or useful in other situations or cases. For example,
secondary data (i.e., collected by a person) need not be useful for another
person.
(iv) Statistics are not 100 per cent precise as in Mathematics. Those who use
Statistics should be aware of this limitation
Specimen
Misuse of Statistics
The misuse of Statistics is the main cause of discredit to this science and has
led to public distrust in Statistics. The various reasons of misuse are:
Actuarial Science
Specimen
Actuarial science includes a number
of interrelating subjects, including
Probability, Mathematics, Statistics, Finance, Economics, Financial Economics,
and Computer Programming. Historically, actuarial science used deterministic
models in the construction of tables and premiums. The science has gone
through revolutionary changes during the last 30 years due to the proliferation
of high speed computers and the union of stochastic actuarial models with
modern financial theory (Frees 1990).
Biostatistics
Agricultural Statistics
The agricultural investigations are
based on the application of statistical
methods and procedures which are
helpful in testing hypotheses using
observed data, in making estimations
of parameters and in predictions. The
application of statistical principles
and methods is necessary for effective
practice in resolving various problems
that arise in the many branches of
agricultural activity. Because of the
variability inherent in biological and agricultural data, knowledge of statistics
Specimen
is necessary for their understanding and interpretation. Numerous activities in
agriculture are very different from each other, resulting in different branches
of agricultural science like: field crop production, vegetable production,
horticulture, fruit growing, plant protection, livestock, veterinary medicine,
agricultural mechanization, water resources, agricultural economics, etc.
Activity
List out the various branches of statistics related to different disciplines.
Specimen
• SDRD - Survey Design and Research Division
The Statistics Wing called the National Statistical Office(NSO) consists of the
Central Statistical Office (CSO), the Computer Centre and the National Sample
Survey Office (NSSO).
Specimen
an organization under the Ministry of
Statistic of the Government of India.
It is the largest organisation in India,
conducting regular socio-economic
surveys. It was established in 1950.
Specimen
Indian Statistical Institute (ISI), a unique institution
devoted to the research, teaching and application
of statistics, natural sciences and social sciences.
Founded by Prof.Prasanta Chandra Mahalanobis in
Kolkata on 17th December, 1931.He is known as father
of Indian statistics. The Indian Statistical Institute
publishes Sankhya, the Indian Journal of Statistics.
Prof. P.C. Mahalanobis
In recognition of the notable contributions
made by Prof.P.C.Mahalanobis in the fields of
economic planning and statistical development
in the post independent era, the Govt. of India
has decided to designate 29th June every
year, coinciding with his birth anniversary, as
Statistics Day in the category of special day to
be celebrated at the national level. The Day is
celebrated by holding seminars, discussions
and competitions to highlight the importance
of official statistics in national development.
1.7 Official Statistics 13
Specimen
Offices is assisted by one District Officer, one or more Additional District
Officers, one Price Supervisory Officer and one or two Research Officers. At
taluk level, there is a Taluk Statistical Office, which is the lowest statistical unit
in the State. There are at present 61 Taluk Statistical Offices, each under the
control of a Taluk Statistical Officer.
Activity
Visit the nearest economics and statistics department and prepare a detailed
report regarding their functions.
Let us sum up
Statistics are all around us. Without statistics we couldn’t plan our budgets, pay
our taxes, enjoy games to their fullest, evaluate classroom performance, etc. In this
chapter we discussed the history ,importance,development,scope and some definitions
of statistics . Statistics is applied in all walks of life.Various branches of statistics are
explained here .We have seen the functions and roles of the ministry of statistics and
programme implementation and the famous Indian Statistical Institute in Kolkatta. We
also introduced the Department of Economics and Statistics of the state.
14 Statistics - Scope and Development
Learning outcomes
Evaluation Items
Specimen
1. ” Statistics can be defined as the collection presentation and interpretation
of numerical data”. This definition is given by:
a) R.A Fisher b) Horace Secrist
c) Croxton and Crowden d) Conner.
Specimen
10. The journal published by Indian Statistical Institute (ISI) is
a) Statistica b) Sankhya
c) Sample surveys d) Census
13. How will you critically approach the definition of statistics given by Horace
Secrist?
Answers:
Specimen
17
2 Collection of Data
Introduction
In chapter 1, we discussed Statistics as the study of collection, organization,
analysis, interpretation and presentation of data. For studying statistics the
first step is collection of data, which we will discuss in detail in this chapter.
Specimen
a journalist might collect information regarding the recent social issues, a
politician collects information on how voters plan to vote in the upcoming
election, etc. Data collection is the systematic gathering of data for a particular
purpose from various sources.
Data is the plural of the term datum, which means any measurement, result,
fact or observation which gives information. Statistical surveys are the most
popular devices for obtaining the desired data.
Before dealing with statistical surveys, we have to familiarize with the following
terms.
Statistical Investigation
Statistical investigation includes collection, classification, presentation, analysis
and interpretation of data according to well defined procedures. The person
authorized to make investigation is known as Investigator. In a statistical
investigation the investigator formulates the problem, suggests the data
collection methods, organises various steps in an appropriate way, analyses the
data and interpret the result. Usually, the investigators depute some persons
to collect the data from the field. These persons are known as Enumerators.
The enumerator may not be aware of the investigation procedures completely.
18 Collection of Data
His/her duty is to collect the data for the investigator. It is the duty of the
investigator to train and supervise the work of the enumerator. The process of
data collection by the enumerator is known as Enumeration.
Specimen
population is finite. All students of Kerala University for the year 2013-14
constitute a finite population. A population which is not finite or extremely
large is infinite. The population comprises of all people in the world above 18
years of age is considered as an infinite population.
If the population is infinite or is of extremely large size, it is not feasible or
practicable to access the entire population for study. As a result, it is apt
to take a representative part as a substitute for the entire population. This
representative part of the population is known as sample. The method of
collecting data from the sample is known as sampling or sample survey.
Various sampling designs and their selections are discussed in the last chapter.
Statistical Survey
A survey is a process of collecting data either from the population or from
sample units. The statistical survey may be either by Census method or by
Sampling method. The purpose of conducting a sample survey is to collect
information about population using sample.
Specimen
2. Distinguish between Population and sample?
3. Explain Finite and Infinite Population with the help of examples
2.2 Variables
Consider a group of people in a locality. The members of the group are found to
be varying in many factors like sex, age, eye colour, intelligence, height, weight,
blood pressure etc. The factors which can vary from one object to another are
called variables. Among these variables sex, eye colour and intelligence which
cannot be numerically measured are called qualitative variables or attributes.
A qualitative variable is one that can be identified by noting its presence or
identified with different categories of the factor. The other variables height,
weight, age and blood pressure which are numerically measured are called
quantitative variables. A quantitative variable consists of numerical values.
Depending on the values taken by a quantitative variable, it is further classified
as discrete variable and continuous variable. If the variable takes specific
values only, it is called discrete variable. The variable, number of children in a
family, does not take values other than 0, 1, 2, 3,etc. That is, there is a specified
20 Collection of Data
Specimen
Levels of Measurement -
Nominal, Ordinal and Cardinal Data
Specimen
1. Compare Nominal, Ordinal, Cardinal data.
2. Give some examples Nominal, Ordinal and Cardinal data.
those which are already in existence, and which have been collected for some
other purpose than the answering of question in hand”
Specimen
Primary data, after use Secondary data cannot
become secondary data be converted to primary data
Questionnaire
A questionnaire is usually mailed by post or by email to selected informants.
The informants are allowed a specified time to fill up the questionnaire and
have to return to the investigator. Here the quality of the obtained data
depends on the quality of the questions and the honesty of the informants.
As the informants are to fill up the data, they should be literate. This method
is suitable in cases where the informants are widely scattered. One of the
main disadvantages of this method is that the chance of getting incomplete
information is large.
2.4 Questionnaire and Schedule 23
Schedule
If the group of informants are not widely scattered, or if they are not literate,
the enumerator himself/herself can personally approach the informants with
the set of question and collect information. These questions may not be in
detailed manner as questionnaire. It may not contain explanatory foot notes
or explanations of terms used. These set of questions used for data collection
is termed as schedule. In some cases questionnaire itself can be used as a
schedule.
Questionnaire Schedule
Specimen
It is often sent by post Enumerators carry the schedule
personally to the informant
Answers are filled by the Answers are filled by the enumerators
respondents
Informants are to be literate Informants need not be literate
Success depends on the quality Success depends on the honesty and
of questions and sincerity of the competence of the enumerator
informant
Chance of getting incomplete Chance of getting incomplete
information is more information is less as enumerators
explain the questions
3. Question should not contain technical terms and words with uncommon
meaning, such questions leads to different information from different
informants
Specimen
10. Questionnaire should be attractive so as to impress the informant
Drafting of Questionnaire
A sample questionnaire is prepared below for studying the socio economic
status of people residing in a village.
2.4 Questionnaire and Schedule 25
1. Name :
2. Address:
Specimen
a) Own house b) Rental
6. Type of House :
a) Temporary b) Structured
7. Toilet facility :
a) Proper b) Improper
8. Water Facility :
a) own well b)water provided by panchayat c) other Sources
9. Electrified Home :
Yes No
12 Occupation :
a)Govt. service b) Non Govt. Service c)Own business
d)Agriculture e) Others
Specimen
16 If Yes,give the number of vehicles in each category. :
a)Two wheeler b) Three wheeler
c)Car d) Others
20 Are you able to properly maintain your standard of living with your
actual income?
Yes No
Specimen
Consider a situation in which the investigator wants to collect data about a
resident in a city. In this case, the investigator may approach a third party,
called witness, who is capable of giving sufficient information about the resident.
This is a case of indirect oral investigation. Indirect oral investigation is
applicable in cases where the informant is reluctant to give information or
when the informant is not available. The disadvantage of this method is that
the reliability of the information heavily depends on the quality and honesty of
the witness or intermediate person.
Direct Observation
Specimen
Telephone interview
In some cases the informant may be reluctant to give answer in a face to face
personal interview. In such cases it is better to select another method for data
collection. Telephone interview is one such method. In this case the investigator
collects data from the informant indirectly but personally. This is less time
consuming and cheaper than direct personal interview. The disadvantage is
that it will not worked in some rural areas were telephonic connection is very
low.
Questionnaires and schedules are one of the most popular methods for collecting
primary data. As the questionnaires are usually mailed to the respondents, it
is known as mailed questionnaire method. The only difference between the
questionnaire and schedule is that in questionnaires the answers are filled
by the respondents themselves but in schedule, the answers are filled by the
enumerators.
2.6 Sources of Secondary Data 29
Specimen
2.6 Sources of Secondary Data
Any published or unpublished data which are reliable for the current situation
is a source of secondary data. While collecting secondary data the investigator
must be aware of the following points.
• The person who collected the data and the purpose for which they are
collected.
• Government publications.
• Office records in panchayats, municipalities etc.
• Survey reports of various research organizations.
• Survey reports in Journals, Newspapers and other publications.
• Websites.
30 Collection of Data
Let us sum up
For studying statistics, the first step is collection of data. It is systematic gathering of
informations. Statistical surveys are tools of data collection. Data means information
regarding a variable. The variable may be qualitative and quantitative or it may
be discrete or continuous. The two survey techniques are Census and sampling.
Depending on the source of information, data can be classified as primary or secondary.
The important primary data collection methods are direct personal investigation,
indirect oral investigation, direct observation, telephone interview, mailed questionnaire
or schedule sent through enumerators and focus group discussion. Any published or
unpublished data which are collected by a third party, and now used by the investigator
for his purpose is a secondary data. Better trained persons are required for converting
the secondary data to the required form.
Specimen
Learning outcomes
Evaluation Items
1. Data that can be classified according to colour. They are measured on
. . . . . . . . . scale
a) Nominal b) Ordinal c) Cardinal
2.6 Sources of Secondary Data 31
Specimen
a) a single value b) only two values in a set c) a group of values in a set
d) all the above
16. What kind of data you receive when you are told about
(a) blood type b) house hold c)heights of waterfall
18. What are the points to be remembered while collecting secondary data?
21. Find the discrete data and continuous data from the following list
a. Number of shares sold each day in a stock market
Specimen
b. Temperature recorded every half an hour at a weather bureau
c. Life time of television tubes reduced by a company
d. Yearly income of employees in a company
e. The age of an individual
f. Number of petals a flower has.
Specimen
26. A survey is to be carried out amongst school children to study how they
spent time after school hours. Prepare a questionnaire for that purpose
27. Indicate whether the following statements are true or false. If false
correct the statements
a. Secondary data are generally used in those cases where the primary
data do not provide an adequate basis for analysis.
b. Secondary data does not need much scrutiny and should be accepted
at its face value
c. The task of editing secondary data is a highly specialised one
d. The questionnaire requires a pre-testing before putting into practice
28. Which type of study do you prefer in the following cases? (Census or
Sampling). Give reason
a. The effect of a medicine
b. To study about the wage distribution of 250 employees in a company
c. A study on the roll of media in the marketing of a face cream
d. A study of a patients heart beat who is admitted to ICCU of a hospital
e. A study on the number of petals of a flower of a special kind
34 Collection of Data
29. Which of the primary data collection method do you suggest in the
following situations?. Give reasons
a. You are appointed as marketing manager of a company. The company
introduces a washing machine with many options. You are asked by your
employer to prepare a datasheet regarding the opinion of your customers
about the new equipment
b. To prepare a report for a media on Nehru Trophy Vallamkali in the
current year.
c. To introduce shift sessions in an institution
Specimen
30. As a reporter of a certain media, you got an opportunity to interview an
IAS topper. Which type of data collection method will you use?. List out
other primary data collection methods.