0% found this document useful (0 votes)
39 views

Statistics

This document discusses statistics and statistical methods. It defines statistics as both numerical data related to a particular topic (plural sense) and the techniques used to analyze quantitative data (singular sense). Some key statistical methods covered include direct personal investigation, indirect oral investigation, sampling methods, and classification of data. The document also discusses sources of data, types of data, reliability of sampling data, and important agencies that collect statistical data.

Uploaded by

Vrinda Tayade
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Statistics

This document discusses statistics and statistical methods. It defines statistics as both numerical data related to a particular topic (plural sense) and the techniques used to analyze quantitative data (singular sense). Some key statistical methods covered include direct personal investigation, indirect oral investigation, sampling methods, and classification of data. The document also discusses sources of data, types of data, reliability of sampling data, and important agencies that collect statistical data.

Uploaded by

Vrinda Tayade
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Statistics – A Plural Sense Statistics refers to information in terms of numbers or numerical data,

such as population statistics, employment statistics etc.


Accqrding to Bowley, “Statistics are numerical statements of facts in any department of enquiry
placed in relation to each other.”

Features of Statistics in the Plural Sense

 Aggregate of facts
 Numerically expressed
 Affected by multiplicity of causes
 Reasonable accuracy
 Placed in relation to each other
 Predetermined purpose
 Estimated

Statistics – A Singular Sense It refers to techniques or methods relating to collection,


classification, presentation analysis and interpretation of quantitative data.

According to Seligman, “Statistics is the science which deals with the methods of collecting,
classifying, presenting, comparing and interpreting numerical data collected to throw some light
on any sphere of enquiry”.

Importance of Statistics in Economics:

 Quantitative expression of economic problem


 Inter-sectoral and inter-temporal comparisons
 Working out cause and effect relationship
 Construction of economic theories or economic models
 Economic forecasting
 Formulation of policies

Limitations of Statistics:

 Study of numerical facts only


 Study of aggregates only
 Results are true only on an average
 Without reference, results may prove to be wrong
 Can be used only by the experts
 Prone to misuse

Collection of Data
Sources of Data There are two sources of data

 Primary Source of Data It implies collection of data from its source of origin.
 Secondary Source of Data It implies collection of data from some agency or
institution which already happens to have collected the data through statistical
survey.

Types of Data There are two types of data

 Primary Data Data collected by the investigator for his own purpose for the first
time, from beginning to end are called primary data.
 Secondary Data These data have already been collected by somebody else, these
are available in the form of published or unpublished report.

Principal Differences between Primary and Secondary Data

 Primary data are original and secondary data are already in existence and therefore,
are not original.
 Primary data do not need any adjustment, secondary data need to be adjustment to
suit the objective of study in hand.
 Primary data are expensive and secondary data are less expensive.

Statistical Methods of Data Collection


(i) Direct Personal Investigation
It is the method by which data are personally collected by the investigator from the information.
Merits and demerits of this method are follows.
(a) Merits

 Originality
 Reliability
 Uniformity
 Accuracy
 Related information
 Elastic

(b) Demerits

 Difficult to cover wide areas


 Costly
 Personal bias
 Limited coverage

(ii) Indirect Oral Investigation


It is the method by which information is obtained not from the persons regarding whom the
information is needed. It is collected orally from other persons who are expected to possess the
necessary information. Merits and demerits of this method are given below
(a) Merits

 Wide coverage
 Expert opinion
 Simple
 Less expensive
 Free from bias

(b) Demerits

 Less accurate
 Doubtful conclusions
 Biased

(iii) Information from Local Sources or Correspondents


Under this method, the investigator appoints local persons or correspondents at different places.
Merits and demerits of this method are given below
(a) Merits

 Economical
 Wide coverage
 Continuity
 Suitable for special purpose

(b) Demerits

 Loss of originality
 Lack of uniformity
 Personal bias
 Less accurate
 Delay in collection

(iv) Information Through Questionnaries and Schedules


There are two ways of collecting information on the basis of questionnaire
(a) Mailing Method Under this method questionnaires are mailed to the informants. The method
is most suited when

 The area of the study is very wide.


 The informants are educated.

(b) Enumerator’s Methods Under this Method enumerator himself fills the schedules after
seeking information from the informants. This method is mostly used when
 field of investigation is large.
 the investigation need specialised and skilled investigation.
 the investigators are well versed in the local language and cultural norms of the
informants.

(c) Collection of Secondary Data There are two main sources of secondary data

 Published sources
 Unpublished sources

(d) Published Sources Some of the published source of secondary data are

 Government publication
 Semi-government publication
 Reports of committees and commissions
 Publications of trade associations
 Publication of research institutions
 Journals and papers
 Publication of research scholars
 International publication

(e) Unpublished Sources These data are collected by the government organisations and others,
generally for their self use or office record.

 In order to assess the reliability, suitability and adequacy of the data, the following
points must be kept in mind
 Ability of the collecting organisation
 Objective and scope
 Method of collection
 Time and condition of organisation
 Definition of the unit
 Accuracy

(v) Census ‘Method


Census method is that method in which data are collected covering every item of the universe or
population relating to the problem under investigation. Merits and demerits of this method are
given follows
(a) Merits

 Reliable and accurate


 Less biased
 Extensive information
 Study of diverse characteristic
 Study of complex investigation
 Indirect investigation
(b) Demerits

 Costly
 Large manpower
 Not suitable for large investigation

(vi) Sample Method


It is that method in which data is collected about the sample on a group of items taken from the
populations for examination and conclusions are drawn on their basis. Merits and demerits of
this method are given below
(a) Merits

 Economical
 Time saving
 Identification of error
 Large investigation
 Administrative convenience
 More scientific

(b) Demerits

 Partial
 Wrong conclusions
 Difficulty in selecting representative sample
 Difficulty in framing a sample
 Specialised knowledge

Methods of Sampling
(i) Random Sampling Random sampling is that method of sampling in which each and every
item of the universe has equal chance of being selected in the sample.
Random sampling may be done in any of the following ways

 Lottery method
 Tables of random number

(ii) Purposive or Deliberate Sampling It is that method in which the investigator himself makes
the choice of the samples items which in his opinion are the best representative of the universe.
(iii) Stratified or Mixed Sampling According to this method of sampling population is divided
into different strata having different characteristics and some of the items are selected from each
strata, so the entire population gets represented.
(iv) Systematic Sampling According to this methods, units of the population are numerically,
geographically and alphabetically arranged. Every nth item of the numbered is selected as a
sample item.
(v) Quota Sampling In this method, the population is divided into different groups or classes
according to different characteristics of the population.
(vi) Convenience Sampling In this method, sampling is done by the investigator in such a
manner that suits his convenience.

Reliability of Sampling Data


It depends mainly on the following factors

 Size of the sample


 Method of sampling
 Bias of correspondents and enumerators
 Training of enumerators

Important agencies at the national level which collect process and tabulate the statistical data.
NSSO (National Sample Survey Organisation), RGI (Registrar General of India), DGCIS
(Directorate General of Commercial Intelligence and Statistics) and Labour Bureaus.

Organisation of Data
Organisation of Data
Organisation of data refers to the arrangement of figures in such a form that comparison of the
mass of similar data may be facilitated and further analysis may be possible.

Classification
Classification is the process of arranging things in groups or classes according to their
resemblances and affinities and gives expression to the unity of attributes that may exist amongst
a diversity of individuals.

Objectives of Classification

 Simplification and Briefness


 Utility
 Distinctiveness
 Comparability
 Scientific arrangement
 Attractive and effective

Characteristic of a Good Classification

 Comprehensiveness
 Clarity
 Homogeneity
 Suitability
 Stability
 Elastic
Basis of Classification

 Geographical Classification This classification of data is based on the geographical


or locational differences of the data.
 Chronological Classification When data are classified on the basis of time, it is
known as chronological classification.
 Qualitative Classification This classification is according to qualities or attributes
of the data.
This classification may be of two types

 Simple classification

 Manifold classification

 Quantitative or Numerical Classification Data are classified in to classes or groups


on the basis of their numerical values. Quantitative classification is also called
classification by variables.
 Concept of Variable A characteristic or a phenomenon which is capable of being
measured and changes its value overtime is called a variable.
The variable may be either discrete or continuous

 Discrete Variable These are those variables that increase in jumps or


in compete numbers.

 Continuous Variable Variable that assume a range of values or


increase not in jumps but continuously or in fractions are called
continuous variables.

 Raw Data A mass of data in its crude form is called raw data.

Types of Statistical Series Statistical series are of two types

 Individual Series These are those series in which the items are listed singly. These
series may be presented in two ways

 According to serial numbers

 Ascending or descending order of data

 Frequency Series Frequency series may be of two types

 Discrete Series or Frequency Array It is that series in which data are


presented in way that exact measurement of items are clearly shown.
In this series there are no class intervals and a particular item in the
series.
 Frequency Distribution It is that series in which items cannot be
exactly measured. The items assume a range of values and are placed
within the limits is called class interval.

Frequency distribution is also known as continuous series or series with class-intervals, or series
of grouped data.

Types of Frequency Distribution

 Exclusive Series It is that series in which every class-interval excludes items


corresponding to its upper limit.
 Inclusive Series An inclusive series is that series which includes all items upto its
upper limit.
 Open End Series An open end series is that series in
which lower limit of the first class-interval and the upper limit of last class- interval
is missing like as below – 5, 20 and above
 Cumulative Frequency Series It is that series in which the frequencies are
continuously added corresponding to each class-interval in the series.
There are two ways of converting this series into cumulative frequency series

 Cumulative frequencies may be expressed on the basis of upper class


limits of the class-intervals.

 Cumulative frequencies may b expressed on the basis of lower class


limits of the class-intervals.

 Mid Values Frequency Series Mid value frequency series are those series in which
we have only mid values of the class intervals and the corresponding frequencies.
 Univariate Distribution The frequency distribution of a single variable is called a
univariate distribution.
 Bivariate Distribution A bivariate distribution is the frequency distribution of two
variables.

Presentation of Data
Textual Presentation
In textual presentation, data are a part of the text of study or a part of the description of the
subject matter of study.

Tabular Presentation of Data


“Tabulation involves the orderly and systematic presentation of numerical data in a form
designed to elucidate the problem under consideration”
Components of a Table
Following are the principal components of a table

 Table number
 Title
 Head note
 Stubs
 Caption
 Body or field
 Footnotes
 Source

Classification of Data and Tabular Presentation


(i) Qualitative Classification of Data and Tabular Presentation Qualitative classification occurs
when data are classified on the basis of qualitative attributes or qualitative.

(ii) Characteristics of a Phenomenon

 Quantitative Classification of Data These occurs when data are classified on the
basis ot quantitative characteristics of a phenomenon.
 Temporal Classified of Data In this, data are classified according to time, and time
becomes the classifying variable.

(iii) Spatial Classification In spatial classification place, location becomes the classifying
variable. It may be a village, a town, a district, etc.
(iv) Merits of Tabular Presentation

 Simple and brief presentation


 Facilitates comparison
 Easy analysis
 High lights characteristics of data
 Economical

Diagrammatic Presentation of Data


These translates quite effectively the highly abstract ideas contained in numbers into more
concrete and easily comprehensible form. Diagrammatic presentation is classified as given below
(i) Bar Diagrams Bar diagrams are these diagrams in which data are presented in the form of bars
or rectangles. Types of Bar Diagram are as follows

 Simple Bar Diagrams Simple bar diagrams are those diagrams which are based on
a single set of numerical data.
 Multiple Bar Diagrams These are those diagram which show two or more sets of
data simultaneously.
 Sub Divided Bar Diagram Sub-divided bar diagram are those diagrams which
simultaneously present total values as well as part values of a set of data.
 Percentage Bar Diagram Percentage bar diagrams are those diagrams which show
simultaneously, different parts of the values of a set of data in terms of percentages.

(ii) Pie or Circular Diagrams Pie diagram is a circle divided into various segments showing the
per cent values of a series. This diagram does not show absolute values.
(iii) Frequency Diagram Data in the form of grouped frequency distributions are generally
represented by frequency diagram like histogram, frequency polygon, frequency curve and
ogive.

 Histogram A histogram is a two dimensional diagram. It is a set of rectangles with


passes as the intervals between class boundaries and with areas proportional to the
class frequency
Histogram frequency distribution are of two types

 Histogram of equal class intervals

 Histogram of unequal class intervals

 Polygon Polygon is another form of diagrammatic presentation of data. It is formed


by joining mid points of the tops of all rectangles in a histogram. However, a
polygon can be drawn even without constructing a histogram.
 Frequency Curve A frequency curve is a curve which is plotted by joining the mid
points of all tops of histogram by free hand smoothed curves and not by straight
lines.
 Ogive or Cumulative Curve Ogive or cumulative curve is the curve which is
constructed by plotting cumulative frequency data on the group paper, in the form
of a smooth curve.
A cumulative frequency curve or ogive may be constructed in two ways

 Less than Method In this method, beginning from upper limit of the
1st values we go on adding the frequencies corresponding to every
next upper limit of the series.

 More than Method In this method, we take cumulative total of the


frequencies beginning with lower limit of the 1st class interval.

(iv) Arithmetic Line Graph An arithmetic line graph is also called time series graph. In it time is
plotted along x-axis and the value of the variable along y-axis. A line graph by joining these
plotted points, these obtained is called time series graph.

Rules for Constructing a Graph

 Choice of scale
 Proportion of axis
 Method of plotting the points
 Lines of different types
 Table of data
 Use of false line
 To draw a line or curve

 One Variable Graph One variable graph are those graphs in which
values of only one variable are shown with respect to some time
period.

 Two or More than Two Variable Graphs These – are the graphs in
which values of two variables are simultaneously shown with respect
to some period of time.

Merits of Diagrammatic and Graphic Presentation

 Simple and understandable information


 Lasting impact
 No need of training or specialised knowledge
 Attractive and effective means of presentation
 A quick comparative glance
 Information and entertaining
 Location of averages
 Study of correlation

Limitations of Diagrammatic and Graphic Presentation

 Limited use
 Misuse
 Only preliminary conclusions

Measures of Central Tendency


Central Tendency
A central tendency refers to a central value or a representative value of a statistical series.
According to Clark, “An average is a figure that represents the whole group”.

Types of Statistical Averages


Averages are broadly classified into two categories

 Mathematical Averages
 Positional Averages
Arithmetic Mean
Arithmetic Mean is the number which is obtained by adding the values of all the items of a series
and dividing the total by the number of items.
Arithmetic Mean is generally written as X. It may be expressed in the form of following formula
𝑋⎯⎯⎯⎯⎯=𝑥1+𝑥2+𝑥3+……𝑥𝑁𝑁 or Σ𝑋⎯⎯⎯⎯⎯𝑁
Types of Arithmetic Mean

 Simple Arithmetic Mean


 Weighted Arithmetic Mean

Methods of Calculating Simple Arithmetic Mean


(i) Individual Series In the case of individual series, Arithmetic Mean may be calculated by two
methods

 Direct Method According to this method, we find the Arithmetic mean from the
following formula
𝑋⎯⎯⎯⎯⎯=Σ𝑋𝑁 or 𝑋⎯⎯⎯⎯⎯= Total value of the item Number of items
 Short-cut Method By short cut method, we find the Arithmetic Mean from the
following formula
𝑋⎯⎯⎯⎯⎯=𝐴+Σ𝑑𝑁
Here, 𝑋⎯⎯⎯⎯⎯ = Arithmetic Mean, A = Assumed average of Ed = Net sum of the
deviations of the different values from the assumed average; and N = Number of
items in the series,

(ii) Discrete Series There are three methods of calculating mean of the discrete series

 Direct Method Direct method of estimating mean of the discrete frequency series
uses the formula
𝑋⎯⎯⎯⎯⎯=Σ𝑓𝑋Σ𝑓
 Short-cut Method Short cut method of estimating mean of the discrete frequency
series uses the following formula
𝑋⎯⎯⎯⎯⎯=𝐴+Σ𝑓𝑑Σ𝑓
 Step-deviation Method This method is a variant of short-cut method. It is adopted
when deviations from the assumed mean have some common factor
𝑋⎯⎯⎯⎯⎯=𝐴+Σ𝑓𝑑Σ𝑓×𝑐
(iii) Frequency Distribution
There are three methods of calculating mean in frequency distribution
(a) Direct Method Direct method of estimating mean of the discrete frequency series uses the
formula
𝑋⎯⎯⎯⎯⎯=Σ𝑓𝑚Σ𝑓
m = mid-value, mid-value = 𝐿1+𝐿22
L1 = lower limit of the class
L2 = upper limit of the class
(b) Short-cut Method Short cut method of estimating mean of the frequency distribution uses the
formula
𝑋⎯⎯⎯⎯⎯=𝐴+Σ𝑓𝑑Σ𝑓
(c) Step Deviation Method According to this method, we find the Arithmetic Mean by the
following formula
𝑋⎯⎯⎯⎯⎯=𝐴+Σ𝑓𝑑′Σ𝑓×𝑐
(d) Weighted Arithmetic Mean It is the mean of weighted items of the series. Different items are
accorded different weights depending on their relative importance. The weighted sum of the
items is divided by the sum of the weights.
Calculation of Weighted Mean
According to this way, we find weighted mean from the following information
𝑋⎯⎯⎯⎯⎯𝑊=Σ𝑊𝑋Σ𝑊
(i) Merits

 Simplicity
 Certainty
 Based on all items
 Algebraic treatment
 Stability
 Basis of comparison
 Accuracy test

(ii) Demerits

 Effect of extreme value


 Mean value may not figure in the series at all
 Laughable conclusions
 Unsuitability
 Misleading conclusions

Median
“The Median is that value of the variable which divides the
group into two equal parts, one part comprising all values
greater than the Median value and the other part
comprising all the values smaller than the Median value”.
(i) Calculation of Median
(a) Individual Series Calculation of Median in individual
series involves the following formula
M = Size of (𝑁+12)th item
When N of the series is an even number, Median is estimated
using the following formula

(b) Discrete Series Calculation of Median in case of discrete


series or frequency array Measures of Dispersion
Dispersion
“It is the measure of the variation of the item”. According to Spiegel, ‘The degree to which
numerical data tend to spread about an average value is called the variation or dispersion of the
data”.
Different methods of measuring dispersion are

 Range
 Quartile deviation
 Mean deviation
 Standard deviation

Range Range is the difference between the highest value and the lowest value in a series.
R = H – L or L – S
H or L = Highest or Largest value of series
L or S = Lowest or Smallest value of series

Coefficient of range = 𝐻−𝐿𝐻+𝐿 or 𝐿−𝑆𝐿+𝑆


Calculation of Range and Coefficient of Range
(i) Individual Series and Discrete Series
Range = H – L or L – S
Coefficient of Range = 𝐻−𝐿𝐻+𝐿 or 𝐿−𝑆𝐿+𝑆
(ii) Frequency Distribution Series

 Mid values of the class interval are found, difference between the highest and
lowest values would be the range.
 According to this method, we find the difference between lower limit of the first
class interval and upper limit of the last class interval in the series would be the
range.

(iii) Inter Quartile Range


Difference between third quartile ( Q3) and first quartile of a series, is called Inter quartile range.
IQR = Q3 – Q1
Quartile Deviation
Quartile deviation is half of inter quartile range.
QD = 𝑄3−𝑄12
It is also called semi-inter quartile range.
(i) Coefficient of Quartile Deviation (Coefficient of QD)
Coefficient of QD = 𝑄3−𝑄1𝑄3+𝑄1
(ii) Calculation of Quartile Deviation
(a) Individual Series and Discrete Series First find out Q1 and Q3 from the following equations

(b) Frequency Distribution

Mean Deviation
“Mean deviation is the arithmetic average of deviation of all the values taken from a statistical
average of series. In taking deviation of values, algebraic signs + and – are not taken into
consideration, that is negative deviations are also treated as positive deviations”.
(i) Formulas for Mean Deviation
(a) If deviations are taken from median, the following formula is used
(b) If deviation are taken from arithmetic mean of the series

(ii) Coefficient of Mean Deviation

 Coefficient of mean deviation from Mean = 𝑀𝐷𝑋⎯⎯⎯⎯𝑋⎯⎯⎯⎯⎯


 Coefficient of MD from Median = 𝑀𝐷𝑀𝑀
 Coefficient of MD from Mode = 𝑀𝐷𝑍𝑍

(iii) Calculation of Mean Deviation or Coefficient of Mean Deviation


(a) Individual Series
Estimating MD through Median, MD = Σ|𝑑𝑀|𝑁
Estimating MD through Mean, MD = Σ|𝑑𝑋⎯⎯⎯⎯⎯|𝑁
Estimating Coefficient of MD through Median Coefficient of MD = 𝑀𝐷𝑀𝑀
Estimating Coefficient of MD through Mean Coefficient of MD = 𝑀𝐷𝑋⎯⎯⎯⎯𝑋⎯⎯⎯⎯⎯
(b) Discrete Series
Estimating MD through median, MDM = Σ𝑓|𝑑𝑚|𝑁
Estimating MD through mean, 𝑀𝐷𝑋⎯⎯⎯⎯⎯ = Σ𝑓|𝑑𝑋⎯⎯⎯⎯⎯|𝑁
Estimating Coefficient of MD through Median Coefficient of MD = 𝑀𝐷𝑀𝑁
Estimating Coefficient of MD through Median Coefficient of MD = 𝑀𝐷𝑋⎯⎯⎯⎯𝑋⎯⎯⎯⎯⎯
(c) Frequency Distribution Series
Mean deviation from Median, MDM = Σ𝑓|𝑑𝑀|Σ𝑓
Coefficient of MD = 𝑀𝐷𝑀𝑀
Mean deviation from Mean, 𝑀𝐷𝑋⎯⎯⎯⎯⎯ = Σ𝑓|𝑑𝑋⎯⎯⎯⎯⎯|Σ𝑓
Coefficient of MD = 𝑀𝐷𝑋⎯⎯⎯⎯𝑋⎯⎯⎯⎯⎯
Standard Deviation
Standard deviation is the square root of the arithmetic mean of the squares of deviations of the
items from their mean values.
Coefficient of Standard Deviation
This is a relative measure of the dispersion of series.
Coefficient of standard deviation (Coefficient of σ) = 𝜎𝑋⎯⎯⎯⎯⎯
(i) Calculation of Standard Deviation
(a) Direct Method

Here, σ = Standard Deviation;


ΣX2 = Sum total of the squares of deviation,
𝑋⎯⎯⎯⎯⎯ = Mean Value,
𝑋−𝑋⎯⎯⎯⎯⎯ = Deviation from mean value;
N = number of items
(b) Short-cut Method

(c) Step Deviation Method

(ii) Calculation of Coefficient of Variation


(a) Individual series = 𝜎𝑋 × 100
(b) Discrete series = 𝜎𝑋 × 100
(c) Frequency distribution series = 𝜎𝑋 × 100
Lorenz Curve
It is a curve that shows deviation of actual distribution from the showing equal distribution.
(i) Construction of the Lorenz Curve

 Calculate class mid-points


 Calculate cumulative frequencies as in column 6
 Express the grand total of column 3 and 6 as 100 and convert the cumulative totals
in these columns in to percentage.
 Now, on the graph paper, take the cumulative percentage of the variable on Y-axis
and cumulative percentages of X-axis.
 Draw a line joining co-ordinate (0, 0) with (100,100) this is called the line of equal
distribution.
 Plot the cumulative percentages of the variable with cumulative percentages of
frequency.
involves the following formula

M = Size of (𝑁+12)th item


(c) Frequency Distribution Series
The following formula is applied to determine the Median Value

Quartiles
If a statistical series is divided in to four equal parts, the end value of each part is called a
Quartile.
(i) Calculation of Quartiles Quartile values (Q1 and Q3) are estimated differently for different sets
of series,
(a) Individual and Discrete Series

(b) Frequency Distribution Series In frequency distribution series, the class interval of Q1 and
Q3 are first identified as under

Percentiles
Percentiles divide the series into 100 equal parts, and is generally expressed as P.
Percentiles are estimated for different types of series as under
(i) Individual and Discrete Series
(ii) Frequency Distribution Series

Mode
The value of the variable which occurs most frequently in a distribution is called the mode.
According to Croxton and Cowden, “ The mode may be regarded as the most typical of a series
of value”.
(i) Calculation of Mode

 Individual Series There are two ways of calculating Mode in individual series

 By inspection

 By converting individual series into discrete series

 Discrete Series There are two methods for calculation of mode indiscrete frequency
series

 Inspection Method

 Grouping Method

 Frequency Distribution Series The exact value of Mode can be calculated with the
following formula
𝑍=𝐿1+𝑓1−𝑓02𝑓1−𝑓0−𝑓2𝑥𝑖
Relative Position of Arithmetic Mean, Median and Mode Suppose we express,
Arithmetic Mean = Me
Median = Mi
Mode = Mo
The relative magnitude of the three are Me > Mi > Mo or Me < Mi < Mo The Median is always
between the Arithmetic Mean and the Mode.

Correlation
Correlation
It is a statistical method or a statistical technique that measures quantitative relationship between
different variables, like between price and demand.
According to Croxton and Cowden, “When the relationship is of a quantitative nature, the
appropriate statistical tool for discovering and measuring the relationship and expressing it in a
brief formula is known as correlation.”

Types of Correlation
Correlation is commonly classified into negative and positive correlation.

 Positive Correlation When two variables move in the same direction, such a
relation is called positive correlation, e.g., Relationship between price and supply
 Negative Correlation When two variables changes in different directions, it is
called negative correlation. Relationship between price and demand.

Degree of Correlation
Degree of correlation refers to the coefficient of correlation

(ii) Absence of Correlation


(iii) Limited Degree of correlation
The degree of correlation between 0 and 1 may be rated as

 High (0.75 and 1)


 Moderate (0.25 and 0.75)
 Low (0 and 0.25)

Methods of Estimating Correlation


(i) Scatter Diagram Scattered diagram offers a graphic expression of the direction and degree of
correlation.

Karl Pearson’s Coefficient of Correlation


This is also known as product moment correlation and simple correlation coefficient.
Karl Pearson has given a quantitative method of calculating correlation Karl Pearson’s
coefficient correlation is generally written as V.
Formula According to Karl Pearson’s method, the coefficient of correlation is measured as
𝑟=Σ𝑥𝑦𝑁𝜎𝑥𝜎𝑦
Where,
r = Coefficient of correlation;
x = x – 𝑥⎯⎯⎯
y= y – 𝑦⎯⎯⎯
σx = Standard deviation of x series
σy = Standard deviation of y series
N= Number of observations
If there is no need to calculate standard deviation of x and y directly using the following formula
𝑟=Σ𝑥𝑦Σ𝑥2×Σ𝑦2√
Here, x(x – 𝑥⎯⎯⎯), y = (y – 𝑦⎯⎯⎯)
Short-cut Method
This method is used when mean value is not in whole number but in fractions. In this method,
deviation is calculated by taking the assumed mean both the series.
Coefficient of correlation is calculated using the following formula

Here, dx = deviation of x series from the assumed mean = (x – A)


dy = deviation of y series from the assumed mean = (y – A)
Σ dxdy – sum of the multiple of dx and dy
Σ dx2 = sum of square of dx
Σ dy2 = sum of square of dy
Σdx= sum of deviation of x-series
Σdy = sum of deviation of y-series
N = Total number of items

Step Deviation Method


Coefficient of correlation is calculated using the following formula

Spearman’s Rank Correlation Coefficient


In 1904, ‘Charles Edwards Spearman’ developed a formula to calculate coefficient correlation of
qualitative variables. It is popularly known as Spearman’s rank. Difference formula or method.
Coefficient of Rank Correlation when Ranks are Equal formula

Here, m = number of items of equal ranks.

Importance or Significance of Correlation

 The study of correlation shows the direction and degree of relationship between the
variables.
 Correlation coefficient some times suggests cause and effect relationship.
 Correlation analysis facilitates business decisions because the trend path of one
variable may suggest the expected changes in the other.
 Correlation analysis also helps policy formulation.

Index Numbers
Index Number
An index number is a statistical device for measuring changes in the magnitude of a group of
related variables. It represents the general trend of diverging ratios from which it is calculated.
According to Croxton and Cowden, “Index numbers are devices for measuring difference in the
magnitude of a group of related variables.”

Methods of Constructing Index Numbers

Construction of Simple Index Numbers


There are two methods of constructing simple index numbers.
(i) Simple Aggregative Method In this method, we use the following formula
𝑃01=Σ𝑃1Σ𝑃0×100
Here, P01 = Price index of current year
ΣP1 = Sum of prices of the commodities in the current year
ΣP0 = Sum of prices of the commodities in the base year
(ii) Simple Average of Price Relatives Method
According to this method, we first find out price relatives from each commodity and then take
simple average of all the prices relatives.
Price relatives, P01 = Current year price (𝑃1) Base year price (𝑃0)×100
We can find out price index number of the current year by using the following formula

𝑃01=∑[𝑃1𝑃0×100]𝑁
Construction of Weighted Index Numbers
(i) Weighted Average of Price Relative Method
According to this method, weighted sum of the price relatives is divided by the sum total of the
weight. In this method, goods are given weight according to their quantity, thus
𝑃01=Σ𝑅𝑊Σ𝑊
Here, P01 = Index number for the current year in relation to the base year
W = weight
R = price relative
(ii) Weighted Aggregative Method Under this method, different goods are accorded weight
according to the quantity bought therefore, suggested different techniques of weighting some of
well known methods are as under

Fisher’s Method is considered as ‘Ideal’ because

 It is based on variable weights.


 It takes into consideration the price and quantities of both the base year and current
year.
 It is based on Geometric Mean (GM) which is regarded as the best mean for
calculating index number.
 Fisher’s index number satisfies both the Time Reversal Test and Factor Reversal
Test.

Consumer Price Index or Cost of Living Index Number


The consumer price index is the index number which measures the averages change in prices
paid by the specific class of consumers for goods and services consumed by them in the current
year in comparison with base year.

Construction of Consumer Price Index

 Selection of the consumer class


 Information about the family budget
 Choice of base year
 Information about prices
 Weightage – There are two ways of according weights

 Quantity weight

 Expenditure weight

The following formula is used to find consumer’s price index


Consumer Price Index (CPI) = Σ𝑊𝑅Σ𝑊
Wholesale Price Index (WPI)
The Wholesale Price Index (WPI) measures the relative changes in the prices of commodities
traded in the wholesale markets. In India, the wholesale price index numbers are constructed on
weekly basis.

Industrial Production Index


The index number of industrial production measures changes in the level of industrial production
comprising many industries. It includes the production of the public and the private sector. It is a
weighted average of quantity relatives. The formula for the index is
𝑃01=Σ𝑞1×𝑊Σ𝑊×100
Construction of Index Number of Industrial Production

 Classification of industries
 Statistics or data related to industrial production
 Weightage

Agricultural Production Index


Index number of agricultural production is weighted average of quantity relatives.

Sensex
Sensex is the index showing changes in the Indian stock market. It is a short form of a Bombay
Stock Exchange sensitive index. It is constructed with 1978-79 as the reference year or the base
year. It consists of 30 stocks of leading companies in the country.

Purpose of Constructing Index Number


 Purpose of constructing index number of prices is to know the relative change or
percentage in the price level over time. A rising general price level over time is a
pointer towards inflation, while a falling general price level over time is a pointer
towards deflation.
 Purpose of constructing index number of quantity is to know relative change or
percentage change in the quantum or volume of output of different goods and
services. A rising index of quantity suggests a rising level of economic activity and
vice-versa.

You might also like