Presentation ON Introduction To Statistics: Course No: URP 5151 Couse Title: Statistics For Planners
Presentation ON Introduction To Statistics: Course No: URP 5151 Couse Title: Statistics For Planners
PRESENTATION
ON
INTRODUCTION TO STATISTICS
The word statistics have been derived from Latin word 'status' or Italian word 'statista' and meaning of
those word is 'Political state' or Government.
1. First, Shakespeare used a word ‘statist'is in his drama Hamlet in 1602.In the past, rulers were
interested to track their people, money, events(wars, flooding etc).But Statistical analysis of
data was first described by Jhon Graunt in 1662 in his book named 'Natural d Political
observation.
2. Second, the development of modern statistics was the foundations of probability, with its
origins in games of chance, as laid down by Pascal and later Bernoulli. The big conceptual on
the application of probability to quantitative inference were taken by Bayes and Laplace in
1764.
3. Third, in this period the development of statistics involves statistical graphics. The first major
figure is William Playfair credited with inventing the line and bar charts for economic data and
the pie chart. The period from 1850 to 1900 as the “golden age of statistical graphic.
A Brief History of Statistics
4. Fourth, This royel statistical society began in 1834 as London statistical Society (LSS),American
Statistical Association(ASA) was formed in 1839 for improving the US census.
5. Fifth, Another wave of activity into the 1920s was initiated by the concerns of William Gosset. At
the last We should draw a particular attention on Jhon Tukey's introduction of 'exploratory data
analysis in the 1970's.
Statistics are the aggregates of facts:- It means a single figure is not statistics. For
example, national income of a country for a single year is not statistics but the
same for two or more years is statistics.
Statistics are affected by a number of factors:- For example, sale of a product
depends on a number of factors such as its price, quality, competition, the
income of the consumers, and so on.
Statistics must be reasonably accurate:- Wrong figures, if analyzed, will lead to
erroneous conclusions. Hence, it is necessary that conclusions must be based on
accurate figures
Statistics must be collected in a systematic manner:- If data are collected in a
haphazard manner, they will not be reliable and will lead to misleading
conclusions.
Collected in a systematic manner for a pre-determined purpose
Statistics should be placed in relation to each other:- If one collects data
unrelated to each other, then such data will be confusing and will not lead to any
logical conclusions. Data should be comparable over time and over space.
Sources of Statistical Data
Sources of
Statistical Data
I.I. Observation
Observation
Primary Data Secondary Data I. I. Published Data
Published Data
II.
II. Interviews
Interviews II. II.Personnel records
Personnel records
III.
III. Questionnaire
Questionnaire III.III.Electronic Data
Electronic Data
IV.
IV. Experiments
Experiments IV. IV.Government
Government Organizations
Organizations
V.
V. Investigations
Investigations V. V.Public Sector
Public Records
Sector Records
VI.VI.Internet
Secondary data is the Internet
Primary data is always
VII.VII.
Research journals
Research journals
collected wither by an data that has already
individual or by an been collected through
organization. This is primary sources and
data that has never made readily available
been gathered before for researchers to use
for their own research
Population & Sample
Population –
It is actually a collection of set of
individuals or objects or events
whose properties are to be
analyzed.
Sample –
It is the subset of a population.
Classification of Statistics
Linear Regression
Logistic Regression
ANOVA
Chi-Square Statistic
Descriptive Statistics
1. Descriptive Statistics :
Descriptive statistics uses data that provides a description of the population either through
numerical calculation or graph or table. It provides a graphical summary of data. It is simply used for
summarizing objects, etc. There are four categories in this as following below.
A. Frequency Distribution
Frequency distribution in statistics is a representation that displays the number of
observations within a given group / interval.
The representation of a frequency distribution can be graphical or tabular so that it is
easier to understand.
Vehicle/ 60 12 26 20 15 25
Hour
Descriptive Statistics
B. Measures of Position
Quartiles
A quartile is a statistical term that describes a division of observations into four defined intervals
based on the values of the data and how they compare to the entire set of observations.
Outliers in data set can be detected
I. Mean :
It is measure of average of all value in a sample set.
For example,
Descriptive Statistics
II. Median :
It is measure of central value of a sample set. In
these, data set is ordered from lowest to highest
value and then finds exact middle.
For example,
III. Mode :
It is value most frequently arrived in sample set.
The value repeated most of time in central set is
actually mode.
For example,
Descriptive Statistics
D. Measure of Variability
Measure of Variability is also known as measure of dispersion and used to describe
variability in a sample or population. In statistics, there are three common measures of
variability as shown below:
I. Range :
It is given measure of how to spread apart values in sample set or data set.
Range = Maximum value - Minimum value
II. Variance :
It simply describes how much a random variable defers from
expected value and it is also computed as square of deviation.
In these formula,
S2= ∑ni=1 [(xi - ͞x)2 ÷ n] n = total data points,
iii. Dispersion : ͞x = mean of data points and
xi =individual data points.
It is measure of dispersion of set of data from its mean.
σ= √ (1÷n) ∑ni=1 (xi - μ)2
Inferential Statistics
2. Inferential Statistics :
Inferential Statistics makes inference and prediction about population based on a sample of data
taken from population. It generalizes a large dataset and applies probabilities to draw a conclusion.
It is simply used for explaining meaning of descriptive stats. It is simply used to analyze, interpret
result, and draw conclusion. Inferential Statistics is mainly related to and associated with hypothesis
testing whose main target is to reject null hypothesis.
1. Pearson Correlation
2. Linear Regression
3. Logistic Regression
4. T-test
5. ANOVA
6. Chi-Square Statistic
Hypothesis
A statistical hypothesis is an assumption about a population which may or may not be true.
Hypothesis testing is a set of formal procedures used by statisticians to either accept or reject
statistical hypotheses. Statistical hypotheses are of two types
Alternative hypothesis, Ha - represents a hypothesis of observations which are influenced by some non-
random cause.
Example
Suppose we wanted to check whether a coin was fair and balanced. A null hypothesis might say,
that half flips will be of head and half will of tails whereas alternative hypothesis might say that
flips of head and tail may be very different.
For example if we flipped the coin 50 times, in which 40 Heads and 10 Tails results. Using result, we
need to reject the null hypothesis and would conclude, based on the evidence, that the coin was
probably not fair and balanced.
Inferential Statistics
1. Pearson Correlation
The Pearson Correlation Coefficient is used to identify the strength of a linear interrelation between two variables.
In simple words, Pearson’s correlation coefficient calculates the effect of change in one variable when the other
variable changes.
Number Traffic
of College Congestion
(X) Score (Y)
3 74
1 68
1 66
3 72
T- value
Differences between means X
The larger the t-ratio (in absolute value), the more likely we will reject the null hypothesis
Inferential Statistics
5. Analysis of Variance (ANOVA)
• ANOVA, which stands for Analysis of Variance, is a statistical test used to analyze the difference between
the means of more than two groups.
• A one-way ANOVA uses one independent variable, while a two-way ANOVA uses two independent
variables.
One-way ANOVA example Fertilizer Yield
In Agriculture Planning Type-1 177.2287
As a crop researcher, you want to test the effect of three different fertilizer Type-1 177.55
mixtures on crop yield. You can use a one-way ANOVA to find out if there is a Type-2 176.4793
difference in crop yields between the three groups. Type-2 176.0443
Type-3 177.1042
The larger the F value, the more likely it is that the variation Type-3 178.0796
associated with the independent variable is real and not due
to chance.
The p-value of the independent variable, fertilizer, is
significant (p < 0.05), it is likely that fertilizer type does have a
significant effect on average crop yield.
Inferential Statistics
6. Chi-Square Statistic
A chi-square (χ2) statistic is a test that
measures how a model compares to actual
observed data.
The chi-square test is useful for learning
whether a relationship exists between two
qualitative (nominal) variables, such as sex
(male, female) and dropping out of school
(dropout, stay-in).
χ2 = ∑ (O − E)2 / E
Where,
O = Observed frequency
E = Expected frequency
∑ = Summation
χ2 = Chi-Square value
Objectives of Statistics
1. Statistics and planning: Statistics in indispensable into planning in the modern age which is
termed as “the age of planning”. Almost all over the world the govt. are re- storing to
planning for economic development.
2. Statistics and economics: Statistical data and techniques of statistical analysis have to
immensely useful involving economical problem. Such as wages, price, time series analysis,
demand analysis.
3. Statistics and business: Statistics is an irresponsible tool of production control. Business
executive are relying more and more on statistical techniques for studying the much and
desire of the valued customers.
4. Statistics and industry: In industry statistics is widely used inequality control. In production
engineering to find out whether the product is confirming to the specifications or not.
Statistical tools, such as inspection plan, control chart etc.
5. Statistics and mathematics: Statistics are intimately related recent advancements in
statistical technique are the outcome of wide applications of mathematics.
6. Statistics and modern science: In medical science the statistical tools for collection,
presentation and analysis of observed facts relating to causes and incidence of dieses and the
result of application various drugs and medicine are of great importance.
Scope of statistics
7. Statistics, psychology and education: In education and physiology statistics has found wide
application such as, determining or to determine the reliability and validity to a test, factor analysis
etc.
8. Statistics and war: In war the theory of decision function can be a great assistance to the military and
personal to plan “maximum destruction with minimum effort.”
9. In banking: It play an important role in banking. The banks make use of statistics for a number of
purposes. The banks work on the principle that all the people who deposit their money with the
banks do not withdraw it at the same time. The bankers use statistical approaches based on
probability to estimate the numbers of depositors and their claims for a certain day.
10.In State Management (Administration):Statistics is essential for a country. Different policies of the
government are based on statistics. Statistical data are now widely used in taking all administrative
decisions. it helps in estimating the expected expenditures and revenue from different sources. So
statistics are the eyes of administration of the state.
11.In Accounting and Auditing: Accounting is impossible without exactness. The correction of the values
of current asserts is made on the basis of the purchasing power of money or the current value of it. In
auditing sampling techniques are commonly used. An auditor determines the sample size of the book
to be audited on the basis of error.
Importance of Statistics in Planning
Control Data
sources of variation, Visualize the data;
detect outliers Analyze with
statistical models
Interpret
Experiment Knowledge
Practical and statistical
Significance of results
Design experiments
to answer Understanding
research questions Make scientifically
sound decisions and
communicate them
Regional/Municipality level:
Identifying areas of specific intervention for fund
Planning intervention strategies
Importance of Statistics in Planning
Bulk infrastructure
How many houses are electrified
Tracing illegal connections of electricity
Water connection (Reservoirs) vs backlog
How many houses have sanitation
New roads development
Design of streets- Traffic signs in District
Pedestrian walks
Gravel roads/ existing and new
Settlement pattern –
Dwelling types e.g. informal, old age etc.
Importance of Statistics in Planning
“An essential component of any development planning is statistical data. Without data, a
country’s efforts to plan for future growth and welfare of its people cannot be grounded in
reality and therefore may be severely flawed. [Hon. Prof. Peter Anyang’ Nyong’o” , Minister for
Planning and National Development, Kenya]
Sound data represent a key weapon in the battle against poverty. [Tadao Chino, former
President, Asian Development Bank]
Information gives you the power to make the right decisions. [Dr Roberto Tapia Conyer, Vice-
Minister, Ministry of Health, Mexico]
“Information is at the root of everything we do” [Prof. Francis Omaswa [ , former Director
General, Ministry of Health, Uganda]
Importance of Statistics in Policy Formulation
1.Importance of Quality Data and Statistics: Evidence-based decision making, supported by quality
data and statistics has been recognized as a necessity to formulate policies that are technically sound,
politically relevant and results-based oriented.
2.It has been recognized as important in
Policy formulation
Resource Allocation
Project and Programme Design
Implementation
Management
Monitoring and Evaluation
3. Quality data and statistics support policymakers by:
Identifying issues that require policy intervention
providing the evidence to support the development of or adjustment of policy
facilitating monitoring and evaluation of the impact of policy
4. Stages in Evidence-Based Policymaking:
Identification
Assessment
Implementation
Monitoring and evaluation
Importance of Statistics in Economic Planning
Various concepts of economic theory, such as functional relationship among variables are usually
stated in terms of algebra, symbols, calculus and so on. To analyze various social and economic
phenomena, economic theory can be better studied and analyzed in terms of ‘appropriate’
numbers. Such numbers, that is, statistics presents facts on the basis of a mass of figures. It helps
comparison of facts.
It establishes relationship between variables like price and quantity demanded or quantity
supplied, global warming and agricultural output, money supply and price level and so on. Thus,
economics as a discipline is linked up with statistics on many occasions. Today, we see that
economic growth in India is hampered by faulty policies and better economic policymaking largely
depends on the availability of improved data or statistics. Businessmen also find statistics as an
indispensable tool in their regular activities.
The study of modem economics requires mathematical and statistical foundation. With the
development of mathematics and statistics over the passage of time, econometric methods have
been developed.
Strengths of Statistics
In spite of the usefulness of statistics and the confidence of the people in its efficacy, some
people have misgivings about it and they distrust it. Those who distrust statistics make the
following observations about it:
Croxton, F., & Cowden, D. (1939). Applied general statistics. New York: Prentice-Hall,
Inc.
Peck, R., Olsen, C., & Devore, J. L. (2015). Introduction to statistics and data analysis.
Cengage Learning.
Lee, Y. G., & Kim, S. Y. (2008). Introduction to statistics. Yulgokbooks, Korea, 342-
351.
Khan, S. (2013). Statistics in planning and development. Pakistan Journal of
Statistics, 29(4), 513-524.
Thank You