Blackboard
Blackboard
Distribution
Population
• Population: All possible units or items specified by certain characteristics under
the targeted study area constitute a population.
• population can be classified into four major categories
i. Finite population,
ii. Infinite population,
iii. Real population,
iv. Hypothetical population.
• There are two types of ways in which information from a population can be
gathered. These are
a) Complete enumeration or census method
b) Sample survey method (sampling is a part of sample survey)
Definitions
• The reasons why a census must be used are:
i. When every item or units of the population is required to be considered in the study,
ii. When extreme accuracy of the results of study is needed,
iii. When crucial decision will have to be made on the basis of the results obtained from the study,
iv. Moreover, if the population size is small and finite, it is easy to enumerate all units of population.
• Census: If the relevant data/information from each and every unit of the
targeted population under enquiry is collected it is called census.
• Sample: A sample is a representative part of a population
• Sampling: Sampling is defined as the total process involving in collection of
samples from a target population for a particular study.
• Statistic: Any function of random sample is known as statistic. A numerical
quantity calculated from sample is also called statistic
• Parameter: The unknown constant or any function of them that appear in the
mathematical specification of the population is known as parameter. Any
numerical quantity calculated from the population data is also called
parameter.
Sample
• Classification of Sample:
– Large Sample: If the sample size is n is greater than or equal to 30 𝑛 ≥ 30 , it is known as large
sample. For large samples the sampling distributions of statistic are normal (𝑍 test).
– Small Sample: If the sample size is less than 30 (𝑛 < 30), it is known as small sample. For small
samples the sampling distribution are 𝑡, 𝐹 and 𝜒 2 distribution.
• The basic objective of its study is to draw inference about the population
• The advantages of sampling are:
i) Reduced cost
ii) Greater speed
iii) Greater scope
iv) Greater accuracy
Sampling Error
• Sampling Error: The error due to drawing inference about the population on
the basis of a sample is termed as sampling error.
• Sampling errors are of two types
– Biased Errors
i. Faulty process of selection
ii. Faulty work during the selection of information; and
iii.Faulty methods of analysis
– Unbiased errors
– These errors arise due to chance differences between the members of
population included in the sample and those not included.
Non-sampling Errors
• some of the major sources of non-sampling error can be pointed out as follows:
1. Data specification being inadequate and inconsistent with respect to the objectives of the study,
whatever the study method is, census or serve
2. Omission or duplication of units due to imprecise definition or boundaries of area units,
incomplete or wrong identification of units, or faulty methods of enumeration.
3. Defective frame, faulty selection of sampling units. Inaccurate or inappropriate questionnaire,
methods of interview, definition or instruction may also cause non sampling error.
4. Lack of trains and experienced investigators,
5. lack of adequate inspection and supervision of primary staff
6. Errors due to non-response, that means, incomplete coverage in respect of units,
7. Errors in data processing operations such as coding, punching, certification, tabulation, etc.
8. Errors committed during presentation or printing of tabulated results.
9. Errors in scrutiny of primary or basic data
• Definition: The possible error which may arise at any stage of Investigation,
either in census or in sampling, is termed as non-sampling error. This type of
error arises due to faulty cautionary, due to non-response, due to faulty
tabulation method, etc.
Census vs. Sample survey
Census Sample survey
It is a study which considers all units of It is a study which considers a part of
the population. units of the population.
It is useful when the size is small and population It is useful when the
finite. population size is large and/ or infinite
It is more expensive and more time It is less expensive and less time
consuming. consuming.
If the study is performed with trained Even if the study is performed with
personnel, the results obtained from trained personnel, the results obtained
census may be more accurate and from census may not be accurate and
adequate. adequate.
There is possibility of occurrences of There is possibility of occurrences of
only non-sampling errors, if any. both sampling and non-sampling
errors.
Sampling Methods
• We can separate the methods into two categories:
• Random Sampling Methods:
– Simple random sampling
– Stratified sampling
– Systematics sampling
– Multi-stage sampling
• Non-random sampling:
– Judgement sampling
– Quota sampling
– Convenience sampling
– Snowball sampling
Simple random sampling
• The simple random sampling is the basic kind of probability sampling. We have
two cases for sampling:
1. Sampling with replacement
𝑘 = 𝑁𝑛
2. Sampling without replacement
𝑘 =𝑁 𝐶𝑛
Sampling Distribution
• Sampling distribution
– The probability distribution of a statistic derived from all possible random samples of a given population is called sampling distribution.
• Standard Error
– The positive square root of the variance of a statistic (sample mean, sample median, sample proportion, etc.) is known as the standard error of the
statistic.
• Related Formulae
1. Standard error of sample mean on the basis of a sample size n is
𝜎
• 𝑆𝐸 𝑋ത =
𝑛
• Where, 𝜎 stands for standard deviation of the population. When 𝜎 is unknown, then we use 𝑠 in place of 𝜎 which represents
standard deviation of sample mean.
𝜎2
ത
𝑋~𝑁 𝜇,
𝑛
ത
𝑋−𝜇
And the variate 𝑍 = 𝜎 approaches to the standard normal distribution with mean 0 and
𝑛
variance unity, i.e.,
𝑋ത − 𝜇
𝑍 = 𝜎 ~𝑁(0,1)
𝑛
That means, the variate Z follows standard normal distribution.
Steps to calculate sampling
distribution of sample mean
1. Find all possible samples of size n from a population of size N.
∑𝑋
2. Calculate 𝑋ത for each sample, where 𝑋ത = 𝑖
𝑛
3. Construct the frequency table, for all different values of 𝑋ത and also the frequency of each value (the
total of frequencies = 𝑘).
• To study the sampling distribution of the Sample Mean and its relationship with the population
parameters (𝜇, 𝜎 2 ), we need to find its mean 𝐸 𝑋ത = 𝜇𝑋ത and its variance 𝑉 𝑋ത = 𝜎𝑋2ത , as follows:
2
ത
∑𝑋𝑓 ∑𝑋ത 2 𝑓−𝑘𝜇𝑋
𝜎𝑋2ത
ഥ
• 𝜇𝑋ത = 𝑎𝑛𝑑 =
𝑘 𝑘
1. Its relationship with population parameters 𝜇, 𝜎 2 is:
a. 𝜇𝑋ത = 𝜇
𝜎2
b. 𝜎𝑋2ത = 𝑛
(Sampling with replacement)
𝜎 2 𝑁−𝑛
• 𝜎𝑋2ത = (Sampling without replacement)
𝑛 𝑁−1
𝑁−𝑛
• Where the fraction is called correction factor. We may ignore this correction factor in
𝑁−1
𝑛
practice if 𝑛 ≤ 0.05 or ≤ 0.05, that is the sample size is less than or equal to 5% of the
𝑁
population size N because that factor will approach to 1.
Steps cont.
1. Determine the form or shape of the sampling distribution. Since it depends on the
distribution of the variable in the population (Normal or not normal), thus we have two
cases for its form:
a. The population and the variable X in that population is normally distributed (i.e.,
𝜎 2 ത
𝑋−𝜇
ത
𝑋~𝑁(𝜇, 𝜎 2 )). Then, (with or without replacement) 𝑋~𝑁 𝜇, → ~𝑁(0,1).
𝑛 𝜎/ 𝑛
b. The population and the variable X in that population has any distribution other than
normal distribution with mean and variance 𝜎 2 , thus by Central Limit Theorem
𝜎2 ത
𝑋−𝜇
ത
i. With replacement 𝑋~𝑁 𝜇, → ~𝑁(0,1).
𝑛 𝜎/ 𝑛
𝜎 2 𝑁−𝑛
ത
ii. Without replacement 𝑋~𝑁 𝜇,
𝑛 𝑁−1