0% found this document useful (0 votes)
5 views

Unit-2_Sampling and Estimations

The document provides an overview of sampling methods, including probability and non-probability sampling techniques, as well as classifications of samples based on size. It explains key concepts such as parameters, statistics, sample mean, sample variance, and the Central Limit Theorem, along with formulas for calculating standard errors. Additionally, it includes examples and solutions related to sampling distributions and probabilities.

Uploaded by

sandeep bhukya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Unit-2_Sampling and Estimations

The document provides an overview of sampling methods, including probability and non-probability sampling techniques, as well as classifications of samples based on size. It explains key concepts such as parameters, statistics, sample mean, sample variance, and the Central Limit Theorem, along with formulas for calculating standard errors. Additionally, it includes examples and solutions related to sampling distributions and probabilities.

Uploaded by

sandeep bhukya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Sampling

Population: The totality of observations with which we are concerned, whether


this number be finite or infinite, constitutes what we call population. The size of
the population is denoted by N

Sample: A portion of the population which is examined with a view to


determining the population characteristics is called a sample. The size of the
sample is denoted by n

Different methods of sampling


Some important methods of sampling are discussed below.

I. Probability Sampling Methods

1. Random sampling or Probability sampling


It is the process of drawing a sample from a population in such a way that
each member of the population has an equal chance of being included in the
sample. The sample obtained by the process of random sampling is called a
random sample.
If N is the size of a population and n is the sample size, then
(i) The number of sample with replacement = Nn
(i) The number of sample without replacement = NCn

2. Stratified sampling or Stratified Random sampling


This method is useful when the population is heterogeneous. In this type
of sampling, the population is first sub divided into several parts or small groups
called strata according to some relevant characteristics so that each stratum is
more or less homogeneous. Each stratum is called a sub-population. Then a small
sample called sub-sample is selected from each stratum at random. All the sub
samples are combined together to form the stratified sample which represents the
population properly. The process of obtaining and examining a stratified sample
with a view to estimating the characteristic of the population is known as
Stratified Sampling.

3. Systematic Sampling or Quasi – Random Sampling


As the name suggests this means forming the sample in some systematic
manner by taking items at regular intervals. In this method, all the units of the
population are arranged in some order. If the population size is finite, all the units
of the population are arranged in some order. Then from the first k items, one unit
is selected at random. This unit and every kth unit of the serially listed population
combined together constitute a systematic sample. This type of sampling is
known as Systematic Sampling.

II. Non – Probability Sampling Methods

1. Purposive Sampling or Judgment Sampling


When the choice of the individual items of a sample entirely depends on
the individual judgment of the investigator (or sampler), it is called a purposive
or Judgment Sampling. For example, if a sample of 20 students is to be selected
from a class of 100 to analyze the extra-curricular activities of the students, the
investigator would select 20 students who, in his judgment, would represent the
class.

2. Sequential Sampling
It consists of a sequence of sample drawn one after another from the
population depending on the results of previous sample. If the result of the first
sample leads to a decision which is not acceptable, the lot from which the sample
was drawn is rejected. But if the result of the first sample is acceptable, no new
sample is drawn. But if the first sample leads to no clear decision, a second sample
is drawn and as before if required a third sample is drawn to arrive at a final
decision to accept or reject the lot. This process is called Sequential Sampling

Classification of Samples
Samples are classified in two ways.

1. Large Sample: if the size of the sample (n) ≥ 30, the sample is said to be large
sample

2. Small Sample: if the size of the sample (n) < 30, the sample is said to be small
sample

Parameter and Statistics

Parameter is statistical measures of the population. Ex: population mean


(µ), population variance (σ2)
Statistic is statistical measures of the Sample. Ex: Sample mean (𝑥̅ ),
population variance (s2)

The Sample Mean :


If X1, X2 ……, Xn represents a random sample of size n, then the sample
𝑋
mean is defined by the statistic 𝑋̄ = ∑𝑛𝑖=1 𝑖
𝑛
The sample Variance:
If X1, X2 ……, Xn represents a random sample of size n, then the sample
(𝑋 −𝑋)2
variance is defined by the statistic 𝑠 2 = ∑𝑛𝑖=1 𝑖
𝑛−1
Central Limit Theorem
If 𝑥̅ be the mean of a sample of size n drawn from a population with mean µ
𝑥−𝜇
and S.D. 𝜎 then the standardized sample mean 𝑧 = 𝜎 is a random variable
√𝑛
whose distribution function approaches that to the standard normal distribution
N(z;0,1) as 𝑛 → ∞

STANDARD ERROR (S.E.) OF A STATISTIC


𝜎
1) S.E. (𝑥̅ ) =
√𝑛
𝑃𝑄
2) S.E. of sample proportion 𝑝 = √ where Q=1-P
𝑛
𝜎1 𝜎2
3) S.E. (𝑥1 − 𝑥2 ) = √ + where 𝑥1 and 𝑥2 are the means of two random
𝑛1 𝑛2

samples of sizes n1 and n2 drawn from two populations with S.D. 𝜎1 and
𝜎2 respectively.
𝑃1 𝑄1 𝑃2 𝑄2
4) S.E. (𝑠1 − 𝑠2 ) = √ +
𝑛1 𝑛2
Finite Population: Consider a finite population of size N with mean µ and S.D.
σ. Draw all possible samples of size n without replacement, from this
population. Then

(i) The mean of the sampling distribution of means (for N>n) is 𝜇𝑋 = 𝜇


𝜎 2 𝑁−𝑛
(ii) The variance is 𝜎𝑋 2 = (𝑁−1 )
𝑛
𝑁−𝑛
Note : The factor ( ) is called the finite population correction factor.
𝑁−1

1. What is the value of correcting factor if n = 5 and N = 200

Sol. Given N = the size of the finite population = 200

n = the size of the sample = 5


𝑁−𝑛 200−5 195
∴ Correction factor = = = = 0.98
𝑁−1 200−1 199

2. What is the value of correcting factor if n = 10 and N = 1000

Sol. Given N = the size of the finite population = 1000

n = the size of the sample = 10


𝑁−𝑛 1000−10 990
∴ Correction factor = = = = 0.991
𝑁−1 1000−1 999

3. A population consists of five numbers 2, 3, 6, 8 and 11. Consider all possible


samples of size two which can be drawn with replacement from this population.
Find

(a) The mean of the population


(b) The S.D. of the population
(c) The mean of the sampling distribution of means and
(d) The S.D. of the sampling distribution of means (i.e., the standard error of
means)

Sol.
2+3+6+8+11 30
(a) Mean of the population is given by 𝜇 = = =6
5 5

(b) Variance of the population σ2 is given by


(𝑥𝑖 −𝑥)2
𝜎 2 = ∑𝑛𝑖=1
𝑛
(2 − 6)2 + (3 − 6)2 + (6 − 6)2 + (8 − 6)2 + (11 − 6)2
=
5
16 + 9 + 0 + 4 + 25
= = 10.8
5
(c) Sampling with replacement (infinite population) :

The total no. of samples with replacement is

Nn = 52 = 25 samples of size 2

Here N = population size and n = sample size listing all possible samples of size
2 from population 2, 3, 6, 8, 11 with replacement we get 25 samples
(2,2) (2,3) (2,6) (2,8) (2,11)
(3,2) (3,3) (3,6) (3,8) (3,11)
(6,2) (6,3) (6,6) (6,8) (6,11)
(8,2) (8,3) (8,6) (8,8) (8,11)
{(11,2) (11,3) (11,6) (11,8) (11,11)}

Now compute the arithmetic mean for each of these 25 samples. The set of 25
means 𝑥 of these 25 samples, gives rise to the distribution of means of the samples
known as sampling distribution of means.
The samples means are
2 2.5 4 5 6.5
2.5 3 4.5 5.5 7
4 4.5 6 7 8.5
5 5.5 7 8 9.5
{6.5 7 8.5 9.5 11 }
And the mean of sampling distribution of means is the mean of these 25 means.
𝑆𝑢𝑚𝑜𝑓𝑎𝑙𝑙𝑠𝑎𝑚𝑝𝑙𝑒𝑚𝑒𝑎𝑛𝑠 150
𝜇𝑋 = = =6
25 25
(d) The variance of the sampling distribution of means is obtained by
subtracting the mean 6 from each number and squaring the result, adding all 25
members thus obtained, and dividing by 25
2
(2 − 6)2 +. . . . . . . . . . . . . +(11 − 6)2 135
𝜎𝑋 = = = 5.4
25 25
∴ 𝜎𝑋 = √5.40 = 2.32
𝜎2 10.8
Clearly 𝜎𝑋 2 = = = 5.4 ⇒ 𝜎𝑋 = √5.4 = 2.32
𝑛 2

4. Solve the above examples without replacement to find 𝜇𝑋 and 𝜎𝑋


Sol.
𝜇 = 6and 𝜎 = 3.29
Sampling without replacement (finite population) :
The total no. of samples without replacement is NCn = 5C2 = 10 samples of size 2
(2,3) (2,6) (2,8) (2,11)
(3,6) (3,8) (3,11)
(6,8) (6,11)
{(8,11) }
The corresponding sample means are
2.5 4 5 6.5
4.5 5.5 7
{ }
7 8.5
9.5
The mean of the sampling distribution of means is
𝑆𝑢𝑚𝑜𝑓𝑎𝑙𝑙𝑠𝑎𝑚𝑝𝑙𝑒𝑚𝑒𝑎𝑛𝑠 (2.5 + 4+. . . . . .8.5 + 9.5) 60
𝜇𝑋 = = = =6
25 10 10

The variance of sampling distribution of means


2
(2.5 − 6)2 +. . . . . . . . . . . . . +(9.5 − 6)2 40.5
𝜎𝑋 = = = 4.05
10 10
𝜎 2 𝑁−𝑛 10.8 5−2
Showing that 𝜎𝑋 2 = ( )= ( ) = 4.05
𝑛 𝑁−1 2 5−1

5(H.W). A population consists of 6 numbers 5, 10, 14, 18, 13, and 24. Consider
all possible samples of size two which can be drawn without replacement from
this population. Find
(a) The mean of the population
(b) The S.D. of the population
(c) The mean of the sampling distribution of means and
(d) The S.D. of the sampling distribution of means (i.e., the standard error of
means)

6. The mean height of students in a college is 155cms and S.D. is 15. What is
the probability that the mean height of 36 students is less than 157 cms.

Sol.
𝜇 = mean of the population
= mean height of students of a college = 155 cm
𝜎 = S.D. of population = 15 cms
n = sample size = 36
𝑥 = Mean of samples = 157 cms
𝑥−𝜇 157−155
Now 𝑧 = 𝜎 = = 0.8
15/√36
√𝑛
∴ 𝑃(𝑥 ≤ 157) = 𝑃(𝑧 < 0.8) = 0.5 + 𝐴(0.8) = 0.5 + 0.2881 = 0.7881
Thus the probability that the mean height of 36 students less than 157 = 0.7881

7. A random sample of size 100 is taken from an infinite population having the
mean 𝜇 = 76 and the variance 𝜎2 = 256. What is the probability that 𝑥 will be
between 75 and 78.

Sol.
𝜇 = mean of the population = 76
𝜎2 = variance of population = 256 i.e. 𝜎 = 16
n = sample size = 100
𝑥1 = 75
𝑥1 −𝜇 75−76
Now 𝑧1 = 𝜎 = = −0.625
16/√100
√𝑛
And when 𝑥2 = 78
𝑥1 −𝜇 78−76
Now 𝑧2 = 𝜎 = = 1.25 ∴ 𝑃(75 ≤ 𝑥 ≤ 78) = 𝑃(−0.625 ≤
16/√100
√𝑛
𝑧 ≤ 1.25) = 𝐴(−0.625) + 𝐴(1.25) = 0.2334 + 0.3944 = 0.628

8. A random sample of size 64 is taken from an infinite population having the


mean 45 and the S.D. 8. What is the probability that x will be between 46 and
47.5

Sol.
𝜇 = mean of the population = 45
𝜎 = S.D. of the population = 8
n = sample size = 64
𝑥1 = 46
𝑥1 −𝜇 46−45
Now 𝑧1 = 𝜎 = =1
8/√64
√𝑛
And when 𝑥2 = 47.5
𝑥2 −𝜇 47.5−45
Now 𝑧2 = 𝜎 = = 2.5
8/√64
√𝑛
∴ 𝑃(46 ≤ 𝑥 ≤ 47.5) = 𝑃(1 ≤ 𝑧 ≤ 2.5) = 𝐴(2.5) − 𝐴(1) = 0.4938 − 0.3413
= 0.1525

9(H.W). A random sample of size 64 is taken from a normal population with 𝜇


= 51.4 and 𝜎 = 68. What is the probability that the mean of the sample will (a)
exceed 52.9, (b) fall between 50.5 and 52.3, (c) 50.6

ESTIMATION
Estimate: An estimate is a statement made to find an unknown population
parameter

Estimator: The procedure or rule to determine an unknown population parameter


is called an estimator. For ex. Sample mean 𝑥 is an estimator of population mean
𝜇

Types of Estimation:
Basically there are two kinds of estimates to determine the statistic of the
population parameters

(a) Point Estimation : A point estimate of a parameter 𝜃 is a single numerical


value, which is computed from a given sample.

(b) Interval Estimation : An interval estimation is given by two values


between which the parameter may be considered to lie.

Interval Estimation of 𝜇: The interval estimation of 𝜇 is given by the interval


𝜎
(𝑥 − 𝐸𝑀𝑎𝑥 , 𝑥 + 𝐸𝑀𝑎𝑥 ) where 𝐸𝑀𝑎𝑥 = 𝑧𝛼
2 √𝑛

Interval Estimation of p (proportion): The interval estimation of 𝜇 is given


𝑃𝑄
by the interval (𝑝 − 𝐸𝑀𝑎𝑥 , 𝑝 + 𝐸𝑀𝑎𝑥 ) where 𝐸𝑀𝑎𝑥 = 𝑧𝛼 √
2 𝑛

𝑧𝛼 values : 1.96 for 95% confidence


2
2.58 for 99% confidence
1.64 for 90% confidence

1. A random sample of size 100 has a S.D. of 5. What can you say about the
maximum error with 95% confidence.

Sol.

Given 𝜎 = 5, n = 100, 𝑧𝛼 for 95% confidence = 1.96


2

𝜎 5
We know that 𝐸𝑀𝑎𝑥 = 𝑧𝛼 = 1.96 = 0.98
2 √𝑛 √100

2. Assuming that 𝜎 = 20.0, how large a random sample be taken to assert with
probability 0.95 that the sample mean will not differ from the true mean by
more than 3.0

Sol.
Given maximum error E = 3.0, and 𝜎 = 2.0, 𝑧𝛼 = 1.96 for 95% confidence
2
𝑧𝛼 𝜎 2
𝜎 2
We know that 𝐸𝑀𝑎𝑥 = 𝑧𝛼 ⇒𝑛=( )
2 √𝑛 𝐸
1.96𝑋20 2
𝑛=( ) = 170.74 ≃ 171
3

3. In a study of an automobile insurance a random sample of 80 body repair


costs had a mean of Rs. 472.36 and the S.D. of Rs. 62.35. If 𝑥 is used as a point
estimate to the true average repair costs, with what confidence we can assert
that the maximum error doesn’t exceed Rs. 10

Sol.
Given n = 80, 𝑥 = 472.36, 𝜎 = 62.35, Emax = 10

𝜎 𝐸𝑀𝑎𝑥 .√𝑛 10√80 89.4427


𝐸𝑀𝑎𝑥 = 𝑧𝛼 ⇒ 𝑧𝛼 = = = = 1.4345
2 √𝑛 2 𝜎 62.35 62.35

𝑧𝛼 = 1.43
2

The area when 𝑧𝛼 = 1.43 from tables is 0.4236


2
1 − 𝛼 = 2 × 0.4236 = 0.8472
Confidence = (1 − 𝛼)100% = 84.72
Hence we are 84.72% confidence that the maximum error is Rs. 10
4. If we can assert with 95% that the maximum error is 0.05 and P=0.2, find
the size of the sample.

Sol.
Given P=0.2, E = 0.05

We have Q = 1- P = 1 – 0.2 = 0.8 and 𝑧𝛼 = 1.96(𝑓𝑜𝑟95%)


2

𝑃𝑄 0.2×0.8
We know that maximum error, 𝐸𝑀𝑎𝑥 = 𝑧𝛼 √ ⇒ 0.05 = 1.96√
2 𝑛 𝑛
0.2×0.8×(1.96)2
𝑛= (0.05)2
= 246

5. What is the size of the smallest sample required to estimate an unknown


proportion to within a maximum error of 0.06 with at least 95% confidence

Sol.
Given E = 0.06, Confidence limit = 95%
i.e. 𝑧𝛼 = 1.96
2

1 1
here P is not given, So we take 𝑃 = ⇒ 𝑄 =
2 2
𝑧𝛼 2
1.96 2 1 1
Hence 𝑛 = ( 2 ) (𝑃𝑄) ⇒ 𝑛 = ( ) ( . ) = 266.78 ≃ 267
𝐸 0.06 2 2

6. The mean and S.D. of a population are 11,795 and 14054 respectively. What
can one assert with 95% confidence about the maximum error if 𝑥 = 11,795 and
n = 50. And also construct 95% confidence interval for the true mean

Sol.
Given 𝜇 = 11795, 𝜎 = 14054, 𝑥 = 11795, 𝑛 = 50, 𝑧𝛼 = 1.96
2

𝜎 (14054)
𝐸𝑀𝑎𝑥 = 𝑧𝛼 ⇒ 1.96. = 3899
2 √𝑛 √50

𝜎 𝜎
Confidence interval = (𝑥 − 𝐸𝑀𝑎𝑥 . , 𝑥 + 𝐸𝑀𝑎𝑥 . ,)
√𝑛 √𝑛

(11795 − 3899,11795 + 3899)

(7896,15694)
7.(H.W) A sample of 10 cam shafts intended for use in gasoline engines has an
average eccentricity of 1.02 and a S.D. of 0.044 inch. Assuming the data may be
treated a random sample from a normal population, determine a 95% confidence
interval for the actual mean eccentricity of the cam shaft?
(Ans) = (0.993, 1.047)

You might also like