0% found this document useful (0 votes)
21 views

Sampling CH-9

The document discusses sampling with varying probabilities, also known as probability proportional to size sampling. It defines the procedure, which involves selecting units with probabilities proportional to some measure of their size. Formulas are provided for estimating population totals and means using this method. Examples are included to demonstrate how to select samples and calculate estimates.

Uploaded by

smiletopeace14
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Sampling CH-9

The document discusses sampling with varying probabilities, also known as probability proportional to size sampling. It defines the procedure, which involves selecting units with probabilities proportional to some measure of their size. Formulas are provided for estimating population totals and means using this method. Examples are included to demonstrate how to select samples and calculate estimates.

Uploaded by

smiletopeace14
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

CHAPTER 9: SAMPLING WITH VARYING PROBABILITIES

9.1 Definition

In most of practical applications the sampling units, for example, schools, cities, kebeles,
households, farms, etc., contain different numbers of elements or subunits. The sampling
procedure we discussed so far assumes that samples are selected by simple random sampling in
which the selection probabilities are equal for all units in the population. If the units vary
considerably in size, simple random sampling may not be appropriate since it does not take into
account the possible importance of the larger units in the population. One of the different
approaches which would allow treating different sizes of the units is to assign unequal
probabilities of selection to the different units in the population to get reasonable estimates of the
population quantities.

For example, villages with larger geographical areas are likely to have larger population and
larger areas under food crops. In estimating production or food supply, it may well be desirable
to adopt a scheme of selection in which villages are selected with probabilities proportional to
their population or to their geographical areas.

A sampling procedure in which the units are selected with probabilities proportional to some
measure of size is known as sampling with probability proportional to size (pps). The units may
be selected with or without replacement. In this chapter we only treat sampling with replacement.

9.2 Procedure of Selecting a Sample with PPS

a) Selection from a List Using Simple random sampling

To draw a sample of size n from a population of size N with probability proportional to size and
with replacement, we proceed as follows. If Zi is an ancillary variable, which is the size of the ith

unit, i = 1, 2, - - -, N, then we find successive cumulative totals, Z


i
i and draw a random
N
number R not exceeding Z
i 1
i from a table of random numbers. We repeat the procedure until

we get the required sample of size n is obtained.

Example 1: A village has 10 kebeles containing 150, 50, 80, 100, 200, 160, 40, 220, 60 and 140
households respectively. It is desired to select a sample of 4 kebeles with replacement and with
probability proportional to the number of households in the kebele. The first step in the selection
of kebeles is to form successive cumulative totals and ranges as shown below:

1
Sr. No. of Size Zi (HH) Cumulative Total Range
the Kebele
1 150 150 1-150
2 50 200 151-200
3 80 280 201-280
4 100 380 281-380
5 200 580 381-580
6 160 740 581-740
7 40 780 741-780
8 220 1000 781-1000
9 60 1060 1001-1060
10 140 1200 1061-1200

To select a kebele, we choose a random number between 1 and 1200 with the help of a table of
random numbers. Suppose the chosen random number is 600. It will be seen that this number
falls within the range 581-740 associated with the 6th kebele and it is selected. Draw three more
random numbers and assume these numbers are 650, 850 and 300. Then the kebeles selected
corresponding to these random numbers are 6th, 8th and 4th respectively. We observe that in a
sample of 4 kebeles selected with probability proportional to size with replacement, the 6th
kebele is selected twice.

b) PPS Systematic Sampling

This method of selection is performed as follows:


 List the units, with the measure of size (estimated size) Zi shown against each unit
 Cumulate the values of Zi and enter the cumulative Zi against each unit and the last entry
N
equals Z
i 1
i = Z, i = 1, 2, - - -, N.

 If the n is the sample size, then we compute the sampling interval I, which is the nearest
integer to Z/n, i.e., I = Z/n.
 We choose the number R at random between 1 to I inclusive using a random table. Let this
number be j.
 Then the sample contains the n units with serial numbers j, j + I, j + 2I, - - -, j + (n-1)I and the
units corresponding to these numbers are selected.
 If the interval I = Z/n is not an integer, a pps circular systematic sample can be obtained by
selecting a random start (j) from 1 to Z and then proceeding cyclically with the integer
nearest to Z/n as the interval.

Example 2: Consider example 1, this time using PPS systematic sampling. Compute the interval
I = Z/n =1200/4 = 300. Then select a random number between 1 and 300 from a random table.
Let this number be j = 291. The remaining three numbers are 591, 891, 1191. Then the selected
kebeles corresponding to these random numbers are 4th, 6th, 8th and 10th respectively.

2
9.3 Estimation of population Total Y and Mean Y from Selection of Unequal
Probabilities

Suppose that the sampling is with replacement and that on each draw the probability of selecting
the ith unit of the population is pi, and the characteristics under study is represented by the y-
value, for i = 1, 2, - - -, N.

Theorem 9.1 If a sample of n units is drawn with probabilities pi, then


1 n y
Yˆpps =  i is an unbiased estimate of Y with variance
n i 1 pi
1  N y2  1 N y
V( Yˆpps ) =   i  Y 2  =  pi ( i  Y ) 2 Prove this theorem.
n  i 1 pi  n i 1 pi
Since the population value Y is unknown, it must be estimated from a sample.

Theorem 9.2: If a sample of n units is drawn with probability proportional to pi with


replacement, then for any n  2 an unbiased estimator of the variance, V( Yˆpps ), is given as
n
1 y
v( Yˆpps ) =  ( i  Yˆpps ) 2 Prove this theorem.
n(n  1) i 1 pi
n
1 yi
For the mean: An unbiased estimator of the population mean Y is y pps = Yˆpps /N = p ,
nN i 1 i

1 N y
having the variance V( y pps ) = 2  pi ( i  Y ) 2
N n i 1 pi
n
1 y
The sample variance is v( y pps ) = 2  ( i  Yˆpps ) 2
N n(n  1) i 1 pi

N
When selection is strictly proportional to size, that is, pi = Zi/Z, where Z = Z
i 1
i , then theorems

8.1 and 8.2 will be reduced to the following theorem.

Theorem 9.3: If a sample of n units is drawn with probabilities proportional to size, pi = Zi/Z and
Z n y  Z n
with replacement, the unbiased estimate of Y is given as Yˆpps =   i  =   y i  = Z y
n i 1  Z i  n i 1
where y is the unweighted mean of the unit means, with variance
n

Z N
y y i
Y
V( Yˆpps ) =  2
Z i ( y i  Y ) , where y i  i , y  i 1
and Y 
n i 1 Zi n Z
Z2 n
Similarly an unbiased sample estimate of V( Yˆpps ) is v( Yˆpps ) =  ( yi  y ) 2
n(n  1) i 1

3
Example 3: A village has 24 households and the size of each household is shown below in the
table.

HH No. HH Cumulative HH No. HH Cumulative


Size size Size size
1 5 5 13 6 63
2 3 8 14 4 67
3 3 11 15 5 72
4 7 18 16 6 78
5 4 22 17 3 81
6 4 26 18 5 86
7 6 32 19 1 87
8 6 38 20 3 90
9 4 42 21 5 95
10 5 47 22 6 101
11 3 50 23 4 105
12 7 57 24 4 109

a) Select 5 households with probability proportional to size, with replacement using simple
random sampling.
N
Solution: Cumulate the sizes of households and obtain Z = Z
i 1
i = 109. Then choose 5 random

numbers between 1 and 109 from random table. If these numbers are 28, 36, 69, 80, 104, then
these numbers correspond to the numbers of households 7, 8, 15, 17, and 23 respectively.

b) The following data were obtained from the selected households.

Sample Size Total Total monthly pi xi wi


Household (Zi) monthly food cost (wi)
income (xi)
1 6 61 27 6/109 10.1667 4.5000
2 6 62 30 6/109 10.3333 5.0000
3 5 58 25 5/109 11.6000 5.0000
4 3 35 21 3/109 11.6667 7.0000
5 4 47 22 4/109 11.7500 5.5000

For the whole village estimate:


i) the total monthly income and its standard error.
ii) the total monthly food cost and its standard error.
xi w
Solution: Let the ancillary variable is Zi, then pi = Zi/Z, xi  , wi  i
Zi Zi
x1 61 w1 27
x1   = 10.1667, etc. w1   , etc
Z1 6 Z1 6

4
n
Z  xi
Z n  xi  i 1 109(10.1667  10.3333  11.6  11.6667  11.75) 109x55.5167
X̂ pps = 
n i 1  Z i
 =
n
=
5
=
5

= 1210.264
Z2 n
v( X̂ pps ) =  ( xi  x ) 2 ,
n(n  1) i 1
n

x
i 1
i
10.1667  10.3333  11.6  11.6667  11.75 55.5167
where x = = = = 11.10334
n 5 5
v( X̂ pps )=
2 2 2 2
109 2  (10.1667  11.10334)  (10.3333  11.10334)  (11.6  11.10334)  (11.6667  11.10334) 

5(5  1)  (11.75  11.10334) 2 

109 2 x 2.45247
= = 1456.8898  s.e ( X̂ pps ) =38.1692
20
n n
Z  wi w i
Z n w  i 1 109( 4.5  5.0  5.0  7.0  5.5) i 1
Ŵ pps =   i  = = = 588.6, where w  = 5.4
n i 1  Z i  n 5 n
Z2 n
v( Ŵ pps ) =  (wi  w ) 2
n(n  1) i 1
2 (109) 2 x3.7
= 109 (4.5  5.4 2 2 2 2
 (5  5.4)  (5 _ 5.4)  (5.5  5.4)  (7  5.4) 2
 = = 2197.985
5(5  1) 20
s.e( Ŵ pps ) = 46.88267

You might also like