Sampling CH-5
Sampling CH-5
RATIO ESTIMATORS
5.1. Estimation of a Ratio under Simple Random Sampling
We often have auxiliary information in surveys; few investigators go to the expense of taking a
good sample and then measure only one quantity. Often the sampling frame gives us extra
information about each unit that can be used to improve the precision of our estimates.
In the ratio method an auxiliary variate Xi correlated with Yi is obtained for each unit in the same
sample. The population total X of the Xi must be known. In practice Xi is often the value of Yi at
some previous time when a complete census was taken. The aim in this method is to obtain
increased precision by taking advantage of the correlation between Yi and Xi.
Suppose the units of the population possess two characteristics that are correlated to each other.
Let two varieties Xi and Yi, i=1, 2, …, N, represent the two characteristics. Assume that Y
represents the study variable and X represents an auxiliary variable. The ratio of these two
variables is the simplest and most commonly used method of the complex estimation techniques
for improving the precision or reliability.
For ratio estimator we select a SRS of size n out of a population of size N and measure both the
variables Yi (the variable under study) and Xi (the auxiliary variable) on the units in the sample.
Let the values in the sample be denoted by
(𝑦𝑖 , 𝑥𝑖 ), 𝑖 = 1,2, … , 𝑛
We assume that the population mean (𝑋̅)of the auxiliary variable is known and the interest is to
estimate the ratio(R), the mean and the total of study variable. In the population of size we have
∑𝑁 𝑌 𝑌 𝑌̅
𝑌 = ∑𝑁 𝑁 𝑖=1 𝑖
𝑖=1 𝑌𝑖 And 𝑋 = ∑𝑖=1 𝑋𝑖 , the ratio can be defined as: 𝑅 = ∑𝑁 𝑋 = 𝑋 = 𝑋̅
𝑖=1 𝑖
The estimate ratio,𝑅̂ , is not necessarily an unbiased estimate of a population ratio, R, that is
𝐸(𝑅̂ ) ≠ 𝑅. In most situations the bias is small, and estimated ratios are widely used. For large
samples, mostly if n≥30, the ratio estimate is a consistent estimate, which tends to normality with
It shows that the magnitude of the bias in 𝑅̂ as a ratio of its standard error cannot exceed the
coefficient of variation of 𝑥̅ . If 𝑅̂ and 𝑥̅ are uncorrelated, the bias vanishes. If CV (𝑥̅ ) <10%, the bias
can be ignored. The bias in 𝑌̅̂𝑅 and 𝑌̂𝑅 can be obtained in similar way.
5.4. Comparison of Ratio estimate With the Mean per Unit
Theorem 5.3: In large samples, with simple random sample, the ratio estimate 𝑌̂𝑅 has a smaller
variance than the estimate 𝑌̂=N𝑦̅ obtained by simple expansion, if
𝑆𝑥
1( ⁄ ̅) 1 𝐶𝑉(𝑋)
𝜌> 𝑋
= 2 𝐶𝑉(𝑌), where is the correlation between 𝑌𝑖 and 𝑋𝑖 .
2 (𝑆𝑦⁄ )
𝑌̅
𝑁 2 (1−𝑓) 1−𝑓
Proof: V(𝑌̂) = 𝑆𝑦2 , V(𝑌̂𝑅 )= N2 𝑛 (𝑠𝑦2 + 𝑅 2 𝑠𝑥2 − 2𝑅𝑆𝑥 𝑠𝑥 )
𝑛
If the V(𝑌̂𝑅 )< V(𝑌̂), then 𝑉(𝑌̂) − 𝑉(𝑌̂𝑅 ) > 0
𝑁 2 (1−𝑓) 1−𝑓
𝑆𝑦2 - N2 (𝑠𝑦2 + 𝑅 2 𝑠𝑥2 − 2𝑅𝑆𝑦 𝑠𝑥 ) > 0
𝑛 𝑛
1 𝑅 2 𝑠𝑥2 𝑌̅
𝑆𝑦2 − 𝑆𝑦2 − 𝑅 2 𝑠𝑥2 + 2𝑅𝑆𝑦 𝑠𝑥 > 0 𝜌 > 2 𝑅𝑆 , assuming 𝑅 = 𝑋̅ is positive.
𝑦 𝑆𝑥
𝑆𝑥
1 𝑅𝑆𝑥 1( ⁄ ̅) 1 𝐶𝑉(𝑋)
>2 >2 𝑆𝑦
𝑋
= 2 𝐶𝑉(𝑌)
𝑆𝑦 ( ⁄̅)
𝑌
Therefore, if the difference between the variances of simple and ratio estimate is greater than zero,
i.e; 𝑉(𝑌̂) − 𝑉(𝑌̂𝑅 ) > 0, then a ratio estimate is more efficient. If the difference is zero, then both
estimates are equally efficient. If the difference is less than zero, the ratio estimate is not as efficient
as the estimate from simple expansion.
5.5. Ratio Estimate in Stratified Random Sampling
In stratified random sampling design, there are two methods for estimating ratios: the separate ratio
estimate and the combined ratio estimate.
𝑦 𝑦̅
The separated ratio estimate: For stratum h:𝑌̂𝑅ℎ = ℎ 𝑋ℎ = ℎ 𝑋ℎ =𝑅̂ℎ 𝑋ℎ , and its variance will be
𝑥ℎ 𝑥̅ ℎ
𝑵𝟐𝒉 (𝟏−𝒇𝒉 )
𝑉(𝑌̂𝑅ℎ ) = 2
(𝑠𝑦ℎ + 𝑅ℎ2 𝑠𝑥ℎ
2
− 2𝑅ℎ ℎ 𝑆𝑦ℎ 𝑠𝑥ℎ ), where 𝑦ℎ 𝑎𝑛𝑑 𝑥ℎ are sample totals in the hth
𝒏𝒉
stratum, and 𝑋ℎ is the population stratum total and should be known. For the overall total, the
Sampling Theory | Chapter FIVE | Ratio Estimators 5
𝑦 𝑦̅
separate ratio estimate is represented by 𝑌̂𝑅𝑠 and is given as: 𝑌̂𝑅𝑠 =∑𝐿ℎ=1 ℎ 𝑋ℎ =∑𝐿ℎ=1 ℎ 𝑋ℎ , with the
𝑥ℎ 𝑥̅ ℎ
variance given in the following theorem.
Theorem 5.4: If an independent simple random sample is drawn in each stratum and sample sizes
are large in all strata, then the variance of 𝑌̂𝑅𝑠 is
𝑵𝟐 (𝟏−𝒇𝒉 )
𝑉(𝑌̂𝑅𝑠 ) = ∑𝐿ℎ=1 𝒉 2
(𝑠𝑦ℎ + 𝑅ℎ2 𝑠𝑥ℎ
2
− 2𝑅ℎ 𝑆𝑦ℎ 𝑆𝑥ℎ ), where Rh and h are the true ratio and
ℎ
𝒏𝒉
correlation in stratum h respectively.
The combined Ratio Estimate: In stratified sample, estimate of the population total Y is
𝑌̂𝑠𝑡 =N𝑦̅𝑠𝑡 =N∑ 𝑁ℎ 𝑦̅ℎ . For population total X, its estimate is
𝑋̂𝑠𝑡 =N𝑥̅𝑠𝑡 = N∑ 𝑁ℎ 𝑥̅ℎ .
𝑌̂ 𝑦̅
The combined ratio estimate 𝑌̂𝑅𝑐 , is given as 𝑌̂𝑅𝑐 =( 𝑠𝑡⁄̂ ) 𝑋=( 𝑠𝑡⁄𝑥̅ ) 𝑋,where 𝑦̅𝑠𝑡 and 𝑥̅ 𝑠𝑡 are the
𝑋𝑠𝑡 𝑠𝑡
estimated population means from a stratified sample, and X is known.
Theorem 5.5: If the total sample size n is large, the variance of 𝑌̅̂ is given by: 𝑅𝑐
W𝟐𝒉 (1−fh )
(𝑉(𝑌̅̂𝑅𝑐 ) = ∑𝐿ℎ=1 2
(𝑠𝑦ℎ + 𝑅ℎ2 𝑠𝑥ℎ
2
− 2𝑅ℎ ℎ 𝑆𝑦ℎ 𝑠𝑥ℎ ).
𝒏𝒉
𝟐
𝑵 (𝟏−𝒇 )
Corollary: 𝑉(𝑌̂𝑅𝑐 ) = ∑𝐿ℎ=1 𝒉 𝒏 𝒉 (𝑠𝑦ℎ
2
+ 𝑅ℎ2 𝑠𝑥ℎ
2
− 2𝑅ℎ ℎ 𝑆𝑦ℎ 𝑠𝑥ℎ ).
𝒉
𝑌 ∑𝑌 ̂ ̂
Corollary: Separate ratio estimate, 𝑅̂𝑠 , is given by 𝑅̂𝑠 = 𝑋𝑅𝑆 = 𝑋𝑅ℎ and its variance is
𝐿
V(𝑌̂𝑅𝑠 ) ∑ 𝑉(𝑌̂𝑅ℎ ) 1 𝑁ℎ2 (1 − 𝑓ℎ ) 2
𝑉(𝑅̂𝑠 ) = = = ∑ (𝑠𝑦ℎ + 𝑅ℎ2 𝑠𝑥ℎ
2
− 2𝑅ℎ ℎ 𝑆𝑦ℎ 𝑠𝑥ℎ )
𝑋2 𝑋2 𝑋2 𝑛ℎ
ℎ=1
𝐿
1 Wℎ2 (1 − 𝑓ℎ )
𝑉(𝑅̂𝑠 ) = ∑ 2
(𝑠𝑦ℎ + 𝑅ℎ2 𝑠𝑥ℎ
2
− 2𝑅ℎ ℎ 𝑆𝑦ℎ 𝑠𝑥ℎ )
𝑋̅ 2 𝑛ℎ
ℎ=1
𝑦̅ 𝑦̅ 𝑋 𝑌 ̅ ̅̂
Corollary: The combined ratio estimate for R is 𝑅̂𝑐 = 𝑥̅ 𝑠𝑡 = 𝑥̅ 𝑠𝑡 (𝑋̅) = 𝑋𝑅𝑐
̅
and its variance is
𝑠𝑡 𝑠𝑡
𝑉(𝑌̅̂𝑅𝑐 ) 1
𝐿
W𝒉𝟐 (1 − 𝑓ℎ ) 2
𝑉(𝑅̂𝑐 ) = = ∑ (𝑠𝑦ℎ + 𝑅ℎ2 𝑠𝑥ℎ
2
− 2𝑅ℎ ℎ 𝑆𝑦ℎ 𝑠𝑥ℎ )
𝑋̅ 2 𝑋̅ 2 𝑛ℎ
ℎ=1
Note: The separate ratio estimator is more efficient if Rh, population stratum ratio, varies
considerably and the sample size is large enough in each stratum. It is unlikely to improve the
efficiency of estimators if the same auxiliary variable is used for stratification and then for the ratio
method of estimation. If the population parameters are unknown in above expressions, substitute
the appropriate sample estimators for the parameters, i.e. , 𝑅̂ℎ , 𝑅̂ , 𝑠𝑦ℎ
2 2
, 𝑠𝑥ℎ , 𝑆𝑥𝑦ℎ , ̂ ℎ .