0% found this document useful (0 votes)
29 views

Estimation of Mean Using Two-Auxiliary Varaible

This article discusses estimating the population mean in stratified random sampling using information from two auxiliary variables. The proposed estimator's mean squared error is derived and compared to existing estimators. An empirical study using real data sets shows the proposed estimator is more efficient. Estimation of the population mean, variance, and covariance terms are defined for the study variable and two auxiliary variables within each stratum and overall. Expressions for the bias and mean squared error of the estimators are provided.

Uploaded by

Neha Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Estimation of Mean Using Two-Auxiliary Varaible

This article discusses estimating the population mean in stratified random sampling using information from two auxiliary variables. The proposed estimator's mean squared error is derived and compared to existing estimators. An empirical study using real data sets shows the proposed estimator is more efficient. Estimation of the population mean, variance, and covariance terms are defined for the study variable and two auxiliary variables within each stratum and overall. Expressions for the bias and mean squared error of the estimators are provided.

Uploaded by

Neha Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Journal of Reliability and Statistical Studies; ISSN (Print): 0974-8024, (Online): 2229-5666

Vol. 10, Issue 1 (2017): 59-68

ESTIMATION OF POPULATION MEAN USING TWO


AUXILIARY VARIABLES IN STRATIFIED RANDOM
SAMPLING

Madhulika Mishra, B. P. Singh and *Rajesh Singh


Department of Statistics, Banaras Hindu University, Varanasi, India
E Mail: *[email protected]
*Corresponding Author

Received October 21, 2016


Modified April 08, 2017
Accepted May 19, 2017

Abstract
This article discusses the problem of estimation of population mean in stratified
sampling using information on two auxiliary variables. The expressions for the mean square error
of the proposed estimator have been derived up to the first order of approximation and are
compared with the existing estimators. Also, an empirical study has been carried out in order to
show that the proposed estimator turns out to be more efficient than the existing estimators and
for this we have considered real data sets.

Key Words: Study Variable, Auxiliary Variables, Stratified Random Sampling, Mean
Squared Error, Efficiency, Bias.

1. Introduction
In survey sampling, it is always advantageous to use the information available
on the auxiliary variable which is highly correlated with the variable of interest. The
use of auxiliary information increases the precision of the estimators used for
estimating the unknown population parameters. Several authors have used auxiliary
information on auxiliary variable in the estimation of population parameters like
Srivastava and Jhajj (1981), Bahl and Tuteja (1991), Singh and Vishwakarma (2007),
Sahai and Ray (1980), Srivastava and Jhajj (1983), Srivastava (1971), Swain (1970)
and Perri (2007).

Here we have tried to incorporate the use of auxiliary information in stratified


random sampling. Several authors like Haq and Shabbir (2013), Shabbir and Gupta
(2006), Kadilar and Cingi (2003) have proposed estimators in stratified random
sampling using information on a single auxiliary variable. It is seen that many a times
instead of using information on a single auxiliary variable, we have information on two
auxiliary variables like Tailor et al. (2012) suggested a ratio-cum-product estimator of
population mean in stratified random sampling using two auxiliary variables. Koyuncu
and Kadilar (2009) proposed a family of estimators of population mean using two
auxiliary variables in stratified random sampling. Furthermore Verma et al. (2015) have
given some families of estimators using two auxiliary variables in stratified random
sampling. Likewise Singh and Kumar (2012) have proposed improved estimators of
population mean using two auxiliary variables in stratified random sampling. Through
60 Journal of Reliability and Statistical Studies, June 2017, Vol. 10(1)

this paper the problem of estimation of finite population mean in stratified random
sampling using information on two auxiliary variables has been discussed.
( )
Consider a finite population P = P1 , P2 ,......, PN of size N is divided into L strata of
size N h (h = 1,2,......, L) such that there are Nh units in the hth stratum and
L

N = ∑N h
h =1
Let Y be the study variable and X and Z be the auxiliary variables taking values yni, xni
( )
and zni (h = 1,2,......, L), i = 1,2,......., N h on the ith unit of the hth stratum.
Now we define
Nh
1
Yh = ∑y hi hth stratum mean for the study variable Y
Nh i =1
Nh
1
Xh = ∑x hi hth stratum mean for the study variable X
Nh i =1
Nh
1
Zh = ∑z hi hth stratum mean for the study variable Z
Nh i =1
N
1 hL
1 L L

Y = ∑∑y ni =
N h =1 i =1
∑N Y = ∑W Y : Population mean of the study
N h =1 h h h =1 h h
variable Y
L N
1 L h
X = ∑∑x ni = ∑Wh X h : Population mean of the auxiliary variable X
N h =1 i =1 h =1
N L
1 L h
Z = ∑∑z ni =
N h =1 i =1
∑W Z h h : Population mean of the auxiliary variable Z
h =1
nh
1
yh = ∑y ni : Sample mean for the study variable Y for hth stratum
nh i =1
nh
1
xh = ∑x ni : Sample mean for the study variable X for hth stratum
nh i =1
nh
1
zh = ∑z ni : Sample mean for the study variable Z for hth stratum
nh i =1

Nh
Wh = : Stratum weight for the hth stratum
N
n
f h = h : Sampling fraction
Nh
Estimation of population mean using two auxiliary ... 61

1− fh
lh =
nh
L L L

Also, y st = ∑W y h h , x st = ∑W x h h and z st = ∑W z h h are the usual


h =1 h =1 h =1

unbiased estimators of the population mean Y , X and Z in stratified random


sampling.

To obtain the MSE’s let us define


y st − Y x −X z −Z
∈0 = , ∈1 = st and ∈2 = st
Y X Z
Using these notations,
Ε(∈0 ) = Ε(∈1 ) = Ε(∈2 ) = 0
L
Ε[( y h − Yh ) r ( x h − X h ) s ( z h − Z) t ]
Vrst = ∑ Whr + s + t (1.1)
h =1 Yr XsZt
Using (1.1) we can write:-
L
2
∑W l S 2yh
h h

Ε (∈02 ) = h =1
= V200
Y2
L
2
∑W l S 2xh
h h

Ε (∈12 ) = h =1
= V020
X2
L
2
∑W l S 2zh
h h

Ε (∈22 ) = h =1
= V002
Z2
L
2
∑W h hl S xyh
Ε(∈0 ∈1 ) = h =1
= V110
XY
L
2
∑W h hl S xzh
Ε(∈1∈2 ) = h =1
= V011
XZ
L
2
∑W h hl S yzh
Ε(∈0 ∈2 ) = h =1
= V101
YZ
62 Journal of Reliability and Statistical Studies, June 2017, Vol. 10(1)

Where S
2
=∑
(y − Y )
Nh
hi h
2

, S
2
=∑
(x − X )
Nh
hi h
2

, S
2 (z − Z )
Nh
=∑ hi h
2

yh xh zh
N −1
i =1 h N −1
i =1 h N −1
i =1 h

S xyh = ∑
Nh
(x hi − X h )(y hi − Yh )
, S zyh = ∑ hi
(z − Z h )(y hi − Yh ) ,
Nh

i =1 Nh −1 i =1 Nh −1

2. Estimators available in literature


In this section, we consider several estimators of the finite population mean
that are available in the sampling literature. The variance and mean squared error’s
(MSE’s) of all the estimators considered here are obtained under the first order of
approximation.

• The usual unbiased estimator of the population mean in stratified random


sampling is defined as:-
l

y st = ∑Wh y h (2.1)
h =1
• The notations used here have been used by Dayal (1980). Other useful
references for stratified sampling are Cochran (1977, chapter-5), Reddy
(1978).
Variance of the estimator y st is defined as:
L

V (y ) = ∑W
st
2
l S 2yh
h h (2.2)
h =1
• Koyuncu and Kadilar (2009) suggested different ratio type estimators for
Y utilizing information on known value of population mean
population mean
X and Z of auxiliary variables X and Z as:-
 X  Z 
y1 = y st    (2.3)
x z
 st  st 
and
 X  z 
y 2 = y st   st  (2.4)
 x st  Z 
The mean square error of the estimator y 1 is given by:
MSE ( y1 ) = Y 2 (V200 + V020 + V002 − 2V110 + 2V011 − 2V101 ) (2.5)

The mean square error of the estimator y 2 is given by:-


MSE ( y 2 ) = Y (V020 + V200 + V002
2
− 2V110 − 2V011 + 2V101 ) (2.6)
When we have information on two auxiliary variables then the usual regression
estimator is defined as:
y lr = y st + b1 ( X − x st ) + b2 (Z − z st ) (2.7)
Estimation of population mean using two auxiliary ... 63

The mean square error of the estimator ylr is given by:-

(
L
MSE ( y lr ) = ∑ Wh2 l h S yh
2
1 − ρ yxh
2
− ρ yzh
2
+ 2 ρ yxhh ρ yzh ρ xzhh ) (2.8)
h =1

3. Proposed estimator
For estimating unknown population mean Y of the study variable we propose
an estimator as follows:
t = w 0 t 0 + w 1t1 + w 2 t 2 (3.1)
 X  z 
Where, t 0 = y st ,t 1 = y st   st  and
 x st  Z 
 X − x st  z −Z
t 2 = y st exp  exp st 
 X + x st  z
 st + Z 
Here, t o , t 1 , t 2 ∈ W
Here W denotes the set of all possible estimators for estimating the population
mean Y . By definition, the set W is a linear variety if:
2

t = ∑w i t i ∈ W (3.2)
i =0
2
Such that ∑w
i =0
i = 1 and w i ∈ R (3.3)

Where, w i (i = 0,1, 2) denotes the constants used for reducing the bias in the class of
estimators.

The form of the estimator defined in equation (3.1) has been so taken; so that it comes
out to be an unbiased estimator for the population mean Y . And the technique utilized
here is the technique of “Filtration of Bias”.

It is one of the methods used to remove Bias from ratio and product type estimators.
Some other methods used to remove bias are Quenouille’s method, interpenetrating
sampling method etc. [Singh (2003)].

Expressing the estimator t in terms of ∈ ’s we get


{ ( } { (
t = Y(1 + ∈0 )[w 0 + w1 (1 + ∈1 ) (1 + ∈2 ) + w 2 exp − ∈1 2 + ∈1 ) −1 exp∈2 2 + ∈2 ) −1
−1
}]
(3.4)
By expanding the above equation (3.4) and keeping terms only up to order one in∈' s ,
we can write
64 Journal of Reliability and Statistical Studies, June 2017, Vol. 10(1)

  w   w   3  1  w  
t = Y(1+ ∈0 )1−∈1  w1 + 2 + ∈2  w1 + 2 + ∈12  w1 + w2  − w2 ∈22 −∈1∈2  w1 + 2  
 2   2  8  8 4  
    
(3.5)
Now, subtracting Y from both the sides of equation (3.5) and then taking expectation
of both sides, the bias of the estimator t is obtained up to the first order of
approximation as:
  w   3  1  w  
Bias(t) = Y(V101 − V110) w1 + 2  + V020 w1 + w2  − w2V002 −  w1 + 2 V011
  2  8  8  4 
(3.6)
Ignoring 1st and higher order terms in (3.5) we get
  w  
t − Y = Y ∈0 + w 1 + 2 (∈2 − ∈1 )(1+ ∈0 ) (3.7)
  2  
Squaring both the sides and then taking expectation we get
[
MSE(t) = Y 2 V200 + Q2 {V002 + V020 − 2V011} + 2Q{V101 − V110}] (3.8)
w2
Where w1 + =Q (3.9)
2
The MSE of the estimator t is minimum when
− (V101 − V110 )
Q= (3.10)
(V002 + V020 − 2V011 )
Putting this value of Q in equation (3.8), we get the minimum value for the MSE of the
estimator t which is by,

min MSE( t ) = Y 2 V200 −
(
V101 − V110 )
2

(V002 + V020 − 2V011 )
(3.11)

From equation (3.3) and (3.9) there are two equations and three unknown. It is not
possible to find the unique values for w i ' s , i = 0,1,2 . In order to get unique values
of w i ' s , we impose the linear restriction as,
2

∑w B(t ) = 0
i i (3.12)
i =0
th
Where B( t i ) (i=0, 1, 2) denotes the Bias in the i estimator.
Equations (3.3), (3.9) and (3.12) can be written in the matrix form as,
 1 1 1  w 0   1 
 0 1 1   w  = Q  (3.13)
 2  1   
B( t 0 ) B( t 1 ) B( t 2 )   w 2   0 

Solving (3.13) we get the unique values of w 0 , w 1 and w 2 as,


Estimation of population mean using two auxiliary ... 65

w0 = A +
(1 − A )B
2C − D
w 1 = (1 − A ) +
(1 − A )B
2C − D
− 2(1 − A )B
w2 =
2C − D
Here,
 V101 − V110 
A = 1 + 
 V002 + V020 − 2V011 
(1 − A ) = Q = − (V101 − V110 )
V002 + V020 − 2V011
B = (V020 −V110 − V011 + V101 )
3 1 1 1 1 
C =  V020 − V002 − V011 − V110 + V101 
8 8 4 2 2 
(
D = V020 − V110 − V011 + V101 )

4. Empirical study
To examine the merits of the proposed estimator over the other existing
estimators at optimum conditions, we have considered two natural population data sets
from the literature. The source of population is given below:

Population 1 [Source: Koyuncu and Kadilar (2009)]


Y: Number of teachers,
X: Number of students,
Z: Number of classes in both primary and secondary school
N1 = 127 N 2 = 117 N 3 = 103 N 4 = 170 N 5 = 205 N 6 = 201 n1 = 31
n2 = 21 n3 = 29 n4 = 38 n5 = 22 n6 = 39 Y1 = 703.74 Y2 = 413
Y3 = 573.17 Y4 = 424..66 Y5 = 267.03 Y6 = 393.84 X 1 = 20804.59
X 2 = 9211.79 X 3 = 14309.30 X 4 = 9478.85 X 5 = 5569.95
X 6 = 12997.59 Z1 = 498.28 Z 2 = 318.33 Z 3 = 431.36 Z 4 = 311.32
Z 5 = 227.20 Z 6 = 313.71 S y1 = 888.835 S y 2 = 644.922 S y 3 = 1033 .467
S y 4 = 810.585 S y 5 = 403.654 S y 6 = 711.723 S x1 = 30486.751
S x 2 = 15180 .760 S x 3 = 27549.697 S x 4 = 18218.931 S x 5 = 8497.776
S x 6 = 23094 .141 S z1 = 555.5816 S z 2 = 365.4576 S z 3 = 612.9509
66 Journal of Reliability and Statistical Studies, June 2017, Vol. 10(1)

S z 4 = 458.0282 S z 5 = 260.8511 S z 6 = 397.0481 S yx1 = 25237153 .52


S yx 2 = 9747942 .85 S yx 3 = 28294397 .04 S yx 4 = 14523885 .53
S yx 5 = 3393591 .75 S yx 6 = 15864573 .97 S yz1 = 480688 .2
S yz 2 = 230092 .8 S yz 3 = 623019 .3 S yz 4 = 364943 .4 S yz 5 = 101539
S yz 6 = 277696 .1 S xz1 = 15914648 S xz 2 = 5379190 S xz 3 = 16490674.56
S xz 4 = 8041254 S xz 5 = 2144057 S xz 6 = 8857729 ρ yx1 = 0.936
ρ yx 2 = 0.996 ρ yx 3 = 0.994 ρ yx 4 = 0.983 ρ yx 5 = 0.989 ρ yx 6 = 0.965
ρ yz1 = 0.978 ρ yz 2 = 0.976 ρ yz 3 = 0.983 ρ yz 4 = 0.982 ρ yz 5 = 0.964
ρ yz 6 = 0.982

Population 2 (Source: Murthy (1967))


Y: Output
X: Fixed Capital
Z: Number of workers
N = 10 , n = 5 , n1 = 2 , n2 = 3 , N1 = 5 , N2 = 5
, Y1 = 1925.80 , Y2 = 315.60 X 1 = 214 .40 , X 2 = 333.80 , Z1 = 51.80 ,
Z 2 = 60.60 , S y1 = 615.92 , S y 2 = 340.38 , S x1 = 74.87 S x 2 = 66.35
S z1 = 0.75 , S z 2 = 4.84 , S yx1 = 39360 .68 , S yx 2 = 22356.50 ,
S yz1 = 411.16 , S yz 2 = 1536 .24 , S zx1 = 38.08 , S zx 2 = 287.92

ESTIMATORS VARIANCE/MSE’s PRE with respect to


y st
y st 2228.52 100

y1 1613.59 138.10

y2 1489.095 149.65

y lr 2072.674 107.51
T 0.0296 309.7493

Table 1: MSE’s and Percent Relative Efficiencies (PRE’s) of the estimators w. r.


to y st
Estimation of population mean using two auxiliary ... 67

ESTIMATORS VARIANCE/MSE’S PRE with respect to


y st
y st 32313.7599 100

y1 10647.1302 303.4974

y2 13130.03 246.1058

y lr 17611.89 183.477
T 8948.4080 361.1118

Table 2: MSE’s and Percent Relative Efficiencies (PRE’s) of the estimators w. r. to


y st

The Percent Relative Efficiencies (PRE’s) of the estimators with respect to the usual
unbiased estimator y st are obtained from the following mathematical formula.
MSE ( ESTIMATOR )
PRE ( ESTIMATOR ) = × 100
MSE ( yst )

5. Conclusion
In this paper, we have proposed an estimator for the population mean in
stratified random sampling utilizing information on two auxiliary variables. The MSE
of the proposed estimator has been derived up to first order of approximation.
Furthermore, we have used empirical approach for comparing the efficiency of the
proposed estimator with other estimators for which we have used known natural
population datasets, see Murthy (1967) and Koyuncu and Kadilar (2009). The results
have been shown above in the Table 1 and Table 2. From both the tables, it is clear that
the proposed estimator turns out to be more efficient as compared to the existing
estimators because of smaller value of MSE and higher value of PRE. So it is clearly
more desirable to use the proposed estimator in practical surveys.

References
Bahl, S. and Tuteja, R. K. (1991). Ratio and product type exponential
estimators, Journal of information and optimization sciences, 12(1), p. 159-164.
Cochran, W.G. (1977). Sampling Techniques, 3rd edition, John Wiley, New York.
Dayal, S. (1980). On allocation of sample using estimates of both proportions of
stratum sizes and standard deviations. Ann. Inst. Statist. Math., 32, p. 433-444.
Haq, A., and Shabbir, J. (2013). Improved family of ratio estimators in simple and
stratified random sampling, Communications in Statistics - Theory and Methods, 42(5),
p. 782-799.
Kadilar, C., and Cingi, H. (2003). Ratio estimators in stratified random
sampling, Biometrical journal, 45(2), p. 218-225.
Koyuncu, N. and Kadilar, C. (2009). Familiy of estimators of population mean using
two auxiliary variables in stratified random sampling, Communications in Statistics -
Theory and Methods, 38, p. 2398-2417.
Murthy, M. N. (1967). Sampling Theory and Methods, Statistical Publishing Society,
Calcutta, India, p. 228.
68 Journal of Reliability and Statistical Studies, June 2017, Vol. 10(1)

Perri, G. D. P. F. (2007). Estimation of finite population mean using multi-auxiliary


information, Metron, 65(1), p. 99-112.
Reddy, V. N. (1978). A comparison between stratified and unstratified random
sampling, Sankhaya, C, 40, p. 99-103.
Sahai, A. and Ray, S. K. (1980). An efficient estimator using auxiliary information.
Metrika, 27(4), p. 271–275.
Singh, S. (2003). Advanced Sampling Theory With Applications: How Michael’
Selected’ Amy Vol. I, Springer Science & Business Media.
Shabbir, J., and Gupta, S. (2006). A new estimator of population mean in stratified
sampling, Communications in Statistics-Theory and Methods, 35(7), p. 1201-1209.
Singh, H. P. and Vishwakarma, G. K. (2007). Modified exponential ratio and product
estimators for finite population mean in double sampling, Australian Journal of
Statistics, 36(3), p. 217-225.
Singh, R., and Kumar, M. (2012). Improved estimators of population mean using two
auxiliary variables in stratified random sampling, Pakistan Journal of Statistics and
Operation Research, 8(1), p. 65-72.
Srivastava, S. K. (1971). A generalized estimator for the mean of a finite population
using multi-auxiliary information, Journal of the American Statistical Association,
66(334), p. 404-407.
Srivastava, S. K. and Jhajj, H. S. (1981). A class of estimators of the population mean
in survey sampling using auxiliary information, Biometrika, 68(1), p. 341-343.
Srivastava, S. K. and Jhajj, H. S. (1983). A class of estimators of the population means
using multi-auxiliary information, Calc. Statist. Assoc. Bull., 32, p. 47–56.
Swain, A. K. (1970). A note on the use of multiple auxiliary variables in sample
surveys, Trabajos de estadistica y de investigacion operativa, 21(3), p. 135-141.
Tailor, R., Chouhan, S., Tailor, R., and Garg, N. (2012). A ratio-cum-product estimator
of population mean in stratified random sampling using two auxiliary variables,
Statistica, 72(3), p. 287-297.
Verma, H. K., Sharma, P., and Singh, R. (2015). Some families of estimators using two
auxiliary variables in stratified random sampling, Revista Investigacion Operacional,
36(2), p. 140-150.

You might also like