IMPROVED RATIO ESTIMATORS VIA MODIFIED MAXIMUM LIKELIHOOD more

Oral, E. and Kadilar, C.
Pak. J. Statist. 2011 Vol. 27 (3), 269-282

Pak. J. Statist. 2011 Vol. 27(3), 269-282 IMPROVED RATIO ESTIMATORS VIA MODIFIED MAXIMUM LIKELIHOOD 1 2 Evrim Oral1 and Cem Kadilar2 Louisiana State University Health Sciences Center, Biostatistics Program, New Orleans, LA, 70112, USA. Email: eoral@lsuhsc.edu Department of Statistics, Hacettepe University, Ankara, Turkey. Email: kadilar@hacettepe.edu.tr ABSTRACT We consider ratio estimators in simple random sampling under data anomalies. We specifically focus on the situations where the error term is not normally distributed, and exploit Tiku’s modified maximum likelihood estimators in the ratio method of estimation. We derive the mean square errors of the proposed ratio estimators theoretically and obtain the conditions where the proposed ratio estimators have less mean square errors than the traditional ratio-type estimators. We support our theoretical results with two different real-life examples. KEYWORDS Ratio estimators; Kadilar-Cingi estimators; Modified maximum likelihood; Robust linear regression; Huber-M estimator; Outliers. 1. INTRODUCTION Kadilar and Cingi (2004) suggested a general ratio estimator ~ y  bL X  x y KCi   x X  x  ; i  1, 2,,5 x x  x     (1.1) where   x   1 ,   x   0 for i  1 ;   x   1 ,   x   Cx for i  2 ;   x   1 ,   x   2  x  for i  3 ;   x   2  x  ,   x   Cx for i  4 ;   x   Cx ,   x   2  x  2  for i  5 ; bL  s xy s x is the regression coefficient obtained by the least square estimation; s xy is the sample covariance between the auxiliary and the study 2 variables; s x is the sample variance of X; Cx and 2  x  are the population coefficient of variation and the kurtosis of the auxiliary variable, respectively. Kadilar and Cingi (2004) obtained the MSE equation of their estimators as MSE yKC i    1 f 2 2 2 2 2 2 RKC i S x  2 BL RKC i S x  BL S x  2 RKC i S xy  2 BL S xy  S y , n (1.2) 269   © 2011 Pakistan Journal of Statistics 270 Improved Ratio Estimators via Modified Maximum Likelihood 2 where i  1,...,5 ; BL  S xy S x , and the population ratio RKCi is RKCi    x X   x   x Y , with different values of   x  and   x  for i  1,...,5 as given above. In sample survey studies, non-normal distributions are very common in practice. See Cochran (1977), Jenkins et al. (1973), Chambers (1986), Farrell and Barrera (2007). In this study, we propose to use Tiku’s modified maximum likelihood (MML) methodology for ratio-type estimation which is applicable to both symmetric and skewed distributions. 2. NON-NORMAL REGRESSION In a linear regression model yi  a  b xi  ei , 1  i  n , let the distribution of the error term follows the long tailed symmetric family f (ei )  2  1 1   ei     k  1 2, p  1 2   k  1 p ,   ei   ; (2.1) where k  2 p  3 and p  2 ( p is known), with E (ei )  0 and V (ei )  2 . After ordering the errors as e(1)  e(2)  ...  e( n ) , and by letting  x[i] , y[i ]  pair be the concomitants of e(i ) for 1  i  n , Tiku et al. (2001) derived the MML estimators (MMLEs) of a, b and  . In the same linear model above, now suppose that the distribution of ei is f (ei )  where r exp  ei   r ,   ei   ,   1  exp  e  r 1 i the shape parameter, with (2.2) is E (ei )     (r )   (1)  and V (ei )   2   (r )   (1)  . Here,  ( x)  ( x) ( x) is the psi-function and  ( x) is its derivative. For r  1 , r  1 and r  1 , (2.2) represents negatively skewed, symmetric and positively skewed distributions, respectively. The MMLEs can be found as ˆ ˆ ˆ aT  y[.]  bT x[.]    m  T , (2.3) (2.4) ˆ ˆ bT  K  M T , ˆ T   D  D 2  4nE   2 n (n  2) (2.5) (Islam et al. 2001), where y[.] , x[.] , m, and K are found as in Tiku et al. (2001), replacing Oral and Kadilar 271 i , i , and t(i ) ( 1  i  n ) by  i  1  e ( i )  t( i ) e ( i )  t t   1 e  t( i ) 2 , i  e ( i ) t  1 e  , t t( i ) 2 (i )   ln qi1/ r 1 . (2.6)   In (2.3)-(2.5),  , M, D, and E values are calculated from the equations  i  i   r  1 ,     i , i1 1 n M    i x[i ]  x[.] i 1 n n   i 1  i  x[i ]  x[.]  , n 2 D   r  1  i y[i ]  y[.]  K x[i ]  x[.] i 1 n    , 2 E   r  1  i y[i ]  y[.]  K x[i ]  x[.] i 1    . Tiku et al. (2001) along with Islam et al. (2001) showed that the MMLEs are more efficient and robust than their corresponding least square estimators (LSEs) under nonnormality. See Tiku and Akkaya (2004) for a large variety of examples of non-normal errors in linear regression. 3. SUGGESTED RATIO ESTIMATORS  Kadilar et al. (2007) incorporated robust Huber’s M estimator bH into Kadilar-Cingi estimators (KCEs) given in (1.1). We suggest adapting MMLEs instead of Huber’s M estimator in the context of ratio estimation for two main reasons: The MMLEs have an important advantage of having explicit forms in addition to being easier to compute. Furthermore, in linear regression, Huber’s M estimators are not defined for skewed errors; they are known to be robust only when the errors are heavy tailed. In fact, Islam and Tiku (2004) compared the efficiencies of the MMLEs and M-estimators for estimating parameters in multiple linear regression models when the errors are from a long tailed symmetric family. They showed that the MMLEs are generally more efficient than the M-estimators. For this reason, we propose the following estimators for the population mean of the study variable: y pri    x x    x  ˆ y  bT X  x     x  X    x   ; i  1, 2,,5 (3.1) ˆ where   x  and   x  have different values for i  1, 2,,5 as given in Section 1; bT is the MMLE computed from the sample as described in Section 2. When the errors are normally distributed, the proposed estimators become exactly the same with the KCEs. 272 Improved Ratio Estimators via Modified Maximum Likelihood Using the first degree approximation of Taylor series expansion, the MSE of the estimators in (3.1) can be obtained along the same lines as in Kadilar and Cingi (2004), see also Wolter (1985). As an example, we give the details of the derivation of MSE y pri   for i  1 below; the MSEs of the other proposed estimators  i  2,3, 4,5  can be obtained similarly. ˆ MSE y pr1  X 2 E R pr1  RKC1     2 , (3.2) ˆ ˆ where R pr1  y  bT X  x    T ˆ x . The expression R pr1  RKC1 can be approximated by using the first two terms of a Taylor series expansion as ˆ R pr1  RKC1   Y  bˆ X   x  X   1  y  Y  . X 2 X If we define a new random variable ui as  ˆ ui     Y  bT X      X2    xi  X     1 X  yi  Y   ,  where 1  i  n for the sample and 1  i  N for the population, then the sample and population means of the random variable u are ˆ u   in1 ui n  R pr1  RKC1 and U   iN 1 ui N  0 ,  respectively. Hence, the expected value in (3.2) can be found from ˆ E R pr1  RKC1   2  E u2    (3.3) 1 f 2 Su , n 2 N  1 . After incorporating Su into (3.3) we have 2 where Su   iN 1 ui  U    2 ˆ E R pr1  RKC1 therefore,   2  1 f 1 n X2  R KC1 2 2  BT  S X  2  RKC1  BT  S XY  SY , 2  MSE y pr1    1 f n  R KC1 2 2  BT  S X  2  RKC1  BT  S XY  SY . 2  (3.4) The MSE equations of the other proposed estimators can be found along the same lines as explained above. The equation (3.4) can be generalized for all proposed estimators as Oral and Kadilar 273 MSE y pr i    1 f 2 2 2 2 2 2 RKC i S x  2 BT RKC i S x  BT S x  2 RKC i S xy  2 BT S xy  S y , n (3.5)   where BT is the MMLE, computed from the population, and for i  1,...,5 , RKCi values are the same as stated in Section 1. Realize that the MSE equation of the proposed estimators given in (3.5) and the MSE equation of the KCEs given in (1.2) have a similar structure. Also, note that in deriving the MSE equations (3.5), we omit the difference ˆ E (b )  B (for details, see Kadilar and Cingi, 2004). T T 4. MEAN SQUARE ERROR COMPARISONS In order to compare the MSEs of the proposed estimators in (3.5) with the MSE of the traditional ratio estimator yr   y x  X , we write the inequality MSE y pri  MSE  yr  , 2 2 2 2 2 2 S x BT  2 S xy  RKCi S x BT  RKCi S x  2 RKCi S xy  R 2 S x  2 RS xy  0 ,       (4.1) for i  1,...,5 where R  Y X . When the inequality (4.1) is solved for BT , we have the following conditions: R  RKCi  BT  2 BL   R  RKCi  2 BL   R  RKCi   BT  R  RKCi for BL  R    for BL  R.   (4.2) Specifically, for y pr1 , the condition (4.2) becomes 0  BT  2  BL  R  2  BL  R   BT  0    for BL  R.   for BL  R Thus, when the condition (4.2) is satisfied, the proposed estimators y pri  i  1,...,5  have smaller MSEs than the traditional ratio estimator yr . Similarly, in order to compare the MSEs of the proposed estimators in (3.5) with the MSEs of the KCEs in (1.2), we write the inequality MSE y pri  MSE  yKCi  , 2 2 2 2 2 2 2 RKCi BT S x  BT S x  2 BT S xy  2 RKCi BL S x  BL S x  2 BL S xy ,   (4.3) for i  1,...,5 . The inequality (4.3) becomes 274 2 2 RKCi BT S x Improved Ratio Estimators via Modified Maximum Likelihood  2 2 BT S x  2 BT S xy  2 RKCi S xy  2 S xy 2 Sx , 2 2 2 2 S x BT  2 S xy  RKCi S x BT  2 RKCi S xy  2 S y  0 .     (4.4)  i  1,...,5 When the inequality (4.4) is solved for BT , we obtain the following conditions 2 RKCi  BT  BL  0 0  BT  BL  2 RKCi for RKCi  0    for RKCi  0   (4.5) Hence, when the condition (4.5) is satisfied, the proposed estimators (3.1) are better than the estimators suggested by Kadilar and Cingi (2004), given in (1.1), correspondingly. Finally, to compare the proposed estimators with the ones proposed in Kadilar et al. (2007), which will be denoted by yKCC i  i  1,...,5  , we need to solve the inequality MSE y pri  MSE yKCC i . Kadilar et al. (2007) give the MSEs of their proposed estimators as     (4.6) MSE yKCC i    1 f 2 2 2 2 2 2 RKC i S x  2 BH RKC i S x  BH S x  2 RKC i S xy  2 BH S xy  S y n (4.7)   where BH is the Huber-M estimator obtained from the population values, and RKCi  i  1,...,5 values are the same as in Section 1. Note that the difference between (3.5) and (4.7) is in the robust estimator integrated into the KCEs. When the inequality (4.6) is solved for BT , we obtain the following conditions BT  2  BL  RKCi   BH BT  2  BL  RKCi   BH for BT  BH  0    for BT  BH  0   (4.8) When (4.8) is satisfied, the proposed estimators have less MSEs than the estimators suggested by Kadilar et al. (2007). In practice, one can easily obtain the inequalities in (4.2), (4.5), and (4.8) from the sample values and decide whether the conditions are satisfied or not.  ˆ As an example, to check if the condition (4.2) is satisfied, for bL  R , the inequality ˆ  ˆ ˆ ˆ ˆ R  RKCi  bT  2bL  R  RKCi should be calculated for i  1,...,5 where   Oral and Kadilar 275 ˆ ˆ ˆ ˆ R  RKC1  y X , RKC 2  y  X  C x  , RKC 3  y  X  2 ( x)  ,     ˆ ˆ RKC 4  y2 ( x)  X 2 ( x )  C x  and RKC 5  yC x  XC x  2 ( x)  .      ˆ Realize that bL and bT are found from the sample values  xi , yi  , 1  i  n . 5. EMPIRICAL STUDY In this section, we study the preceding theoretical results empirically in two different populations. First, we examine the data of Kadilar and Cingi (2003, 2004) which is also revisited by Kadilar et al. (2007). Next, we work on a data which is obtained from the Ministry of National Education, Republic of Turkey. For all of the computations performed in this section, Intel Visual Fortran Complier is used except that the Huber’s M estimates are obtained from S-PLUS by using the rreg function. 5.1 Population I The first population consists of the apple production amount (in tons) and the number of apple trees (1 unit =100 trees) in 106 villages of the Aegean Region of Turkey in 1999 as the study and the auxiliary variables, respectively (Source: Institute of Statistics, Republic of Turkey). In Table 1, we provide the population values obtained from the data. Table 1 Statistics obtained from Data I RKC1  8.0688 Y  2212.594 N=106 BL  17.208 Cx  2.095 X  274.217 S y  11551.528 RKC2  8.0076 RKC3  7.1654 RKC4  8.0670 RKC5  7.6109 2  x   34.572  x, y  0.856 S x  574.606 S x , y  5681761.761 To calculate the MMLE BT , we first need to identify the distribution of the error term. This can be accomplished by constructing a normal probability plot, followed by selection of a reasonable value for the shape parameter. To construct a normal probability plot of the deviants wi  yi  b xi , for 1  i  106 ( ei  wi  a ), we replace b by its population LSE BL  106  xi  x  yi 106  xi  x  =17.208, which is a reasonable i 1 i 1 efficient estimate that can also be easily computed. The plot of the ordered deviants w( i )  y[i ]  17.208 x[i ] ( 1  i  106 ) against the normal quantiles  1 (i /107) , where 2  ( z ) is the cdf of the standard normal distribution is given in Figure 1. It can be seen that the largest deviant is clearly an outlier that may be studied separately; however, for the sake of comparisons we will keep it in the data initially. Figure 1 indicates that the distribution of the errors is negatively skewed with a long tail on the left hand side, thus it 276 Improved Ratio Estimators via Modified Maximum Likelihood can be modeled by the generalized logistic density function with parameters r and  given in (2.2). To determine a plausible value of the shape parameter r, we follow the procedure described in Tiku and Akkaya (2004), Chapter 11. For a given r ( r  1 ) value, we first calculate the MML estimates of a , b and  from (2.3)-(2.5) using the population values. 4 3 Ordered deviants 2 1 0 -1 -2 -3 -3 -2 -1 Normal quantiles 0 1 2 3 Fig. 1: Normal probability plot of Data I 5 4 Ordered deviants 3 2 1 0 -1 -2 -3 -4 -5 -3 -2 -1 Normal quantiles 0 1 2 3 Fig. 2: Normal probability plot of Data I after eliminating the outlier Then, we calculate ˆ ˆ (1/ n) ln L  ln r  ln T  (1 / n)106 zi  (r  1) n 106 ln 1  exp( zi ) , (5.1) i 1 ˆ i 1 parameters a , b and  in zi   yi  a  bxi   with their corresponding MML ˆ for a series of values of r, where zi ( 1  i  106 ) values are computed by replacing the estimates obtained from the population. We choose the r value which maximizes ln L . It should be noted that since the MMLEs are robust to plausible deviations, it suffices to locate a distribution in reasonable proximity to the true distribution. For the data above, with the outlier included, we have the following (1 / n) ln L values calculated from the population: Oral and Kadilar 277 r= 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 - (1/ n) ln L 9.791 9.767 9.732 9.686 9.642 9.578 9.551 9.550 9.546 We chose r =0.6 as a plausible value of the shape parameter. In fact, the likelihood function ln L becomes almost flat (stable) for the values of r around 0.6, thus, any value close to r =0.6 provides a plausible model due to the robustness of the MMLEs. As a result, we find the MML estimate of the regression coefficient as BT  8.198 from the population. We calculate the theoretical MSE values of the traditional ratio estimator, Kadilar-Cingi estimators and the proposed estimators as explained in Sections 1 and 3 for n =30. Although Huber’s M estimator is not defined for skewed distributions, since this data is the one which was used by Kadilar et al. (2007) for MSE comparisons, we also calculate the theoretical MSEs of the estimators yKCC i ( i  1,...,5 ) from (4.7), just for the sake of comparisons. The values of the relative efficiencies MSE( yKCi ) , MSE( yKCC i ) , MSE( y pri ) , and the values of REi  100 MSE( yr ) MSE( y pri ) , i  1,..., 5 REi*  100 MSE( yKCi ) MSE( y pri ) , i  1,..., 5 , REi  100 MSE( yKCC i ) MSE( y pri ) , i  1,..., 5   (5.2) (5.3) (5.4)     of the proposed estimators are given in Table 2. The Huber-M estimate obtained from the population, also given by Kadilar et al. (2007), is BH  5.025 for n =30. Realize that, it is not reasonable to use Huber’s M estimator when the errors are from a skewed distribution; the only reason we present the relative efficiencies of the proposed estimators with respect to yKCC i ( i  1,...,5 ) is that, Kadilar et al. (2007) used this data to compare the efficiencies between their estimators and the KCEs. Table 2 Theoretical MSE values and relative efficiencies of the proposed estimators for Data I MSE( yKCC i ) MSE( y pri ) REi MSE ( yKCi ) REi* REi yKC1 yKC 2 yKC 3 yKC 4 yKC 5 1168.790 1165.464 1121.371 1168.692 1144.295 yKCC1 yKCC 2 yKCC 3 yKCC 4 yKCC 5 992.940 994.954 1025.199 992.999 1008.623 y pr1 y pr 2 y pr 3 y pr 4 y pr 5 927.001 927.507 937.656 927.015 931.554 175.89 158.97 114.73 175.70 157.89 115.07 171.92 143.02 119.54 175.89 158.94 114.74 174.18 150.89 117.23 278 Improved Ratio Estimators via Modified Maximum Likelihood From Table 2, it is seen that all the proposed estimators have smaller MSEs than both the traditional ratio estimator and the corresponding KCEs. In addition, the proposed ~ estimators are better than the ones proposed in Kadilar et al. (2007), where the LSE b L in (1.1) is replaced by the Huber M-estimator. These results are expected since the conditions (4.2), (4.5), and (4.8) are satisfied for i  1,..., 5 . We present all the necessary computations to examine these conditions in Table 3. Realize that the sample size, n, does not have an effect on the MSE comparisons; therefore the relative efficiency values found for n =30 are exactly the same for all other n values. We then eliminate the largest deviant which is grossly anomalous (see Figure 1). We re-calculate the LSE BL  105  xi  x  yi 105  xi  x  i 1 i 1 2 from the remaining 105 observations as 5.374 and plot the new ordered deviants w(i )  y[i ]  5.374 x[i ] ( 1  i  105 ) against the normal quantiles  1 (i / 106) . The normal probability plot of the deviants given in Figure 2 indicates a symmetric distribution which has fatter tails than the normal distribution; therefore the long-tailed symmetric family (2.1) is quite plausible to model the error after eliminating the outlier. Table 3: Population values for Data I to test the conditions Comparison with yKC i Comparison with yKCC i Comparison with yr i R  RKCi 2 BL   R  RKCi  1 0 1.1552 2 0.0334 1.1886 3 0.1687 1.3239 1.1592 4 0.0040 5 0.1051 1.2603 2 RKCi 2  BL  RKCi   BH i R  RKCi 2 BL   R  RKCi  1 0 18.2784 -16.1375 13.2534 2 0.0612 18.3396 -16.0152 13.3758 3 0.9034 19.1818 -14.3308 15.0602 4 0.0018 18.2802 -16.1340 13.2570 18.7363 -15.2217 14.1692 5 0.4579 Population I without the outlier BL  R RKCi  0 BT  BH  0 2 RKCi -9.5928 -9.5259 -9.2553 -9.5848 -9.3825 BL  R Population I with the outlier RKCi  0 BT  BH  0 2  BL  RKCi   BH -2.4698 -2.4030 -2.1324 -2.4618 -2.2596 Following the same procedure discussed above, for a series of values of p, we ˆ ˆ ˆ ˆ ˆ ˆ ˆ calculate zi  yi  aT  bT xi T , 1  i  105 , where aT , bT , and T are found from   the formulas given in Tiku et al. (2001), and compute the values of (1 / n) ln L , where L is the likelihood function based on long tailed symmetric distribution, i.e., Oral and Kadilar 279 (1/ n) ln L   ln  ˆ ˆ k B 1 2 , p  1 2   ln T   p n  105 ln 1  zi2 k . i 1    We have the following values calculated from the population with the gross outlier excluded: p= 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 5.0 - (1/ n) ln L 8.163 8.109 7.999 7.931 7.959 7.985 8.008 8.028 8.046 8.062 8.077 8.173 We chose p =3.3 which is clearly a plausible value of the shape parameter. We find the new MML estimate of the regression coefficient as BT  3.281. Since after eliminating the gross outlier, the errors are distributed from a long-tailed symmetric family, it is rational to use the estimators given by Kadilar et al. (2007). The Huber-M estimator is found as BH  3.625 from the remaining 105 values. The re-calculated population values for n =30 are given in Table 4. Table 4: Statistics of Data I after eliminating the outlier RKC1  4.7964 N=105 Y  1112.714 BL  5.374 Cx  1.627 X  231.990 S y  2292.557 RKC2  4.7630 RKC3  4.6277 RKC4  4.7924 RKC5  4.6913 2  x   8.459  xy  0.885 S x  377.523 S xy  765983.244 We calculate the new MSE values and relative efficiencies (5.3) and (5.4) and provide them in Table 5. Since the condition (4.2) is not satisfied for i  1,...,5 , we do not record the relative efficiencies (5.2). Table 5: Theoretical MSE values and relative efficiencies of the proposed estimators for Data I after eliminating the outlier MSE ( yKCC i ) MSE ( y pri ) MSE ( yKCi ) REi* REi yKC1 yKC 2 yKC 3 yKC 4 yKC 5 324.326 322.651 315.898 324.127 319.066 yKCC1 yKCC 2 yKCC 3 yKCC 4 yKCC 5 242.115 240.691 235.005 241.945 237.663 y pr1 y pr 2 y pr 3 y pr 4 y pr 5 227.834 226.493 221.156 227.674 223.646 202.64 112.93 202.93 112.93 204.03 112.92 202.68 112.93 203.53 112.93 280 Improved Ratio Estimators via Modified Maximum Likelihood From Table 5, we conclude that, even after eliminating the outlier in the data, all the proposed estimators are much better than their corresponding KCEs. They are also still better than the ones proposed in Kadilar et al. (2007). These results are also expected since the conditions (4.5) and (4.8) are satisfied for i  1,..., 5 . We provide all the necessary computations to test the conditions in Table 3. 5.2 Population II The study and the auxiliary variables in the second population are the number of students and the number of schools, respectively, in 124 districts of the Eastern Anatolia Region of Turkey in 2007 (Source: Ministry of National Education, Republic of Turkey). In Table 6, we provide the population values obtained from the data. The LSE BL is found as 298.363 from the population. The normal probability plot of the ordered deviants w(i )  y[i ]  298.363 x[i ] ( 1  i  124 ), which is given in Figure 3, indicates a positively skewed distribution with a tail on the right hand side, thus, the generalized logistic density function (2.2) with r  1 is a candidate. The maximum of (1/ n) ln L is attained at r  8 . Table 6: Statistics obtained from Data II RKC1  179.4526 Y  8725.161 N=124 BL  298.363 Cx  0.772 X  48.621 S y  13690.788 RKC2  176.6464 RKC3  176.6111 RKC4  175.8809 RKC5  175.7909 2  x   0.782  x, y  0.818 S x  37.555 S x , y  420806.037 We find BT  166.419. Since Huber-M estimation is not defined for skewed distributions, we do not use the estimators yKCC i ( i  1,...,5 ), given by Kadilar et al. (2007) for this data. The values of MSE( yKCi ) and MSE( y pri ) , and the values of the relative efficiencies (5.2) and (5.3) for n =10 are given in Table 7. Note that MSE( yr ) =2742.775 for n =10. Oral and Kadilar 5 4 3 281 Ordered deviants 2 1 0 -1 -2 -3 -3 -2 -1 Normal quantiles 0 1 2 3 Fig. 3: Normal probability plot of Data II From Table 7, we see that the minimum MSE values are attained with the proposed estimators in (3.1). The proposed estimators are clearly the best estimators. This result is expected since the conditions (4.2) and (4.5) are satisfied for i  1,..., 5 as shown in Table 8. Table 7: Theoretical MSE values and relative efficiencies of the proposed estimators for Data II MSE ( y pri ) REi MSE ( yKCi ) REi* yKC1 yKC 2 yKC 3 yKC 4 yKC 5 3140.863 3120.168 3119.909 3114.556 3113.897 y pr1 y pr 2 y pr 3 y pr 4 y pr 5 2445.827 2438.958 2438.874 2437.153 2436.943 125.76 126.47 126.47 126.65 126.67 164.91 163.66 163.65 163.32 163.27 Table 8 Population values for Data II to test the conditions Comparison with yKC i Comparison with yr Population II BL  R i 1 2 3 4 5 R  RKCi 0 2.8062 2.8415 3.5717 3.6617 2 BL   R  RKCi  237.8208 240.6270 240.6623 241.3925 241.4825 RKCi  0 2 RKCi -358.9053 -353.2928 -353.2222 -351.7618 -351.5818 282 Improved Ratio Estimators via Modified Maximum Likelihood 6. CONCLUSION AND FUTURE WORK For a linear regression model, the assumption of normality of the error term hardly ever holds true; thus, we propose to integrate Tiku’s MML methodology into ratio-type estimators by replacing the LSE with MMLE in KCEs. Kadilar et al. (2007) proposed adapting Huber-M estimation into ratio-type estimators; however, there are some important advantages of Tiku’s MMLEs with respect to Huber’s M estimators in robust inference: The former estimators have explicit algebraic forms, which make the theoretical MSE comparisons meaningful in the context of survey sampling, while the latter estimators do not. Likewise, it is easier to compute the former estimators. More importantly, the former estimators are applicable to both symmetric and skewed distributions, while the latter estimators are only applicable to long-tailed symmetric distributions. We are in the process of searching the effect of the shape parameters in (2.1) and (2.2) on the proposed estimators, as well as their robustness properties via simulations and we plan to submit our findings in a future paper. ACKNOWLEDGEMENTS The authors are thankful to the anonymous referees for their constructive comments and suggestions for the improvement of this manuscript. REFERENCES 1. Chambers, R.L. (1986). Outlier robust finite population estimation. J. Amer. Statist. Assoc., 81, 1063-1069. 2. Cochran, W.G. (1977). Sampling Techniques, J. Wiley and Sons, NY. 3. Farrell, P.J. and Barrera, S.M. (2007). A comparison of several robust estimators for a finite population mean. J. Statist. Studies, 26, 29-43. 4. Islam, M.Q. and Tiku, M.L. (2004). Multiple linear regression model under nonnormality. Commun. in Statist.: Theo. and Meth., 33, 2443-2467. 5. Islam, M.Q., Tiku, M.L. and Yildirim, F. (2001). Nonnormal regression I skew distributions. Commun. in Statist.: Theo. and Meth., 30, 993-1020. 6. Kadilar, C. and Cingi, H. (2004). Ratio estimators in simple random sampling. App. Math. and Compu., 151, 893-902. 7. Kadilar, C. and Cingi, H. (2003). Ratio estimators in stratified random sampling. Biometrical Journal, 45, 218-225. 8. Kadilar, C., Candan, M. and Cingi, H. (2007) Ratio estimators using robust regression. Hacettepe J. Math. and Statist., 36, 181-188. 9. Jenkins, O.C., Ringer, L.J. and Hartley, H.O. (1977). Root estimators. J. Amer. Statist. Assoc., 68, 414-419. 10. Tiku M.L. and Akkaya, A.D. (2004). Robust Estimation and Hypothesis Testing. New Age International Pub., New Delhi. 11. Tiku, M.L., Islam, M.Q. and Selcuk, A.S. (2001). Nonnormal regression II symmetric distributions. Commun. in Statist.: Theo. and Meth., 30, 1021-1045. 12. Wolter, K.M. (1985). Introduction to Variance Estimation. Springer-Verlag, NewYork.
x

Log In

or reset password

Reset Password

Enter the email address you signed up with, and we'll send a reset password email to that address

Academia © 2012