Open access peer-reviewed chapter - ONLINE FIRST

A Study on the Comparison of the Effectiveness of the Jackknife Method in the Biased Estimators

By Nilgün Yıldız

Submitted: September 10th 2018Reviewed: November 2nd 2018Published: December 18th 2018

DOI: 10.5772/intechopen.82366

Downloaded: 175

Abstract

In this study, we proposed an alternative biased estimator. The linear regression model might lead to ill-conditioned design matrices because of the multicollinearity and thus result in inadequacy of the ordinary least squares estimator (OLS). Scientists have developed alternative estimation techniques that would eradicate the instability in the estimates. Several biased estimators such as Stein estimator, the ordinary ridge regression (ORR) estimator, the principal components regression (PCR) estimator. Liu developed a Liu estimator (LE) by combining the Stein estimator with the ORR estimator. Since both ORR and LE depend on OLS estimator, multicollinearity affects them both. Therefore, the ORR and LE may give misleading information in the presence of multicollinearity. To overcome this problem, Liu introduced a new estimator, which is based on k and d biasing parameters, the authors worked on developing an estimator that would still have the valuable characteristics of the Liu-type estimator (LTE) but have a smaller bias. We are proposing a modified jackknife Liu-type estimator (MJLTE) that was created by combining the ideas underlying both the LTE and JLTE. Under mean square error matrix criteria, the MJLTE is superior to Liu-type estimator (LTE) and jackknifed Liu-type estimator (JLTE). Finally, a real data example and a Monte Carlo simulation are also given to illustrate theoretical results.

Keywords

  • jackknifed estimators
  • jackknified Liu-type estimator
  • multicollinearity
  • MSE
  • Liu-type estimator

1. Introduction

With regression analysis; Is there a relationship between dependent and independent variables? If there is a relationship, what is the power of this relationship? What is the relationship between variables? Is it possible to predict prospective variables and how should they be estimated? What is the effect of a particular variable or group of variables on other variables or variables in the event that certain conditions are checked? Try to search for answers to questions such as. Linear regression is very important, popular method in statistics. According to Web of Science, the number of publications about linear regression between 2014 and 2018 is given in Figure 1.

Figure 1.

Number of publications published between 2014 and 2018.

According to Figure 1, the number of studies conducted in 2014 is 12,381, while the number of studies conducted in 2018 is 13,137.

The number of publications about linear regression by document types is given in Figure 2.

Figure 2.

Number of publications by document types.

The most common type of document about linear regression is the article. This is followed by proceeding paper, review, and editorial material.

The number of publications about linear regression by research area is given in Figure 3.

Figure 3.

Number of publications by research area.

The most widely published area related to linear regression is engineering, followed by mathematics, computer science, environmental sciences, ecology and other scientific fields.

The number of publications about linear regression by countries is given in Figure 4.

Figure 4.

Number of publications by countries.

The countries with the most publications on linear regression are USA, China, England, Germany, Canada, Australia, respectively.

In regression analysis, the most commonly used method for estimating coefficients is ordinary least squares (OLS). We considered the multiple linear regression model given as

y=+εE1

where yis n×1observable random vector, Xis a n×pmatrix of non-stochastic (independent) variables of rank p; βis p×1vector of unknown parameters associated with X, and εis a n×1vector of error terms with

Ee=0,Cove=σ2IE2

In regression analysis, there are several methods to estimate unknown parameters. The most frequently used method is the least squares method (OLS). Apart from this method, there are three general estimation methods: maximum likelihood, generalized least squares, and best linear unbiased estimator BLUE [1] .

Since the use of once very popular estimators such as the ordinary least squares (OLS) estimation has become limited due to multicollinearity, which makes them unstable and results in bias and reduced variance of the regression coefficients.

We can give, it is a linear (or close to linear) relationship between the independent variables as the definition of multicollinearity. In the regression analysis, multicollinearity leads to the following problems:

  • In the case of multicollinearity, linear regression coefficients are uncertain and the standard errors of these coefficients are infinite.

  • The regression coefficients of the multicollinearity increase the variance and covariance of OLS.

  • The value of the model R2is high but none of the independent variables is significant compared to the partial ttest.

  • The direction of the related independent variables’ relations with the dependent variable may contradict the theoretical and empirical expectations.

  • If independent variables are interrelated, some of them may need to be removed from the model. But what variables will be extracted? Removing an incorrect variable from the model will result in a model error. On the other hand, there are no simple rules that we can use to include and subtract the arguments in the model.

Methods for dealing with multicollinearity are collecting additional data, model respecification. Instead of two related variables, the sum of these two variables (as a single variable) can be taken and use of biased estimators. In this book provides information on biased estimators used as OLS alternatives. In literature many researchers have developed biased regression estimators [2, 3].

Examples of such biased estimators are the ordinary ridge regression (ORR) estimator introduced by Hoerl and Kennard [4].

β̂k=XX+kI1Xyk0E3

where kis a biasing parameter, in later years researchers combined various estimators to obtain better results. For example, Baye and Parker [5] introduced rkclass estimator, which combines the ORR and principal component regression (PCR). In addition, Baye and Parker also showed that rkclass estimator is superior to PCR estimator based on the scalar mean square error (SMSE) criterion.

Since both ORR and LE depend on OLS estimator, multicollinearity affects both of them. Therefore, the ORR and LE may give misleading information in the presence of multicollinearity. Liu estimator (LE) was developed by Liu [6] by combining the Stein [7] estimator with the ORR estimator.

β̂d=XX+I1Xy+dβ̂0<d<1E4

To overcome this problem, Liu [8] introduced a new estimator, which is based on kand dbiasing parameters as follows

β̂LTE=XX+kI1Xy+dβ̂k>0,<d<E5

Next, the authors worked on developing an estimator that would still have valuable characteristics of the Liu-type estimator (LTE), but have a smaller bias. In 1956, Quenouille [9] suggested that it is possible to reduce bias by applying a jackknife procedure to a biased estimator.

This procedure enables processing of experimental data to get statistical estimator for unknown parameters. A truncated sample is used calculate specific function of estimators. The advantage of jackknife procedure is that it presents an estimator that has a small bias while still providing beneficial properties of large samples. In this article, we applied the jackknife technique to the LTE. Further, we established the mean squared error superiority of the proposed estimator over both the LTE and the jackknifed Liu-type estimator (JLTE).

The article is organized as follows: The model as well as LTE and the JLTE are described in Section 2. The proposed new estimator is introduced in Section 3. Superiority of the new estimator vis-a-vis the LTE and the JLTE are studied and the performance of the modified Jackknife Liu-type estimator (MJLTE) is compared to that of the JLTE in Section 4. Sections 5 and 6 consider a real data example and a simulation study to justify the superiority of the suggested estimator.

2. The model

We assume that two or more regressors in Xare closely linearly related, therefore model suffers from multicollinearity problem. A symmetric matrix S=XXhas an eigenvalue–eigenvector decomposition of the form S=TΛT, where Tis an orthogonal matrix and Λis (real) a diagonal matrix. The diagonal elements of Λare the eigenvalues of Sand the column vectors of Tare the eigenvectors of S. The orthogonal version of the standard multiple linear regression models is

y=XTTβ+ε=+εE6

where Z=XT, γ=Tβand ZZ=Λ. The ordinary least squares estimator (OLSE) of γis given by

γ̂=ZZ1Zy=Λ1ZyE7

Liu [8] proposed a new biased estimator for γ, called the Liu-type estimators (LTE), and defined as

γ̂LTEkd=Λ+kI1Zydγ̂fork0andd+=Λ+kI1ZydΛ1Zy=IΛ+kI1k+dγ̂=Fkdγ̂E8

where

Fkd=Λ+kI1ΛdIE10

γ̂LTEhas bias vector

Biasγ̂LTE=FkdIγE11

and covariance matrix

Covγ̂LTE=σ2FkdΛ1FkdE12

By using Hinkley [10], Singh et al. [11], Nyquist [12], and Batah et al. [13] we can propose the jackknifed form of γ̂LTE. Quenouille [9] and Tukey [14] introduced the jackknife technique to reduce the bias. Hinkley [10] stated that with few exceptions, the jackknife had been applied to balanced models. After some algebraic manipulations, the corresponding jackknife estimator is obtained by deleting the ith observation ziyias

Azizi1=A1+A1ziziA11ziA1zi
γ̂LTEikd=Azizi1Zyziyi=A1+A1ziziA11ziA1ziZyziyi=A1ZyA1ziyi+A1ziziA11ziA1ziZyA1ziziA11ziA1ziziyi
=γ̂LTEkd+A1ziyi1+ziA1zi1ziA1zi+A1zizi1ziA1ziγ̂LTEkd=γ̂LTEkdA1ziA1ziyiziγ̂LTEkd1ziA1zi=γ̂LTEkdA1ziei1wiE13

where ziis the ithrow of Z, ei=yiziγ̂LTEkdis the Liu-type residual, wi=ziA1ziis the distance factor and A1=Λ+kI1IdΛ1=FkdΛ1. In the view of the non-zero value of wireflecting the lack of balance in the model, we use the weighted jackknife procedure. Thus, weighted pseudo values are defined as

Qi=γ̂LTEkd+n1wiγ̂LTEkdγ̂LTEi(kd)

the weighted jackknifed estimator of γ is obtained as

γ̂JLTEkd=1ni=1nQi=γ̂LTEkd+A1i=1nzieiE14
i=1nziei=i=1nziyiziγ̂LTEkd=IA1Zy
γ̂JLTEkd=γ̂LTEkd+A1ZyA1ΛA1Zy=2IA1Λγ̂LTEkdE15

However, since IA1Λ=IΛ+kI1ΛdI=IFkd, we obtain

γ̂JLTEkd=2IFkdγ̂LTEkdE16

From (9) we have

γ̂JLTEkd=2IFkdFkdγ̂E17
Biasγ̂JLTEkd=IFkd2γE18

Variance of the JLTE as,

Covγ̂JLTEkd=σ22IFkdFkdΛ1Fkd2IFkdE19

MSEMs of the JLTE and LTE as

MSEMγ̂JLTEkd=Covγ̂JLTEkd+Biasγ̂JLTEkdγ̂JLTEkd=σ22IFkdFkdΛ1Fkd2IFkd+ Fkd2γγIFkd2E20
MSEMγ̂LTEkd=σ2FkdΛ1Fkd+FkdIββFkdIE21

3. Our novel MJLTE estimator

In this section, Yıldız [15] propose a new estimator for γ. The proposed estimator is designated as the modified jackknifed Liu-type estimator (MJLTE) denoted by γ̂MJLTEkd

γ̂MJLTEkd=Ik+d2Λ+kI2Ik+dΛ+kI1γ̂E22

It may be noted that the proposed estimator MJLTE in (22) is obtained as in the case of JLTE but by plugging in the LTE instead of the OLSE. The expressions for bias, covariance and mean squared error matrix (MSEM) of γ̂MJLTEkdare obtained as

Biasγ̂MJLTEkd=k+dΛ+kI1WΛ+kI1γE23
Covγ̂MJLTEkd=σ2ΦΛ1ΦE24
MSEMγ̂MJLTEkd=σ2ΦΛ1Φ+k+d2Λ+kI1WΛ+kI1γγΛ+kI1WΛ+kI1E25

where W=I+k+dΛ+kI1k+d2Λ+kI2=I+FkdFkd2and Φ=2IFkdFkd2

4. Properties of the MJLTE

One of the most prominent features of our novel MJLTE estimator is that its bias, under some conditions, is less than LTE estimator from which it originates from.

Theorem 4.1. Under the model (1) with the assumptions (2), the inequality

Biasγ̂MJLTEkd2<Biasγ̂LTEkd2holds true for d>0and k>d

Proof. From 11 and 23, we can obtain that

Biasγ̂MJLTEkd2Biasγ̂LTEkd2=k+d2Λ+kI2W2Λ+kI2Λ+kI2>0

It is obvious that the difference is greater than 0, because it consists of the product of the squares in the expression above. Thus, the proof is completed.

Corollary 4.1. The bias of the absolute value of the ithcomponent of MJLTE is smaller than that of LTE, namelyBias(γ̂MJLTEkdi<Bias(γ̂LTEkdi.

Theorem 4.2. The MJLTE has smaller variance than the LTE

Proof. From 12 and 24, it can be shown that

Covγ̂LTEkdCovγ̂MJLTEkd=σ2H

where

H=I+k+dΛ+kI1Λ1I+k+dΛ+kI1ΦΛ1Φ=IFkdΛ1IFkd2Λ1(IF(kd)2IFkd

H is a diagonal matrix and ith element

hii=λi+ki4λi+2ki+di2λidi2λidi2λiλi+ki6

is a positive number. Thus we conclude that H is a positive definite matrix. This completes the proof.

Next, we prove necessary and sufficient condition for the MJLTE to outperform the LTE using the MSEM condition. The proof requires the following lemma.

Lemma 4.1. Let Mbe a positive definite matrix, namely M>0, αbe some vector, then

Mαα0if and only if αM1α1

Proof. see Farebrother [16]

Theorem 4.3. MJLTE is superior to the LTE in the MSEM sense, namelyMSEMγ̂LTEkdMSEMγ̂MJLTEkd>0, if the inequality

Δ1=MSEMγ̂LTEkdMSEMγ̂MJLTEkdis nonnegative definite matrix if and if the inequality

γL1σ2H+FkdγγFkdL11γ1E26

is satisfied with L=FkdW, Fkd=FkdIand W=I+FkdFkd2

Proof.

We consider the difference from (21, 25) we have

Δ1=MSEMγ̂LTEkdMSEMγ̂MJLTEkd=σ2H+FkdγγFkdLγγLE27

where

H=I+k+dΛ+kI1Λ1I+k+dΛ+kI1ΦΛ1Φ=IFkdΛ1IFkd2Λ1(IF(kd)2IFkd

W=I+FkdFkd2is a positive definite matrix. We have seen His a positive definite matrix from Theorem 2. Therefore, the difference Δ1is a nonnegative definite, if and only if L1Δ1L1is a nonnegative definite. The matrix L1Δ1L1can be written as

L1Δ1L1=L1σ2H+FkdγγFkdL1γγE28

Since the matrix σ2H+FkdγγFkdis symmetric and positive definite, using Lemma 4.1, we may conclude that L1Δ1L1is a nonnegative definite, if and only if the inequality

γL1σ2H+FkdγγFkdL11γ1

is satisfied.

4.1. Comparison between the JLTE and the MJLTE

Here, we show that the MJLTE outperforms the JLTE in terms of the sampling variance.

Theorem 4.4. The variance of MJLTE has a smaller variance than that of the JLTE for d>0and k>d

Proof. From (19, 24) it can be written as

Covγ̂JLTEkd=σ22IFkdFkdΛ1Fkd2IFkd=σ2VUΛ1UVE29

and

Covγ̂MJLTEkd=σ2ΦΛ1Φ=σ2VUVΛ1VUVE30

where V=IFkdand U=I+Fkd, respectively. It can be shown that

Covγ̂JLTEkdCovγ̂MJLTEkd=σ2ΣE31

where Σ=VUΛ1VΛ1VUV, Σis a diagonal matrix. Then ith the diagonal element of Covγ̂JLTEkdCovγ̂MJLTEkdis

σ2λi+k+2di2λidi2k+di2λi+k+diλi+λi+ki6

Hence of Covγ̂JLTEkdCovγ̂MJLTEkd>0which completes the proof.

In the following theorem, we have obtained a necessary and sufficient condition for the MJLTE to outperform the JLTE in terms of matrix mean square error. The proof of the theorem is similar to that of Theorem 4.3.

Theorem 4.5.

Δ2=MSEMγ̂JLTEkdMSEMγ̂MJLTEkdis a nonnegative definite matrix, if and if the inequality

γL1σ2Σ+Fkd2γγFkd2L11γ1E32

is satisfied.

Proof. From (20, 25) we have

Δ2=MSEMγ̂JLTEkdMSEMγ̂MJLTEkd=σ2Σ+Fkd2γγFkd2FkdWγγWFkd

We have seen from Theorem 4.4 that Σis a positive definite matrix. Therefore, the difference Δ2is a nonnegative definite, if and only if L1Δ2L1is a nonnegative definite. The matrix L1Δ2L1can be written as

L1Δ2L1=L1σ2Σ+Fkd2γγFkd2L1γγ

The difference Δ2is a nonnegative definite matrix, if and only if L1Δ2L1is a nonnegative definite matrix. Since the matrix σ2Σ+Fkd2γγFkd2is symmetric and positive definite, using Lemma 4.1, we may conclude that L1Δ2L1is nonnegative definite, if and only if the inequality

γL1σ2Σ+Fkd2γγFkd2L11γ1

is satisfied. This confirms our validation. Theorems 4–6 showed that the estimator we proposed was superior to the LTE estimator and JLTE estimator. Accordingly, we can easily say that the MJLTE estimator is better than other estimators LTE, JLTE.

5. Numerical example

To motivate the problem of estimation in the linear regression model, we consider the hedonic prices of housing attributes. The data consists of 92 detached homes in the Ottawa area sold during 1987 (see Yatchew [17]).

Let y be the sale price (sp) of the house, Xbe a 92 × 9 observation matrix consisting of the variables: frplc: dummy for fireplace(s), grge: dummy for garage, lux: dummy for luxury appointment, avginc: average neighborhood income, dhwy: distance to highway, lot area: area of lot, nrbed: number of bedrooms, usespc: usable space. The data are given in Table 1.

sellprixfireplacgarageluxbathavginccrowdistncrosdstdisthwylotareanrbedusespacesouthwestnsouthnwest
18001032.31630.9342800.638073.6329731.233090.840.40900
13501031.50162.186240.139420.664526.530.845921.441.6450.0871710.16962
165.901032.06542.311480.153370.434225.7230.875081.6731.5950.121020.16276
10100037.83482.543810.179240.268723.13620.714452.2521.1830.205140.10622
12700037.83482.54580.179470.166212.730.737892.1931.2930.196570.12131
23510167.00562.771470.20460.229356.69541.35182.3521.4660.219670.14505
19511065.82783.087470.239790.302954.2331.168292.5641.720.250470.17991
184.510062.30533.48440.283990.330094.22430.982282.7852.0940.282580.23123
10611038.49463.830860.322580.870563.23420.795072.1483.1720.190030.37917
15611052.35523.853060.325050.39006530.902392.5632.8770.250330.33869
19510052.35523.882830.328360.672114.841.294652.3633.0810.221270.36668
20611052.35523.924930.333050.694954.7541.71112.383.1210.223740.37217
15701052.35523.958880.336830.43005530.9012.6212.9670.258750.35104
18010079.45833.962360.337220.12078430.955273.0152.5710.3160.29669
19311079.45833.96260.337250.195036.3554531.504363.0632.5140.322970.28887
23011052.35523.972840.338390.39307531.084212.6612.950.264560.3487
21201153.86474.043290.346230.22395541.087242.8452.8730.29130.33814
10210059.57744.114940.354211.18386421.248132.1133.5310.184950.42843
13711059.57744.131410.356051.011872.430.7532.2853.4420.209940.41622
18701059.57744.137230.356691.004136.1068630.919362.2973.4410.211680.41608
10300059.57744.163380.359611.14419420.688932.1933.5390.196570.42953
10000059.57744.225210.366491.22908431.074832.1693.6260.193080.44147
15211039.76524.296880.374471.020719.941.016683.7932.0190.429030.22094
12711039.52294.528790.40030.69035.1641.647432.8973.4810.298850.42157
119.501035.64.596490.407841.393924.920.97684.2061.8540.489030.1983
10300039.52294.638410.412510.647744.615821.143583.0233.5180.317160.42665
9901030.65144.695990.418921.00164540.86554.1122.2680.475370.25511
7500039.54.699410.41930.493821.89131.14363.1923.4490.341710.41718
12811039.52294.733530.42310.7881531.039112.9923.6680.312650.44723
13210035.64.759380.425981.235045.089231.325794.2732.0960.498770.23151
13200038.12164.807010.431280.353736.541541.253123.8512.8770.437450.33869
13400039.55.047630.458080.274294.72530.805363.643.4970.40680.42377
12011039.87325.080060.461690.58519531.24524.2082.8460.489320.33443
12501039.87325.218550.477120.71216.9620.624724.3912.820.515910.33086
13511025.95455.235990.479061.017165.430.882734.5632.5680.54090.29628
13911038.245.265720.482370.922314.0841.444143.3354.0750.362490.50309
15101034.40495.343160.490990.962856.631.032053.3684.1480.367280.51311
116.501038.12165.537040.512580.519615.330.993254.5443.1640.538140.37807
13711050.05485.567670.5160.706376.630.87733.7574.1090.42380.50775
14911055.86675.754110.536761.382965.631.192553.374.6640.367570.58392
16711058.47635.80550.542481.203416.415241.486243.5654.5820.39590.57266
16311058.47635.82130.544241.033754.9241.600653.7164.4810.417840.5588
147.501058.47636.07490.572480.733265.182530.9724.164.4270.482350.55139
237.711058.47636.07520.572520.623635.449541.999674.2414.350.494120.54083
16811044.24756.25990.593091.49998531.302753.7065.0450.416390.6362
18011058.47636.31730.599481.032635.301441.597824.1354.7760.478720.59929
15610057.74466.63980.635391.09965531.410064.3545.0130.510530.63181
14511044.24756.64690.636181.673996.57241.221693.8915.3890.443270.68341
14411057.68616.68540.640470.760676.82541.6895.6163.6270.693880.44161
14011057.68616.697520.641820.633766.331.147785.5563.740.685170.45712
23611144.24756.73980.646531.596214.72541.631844.0375.3970.464480.68451
17211053.59076.82310.655810.98439541.229564.5965.0430.545690.63593
14810071.02697.11470.688281.42051631.142764.5015.510.531890.70001
153.511071.02697.12240.689141.545597.383241.4334.4065.5960.518090.71182
15411071.02697.1570.692991.218925.29231.440254.6965.4010.560220.68506
145.511050.71017.30230.709170.62336.73541.492976.0444.0980.756070.50624
14910050.71017.352720.714790.47386.088841.2668764.250.749670.5271
13810050.71017.35380.714910.580635.335241.503396.0624.1630.758680.51516
141.510052.63787.446720.725251.072276.5441.034626.4033.8020.808220.46562
12510052.63787.468920.727730.96967.0241.167726.3683.9030.803140.47948
13001052.63787.522660.733710.939246.541.097576.3963.960.807210.48731
13201052.63787.557570.73760.913615.141.203776.4114.0020.809390.49307
132.911052.63787.587830.740970.88591631.060596.4214.0430.810840.4987
12211052.63787.62210.744790.85984530.804946.4354.0850.812870.50446
16211051.50877.636280.746370.359164.99241.43185.7055.0760.706810.64046
127.510073.44647.65030.747930.687576.78530.846925.4885.330.675290.67531
8700042.31387.67850.751071.679194.640.740174.7696.0180.570830.76973
139.911040.92467.903530.776130.401385.531.192356.3664.6840.802850.58666
24011141.957.91940.777891.160655.00441.39725.3165.870.65030.74942
13411052.63787.925040.778521.080984.258131.22986.7764.110.862410.50789
136.510042.31387.93810.779981.581844.62531.066945.0086.1590.605550.78908
143.511051.50877.985520.785260.19433541.449726.0385.2260.755190.66104
14011057.57067.993080.78610.940967.85631.031676.7434.2920.857620.53287
12310051.50878.020720.789180.76856.75741.287825.6655.6780.7010.72307
14711057.57068.068750.794530.55185.0430.960546.5654.6910.831760.58762
134.911057.57068.166780.805440.776845.18541.357816.7634.5780.860530.57211
15411057.57068.238680.813450.853445.2741.609386.8554.570.873890.57102
143.910040.92468.567690.850090.53129530.911126.8775.110.877090.64512
12611053.80298.625040.856480.48411540.888386.8855.1950.878250.65679
118.510040.06188.94860.892510.103175.131.058856.855.7580.873170.73405
15811139.52628.95520.893251.864325.03531.11925.4257.1250.666130.92164
11800037.14579.29640.931240.186115.03530.935957.1275.9690.913410.763
109.2500030.97049.45720.949150.703295.430.674497.565.6820.976320.72362
12400049.02979.54450.958870.43645631.466836.8636.6330.875050.85412
13710035.51889.57950.962771.7827530.973885.8827.5610.732530.98147
14211030.58449.62730.968091.59895541.47616.0567.4840.757810.97091
120.500026.99479.64830.970431.51784530.699716.1327.4490.768850.9661
12301051.15699.74010.980661.79905520.685745.977.6960.745311
157.510051.15699.7440.981091.71459541.072266.0397.6470.755340.99328
11510047.86889.75860.982720.92055530.901636.6517.1410.844250.92384
126.511055.29019.86280.994320.412530.9027.6376.2410.987510.80033
15511055.29019.913810.49461531.413197.7236.21610.7969

Table 1.

Data set.

The eigenvalues of the matrix XX: 9 × 9 are given by λ1=1.47, λ2=3.77, λ3=4.52, λ4=15.33, λ5=18.57, λ6=20.97, λ7=41.79, λ8=271.15and λ9=239153.68.

If we use the spectral norm, then the corresponding measure of conditioning of Xis the number κX=λmaxXX/λminXXwhereκ.1. We obtained κX=403.27, which is large and so Xmay be considered as being ill-conditioned.

In this case, the regression coefficients become insignificant and therefore, it is hard to make a valid inference or prediction using OLS method. To overcome many of the difficulties associated with the OLS estimates, the LTE. When β̂=XX1Xyand kand dare biasing parameters the use of β̂LTE=XX+kI1Xy+dβ̂, k>0, <d<+has become conventional. The LTE estimator will be used for the following example. The original model was used to reconstruct a canonical form as shown in (6) y=+ε. Estimators γ̂LTE,γ̂JLTEand γ̂MJLTEused data d=0.10,0.30,0.70,1and k=0.30,0.50,0.70,1. Then, the original variable scale was obtained by using the coefficients estimated by these estimators. The individual values of dand kfor the scalar MSE (SMSE = trace (MSEM)) of the estimators are shown in Tables 2, 3, 4, 5. The effects of different values of don MSE can be seen in Figures 5, 6, 7, 8 that clearly show that the proposed estimator (MJLTE) has smaller estimated MSE values compared to those of the LTE, JLTE.

d = 0.10d = 0.30d = 0.70d = 1
MSE(LTE)810.45111037.64541900.689712905.5467
MSE(JLTE)733.5563729.5050977.13821688.9649
MSE(MJLTE)631.2267669.0754967.79051289.1137

Table 2.

The estimated MSE values of LTE, JLTE and MJLTE k = 0.30.

d = 0.10d = 0.30d = 0.70d = 1
MSE(LTE)957.66231245.72432157.46933134.9466
MSE(JLTE)725.6311752.21251102.89701872.6471
MSE(MJLTE)608.2459656.6023892.22141115.3394

Table 3.

The estimated MSE values of LTE, JLTE and MJLTE k = 0.50.

d = 0.10d = 0.30d = 0.70d = 1
MSE(LTE)1133.25671459.73602393.90792042.7127
MSE(JLTE)734.5155795.83111234.97203340.5986
MSE(MJLTE)587.0096633.9972815.8143973.5845

Table 4.

The estimated MSE values of LTE, JLTE and MJLTE k = 0.70.

d = 0.10d = 0.30d = 0.70d = 1
MSE(LTE)1415.1481774.12221774.12223613.8006
MSE(JLTE)779.0405891.0250891.02502274.5162
MSE(MJLTE)551.0494588.8484588.8484807.4456

Table 5.

The estimated MSE values of LTE, JLTE and MJLTE k = 1.

Figure 5.

Various MSE of the proposed estimator compared to others for different values of dwhen k=0.30.

Figure 6.

Various MSE of the proposed estimator compared to others for different values of dwhen k=0.50.

Figure 7.

Various MSE of the proposed estimator compared to others for different values of dwhen k=0.70.

Figure 8.

Various MSE of the proposed estimator compared to others for different values of dwhen k=1.

We observed that for all values of d SMSE(MJLTE) assumed smaller values compared to both SMSE(JLTE) and SMSE(LTE). The estimators’ SMSE values are affected by increasing values of k, however the estimator that is affected the least by these changes is our proposed MJLTE estimator. When compared to the other two estimators, the SMSE values of MJLTE gave the best results for both the small and large values of k and d.

6. A simulation study

We want to illustrate the behavior of the proposed parameter estimator by a Monte Carlo simulation. The main purpose of this article is to demonstrate the construction and the details of the simulation which is designed to evaluate the performances of the estimators LTE, JLTE and MJLTE when the regressors are highly intercorrelated. According to Liu [8] and Kibria [18] the explanatory variables and response variable are generated by using the following equations

xij=1γ21/2zij+γzip,yi=1γ21/2zi+γzipi=1,2,,n,j=1,2,,p

where zijis an independent standard normal pseudo-random number and pis specified so that correlation between any two explanatory variables is given by γ2. In this study, we used γ=0.90,0.95,0.99to investigate the effects of different degrees of collinearity with sample sizes n=20,50and 100, while four different combinations for kdare taken as (0.8, 0.5), (1, 0.7), (1.5, 0.9), (2, 1). The standard deviations considered in the simulation study are σ=0.1;1.0;10. For each choice of γ, σ2and n, the experiment was replicated 1000 times by generating new error terms. The average SMSE was computed using the following formula

SMSEβ̂=11000j=11000βjββjβ

Let us consider the LTE, JLTE and MJLTE and compute their respective estimated MSE values with the different levels of multicollinearity. According to the simulation results shown in Tables 4 and 5 for LTE, JLTE and MJLTE with increasing levels of multicollinearity there was a general increase in the estimated MSE values Moreover, increasing level of multicollinearity also lead to the increase in the MSE estimators for fixed d and k.

In Table 4, the MSE values of the estimators corresponding to different values of d are given for k = 0.70. For all values of d, the smallest MSE value appears to belong to the MJLTE estimator. The least affected by multicollinearity is MJLTE according to MSE criteria.

In Table 5, the MSE values of the estimators corresponding to different values of d are given for k = 1.

For all values of d, the smallest MSE value appears to belong to the MJLTE estimator. The least affected by multicollinearity is MJLTE according to MSE criteria.

We can see that MJLTE is much better than the competing estimator when the explanatory variables are severely collinear. Moreover, we can see that for all cases of LTE, JLTE and MJLTE in MSE criterion the MJLTE has smaller estimated MSE values than those of the LTE and JLTE.

7. Conclusion

In this paper, we combined the LTE and JLTE estimators to introduce a new estimator, which we called MJLTE. Combining the underlying criteria of LTE and JLTE estimators enabled us to create a new estimator for regression coefficients of a linear regression model that is affected by multicollinearity. Moreover, the use of jackknife procedure enabled as to produce an estimator with a smaller bias. We compared our MJLTE to its originators LTE and JLTE in terms of MSEM and found that MJLTE has a smaller variance compared to both LTE and JLTE. Thus, MJLTE is superior to both LTE and JLTE under certain conditions.

Download

chapter PDF

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Nilgün Yıldız (December 18th 2018). A Study on the Comparison of the Effectiveness of the Jackknife Method in the Biased Estimators [Online First], IntechOpen, DOI: 10.5772/intechopen.82366. Available from:

chapter statistics

175total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us