1 Introduction
Empirical likelihood methods have been studied extensively in the past three decades as a reliable and flexible alternative to the parametric likelihood. Among its numerous attractive properties, the ones that are most celebrated are the asymptotic distribution of the empirical likelihood ratio and the ability to use Bartlett correction to improve the corresponding confidence region coverage accuracy. However, despite these desirable properties that are at least parallel to the parametric likelihood methods, there is a serious drawback, where the empirical likelihood confidence region has an undercoverage problem in small sample or high dimensional settings. This undesirable feature was noticed in the early works, for example by Owen1988 and Tsao2004. For independent data, various methods have been proposed to address this issue. They can be divided into roughly two main areas. One is to improve the approximation to the limiting distribution of the log empirical likelihood ratio. For this approach, among others, Owen1988 proposed to use a bootstrap calibration and DiCiccio1991 showed that by scaling the empirical likelihood ratio with a Bartlett factor, which can be estimated from the data, the limiting coverage accuracy can be improved from to . Another approach is to tackle the convex hull constraint, which was first studied in Tsao2004. There are three major methods in this approach, namely the penalized empirical likelihood by Bartolucci2007, the adjusted empirical likelihood by Chen2008, and the extended empirical likelihood by Tsao2013. These three methods then have been extended and refined by subsequent research. To mention a few that are related to this paper, Zhang2016 extended the penalized empirical likelihood to fixb blockwise method to apply on weakly dependent data. Emerson2009 proposed to modify the placement of the extra point in the adjusted empirical likelihood in order to remove the upper bound of the adjusted likelihood ratio statistic, which if not removed will cause a confidence region to cover the whole space in some situations. Liu2010 showed that by choosing the tuning parameter in the adjusted empirical likelihood in a specific way, it is possible to achieve the Bartlett corrected coverage error rate. Chen2013 studied the finite sample properties of the adjusted empirical likelihood and discussed a generalized version of the method proposed in Emerson2009. It is worth pointing out that most of the existing work have focused on independent data, and the aforementioned Zhang2016 was the first paper to address the convex hull constraint for weakly dependent data with penalized empirical likelihood under the blockwise framework, which was introduced to empirical likelihood by Kitamura1997. Recently PiyadiGamage2017
studied the adjusted empirical likelihood for time series models under the frequency domain empirical likelihood framework, which was introduced by
Nordman2006. In this paper, we extend the adjusted empirical likelihood to weakly dependent data under the the blockwise empirical likelihood framework. Hereafter, we call it the adjusted blockwise empirical likelihood (ABEL). Compared to the nonstandard pivotal asymptotic distribution obtained in Zhang2016, we show that the ABEL preserves the much celebrated asymptotic distribution. In addition, we show that the tuning parameter can be selected such that the ABEL achieves the Bartlett corrected coverage error rate with weakly dependent data.This paper is organized as the following. Section 2 gives a brief introduction to the empirical likelihood method and its convex hull constraint problem. Basic notations used throughout the paper are also established in this section. Section 3 introduces the ABEL along with its asymptotic properties. In section 4, we show that the tuning parameter associated with the adjustment can be used to achieve Bartlett corrected error rate for weakly dependent data. In section 5, we demonstrate the performance of the ABEL method through a simulation study and discuss possible ways to calculate the tuning parameter. Proofs of the theorems are presented in section 7.
2 Empirical Likelihood and the convex hull constraint
In this section, we establish notations used in this paper by presenting a brief review of the empirical likelihood methods, the adjusted empirical likelihood and the blockwise empirical likelihood. For a comprehensive review of the empirical likelihood methodology, we refer to Owen2001. Let be i.i.d random samples from an unknown distribution . is the parameter of interest. Let be a dimensional estimating function, such that , where is the true parameter. One of the advantages of the empirical likelihood is that more information about the parameter can be incorporated through the estimating equations. In other words, we can have . The profile empirical likelihood about is defined as
(1) 
Then by a standard Lagrange argument, we have
where is the Lagrange multiplier that satisfies the equation
In the rest of this paper, we write in place of unless the dependency on needs to be explicitly stressed. The profile empirical likelihood ratio is defined as
Under regularity conditions, for example in Qin1994, it can be shown that
(2) 
Then an asymptotic empirical likelihood confidence region for can be found as
(3) 
(2) and (3) are the most celebrated properties of the empirical likelihood, which are parallel to its parametric counterpart. Despite these advantages, it has been noted early on by Owen1988, that the empirical likelihood confidence region constantly under covers. Tsao2004 studied the least upper bounds on the coverage probabilities, where it focused on the fact that the is finite if and only if is in the convex hull constructed by , and then showed that the empirical likelihood confidence region coverage probability is upper bounded by the probability of the origin being in the convex hull of . Further, Tsao2004 demonstrated that this upper bound is affected by sample size and parameter dimension in such a way that if the parameter dimension is comparable to the sample size, then the upper bound goes to 0 as the sample size goes to infinity. This not only explains the root cause of the undercoverage issue, but also shows the severity of the upper bound problem when the finite sample size is small compared to the parameter dimension. Since then, various researchers tried to address the convex hull constraint directly in order to improve the coverage probability. As mentioned in the introduction, three major approaches have been proposed, and in this paper we will focus on the adjusted empirical likelihood by Chen2008.
The idea of the adjusted empirical likelihood can be most easily demonstrated and understood by considering the two dimensional population mean. That is we have , and . The estimating function then becomes . In this set up, (1) simplifies to
(4) 
Notice that in (4) is defined if and only if is in the convex hull of . If the hypothesised is not in the convex hull, then there is no solution to (4), and by convention, is set to . As a result, even though when is the true population mean, it will not be included in the empirical likelihood confidence region because for any level . The first plot in Figure 1 shows 15 sample points whose population mean is , which is represented as the red dot. For this sample, using the empirical likelihood defined in (4), is set to . Even though is the true population mean and it is very close to the convex hull, setting provides no information about the plausibility of . In other words, one cannot compare the points and using the nonadjusted empirical likelihood because their likelihood ratio will both be , even though is much closer to the convex hull than is.
To mitigate this problem, Chen2008 proposed to add an extra point , to the original data, and then use the data points to construct the empirical likelihood. They called this the adjusted empirical likelihood,
The intuition of the adjustment can be seen from the plot on the right of Figure 1, where the adjusted convex hull will always contain the origin by design; thus the situation of forcing is avoided. Moreover, it has been shown in Chen2008 that if the hypothesised parameter is close to or in the convex hull, then the adjustment will alter the empirical likelihood by a negligible amount. Thus, at the true population mean, the asymptotic distribution still holds, and a confidence region can be constructed accordingly.
To relax the independence assumption on the data, modifications need to be made to the empirical likelihood method in (1). There are roughly two major approaches. One is the blockbased methods in the time domain and the other is the periodogrambased methods in the frequency domain. For a review of these methods, we refer to Nordman2014 and the references therein. In this paper, we use the blockbased method introduced by Kitamura1997 to work with weakly dependent data, where we assume that is a sample from a stationary stochastic process that satisfies the strong mixing condition,
(5) 
where , and denotes the algebra generated by . Further, assume
(6) 
for some constant . The reason that the empirical likelihood in (1) is inadequate for weakly dependent data is also easily seen by considering the population mean as in (4). The asymptotic distribution for is derived by the approximation , where . For i.i.d data, provides a proper scale to the score , so that is asymptotically distributed. However, for dependent data, is inadequate to scale the score because it does not take the autocorrelations among the data into account. As a remedy, Kitamura1997 proposed to use blocks of data in place of individual data points. To review this blocking method, let be the block length, the gap between block starting points, and the number of blocks respectively, where . Define the blockwise estimating equations as
Then the blockwise empirical likelihood is defined as
(7) 
And the log blockwise empirical likelihood ratio is defined as
Under assumptions A.1A8 in section 3, it can be shown that
(8) 
The proof of the above result (8) can be found in Kitamura1997 and Owen2001. It should be noted that the choice of the block length is important to the performance of the blockwise empirical likelihood method. Various authors have studied and proposed ways to select with their respective advantages and limitations. For examples on selecting we refer to Nordman2014, Nordman2013, Kim2013, and Zhang2014. The study of the optimal block choice is however beyond the scope of this paper.
3 Adjusted blockwise empirical likelihood
It is apparent that the blockwise EL method (7) also suffers from the convex hull constraint, which will impede proper coverage probability for finite sample. In this section, we propose to adjust the blockwise empirical likelihood and examine its effectiveness in improving the coverage probability for weakly dependent data. The theoretical appeal of the adjusted empirical likelihood for the i.i.d data is that it preserves the asymptotic distribution and at the same time breaks the convex hull constraint. Moreover, Liu2010 showed that for i.i.d data the adjusted empirical likelihood confidence region coverage probability error can be reduced from to . Furthermore, simulation studies in Chen2008, Emerson2009, and Liu2010 showed that the adjusted empirical likelihood provides significant improvements over the original empirical likelihood in terms of coverage probability. In the rest of this section, we show that all of the desirable properties of the adjusted empirical likelihood method mentioned above are preserved under the adjusted blockwise empirical likelihood for weakly dependent data.
Since the convex hull used in the blockwise empirical likelihood is formed by using the blockwise estimating functions , the extra estimating function used for the adjustment will naturally be constructed from the in contrast to the individual data points used in the i.i.d setting. With this we define the adjustment as
(9) 
where and . Now we construct the adjusted blockwise empirical likelihood with as the following
(10) 
and the log adjusted empirical likelihood ratio is then
(11) 
where
is the vector of Lagrange multiplier that satisfies
Before stating the asymptotic distribution of in (11), we first list the regularity conditions needed. A detailed explanation of these assumptions can be found in Kitamura1997. They are generalizations from the assumptions used in the i.i.d setting, for example in Qin1994, to the weakly dependent setting. The main points of these assumptions are on the continuity and differentiability of the estimating function around the true parameter of interest; so that the remainder terms in the Taylor expansion of the log empirical likelihood ratio are controlled, and that the dominating term converges to a distribution.

The parameter space is compact.

is the unique root of .

For sufficiently small , where is a small ball around , with radius .

If a sequence converges to some converges to except on a null set, which may vary with .

is an interior point of and is twice continuously differentiable at .

.

for defined in the strong mixing condition.
. and , where is the jth component of . 
is of full rank.
With these assumptions, the following theorem then shows that the adjusted empirical likelihood ratio has asymptotic distribution.
Theorem 1.
Similar to the nonadjusted BEL in Kitamura1997, the factor is to account for the overlap between blocks. If the blocks do not overlap, then . The in theorem 1 shows that the size of the tuning parameter is controlled by the block length and sample size . This should be expected since is allowed to grow with and the asymptotic result is obtained under this growing block length setting. Moreover, in the blockwise empirical likelihood, we usually have , where is the number of blocks; therefore, . The intuition is that the adjustment is made on the blocked estimating equations, thus the size of the tuning parameter should be controlled by the number of blocks instead of only by the sample size. By theorem 1, a asymptotic confidence region based on the ABELR can be constructed as:
(12) 
By the design of the extra point in (9), it is clear that the is well defined for any . As a consequence, there is no upper bound imposed by the convex hull on the coverage probability of (12).
As with any method that involves a tuning parameter, the choice of in practice is delicate, and it may depend on the statistical task that one wishes to tackle. In the i.i.d setting, Liu2010 studied the choice of through an Edgeworth expansion of the adjusted empirical likelihood ratio, and they found that if is specified in relation to the Bartlett correction factor, then the adjusted empirical likelihood confidence region can achieve the Bartlett error rate. In the next section, we will show that Bartlett correction is also possible for the adjusted blockwise empirical likelihood with weakly dependent data.
4 Tuning Parameter for Bartlett Corrected Error Rate
Being Bartlett correctable is an important feature of the parametric likelihood ratio confidence region, where the coverage probability error can be decreased from to . Like its parametric counterpart, DiCiccio1991 showed that the empirical likelihood for smooth function model is also Bartlett correctable. Further, Chen2007 showed that this property also holds for the empirical likelihood with general estimating equations. For weakly dependent data, Kitamura1997 showed that the blockwise empirical likelihood for smooth function model is Bartlett correctable, where the coverage probability error can be improved from to . The errors being larger than the ones for i.i.d data is due to the data blocking method, which is used to deal with the weakly dependent data structure. In this section, we show that through an Edgeworth expansion of the adjusted blockwise empirical likelihood ratio, a tuning parameter can be found such that the adjusted empirical likelihood confidence region coverage error is for general estimating equations. Here we assume the nonoverlapping blocking scheme. In other words, . In addition to the mixing conditions (5) and (6), we also assume that where and are defined in (5) and (6) and is a positive constant. We also need to assume the validity of the Edgeworth expansion of sums of dependent data, which Gotze1983
has shown by assuming the existence of more moments, a conditional Cramer condition, and that the random processes are approximated by other exponentially strong mixing processes with exponentially decaying mixing coefficients that satisfy a Markov type condition. For more details on these assumptions, we refer to
Kitamura1997, Bhattacharya1978, and Gotze1983.To simplify the notations in deriving the tuning parameter, assume that
where
is the identity matrix, otherwise we can replace
by . Let denote the jth component of . For , define(13) 
Notice that is the Kronecker , where and . Further, we let
With the above notation, it can be shown by following the calculations in Liu2010 and DiCiccio1988 that
(14) 
where, for
Here the summation over repeated index is used. Equation (14) is the so called signedroot decomposition of . Since we add an extra blocked estimating equation (9) in the adjusted blockwise empirical likelihood, the signedroot decomposition of will be slightly affected by the adjustment. And this is exactly where we can leverage the tuning parameter to achieve Bartlett corrected coverage error rate. By the derivation shown in section 7, the signedroot decomposition of is
(15) 
With defined above, in order to derive the tuning parameter in equation (15) that will yield the Bartlett error rate, we define the counterpart of (13) under dependency as the following: for a sequence of integers ,
Now if we define as follows, then the next theorem will show that the adjusted blockwise empirical likelihood confidence region (12) achieves the Bartlett corrected coverage error rate.
Let
(16) 
where, for ,
(17) 
with , and similarly for . The quantities , are defined as
The are the same as correspondingly, except that the superscript are exchanged, for example .
Theorem 2.
Assume that conditions A.1A.8 in section 3 hold with nonoverlapping blocking scheme. And such that the bounds on , and the assumptions for the Edgeworth expansion for sums of dependent data mentioned in the beginning of this section hold, if is as in (16), then as ,
In practice, the unknown population quantity can be replaced by , where is the maximum blockwise empirical likelihood estimator of . The quantity is composed of various population moments, which can be replaced by their corresponding sample moments to obtain an estimator of . Moreover, the estimated may be positive or negative. If it is positive, then the convex hull constructed with the extra point will always contain the origin. However, if is negative, then is also negative. As a result, the convex hull with the new point will not contain the origin if the original convex hull does not. To avoid the second situation, if , then we add two extra points , such that . We can let and , such that will guarantee that the origin is in the new convex hull. Moreover, since , adding will have the same effect as adding with tuning parameter in terms of obtaining the Bartlett coverage probability.
5 Simulation
In this section, we examine the numerical properties of the adjusted blockwise empirical likelihood through a simulation study. We compare the confidence regions constructed by the adjusted blockwise empirical likelihood (10) with several tuning parameters to the one constructed by the nonadjusted blockwise empirical likelihood (7). The data are simulated from an AR(1) model
where are i.i.d
dimensional multivariate standard normal random variables and
is a diagonal matrix with on the diagonal. The parameter of interest is the population mean . In order to see how the data dependencies affect the performance of the methods, we simulate the data with a range of . In particular, we look at . We also simulate for dimension to see how the parameter dimension affects the performances. We look at two sample sizes and . For each scenario, we calculate the blockwise empirical likelihood ratio at block lengths ranging from to in order to examine the effects of block choices. In addition, we also use the progressive blocking method proposed by Kim2013, which does not require to fix a blocklength. For each scenario, we simulate data sets and calculate the likelihood ratio for each data set at the true mean. The coverage probability is then calculated as the number of times the likelihood ratio is less than the theoretical quantile at levels divided by . The likelihood ratios are calculated by the blockwise empirical likelihood without adjustment (BEL), adjusted blockwise empirical likelihood with (ABEL_log), (ABEL_0.5), (ABEL_0.8), (ABEL_1), and given in (16) (ABEL_bart). The Bartlett tuning parameter (16) is estimated by the plugin estimator, which is then bias corrected by a blockwise bootstrap. The full simulation results are shown in Table 2 in the appendix. Table 1 shows a snapshot of Table 2, where the AR(1) coefficients . is the block length that gives the best coverage rates of each particular method, where indicates that the progressive block method gives the best result. It can be seen that for negative , BEL performed well and at least one of the adjusted BEL matched or surpassed the BEL performance. As becomes positive, the BEL starts to show its vulnerability of undercoverage and this becomes worse as dimension increases. In contrast, the adjusted BEL still provides adequate coverage. This phenomenon where the BEL does not suffer as severe undercoverage for negative as it does for positive exemplifies the fact that the coverage probability is upper bounded by the probability of the convex hull containing the origin. For when is negative, the consecutive points are likely to be on the opposite sides in relation to the origin, therefore the resulting convex hull is likely to contain the origin and does not impose an upper bound on the coverage probability. Whereas, for positive , especially when it is close to , the consecutive points are likely to be close to each other; thus, the probability that the resulting convex hull contains the origin is small.n=  n=  
Methods  0.90  0.95  0.99  0.90  0.95  0.99  
0.2  3  BEL  3  0.90  0.94  0.98  6  0.89  0.94  0.99 
0.2  3  ABEL_log  3  0.94  0.97  0.99  pro  0.90  0.96  1.00 
0.2  3  ABEL_0.8  3  0.91  0.95  0.98  7  0.90  0.94  0.99 
0.2  3  ABEL_1  14  0.90  0.95  0.99  7  0.90  0.95  0.99 
0.2  3  ABEL_bart  14  0.91  0.94  0.97  pro  0.90  0.95  0.99 
0.2  3  BEL  3  0.82  0.89  0.95  9  0.88  0.93  0.98 
0.2  3  ABEL_log  4  0.89  0.95  1.00  6  0.90  0.95  0.99 
0.2  3  ABEL_0.8  3  0.83  0.90  0.96  9  0.88  0.94  0.98 
0.2  3  ABEL_1  14  0.88  0.95  0.99  9  0.88  0.94  0.99 
0.2  3  ABEL_bart  12  0.93  0.96  0.98  8  0.90  0.95  1.00 
0.5  3  BEL  5  0.68  0.77  0.89  10  0.82  0.87  0.95 
0.5  3  ABEL_log  5  0.89  0.97  1.00  13  0.91  0.96  1.00 
0.5  3  ABEL_0.8  16  0.74  0.88  0.97  10  0.83  0.89  0.96 
0.5  3  ABEL_1  14  0.87  0.95  0.99  10  0.83  0.89  0.96 
0.5  3  ABEL_bart  14  0.92  0.95  0.97  pro  0.90  0.96  0.99 
0.5  4  BEL  4  0.64  0.72  0.85  9  0.77  0.85  0.95 
0.5  4  ABEL_log  16  0.92  0.94  0.97  11  0.88  0.95  1.00 
0.5  4  ABEL_0.8  14  0.72  0.86  0.95  9  0.79  0.87  0.96 
0.5  4  ABEL_1  14  0.86  0.92  0.97  9  0.79  0.87  0.96 
0.5  4  ABEL_bart  13  0.91  0.94  0.96  pro  0.91  0.97  0.99 
0.8  2  BEL  9  0.58  0.67  0.76  16  0.77  0.85  0.94 
0.8  2  ABEL_log  7  0.87  0.98  1.00  16  0.88  0.95  1.00 
0.8  2  ABEL_0.8  16  0.72  0.86  0.98  16  0.80  0.86  0.94 
0.8  2  ABEL_1  16  0.85  0.95  0.99  16  0.80  0.87  0.95 
0.8  2  ABEL_bart  4  0.91  0.94  0.97  13  0.90  0.96  1.00 
6 Conclusion
Originally proposed to improve the coverage probability of the empirical likelihood confidence region coverage probability for i.i.d data, the adjusted empirical likelihood in this paper is shown to be effective in improving the coverage probability when combined with the blocking method in dealing with weakly dependent data. In particular, we have shown that the ABEL possesses the asymptotic property similar to its nonadjusted counterpart. Moreover, we have shown that the adjustment tuning parameter can be used to achieve the asymptotic Bartlett corrected coverage error rate of . This tuning parameter that gives the Bartlett corrected rate involves higher moments that needs to be estimated in practice. How to best estimate this tuning parameter needs to be further studied. In the simulation study, we used a blockwise bootstrap to correct the bias in estimating the tuning parameter by plugging in the sample moments. The results show that the adjusted BEL performs comparable to the nonadjusted BEL when the nonadjusted BEL performs well, and it outperforms the nonadjusted BEl when the nonadjusted BEL suffers from the undercoverage issue. Our bootstrap bias corrected tuning parameter performs well most of the time, but sometimes it is outperformed by other choices of the tuning parameter. As mentioned above, the optimal way to estimate the tuning parameter will be addressed in future studies.
7 Proofs
Proof of Theorem 1.
The first step in proving theorem 1 is to show that the Lagrange multiplier is , where we use the subscript to emphasis that this is the Lagrange multiplier for the adjusted empirical likelihood. First, we note that solves the following equation
(18) 
Now, define . Multiply on both sides of equation (18), and recall that . Then we have
(19) 
where
. By law of large numbers and the central limit theorem, and the argument in
Owen1990 and Kitamura1997, it can be shown that a.s. It has also been shown in Kitamura1997 that and . Then, we can deduce from (19) thatBy the assumption that , we have
Therefore, , which in particular means that .
The next step is to express in terms of . Notice that equation (18) can be written as the sum of two parts
(20) 
where the first part on the right hand side can be written as
The last part in (20) is
Now, we have the relationship
(21) 
The final step is done through Taylor expansion of the adjusted blockwise empirical likelihood ratio . The adjusted blockwise empirical likelihood ratio can be written in two parts as
Comments
There are no comments yet.