A new method of joint nonparametric estimation of probability density and its support
Taku Moriyama

TL;DR
This paper introduces a novel joint nonparametric method for estimating probability density and its support, effectively addressing boundary bias issues in kernel density estimation, including multivariate cases.
Contribution
It proposes a boundary detection technique that eliminates boundary bias in kernel density estimation, extending to multivariate scenarios with an improved estimator.
Findings
Successfully detects boundaries in density estimation
Eliminates boundary bias in univariate and multivariate cases
Provides a more accurate density and support estimation
Abstract
In this paper we propose a new method of joint nonparametric estimation of probability density and its support. As is well known, nonparametric kernel density estimator has "boundary bias problem" when the support of the population density is not the whole real line. To avoid the unknown boundary effects, our estimator detects the boundary, and eliminates the boundary-bias of the estimator simultaneously. Moreover, we refer an extension to a simple multivariate case, and propose an improved estimator free from the unknown boundary bias.
| 0. Assume that the support of the density is from to , |
| where is known, and is unknown but bounded. |
| Check the assumptions of Theorem 1. |
| 1. Select a boundary bias reduction method, |
| and choose the kernel and bandwidth, following the method. |
| denotes the boundary bias free estimator |
| of the density function whose support is to |
| and denotes the cumulative distribution estimator. |
| 2. Solve the equation for : |
| . |
| 3. Set the solution as , |
| and output the boundary-adjusted estimators and . |
|
|
||||||||||||||||||||
| 0. Assume that the support of the density is of the form, |
| , |
| where is the interval from to . |
| Check the assumptions of Theorem 3. |
| 1. Select a boundary bias reduction method, |
| and choose the kernel and bandwidth as follows. |
| and denote the marginal distribution and density estimator, |
| free from the boundary bias caused by the bounded support . |
| 2. Solve the equation for for every : |
| (see Section 3). |
| 3. Set the solution as , |
| and output the boundary-adjusted estimator and , |
| using a combining nonparametric copula estimator (see (14)). |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Statistical Methods and Bayesian Inference
A new method of joint nonparametric estimation of probability density and its support
Taku MORIYAMA
Graduate School of Mathematics, Kyushu University
Abstract
We propose a new method for simultaneous nonparametric estimation of a probability density and its support. As is well known, a nonparametric kernel density estimator has a ‘boundary bias problem’ when the support of the population density does not cover the whole real line. If we know the support exactly, we may reduce the bias by using a boundary bias reduction method. When the support is unknown, there is possibly a boundary problem of which we should take care. We insist on the necessity of estimating the support and propose a new method of nonparametric density estimation that is free from the boundary bias in such case.
The proposed method detects the boundary and gives the modified density estimator simultaneously. Although it is natural to estimate the support by using the sample maximum (and minimum) and modify the density estimator as in Hall & Park (2002), the new method is numerically superior in the sense of an integrated squared error in the boundary region. Moreover, we discuss an extension to a simple multivariate case and propose a new method for estimating the joint probability density. Using the idea of nonparametric copula estimation, this method combines the marginal densities estimated by the proposed single variable method. The obtained joint density estimator is also boundary bias free.
Keywords: Boundary bias; kernel estimator; integrated squared error; support estimator
1 Introduction
Random phenomena are described uniquely by their probability distribution, and estimations of their distributions give us much information. Nonparametric density estimation not only provides a graphical overview of the shape of a distribution, but also enables us to infer a variety of interesting things. These include estimation of functionals of density and statistical testing, for example, goodness of fit with a parametric model, equality of two distributions, symmetry, multi-modality, and so on. Rosenblatt (1956) has proposed a smooth nonparametric estimator of the density function , and it has been extensively investigated (Tsybakov (2009)). The kernel cumulative distribution estimator is given by integration of ; we will begin by introducing that.
Let be independently and identically distributed () random variables with a distribution function , and let be the density function. The kernel density estimator and distribution estimator are given by
[TABLE]
and
[TABLE]
where is a symmetric kernel function, is the integral of and is a bandwidth which satisfies and . We call the ‘naive density estimator’ and similarly.
Since the naive kernel estimators are sums of random variables, it is easy to obtain their moments. By making a change of variables and performing a Taylor expansion, if the support of is all of , we find that
[TABLE]
and similarly
[TABLE]
[TABLE]
under certain regularity conditions. However, when the support is not all of , the moments changes, and loses consistency near the boundary of the support. This situation is known as the ‘boundary bias’ problem. When the support is (i.e. ), the bias of at the boundary is
[TABLE]
We call the area , where is biased, the boundary region of the density . In addition, the order of the asymptotic bias of becomes larger: for ,
[TABLE]
Remark 1
In fact, the boundary bias depends on the left limit rather than . Therefore, it is not a problem of whether is left or right continuous at . Hereafter, we will assume that and that is right continuous at the left endpoint i.e. f(l_{0}+)=f(l_{0})$$) if is bounded.
If we know the support of exactly, we can reduce the bias by using a boundary bias reduction method. The basic reduction methods of include renormalization (Jones (1993)), reflection (Ćwik & Mielniczuk (1993)), asymmetric kernel (Chen (1999), Chen (2000), etc.) and generalized jackknife (Jones (1993), Terrell & Scott(1980)), the ‘direct’ method for reducing to an arbitrary order (Bearse & Rilstone (2009)), and so on. It seems to be easy to obtain boundary bias free estimators of the cumulative distribution by integration of the modified kernel density estimators. However, the modified density estimators are not usually sums, and it is often difficult to represent the integral explicitly. That is why it is hard obtain the modified cumulative distribution estimators. Recently, some papers were published that focus on distribution function estimation. The boundary kernel method (Tenreiro(2013)) and generalized reflection method (Kolacek & Karunamuni (2011)) can give boundary bias free estimators of . We should note that all of them control the boundary effect which comes from the ‘known’ support.
When the support is unknown, however, we do not take care with the boundary problem well. This may be because there are almost no papers tackling the unknown boundary effect (except Hall & Park(2002)). In most ‘actual’ cases, the realized values are not large, and so the support must be smaller than . Therefore, we insist on the necessity of both estimating the support and eliminating the unknown boundary bias appropriately.
To overcome the unknown boundary effect, one can estimate the support and then modify the density estimator which regards the estimated support as true. In fact, Hall & Park(2002) proposed to replace the unknown upper bound of the support with the sample maximum . We call the modified density estimator an ‘-based estimator’. However, it seems obvious that the boundary estimator is not always numerically best for a density estimation that utilizes a very different boundary bias reduction method. Here, we propose a new method for estimating the probability density and its support simultaneously. The boundary estimator depends on the boundary bias reduction method, which is what we apply, and the proposed density estimator is numerically more accurate in the boundary region. In fact, the new method minimizes a loss function asymptotically.
Section 2 describes some of the basic boundary bias reduction methods for when the support is known and simple, such as and . In Section 3, we describe the new method for estimating the population density which is free from the ‘unknown’ boundary bias and investigate its asymptotic properties. In addition, we confirm that some boundary bias reduction methods satisfy a condition which ensures the proposed estimator works well. In Section 4, we compare the proposed estimator with the naive kernel density estimator and the -based estimator in the sense of the integrated squared error in the boundary region. When is bounded and holds, has consistency, while the new boundary estimator has consistency. However, the convergence rate does not affect the integrated error asymptotically; in fact, we demonstrate that the new density estimator performs better numerically. Moreover, in section 5, we discuss an extension to the simple multivariate case in which the support is given by an hyper-rectangle. Using the idea of nonparametric copula estimation, we propose a new method for estimating joint density estimator. This method combines the marginals estimated by the single variable method (one-dimensional cases), and the obtained joint density estimator is boundary bias free. We study the proposed method by simulating a number of bivariate distribution estimations. The proofs are given in the appendices.
2 Boundary bias reduction methods
Let us assume that the support of the density function is known and given by . Ćwik & Mielniczuk (1993) discussed a ‘reflection’ method that reduces the boundary bias of the naive kernel density estimator. The estimator is given by
[TABLE]
for . The asymptotic bias is as follows: for ,
[TABLE]
where , and the asymptotic order of the variance is the same as that of the naive estimator . Thus, the estimator recovers its consistency in the boundary region. In addition, if , the bias becomes of order uniformly, where is the derivative of . From the integral of , we can derive the distribution estimator as follows:
[TABLE]
From Ćwik & Mielniczuk (1993, we can see
[TABLE]
where , and the order of the variance is the same as that of the naive estimator.
Tenreiro (2013) proposed the ‘boundary kernel’ method for reducing the bias of the kernel cumulative distribution estimator. The method changes the kernel function in the boundary region. It has a simple form, given as follows:
[TABLE]
The author of that paper shows that for ,
[TABLE]
when the support of is bounded. We can also derive the following density estimator from the derivative of and calculate the following bias,
[TABLE]
If is zero, the bias becomes of order .
3 New joint estimator of probability density and its support in one dimension
Let us assume that both boundaries \mbox{\boldmathu}_{0}=(l_{0},u_{0})^{T} are unknown but that both are bounded. We do not assume that the interval is open, half-open, or closed. Now we put \mbox{\boldmathu}=(l,u)^{T}, and \widehat{f}_{\mbox{\boldmathu}}^{\dagger} denotes the kernel type and boundary bias free estimator of the density whose support is . We propose the following new method for estimating the probability density and its support simultaneously. Let us define the estimator \mbox{\boldmathu}=\widehat{\mbox{\boldmathu}} of \mbox{\boldmathu}_{0} as the solution of the following function,
[TABLE]
and the density estimator as \widehat{f}_{\widehat{\mbox{\boldmathu}}}^{\dagger} (and the distribution estimator as \widehat{F}_{\widehat{\mbox{\boldmathu}}}^{\dagger}), where
[TABLE]
and
[TABLE]
In this section, all functions in bold-type denote two-dimensional vectors of ‘same’ scalar-valued functions for (e.g., \mbox{\boldmathF}(\mbox{\boldmathx})=(F(x_{1}),F(x_{2}))^{T}). Note that \mbox{\boldmathu}_{0} stands for the true value and is variable. Intuitively, equation comes from the properties of the maximum and minimum estimation of the uniform distribution on the interval . This is because \mbox{\boldmathF}((X_{1},X_{2})^{T})\stackrel{{\scriptstyle d}}{{=}}\mbox{\boldmathZ} (in distribution) and \mbox{\boldmathF}(\mbox{\boldmathX}_{(1,n)})\approx\mbox{\boldmathc}_{n} hold, where is the two-dimensional uniform random variable on . Moreover, (the sample maximum of the uniform distribution) is known as the minimum-variance unbiased estimator of the boundary value () of the uniform distribution (the sample minimum is similar).
Remark 2
The solution \mbox{\boldmathu}=\mbox{\boldmathX}_{(1,n)} (that is, \mbox{\boldmathX}_{(1,n)}-based estimator) can be viewed as the solution of the following equation,
[TABLE]
where \mbox{\boldmathc}=(0,1)^{T}.
In fact, the new estimator asymptotically coincides with the minimizer of a local expected predict error. Let us define the following error as the loss function
[TABLE]
where \widehat{F}_{\mbox{\boldmathu}}^{\dagger} is based on the sample and is the indicator function (if occurs), (if fails). Let \mbox{\boldmathq}_{n} be some two areas near both boundaries which contain both and as the realized value. Then, replacing by the estimator and using the fact that \mbox{\boldmathF}(\mbox{\boldmathX}_{(1,n)})\approx\mbox{\boldmathc}_{n}, we can see that the minimizer of the error asymptotically coincides with \widehat{\mbox{\boldmathu}} as follows:
[TABLE]
The solution \mbox{\boldmathu}=\widehat{\mbox{\boldmathu}} of is not given as an explicit formula, and the properties depend on the bias reduction method. Next, we state the general properties of the proposed estimator and study some applications.
3.1 Asymptotic properties
To construct the asymptotic properties of the new estimators, we utilize the asymptotic theory of estimation. To view the solution \mbox{\boldmathu}=\widehat{\mbox{\boldmathu}} as an estimator asymptotically, we make the following assumptions.
Assumption 1
For all large enough integers , there is whose interior includes \mbox{\boldmathu}_{0}.
Assumption 2
*There is a function \mbox{\boldmath\Psi}_{\mbox{\boldmathu},n} which satisfies the following for all large enough integers :
\mbox{\boldmath\Psi}_{\mbox{\boldmathu},n} is given by following sum*
[TABLE]
where
[TABLE]
and
[TABLE]
* The following stochastic expansion holds:*
[TABLE]
where the residual \mbox{\boldmathR}_{\mbox{\boldmathu},n} satisfies
[TABLE]
*uniformly for \mbox{\boldmathu}\in\mbox{\boldmath\Theta} and \mbox{\boldmathx}\in\mathbb{R}^{2}, and \mbox{\boldmath1}=(1,1)^{T}.
The next convergence holds*
[TABLE]
*where \mbox{\boldmathF}_{\mbox{\boldmathu}} is a function such that \mbox{\boldmathF}_{\mbox{\boldmathu}_{0}} equals the underlying
distribution function .*
Assumption 3
*The following holds for all large enough integers :
\centering{\displaystyle\Psi_{\mbox{\boldmathu}_{0},n}(x)=F(x)+b_{F}^{\dagger}(x,n,h)+O_{P}(n^{-1/2})}\@add@centering
holds uniformly for , where satisfies b_{F}^{\dagger}(\mbox{\boldmathu}_{0},n,h)=o(n^{-1/2}\mbox{\boldmath1}).
\displaystyle\frac{\partial}{\partial x}\Psi_{\mbox{\boldmathu}_{0},n}(x)=f(x)+b_{f}^{\dagger}(x,n,h)+O_{P}((nh)^{-1/2})
holds uniformly for , where*
[TABLE]
* For every ,*
[TABLE]
holds uniformly for \mbox{\boldmathu}\in\mbox{\boldmath\Theta} and (that is, every component converges some uniformly bounded constants), where
[TABLE]
Assumption 4
*For all large enough integers , the following holds:
E[\mbox{\boldmath\psi}_{i,\mbox{\boldmathu},n}(\mbox{\boldmathu}_{0})]-\mbox{\boldmathc}_{n}=\mbox{\boldmath0} has the unique solution \mbox{\boldmathu}=\mbox{\boldmathu}_{n}^{*} which satisfies
\mbox{\boldmathu}_{n}^{*}\to\mbox{\boldmathu}_{0}.
for any , there is which ensures
{\displaystyle P\left[\inf_{\mbox{\boldmathu}:\|\mbox{\boldmathu}-\mbox{\boldmathu}_{0}\|>\eta}\|\mbox{\boldmath\Psi}_{\mbox{\boldmathu},n}(\mbox{\boldmathu}_{0})-\mbox{\boldmathc}_{n}\|>\kappa_{\eta,\delta_{n},n}\right]>1-\delta_{n}},
where .
{\displaystyle\left(E\left[\frac{\partial}{\partial\mbox{\boldmathu}}\mbox{\boldmath\psi}_{i,\mbox{\boldmathu}_{n}^{*},n}(X_{i},\mbox{\boldmathu}_{0})\right]\right)^{-1}} exists, i.e., the matrix is nonsingular.
E[\|\mbox{\boldmath\psi}_{i,\mbox{\boldmathu},n}(\mbox{\boldmathu}_{0})\|^{2}] is bounded in some neighborhood of \mbox{\boldmathu}=\mbox{\boldmathu}_{n}^{*}
and is continuous at \mbox{\boldmathu}_{n}^{*}.*
Assumptions 2 and 3 admit the following asymptotic expansion of \widehat{\mbox{\boldmathF}}_{\mbox{\boldmathu}}^{\dagger}(\mbox{\boldmathX}_{(1,n)}) at the sum \mbox{\boldmath\Psi}_{\mbox{\boldmathu},n},
[TABLE]
and the boundary bias reduction of the distribution estimators \widehat{F}_{\mbox{\boldmathu}}^{\dagger} and \widehat{f}_{\mbox{\boldmathu}}^{\dagger}. Assumption 4 keeps the consistency of \widehat{\mbox{\boldmathu}}, on the basis of the asymptotic theory of estimation.
Under these assumptions, we can show the consistency of \widehat{\mbox{\boldmathu}} and the bias reduction of \widehat{F}_{\widehat{\mbox{\boldmathu}}}^{\dagger} and \widehat{f}_{\widehat{\mbox{\boldmathu}}}^{\dagger}.
Theorem 1
Given Assumptions 1 - 4 and that
[TABLE]
we have
[TABLE]
Proof. See the Appendices.
Remark 3
When and , \mbox{\boldmathX}_{(1,n)} is an consistent estimator of \mbox{\boldmathu}_{0} and clearly satisfies the assumptions of Theorem 1.
Remark 4
If we know either or , we can see that Theorem 1 still holds by replacing the all vector values with the scalar ones, for example, \mbox{\boldmathu}_{0} with a scalar or , with a scalar or , and \mbox{\boldmathX}_{(1,n)} with or , and so on. Table 1 shows the main procedure of the proposed method if we know the left support .
Now, we apply the boundary kernel method and reflection method described above. Since P[\|\mbox{\boldmathX}_{(1,n)}-\mbox{\boldmathu}_{0}\|>h]=o((\sqrt{n}h)^{-1}) holds under the assumptions of Theorem 1, we define the boundary estimator \mbox{\boldmathu}=\widehat{\mbox{\boldmathu}}^{[BK]} as the solution of the following equation,
[TABLE]
where
[TABLE]
The vector-valued equation is divided into two scalar-valued equations, such that one includes only , and the other includes only . We can easily see that is an increasing function for and is a decreasing function for . For any fixed , we have
[TABLE]
and
[TABLE]
Therefore, we can see that the solution is unique and is also unique. Thus, it is easy to confirm that the minimizer \widehat{\mbox{\boldmathu}}^{[BK]} is unique and satisfies . Now, we can deduce the following result.
Corollary 1
Let us assume that and exists and is continuous. When holds and , we have
[TABLE]
By applying the reflection method, we define the boundary estimator \mbox{\boldmathu}=\widehat{\mbox{\boldmathu}}^{[R]} and the distribution estimator \widehat{F}_{\widehat{\mbox{\boldmathu}}}^{[R]} in the same way:
[TABLE]
When is large enough, equation is divided into two mutually independent parts in the sense of and as same as the boundary kernel method. It is easy to see that is an increasing function for and that is a decreasing function for . For any fixed , we have
[TABLE]
and
[TABLE]
The equation
[TABLE]
does not always hold, but we can prove the following asymptotic properties under some assumptions. We can get the following result in a similar manner.
Corollary 2
Let us assume that exists and is continuous and that holds. Then, if Assumption 1 holds and , we have
[TABLE]
4 Simulation study in Case 1
We compared the proposed estimator numerically with the naive estimator and -based estimator. Hall & Park(2002) also discusses bias reduction of in case ; however, the modification does not affect the first-order asymptotics of the density estimation since is consistent in such case. Therefore, we did not take it into account here. Table 2 shows the averaged values of , defined as
[TABLE]
of the density estimator in the boundary region . In the table, denotes the of the standard kernel density estimator . denotes the of and denotes -based estimator whose upper bound of support is given by using the boundary kernel method. denotes and is the -based estimator using the boundary kernel method. The number of repetitions was for all cases. The kernel functions were the Epanechnikov, and the bandwidths were the same and chosen by cross-validation as to which made ‘Naive’ asymptotically best. , , and show the in the same way. The kernel function was a Gaussian.
Remark 5
, , hold, where denotes the density function of the beta distribution with parameters . Note that if , holds. Table 3 shows the boundary bias of the kernel estimators.
We can see from these numerical studies that when , converges more slowly than , which has consistency; however, the proposed is mostly better than . Although the boundary effect is not so great when , the proposed estimator is comparable with the other estimators. From the above results, we can claim that the proposed method is better in terms of both the theoretical and numerical local loss, at least when the support seems to be compact. Compared with the reflection method, the boundary kernel method seems to be numerically superior, especially when .
5 Extension to a simple multivariate case
5.1 Estimating multivariate joint density with unknown but simple form of support
We want to apply the new method to improve kernel joint probability density estimation, kernel regression and so on; however, it is impossible to apply it to multivariate cases directly. This is because the support of a multivariate joint density is given by a much more general formula and the problem completely changes. In addition, most of the boundary bias reduction methods have only been thoroughly investigated in simple cases, such as , and so on.
Let \mbox{\boldmathX}_{1},\mbox{\boldmathX}_{2},\cdots,\mbox{\boldmathX}_{n} be independently and identically distributed () random variables with distribution function and density function . \mbox{\boldmathX}_{i}=(X_{1,i},\cdots,X_{d,i})^{T} denotes -dimensional value, and hereafter, we will assume that the support of is given by an unknown hyper-rectangle .
Assumption 5
There is bounded which satisfies
[TABLE]
where is a bounded and open, half-open, or closed interval from to .
We can treat the marginal distributions in one dimension, even those whose support is unknown, so let us construct the joint probability density estimator, combining the marginals with the copula defined as follows:
[TABLE]
where \mbox{\boldmathx}=(x_{1},\cdots,x_{d})^{T}, {\accentset{\diamond}{\mbox{\boldmathG}}}(\mbox{\boldmathx})=(G_{1}(x_{1}),\cdots,G_{d}(x_{d}))^{T} and is the marginal distribution of the -th component. The following equivalent form of the copula is known:
[TABLE]
where \mbox{\boldmathv}=(v_{1},\cdots,v_{d})^{T} and the existence and uniqueness is known as Sklar’s theorem. The copula function describes the dependence structure between the marginal variables. Scaillet and Fermanian (2002) proposed the following estimator in the bivariate case ():
[TABLE]
where is the kernel marginal distribution estimator. In addition, Chen & Huang (2007) discusses reduction of the boundary bias of the copula estimator; however, we can not apply it to the case of because the boundary problem gets rather complicated. Instead, using an appropriate estimator and the idea of copula, we define the following general form of the joint probability density estimator free from the boundary bias
[TABLE]
where
[TABLE]
\widehat{G}_{j,\widehat{\mbox{\boldmathu}}_{j}}^{\dagger} is the proposed estimator of the marginal distribution as given in section 3 and satisfies the following assumption.
Assumption 6
For every and , exists and is continuous. In addition, all of the following hold for any \mbox{\boldmathx}\in\mathbb{R}^{d}:
[TABLE]
Remark 6
The following product-type of the boundary kernel and the reflection estimators of the marginals satisfy Assumption 6 and give an appropriate distribution estimation,
[TABLE]
where n^{-1}\sum_{i=1}^{n}W_{\mbox{\boldmathu}_{j,0}}^{[\cdot]}(x_{j},X_{j,i},n,h)=\widehat{G}_{\mbox{\boldmathu}_{j,0}}^{[\cdot]}(x_{j}) is the marginal distribution estimator of and ‘’ is ‘’ or ‘’.
The critical part of the modification is changing the unknown support (hypercube) to . Under the assumption of support, we can remove the boundary bias.
Theorem 2
Let us assume that for every marginal distribution , Assumptions 1 - 4 hold. In addition, we assume that for every ,
[TABLE]
holds, where \mbox{\boldmathX}_{j,(1,n)}=(X_{j,(1)},X_{j,(n)})^{T} and \mbox{\boldmathu}_{j,0}=(l_{j,0},u_{j,0})^{T}. Under Assumptions 5 and 6, we have
[TABLE]
where
[TABLE]
b_{G_{j}}^{\dagger}(\mbox{\boldmathx},n,h)* is the bias of the marginal distribution estimation \widehat{G}_{{\mbox{\boldmathu}}_{j},j}^{\dagger}(\mbox{\boldmathx}) and B_{g}^{\dagger}(\mbox{\boldmathx},n,h) is defined similarly.*
Proof. See the appendices.
Table 4 summarizes the multivariate version of the proposed method.
Acknowledgement
The author thanks Prof. Y. Maesono., Faculty of Mathematics, Kyushu University for his valuable comments.
6 Appendices: Some proofs
Proof of Theorem 1Under the assumptions of Theorem 1, we can prove that the following equation holds by using the asymptotic expansions:
[TABLE]
Therefore, \mbox{\boldmathu}=\widehat{\mbox{\boldmathu}} can be viewed as an estimator, and we can use asymptotic theory. Using assumptions and of Assumption 4, we find that
[TABLE]
and
[TABLE]
Combining them and using assumption of Assumption 4 for any , we have
[TABLE]
Next, let us expand around in the estimating equation,
[TABLE]
where \widetilde{\mbox{\boldmathu}} is a random variable between \widehat{\mbox{\boldmathu}} and \mbox{\boldmathu}_{0}. Accordingly, we find that
[TABLE]
where
[TABLE]
holds. Therefore, we obtain
[TABLE]
and thus
[TABLE]
Next, expanding \widehat{\mbox{\boldmathu}} around \mbox{\boldmathu}_{0}, we have
[TABLE]
The expectation is given by
[TABLE]
To derive the variance term, we perform another asymptotic expansion as follows:
[TABLE]
Then, we have the following four results:
[TABLE]
From the above results, we can see that
[TABLE]
and that
[TABLE]
In the same way, we can prove that
[TABLE]
Proof of Theorem 2 From Theorem 1, the following asymptotic expansion holds under the assumptions:
[TABLE]
Then, it is easy to see that Theorem 2 follows.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Bearse, P., & Rilstone, P. (2009). Higher order bias reduction of kernel density and density derivative estimation at boundary points. In Nonparametric Econometric Methods , pages 319–331. Emerald Group Publishing Limited.
- 2[2] Chen, S. X. (1999). Beta kernel estimators for density functions. Computational Statistics & Data Analysis , 31(2), 131–145.
- 3[3] Chen, S. X. (2000). Probability density function estimation using gamma kernels. Annals of the Institute of Statistical Mathematics , 52(3), 471–480.
- 4[4] Chen, S. X., & Huang, T. M. (2007). Nonparametric estimation of copula functions for dependence modelling. Canadian Journal of Statistics , 35(2), 265–282.
- 5[5] Ćwik, J. & Mielniczuk, J. (1993). Data-dependent bandwidth choice for a grade density kernel estimate. Statistics & Probability Letters , 16(5), 397–405.
- 6[6] Hall, P., & Park, B. U. (2002). New methods for bias correction at endpoints and boundaries. Annals of Statistics , 1460–1479.
- 7[7] Jones, M. C. (1993). Simple boundary correction for kernel density estimation. Statistics and Computing , 3(3), 135–146.
- 8[8] Kolácek, J. & Karunamuni, R. J. (2011). A generalized reflection method for kernel distribution and hazard functions estimation. Journal of Applied Probability and Statistics , 6(2), 73–85.
