Maximum pseudo-likelihood estimation based on estimated residuals in copula semiparametric models
Marek Omelka, \v{S}\'arka Hudecov\'a, Natalie Neumeyer

TL;DR
This paper investigates the maximum pseudo-likelihood estimation of copula models with residual-based data, demonstrating asymptotic equivalence to unobserved error-based estimators under certain conditions, and exploring limitations via simulations.
Contribution
It establishes the asymptotic properties of residual-based pseudo-likelihood estimators in copula models and examines their performance when regularity conditions fail.
Findings
Residual-based estimator is asymptotically equivalent to error-based estimator under regularity.
Simulation shows poor behavior of pseudo-likelihood estimator when assumptions are violated.
Moment estimation of copula parameters can be preferable in irregular cases.
Abstract
This paper deals with a situation when one is interested in the dependence structure of a multidimensional response variable in the presence of a multivariate covariate. It is assumed that the covariate affects only the marginal distributions through regression models while the dependence structure, which is described by a copula, is unaffected. A parametric estimation of the copula function is considered with focus on the maximum pseudo-likelihood method. It is proved that under some appropriate regularity assumptions the estimator calculated from the residuals is asymptotically equivalent to the estimator based on the unobserved errors. In such case one can ignore the fact that the response is first adjusted for the effect of the covariate. A Monte Carlo simulation study explores (among others) situations where the regularity assumptions are not satisfied and the claimed result does…
| margins | estim | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| bias | SD | RMSE | bias | SD | RMSE | bias | SD | RMSE | |||
| 0.50 | inov | 0.03 | 5.54 | 5.54 | 0.00 | 1.64 | 1.64 | 0.01 | 0.53 | 0.53 | |
| 0.33 | 4.90 | 4.91 | 0.01 | 1.49 | 1.49 | 0.00 | 0.48 | 0.48 | |||
| N+E | 1.25 | 5.62 | 5.76 | 0.27 | 1.64 | 1.67 | 0.05 | 0.53 | 0.53 | ||
| 3.91 | 5.54 | 6.78 | 2.26 | 2.08 | 3.08 | 0.80 | 0.75 | 1.10 | |||
| 1.94 | 5.30 | 5.65 | 1.23 | 1.81 | 2.19 | 0.44 | 0.63 | 0.77 | |||
| N+U | 0.21 | 5.55 | 5.55 | 0.03 | 1.63 | 1.63 | 0.02 | 0.53 | 0.53 | ||
| 0.84 | 4.86 | 4.93 | 0.61 | 1.53 | 1.65 | 0.22 | 0.51 | 0.55 | |||
| 0.02 | 5.00 | 5.00 | 0.13 | 1.50 | 1.51 | 0.05 | 0.49 | 0.49 | |||
| t | 0.15 | 5.58 | 5.58 | 0.01 | 1.64 | 1.64 | 0.02 | 0.53 | 0.53 | ||
| 0.10 | 4.96 | 4.96 | 0.02 | 1.50 | 1.50 | 0.01 | 0.48 | 0.48 | |||
| 0.38 | 5.05 | 5.06 | 0.06 | 1.51 | 1.51 | 0.02 | 0.48 | 0.48 | |||
| 0.75 | inov | 0.02 | 3.40 | 3.40 | 0.01 | 1.01 | 1.01 | 0.01 | 0.31 | 0.31 | |
| 0.77 | 3.12 | 3.21 | 0.16 | 0.93 | 0.94 | 0.01 | 0.28 | 0.28 | |||
| N+E | 2.14 | 3.70 | 4.27 | 0.48 | 1.08 | 1.18 | 0.07 | 0.32 | 0.33 | ||
| 9.19 | 5.85 | 10.89 | 4.19 | 2.88 | 5.09 | 1.57 | 1.14 | 1.94 | |||
| 6.26 | 4.95 | 7.98 | 2.86 | 2.36 | 3.71 | 1.07 | 0.94 | 1.43 | |||
| N+U | 0.24 | 3.39 | 3.40 | 0.06 | 1.01 | 1.01 | 0.00 | 0.31 | 0.31 | ||
| 2.99 | 3.27 | 4.43 | 1.22 | 1.18 | 1.70 | 0.44 | 0.41 | 0.60 | |||
| 1.63 | 3.15 | 3.55 | 0.60 | 1.01 | 1.17 | 0.20 | 0.33 | 0.39 | |||
| t | 0.22 | 3.45 | 3.45 | 0.05 | 1.01 | 1.01 | 0.01 | 0.31 | 0.31 | ||
| 1.21 | 3.21 | 3.43 | 0.22 | 0.93 | 0.95 | 0.02 | 0.28 | 0.28 | |||
| 1.04 | 3.24 | 3.40 | 0.17 | 0.93 | 0.95 | 0.01 | 0.28 | 0.28 | |||
| margins | estim | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| bias | SD | RMSE | bias | SD | RMSE | bias | SD | RMSE | |||
| 0.50 | inov | 0.03 | 4.62 | 4.62 | 0.01 | 1.44 | 1.43 | 0.01 | 0.45 | 0.45 | |
| 0.03 | 4.51 | 4.50 | 0.01 | 1.42 | 1.42 | 0.01 | 0.45 | 0.45 | |||
| N+E | 0.45 | 4.68 | 4.70 | 0.05 | 1.44 | 1.44 | 0.00 | 0.45 | 0.45 | ||
| 0.45 | 4.55 | 4.57 | 0.05 | 1.43 | 1.43 | 0.00 | 0.45 | 0.45 | |||
| 0.21 | 4.84 | 4.84 | 0.04 | 1.46 | 1.46 | 0.00 | 0.45 | 0.45 | |||
| N+U | 0.08 | 4.65 | 4.65 | 0.00 | 1.44 | 1.43 | 0.00 | 0.45 | 0.45 | ||
| 0.08 | 4.53 | 4.53 | 0.00 | 1.42 | 1.42 | 0.01 | 0.45 | 0.45 | |||
| 0.09 | 4.85 | 4.85 | 0.01 | 1.45 | 1.45 | 0.01 | 0.45 | 0.45 | |||
| 0.75 | inov | 0.11 | 2.50 | 2.50 | 0.00 | 0.74 | 0.74 | 0.00 | 0.23 | 0.23 | |
| 0.53 | 2.45 | 2.50 | 0.06 | 0.74 | 0.74 | 0.00 | 0.23 | 0.22 | |||
| N+E | 1.17 | 2.79 | 3.02 | 0.14 | 0.76 | 0.77 | 0.01 | 0.23 | 0.23 | ||
| 1.59 | 2.77 | 3.19 | 0.19 | 0.76 | 0.78 | 0.02 | 0.23 | 0.23 | |||
| 1.42 | 2.90 | 3.23 | 0.17 | 0.77 | 0.79 | 0.01 | 0.23 | 0.23 | |||
| N+U | 0.25 | 2.53 | 2.54 | 0.01 | 0.74 | 0.74 | 0.00 | 0.23 | 0.23 | ||
| 0.69 | 2.50 | 2.59 | 0.07 | 0.74 | 0.74 | 0.00 | 0.23 | 0.23 | |||
| 0.57 | 2.62 | 2.68 | 0.05 | 0.76 | 0.76 | 0.00 | 0.23 | 0.23 | |||
| margins | estim | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| bias | SD | RMSE | bias | SD | RMSE | bias | SD | RMSE | |||
| 0.50 | inov | 0.03 | 4.94 | 4.94 | 0.07 | 1.53 | 1.53 | 0.00 | 0.48 | 0.48 | |
| 1.07 | 4.51 | 4.63 | 0.10 | 1.39 | 1.40 | 0.03 | 0.44 | 0.44 | |||
| N+E | 0.43 | 4.97 | 4.99 | 0.17 | 1.53 | 1.54 | 0.02 | 0.48 | 0.48 | ||
| 0.32 | 4.54 | 4.55 | 0.21 | 1.41 | 1.43 | 0.06 | 0.45 | 0.45 | |||
| 0.99 | 4.90 | 5.00 | 0.08 | 1.46 | 1.46 | 0.03 | 0.45 | 0.45 | |||
| N+U | 0.06 | 4.97 | 4.97 | 0.08 | 1.53 | 1.53 | 0.00 | 0.48 | 0.48 | ||
| 0.87 | 4.53 | 4.62 | 0.01 | 1.40 | 1.39 | 0.01 | 0.44 | 0.44 | |||
| 1.36 | 4.86 | 5.05 | 0.22 | 1.46 | 1.47 | 0.07 | 0.45 | 0.45 | |||
| t | 0.04 | 4.99 | 4.98 | 0.07 | 1.53 | 1.53 | 0.00 | 0.48 | 0.48 | ||
| 1.08 | 4.55 | 4.67 | 0.09 | 1.40 | 1.40 | 0.02 | 0.44 | 0.44 | |||
| 1.42 | 4.90 | 5.10 | 0.21 | 1.46 | 1.48 | 0.06 | 0.45 | 0.45 | |||
| 0.75 | inov | 0.16 | 2.79 | 2.80 | 0.02 | 0.89 | 0.89 | 0.00 | 0.27 | 0.27 | |
| 0.06 | 2.53 | 2.53 | 0.02 | 0.80 | 0.80 | 0.00 | 0.25 | 0.25 | |||
| N+E | 1.02 | 2.93 | 3.10 | 0.24 | 0.90 | 0.93 | 0.04 | 0.27 | 0.27 | ||
| 1.81 | 2.81 | 3.34 | 0.73 | 0.95 | 1.20 | 0.21 | 0.30 | 0.37 | |||
| 1.01 | 2.77 | 2.95 | 0.40 | 0.88 | 0.97 | 0.10 | 0.27 | 0.29 | |||
| N+U | 0.08 | 2.80 | 2.80 | 0.05 | 0.89 | 0.89 | 0.01 | 0.27 | 0.27 | ||
| 0.48 | 2.52 | 2.56 | 0.27 | 0.82 | 0.86 | 0.09 | 0.25 | 0.27 | |||
| 0.00 | 2.61 | 2.60 | 0.05 | 0.81 | 0.81 | 0.01 | 0.25 | 0.25 | |||
| t | 0.14 | 2.82 | 2.82 | 0.02 | 0.89 | 0.89 | 0.01 | 0.27 | 0.27 | ||
| 0.03 | 2.56 | 2.56 | 0.02 | 0.79 | 0.79 | 0.00 | 0.25 | 0.25 | |||
| 0.20 | 2.62 | 2.62 | 0.02 | 0.80 | 0.80 | 0.02 | 0.25 | 0.25 | |||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Maximum pseudo-likelihood estimation based on estimated residuals in copula semiparametric models
Marek Omelka1, Šárka Hudecová1, Natalie Neumeyer2
Abstract.
This paper deals with a situation when one is interested in the dependence structure of a multidimensional response variable in the presence of a multivariate covariate. It is assumed that the covariate affects only the marginal distributions through regression models while the dependence structure, which is described by a copula, is unaffected. A parametric estimation of the copula function is considered with focus on the maximum pseudo-likelihood method. It is proved that under some appropriate regularity assumptions the estimator calculated from the residuals is asymptotically equivalent to the estimator based on the unobserved errors. In such case one can ignore the fact that the response is first adjusted for the effect of the covariate. A Monte Carlo simulation study explores (among others) situations where the regularity assumptions are not satisfied and the claimed result does not hold. It shows that in such situations the maximum pseudo-likelihood estimator may behave poorly and the moment estimation of the copula parameter is of interest. Our results complement the results available for nonparametric estimation of the copula function.
1 Department of Probability and Statistics, Faculty of Mathematics and Physics, Charles University, Sokolovská 83, 186 75 Praha 8, Czech Republic
2 Department of Mathematics, University of Hamburg, Bundesstrasse 55, 20146 Hamburg, Germany
Keywords and phrases: asymptotic normality, copula, moment estimation, pseudo-likelihood, residuals.
1. Introduction
Consider a -dimensional vector of responses and an associated -dimensional vector of the covariates . For instance in insurance applications one can consider that the response represents various type of payments related to a given car accident (medical benefits, income replacement benefits, and allocated expenses for a claimant) and the covariates present some additional information (claimant’s age, gravity of accident, number of people injured in the accident, …).
Often we are interested in the conditional distribution of given the value of the covariate. To simplify the situation it is often assumed that affects only the marginal distributions of , but does not affect the dependence structure of . More formally, it is assumed that there exists a copula such that the joint conditional distribution of given can be for all (the support of ) written as
[TABLE]
where , . Using this assumption one can proceed in two steps. In the first step one models the effect of the covariate on each of the marginal distributions separately (i.e. estimating for each separately). Having one estimates the copula function in the second step.
Nonparametric estimation of the copula function (for and ) was in detail considered in Gijbels et al., (2015). The most interesting result is as follows. Suppose that the marginal distributions follow the parametric or even non-parametric location scale models, i.e.
[TABLE]
Note that then is the copula function corresponding to the random vector . Then Gijbels et al., (2015) proved that (under some regularity assumptions) the empirical copula based on the estimated residuals from model (1) is asymptotically equivalent to the empirical copula calculated from the unobserved errors . More precisely it was proved that
[TABLE]
This result was generalized to time-series setting by Neumeyer et al., (2019). In Portier and Segers, (2018) the authors were even able to drop the location-scale assumption (1) but at the cost of deriving only a slightly weaker result (the supremum in (2) is replaced with where can be taken arbitrarily small but positive). On the other hand Côté et al., (2019) concentrated on the parametric form of the location scale model (1) and generalized the results to , and at the same time relaxed assumptions on (the density of ).
To complement the results on nonparametric estimation of one is naturally interested if analogous results hold also for parametric estimation of . More precisely suppose that the copula function belongs to the family \mathcal{C}=\big{\{}C(\cdot;\mathbf{a}):\mathbf{a}\in\Theta\big{\}} and we are interested in estimating the unknown parameter. Denote the true value of the parameter, the estimator based on the residuals () and its counterpart based on the true (but unobserved) errors () from the location-scale model (1). Then in analogy to (2) one would expect that is (the first-order) asymptotically equivalent to , i.e.
[TABLE]
Although the conjecture (3) seems to be natural, to the best of our knowledge there are only limited results specifying the regularity assumptions that are needed so that (3) holds. Some results for the moment-like estimators that can be deduced from the convergence of the empirical copula can be found in Neumeyer et al., (2019) and Côté et al., (2019).
In this paper (similarly as in Côté et al.,, 2019) we assume the parametric form of the location-scale model (1) and concentrate on maximum pseudo-likelihood estimation. This method of estimation was in the context of copula models popularised by Genest et al., (1995) and in more detail investigated in Tsukahara, (2005). This method is often preferred to moment-like estimation because the resulting estimator has usually a lower asymptotic variance.
In the econometric (time-series) literature the inference based on the residuals is also known as univariate (marginal) filtering (see e.g., Bücher et al.,, 2015) and the result (3) is supported by many simulation studies. The result is formulated already in Chen and Fan, 2006a but there it is presented more on an intuitive level and the precise assumptions (as well as reasoning) are missing. This lack of of rigorousness were to some extent redeemed in the subsequent paper Chan et al., (2009) where the authors concentrated on the multivariate GARCH-models and presented a lot of interesting ideas how to deal with the technical difficulties. But a careful reading of the paper reveals that (probably due to the broad scope of the presented results) some of the crucial steps in the proofs are missing.
In our paper we will explore in detail the assumptions that are needed so that (3) holds in the standard i.i.d. setting. Even in this relatively simply setting one has to handle many technical difficulties. The thing is that it is not clear how to make use to of the recent deep results in empirical copula estimation (see e.g., Berghaus et al.,, 2017; Radulović et al.,, 2017) as the densities of many standard copulas are unbounded. The only remarkable exception in this aspect is Theorem 3.3 of (Berghaus et al.,, 2017), but the authors considered only two dimensional copulas and no covariates.
We show that although the assumptions that guarantess (3) are mild, they are not satisfied for some combinations of commonly used copula functions and marginal densities. Roughly speaking we illustrate that an unbounded copula density has to be compensated with marginal densities that are well behaved not only in the supports of the corresponding distributions, but also at the border points of the supports. We are convinced that exploring this problem in this settings is not only of independence interest, but it provides also insights to understand what might go wrong when switching to more complicated econometric or time-series models (see also the discussion in Section 4).
The paper is organised as follows. The main result and the needed assumptions are formulated in Section 2. The theoretical results are illustrated in a simulation study in Section 3. All the proofs are given in the Appendices.
2. Main result
In what follows we assume that for each there exists a known transformation increasing on the support of and known functions and depending only on an unknown (finite-dimensional) parameter such that the random variable
[TABLE]
is independent of with cumulative distribution function . The distribution of the random vector has continuous margins and the copula corresponding to belongs to the families of copulas \mathcal{C}=\big{\{}C(\cdot;\mathbf{a}):\mathbf{a}\in\Theta\big{\}} and .
Our task is to estimate the true value of the copula parameter (say ) based on the observations that are assumed to be mutually independent copies of the vector .
Let . As the parameters () are in practice unknown, we work with the residuals
[TABLE]
where is a suitable estimate of . For let be the marginal empirical distribution function of the estimated residuals, i.e.
[TABLE]
Then the maximum pseudo-likelihood estimator based on the residuals is defined as
[TABLE]
where
[TABLE]
are the estimated pseudo-observations and is the density of the assumed copula family. As it is common in the maximum likelihood theory we will consider the estimator to be an appropriately chosen root of the estimating equations
[TABLE]
Analogously let be the corresponding estimator based on the true (but unobserved) errors . I.e. is defined as (an appropriately chosen) root of the estimating equations
[TABLE]
where
[TABLE]
and is the marginal empirical distribution function of the (unobserved) errors, i.e.
[TABLE]
2.1. Regularity assumptions on the marginal distributions
In general we need to assume that the density of the error term should be ‘well-behaved’ on the border of its support. The following assumption is close to assumption F(iii) in Appendix A of Einmahl and Van Keilegom, (2008). But our assumption is weaker as it allows for distributions with supports different from a real line.
Assumption : For each the density function of is continuous on the support of and there exists such that
[TABLE]
and
[TABLE]
Further for some , in the function f_{j\varepsilon}\big{(}F_{j\varepsilon}^{-1}(u)\big{)} is non-decreasing on and non-increasing on .
Note that assumption with allows also for distributions with non-continuous but bounded densities (e.g. exponential and uniform). But as we show later, for copula families with unbounded densities one needs to assume that .
Remark 1*.*
The assumption is formulated so that it covers the general case when both the conditional mean as well as the conditional variance of depends on . From the proofs given in the appendix it follows that if one rightly assumes that the conditional variance does not depend on , then one does only location adjustment (i.e. ) and assumption (8) simplifies to
[TABLE]
On the other hand if one rightly assumes that the conditional mean is zero then one does only scale adjustment (i.e. ) and it is sufficient to assume
[TABLE]
This last assumption is close to the assumption 2. formulated just before Theorem 2.1 of Chan et al., (2009). But similarly as when comparing with assumption F(iii) in Appendix A of Einmahl and Van Keilegom, (2008), our assumption does not require that the support of the distribution is a real line.
Remark 2*.*
As in assumption the function f_{j\varepsilon}\big{(}F_{j\varepsilon}^{-1}(u)\big{)} is supposed to be monotone when is close to zero or close to one, then the integrability of (see Lemma 12) implies that
[TABLE]
Thus if
[TABLE]
then one gets
[TABLE]
Note that the above equations are also automatically satisfied if even if (9) does not hold. Thus one can conclude that if (10) does not hold, then and the corresponding border of the support is finite, i.e.,
[TABLE]
2.2. Regularity assumptions on and
The next assumption states that the parametric models can be estimated at the standard -rate and that the location and scale functions are sufficiently smooth and integrable.
Assumption : For each is a -consistent estimate of the parameter . The functions and are (once) differentiable with respect to and the derivatives are denoted as and . Further there exists a neighborhood of the true value of the parameter such that and there exists a function such that for each :
[TABLE]
and \mathsf{E}\big{[}M_{j}(\boldsymbol{X})\big{]}^{r}<\infty for some . Finally, for each the derivatives and viewed as functions of are continuous at uniformly in .
2.3. Regularity assumptions about the copula family
To formulate the main regularity assumptions about the copula family it is useful to introduce the following set of functions.
Definition** **(Class of - and
-functions).
A function is called a -function if is continuous on and there exist and a finite constant such that for all
[TABLE]
Let and be fixed. We say that a function is a -function if it is continuous on and there exists a finite constant such that for all
[TABLE]
Further is a -function for all , where
[TABLE]
Now we are ready to formulate the needed regularity assumptions about the copula family. Recall that , is the true value of the parameter, and is a density corresponding to the copula function .
Assumptions C:
C1**.**
for almost all only if .
C2**.**
The function is continuously differentiable with respect to for all .
Denote the th element of the vector function by .
C3**.**
For each , the function , where , for introduced in assumption and in assumption .
C4**.**
The function is assumed to be continuously differentiable with respect to for all . Further there exist an open neighborhood of and a dominating function such that is continuous in and
[TABLE]
C5**.**
The (Fisher information) matrix I(\boldsymbol{\alpha})=-\mathsf{E}\,\big{\{}\partial\boldsymbol{\psi}(\mathbf{U};\mathbf{a})/{\partial\mathbf{a}^{\mathsf{T}}}\big{|}_{\mathbf{a}=\boldsymbol{\alpha}}\big{\}}, where
[TABLE]
is finite and nonsingular.
Remark 3*.*
Note that the score functions of the commonly used one-parameter bivariate copula families with unbounded densities (e.g. Clayton, Gumbel, Normal, Student, …) can be bounded by
[TABLE]
and its derivative as
[TABLE]
for a sufficiently large but finite constant (see also Chen and Fan, 2006b, ). Thus in Assumption C3 one can consider and arbitrarily close to zero but positive.
Assumption C3 is inspired by Chan et al., (2009). Note that generally speaking this assumption is more strict than the corresponding assumptions of Tsukahara, (2005) that are based on -shaped functions. The advantage of assumption C3 is that it enables to derive bounds that depend only on the marginal distributions. The price that we pay for this advantage does not seem to be big because we are not aware of a standard copula family that does not meet C3 with and arbitrarily small positive constants.
Note that assumption C3 implies that , which does not allow for marginal densities that are bounded but possibly discontinuous at a border point (e.g. exponential or uniform distributions). As shown in simulations in Section 3 the aimed result (3) indeed does not hold in general when the marginal densities are not continuous.
Nevertheless a closer inspection of the proof shows that is needed to get a control over a possibly unbounded score function . But there are commonly used copula families (e.g. Frank, Ali-Mikhail-Haq, Plackett) for which the score function and its derivatives are bounded. It is of interest to formulate an alternative to assumptions C3 and C4 separately as it allows for in assumption ,
C6**.**
The function is bounded and continuously differentiable with respect to for all . Further there exists an open neighborhood of such that is continuous in and
[TABLE]
2.4. Main results
Now we are ready to formulate the main results of the paper.
Theorem 1**.**
Suppose that assumptions , C1-C5 and with are satisfied. Then with probability going to one there exist consistent roots (say and ) of the estimating equations (5) and (6). Further and satisfy (3).
The next theorem say that if assumption C6 is satisfied then one can also include the case in assumption . Thus for instance if one (rightly) assumes that is a Frank copula then the marginal distributions of the errors are allowed to be also uniform or exponential.
Theorem 2**.**
Suppose that assumptions , C1, C2, C5, C6 and are satisfied. Then the statement of Theorem 1 holds.
The above theorems imply that when fitting the copula one can (under the stated assumptions) ignore the fact that he/she is working with estimated residuals () instead of unobserved errors (). As it is known (and it also follows from the proof of Theorem 1) the asymptotic distribution of is normal. Thus thanks to (3) one can conclude that also is asymptotically normal.
Corollary 1**.**
Suppose that the assumptions either of Theorem 1 or 2 hold. Then with probability going to one there exists a consistent root of (5). This root satisfies
[TABLE]
where \widetilde{\boldsymbol{\psi}}\big{(}\mathbf{u})=\big{(}\widetilde{\psi}_{1}(\mathbf{u}),\dotsc,\widetilde{\psi}_{p}(\mathbf{u})\big{)}^{\mathsf{T}} with
[TABLE]
3. Simulation study
A Monte Carlo study was conducted in order to illustrate the theoretical conclusions and to show how the finite sample performance of the maximum pseudo-likelihood estimator depends on the level of violation of the regularity assumptions.
3.1. Settings
To keep the presentation as clear as possible we concentrate on a bivariate response variable (some results for a three-dimensional case can be found in the Supplementary material) following the model
[TABLE]
The joint cumulative distribution function of the random vector is C\big{(}F_{1\varepsilon}(y_{1}),F_{2\varepsilon}(y_{2})\big{)}, where is a copula and , are marginal distribution functions. The following five copula families were considered for : Clayton, Frank, Gumbel, Gaussian, and Student with 5 degrees of freedom. The copula parameter is chosen such that the corresponding Kendall’s tau is or . The marginal distributions were chosen one of the following:
is standard normal and exponential with mean 1 (denoted as N+E),
is standard normal and uniform on (denoted as N+U),
and are both Student with 5 degrees of freedeom (denoted as t).
The first two situations satisfy the assumption only with . Hence, the result of Theorem 2 applies only if (C6) holds. From the five considered copula families, this is the case only for the Frank copula. On the other hand, the marginals satisfy with and the assumptions of Theorem 1 hold. Hence, these marginals provide a useful regular benchmark for a comparison with the first two situations.
The covariate is generated from the standard normal distribution (Poisson distribution with mean 5 was considered as well, but the results are almost identical and are not reported). The presented results correspond to the particular choice , , , and . The unobserved errors are estimated as the residuals after fitting the regression lines (marginally) where the parameters are estimated with the help of the least squares method assuming , , cf. Remark 1.
The following estimators of the parameter are compared:
- (i)
(oracle) inversion of Kendall’s tau based on the unobserved errors ; 2. (ii)
inversion of Kendall’s tau based on the residuals ; 3. (iii)
(oracle) maximum pseudo-likelihood estimator based on the unobserved errors ; 4. (iv)
maximum pseudo-likelihood method estimator on the residuals ; 5. (v)
modified maximum pseudo-likelihood estimator based on the residuals .
The latter estimator is inspired by the estimator introduced in the context of single index conditional copulas by Fermanian and Lopez, (2018). In our situation this estimator coincides with the maximum pseudo-likelihood estimator computed only from which lie in , where . Note that this choice corresponds to the choice in the proof of Theorem 1. In the presented simulations we choose and , thus in view of Remark 3 the statement of Theorem 1 (or 2) holds also for provided that the corresponding regularity assumptions hold.
In order to have more comparable results for the various copula families, the estimates of the parameters are presented on the Kendall’s tau scale. The performance of the estimators is measured by the bias, the standard error (SD), and the root mean square error (RMSE), which are estimated from random samples of sample sizes and whose 100 multiplies are reported, because the obtained quantities are typically of order . The obtained results for Clayton, Frank and Gaussian copulas are listed in Tables 1, 2, and 3, while tables for Gumbel and Student copula can be found in the Supplementary material. The Monte Carlo simulations were run in R statistical computing environment (R Core Team,, 2018). The same starting seed was always used so that the estimates based on the true (but unobserved) errors are the same regardless the choice of the marginals and . These ‘oracle’ estimates are denoted as “inov” in the tables and provide benchmarks for the estimators calculated from the estimated residuals.
3.2. Findings
As it is well known (Genest et al.,, 1995; Tsukahara,, 2005) in case of no covariates the maximum pseudo-likelihood is usually more efficient than the moment like estimators. This is illustrated by the performance of the estimators and that are calculated from the errors . The question of interest is if this property continues to hold also for estimators that are calculated from the residuals (i.e., in the presence of covariates).
Generally speaking one can conclude that in agreement with our theoretical results the maximum pseudo-likelihood estimator outperforms in situations for which our regularity assumptions are satisfied (see Table 2 and the rows corresponding to -marginals in Tables 1 and 3). For these situations the modified maximum pseudo-likelihood estimator is of no interest.
On the other hand the performance of may deteriorate significantly if the regularity assumptions are not met. The problems are generally worse for larger values of Kendall’s tau (a stronger dependence). It is also interesting that exponential margins (rows denoted as N+E) are much more problematic than uniform margins (rows denoted as N+U).
As illustrated in Table 1 one should be in particular careful when fitting the Clayton copula (and also the Gumbel copula as illustrated in the Supplementary material). Then performs significantly worse than in cases of non-regular margins combined with a strong dependence (). The problems can be to some extent prevented by considering the modified estimator in particular in case of uniform margins (N+U). Thus while for Frank copula the modified estimator is of no interest, for the Clayton (and the Gumbel) copula it presents an interesting alternative to the ‘standard’ pseudo maximum-likelihood estimator.
The results for the Gaussian copula (see Table 3) are of independence interest. Note that although the density of the copula function is unbounded, the estimator performs better than for even in case of exponential margins (N+E). And this holds true for uniform margins (N+U) even for . This raises a question whether a milder assumptions than would be sufficient for the Gaussian copula.
An analogous simulation study was conducted also for a system of three linear regressions, where the vector of innovations was sampled from C\big{(}F_{1\varepsilon}(y_{1}),F_{2\varepsilon}(y_{2}),F_{3\varepsilon}(y_{3})\big{)} with the marginals and being standard normal and either exponential (with mean 1) or uniform on . As the obtained results are very similar to the results for model (12), they are not presented here, but can be found in the Supplementary material. The common important finding is that the pseudo-likelihood estimator may perform poorly (and noticeably worse compared to ) for copula families with unbounded densities even in cases when only one of the marginals does not satisfy the regularity assumption while the remaining ones are regular.
4. Conclusions and further discussions
As illustrated in the previous section one should be careful when a copula with an unbounded density is fitted with the help of the maximum pseudo-likelihood method. Although the assumptions of Theorem 1 are not strict one should keep in mind that they are not satisfied for distributions with a non-continuous error density function (e.g., uniform distribution, exponential distribution, …). Although such situations are probably rare in practice, there are applications in which for instance uniform errors can naturally appear (see e.g., Schechtman and Schechtman,, 1986).
One of the possible next steps would be to generalize the results into the time-series context and to find the assumptions so that the results claimed in Chen and Fan, 2006a hold. Based on our results for i.i.d. setting and our simulation study we conjecture that the method of the pseudo-likelihood estimation can be problematic when the marginal models have exponential innovations (or more generally positive or bounded innovations with discontinuous density) (see e.g. Lawrance and Lewis,, 1985; Davis and McCormick,, 1989; Anděl,, 1989, 1992; Nielsen and Shephard,, 2003) and one uses -consistent estimators of the model parameters.
Note that in models where (based on our findings) the use of maximum pseudo-likelihood estimation is questionable, one can consider the method of moments (see e.g., Section 5.5.1 of McNeil et al.,, 2005; Brahimi and Necir,, 2012). As proved in Côté et al., (2019) many moment estimators based on residuals satisfy (3) under less restrictive assumptions on the marginal error density . In particular for standard two-dimensional copulas the method of the inversion of Kendall’s tau can present a ‘robust’ alternative. It is usually only slightly less efficient if no covariates are present, but in the presence of covariates it can perform significantly better than the maximum pseudo-likelihood estimator.
For the sake of brevity we concentrated only on estimation of the copula parameter. We conjecture that also other procedures (e.g., procedures for goodness-of-fit testing) that make use of the maximum pseudo-likelihood estimator calculated from the residuals will be valid provided that next to our assumptions also some standard regularity assumptions for these procedures are satisfied.
Acknowledgments
M. Omelka gratefully acknowledges support from the grant GACR 19-00015S. The research of Š. Hudecová was supported by the grant GACR 18-01781Y. N. Neumeyer gratefully acknowledges support from the DFG (Research Unit FOR 1735 Structural Inference in Statistics: Adaptation and Efficiency).
Appendix A Proofs of the main results
Note that the estimated pseudoobservations given by (4) can be viewed as estimates of ‘unobserved’ pseudoobservations (given in (7)) which can be further viewed as estimates of , given by
[TABLE]
To prove Theorem 1 we need some technical results about the ‘closeness’ of (the -th element of ) to and .
As we will show later one does not need to handle if either is close to zero or one or if is too large. This is formalised as follows. Introduce the set of indices
[TABLE]
where
[TABLE]
The following lemma gives an upper bound on the number of indices for which it holds that or .
Lemma 1**.**
Let and satisfy (A2) and assumption holds. Then
[TABLE]
which further implies that
[TABLE]
Proof.
Denote
[TABLE]
and note that thanks to (A2) and Markov’s inequality (applied to )
[TABLE]
Now as the random variable \frac{1}{n}\sum_{i=1}^{n}\mathbf{1}\big{\{}U_{ji}\not\in[\delta_{n},1-\delta_{n}]\text{ or }M_{j}(\boldsymbol{X}_{i})>a_{n}\Big{\}} is non-negative one can use once more Markov’s inequality to conclude that
[TABLE]
∎
A.1. Some results on statistics with ranks calculated
from residuals
Lemma 2**.**
Suppose that assumptions and hold and that is a -function. Then
[TABLE]
Proof.
As is a -function, it is easy to show that the expectation exists and is finite. Thus thanks to the law of large numbers it is sufficient to show
[TABLE]
Let and be as in (A1) and (A2), where and are chosen so that they satisfy the assumptions of Lemma 6. Then this lemma together with the standard Glivenko-Cantelli theorem for the empirical distribution function implies that
[TABLE]
Now introduce
[TABLE]
and note that with the help of (A4)
[TABLE]
As the above equation is not guaranteed for , we need to take care about the sets of indices and separately. That is why we bound given by (A3) as
[TABLE]
In what follows we show that each term on the right-hand side of (A7) is asymptotically negligible.
Dealing with the first term in (A7)
As is a -function one can bound
[TABLE]
Now by Lemma 1 (with probability going to one) there are at most indices for which there exists such that or . Thus one can choose the indices for which takes the biggest values and gets that (with probability going to one)
[TABLE]
Dealing with the second term in (A7)
Note that \mathsf{E}\big{|}\varphi(\mathbf{U}_{i})\big{|}<\infty implies that
[TABLE]
Thus \frac{1}{n}\sum_{i\in\mathrm{K}_{n}^{X}}\big{|}\varphi(\mathbf{U}_{i})\big{|}=o_{P}(1) follows from Markov’s inequality.
Dealing with the third term in (A7)
We use the continuity of the function . To be able to do that we need to stay in the interior of . Thus for a given (that will be specified later on), consider the set
[TABLE]
and introduce the corresponding sets of indices
[TABLE]
where for simplicity of notation we do not stress that both and depends on . Now one can bound
[TABLE]
Note that by the uniform continuity of the function on and (A6) one gets that the first term on the right-hand side of (A11) converges to zero in probability.
To deal with the second term on the right-hand side of (A11) note that thanks to (A6) with probability going to one
[TABLE]
Thus one can bound
[TABLE]
which can be made arbitrarily small by taking small enough.
Finally with the help of law of large numbers the third term on the right-hand side of (A11) can be bounded by
[TABLE]
which can be also made arbitrarily small by taking sufficiently small and sufficiently large. ∎
Lemma 3**.**
*Suppose that assumptions and hold. Let be a -function such that and . Then *
[TABLE]
Proof.
Let and be defined as in (A5). Then similarly as in (A8) of the proof of Lemma 2 one can bound
[TABLE]
where the role of is now taken by .
In what follows we take so that
[TABLE]
and satisfies (B30). Such choices of and guarantee that the right-hand sides of (A13) are of order and at the same time the assumptions of Lemma 5 are satisfied and one can make use of Lemmas 6 and 7.
It is sufficient to show that
[TABLE]
Note that
[TABLE]
where and are given in (A2).
Now by the mean value theorem
[TABLE]
where lies between and . Thus to prove the lemma it is sufficient to show that the second term on the right-hand side of (A14) diminishes in probability.
With the help of Lemma 6 for a fixed one gets
[TABLE]
where
[TABLE]
and is taken sufficiently small so that . In what follows we show that and are asymptotically negligible.
Dealing with . With the help of Lemma A3 of Shorack, (1972) and Lemma 7 for each there exists a positive constant such that the quantity given by (A17) can be with probability at least bounded by
[TABLE]
where the law of large numbers is used on the last line.
Thus one can concentrate on the quantities and .
Dealing with . Note that given by (A15) can be rewritten as
[TABLE]
Now analogously as in the proof of Lemma 2 one can show that
[TABLE]
and also
[TABLE]
Combining (A18), (A19), (A20) and the fact that the estimator is -consistent yields
[TABLE]
Dealing with . Now have a look at the term defined in (A16). One can proceed analogously as above and show that
[TABLE]
where
[TABLE]
Now similarly as in the proof of Lemma 5 one can show that
[TABLE]
and analogously also
[TABLE]
Now (A21), (A22), (A23) and (A24) yields that , which was to be proved.
∎
The following lemma will be useful for copula families with ‘nicely bounded’ score functions.
Lemma 4**.**
Suppose that assumptions and hold. Let be a -function such that and is bounded for each . Then the statement of Lemma 3 holds.
Proof.
By the mean value theorem
[TABLE]
Now take and recall the sets of indices of introduced in (A5). Then
[TABLE]
Now with the help of Lemma 9 one can show that the second term on the right-hand side of (A25) can be bounded as the preceding equation is
[TABLE]
where the last equation follows from Markov’s inequality and
[TABLE]
Finally the first term on the right-hand side of (A25) can be handled analogously as in the proof of Lemma 3. ∎
Corollary 2**.**
Suppose that assumptions of Lemma 3 or Lemma 4 are satisfied. Then
[TABLE]
where
[TABLE]
Proof.
With the help of (A12) it is sufficient to show that
[TABLE]
But this can be proved component-wise by mimicking the proof of Lemma 2 of Gijbels et al., (2017), where the situation with but a more general depending possibly also on is considered. ∎
A.2. Proofs of Theorems 1 and 2
Proof of Theorem 1.
With the help of Lemmas 2 and 3 the proof can closely follow the proof of Lemma 3 in Gijbels et al., (2017). In order to do that define
[TABLE]
In what follows we show that assumptions of Theorem A.10.2 of Bickel et al., (1993) are satisfied for and given by (A26).
It follows from the standard maximum likelihood theory that Assumption (GM0) is satisfied thanks to Assumption C1. Moreover, Assumptions C4 and C5 imply Assumption (GM3). Assumption (GM2) is also satisfied as thanks to assumption C3 one can for each apply Corollary 2 to and get
[TABLE]
where was introduced in Corollary 1.
Thus, it remains to check Assumption (U) from Theorem A.10.2. Therefore for each and for each , it is sufficient to find a neighborhood such that
[TABLE]
where stands for the element of .
For simplicity of notation, let us put . Assumption C4 allows to adapt Lemma 2, which gives
[TABLE]
Hence, it remains to show
[TABLE]
For a given (that will be specified later on), let us introduce the sets and as in (A9) and (A10). Then the left-hand side of (A27) can be bounded by
[TABLE]
where was introduce in (A5) and in Assumption C4. Now with probability going to one for each sufficiently large , if , then . Thus for each the term on the right-hand side of (A28) can be made arbitrarily small (Assumption C4) up to term by considering a sufficiently small neighbourhood .
Finally, analogously as in the proof of Lemma 2, one can show that
[TABLE]
where as .
Thus we have verified the assumptions of Theorem A.10.2 of Bickel et al., (1993) which yields that there exists a consistent root (say ) of the estimating equation (5) which has the following asymptotic representation
[TABLE]
where the elements of the vector function are given in (11). Note that completely analogously one can show that there exists a consistent root (say ) of the estimating equation (6) which has the same asymptotic representation. This finally implies the statement of the theorem. ∎
Proof of Theorem 2.
The proof is completely analogous to the proof of Theorem 2. The only difference is that one uses Lemma 4 instead of Lemma 3. In fact the proof is even simpler as thanks to assumption C6 one can take a finite constant instead of the function . ∎
Appendix B Some results on and
In what follows let .
Lemma 5**.**
Suppose that assumptions and hold. Then for where it holds uniformly in
[TABLE]
for each and .
Proof.
We will show the statement for . The proof would be completely analogous for .
Note that
[TABLE]
In what follows we need to take care of the fact that the majorant from assumption can be unbounded. Let , where will be specified later. Then similarly as in the proof of Lemma 1 one can use Markov’s inequality to bound
[TABLE]
Note that thanks to the assumption it is straightforward to verify that \tfrac{1}{2}+\tfrac{\beta}{\lambda}<r\big{(}\tfrac{1}{2}-\tfrac{1-\beta}{\lambda}\big{)}. In the following we will take such that
[TABLE]
Now with the help of (B2) one can conclude that
[TABLE]
for .
Now for simplicity of notation introduce
[TABLE]
Further for and put
[TABLE]
where is sufficiently small. Note that the function is increasing on and decreasing on for . Finally let
[TABLE]
and for introduce the processes
[TABLE]
that are indexed by the set , where T_{1}=\big{\{}\mathbf{t}\in\mathbb{R}^{p_{j}}:\|\mathbf{t}\|\leq 1\big{\}}.
Note that assumption guarantees that for each , which further implies that . Put
[TABLE]
Then with the help of (B3) one can (with probability going to one) write that for
[TABLE]
Now equip the space with the semimetric given by
[TABLE]
where is a finite constant that will be specified afterwards.
Later we show that the assumptions of Theorem 2.11.11 of van der Vaart and Wellner, (1996) are satisfied for the empirical process indexed by , which implies that the process is asymptotically tight. Further as \sup_{u\in(0,\frac{1}{2}]}\rho\big{(}(\widehat{\boldsymbol{\vartheta}}_{n},u),(\boldsymbol{0},u)\big{)}=o_{P}(1), one gets that uniformly in
[TABLE]
where stands for the expectation with respect to ’s and ’s (while considering being fixed).
In what follows we concentrate on . If not stated otherwise all the following results hold uniformly for from this interval.
Note that similarly as in (B5) one can argue that
[TABLE]
This together with (B3) and (B) implies
[TABLE]
Thus to finish the proof it remains to deal with the second term on the right-hand side of (B8). As one can use the mean value theorem which guarantees that (with probability going to one) there exists such that
[TABLE]
Note that for such that one has
[TABLE]
and also
[TABLE]
where both inequalities hold uniformly in and . Thus with the help of Lemma 11
[TABLE]
and also
[TABLE]
Now combining the above findings with assumption yields that (B9) can be simplified to
[TABLE]
which together with (B8) implies (B1).
Verifying assumptions of Theorem 2.11.11 of van der Vaart and Wellner, (1996)
First of all we need to show that the semimetric defined in (B6) is Gaussian-dominated. To prove that it is sufficient to show that (see p. 212 of van der Vaart and Wellner,, 1996)
[TABLE]
where is the covering number of .
It is known (see Example 2.11.15 of van der Vaart and Wellner,, 1996) that (B14) holds true if is replaced with and with
[TABLE]
as is Gaussian. But from the definition of in (B6) it follows that one can bound
[TABLE]
thus also satisfies (B14).
Next we need to check the three assumptions of Theorem 2.11.11 of van der Vaart and Wellner, (1996). As in our situations the processes are identically distributed, the assumptions can be rewritten as follows.
(I) For each
[TABLE]
(II) For each
[TABLE]
(III) For every -ball of radius less than
[TABLE]
Note that the first assumption (B16) is easy to check as
[TABLE]
To verify the second assumption (B17) fix and (so that ) and calculate
[TABLE]
Now we will have a look at the first term on the right-hand side of (B19). For a given by the mean value theorem there exists between and such that
[TABLE]
Now with the help of (B4), (B10), (B11) and Lemma 10 one can conclude that with probability going to one
[TABLE]
which together with (B) implies that
[TABLE]
uniformly in .
Now fix and . Then by the mean value theorem there exists between and such that
[TABLE]
which together with
[TABLE]
assumption and (B23) implies that
[TABLE]
uniformly in and .
Now combining the inequalities (B21), (B22) and (B24) implies that
[TABLE]
Now turn our attention to the second term on the right-hand side of (B19). Analogously as above one can bound
[TABLE]
Combining this with (B19) and (B25) one gets
[TABLE]
where the last inequality follows by Lemma 13(iii) in Appendix D.
Finally we show that also the third assumption (B18) is satisfied. Let be a fixed -ball. Then from the properties of the Euclidean norm and the function (see Lemma 13(iv) in Appendix D), there exist and such that
[TABLE]
Then one can bound
[TABLE]
To deal with the last probability introduce
[TABLE]
Then one can bound
[TABLE]
where , stand for the first and second term on the right-hand side of (B27) respectively.
Now similarly as in (B25) one can bound the second moment of as
[TABLE]
provided that in the definition of the semimetric (B6) is taken sufficiently large.
Thus also by Markov’s inequality
[TABLE]
Now we can concentrate on the second term in (B27). To do so note that from the definition of the semimetric in (B15) it follows that for each
[TABLE]
which further implies that
[TABLE]
Using the above inequality one can bound (with probability going to one)
[TABLE]
where we have used that thanks to (B21)
[TABLE]
and for each
[TABLE]
Thus we can bound
[TABLE]
for a sufficiently large . Now combining (B28) and (B29) yields that
[TABLE]
which together with (B26) implies that also (B18) is satisfied.
∎
Note that while is only a cleverly chosen constant in Lemma 5 that is not involved in the statement, in the following lemmas we will speak about and thus we need to be more specific about . Thus in what follows we often assume that
[TABLE]
Lemma 6**.**
Suppose that the assumptions of Lemma 5 are satisfied and satisfies (B30). Then it holds uniformly in
[TABLE]
for each and .
Proof.
The lemma will be shown by substitution of into the approximation (B1) stated in Lemma 5. Note that all the following statements holds uniformly in .
The proof will be divided into four steps. First we show that with probability going to one
[TABLE]
to justify the substitution into (B1). Second
[TABLE]
Next we show that
[TABLE]
and finally we derive
[TABLE]
and realise that and .
Showing (B31).
Analogously as in (B22) for
[TABLE]
This further implies that
[TABLE]
where we have used that satisfies (B30). Thus for a sufficiently large one gets that
[TABLE]
and analogously also
[TABLE]
Showing (B32).
Note that with the help of (B36) one can conclude that
[TABLE]
which implies (B32).
Showing (B33) and (B34). This follows from (B31), (B12) and (B13).
Showing (B35).
Without loss of generality consider only those for which . Now for ) introduce
[TABLE]
Similarly as in the proof of Lemma 5 define for the processes
[TABLE]
that are indexed by the set . Now one can write as
[TABLE]
where
[TABLE]
Note that for
[TABLE]
Now equip the space with the semimetric given by
[TABLE]
where is a sufficiently large but finite constant. Then completely analogously as in the proof of Lemma 5 one can verify the assumptions of Theorem 2.11.11 of van der Vaart and Wellner, (1996). Thus \sup_{u\in(0,\frac{1}{2}]}\rho\big{(}(\hat{\mathbf{t}}_{n},u),(\boldsymbol{0},u)\big{)}=o_{P}(1), implies that
[TABLE]
which together with (B37) implies that
[TABLE]
Now the right-hand side of the above equations can be with the help of (B12) and (B13) rewritten as
[TABLE]
which combined with (B38) implies (B35). ∎
Lemma 7**.**
Suppose that the assumptions of Lemma 6 are satisfied and . Then for each there exists such that for each for all sufficiently large
[TABLE]
Proof.
We concentrate on the inequality . Showing the upper inequality for would be analogous.
By Lemma 6 one gets , where
[TABLE]
and can be taken arbitrarily small.
Now by Lemma A3 of Shorack, (1972) for each there exists such that
[TABLE]
Thus one can take provided we show that
[TABLE]
To do that one can consider each of the summands on the right-hand side of (B39) separately. Thus for instance one has that uniformly in
[TABLE]
as satisfies (B30). The other summands on the right-hand side of (B39) can be handled analogously. ∎
Some results useful when
holds with
Lemma 8**.**
Suppose that assumptions and hold. Then for each
[TABLE]
Proof.
Let be the neighborhood of introduced in . Now consider the set of functions
[TABLE]
and denote its elements as . Then one can write
[TABLE]
Similarly as in the proof of Theorem 4 of Gijbels et al., (2015) one can argue that the set is -Donsker. Further similarly as in the proof of Lemma 5 one can show that
[TABLE]
which further implies that uniformly in
[TABLE]
Now by the mean value theorem there exists between and such that
[TABLE]
which together with (B41) implies (B40). ∎
Lemma 9**.**
Suppose that the assumptions of Lemma 8 are satisfied. Then for each
[TABLE]
Proof.
The proof follows by substitution of into (B40) and following the proof of Lemma 6. ∎
Appendix C Further auxiliary results
Lemma 10**.**
Suppose that assumption holds. Let satisfy and satisfies (B30). Further for introduce . Then there exists such that for all sufficiently large for all for each
[TABLE]
*and *
[TABLE]
Proof.
We show only that
[TABLE]
as the remaining inequalities could be proved analogously. Thus we need to show that
[TABLE]
Now by the mean value theorem
[TABLE]
where is between and . Thus with the help of (C1) it remains to show that
[TABLE]
Now by assumption and using the fact that
[TABLE]
where we have used that satisfies (B30), which guarantees that one can find sufficiently small so that holds.
∎
Lemma 11**.**
Suppose that the assumptions of Lemma 10 are satisfied. Then there exists such that for all for each
[TABLE]
and also
[TABLE]
as .
Proof.
We will prove only that
[TABLE]
as the remaining cases can be shown analogously.
First suppose that \lim_{u\to 0_{+}}f_{j\varepsilon}\big{(}F_{j\varepsilon}^{-1}(u)\big{)}>0. Then from Remark 2 one can conclude that and . Thus also and the statement follows from the continuity of .
Now suppose that \lim_{u\to 0_{+}}f_{j\varepsilon}\big{(}F_{j\varepsilon}^{-1}(u)\big{)}=0. Note that for a given
[TABLE]
which follows from the continuity of the function .
Now let be given and fixed. Thanks to assumption one can choose so that
[TABLE]
where Now thanks to Lemma 10 one can conclude that also
[TABLE]
which finishes the proof of the lemma. ∎
Lemma 12**.**
Suppose that the density satisfies assumption . Then
[TABLE]
Proof.
We will consider only . The remaining case would be handled analogously.
First, note that one can assume that , otherwise the proof is trivial. Now suppose that
[TABLE]
Then one can find a positive constant and a sequence monotonically going to infinity such that
[TABLE]
Note that by assumption the function is non-increasing for . In what follows we will assume that and that (otherwise one can take an appropriate subsequence of ). Now one can bound
[TABLE]
which is in contradiction with the fact, that is a density. ∎
Appendix D Some properties of function
Recall the definition of in (B15) and for simplicity of notation put . Then for each satisfying one has , where
[TABLE]
Lemma 13**.**
Let and be fixed. Then the following statements hold.
- (i).
The function is increasing for . 2. (ii).
For the function is increasing on and decreasing on , where u_{*}=u_{0}\big{(}\frac{1-2b}{2(1-b)}\big{)}^{1/b}. 3. (iii).
For each it holds that . 4. (iv).
For each the set U(u_{0},\epsilon)=\big{\{}u\in[0,\tfrac{1}{2}]:\rho_{0}(u,u_{0})\leq\epsilon\big{\}} is contained in a set such that .
Proof.
The proof of (i) follows directly from the definition of the function , as
[TABLE]
which is evidently an increasing function on .
For the proof of (ii) rewrite
[TABLE]
Now it is straightforward to find that the function has exactly one local maximum in the point and meets the claimed properties.
Now we show (iii). Note that thanks to (ii) the function is decreasing on , thus the inequality trivially holds if .
Thus suppose that . From (ii) we further know that , where . Thus we can bound
[TABLE]
which was to be proved.
To prove (iv) first note that from (i) there exists such that
[TABLE]
When searching for one has to be more careful as the function is not decreasing on . We need to distinguish two cases. First, let . Then one can find in a similar way as was found. Second, suppose that . Then we take simply .
Now it remains to check that . To do that bound
[TABLE]
∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Anděl, (1989) Anděl, J. (1989). Non-negative autoregressive processes. J. Time Series Anal. , 10(1):1–11.
- 2Anděl, (1992) Anděl, J. (1992). Nonnegative multivariate AR(1) processes. Kybernetika , 28(3):213–226.
- 3Berghaus et al., (2017) Berghaus, B., Bücher, A., and Volgushev, S. (2017). Weak convergence of the empirical copula process with respect to weighted metrics. Bernoulli , 23(1):743–772.
- 4Bickel et al., (1993) Bickel, P. J., Klaassen, C. A. J., Ritov, Y., and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models . Johns Hopkins University Press, Baltimore.
- 5Brahimi and Necir, (2012) Brahimi, B. and Necir, A. (2012). A semiparametric estimation of copula models based on the method of moments. Stat. Methodol. , 9(4):467–477.
- 6Bücher et al., (2015) Bücher, A., Jäschke, S., and Wied, D. (2015). Nonparametric tests for constant tail dependence with an application to energy and finance. J. Econometrics , 187(1):154–168.
- 7Chan et al., (2009) Chan, N.-H., Chen, J., Chen, X., Fan, Y., and Peng, L. (2009). Statistical inference for multivariate residual copula of GARCH models. Statist. Sinica , 19:53–70.
- 8(8) Chen, X. and Fan, Y. (2006 a). Estimation and model selection of semiparametric copula-based multivariate dynamic models under copula misspecification. J. Econometrics , 135:125–154.
