Are Unobservables Separable?
Andrii Babii, Jean-Pierre Florens

TL;DR
This paper develops a novel nonparametric test for the separability of unobservables in models with endogenous observables, using a nonseparable IV framework and applying it to US expenditure data.
Contribution
It introduces a new nonparametric test for unobservables separability based on a nonseparable IV model with a novel Donsker-type CLT for residuals.
Findings
Test rejects separability in Engel curves for most commodities.
Proposes a nonstandard distribution for the test statistic.
Uses a dataset from the 2015 US Consumer Expenditure Survey.
Abstract
It is common to assume in empirical research that observables and unobservables are additively separable, especially, when the former are endogenous. This is done because it is widely recognized that identification and estimation challenges arise when interactions between the two are allowed for. Starting from a nonseparable IV model, where the instrumental variable is independent of unobservables, we develop a novel nonparametric test of separability of unobservables. The large-sample distribution of the test statistics is nonstandard and relies on a novel Donsker-type central limit theorem for the empirical distribution of nonparametric IV residuals, which may be of independent interest. Using a dataset drawn from the 2015 US Consumer Expenditure Survey, we find that the test rejects the separability in Engel curves for most of the commodities.
| Commodity | KS | CvM | Commodity | KS | CvM |
|---|---|---|---|---|---|
| Food home | 0.00 | 0.00 | Gas and oil | 0.00 | 0.00 |
| Food away | 0.00 | 0.00 | Personal care | 0.00 | 0.00 |
| Clothing | 0.00 | 0.00 | Health | 0.00 | 0.00 |
| Tobacco | 0.00 | 0.00 | Insurance | 0.00 | 0.00 |
| Alcohol | 0.00 | 0.00 | Reading | 0.00 | 0.53 |
| Trips | 0.00 | 0.00 | Transportation | 0.01 | 0.03 |
| Entertainment | 0.08 | 0.00 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Are Unobservables Separable?
Andrii Babii
UNC Chapel Hill Department of Economics, University of North Carolina–Chapel Hill - Gardner Hall, CB 3305 Chapel Hill, NC 27599-3305. Email: [email protected].
Jean-Pierre Florens
Toulouse School of Economics Department of Economics, Toulouse School of Economics - 1, Esplanade de l’Université, 31080 Toulouse Cedex 06, France.
Abstract
It is common to assume in empirical research that observables and unobservables are additively separable, especially, when the former are endogenous. This is done because it is widely recognized that identification and estimation challenges arise when interactions between the two are allowed for. Starting from a nonseparable IV model, where the instrumental variable is independent of unobservables, we develop a novel nonparametric test of separability of unobservables. The large-sample distribution of the test statistics is nonstandard and relies on a novel Donsker-type central limit theorem for the empirical distribution of nonparametric IV residuals, which may be of independent interest. Using a dataset drawn from the 2015 US Consumer Expenditure Survey, we find that the test rejects the separability in Engel curves for most of the commodities.
Keywords: unobservables, endogeneity, separability test, Engel curves, heterogeneity in unobservables, distribution of nonparametric IV residuals.
1 Introduction
It is common to assume in empirical research that observables and unobservables are additively separable, especially when the former are endogenous. This is done because it is widely recognized that identification and estimation challenges arise when interactions between the two are allowed for. However, the economic theory and considerations often lead to nonseparable models. Prominent examples are demand functions, where the price or income effects might be heterogeneous in unobserved preferences; production functions, where observed input choices may be heterogeneous in input choices unobserved by the econometrician; labor supply functions with heterogeneous wage effects; wage equations, where the returns to schooling might vary with unobserved ability; or more generally, treatment effect models, where causal effects are heterogeneous in unobservables.
In response to these empirical challenges, there is a growing literature studying the nonparametric identification of nonseparable models with endogeneity; see Chernozhukov and Hansen (2005), Chernozhukov, Imbens, and Newey (2007), Florens, Heckman, Meghir, and Vytlacil (2008), Imbens and Newey (2009), Torgovitsky (2015), and D’Haultfouille and Février (2015) among many others. It is well-understood that the fully nonparametric estimation of a nonseparable model may lead to a difficult nonlinear ill-posed inverse problem; see Carrasco, Florens, and Renault (2007), Horowitz and Lee (2007), Gagliardini and Scaillet (2012), and Dunker, Florens, Hohage, Johannes, and Mammen (2014).
Since a fully nonparametric estimation of a nonseparable model is more challenging and since separable models rule out the heterogeneity of marginal effects in unobservables, detecting separability is desirable in empirical applications. If the separability is rejected, then the more sophisticated nonseparable models should not be neglected, while if it turns out that the structural relation is separable, then the conventional empirical practice could be well-justified.
Despite the significant efforts focused on understanding the identification and the estimation of nonseparable IV models and the widespread use of separable IV models in empirical practice, little work has been done on developing formal testing procedures that could discriminate empirically between the two. Lu and White (2014) and Su, Tu, and Ullah (2015) are notable exceptions that develop separability tests under the conditional independence restriction and additional identifying restrictions imposed by the nonseparable model. The conditional independence restriction is different from the mean-independence restriction imposed by the separable nonparametric IV model and does not allow justifying the separable nonparametric IV model that we are interested in here. Other recent specification tests for the nonseparable model include the monotonicity test of Hoderlein, Su, White, and Yang (2016), the endogeneity test of Fève, Florens, and Van Keilegom (2018), and the specification test for the quantile IV regression of Breunig (2020).
In this paper, we design a novel fully nonparametric separability test. Our test is based on the independence condition of the nonseparable model and does not rely on additional identifying restrictions, such as the monotonicity in unobservable. The test is based on the insight that the structural function in the separable model can be estimated using the nonparametric IV approach; see Florens (2003), Newey and Powell (2003), Hall and Horowitz (2005), Blundell, Chen, and Kristensen (2007), and Darolles, Fan, Florens, and Renault (2011). If the separable model is correct, then the nonparametric IV residuals should approximate unobservables that are independent of the instrumental variables in the nonseparable IV model. This intuition suggests that it should be possible to detect the separability with the classical Kolmogorov-Smirnov or Cramér-von Mises independence tests between the nonparametric IV residuals and the instrumental variable. To the best of our knowledge, no such test is currently available in the literature, and it is not known whether the empirical distribution of the nonparametric IV residuals satisfies the Donsker property.
Formalizing this intuition is far from trivial since the regression residuals are different from the true regression errors and the nonparametric IV regression is an example of a linear ill-posed inverse problem and requires regularization. Moreover, the empirical distribution function of the nonparametric IV residuals is a non-smooth function of the estimated nonparametric IV regression. The weak convergence of the empirical distribution of regression residuals in the parametric linear case is a classical problem in statistics; see, e.g., Durbin (1973), Loynes (1980), and Mammen (1996). The extension to the nonparametric regression is more challenging, and it is remarkable that the empirical distribution of nonparametric regression residuals still converges weakly as was shown in Akritas and Van Keilegom (2001). The additively separable nonparametric IV regression differs from the problems discussed above in two important directions. First, its finite-sample and the asymptotic performance depend both on the smoothness of the regression function and the smoothing properties of the conditional expectation operator. Second, it features an additional dependence between the endogenous regressor and the regression error that cannot be neglected in practice.
In this paper, we show that the empirical distribution function of nonparametric IV residuals converges weakly to a Gaussian process at a parametric rate, even though residuals are obtained from the nonparametrically estimated ill-posed inverse problem. To the best of our knowledge, this is the first result on the distribution of the nonparametric IV residuals, which can be used to develop various residual-based specification tests and is of independent interest. Building on this result, we obtain the large sample approximation to the distributions of independence separability tests. The distributions of residual-based independence tests are non-standard and not amenable to standard bootstrap approximations. Therefore, we suggest using the out of bootstrap or subsampling to compute the critical values.
Our results are based on the insight that the Tikhonov regularization in Sobolev spaces, considered in Florens, Johannes, and Van Bellegem (2011), Gagliardini and Scaillet (2012), Carrasco, Florens, and Renault (2014), and Gagliardini and Scaillet (2017), among others, provides a natural link between the modern empirical process theory and the theory of ill-posed inverse problems. In regards to this literature, we obtain new results for the Tikhonov regularization with a Sobolev penalty that can be applied to generic ill-posed inverse problems, including various nonparametric IV estimators, e.g., based on kernel smoothing. In particular, the Tikhonov regularization with a Sobolev penalty achieves sufficiently fast convergence rates for the semiparametric theory. In contrast, the simple one-step Tikhonov regularization without Sobolev penalization suffers from the well-known saturation effects; see Darolles, Fan, Florens, and Renault (2011).
The paper is organized as follows. In Section 2, we present two motivating examples, where economic considerations lead to nonseparable models with endogeneity and discuss a testable implication of separability. In Section 3, we characterize the large sample approximation to the distribution of the residual-based Kolmogorov-Smirnov and Cramér-von Mises independence tests and introduce a resampling procedure to compute the critical values. We also study the behavior of these tests under the fixed and the local alternative hypotheses. We report on a Monte Carlo study in Section 4 which provides insights about the validity of our asymptotic approximations in finite samples. In Section 5, we test the separability of Engel curves for a large set of commodities and find that the separability is rejected most of the time. Conclusions appear in Section 6. All technical details, auxiliary results, and proofs are collected in the Appendix and the Supplementary Material.
2 Separability of unobservables
2.1 Motivating examples
The instrumental variable models with additively separable unobservables constitute a workhorse of modern empirical practice. However, the additive separability of unobservables is a restrictive modeling assumption that essentially rules out the heterogeneity of estimated causal structural effects in unobservables; see, e.g., Heckman (2001) or Imbens (2010). Indeed, the structural economic models typically lead to nonseparable unobservables as illustrated below.
Example 2.1** (Demand function).**
Consider a random utility maximization problem
[TABLE]
where is a utility function, is a vector of demanded quantities, is an individual preference variable, unobserved by the econometrician, is a vector of prices, and is the income. The solution to this optimization problem leads to the nonseparable demand functions for each good as shown in Brown and Walker (1989) and Lewbel (2001); see also Hoderlein and Vanhems (2018) for the welfare analysis based on the nonseparable model. The nonseparable demand functions may lead in turn to the nonseparable Engel curves.
Example 2.2** (Production function/frontier).**
Simar, Vanhems, and Van Keilegom (2016)** consider a production process with unobserved heterogeneity that leads to the production function/frontier such that , where is an output, are observed inputs, is an environmental factor, and is a measure of inefficiency. In this example, the nonseparable model is generated by the fact that the environmental factor is taken into account along with other input choices by firms, and, at the same time, the former is not observed by the econometrician.
2.2 A testable implication
Let be observed random variables admitting a nonseparable representation
[TABLE]
where is outcome, are regressors, is unobservable, is a vector of instrumental variables, and is a structural function. We assume that are valid instrumental variables satisfying the exclusion restriction, , and the relevance condition, . Note that the independence exclusion restriction is a commonly used identifying condition for nonseparable models; see Chernozhukov, Fernández-Val, Newey, Stouli, and Vella (2020), Blundell, Horowitz, and Parey (2017), Torgovitsky (2017), Torgovitsky (2015), D’Haultfouille and Février (2015), Dunker, Florens, Hohage, Johannes, and Mammen (2014), Gagliardini and Scaillet (2012), and Horowitz and Lee (2007) for recent examples and applications, as well as Chernozhukov and Hansen (2013), Matzkin (2013), and Imbens (2010) for the review of earlier econometrics literature on the identification of nonseparable models.
The independence condition does not rule-out the heteroskedasticity in the distribution of conditionally on or , which is often observed in the empirical practice. It also does not rule-out the heteroskedasticity in the distribution of unobservables conditionally on covariates . However, it rules out the heteroskedasticity of unobservables conditionally on the instrumental variable, which could be less restrictive, since the instrumental variable is univariate in typical applications. This leads to an interesting trade-off between the heterogeneity of causal structural effects in unobservables allowed for in the nonseparable model and the heteroskedasticity of unobservables conditionally on the instrumental variable allowed for in the separable model.
To develop the separability test, several strategies can be adopted. For instance, one could nonparametrically estimate the nonseparable model and check whether the separability holds. This approach corresponds to the principle behind the Wald test for parametric models. Alternatively, since the nonparametric identification and estimation of the separable model is easier, one could estimate the separable model and check the independence condition of the nonseparable model. This approach corresponds to the principle behind Rao’s score test in the parametric setting and is the one adopted in this paper.
We say that the model in equation (1) has a separable representation if there exists measurable functions and such that
[TABLE]
If the model has a separable representation, then the structural function can be estimated consistently using the nonparametric IV approach; see Darolles, Fan, Florens, and Renault (2011), Blundell, Chen, and Kristensen (2007), Horowitz and Lee (2007), and Newey and Powell (2003). The nonparametric IV regression function solves the functional equation
[TABLE]
where is an integral operator. Let be the nonparametric IV regression error. Note that even if the model is nonseparable, we still have with for solving the functional equation (2). The following result provides a convenient for us testable implication of separability, provided that is unambiguously defined, see Appendix for a formal proof.
Proposition 2.1**.**
Suppose that there exists a unique solution to equation (2). If the model in equation (1) admits a separable representation, then .
It is worth mentioning that the independence between and is only a testable implication of additive separability of unobservables. However, when the model is nonseparable, we have , for some non-degenerate function of , which in many cases is not independent of , because by the relevance condition. Therefore, the independence test between and will have power against many interesting deviations from the separability. Note also that Proposition 2.1 relies on the injectivity of , which is known as a completeness condition, see Newey and Powell (2003) and Babii and Florens (2020), and does not require that the nonseparable model is identified; see, e.g., Chernozhukov and Hansen (2005) and Chen, Chernozhukov, Lee, and Newey (2014). Lastly, note that the additive separability is different from the multiplicative separability when . However, when , and are positive, we obtain the additively separable model after taking logs.
3 Independence test
In this section, we introduce tests of the independence condition characterized in Proposition 2.1. Formally, we focus on testing
[TABLE]
is testable, provided that the nuisance parameter in is replaced by the appropriate estimator.
3.1 Tikhonov regularization in Sobolev spaces
We focus on the Tikhonov-regularized estimator penalized by the Sobolev norm to estimate the nuisance parameter ; see Carrasco, Florens, and Renault (2014), Gagliardini and Scaillet (2012), and Florens, Johannes, and Van Bellegem (2011). The attractive feature of this estimator is that it does not suffer from the well-known saturation bias and can achieve a sufficiently fast convergence rate for our asymptotic theory and more generally for semiparametric applications; see Corollary A.1.1 in the Appendix.
Let denote the space of functions square-integrable with respect to the Lebesgue measure. Let be a polynomial weight function with , where and is a Euclidean norm on . Consider the operator defined for all such that , where is a Fourier transform on with scaling . Then the self-adjoint operator generates a Hilbert scale of Sobolev spaces
[TABLE]
see Krein and Petunin (1966) for more details on Banach and Hilbert scales.
Let be the kernel estimators of in equation (2) computed as
[TABLE]
where and are kernel functions and is a sequence of bandwidth parameters.
We estimate using the Tikhonov-regularized estimator penalized by the Sobolev norm with
[TABLE]
where and are as described above. It is easy to see that this problem has a closed-form solution
[TABLE]
where and is the adjoint operator to .
3.2 Distribution of statistics
Let be the nonparametric IV residuals and let
[TABLE]
be the empirical distribution functions. To test , we focus on the following residual-based independence empirical process
[TABLE]
Note that this process involves residuals instead of the true regression errors , hence, its asymptotic behavior can be significantly different from the asymptotic behavior of classical independence empirical processes; see van der Vaart and Wellner (1996), Chapter 3.8. In particular, the estimation of the nuisance component may affect the asymptotic distribution of the independence empirical process.
To understand the behavior of , we introduce several assumptions.
Assumption 3.1**.**
For some
- (i)
Operator smoothing: for all and . 2. (ii)
Parameter smoothness: .
Assumption 3.1 (i) describes the smoothing property of the operator . Roughly speaking, the action of increases the Sobolev smoothness by , which is called the degree of ill-posedness. Intuitively, the more smooths out features of , the harder it is to recover it from the equation (2). Condition (ii) describes the smoothness of the structural function and is a standard smoothness restriction in the nonparametric literature.
Assumption 3.2**.**
(i) are i.i.d. observations of with , , and ; (ii) the distribution of is absolutely continuous with respect to the Lebesgue measure with densities and ; (iii) for some ; (iv) and products of a univariate continuous kernel of bounded variation with , , and for and .
Assumption 3.2 describes several mild conditions on the distribution of the data and the kernel functions that are largely standard for kernel estimators; see also Darolles, Fan, Florens, and Renault (2011), Appendix B for a discussion of generalized boundary kernels that can be used when supports are bounded. To introduce the next assumption, let be a partial derivative with respect to the variable , let denote the uniform norm, and put and .
Assumption 3.3**.**
(i) and with ; (ii) and with ;
Assumption 3.3 imposes some relatively mild smoothness conditions on the distribution of the data.
Assumption 3.4**.**
* and as are such that (i) , , and ; (ii) , , and ; (iii) , , and ; where , , , , are as in Assumptions 3.1 and 3.2.*
Assumption 3.3 (i) provides a set of sufficient conditions for with , while condition (ii) states additional requirements for ; see Corollary A.1.1 in the Appendix. The former condition is needed for the asymptotic equicontinuity argument, while the latter requires that the nuisance parameter is estimated at a sufficiently fast rate, which is often encountered in the semiparametric literature. Lastly, condition (iii) ensures that a certain uniform asymptotic expansion holds. To illustrate that conditions on tuning parameters are feasible, suppose for simplicity that and that and for some . Then (i) requires that , , and . For (ii), we additionally need , , and . Lastly, (iii) requires that , , and . Therefore, we require , which is non-empty provided that . Given this choice, the following smoothness conditions are imposed in Assumption 3.4: and .
The following result describes a convenient for us approximation to the residual-based independence empirical process:
Theorem 3.1**.**
Suppose that Assumptions 3.1, 3.2, 3.3, and 3.4 are satisfied. Then
[TABLE]
uniformly over with
[TABLE]
It if worth mentioning that Theorem 3.1 does not require . The proof of this result can be found in the Appendix and relies on the asympttoic equicontinuity arguments. Roughly speaking, we show that the consistency of the nonparametric IV estimator in the Sobolev norm together with the Donsker property of Sobolev balls imply that that certain terms associated with residuals are asymptotically negligible. At the same time, the estimation of the nuisance component has a first-order asymptotic effect due to the term, while the higher-order terms are negligible provided that . This rate condition is typically encountered for the semiparametric problems; see Chernozhukov, Chetverikov, Demirer, Duflo, Hansen, Newey, and Robins (2018) and Chernozhukov, Escanciano, Ichimura, Newey, and Robins (2016) for recent contributions, Andrews (1994) for earlier treatment, and Babii (2021), Section 3.3 for a related discussion in the setting of ill-posed inverse problems.
It is worth mentioning that, in some cases, the estimation of nuisance parameters does not have any first-order asymptotic effect, which is known as the Neyman orthogonality property in the semiparametric literature. In particular, this is the case for the independence empirical process based on the nonparametric conditional mean regression residuals; see Einmahl and Van Keilegom (2008). Interestingly, if we had , then , and the estimation of would not have any first-order asymptotic effect.
Theorem 3.1 can be readily used to construct the residual-based Cramér-von Mises and Kolmogorov-Smirnov statistics
[TABLE]
To understand the behavior of the two statistics under the null and the alternative hypotheses, consider a centered version of the process in Theorem 3.1
[TABLE]
where . The following Donsker-type central limit theorem holds:
Proposition 3.1**.**
Suppose that assumptions of Theorem 3.1 are satisfied. Then
[TABLE]
where is a tight centered Gaussian process with uniformly continuous sample paths and the covariance function
[TABLE]
Note that under the null hypothesis , we have and the covariance function of simplifies to
[TABLE]
For the alternative hypothesis, , put
[TABLE]
Consider also a sequence of local alternative hypotheses
[TABLE]
where the function is such that is a proper CDF. There exist several ways to construct such local alternatives with prespecified marginal distributions and . For instance, the Morgenstern’s family is with ; see Devroye (1986), Chapter XI, Theorem 3.2. The following corollary describes the behavior of the independence test under the null and fixed/local alternative hypotheses:
Corollary 3.1**.**
Suppose that assumptions of Theorem 3.1 are satisfied. Then under
[TABLE]
while under , we have , provided that . Moreover, under
[TABLE]
Corollary 3.1 shows that the residual-based independence test can detect parametric local alternatives. The asymptotic distributions under are not pivotal, in contrast to the nonparametric regression without endogeneity, cf. Einmahl and Van Keilegom (2008). While obtaining the distribution-free statistics is possible in simpler residual-based testing problems, see Escanciano, Pardo-Fernández, and Van Keilegom (2018), these methods do not seem to extend naturally to our setting. Therefore, the bootstrap could be an attractive alternative for simulating the critical values of the test. Interestingly, the naive nonparametric and the multiplier bootstraps do not work.
3.3 Critical values
The asymptotic distributions in Corollary 3.1 are nonstandard and depend on several nuisance nonparametric components. This calls for resampling methods to compute the critical values. As can be seen from the proof of Theorem 3.1, our uniform asymptotic expansion relies on the differentiability of the CDF. This leads to a dependence of the asymptotic distribution on the probability density function in Corollary 3.1; see also the proof of Theorem A.1 and Corollary A.2.1. Such uniform asymptotic expansion cannot be obtained in the same way for the bootstrapped statistics since in the bootstrap world the empirical distribution function is not differentiable.
The lack of smoothness of the empirical distribution function suggests that the standard bootstrap procedures may fail in approximating the asymptotic distribution of the test statistics. The problem of a similar nature occurs with the bootstrap of the cube-root consistent estimators; see, e.g., Babii and Kumar (2021) and references therein. Another complication with the bootstrap is that we typically need to resample from the distribution obeying the constraints of the null hypothesis and that the validity of the bootstrap has to be established case-by-case. Note also that the (smoothed) residual bootstrap, cf. Neumeyer and Van Keilegom (2019), does not preserve the dependence between the endogenous regressor and the unobservables and does not mimic the data generating process of the IV regression under the null hypothesis. In Section 4, we find in Monte Carlo experiments that the standard nonparametric bootstrap does not work.
Consequently, we suggest relying on the subsampling or the out of bootstrap to compute the critical values of the test. The resampling procedure is as follows:
Draw a sample of size from without replacements (subsampling) or with replacements ( out of bootstrap), where is a sequence such that and and as . 2. 2.
Compute the Kolmogorov-Smirnov or the Cramér-von Mises statistics using the simulated sample. 3. 3.
Repeat the first two steps many times and compute the critical values using empirical quantiles of the statistics over all simulated samples. Alternatively, compute p-values as , where is the statistics computed from and is the empirical distribution function of bootstrapped statistics.
An attractive feature of subsampling is that it is valid for general hypothesis testing problems; see Politis, Romano, and Wolf (2001), Theorem 3.1, and there is no need to show its validity in each specific application. An adaptive data-driven rule to select is considered, e.g., in Bickel and Sakov (2008).
4 Monte Carlo experiments
To evaluate the finite-sample performance of the test, we simulate samples as
[TABLE]
We set and consider samples of size and observations; see Supplementary Material for additional simulation results. Note that the degree of separability of unobservables is governed by . The separable model corresponds to , while any corresponds to the alternative nonseparable model. It is worth mentioning that under , the nonparametric IV regression does not estimate consistently the nonseparable structural function , which depends on unobservables. The nonparametric IV regression estimates instead the function solving the functional equation . The difference between the two functions is precisely what gives the power to the test.
We set the number of Monte Carlo replications and the number of bootstrap replications to through all our experiments. We also discretize all continuous quantities on the grid of 100 equidistant points in . The estimates and in equation (3) are obtained using the sixth-order Epanechnikov kernel. The corresponding bandwidth parameters are computed using Silverman’s rule of thumb: and , where and are sample standard deviations of observed and . This choice satisfies Assumption 3.4 and requires that the regularization parameter is with . To satisfy this requirement, we set .
We look at the distributions of Kolmogorov-Smirnov and Cramér-von Mises statistics, computed respectively as
[TABLE]
where and the empirical distribution functions are computed as in equation (4). Lastly, we use the adaptive rule of Bickel and Sakov (2008) to estimate the size of the subsample. The rule consists of choosing , where is the empirical distribution of the simulated statistics using a subsample of size , is integer part of , and .
Figure 1 shows the distribution of the test statistics under the null hypothesis and the two alternative hypotheses for different sample sizes. The two distributions are sufficiently distinct once the alternative hypothesis becomes more separated from the null hypothesis.
We plot in Figure 2 the power curves when the level of the test is fixed at . The power of the test increases once alternative hypotheses become more distant from the null hypothesis. The Cramér-von Mises test seems to have higher power for the class of considered alternatives. We can also see that the figure illustrates the consistency of the test in the sense that its power becomes closer to one as the sample size increases under the alternative hypotheses.
In Figure 3, we explore the performance of the bootstrap. We plot the exact finite sample distribution of both test statistics and the distribution of bootstrapped statistics under . In panels (a) and (b), we plot the distribution of the naive bootstrap, drawing a sample of size randomly with replacements from . In panels (c) and (d), we plot the distribution of the out of bootstrap. The naive bootstrap fails and does not mimic the distribution of the Kolmogorov-Smirnov/Cramér-von Mises statistics. The distribution of the out of bootstrap, on the other hand, is close to the finite sample distributions of both statistics. We also observe that the adaptive choice seems to work slightly better for the Cramér-von Mises statistics.
5 Are Engel curves separable?
Engel curves are fundamental for the analysis of consumers’ behavior and have implications for the aggregate economic outcomes. The Engel curve describes the relationship between the demand for a particular commodity and the household’s budget. Interesting applications of the estimated Engel curves include a measurement of welfare losses associated with tax distortions in Banks, Blundell, and Lewbel (1997), an estimation of the growth and the inflation in Nakamura, Steinsson, and Liu (2016), or an estimation of the income inequality across countries in Almås (2012). The nonparametric IV approach to the estimation of Engel curves is pioneered in the seminal paper Blundell, Chen, and Kristensen (2007) who focus on the estimation of Engel curves in the UK.
We draw a dataset from the 2015 US Consumer Expenditure Survey; see Babii (2020) for the estimated Engel curves with the uniform confidence bands using this dataset. We restrict our attention to married couples with a positive income during the last 12 months, yielding 10,055 observations. The dependent variable is a share of expenditures on a particular commodity while the endogenous regressor is a natural logarithm of the total expenditures. We instrument the expenditures using the gross income. In particular, Blundell, Chen, and Kristensen (2007) point out that the gross income will be exogenous for consumption expenditures assuming that heterogeneity in earnings is not related to unobserved preferences over consumption; see also Chen and Christensen (2018) and Babii (2020).
In Table 1, we report the out of bootstrap p-values, with the adaptive choice of ; see Section 4 for more details on the practical implementation of tests. We report results for both the Kolmogorov-Smirnov (KS) and the Cramér-von Mises (CvM) tests. Remarkably, the 5% level tests reject the separability for all commodities with the exception for the Entertainment (KS) and Reading (CvM). Moreover, the 1% level tests reject separability in all cases, except for Reading (CvM) and Transportation (CvM). This suggests that Engel curves for these commodities may exhibit substantial heterogeneities in unobservables.
6 Conclusions
This paper offers a new perspective on the separability of unobservables in economic models with endogeneity. Starting from the nonseparable model where the instrumental variable is independent of unobservables, our first contribution is to develop a novel fully nonparametric separability test. The test is based on the estimation of a separable nonparametric IV regression and the verification of the independence restriction imposed by the nonseparable IV model. To obtain a large sample approximation to the distribution of our test statistics, we develop a novel uniform asymptotic expansions of the empirical distribution function of nonparametric IV residuals and obtain new results for the Tikhonov regularization in Sobolev spaces. We show that, despite the uncertainty coming from an ill-posed inverse nonparametric IV regression, the empirical distribution function of residuals and the residual-based independence empirical process still satisfy the Donsker central limit theorem. In contrast to the nonparametric regression without endogeneity, we find that the parameter uncertainty affects the asymptotic distribution of the residual-based independence tests, which are highly nonstandard. In our Monte Carlo experiments, we find that the bootstrap fails in approximating the distribution of the test statistics under the null hypothesis; hence we rely on the out of bootstrap (or subsampling) procedure to compute its critical values.
Using the 2015 US Consumer Expenditure Survey data, we find that the level test rejects the separability of Engel curves for most of the commodities. This indicates that the Engel curves may be heterogeneous in unobservables and that the nonseparable modeling of Engel curves may be useful, see, e.g., Blundell, Horowitz, and Parey (2017) for the estimation of nonseparable demand functions.
The paper offers several other directions for future research. First, it might be interesting to test the separability of unobservables in other structural relations that are commonly estimated using the additively separable models in the empirical practice, such as the production function, the labor supply function, the demand function, or the wage equation. Second, given the plethora of residual-based specification tests for regression models without endogeneity, our results could also be used to develop similar tests for econometric models with endogeneity; see Pardo-Fernández, Van Keilegom, and González-Manteiga (2007) and Escanciano, Pardo-Fernández, and Van Keilegom (2018).
Acknowledgement
This work was supported by the French National Research Agency under Grant ANR-19-CE40-0013-01/ExtremReg project. We thank Ivan Canay, Tim Christensen, Elia Lapenta, Pascal Lavergne, Thierry Magnac, Nour Meddahi, and Ingrid Van Keilegom for helpful discussions. All remaining errors are ours.
APPENDIX: ADDITIONAL RESULTS AND PROOFS
Notation.
For two sequences and , we denote if and if both and . For two sequences of random variables and , we denote for . For a bounded linear operator on normed spaces, we use to denote its operator norm, where with some abuse of notation, we use to denote the norm of both spaces.
A.1 Tikhonov regularization in Sobolev spaces
This section discusses convergence rates for the Tikhonov-regularized estimator in Sobolev spaces. The following result extends Carrasco, Florens, and Renault (2014), Proposition 3.1 to the case of the unknown operator.
Theorem A.1**.**
Suppose that Assumption 3.1 is satisfied, , and . Then for every
[TABLE]
It is worth emphasizing that this result is not specific to the nonparametric IV regression and can be applied to a generic ill-posed inverse problem , where is estimated with . Moreover, in the case of nonparametric IV regression, it can be easily applied to nonparametric/machine learning estimators other than the kernel smoothing. Next, we specialize the generic result of Theorem A.1 to the nonparametric IV regression with estimated via kernel smoothing, see equation (3).
Corollary A.1.1**.**
Suppose that Assumptions 3.1 and 3.2 are satisfied, , and . Then for every
[TABLE]
A.2 Distribution of nonparametric IV residuals
In this section, we present results on the weak convergence of the empirical distribution of nonparametric IV residuals. These results are used to obtain the large sample approximation to the distribution of independence tests and are of independent interest.
Theorem A.1**.**
Suppose that Assumptions 3.1, 3.2, and 3.3 (i), and 3.4 are satisfied. Then
[TABLE]
uniformly over .
Proof.
By Lemma A.4.1, the following expansion holds uniformly in
[TABLE]
By Taylor’s theorem, there exists some such that
[TABLE]
By Lemma B.1.1 in the Supplementary Material,
[TABLE]
while under Assumptions 3.3 (i) and 3.4
[TABLE]
Combining all estimates, we obtain uniformly in
[TABLE]
∎
As a consequence of Theorem A.1, we obtain the following Donsker-type central limit theorem for the empirical distribution of nonparametric IV residuals.
Corollary A.2.1**.**
Suppose that assumptions of Theorem A.1 are satisfied. Then
[TABLE]
where is a tight centered Gaussian process with uniformly continuous sample paths and the covariance function
[TABLE]
Proof.
The process given in Theorem A.1 is an empirical process indexed by the following class of functions , which is a sum of the Donsker class and . By van der Vaart and Wellner (1996), Example 2.10.5, it enough to show that is Donsker. The former statement follows from the fact that under Assumption 3.1 in the Supplementary Material by Engl, Hanke, and Neubauer (2000), since for
[TABLE]
where the last inequality follows under Assumption 3.3 (i). Therefore, , where is a Sobolev ball of radius . Since , this shows that the class is Donsker; see Nickl and Pötscher (2007), Corollaries 4 and 5. The covariance function simplifies since . ∎
A.3 Proofs of main results
In this section we provide proofs of main results of the paper.
Proof of Proposition 2.1.
Since is injective, the nonparametric IV regression is unique. Therefore, is a well-defined unique random variable. If the model in equation (1) admits a separable representation, then since
[TABLE]
Therefore, by the injectivity of , and whence . This shows that because . ∎
Proof of Theorem 3.1.
By Lemma A.4.2, uniformly in
[TABLE]
where
[TABLE]
The first term is a classical independence empirical process
[TABLE]
where the second line follows by the maximal inequality.
Next, under Assumption 3.3 (i), by Taylor’s theorem, for some
[TABLE]
Under Assumptions 3.3 by Corollary A.1.1
[TABLE]
Similarly, we have uniformly in
[TABLE]
Therefore, uniformly in
[TABLE]
where the last line follows by the same argument as in the proof of Theorem A.1 under Assumption 3.3 (i). ∎
Proof of Proposition 3.1.
is an empirical process indexed by the class of functions
[TABLE]
By van der Vaart and Wellner (1996), Example 2.10.7 it suffices to show that each of the functions in the sum constitutes a Donsker class. To that end, recall first that the indicator functions are classical examples of Donsker classes. Therefore, all terms in , but the last one, are either Donsker or can be factored as Donsker classes and a deterministic bounded function not depending on the argument of the indicator function. Lastly, under Assumptions 3.1 (i) by Engl, Hanke, and Neubauer (2000), Corollary 8.22
[TABLE]
where the latter follows under Assumption 3.3 (ii). Therefore, we obtain that , where is a Sobolev ball of radius . Since , this shows that is Donsker; see Nickl and Pötscher (2007), Corollaries 4 and 5. ∎
Proof of Corollary 3.1.
Since under , by Proposition 3.1, the asymptotic distribution of under is readily obtained by the continuous mapping theorem; see van der Vaart and Wellner (1996), Theorem 1.3.6. For the Cramér-von Mises statistics, write
[TABLE]
with
[TABLE]
By Proposition 3.1, under , and also converges weakly by Proposition 3.1 and Theorem A.2.1, whence by the Skorokhod construction
[TABLE]
The first expression in Eq. A.1 implies that . Since has a.s. bounded and continuous trajectories, the second expression in Eq. A.1 in conjunction with the Helly-Bray theorem show that . Therefore, the asymptotic distribution of the Cramér-von Mises test follows by the continuous mapping theorem.
Under the fixed alternative hypothesis, since , by Theorem 3.1, the Glivenko-Cantelli theorem, and a similar argument we obtain
[TABLE]
Therefore, by Slutsky’s theorem and , which proves the second statement. For the local alternatives, note that
[TABLE]
Therefore, by Corollary 3.1 and continuous mapping theorem
[TABLE]
For the Cramér-von Mises statistics, write
[TABLE]
where
[TABLE]
Therefore, the result follows by Proposition 3.1 and the same argument as under with the only difference that now we have the bias in the limiting distribution. ∎
A.4 Auxiliary technical results
In this section, we provide several auxiliary technical results.
Lemma A.4.1**.**
Suppose that Assumption 3.1, 3.2, 3.3, and 3.4. Then
[TABLE]
where and .
Proof.
The main idea of the proof is to embed the process inside the supremum into an empirical process indexed by and a Sobolev ball containing with a probability tending to one. We first show that the process is Donsker, whence the supremum in Eq. A.2 is . Finally, the required order will follow from the fact that the process is degenerate.
Let be a ball of radius in the Sobolev space . For and , define , , , and . Note that is a classical Donsker class of indicator functions. If we can show that is Donsker, then will be Donsker as a sum of two Donsker classes; see van der Vaart and Wellner (1996), Theorem 2.10.6. To this end, we check that the bracketing entropy condition is satisfied for .
By Nickl and Pötscher (2007), Corollary 4 the bracketing number of satisfies , where denotes the space of functions, square-integrable with respect to . Put and fix . Let be a collection of -brackets for , i.e., for any , there exists such that and , and whence . Now for each , partition the real line into intervals defined by grids of points and , so that each segment has probabilities
[TABLE]
Denote the largest such that by and the smallest such that by . Consider the following family of brackets Under Assumption 3.2 (ii)
[TABLE]
Therefore, we constructed brackets of size , covering , and we have used at most such brackets. Since , we have . This shows that the empirical process is Donsker, hence, asymptotically equicontinuous; see van der Vaart and Wellner (1996), Theorem 1.5.7. Then for any
[TABLE]
where denotes the outer probability measure.
Next, we show that for every , with , where the expectation is computed with respect to only. Indeed,
[TABLE]
where the third line follows by the Cauchy-Schwartz inequality and Corollary A.1.1 under Assumptions 3.1, 3.2, and 3.4. Similarly,
[TABLE]
Lastly, let denote the supremum in Eq A.2. Then
[TABLE]
where the second probability tends to zero as we have just shown and the last probability tends to zero since under the maintained assumptions, by Corollary A.1.1, . Therefore, it follows from the asymptotic equicontinuity in Eq. A.3 that , which concludes the proof. ∎
Lemma A.4.2**.**
Suppose that Assumptions 3.1, 3.2, 3.3, and 3.4 are satisfied. Then uniformly over
[TABLE]
and
[TABLE]
where and ,
Proof.
Note that the first expression and the expression in the statement of Lemma A.4.1 multiplied by differ only by
[TABLE]
which is by Corollary A.2.1 and the classical Donsker central limit theorem. By Lemma A.4.1, we obtain the first statement since is uniformly bounded by one.
The proof of the second statement is similar to the proof of Lemma A.4.1 and is omitted. ∎
B.1 Additional proofs and auxiliary results
This section contains proofs of several results from the main part of the paper as well as several auxiliary result.
Proof of Theorem A.1.
Decompose
[TABLE]
with
[TABLE]
For the first term
[TABLE]
where the second line follows by Engl, Hanke, and Neubauer (2000), Corollary 8.22 with ; the third line by the definition of operator norm; the fourth line by the isometry of functional calculus; and the last since for all .
Similarly, since for bounded linear operators and , ,
[TABLE]
Next, since and , by Engl, Hanke, and Neubauer (2000), Corollary 8.22, there exists such that . Therefore,
[TABLE]
Next, decompose
[TABLE]
with
[TABLE]
and
[TABLE]
Similarly, decompose
[TABLE]
with and defined below. In particular,
[TABLE]
where the last two lines follow by Engl, Hanke, and Neubauer (2000), Corollary 8.22 with and previous computations. Similarly,
[TABLE]
The result follows from combining all estimates. ∎
Proof of Corollary A.1.1.
By the Cauchy-Schwartz inequality
[TABLE]
where the second line follows from the well-known risk bound; see, e.g., Giné and Nickl (2015), p. 403-404 under Assumption 3.2. Therefore, by Theorem A.1
[TABLE]
The proof of
[TABLE]
under Assumption 3.2 can be found in Babii and Florens (2020). ∎
Lemma B.1.1**.**
Suppose that Assumptions 3.1, 3.2, 3.3, and 3.4, and are satisfied. Then
[TABLE]
Proof.
Similarly to the proof of Theorem A.1, decompose
[TABLE]
with
[TABLE]
We show below that . To that end, first since
[TABLE]
where the third line follows under Assumptions 3.1 (i) and 3.3 (i); and the fourth by arguments as in the proof of Corollary A.1.1 under Assumptions 3.2 and 3.4 (ii).
Second,
[TABLE]
where the first equality follows since is self-adjoint; the second line by the Cauchy-Schwartz inequality since under Assumption 3.3 (i) and by Assumption 3.1 (i); the third since for some by Engl, Hanke, and Neubauer (2000), Corollary 8.22; the fourth by the isometry of the functional calculus; and the last since and since under Assumption 3.4 (iii).
Next, decompose with
[TABLE]
By the Cauchy-Schwartz inequality and previous computations
[TABLE]
and
[TABLE]
Therefore, under Assumption 3.4 since
[TABLE]
Similarly, decompose with
[TABLE]
Likewise, by the Cauchy-Schwartz inequality and previous computations
[TABLE]
and
[TABLE]
where we use and ; see also the proof of Theorem A.1. Therefore, .
Combining all estimates, we obtain uniformly over
[TABLE]
Next, note that
[TABLE]
with , whence
[TABLE]
with . Using this observation, decompose equation (B.1) further
[TABLE]
with
[TABLE]
By the Cauchy-Schwartz inequality
[TABLE]
where the second line follows by triangle inequality and Assumption 3.3 (i); the third by Assumption 3.2 (i), Cauchy-Schwartz inequality, and since and are uniformly bounded under Assumption 3.2 (ii); and the last by the standard bias computations under Assumptions 3.1 (ii) and 3.2, and Young’s inequality under Assumption 3.2 (ii) and (iv).
Similarly, by the Cauchy-Schwartz inequality and Assumption 3.1 (i)
[TABLE]
where the second line follows under the i.i.d. assumption; the third since under Assumption 3.2 (i); the fourth since is uniformly bounded under Assumption 3.2 (ii); and the last by the standard bias computations under Assumptions 3.1 (ii) and 3.2 (iv).
Lastly, by the Cauchy-Schwartz inequality
[TABLE]
where the second inequality follows under Assumptions 3.2 (i); the third line under Assumptions 3.1, 3.2 (i)-(ii), and 3.3 (i); and the last by the isometry of functional calculus.
Combining these estimates under Assumptions 3.3 (i) and 3.4, we obtain the result
[TABLE]
∎
B.2 Additional Monte Carlo experiments
In this section, we report results of additional Monte Carlo experiments when the structural function is . The rest of the data-generating process is the same as in the main part of the paper.
Figure B.1 shows the distribution of the test statistics under the null hypothesis and the two alternative hypotheses for different sample sizes. The two distributions are sufficiently distinct once the alternative hypothesis becomes more separated from the null hypothesis.
We plot in Figure B.2 the power curves when the level of the test is fixed at . The power of the test increases once alternative hypotheses become more distant from the null hypothesis and when the sample size is larger. The Cramér-von Mises test seems to have a higher power for the class of considered alternatives.
Overall, the findings are largely similar to the findings of experiments presented in the main part of the paper.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1)
- 2Akritas and Van Keilegom (2001) Akritas, M., and I. Van Keilegom (2001): “Non-parametric estimation of the residual distribution,” Scandinavian Journal of Statistics , 28(3), 549–567.
- 3Almås (2012) Almås, I. (2012): “International income inequality: measuring ppp bias by estimating Engel curves for food,” American Economic Review , 102(2), 1093–1117.
- 4Andrews (1994) Andrews, D. (1994): “Chapter 37: Empirical process methods in econometrics,” Handbook of Econometrics , 4, 2247–2294.
- 5Babii (2020) Babii, A. (2020): “Honest confidence sets in nonparametric IV regression and other ill-posed models,” Econometric Theory , 36(4), 658–706.
- 6Babii (2021) Babii, A. (2021): “High-dimensional mixed-frequency IV regression,” ar Xiv preprint ar Xiv:2003.13478 .
- 7Babii and Florens (2020) Babii, A., and J.-P. Florens (2020): “Is completeness necessary? Estimation in nonidentified linear models,” ar Xiv preprint ar Xiv:1709.03473 .
- 8Babii and Kumar (2021) Babii, A., and R. Kumar (2021): “Isotonic regression discontinuity designs,” Journal of Econometrics (forthcoming) .
