Nonparametric Regression with Multiple Thresholds: Estimation and Inference
Yan-Yu Chiou, Mei-Yuan Chen, Jau-er Chen

TL;DR
This paper develops methods for estimating and testing the number and values of multiple thresholds in nonparametric regression models with an exogenous threshold variable, supported by simulations and an empirical application.
Contribution
It introduces a testing procedure to determine the unknown number of thresholds and derives their asymptotic properties, advancing nonparametric regression analysis.
Findings
The proposed test accurately identifies the number of thresholds.
Sequential estimation of threshold values is precise.
Monte Carlo simulations confirm the test's effectiveness.
Abstract
This paper examines nonparametric regression with an exogenous threshold variable, allowing for an unknown number of thresholds. Given the number of thresholds and corresponding threshold values, we first establish the asymptotic properties of the local constant estimator for a nonparametric regression with multiple thresholds. However, the number of thresholds and corresponding threshold values are typically unknown in practice. We then use our testing procedure to determine the unknown number of thresholds and derive the limiting distribution of the proposed test. The Monte Carlo simulation results indicate the adequacy of the modified test and accuracy of the sequential estimation of the threshold values. We apply our testing procedure to an empirical study of the 401(k) retirement savings plan with income thresholds.
| 10% | 5% | 1% | |
|---|---|---|---|
| 1 | 1.281552 | 1.644854 | 2.326348 |
| 2 | 1.632219 | 1.954508 | 2.574961 |
| 3 | 1.818281 | 2.121201 | 2.711943 |
| 4 | 1.943196 | 2.234002 | 2.805821 |
| 5 | 2.036469 | 2.318679 | 2.876895 |
| 500 | 1000 | 2000 | |
|---|---|---|---|
| 1% | 0.021 | 0.017 | 0.011 |
| 5% | 0.045 | 0.051 | 0.051 |
| 10% | 0.076 | 0.084 | 0.086 |
| 2000 | 2000 | 2000 | |
|---|---|---|---|
| 1% | 0.018 | 0.011 | 0.018 |
| 5% | 0.050 | 0.044 | 0.056 |
| 10% | 0.087 | 0.090 | 0.086 |
| MSE() | MSE() | MSE() | |||||||
|---|---|---|---|---|---|---|---|---|---|
| 500 | 0.4227 | 0.2542 | 0.0705 | 0.1775 | 0.0960 | 0.0100 | -0.6523 | 0.2407 | 0.0602 |
| 1000 | 0.4867 | 0.1245 | 0.0160 | 0.1529 | 0.0320 | 0.0010 | -0.6894 | 0.1198 | 0.0140 |
| 3000 | 0.5025 | 0.0079 | 6.9 | 0.1498 | 0.0077 | 5.9 | -0.7029 | 0.0040 | 2.4 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMonetary Policy and Economic Impact · Housing Market and Economics · Spatial and Panel Data Analysis
Nonparametric Regression with Multiple Thresholds: Estimation and Inference
Yan-Yu Chioua, Mei-Yuan Chenb,∗, Jau-er Chenc,∗
aInstitute of Economics, Academia Sinica, Taiwan.
bDepartment of Finance, National Chung Hsing University, Taiwan.
cDepartment of Economics, National Taiwan University, Taiwan.
2nd-round R&R at the Journal of Econometrics
We are grateful to the two anonymous referees for their constructive comments that have greatly improved this paper. We thank Ming-Yen Cheng for valuable discussions, and thank Zongwu Cai and the participants at the International Symposium on Recent Developments in Econometric Theory with Applications in Honor of Professor Takeshi Amemiya for their helpful comments. The usual disclaimer applies. *Corresponding authors: National Chung Hsing University, Department of Finance, 250 Kuo Kuang Road, Taichung 402, Taiwan. Tel.: +886-4-22853323. E-mail address: @dragon.nchu.edu.tw (Mei-Yuan Chen); National Taiwan University, Department of Economics, No. 1, Sec. 4, Roosevelt Road, Taipei 10617, Taiwan. Tel.: +886-2-3366-8326. E-mail address: [email protected] (Jau-er Chen).
ABSTRACT
This paper examines nonparametric regression with an exogenous threshold variable, allowing for an unknown number of thresholds. Given the number of thresholds and corresponding threshold values, we first establish the asymptotic properties of the local constant estimator for a nonparametric regression with multiple thresholds. However, the number of thresholds and corresponding threshold values are typically unknown in practice. We then use our testing procedure to determine the unknown number of thresholds and derive the limiting distribution of the proposed test. The Monte Carlo simulation results indicate the adequacy of the modified test and accuracy of the sequential estimation of the threshold values. We apply our testing procedure to an empirical study of the 401(k) retirement savings plan with income thresholds.
Keywords: nonparametric regression, threshold variable, threshold value, significance test
JEL Classification: C12; C13; C14
1 Introduction
Piecewise linearity has been widely used to model shifts in economic relationships under a regression framework. Most regressions with piecewise linearity can be represented as linear regressions with thresholds. For example, linear regressions with structural changes can be written as linear threshold regressions with the time index as the threshold variable. Among previous studies, Bai and Perron (1998, 2003), Qu and Perron (2007), and Yamamoto and Perron (2013) estimate and test linear regressions with structural changes and Chen (2008), Qu (2008), and Oka and Qu (2011) estimate and test linear quantile regressions with structural changes. The threshold model splits the sample into classes based on the value of an observed variable (i.e., whether it exceeds a certain threshold). In empirical work, determining the threshold of economic variables such as taxes rates as well as the optimal public debt ratio is relevant for policy makers. When the threshold is unknown as is typical in practice, it needs to be estimated, and this consequently increases the complexity of the econometric problem. Nonetheless, theories of estimation and inference are well developed for linear models with exogenous regressors, including the works by Chan (1993), Hansen (1996, 1999, 2000), and Caner (2002).
The scope of threshold models has broadened considerably in recent years. In particular, discussions of piecewise linearity have been extended to nonparametric regressions. Su and Xiao (2008), for instance, test for structural changes in time-series nonparametric regression models, while Chen and Hong (2012) investigate how to test for smooth structural changes in time-series models by using nonparametric regressions. In addition, Chen and Hong (2013) extend their earlier study to test for smooth structural changes in panel data models. In economics, the regression discontinuity (RD) design has gradually emerged as a common tool in applied research. The validity of RD estimates depends crucially both on the threshold variable (also termed the running variable in the RD literature) and on an adequate description of the conditional mean function of the outcome variable. Since what looks like a jump at the threshold might simply be unaccounted for nonlinearity, the nonparametric approach plays an important role in the RD estimations (cf. Angrist and Pischke, 2009). For example, by allowing for an unknown threshold value in the RD framework, Henderson, Parmeter, and Su (2014) provide estimation and inference procedures for the threshold value in a nonparametric regression with one threshold. Although related to Henderson et al. (2014), which is a pioneering study examining the nonparametric regression with one threshold, our study analyzes nonparametric regression with multiple thresholds. Further, in contrast to Henderson et al. (2014), the threshold variable is excluded from the explanatory variables in our framework. In empirical applications, multiple thresholds might be present; however, the number of thresholds and the corresponding threshold values are typically unknown in practice. Therefore, identifying the unknown number of thresholds and estimating the threshold values are critical issues in a nonparametric regression with multiple thresholds, especially when conducting empirical studies. We thus propose a testing procedure to determine the unknown number of thresholds and derive the limiting distribution of the proposed test. To the best of our knowledge, the present study is the first to comprehensively investigate the aforementioned issues. This study develops a test procedure for testing the existence of thresholds, determining the number of thresholds, and estimating the values of thresholds in nonparametric regression. Specifically, this procedure is a modified significance test based on the work of Aït-Sahalia et al. (2001). In addition, we establish the consistency and asymptotic normality of the threshold value estimators by using the sequential method. Hence, this study complements the existing literature on estimating and testing multiple thresholds in nonparametric regression models. Further, we apply our testing procedure to an empirical study of the 401(k) retirement savings plan with income thresholds and identify four threshold values. Those crucial income threshold values are all above the median income value.
The rest of the paper is organized as follows. The model specification and estimation for a nonparametric regression with thresholds are introduced in Section 2. This section also summarizes the necessary assumptions for deriving our theoretical results of the test statistics and estimators under the known thresholds. Section 3 provides the test determining the unknown number of thresholds. Section 4 presents the statistical properties of the multiple threshold estimator. Section 5 investigates the performance of these tests by using Monte Carlo studies, while Section 6 presents an empirical application. Section 7 concludes. All the technical proofs are collected in the Appendix.
2 Model, Assumptions, and Asymptotics
We first fix the notations and consider the following threshold model, which is a nonparametric regression with thresholds and known threshold values:
[TABLE]
where is the outcome variable, is a vector of the covariates, is the threshold variable, which is used to split the sample into distinct thresholds, are the corresponding threshold values, and denotes an indicator function defined as
[TABLE]
with and . Accordingly the conditional mean of the th regime at a grid point can be represented as
[TABLE]
where and denote the joint density function of and and the marginal density of in the th regime, respectively.
Given a sample with observations , the nonparametric regression with known thresholds is specified as
[TABLE]
where , , and are the th sample observations of , , and , respectively; is the regression error. Note that the threshold values satisfy .
Given a -dimensional product kernel function, , in which is defined as
[TABLE]
the sample kernel density estimators of and are
[TABLE]
Thus, the standard Nadaraya-Watson kernel regression estimator of is
[TABLE]
2.1 Assumptions
To establish the asymptotic properties of the conditional mean estimator, , and the density estimator, , in the th regime, as well as the convergence rate of the optimal bandwidth selector, we make the following assumptions.
Assumption 1. The following assumptions are specified for the random variables under study.
- 1-1.
is strictly stationary, ergodic and -mixing with coefficients for some fixed , satisfying . 2. 1-2.
The density is bounded away from zero and globally integrable on the compact support of the weighting function , where is defined in Section 3.1 when we construct the proposed test statistic. Hence . 3. 1-3.
The joint density of exists for all and is continuous on . 4. 1-4.
, and is square-integrable on . 5. 1.5.
, and .
Assumption 2. The following assumptions are imposed on the kernel function.
- 2-1.
is a product kernel, , given , and a bounded function on , symmetric about 0, with , , , and . 2. 2.2.
The kernel is th continuous differentiable with .
Assumption 3. The following assumptions are assumed for the bandwidth selector.
- 3-1.
As ,, and . 2. 3-2.
As , the bandwidth sequence is such that and then , and .
Assumptions 1-1 and 1-3 are similar to Assumption 7 in Aït-Sahalia et al. (2001), allowing for dependent observed data including macroeconomic or financial time-series data. Assumptions 1-2 and 1-4 generalize Assumption 2 of Aït-Sahalia et al. (2001) to encompass the threshold models. Moreover, Assumptions 1-4 and 1-5 restrict the behaviors of the conditional moments and conditional mean functions across distinct thresholds. Assumption 2.1 states the standard restrictions on the higher-order kernel functions, which are devices used to reduce bias ( cf. Li and Racine, 2007). Assumption 2.2, however, implies that there is no need to use a higher-order kernel unless the dimensionality of the covariate is greater than or equal to 3. Assumption 3 imposes the joint restrictions on the bandwidth sequence , order of the kernel , dimensionality of the covariate , and sample size . In particular, when and , the restriction, which is also used by Aït-Sahalia et al. (2001), leads to . In this study, when conducting Monte Carlo simulations, we impose , which suffices the nonparametric estimator valid asymptotic properties.
2.2 Asymptotic Properties of the Estimators under Known Thresholds
Assuming that the number of thresholds and corresponding threshold values are known already, the consistency and asymptotic normality of are provided in Theorem 1 and the asymptotic properties of are stated in Theorem 2.
Theorem 1**.**
Suppose that the assumptions in Assumptions 1, 2, and 3-1 hold. The following results are established.
a). The almost sure convergence rate of ,
[TABLE]
b). The asymptotic normality of ,
[TABLE]
where
[TABLE]
When the estimation is carried out at a single point , we have the convergence rate . In empirical applications, multiple often appear and then the estimator has a slower uniform convergence rate . Hence, from part , the kernel-smoothing density estimation is biased. Given that a Gaussian product kernel is being used, we already know that and according to Aït-Sahalia et al. (2001). Moreover, given that the number of thresholds and corresponding threshold values are known, the consistency and asymptotic normality of are provided as follows.
Theorem 2**.**
Suppose that the assumptions in Assumptions 1, 2 and 3-1 hold. The following results are derived.
a) The almost sure convergence rate of ,
[TABLE]
b) The asymptotic normality of ,
[TABLE]
where denotes the asymptotic bias,
[TABLE]
* and are the first- and second-order derivatives of the **th regime’s conditional mean with respect to the *th explanatory variable, respectively.
It is now clear that the sample estimator is also asymptotically biased. However, this asymptotic bias could be reduced by using higher-order kernels. Notice that the convergence rates and asymptotic results of and are not affected by , the number of thresholds. In finite samples, the number of thresholds does affect the nonparametric estimation. However, at the limit, the convergence rate does not depend on . Our results are therefore similar to those presented by Li and Racine (2007).
2.3 Optimal Bandwidth Selector
In nonparametric regressions, bandwidth plays a crucial role in the estimation. Different bandwidth selection rules have been suggested in the literature. Among the selectors, the optimal bandwidth selector is the most comprehensively studied and is obtained by minimizing the mean integrated squared error (MISE). That is, for a model with thresholds, the corresponding MISE is defined as
[TABLE]
and then the optimal bandwidth selector is obtained from
[TABLE]
The weighting function is an indicator function selecting a particular -region of interest, and this depends generally on empirical studies. Since the threshold variable does not affect the convergence rate of the proposed estimator, we construct the weighting function without including the threshold variable. The convergence rate of is derived and summarized in the following theorem.
Theorem 3**.**
Under Assumptions 1, 2, and 3, the convergence rate of the optimal bandwidth selector is in which .
This result shows that the convergence rate of the optimal bandwidth selector depends on the number of covariates and order of continuous differentiability of the kernel function, but that the convergence rate is not affected by the number of thresholds. In other words, the additional thresholds do not worsen the curse-of-dimensionality problem.
3 Determining the Number of Thresholds
The number of thresholds and corresponding threshold values are typically unknown in practice. In this section, we thus present a procedure for determining the unknown number of thresholds and estimating the threshold values. In linear regressions with thresholds, the number of thresholds is commonly determined by carrying out a sequential significance test (see Hansen, 1997). This sequential test is conducted by comparing the estimated sum of the squared errors from a model with thresholds (under the null hypothesis) with that from a model with thresholds (under the alternative) sequentially. The number of thresholds is determined as when the null of thresholds versus the alternative of thresholds is rejected, whereas the null of thresholds versus the alternative of thresholds is not rejected. Similarly, we determine the number of thresholds in nonparametric regressions based on sequential tests in this study. Instead of comparing the estimated error sum of squares from the linear regressions, however, we use the significance test suggested by Aït-Sahalia et al. (2001) for the nonparametric regressions as the basis in the sequential tests. The test statistic for the null of thresholds to thresholds is constructed and its asymptotic distribution is established as follows.
The test of Aït-Sahalia et al. (2001) is constructed to test the significance of a subset of covariates in a nonparametric regression. The intuition behind the test is to check the difference between the nonparametric regression estimates of unconstrained and constrained conditional means. That is, the null of the significance test is written as
[TABLE]
where represents the -dimensional explanatory variables, is the -dimensional explanatory variables under testing, and denote the conditional means under the alternative and null hypotheses, and and are the joint probability density functions of and , respectively.
To test the null of thresholds versus the alternative of , this test can be modified by taking as the independent variables in the regression with thresholds and as the extra independent variables in the regression with thresholds. The significance of implies that the regression with thresholds must be considered. However, the regression remains with thresholds if is not significant. The details are discussed as follows. First, we construct the test for detecting whether an extra threshold (known at value, ) exists in the th regime. Second, since the threshold value is unknown in general, the test is extended to test whether an extra unknown threshold exists in the th regime.
3.1 Testing for the Existence of an Extra Threshold
Given a regression with thresholds expressed as (2), a new threshold is suspected to exist in the th regime . Then, the conditional mean for the regime is split into two parts: in the regime and in the regime , where
[TABLE]
and is defined as
[TABLE]
and is defined similarly to .
Denote as the conditional mean with thresholds under the null and as the conditional mean function with thresholds under the alternative. Then, the null hypothesis for testing whether an extra threshold exists in the regime can be written as
[TABLE]
The sample statistic analogous to the test in Aït-Sahalia et al. (2001) is constructed as
[TABLE]
where , , and are the sample estimates of , , and , respectively, and is a weighting function. Specifically,
[TABLE]
The choice of is application-dependent. For example, in an empirical analysis of options prices, can be set to exclude those in-the-money options with price biases. Similarly, it can be set by using prior information to tackle boundary effects so that the density is bounded away from zero. Since is the weighted sum of the squares of the differences from to and to , the null hypothesis, , is not rejected when is insufficiently large and is rejected when is sufficiently large. Therefore, this inference is a right-tailed test. The asymptotic distribution of is constructed as follows.
Theorem 4**.**
Under the null hypothesis and according to Assumptions 1, 2, and 3, the asymptotic normality of the statistic is represented as
[TABLE]
where and denote the bias and variance terms, respectively, and where the bias term is
[TABLE]
with
[TABLE]
* was defined in Theorem 1, and the variance term is*
[TABLE]
with
[TABLE]
where , , and are
[TABLE]
and
[TABLE]
Note that Aït-Sahalia et al. (2001) also show that when the Gaussian product kernel is used. Given the result in Theorem 4, we denote
[TABLE]
and then the test statistic for the null of having an extra threshold in the th regime can be considered to be
[TABLE]
where and are the consistent estimators for and , respectively. The limiting distribution of is . The power property of is investigated in Section 3.4; Consequently it is a consistent test. We describe the consistent estimation of and in the following subsections.
3.2 Testing for an Extra Unknown Threshold
In practice, is unknown a priori and there are, in principle, infinite many of s in the regime . To make the test implementable, instead of infinite many of s, we only consider the candidate threshold values within the regime , i.e., , where . Given the suspected pseudo thresholds, , the null of an extra unknown threshold can be written as
[TABLE]
Given the sample counterpart of as defined in (9), the following theorem reports the joint asymptotic distribution of the statistics.
Theorem 5**.**
Given that the assumptions in Assumptions 1, 2, and 3 hold, , and under the null,
[TABLE]
where
[TABLE]
and is the variance-covariance matrix of . The -element in the variance-covariance matrix , assuming , is
[TABLE]
where is defined in the Appendix because of its complex form.
Theorem 5 is applicable to nonparametric regressions with heteroskedastic errors whose variances depend on the values of and , i.e, .111 For the two restricted cases with heteroskedastic errors whose variances depend on the values of but not on those of , i.e., , when and are either dependent or independent, the joint asymptotic distribution of the statistics is also derived but not provided in this paper. The detailed results and proofs of the corresponding asymptotic distributions are available from the authors upon request. By replacing , , and in Theorem 5 with consistent estimates, namely , , and , respectively, we have
[TABLE]
where
[TABLE]
3.3 Estimation of the Nuisance Parameters
Given the asymptotic normality of the test statistic , the nuisance parameters must be estimated consistently. First, the parameter can be estimated by using the Nadaraya-Watson estimator as follows:
[TABLE]
Thus, , , and can be estimated as
[TABLE]
and
[TABLE]
Further, the th elements of can be estimated as
[TABLE]
where the terms to are
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
Given Lemma 6 , Theorems 1 and 2, and Assumptions 1, 2 and 3, we have the following results as in Aït-Sahalia et al. (2001):
[TABLE]
and
[TABLE]
That is, , , , and are the consistent estimators of , , , and , respectively. For and , Aït-Sahalia et al. (2001) show that
[TABLE]
In light of the results in (26), the following test statistics are suggested to test the null of no extra unknown threshold existing in the regime :
[TABLE]
Furthermore, we know that converge to the standard normal distribution. Therefore, the distribution in the limit of is also standard normally distributed, i.e.,
[TABLE]
3.4 Local Alternative Power
In this subsection, we study the consistency of the test. We then examine its power, that is, the probability of rejecting a false hypothesis against the sequences of alternatives that approach the null as . Given an extra threshold existing in and being neglected,
[TABLE]
for . Suppose an extra threshold does exist in under the alternative and denote the sequence of densities as , and . The superscript is specified to show that these densities are dependent on since the value of the extra threshold is unknown. The local alternatives can be specified as
[TABLE]
where
[TABLE]
and satisfies
[TABLE]
and
[TABLE]
It is clear that the alternative converges to the null at speed (i.e., ).
Theorem 6**.**
Under Assumptions 1, 2, and 3, the asymptotic power of the test is
[TABLE]
where with , the CDF function of a standard normal random variable.
3.5 Identifying the Number of Thresholds
The test statistic, the average norm , is suggested to check whether an extra threshold exists in the regime given that the threshold values are already known. Logically, the test can be applied to check for an extra threshold existing in the regime for jointly. This thus ends up being the test for whether there is an extra threshold in a given threshold regression. Accordingly, we construct, in what follows, the test for the null of thresholds against the alternative of thresholds.
Since the indicator functions are independent, i.e., , the covariance of and for is zero. That is
[TABLE]
This fact implies that and () are asymptotically independent. The test statistic for the null thresholds against thresholds is constructed as characterized in the following theorem.
Theorem 7**.**
Under the same assumptions as for Theorem 5, the test statistic for the null thresholds against thresholds is constructed as
[TABLE]
with , where is the CDF of a standard normal distribution and is defined in equation (12).
Table 1 presents he critical values of the test statistic for at 1%, 5% and 10%.
Given the test statistic for testing thresholds against thresholds in Theorem 7, the number of thresholds can be determined by conducting these tests sequentially for and so on. The number of thresholds is determined by sequential inferences until the not rejection result is obtained. In other words, the number of thresholds is when the null of thresholds against thresholds is not rejected. When the number of thresholds is determined, we estimate the corresponding threshold values by using the methods discussed in the next section.
4 Statistical Properties of the Threshold Estimators
In the preceding discussions on testing an extra unknown threshold in a certain regime and testing the null of thresholds against thresholds, the threshold values under the null are assumed to be known already. In applied research, the threshold values are unknown and need to be estimated by using a valid procedure. In the framework of linear regressions, Bai (1997) and Bai and Perron (1998) determine the number of structural changes by using a sequential test and estimate the breakpoints by looking up the sums of the squared errors at which the minimization is obtained. Hansen (1999) discusses the determination of the number of thresholds and estimation of threshold values in linear regressions by using similar procedures. We thus extend these procedures to the framework of nonparametric regressions.
4.1 Added Assumptions
To derive the statistical properties of the threshold value estimators, we need the following assumptions.
Assumption 4.
- 4-1.
, , and exist and are continuous at , where . 2. 4-2.
, . 3. 4-3.
, , for some , and . 4. 4-4.
, and . 5. 4-5.
and , where .
Assumptions 4-1, 4-2, and 4-3 are standard in proving the consistency of the threshold estimators. Assumptions 4-4 and 4-5 relate to a condition called the small effect, , which is needed when we derive the asymptotic property of the threshold value estimator; see the proofs of Lemma 7 and Theorem 9. The small effect can approach zero when the sample size is sufficiently large; therefore, it depends on . is the remainder of the difference between and when we extract the effect of the sample size, , from .
4.2 Asymptotic Properties of the Threshold Value Estimators
Given that the number of thresholds is known, the estimator of the threshold values can be defined in a manner similar to that in Proposition 5 of Bai and Perron (1998):
[TABLE]
Clearly, are determined simultaneously by global minimization. In practice, the estimation is implemented by an algorithm based on the principle of dynamic programming. Under Assumptions 1, 2, 3, and 4, the following theorem establishes the consistency of .
Theorem 8**.**
For ,
- a)
[TABLE] 2. b)
[TABLE]
The convergency rate of is , which is a common result in the literature on structural changes and threshold models within the framework of linear regressions and linear quantile regressions (cf. Chen, 2008). The limiting distribution of the threshold value estimator is provided by Chan (1998) for linear models. On the contrary, Hansen (2000) and Bai and Perron (2003) introduce the existence of the small effect to obtain the limiting distribution without the nuisance parameters of the threshold value estimation. That is, denote
[TABLE]
Under the assumption of , which is called the small effect, we then obtain the asymptotic property of :
Theorem 9**.**
[TABLE]
where
[TABLE]
where
[TABLE]
and and are two independent Brownian motions.
Note that the convergence rate of under the existence of the small effect is slower than the rate in the case in which no small effect is assumed. The CDF of can be obtained from Bhattacharya and Brockwell (1976), i.e., for ,
[TABLE]
and for , , where is the CDF of a standard normal random variable.
4.3 Sequential Method
Instead of using a global minimization algorithm in the threshold value estimations, the sequential method can be adopted. Bai (1997) proposes the sequential method for estimating the change points in a linear regression with multiple structural changes and provides the proof of the consistency of his estimator without knowing the number of breaks. Bai and Perron (1998) also suggest using the sequential method to estimate the change points in linear regressions, while Hansen (1998) applies the sequential method to estimate the threshold values for nondynamic panel threshold models. Following the literature, we thus use the sequential method to estimate the threshold values in the nonparametric regressions. Without loss of generality, a nonparametric regression with three thresholds is considered. The model under consideration is, for ,
[TABLE]
The true threshold values implied by this model are , and , while and are the lower and upper bounds of the threshold values. However, a nonparametric regression is mis-specified when a model with one threshold is estimated as
[TABLE]
where and denote the kernel estimations from the sample observations and , respectively. The indicator function for and 0 otherwise.
Denote as the sum of the squared residuals from the nonparametric regression with the threshold value . That is,
[TABLE]
Theorem 10**.**
Given a threshold value specified at in a mis-specified nonparametric regression with one threshold, the model mis-specification error is
[TABLE]
where and for are defined in the Appendix.
Given the three true threshold values , , and , the threshold value of a mis-specified nonparametric regression with one threshold may be in , in , in , or in . The model mis-specification error of the whole sample is , , , or if the threshold value is mis-specified at the regime , , , or , respectively. In the Appendix, we describe the foregoing results in detail.
Theorem 11**.**
Let . is the smallest model mis-specification error among all . The exact expression of can be found in the Appendix.
, and are three smallest model mis-specification errors among all . Moreover, since is the limit of in probability and, without loss of generality, is assumed, we have the following theorem to prove is global minimization. That is, Theorem 12 is sufficient to justify the sequential procedures discussed.
Theorem 12**.**
Assume that the true model is a nonparametric regression with three threshold values, namely , , and , and that a nonparametric regression with one threshold is mis-specified and estimated via
[TABLE]
We then have
- a).
If , is the smallest model mis-specification error among all 2. b).
. 3. c).
* will, with probability one, converge to . *
According to Theorem 12, even if the nonparametric regression is mis-specified and a threshold value is mis-estimated at which the sum of the squared errors is smallest, the mis-estimated threshold value converges to the true threshold value at which the model mis-specification error is the smallest. The result of Theorem 12 is thus similar to those in the study by Bai and Perron (1998) for the estimation of the change points in a linear regression with multiple structural changes. To the best of our knowledge, this is the first theorem that ensures the consistency of the estimators obtained from using a sequential method in nonparametric regressions.
Note that the assumption indicates that the threshold value has the largest influence on the regression.Theorem 12 can be extended to a mis-specified regression model with two threshold values, and then the two estimated threshold values will be consistent with the two true threshold values that have a larger impact on the regression. Based on Theorem 12, the determination of the number of thresholds and estimation of the threshold values can be obtained by using the following sequential procedure.
Implement the test for the null of against . That is, run the test to check whether an extra threshold exists in . If the null is not rejected, it is inferred that the regression has no threshold. If the null is rejected, move onto the next step. 2. 2.
Specify and estimate the threshold value as . Given , carry out the test for the null of against . That is, run the test to check whether an extra threshold exists in regimes and . If the null is not rejected, it is inferred that the regression has one threshold. If the null is rejected, move onto the next step. 3. 3.
Specify and estimate the extra threshold value from regimes and as . Pick up the estimation of the threshold values, , which has a smaller sum of squared errors. Given and , carry out the test for the null of against . That is, run the test to check whether an extra threshold exists in regimes , , and if . If the null is not rejected, it is inferred that the regression has two thresholds. If the null is rejected, repeat the above test until the null of against thresholds is not rejected.
When the procedure is conducted to the end such that the null of thresholds against thresholds is not rejected, we then pin down a nonparametric regression with thresholds. Along with this procedure, the estimates of the threshold values, , are obtained as a byproduct. Following Theorem 12, the consistency of , is obtained consequently.
As mentioned in Proposition 8 of Bai and Perron (1998), the drawback of the previously described sequential method is that the determined number of thresholds is larger than the true number of thresholds with a nonzero probability value. Therefore, Bai and Perron (1998) recommend applying the sequential method with a certain Type I error that converges to zero at a slower rate with the sample size. By doing so, the determined number of thresholds converges to the true number of thresholds.
5 Monte Carlo Studies
In this section, Monte Carlo studies are conducted to evaluate the performance of the proposed test statistic, . We also conduct simulations to assess the finite sample performance of the sequential method for estimating the threshold values.
5.1 Empirical Performance of the Test Statistic
Monte Carlo simulations are designed to evaluate the empirical size and power performances of the tests to identify the number of thresholds. Our experimental design is mainly based on the data-generating process (DGP) considered in Aït-Sahalia et al. (2001). We consider the null of no threshold against the alternative with one threshold. The DGP under the null is specified as
[TABLE]
In this DGP, the random variable is dependent on the threshold variable and the heteroskedasticity of the regression depends on and . By using a univariate normal kernel function, we compute the bandwidth as , where (cf. Aït-Sahalia et al., 2001, p.383), , and is set to one in our simulation. We also conduct robustness checks on the bandwidth selection. Since under the null, the critical values of the test statistic in Theorem 7 are 1.282, 1.645, and 2.326 for Type I errors at 1%, 5%, and 10%, respectively.
We conduct simulations with sample sizes of 500, 1000 and 2000. Throughout our simulations, the numbers of replications and partitions are set to be 1000 and 7, respectively. Table 2 presents the empirical sizes of at 1%, 5%, and 10%, showing that the proposed test performs well with decent empirical sizes.
Table 3 shows the corresponding Monte Carlo results with the robustness checks on the choice of bandwidth. The proposed test copes well with decent sizes across the distinct bandwidth values.
5.2 Finite-sample Performance of the Sequential Method
To assess the accuracy of the sequential method for estimating the threshold values, we consider the following DGP in the Monte Carlo studies, which are similar to those in Aït-Sahalia et al. (2001, p.383):
[TABLE]
Let denote the threshold value estimate in the first-round identification from the th replication of the DGP. Then, the mean, standard error, and MSE (mean square error) from all the replications are computed by
[TABLE]
Given and 1000 replications, Table 4 shows the Monte Carlo results. We can draw the following conclusions from the simulation results. The standard error and MSE of the estimated threshold values decrease as the sample size increases. The sequential method consistently estimates the unknown threshold values. In particular, the mean and standard error of the first estimated threshold values are 0.5029813 and 0.0107737, respectively. The mean value is close to . For the second estimated threshold values, the mean is 0.152506, which is close to . The mean of the third estimated threshold values is -0.6966892, which is close to . These simulated results indicate the accuracy of the sequential method for estimating the threshold values. Given the good performance of the simulations, and based on Theorems 8 and 9, the threshold value estimators are super-consistent, as we see in Hansen (2000).
6 An Empirical Application: the 401(K) Retirement Savings Plan with Income Thresholds
Examining the effects of 401(k) plans on savings is an issue of long-standing empirical interest (see Chernozhukov and Hansen (2004) and the references cited therein). Intuitively, because different income groups face distinct resource constraints, income thresholds should play an important role in the analysis of individual savings for retirement. Chernozhukov and Hansen (2013) study the effect of 401(k) eligibility on total wealth by using high-dimensional methods that allow for flexible functional forms. By using a sample of 9915, they generate 10,763 technical variables through a spline basis and polynomial basis and then select a few important variables out of the technical variables by using a LASSO-based double selection procedure. The selected few important variables include , where the variable is normalized on the interval. Their result suggests that the income threshold exists in the 401(k) study. In the literature, however, no test procedures have thus far been implemented to investigate the relevant income threshold values in 401(k) applications. In this section, we use our testing procedure to show that income thresholds indeed exist in 401(k) applications, and confirm that this finding is robust to functional form specifications.
To illustrate the testing procedure proposed in the preceding sections, we consider the estimation and inference of the thresholds associated with the effect of 401(k) eligibility on total wealth. 401(k) eligibility, the variable of interest, is an indicator of being eligible to enroll in a 401(k) plan (i.e., whether individual is working for a firm that offers access to a 401(k) plan). Poterba et al. (1994a, 1994b) and Chernozhukov et al. (2016) argue that 401(k) eligibility may be taken as exogenous conditional on income. Following Chernozhukov et al. (2016) and by using the data set in Chernozhukov and Hansen (2004), we thus construct both our outcome variable and the explanatory variable of interest after partialling out the effects of the other variables including the dummies for age, education, marital status, family size, and homeownership. The sample size is 9915. In the example presented herein, we consider the following nonparametric regression with thresholds:
[TABLE]
where the threshold variable is income, while and are the partialled out total wealth and partialled out 401(k) eligibility, respectively.
We implement the test in Theorem 7 to determine the number of thresholds and then estimate the corresponding threshold values by using the sequential method. The weighting function is constructed as , and the bandwidth , where and is set to 1. We first conduct a test for the null hypothesis that versus . We find that the value of the test statistic is 50.46, thereby rejecting the null. The first-round estimated threshold value (92nd percentile). Since there are a small number of observations on the right-hand side interval of this threshold value, we conduct the next test, in the interval , for the null hypothesis that versus . The corresponding value of the test statistic is 27.34, which again rejects the null. The second-round estimated threshold value (68th percentile). We now conduct the test for the null hypothesis that versus in the joint interval of and . The value of the joint test statistic is 2.62. Thus, we reject the null, and then estimate the threshold value in this joint interval according to Theorem 12. We obtain (50th percentile). Since there are insufficient observations in the intervals and , we only conduct our next test to detect whether an extra threshold exists in the interval . Finally, we conduct the test for versus in the interval of Here, we do not reject the null because the test statistic with the value 0.85 is less than the critical value. We also conduct robustness checks by using different bandwidth values with and . The corresponding three threshold values found are the same as those found with . In short, our testing procedure allows us to identify four threshold regions and the estimated income threshold values are \31,836\ (50%)$42,600\ (68%)$75,000.3\ (92%)$. The crucial income threshold values are therefore all above the median income values.
7 Conclusion
In this study, we identify the number of thresholds and estimate the threshold values for a nonparametric regression with multiple thresholds. The significance test of Aït-Sahalia et al. (2001) is modified to detect the existence of an extra threshold (i.e., versus thresholds). The asymptotic properties of the modified tests are then established. Based on the modified test, a procedure for determining the number of thresholds is suggested. Accordingly, we then carry out the sequential method to estimate the unknown threshold values. We also derive the asymptotic properties of the corresponding threshold value estimator. Our simulation results signify that the proposed estimators perform adequately in finite samples. To illustrate our testing procedure, we present an empirical analysis of the 401(k) plan with income thresholds.
Appendix
Proof of Theorem 1.
The kernel density estimator is defined by
[TABLE]
Suppose the kernel satisfies the conditions in Assumption 2 and is a second-order () kernel function and that Assumptions 1-1 to 1-4 hold. Then, has the expectation
[TABLE]
and the variance
[TABLE]
Assuming satisfies
[TABLE]
we have
[TABLE]
By denoting , we obtain
[TABLE]
Denote and for any , the upper bound of the covariance terms can be obtained by Lemma A.0 of Fan and Li(1999) as
[TABLE]
where is defined as
[TABLE]
Furthermore, given that Assumption 1-1 holds,
[TABLE]
By combining (34), (35) and (36), we have the variance of as
[TABLE]
In general, if the th-order kernel function is considered, (32) becomes
[TABLE]
Given the results in (37) and (38) and that the bandwidth satisfies Assumption 3-1, the uniform almost sure convergence rate of a kernel density estimator can be obtained; see Lemma 2 and Lemma 8 in Stone (1983). Given the results in (32) and (37), and that Assumptions 1, 2, 3-1, and 5-2 hold, the asymptotic sampling distribution of is derived by Masry (1996) and Li and Racine (2007).
Proof of Theorem 2.
Given a second-order kernel function as well as equations (32) and (37), we have
[TABLE]
Together with (39), the local constant estimator can be rewritten as
[TABLE]
Under the correct specification of a nonparametric regression with thresholds, the first term in the previous result is
[TABLE]
From Assumption 1-1, we have
[TABLE]
where ; . Thus, (40) becomes
[TABLE]
Further, the asymptotic variance term is
[TABLE]
with
[TABLE]
Given that Assumption 1-1 holds, and from arguments similar to the proof for Theorem 1, the covariance term . We have
[TABLE]
and the covariance terms are
[TABLE]
In general, when the kernel is an th kernel function, (40) becomes
[TABLE]
Given that (43), (45), and Assumption 3-1 hold, the result of part a) in Theorem 2 is verified based on Lemmas 2 and 8 of Stone (1983). Moreover, given (41), (43), (44), and that Assumption 3-1 holds, the result of part b) in Theorem 2 holds according to the central limit theorem; see Masry (1996) and Li and Racine (2007).
Proof of Theorem 3.
By substituting (43) and (45) into the mean integrated square error, we have the optimal bandwidth defined as
[TABLE]
Taking the first-order derivative of (46) with respect to ,
[TABLE]
we then have . It is clear that the convergence rate of depends on the dimension of , , and the orders of the kernel function, . It is worth noting that the convergence rate does not depend on the number of thresholds, . This result suggests that the bandwidth can be selected without considering the number of thresholds.
Proof of Theorem 4.
Since
[TABLE]
we have
[TABLE]
Note that
[TABLE]
We need the following lemmas to complete the proof.
Lemma 1**.**
(Lemma 2 of Aït Sahalia et al. (2001))
Defining
[TABLE]
where
[TABLE]
we have
[TABLE]
Lemma 2**.**
(Lemma 7 of Aït Sahalia et al. (2001))
[TABLE]
with
[TABLE]
where , , and .
Lemma 3**.**
(Hall, 1984).
Let be an i.i.d sequence. Suppose that the U-statistic with the symmetric variable function being centered (i.e., ) and degenerate (i.e., almost surely for all ). Let
[TABLE]
Then, if
[TABLE]
we have that as
[TABLE]
From Lemma 2, we have
[TABLE]
To prove this, denote as
[TABLE]
It can then be seen that when
[TABLE]
are specified, we have and . Then, by the Taylor expansion,
[TABLE]
and thus it is equivalent to have
[TABLE]
where . Denote
[TABLE]
so that can be written as
[TABLE]
where
[TABLE]
in which the first derivative of is
[TABLE]
the second derivative of is
[TABLE]
and the third derivative of is
[TABLE]
It is clear that under the null hypothesis. Therefore, under the null , we have
[TABLE]
Given Lemma 1, satisfies
[TABLE]
Given that Assumption 3-2 holds,
[TABLE]
For the term in , it is clear that
[TABLE]
in which
[TABLE]
Since ,
[TABLE]
Similarly,
[TABLE]
and
[TABLE]
Therefore, we obtain
[TABLE]
Specifically,
[TABLE]
To simplify the expression, we denote
[TABLE]
and also denote its de-mean as
[TABLE]
Finally, the term can be expressed as
[TABLE]
In the above equation, the term is asymptotically normal, is the asymptotic bias, and and are asymptotically negligible.
In addition,
[TABLE]
and uniformly in in from Assumption 3-2,
[TABLE]
Denote
[TABLE]
can then be simplified to
[TABLE]
We then have
[TABLE]
and
[TABLE]
From Chebyshev’s inequality it follows that
[TABLE]
Let and denote
[TABLE]
which verifies the centering and degeneracy conditions by construction. In addition, since
[TABLE]
we have
[TABLE]
which is the necessary condition for having Lemma 3 applicable.
As for , we have
[TABLE]
Thus, the following are obtained:
[TABLE]
Hence,
[TABLE]
where
[TABLE]
According to Lemma 3 (Hall, 1984), we have
[TABLE]
From (48), we obtain
[TABLE]
and then
[TABLE]
from Chebyshev’s inequality
[TABLE]
and from (47), we have the following result for :
[TABLE]
Note that this proof is established under are i.i.d. For mixing data with the -coefficient as in Assumption 1-1, Aït-Sahalia et al. (2001), Fan and Li (1999), and Dette and Spreckelsen (2004) point out that this result also holds.
Proof of Theorem 5.
To begin with, we write out the term as follows:
[TABLE]
Let
[TABLE]
By definition,
[TABLE]
where
[TABLE]
and
[TABLE]
[TABLE]
where
[TABLE]
and
[TABLE]
Denote
[TABLE]
Therefore, the variance-covariance is
[TABLE]
Proof of Theorem 6.
Let
[TABLE]
with
[TABLE]
It is clear that when the alternative hypothesis is true.
As in the proof of Theorem 4, we know that
[TABLE]
where
[TABLE]
It is clear that under the null and under the alternative. Then, under the alternative,
[TABLE]
and
[TABLE]
Given the following results in the proof of Theorem 4,
[TABLE]
we have
[TABLE]
Therefore, under the alternative,
[TABLE]
When the alternative converges to the null at speed , we get . Similarly, we have . Hence, from Proposition 2 of Aït Sahalia et al. (2001), we have proved Theorem 6.
Proof of Theorem 7.
Observe that the indicator functions defined on distinct intervals are mutually exclusive. Therefore the asymptotic covariance between the statistics and () is zero. In what follows, we verify this fact. Let and , respectively be the and splitting points in the intervals of and ; also let . Following the proof of Theorem 4, we have
[TABLE]
where
[TABLE]
As those defined in Theorem 4,
[TABLE]
Following the proof of Theorem 5, we denote
[TABLE]
and obtain
[TABLE]
The equation above signifies that the indicators , , , and are mutually exclusive; , , , and are also mutually exclusive. Hence, is of . Further, and are asymptotically normally distributed, and they thus can be seen as asymptotically independent. Accordingly, with the same assumptions imposed in Theorem 5, Theorem 7 holds. .
Proof of Theorem 8.
With pseudo threshold values, in , the conditional mean estimator is constructed as
[TABLE]
with
[TABLE]
and , .
To proceed, we need the following lemmas.
Lemma 4**.**
For any , we have
[TABLE]
and
[TABLE]
Proof: Since is a local constant estimator, its almost sure convergence rate is from the result of part a) in Theorem 2. From the definition of ,
[TABLE]
Lemma 5**.**
Under the condition that and are exogenous, we have
[TABLE]
Proof: The second moment of exists, that is
[TABLE]
Since for and being exogenous, from the law of large number, we have
[TABLE]
Let and . The estimated sum of squared residuals at threshold values is
[TABLE]
with
[TABLE]
Moreover,
[TABLE]
It is clear that, from Lemma 4, and have their minimum at . According to Theorem 2.1 of Newey and McFadden (1994), we then have
[TABLE]
This is the proof of part a) of Theorem 8.
For the proof of part b) of Theorem 8, without loss of generality, we provide the proof of in a nonparametric regression with three thresholds. Denote
[TABLE]
where .
The following lemmas are needed for our proof.
Lemma 6**.**
Set and
[TABLE]
There exist the constants , , and , such that for all and , there exists a such that for all ,
[TABLE]
Proof: See Lemma A.7 of Hansen (2000).
Lemma 7**.**
For all and , there exists some such that for any ,
[TABLE]
Proof: See Lemma A.8 of Hansen (2000).
Let be the intersection sets of and . From Lemmas 6 and 7, we have
[TABLE]
Take and to be sufficiently small such that
[TABLE]
We thus have
[TABLE]
Therefore,
[TABLE]
This result indicates, in the event , when and when . However, this contradicts the fact that . Therefore, the foregoing analysis implies , and then, for . This is equivalent to for .
Proof of Theorem 9.
The following lemmas are necessary for proving Theorem 9.
Lemma 8**.**
Given the existence of the small effect, ,
[TABLE]
where .
Proof: The proof is similar to the one in part b) of Theorem 8.
Let us fix some new notations before introducing a new lemma.
[TABLE]
Lemma 9**.**
Let and . We then have
[TABLE]
Proof: Since
[TABLE]
from Lemma A.2 of Hansen (2000)
[TABLE]
Therefore, according to Chebyshev’s inequality.
Let . We have the following functional central limit theorem:
Lemma 10**.**
[TABLE]
and is a standard Brownian motion.
Proof: The variance of is
[TABLE]
For any satisfying , is
[TABLE]
Furthermore, let
[TABLE]
We then have
[TABLE]
From Lemma A.0 of Fan and Li (1999), it can then be seen that
[TABLE]
where
[TABLE]
In addition,
[TABLE]
By combining (53), (54), and (55), we have
[TABLE]
Next, the big block and small block method is used to derive the asymptotic normality of . Let and satisfy
[TABLE]
where is the mixing coefficient of . Denote
[TABLE]
where , is a Gaussian function. Then can be rewritten as
[TABLE]
The necessary conditions for applying a functional central limit theorem in a big and small block method include
[TABLE]
From (56), we have the variance and then the variance . Similarly, we have . Therefore, it is clear that holds. In addition, as , it can be seen that (59) also holds.
From Proposition 2.6 of Fan and Yao (2003), we have
[TABLE]
and then also holds. Furthermore, from Lemma 1 of Hansen (2000), and by letting , we obtain
[TABLE]
and then (60) holds. From Lemma 3 of Hansen (1999), we have
[TABLE]
and then (61) also holds. Finally, combining equations (57) through (61), we have proved .
Given Lemma 8, the probability of having in is . Denote . We consequently have
[TABLE]
and
[TABLE]
Given Lemmas 9 and 10, we have
[TABLE]
and then from Theorem 2.7 of Kim and Pollard (1990), we obtain Theorem 1 of Hansen (2000),
[TABLE]
Note for Theorem 10.
[TABLE]
where
[TABLE]
with for and 0 otherwise, and
[TABLE]
with for and 0 otherwise, for and 0 otherwise, and
[TABLE]
with for and 0 otherwise, for and 0 otherwise, and
[TABLE]
with for and 0 otherwise.
Graphic Description of Theorem 10.
Case of b(1)$$\gamma_{0}$$\gamma_{1}$$\gamma$$\gamma_{2}$$\gamma_{3}$$\gamma_{4}$$\hat{m}_{\gamma}(\mathbf{x})$$\hat{m}_{\gamma}^{*}(\mathbf{x})$$(1)$$(2)$$(3)$$(4)$$\hat{m}_{\gamma}(\mathbf{x})$$(1)$$(2)$$(3)$$(4)$$(5)$$\hat{m}_{\gamma}^{*}(\mathbf{x})Case of b(2)$$\gamma_{0}$$\gamma_{1}$$\gamma$$\gamma_{2}$$\gamma_{3}$$\gamma_{4}$$\hat{m}_{\gamma}(\mathbf{x})$$(1)$$(2)$$(3)$$(4)$$(5)$$\hat{m}_{\gamma}^{*}(\mathbf{x})Case of b(3)$$\gamma_{0}$$\gamma_{1}$$\gamma$$\gamma_{2}$$\gamma_{3}$$\gamma_{4}$$\hat{m}_{\gamma}(\mathbf{x})$$(1)$$(2)$$(3)$$(4)$$(5)$$\hat{m}_{\gamma}^{*}(\mathbf{x})Case of b(4)$$\gamma_{0}$$\gamma_{1}$$\gamma$$\gamma_{2}$$\gamma_{3}$$\gamma_{4}$$\hat{m}_{\gamma}(\mathbf{x})$$(1)$$(2)$$(3)$$(4)$$\hat{m}_{\gamma}^{*}(\mathbf{x})
Given the three true threshold values , , and , the threshold value of a mis-specified nonparametric regression with one threshold may be in , or in , or in , or in . For , there is no model miss-specified error for but the miss-specified errors are
is , 2. 2.
is , 3. 3.
is , and 4. 4.
is ,
as shown in the first graph of Case . For , the mis-specified errors are
for is and is denoted as (1), 2. 2.
for is and is denoted as (2), 3. 3.
for is and is denoted as (3), 4. 4.
for is and is denoted as (4), 5. 5.
for is and is denoted as (5),
as shown in the second graph of Case . For , the mis-specified errors are
for is and is denoted as (1), 2. 2.
for is and is denoted as (2), 3. 3.
for is and is denoted as (3), 4. 4.
for is and is denoted as (4), 5. 5.
for is and is denoted as (5),
as shown in the third graph of Case . For , the mis-specified errors are
for is and is denoted as (1), 2. 2.
for is and is denoted as (2), 3. 3.
for is and is denoted as (3), 4. 4.
for is and is denoted as (4),
as shown in the last graph of Case . Note that there is no model mis-specified error for in this case.
As to the cases of , , or , the model mis-specification errors are
[TABLE]
Proof of Theorems 10 and 11.
From Lemma 4,
[TABLE]
where
[TABLE]
and
[TABLE]
Denote as a pseudo threshold value considered in a mis-specified nonparametric regression with one threshold and assume . From (62), we have
[TABLE]
Based on Lemma 5, the limit of the cross products of with the other terms in the above equation will be . Note that the cross products among these terms converge to zero. Therefore, the limit of (63) is
[TABLE]
The limiting properties of , , and can be derived in the same manner.
Proof of Theorem 12.
The slope of for is
[TABLE]
The slope of for is
[TABLE]
The slope of for is
[TABLE]
Finally, the slope of for is
[TABLE]
From (64), the slope is a strictly decreasing function in for . Thus, is the smallest value of the model mis-specification error for . For , we denote
[TABLE]
. The partial effect of on is
[TABLE]
We have . This result indicates that the minimum of is either at or at in spite of the initial value of being positive or negative. In other words, either or must be the minimal value of the model mis-specification error for . In the same manner, either or must be the minimal value of the model mis-specification error for . Finally, from (67), the slope is a strictly increasing function in for . This fact implies that the minimal value of the model mis-specification error takes place at , which is equal to . Therefore, the minimal value among , , and is the global minimum of the model mis-specification error for . This is the proof of part a) in Theorem 12.
Since is assumed, is the global minimum of the model mis-specification error for . Therefore, from Theorem 2.1 of Newey and McFadden (1994), we have
[TABLE]
This completes the proof of parts b) and c) in Theorem 12.
References
- Aït-Sahalia, Y., Bickel, P.J., Stoker, T.M., 2001. Goodness-of-fit tests for kernel regression with an application to option implied volatilities, Journal of Econometrics 105, 363–412.
Angrist, J.D., Pischke, J., 2009. Mostly Harmless Econometrics, New Jersey: Princeton University Press.
Bai, J., 1997. Estimating multiple breaks one at a time, Econometric Theory 13, 315–352.
Bai, J., Perron, P., 1998. Estimating and testing linear models with multiple structural changes, Econometrica 66, 47–78.
Bai, J., Perron, P., 2003. Computation and analysis of multiple structural change models, Journal of Applied Econometrics 18, 1–22.
Bhattacharya, P.K., Brockwell, P.J., 1976. The minimum of an additive process with applications to signal estimation and storage theory, Z. Wahrschein. Verw. Gebiete 37, 51–75.
Chan, K.S., 1993. Consistency and limiting distribution of the least squares estimator of a threshold autoregressive model, The Annals of Statistics 21, 520–533.
Chen, B., Hong, Y., 2012. Testing for smooth structural changes in time series models via nonparametric regression, Econometrica 80, 1157–1183.
Chen, B., Hong, Y., 2013. Nonparametric testing for smooth structural change in panel data models, Working Paper, Department of Economics, University of Rochester.
Chen, J.-E., 2008. Estimating and testing quantile regression with structural changes, Working Paper, Department of Economics, NYU.
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., 2016. Double machine learning for treatment and causal parameters, Cemmap Working Paper CWP49/16.
Chernozhukov, V., Hansen, C., 2004. The impact of 401(k) participation on the wealth distribution: An instrumental quantile regression analysis, Review of Economics and Statistics 86, 735–751.
Chernozhukov, V., Hansen, C., 2013. High-Dimensional Methods: Examples for Inference on Structural Effects, NBER Summer Institute.
Dette, H., Spreckelsen, I., 2004. Some comments on specification tests in nonparametric absolutely regular processes, Journal of Time Series Analysis 25, 159–172.
Fan, Y., Li, Q., 1999. Central limit theorem for degenerate U-statistics of absolutely regular processes with applications to model specification testing, Journal of Nonparametric Statistics 10, 245–271.
Fan, J., Yao, Q., 2003. Nonlinear Time Series: Nonparametric and Parametric Methods, New York: Springer-Verlag.
Hall, P., 1984. Central limit theorem for integrated squared error of multivariate nonparametric density estimators, Journal of Multivariate Analysis 14, 1–16.
Hansen, B.E., 1999. Threshold effects in non-dynamic panels:Estimation, testing, and inference, Journal of Econometrics 93, 345–368.
Hansen, B.E., 2000. Sample splitting and threshold estimation, Econometrica 68, 575–603.
Henderson, D.J., Parmeter, C.F., Su, L., 2014. Nonparametric threshold regression: Estimation and inference, Working Paper, Department of Economics, University of Miami.
Li, Q., Racine, J.S., 2007. Nonparametric Econometrics: Theory and Practice, Princeton, NJ: Princeton University Press.
Masry, E., 1996. Multivariate regression estimation local polynomial fitting for time series, Stochastic Processes and their Applications 65, 81–101.
Masry, E., Fan, J., 1997. Local polynomial estimation of regression functions for mixing processes, Scandinavian Journal of Statistics 24, 165–179.
Newey, W. K., McFadden, D.L., 1994. Large sample estimation and hypothesis testing, Handbook of Econometrics: Vol. IV, ed. by R. F. Engle and D. L. McFadden, New York: Elsevier, 2113 – 2245.
Oka, T., Qu, Z., 2011. Estimating structural changes in regression quantiles, Journal of Econometrics 162, 248–267.
Poterba, J.M., Venti, S.F., Wise, D.A., 1994a. 401(k) plans and tax-deferred savings, Studies in the Economics of Aging, Chicago: University of Chicago Press, 105–142.
Poterba, J.M., Venti, S.F., Wise, D.A., 1994b. Do 401(k) contributions crowd out other personal saving?, Journal of Public Economics 58, 1–32.
Qu, Z., 2008. Testing for structural change in regression quantiles, Journal of Econometrics 146, 170–184.
Qu, Z., Perron P., 2007. Estimating and testing structural changes in multivariate regressions, Econometrica 75, 459–502.
Stone, C.J., 1983. Optimal uniform rate of convergence for nonparametric estimators of a density function or its derivatives, Recent Advances in Statistics, 393–406. Academic Press, New York.
Su, L., Xiao, Z., 2008. Testing structural change in time-series nonparametric regression models, Statistics and Its Interface 1, 347–366.
Yu, P., Philips, P.C.B., 2015. Threshold regression with endogeneity, Cowles Foundation Discussion Paper no. 1966.
