High-dimensional sliced inverse regression with endogeneity
Linh H. Nghiem, Francis.K.C. Hui, Samuel Muller, and A.H. Welsh

TL;DR
This paper introduces a two-stage Lasso SIR method to effectively perform sufficient dimension reduction in high-dimensional settings with endogenous covariates, addressing the inconsistency of classical SIR.
Contribution
It proposes a novel two-stage Lasso SIR estimator that accounts for endogeneity, with theoretical guarantees and superior empirical performance over existing methods.
Findings
The two-stage Lasso SIR estimator achieves consistent estimation of the central subspace.
The method performs well in high-dimensional settings with exponentially growing covariates and instruments.
Empirical results demonstrate its superiority over methods ignoring endogeneity.
Abstract
Sliced inverse regression (SIR) is a popular sufficient dimension reduction method that identifies a few linear transformations of the covariates without losing regression information with the response. In high-dimensional settings, SIR can be combined with sparsity penalties to achieve sufficient dimension reduction and variable selection simultaneously. Nevertheless, both classical and sparse estimators assume the covariates are exogenous. However, endogeneity can arise in a variety of situations, such as when variables are omitted or are measured with error. In this article, we show such endogeneity invalidates SIR estimators, leading to inconsistent estimation of the true central subspace. To address this challenge, we propose a two-stage Lasso SIR estimator, which first constructs a sparse high-dimensional instrumental variables model to obtain fitted values of the covariates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical methods in inverse problems · Gaussian Processes and Bayesian Inference · Face and Expression Recognition
