High-dimensional sliced inverse regression with endogeneity

Linh H. Nghiem; Francis.K.C. Hui; Samuel Muller; and A.H. Welsh

arXiv:2412.15530·stat.ME·December 4, 2025

High-dimensional sliced inverse regression with endogeneity

Linh H. Nghiem, Francis.K.C. Hui, Samuel Muller, and A.H. Welsh

PDF

Open Access

TL;DR

This paper introduces a two-stage Lasso SIR method to effectively perform sufficient dimension reduction in high-dimensional settings with endogenous covariates, addressing the inconsistency of classical SIR.

Contribution

It proposes a novel two-stage Lasso SIR estimator that accounts for endogeneity, with theoretical guarantees and superior empirical performance over existing methods.

Findings

01

The two-stage Lasso SIR estimator achieves consistent estimation of the central subspace.

02

The method performs well in high-dimensional settings with exponentially growing covariates and instruments.

03

Empirical results demonstrate its superiority over methods ignoring endogeneity.

Abstract

Sliced inverse regression (SIR) is a popular sufficient dimension reduction method that identifies a few linear transformations of the covariates without losing regression information with the response. In high-dimensional settings, SIR can be combined with sparsity penalties to achieve sufficient dimension reduction and variable selection simultaneously. Nevertheless, both classical and sparse estimators assume the covariates are exogenous. However, endogeneity can arise in a variety of situations, such as when variables are omitted or are measured with error. In this article, we show such endogeneity invalidates SIR estimators, leading to inconsistent estimation of the true central subspace. To address this challenge, we propose a two-stage Lasso SIR estimator, which first constructs a sparse high-dimensional instrumental variables model to obtain fitted values of the covariates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNumerical methods in inverse problems · Gaussian Processes and Bayesian Inference · Face and Expression Recognition