Rate Optimal Estimation and Confidence Intervals for High-dimensional Regression with Missing Covariates
Yining Wang, Jialei Wang, Sivaraman Balakrishnan, Aarti Singh

TL;DR
This paper develops methods for high-dimensional linear regression with missing covariates, providing optimal estimation bounds and confidence intervals, and reveals a parameter-dependent phase transition in estimation accuracy.
Contribution
It introduces a de-biased estimator for inference with missing data, establishes upper and lower bounds on estimation error, and uncovers a phase transition depending on the covariance matrix knowledge.
Findings
Faster estimation rates when the covariance matrix is known.
Establishment of minimax lower bounds matching upper bounds.
Effective confidence intervals constructed despite missing data.
Abstract
Although a majority of the theoretical literature in high-dimensional statistics has focused on settings which involve fully-observed data, settings with missing values and corruptions are common in practice. We consider the problems of estimation and of constructing component-wise confidence intervals in a sparse high-dimensional linear regression model when some covariates of the design matrix are missing completely at random. We analyze a variant of the Dantzig selector [9] for estimating the regression model and we use a de-biasing argument to construct component-wise confidence intervals. Our first main result is to establish upper bounds on the estimation error as a function of the model parameters (the sparsity level s, the expected fraction of observed covariates , and a measure of the signal strength ). We find that even in an idealized setting where the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Advanced Statistical Methods and Models
MethodsLinear Regression
