The conditionality principle in high-dimensional regression
David Azriel

TL;DR
This paper investigates the impact of covariate distribution knowledge on high-dimensional regression, revealing that ancillary covariate information can significantly improve inference, especially in semi-supervised learning contexts.
Contribution
It demonstrates that in high-dimensional regression, knowledge of the covariate distribution affects the feasibility of consistent estimation under the conditional framework.
Findings
No consistent estimator exists in the conditional framework when covariate distribution is unknown.
Conditional error of a known estimator converges to zero if covariate distribution is normal.
Additional covariate data can greatly enhance inference in semi-supervised learning.
Abstract
Consider a high-dimensional linear regression problem, where the number of covariates is larger than the number of observations and the interest is in estimating the conditional variance of the response variable given the covariates. A conditional and unconditioned framework are considered, where conditioning is with respect to the covariates, which are ancillary to the parameter of interest. In recent papers, a consistent estimator was developed in the unconditional framework when the marginal distribution of the covariates is normal with known mean and variance. In the present work, a certain Bayesian hypothesis test is formulated under the conditional framework, and it is shown that the Bayes risk is a constant. This implies that no consistent estimator exists in the conditional framework. However, when the marginal distribution of the covariates is normal, the conditional error of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
