Doubly robust and computationally efficient high-dimensional variable selection
Abhinav Chakraborty, Jeffrey Zhang, Eugene Katsevich

TL;DR
This paper introduces tower PCM (tPCM), a computationally efficient method for high-dimensional variable selection that maintains power and robustness, significantly speeding up the process compared to traditional PCM.
Contribution
The paper proposes tPCM, an extension of PCM that reduces computational cost using a joint predictor distribution estimate, and proves its robustness and asymptotic equivalence to existing methods.
Findings
tPCM achieves up to 130× speedup over PCM in simulations.
tPCM maintains the power of PCM while being computationally more efficient.
tPCM improves per-variable p-value estimation and speed over existing model-X methods.
Abstract
Variable selection can be performed by testing conditional independence (CI) between each predictor and the response, given the other predictors. A doubly robust and powerful option for these CI tests is the projected covariance measure (PCM) test. However, directly deploying PCM for variable selection brings computational challenges: testing a single variable involves a few machine learning fits, so testing variables requires fits. Inspired by model-X ideas, we observe that an estimate of the joint predictor distribution and a single response-on-all-predictors fit can be used to reconstruct all PCM fits. This yields tower PCM (tPCM), a computationally efficient extension of PCM to variable selection. When the joint predictor distribution is sufficiently tractable, as in applications like genome-wide association studies, tPCM offers a substantial speedup over PCM -- up to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Control Systems and Identification · Advanced Statistical Methods and Models
