Surrogate-Powered Inference: Regularization and Adaptivity
Jianmin Chen, Huiyuan Wang, Thomas Lumley, Xiaowu Dai, Yong Chen

TL;DR
This paper introduces the SPI toolbox, a unified framework that combines validated labels and surrogates for reliable statistical inference, reducing error and increasing power while handling multiple surrogates and limited validation budgets.
Contribution
The paper proposes the SPI framework with three versions, integrating surrogate data with validation labels, regularized regression, and adaptive multiwave labeling for improved inference.
Findings
SPI reduces estimation error compared to traditional methods.
SPI increases power in risk factor identification.
Theoretical guarantees support the method's reliability.
Abstract
High-quality labeled data are essential for reliable statistical inference, but are often limited by validation costs. While surrogate labels provide cost-effective alternatives, their noise can introduce non-negligible bias. To address this challenge, we propose the surrogate-powered inference (SPI) toolbox, a unified framework that leverages both the validity of high-quality labels and the abundance of surrogates to enable reliable statistical inference. SPI comprises three progressively enhanced versions. Base-SPI integrates validated labels and surrogates through augmentation to improve estimation efficiency. SPI+ incorporates regularized regression to safely handle multiple surrogates, preventing performance degradation due to error accumulation. SPI++ further optimizes efficiency under limited validation budgets through an adaptive, multiwave labeling procedure that prioritizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods and Bayesian Inference · Gaussian Processes and Bayesian Inference
