Prediction Aided by Surrogate Training
Eric Xia, Martin J. Wainwright

TL;DR
This paper introduces PAST, a method that leverages helper covariates during training to improve prediction accuracy using only standard covariates at test time, with theoretical guarantees and empirical validation.
Contribution
The paper proposes PAST, a novel framework that constructs response estimators using helper covariates to enhance predictive models trained solely on standard covariates, with theoretical error bounds.
Findings
Theoretical guarantees on prediction error bounds for PAST.
Empirical improvements demonstrated across diverse applications.
Characterization of regimes where PAST approaches oracle accuracy.
Abstract
We study a class of prediction problems in which relatively few observations have associated responses, but all observations include both standard covariates as well as additional "helper" covariates. While the end goal is to make high-quality predictions using only the standard covariates, helper covariates can be exploited during training to improve prediction. Helper covariates arise in many applications, including forecasting in time series; incorporation of biased or mis-calibrated predictions from foundation models; and sharing information in transfer learning. We propose "prediction aided by surrogate training" (), a class of methods that exploit labeled data to construct a response estimator based on both the standard and helper covariates; and then use the full dataset with pseudo-responses to train a predictor based only on standard covariates. We establish…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
