Cross-Prediction-Powered Inference
Tijana Zrnic, Emmanuel J. Cand\`es

TL;DR
This paper introduces cross-prediction, a machine learning-based method for valid statistical inference using both labeled and unlabeled data, improving power and stability over existing approaches.
Contribution
The paper proposes cross-prediction, a novel inference technique that leverages machine learning to incorporate unlabeled data, enhancing validity, power, and stability of statistical conclusions.
Findings
Cross-prediction achieves valid error control in inference.
It is more powerful than adapted prediction-powered inference.
It produces more stable confidence intervals with lower variability.
Abstract
While reliable data-driven decision-making hinges on high-quality labeled data, the acquisition of quality labels often involves laborious human annotations or slow and expensive scientific measurements. Machine learning is becoming an appealing alternative as sophisticated predictive techniques are being used to quickly and cheaply produce large amounts of predicted labels; e.g., predicted protein structures are used to supplement experimentally derived structures, predictions of socioeconomic indicators from satellite imagery are used to supplement accurate survey data, and so on. Since predictions are imperfect and potentially biased, this practice brings into question the validity of downstream inferences. We introduce cross-prediction: a method for valid inference powered by machine learning. With a small labeled dataset and a large unlabeled dataset, cross-prediction imputes the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Bayesian Modeling and Causal Inference · Explainable Artificial Intelligence (XAI)
