Revisiting inference after prediction
Keshav Motwani, Daniela Witten

TL;DR
This paper evaluates recent methods for valid inference after prediction, showing that Angelopoulos et al.'s approach reliably controls error rates regardless of model quality, unlike Wang et al.'s method which requires unrealistic assumptions.
Contribution
It demonstrates the robustness of Angelopoulos et al.'s correction method for inference after prediction, contrasting it with the limitations of Wang et al.'s approach.
Findings
Angelopoulos et al.'s method controls type 1 error regardless of model quality.
Wang et al.'s method requires near-perfect model estimation, which is rarely achievable.
The paper clarifies conditions under which inference methods are valid.
Abstract
Recent work has focused on the very common practice of prediction-based inference: that is, (i) using a pre-trained machine learning model to predict an unobserved response variable, and then (ii) conducting inference on the association between that predicted response and some covariates. As pointed out by Wang et al. (2020), applying a standard inferential approach in (ii) does not accurately quantify the association between the unobserved (as opposed to the predicted) response and the covariates. In recent work, Wang et al. (2020) and Angelopoulos et al. (2023) propose corrections to step (ii) in order to enable valid inference on the association between the unobserved response and the covariates. Here, we show that the method proposed by Angelopoulos et al. (2023) successfully controls the type 1 error rate and provides confidence intervals with correct nominal coverage, regardless…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Forecasting Techniques and Applications · Machine Learning and Data Classification
