Regression for the Mean: Auto-Evaluation and Inference with Few Labels through Post-hoc Regression

Benjamin Eyre; David Madras

arXiv:2411.12665·cs.LG·July 9, 2025

Regression for the Mean: Auto-Evaluation and Inference with Few Labels through Post-hoc Regression

Benjamin Eyre, David Madras

PDF

Open Access

TL;DR

This paper introduces PPI++ and new PPI-based techniques that use robust regression to improve statistical inference accuracy when only a few high-quality labels are available, addressing variance issues in small sample settings.

Contribution

The paper analyzes the limitations of PPI++ with scarce labels and proposes novel PPI-based methods utilizing robust regressors for better inference with limited data.

Findings

01

PPI++ can underperform classical methods with few labels.

02

Relating PPI++ to OLS regression explains its variance issues.

03

Robust regressors improve estimator stability in small sample regimes.

Abstract

The availability of machine learning systems that can effectively perform arbitrary tasks has led to synthetic labels from these systems being used in applications of statistical inference, such as data analysis or model evaluation. The Prediction Powered Inference (PPI) framework provides a way of leveraging both a large pool of pseudo-labelled data and a small sample with real, high-quality labels to produce a low-variance, unbiased estimate of the quantity being evaluated for. Most work on PPI considers a relatively sizable set of labelled samples, which can be resource intensive to obtain. However, we find that when labelled data is scarce, the PPI++ method can perform even worse than classical inference. We analyze this phenomenon by relating PPI++ to ordinary least squares regression, which also experiences high variance with small sample sizes, and use this regression framework…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFuzzy Logic and Control Systems · Fault Detection and Control Systems

MethodsSparse Evolutionary Training