Unbiased Prevalence Estimation with Multicalibrated LLMs

Fridolin Linder; Thomas Leeper; Daniel Haimovich; Niek Tax; Lorenzo Perini; Milan Vojnovic

arXiv:2604.21549·cs.AI·April 24, 2026

Unbiased Prevalence Estimation with Multicalibrated LLMs

Fridolin Linder, Thomas Leeper, Daniel Haimovich, Niek Tax, Lorenzo Perini, Milan Vojnovic

PDF

TL;DR

This paper demonstrates that multicalibration ensures unbiased prevalence estimation across populations with covariate shift, outperforming standard calibration methods, with theoretical backing and practical applications involving large language models.

Contribution

It introduces multicalibration as a method to achieve unbiased prevalence estimates under covariate shift, extending fairness concepts to measurement problems across disciplines.

Findings

01

Standard methods exhibit bias increasing with shift magnitude.

02

Multicalibrated estimators maintain near-zero bias under covariate shift.

03

Empirical applications show multicalibration reduces bias significantly.

Abstract

Estimating the prevalence of a category in a population using imperfect measurement devices (diagnostic tests, classifiers, or large language models) is fundamental to science, public health, and online trust and safety. Standard approaches correct for known device error rates but assume these rates remain stable across populations. We show this assumption fails under covariate shift and that multicalibration, which enforces calibration conditional on the input features rather than just on average, is sufficient for unbiased prevalence estimation under such shift. Standard calibration and quantification methods fail to provide this guarantee. Our work connects recent theoretical work on fairness to a longstanding measurement problem spanning nearly all academic disciplines. A simulation confirms that standard methods exhibit bias growing with shift magnitude, while a multicalibrated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.