The Endogeneity of Miscalibration: Impossibility and Escape in Scored Reporting
Lauri Lov\'en, Sasu Tarkoma

TL;DR
This paper demonstrates the fundamental limitations of smooth scoring rules in eliciting truthful reports from strategic agents and proposes a threshold-based alternative that preserves calibration and achieves optimal screening.
Contribution
It establishes an impossibility result for smooth scoring rules and introduces a simple step-function threshold mechanism that attains first-best screening under all proper scoring rules.
Findings
Impossibility of truthful reporting with non-affine approval functions under smooth scoring rules.
A step-function approval threshold achieves first-best screening for all strictly proper scoring rules.
Under the Brier score, the welfare gap between second-best and first-best is eliminated.
Abstract
Eliciting truthful reports from autonomous agents is a core problem in scalable AI oversight: a principal scores the agent's report using a strictly proper scoring rule, but the agent also benefits from the report through a non-accuracy channel (approval for autonomous action, allocation share, downstream control). The same structure appears in classical mechanism-design settings such as marketplace operation. Our main result is an endogeneity: the principal's optimal oversight necessarily uses a non-affine approval function to screen types, yet any non-affine approval makes truthful reporting suboptimal under the combined objective whenever deviation is undetectable. The principal cannot avoid the perturbation that undermines calibration. This impossibility holds for all strictly proper scoring rules, with a closed-form perturbation formula. A constructive escape exists: a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
