An Assessment of Human vs. Model Uncertainty in Soft-Label Learning and Calibration
Maja Pavlovic, Silviu Paun, Massimo Poesio

TL;DR
This paper evaluates how human soft-labels influence model calibration and accuracy, distinguishing their effects from label noise correction, and introduces a diagnostic framework for human-AI uncertainty alignment.
Contribution
It provides a controlled analysis separating soft-label benefits from label noise correction and offers a diagnostic testbed for human-AI uncertainty alignment.
Findings
Human soft-labels improve calibration and act as regularizers.
Models trained on human soft-labels align with human uncertainty.
Synthetic labels fail to replicate human uncertainty patterns.
Abstract
Central to human-aligned AI is understanding the benefits of human-elicited labels over synthetic alternatives. While human soft-labels improve calibration by capturing uncertainty, prior studies conflate these benefits with the implicit correction of mislabeled data (mode shifts), obscuring true effects of soft-labels. We present a controlled audit of soft-label learning across MNIST and a synthetic variant, re-annotating subsets to extract human uncertainty. By decoupling soft-label supervision from underlying label mode shifts, we show that while human soft-labels do provide accuracy gains, their larger value lies in acting as a regularizer that improves model calibration on difficult samples and promotes stable convergence across training runs. Dataset cartography reveals models trained on human soft-labels mirror human uncertainty, whereas those trained on synthetic labels fail to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
