Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs
Daniel D. Johnson, Daniel Tarlow, David Duvenaud, Chris J. Maddison

TL;DR
This paper introduces a novel method for models to estimate their own knowledge gaps by predicting pairs of responses, enabling better detection of uncertainty and incorrect outputs across various tasks.
Contribution
It proposes a new training strategy where models learn to predict pairs of responses and measure cheating to estimate epistemic uncertainty, with theoretical guarantees.
Findings
Accurately estimates model ignorance in image classification, language modeling, and navigation.
Outperforms existing uncertainty quantification techniques.
Provides provably-correct confidence intervals for model predictions.
Abstract
Identifying how much a model knows about the stochastic real-world process it was trained on is important to ensure it avoids producing incorrect or "hallucinated" answers or taking unsafe actions. But this is difficult for generative models because probabilistic predictions do not distinguish between per-response noise (aleatoric uncertainty) and lack of knowledge about the process (epistemic uncertainty), and existing epistemic uncertainty quantification techniques tend to be overconfident when the model underfits. We propose a general strategy for teaching a model to both approximate and also estimate the remaining gaps between and : train it to predict pairs of independent responses drawn from the true conditional distribution, allow it to "cheat" by observing one response while predicting the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence
