Uncovering Uncertainty in Transformer Inference
Greyson Brothers, Willa Mannering, Amber Tien, John Winder

TL;DR
This paper investigates how transformer models refine their internal representations during inference, revealing that residual embeddings' convergence correlates with uncertainty and proposing a method to detect incorrect token generations.
Contribution
It provides empirical evidence supporting the Iterative Inference Hypothesis and introduces a cross-entropy-based method to identify uncertain and incorrect token predictions.
Findings
Residual embeddings follow a decreasing loss trajectory.
Convergence rate indicates uncertainty in token generation.
Cross-entropy method can distinguish correct from incorrect tokens.
Abstract
We explore the Iterative Inference Hypothesis (IIH) within the context of transformer-based language models, aiming to understand how a model's latent representations are progressively refined and whether observable differences are present between correct and incorrect generations. Our findings provide empirical support for the IIH, showing that the nth token embedding in the residual stream follows a trajectory of decreasing loss. Additionally, we observe that the rate at which residual embeddings converge to a stable output representation reflects uncertainty in the token generation process. Finally, we introduce a method utilizing cross-entropy to detect this uncertainty and demonstrate its potential to distinguish between correct and incorrect token generations on a dataset of idioms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Power Transformer Diagnostics and Insulation
