From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty
Maor Ivgi, Ori Yoran, Jonathan Berant, Mor Geva

TL;DR
This paper analyzes fallback behaviors of large language models under uncertainty, revealing a progression from repetitions to hallucinations as models become more advanced or uncertain, and examines how decoding methods influence these behaviors.
Contribution
It introduces a unified view of undesirable fallback behaviors as a function of model uncertainty and provides a detailed analysis of their ordering and mitigation strategies.
Findings
Advanced models exhibit more hallucinations than repetitions.
Uncertainty increases hallucination likelihood during generation.
Sampling techniques can reduce repetitions but may increase hallucinations.
Abstract
Large language models (LLMs) often exhibit undesirable behaviors, such as hallucinations and sequence repetitions. We propose to view these behaviors as fallbacks that models exhibit under epistemic uncertainty, and investigate the connection between them. We categorize fallback behaviors - sequence repetitions, degenerate text, and hallucinations - and extensively analyze them in models from the same family that differ by the amount of pretraining tokens, parameter count, or the inclusion of instruction-following training. Our experiments reveal a clear and consistent ordering of fallback behaviors, across all these axes: the more advanced an LLM is (i.e., trained on more tokens, has more parameters, or instruction-tuned), its fallback behavior shifts from sequence repetitions, to degenerate text, and then to hallucinations. Moreover, the same ordering is observed during the generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
