From Loops to Oops: Fallback Behaviors of Language Models Under   Uncertainty

Maor Ivgi; Ori Yoran; Jonathan Berant; Mor Geva

arXiv:2407.06071·cs.CL·February 11, 2025·2 cites

From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty

Maor Ivgi, Ori Yoran, Jonathan Berant, Mor Geva

PDF

Open Access 1 Repo

TL;DR

This paper analyzes fallback behaviors of large language models under uncertainty, revealing a progression from repetitions to hallucinations as models become more advanced or uncertain, and examines how decoding methods influence these behaviors.

Contribution

It introduces a unified view of undesirable fallback behaviors as a function of model uncertainty and provides a detailed analysis of their ordering and mitigation strategies.

Findings

01

Advanced models exhibit more hallucinations than repetitions.

02

Uncertainty increases hallucination likelihood during generation.

03

Sampling techniques can reduce repetitions but may increase hallucinations.

Abstract

Large language models (LLMs) often exhibit undesirable behaviors, such as hallucinations and sequence repetitions. We propose to view these behaviors as fallbacks that models exhibit under epistemic uncertainty, and investigate the connection between them. We categorize fallback behaviors - sequence repetitions, degenerate text, and hallucinations - and extensively analyze them in models from the same family that differ by the amount of pretraining tokens, parameter count, or the inclusion of instruction-following training. Our experiments reveal a clear and consistent ordering of fallback behaviors, across all these axes: the more advanced an LLM is (i.e., trained on more tokens, has more parameters, or instruction-tuned), its fallback behavior shifts from sequence repetitions, to degenerate text, and then to hallucinations. Moreover, the same ordering is observed during the generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mivg/fallbacks
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems