On Next-Token Prediction in LLMs: How End Goals Determine the Consistency of Decoding Algorithms

Jacob Trauger; Ambuj Tewari

arXiv:2505.11183·stat.ML·May 19, 2025

On Next-Token Prediction in LLMs: How End Goals Determine the Consistency of Decoding Algorithms

Jacob Trauger, Ambuj Tewari

PDF

Open Access

TL;DR

This paper investigates how different decoding algorithms in large language models align with various end goals, revealing that their effectiveness varies depending on the specific objective and highlighting the importance of goal-aware decoding choices.

Contribution

It is the first study to analyze the consistency of decoding algorithms with respect to different goals in large language models, providing theoretical insights into their optimality and limitations.

Findings

01

Random sampling aligns with the true probability distribution when convergence occurs.

02

No polynomial-time algorithm is universally optimal for all goals and distributions.

03

Decoding algorithms differ in effectiveness between information retrieval and creative generation.

Abstract

Probabilistic next-token prediction trained using cross-entropy loss is the basis of most large language models. Given a sequence of previous values, next-token prediction assigns a probability to each possible next value in the vocabulary. There are many ways to use next-token prediction to output token sequences. This paper examines a few of these algorithms (greedy, lookahead, random sampling, and temperature-scaled random sampling) and studies their consistency with respect to various goals encoded as loss functions. Although consistency of surrogate losses with respect to a target loss function is a well researched topic, we are the first to study it in the context of LLMs (to the best of our knowledge). We find that, so long as next-token prediction converges to its true probability distribution, random sampling is consistent with outputting sequences that mimic sampling from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Language and cultural evolution · Authorship Attribution and Profiling