Next-Token Prediction and Regret Minimization

Mehryar Mohri; Clayton Sanford; Jon Schneider; Kiran Vodrahalli; Yifan Wu

arXiv:2603.28499·cs.LG·March 31, 2026

Next-Token Prediction and Regret Minimization

Mehryar Mohri, Clayton Sanford, Jon Schneider, Kiran Vodrahalli, Yifan Wu

PDF

TL;DR

This paper analyzes how next-token prediction models can be used in adversarial online decision environments, showing that unbounded context models can approximate low-regret distributions, unlike bounded context models.

Contribution

It demonstrates that unbounded context models can approximate low-regret distributions with negligible accuracy loss, and that transformer architectures can implement and learn these distributions.

Findings

01

Unbounded context models can be exponentially close to low-regret distributions.

02

Bounded context models may be far from any low-regret distribution.

03

Transformers can efficiently implement and learn low-regret distributions.

Abstract

We consider the question of how to employ next-token prediction algorithms in adversarial online decision-making environments. Specifically, if we train a next-token prediction model on a distribution $D$ over sequences of opponent actions, when is it the case that the induced online decision-making algorithm (by approximately best responding to the model's predictions) has low adversarial regret (i.e., when is $D$ a \emph{low-regret distribution})? For unbounded context windows (where the prediction made by the model can depend on all the actions taken by the adversary thus far), we show that although not every distribution $D$ is a low-regret distribution, every distribution $D$ is exponentially close (in TV distance) to one low-regret distribution, and hence sublinear regret can always be achieved at negligible cost to the accuracy of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.