Thinking into the Future: Latent Lookahead Training for Transformers
Lorenzo Noci, Gregor Bachmann, Seyed-Mohsen Moosavi-Dezfooli, Moin Nabi

TL;DR
This paper introduces latent lookahead training for transformers, allowing models to perform multi-step latent space planning before token generation, improving performance on tasks requiring foresight.
Contribution
It proposes a novel latent lookahead training strategy that enables models to perform multi-step planning in latent space, enhancing their ability to handle complex foresight tasks.
Findings
Outperforms autoregressive and non-autoregressive baselines on planning tasks
Enables models to perform multi-step lookahead in latent space
Improves performance on maze solving, Sudoku, and ProsQA
Abstract
Autoregressive language models trained with next-token prediction generate text by sampling one discrete token at a time. Although very scalable, this objective forces the model to commit at every step, preventing it from exploring or reflecting upon multiple plausible continuations. Furthermore, the compute allocation across tokens is uniform; every token is formed based on a single forward-pass, potentially limiting the model's expressiveness in cases where difficult tokens require inherently more compute. Towards addressing these limitations, we introduce latent lookahead, a training strategy that enables models to "think" before generating: at selected positions in the sequence, before committing to the next token, the model performs a multi-step lookahead in latent space. More precisely, instead of sampling future tokens, we leverage the network's latent space by recursively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Artificial Intelligence in Games
