Diffusion LLMs can think EoS-by-EoS

Sarah Breckner; Sebastian Schuster

arXiv:2603.05197·cs.CL·March 6, 2026

Diffusion LLMs can think EoS-by-EoS

Sarah Breckner, Sebastian Schuster

PDF

Open Access

TL;DR

Diffusion LLMs utilize end-of-sequence tokens as a hidden scratchpad, enabling complex reasoning by leveraging EoS representations, which enhances their problem-solving capabilities beyond traditional expectations.

Contribution

This paper introduces the idea that diffusion LLMs think EoS-by-EoS, demonstrating how EoS tokens serve as a hidden computational space for reasoning tasks.

Findings

01

Adding EoS tokens improves reasoning performance.

02

Intervening on EoS hidden states alters outputs, confirming their informational role.

03

Diffusion models can effectively use EoS tokens for complex reasoning.

Abstract

Diffusion LLMs have been proposed as an alternative to autoregressive LLMs, excelling especially at complex reasoning tasks with interdependent sub-goals. Curiously, this is particularly true if the generation length, i.e., the number of tokens the model has to output, is set to a much higher value than is required for providing the correct answer to the task, and the model pads its answer with end-of-sequence (EoS) tokens. We hypothesize that diffusion models think EoS-by-EoS, that is, they use the representations of EoS tokens as a hidden scratchpad, which allows them to solve harder reasoning problems. We experiment with the diffusion models LLaDA1.5, LLaDA2.0-mini, and Dream-v0 on the tasks Addition, Entity Tracking, and Sudoku. In a controlled prompting experiment, we confirm that adding EoS tokens improves the LLMs' reasoning capabilities. To further verify whether they serve as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Logic, Reasoning, and Knowledge · Explainable Artificial Intelligence (XAI)