T$^\star$: Progressive Block Scaling for Masked Diffusion Language Models Through Trajectory Aware Reinforcement Learning
Hanchen Xia, Baoyou Chen, Yutang Ge, Guojiang Zhao, Siyu Zhu

TL;DR
T$^ ext{*}$ introduces a TraceRL-based curriculum for progressive block-size scaling in masked diffusion language models, improving decoding efficiency with minimal performance loss.
Contribution
It proposes a novel training method that gradually scales block sizes in MDMs using reinforcement learning, enhancing decoding parallelism.
Findings
Enables higher-parallelism decoding with minimal performance loss.
Suggests convergence to an alternative decoding schedule with comparable performance.
Abstract
We present T, a simple TraceRL-based training curriculum for progressive block-size scaling in masked diffusion language models (MDMs). Starting from an AR-initialized small-block MDM, T transitions smoothly to larger blocks, enabling higher-parallelism decoding with minimal performance degradation on math reasoning benchmarks. Moreover, further analysis suggests that T may actually converge to an alternative decoding schedule that achieves comparable performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
