Mechanism Shift During Post-training from Autoregressive to Masked Diffusion Language Models
Injin Kong, Hyoungjoon Lee, Yohan Jo

TL;DR
This paper investigates the internal mechanism changes when autoregressive language models are post-trained into masked diffusion models, revealing a systematic shift in circuitry and processing strategies depending on task type.
Contribution
It provides a detailed circuit analysis showing how post-training reorganizes internal mechanisms, especially for global planning tasks, differing from traditional autoregressive models.
Findings
MDMs preserve autoregressive circuitry for local causal tasks
MDMs rewire and shift processing for global planning tasks
Semantic processing transitions from localized to distributed in MDMs
Abstract
Post-training pretrained autoregressive models (ARMs) into masked diffusion models (MDMs) has emerged as a cost-effective way to overcome the limitations of sequential generation. Yet the internal algorithmic changes induced by this shift remain poorly understood, leaving it unclear whether post-trained MDMs acquire genuine bidirectional reasoning or merely repackage autoregressive heuristics. We address this question through a comparative circuit analysis of ARMs and their MDM counterparts. Our analysis reveals a systematic "mechanism shift" that depends on the structural nature of the task. MDMs largely preserve autoregressive circuitry for tasks driven by local causal dependencies, but for global planning tasks they abandon initialized pathways and exhibit distinct rewiring with increased early-layer processing. At the semantic level, we observe a transition from sharp, localized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications
