Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Chenghao Fan, Wen Heng, Bo Li, Sichen Liu, Yuxuan Song, Jing Su, Xiaoye Qu, Kai Shen, Wei Wei

TL;DR
Stable-DiffCoder is a diffusion-based code model that outperforms autoregressive models on various benchmarks by leveraging block diffusion training, data reuse, and structured code modeling enhancements.
Contribution
It introduces Stable-DiffCoder, a novel diffusion-based code model with a tailored training pipeline that surpasses AR models in code generation and editing tasks.
Findings
Outperforms AR models on multiple code benchmarks.
Diffusion training improves code modeling quality.
Enhances structured code editing and reasoning.
Abstract
Diffusion-based language models (DLLMs) offer non-sequential, block-wise generation and richer data reuse compared to autoregressive (AR) models, but existing code DLLMs still lag behind strong AR baselines under comparable budgets. We revisit this setting in a controlled study and introduce Stable-DiffCoder, a block diffusion code model that reuses the Seed-Coder architecture, data, and training pipeline. To enable efficient knowledge learning and stable training, we incorporate a block diffusion continual pretraining (CPT) stage enhanced by a tailored warmup and block-wise clipped noise schedule. Under the same data and architecture, Stable-DiffCoder overall outperforms its AR counterpart on a broad suite of code benchmarks. Moreover, relying only on the CPT and supervised fine-tuning stages, Stable-DiffCoder achieves stronger performance than a wide range of \~8B ARs and DLLMs,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗ByteDance-Seed/Stable-DiffCoder-8B-Instructmodel· 938 dl· ♡ 129938 dl♡ 129
- 🤗ByteDance-Seed/Stable-DiffCoder-8B-Basemodel· 756 dl· ♡ 16756 dl♡ 16
- 🤗Sinketji/Stable-DiffCoder-8B-Instructmodel· 10 dl10 dl
- 🤗servantofares/Stable-DiffCoder-8B-Basemodel· 202 dl202 dl
- 🤗trohrbaugh/Stable-DiffCoder-8B-Instruct-hereticmodel· 195 dl195 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Natural Language Processing Techniques
