Backdooring Masked Diffusion Language Models

Daniel Yiming Cao; Chengzhong Wang; Sheng-Yen Chou; Chengyu Huang; Pin-Yu Chen; Shengwei An

arXiv:2605.19262·cs.LG·May 20, 2026

Backdooring Masked Diffusion Language Models

Daniel Yiming Cao, Chengzhong Wang, Sheng-Yen Chou, Chengyu Huang, Pin-Yu Chen, Shengwei An

PDF

TL;DR

This paper introduces SHADOWMASK, a novel backdoor attack on masked diffusion language models that effectively manipulates text generation while maintaining model utility and robustness against defenses.

Contribution

The work presents the first systematic backdoor attack method tailored for MDLMs, with a mathematical formulation and extensive evaluation demonstrating its effectiveness.

Findings

01

SHADOWMASK achieves near-100% attack success rate.

02

It outperforms standard data poisoning methods.

03

It remains effective under fine-tuning and defenses.

Abstract

Masked diffusion language models (MDLMs) are emerging as a compelling new paradigm for text generation, but their training-time security remains largely unexplored. Existing backdoor attacks on Gaussian diffusion models or autoregressive language models do not directly apply to MDLMs because MDLMs rely on discrete state corruption and iterative denoising rather than continuous noising or left-to-right prediction. In this work, we present the first systematic study of training-time backdoor attacks on MDLMs. We propose SHADOWMASK, a backdoor attack that modifies the MDLM forward corruption process by replacing the standard all-mask terminal distribution with a trigger-mask mixture prior. This creates a dedicated denoising pathway from trigger-corrupted states to attacker-specified targets while preserving clean denoising behavior. We further provide a principled mathematical formulation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.