Membership Inference Attacks on Discrete Diffusion Language Models

Shailesh Kasivelrajan

arXiv:2605.16445·cs.LG·May 20, 2026

Membership Inference Attacks on Discrete Diffusion Language Models

Shailesh Kasivelrajan

PDF

TL;DR

This paper demonstrates that masked diffusion language models are highly vulnerable to membership inference attacks, surpassing existing baselines, with effective transfer attacks using shadow models trained on unrelated data.

Contribution

It introduces novel membership inference attack techniques on MDLMs, including feature extraction and shadow model transfer, revealing significant privacy risks.

Findings

01

XGBoost classifiers achieve up to 0.930 AUC on the MIMIR benchmark.

02

ELBO trajectory features are the primary driver of attack success.

03

Shadow model transfer attack achieves 0.858 AUC, close to white box performance.

Abstract

Masked Diffusion Language Models MDLMs replace autoregressive generation with iterative demasking and their privacy properties are largely unstudied. We study membership inference attacks MIA on fine tuned MDLMs and show they are significantly more vulnerable than current grey box baselines suggest. We extract a 46 dimensional feature vector from the models reconstruction loss at four masking ratios and train XGBoost and MLP classifiers on top. On the MIMIR benchmark across six text domains XGBoost achieves mean AUC 0.878 peaking at 0.930 on Pile CC and beats the SAMA grey box baseline by 0.062 AUC on average. A leave one signal out ablation shows that the ELBO trajectory alone drives most of this with a mean drop of 0.130 when removed while attention features add almost nothing below 0.003. We also design a shadow model transfer attack where K equals 3 surrogate MDLMs trained on data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.