Learn from Your Mistakes: Self-Correcting Masked Diffusion Models

Yair Schiff; Omer Belhasin; Roy Uziel; Guanghan Wang; Marianne Arriola; Gilad Turok; Michael Elad; Volodymyr Kuleshov

arXiv:2602.11590·cs.LG·March 6, 2026

Learn from Your Mistakes: Self-Correcting Masked Diffusion Models

Yair Schiff, Omer Belhasin, Roy Uziel, Guanghan Wang, Marianne Arriola, Gilad Turok, Michael Elad, Volodymyr Kuleshov

PDF

Open Access 2 Models

TL;DR

This paper introduces ProSeCo, a self-correcting framework for masked diffusion models that iteratively refines generated sequences, significantly improving sample quality and efficiency in various tasks.

Contribution

It proposes a novel training and sampling method enabling self-correction in masked diffusion models, enhancing quality and speed over existing approaches.

Findings

01

Up to 2-3x faster sampling with better quality.

02

Improved sample quality by 1.3x on benchmarks.

03

Effective correction of errors during sequence generation.

Abstract

Masked diffusion models (MDMs) have emerged as a promising alternative to autoregressive models, enabling parallel token generation while achieving competitive performance. Despite these advantages, MDMs face a fundamental limitation: once tokens are unmasked, they remain fixed, leading to error accumulation and ultimately degrading sample quality. We address this by proposing a framework that trains a model to perform both unmasking and correction. By reusing outputs from the MDM denoising network as inputs for corrector training, we train a model to recover from potential mistakes. During generation we apply additional corrective refinement steps between unmasking ones in order to change decoded tokens and improve outputs. We name our training and sampling method Progressive Self-Correction (ProSeCo) for its unique ability to iteratively refine an entire sequence, including already…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Model Reduction and Neural Networks