Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling

Tejomay Kishor Padole; Suyash P Awate; Pushpak Bhattacharyya

arXiv:2508.10995·cs.CL·August 19, 2025

Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling

Tejomay Kishor Padole, Suyash P Awate, Pushpak Bhattacharyya

PDF

TL;DR

This paper introduces a verifier-based inference-time scaling method for masked diffusion language models, significantly enhancing text style transfer quality and establishing MDMs as a superior alternative to autoregressive models.

Contribution

It proposes a novel verifier-based inference-time scaling technique for MDMs, improving text style transfer and demonstrating MDMs' advantages over autoregressive models.

Findings

01

Verifier-based scaling improves generation quality.

02

MDMs outperform autoregressive models in style transfer.

03

Simple verifier setups yield significant gains.

Abstract

Masked diffusion language models (MDMs) have recently gained traction as a viable generative framework for natural language. This can be attributed to its scalability and ease of training compared to other diffusion model paradigms for discrete data, establishing itself as the state-of-the-art non-autoregressive generator for discrete data. Diffusion models, in general, have shown excellent ability to improve the generation quality by leveraging inference-time scaling either by increasing the number of denoising steps or by using external verifiers on top of the outputs of each step to guide the generation. In this work, we propose a verifier-based inference-time scaling method that aids in finding a better candidate generation during the denoising process of the MDM. Our experiments demonstrate the application of MDMs for standard text-style transfer tasks and establish MDMs as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.