Guided Transfer Learning for Discrete Diffusion Models
Julian Kleutgens, Claudio Battiloro, Lingkai Kong, Benjamin Grewe, Francesca Dominici, Mauricio Tec

TL;DR
This paper introduces Guided Transfer Learning (GTL) for discrete diffusion models, enabling efficient transfer to target distributions with linear vocabulary scaling, especially effective in small-data regimes.
Contribution
It proposes a practical, scalable algorithm for transfer learning in discrete DMs, addressing computational challenges and demonstrating effectiveness in language and synthetic data tasks.
Findings
GTL reduces transfer learning cost to linear in vocabulary size.
GTL outperforms fine-tuning when target data is limited.
Poor source-target overlap hampers ratio-based guidance effectiveness.
Abstract
Discrete diffusion models (DMs) have achieved strong performance in language and other discrete domains, offering a compelling alternative to autoregressive modeling. Yet this performance typically depends on large training datasets, challenging the performance of DMs in small-data regimes -- common under real-world constraints. Aimed at this challenge, recent work in continuous DMs suggests that transfer learning via classifier ratio-based guidance can adapt a pretrained DM to a related target distribution, often outperforming alternatives such as full-weight fine-tuning on the target data. By contrast, transfer learning for discrete DMs remains unexplored. We address this gap by exploring practical analogues of ratio-based transfer learning for discrete DMs. Our theoretical analysis shows that a direct extension of existing ratio-based guidance is computationally prohibitive, scaling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
