Discrete Flow Matching
Itai Gat, Tal Remez, Neta Shaul, Felix Kreuk, Ricky T. Q. Chen,, Gabriel Synnaeve, Yossi Adi, Yaron Lipman

TL;DR
Discrete Flow Matching introduces a new discrete generative paradigm that effectively models high-dimensional discrete data like language, improving sample quality and scaling to large models.
Contribution
It proposes a novel discrete flow framework with flexible probability paths, a unified sampling formula, and demonstrates improved performance on coding benchmarks.
Findings
Achieves 6.7% Pass@1 on HumanEval
Reaches 13.4% Pass@10 on HumanEval
Scales up to 1.7B parameters
Abstract
Despite Flow Matching and diffusion models having emerged as powerful generative paradigms for continuous variables such as images and videos, their application to high-dimensional discrete data, such as language, is still limited. In this work, we present Discrete Flow Matching, a novel discrete flow paradigm designed specifically for generating discrete data. Discrete Flow Matching offers several key contributions:(i) it works with a general family of probability paths interpolating between source and target distributions; (ii) it allows for a generic formula for sampling from these probability paths using learned posteriors such as the probability denoiser (-prediction) and noise-prediction (-prediction); (iii) practically, focusing on specific probability paths defined with different schedulers improves generative perplexity compared to previous discrete diffusion and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Traffic Prediction and Management Techniques
MethodsDiffusion
