Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions
Emiel Hoogeboom, Didrik Nielsen, Priyank Jaini, Patrick Forr\'e, Max, Welling

TL;DR
This paper introduces Argmax Flows and Multinomial Diffusion, novel methods for modeling categorical data like language and segmentation maps, outperforming existing approaches in log-likelihood.
Contribution
It proposes new generative models tailored for categorical data, extending flows and diffusion techniques with probabilistic inverse and noise diffusion methods.
Findings
Outperforms existing dequantization methods in text modeling
Achieves better log-likelihood on image segmentation maps
Demonstrates effective learning of categorical distributions
Abstract
Generative flows and diffusion models have been predominantly trained on ordinal data, for example natural images. This paper introduces two extensions of flows and diffusion for categorical data such as language or image segmentation: Argmax Flows and Multinomial Diffusion. Argmax Flows are defined by a composition of a continuous distribution (such as a normalizing flow), and an argmax function. To optimize this model, we learn a probabilistic inverse for the argmax that lifts the categorical data to a continuous space. Multinomial Diffusion gradually adds categorical noise in a diffusion process, for which the generative denoising process is learned. We demonstrate that our method outperforms existing dequantization approaches on text modelling and modelling on image segmentation maps in log-likelihood.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing · Machine Learning in Healthcare
MethodsDiffusion
