Generating from Discrete Distributions Using Diffusions: Insights from Random Constraint Satisfaction Problems
Alankrita Bhatt, Mukur Gupta, Germain Kolossov, Andrea Montanari

TL;DR
This paper investigates how diffusion-based generative methods perform on random constraint satisfaction problems like k-SAT, revealing insights that challenge intuition and improve generation accuracy.
Contribution
It connects random CSP theory with diffusion techniques, showing how insights can enhance discrete data generation and outperform existing methods.
Findings
Continuous diffusions outperform discrete diffusions
Learned diffusions can match theoretical accuracy
Smart variable ordering improves generation accuracy
Abstract
Generating data from discrete distributions is important for a number of application domains including text, tabular data, and genomic data. Several groups have recently used random -satisfiability (-SAT) as a synthetic benchmark for new generative techniques. In this paper, we show that fundamental insights from the theory of random constraint satisfaction problems have observable implications (sometime contradicting intuition) on the behavior of generative techniques on such benchmarks. More precisely, we study the problem of generating a uniformly random solution of a given (random) -SAT or -XORSAT formula. Among other findings, we observe that: ~Continuous diffusions outperform masked discrete diffusions; ~Learned diffusions can match the theoretical `ideal' accuracy; ~Smart ordering of the variables can significantly improve accuracy, although not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConstraint Satisfaction and Optimization · Markov Chains and Monte Carlo Methods · Bayesian Modeling and Causal Inference
