Stochastic Sampling from Deterministic Flow Models
Saurabh Singh, Ian Fischer

TL;DR
This paper introduces a method to convert deterministic flow models into stochastic samplers, enhancing their robustness, diversity, and performance in tasks like ImageNet generation by leveraging stochastic differential equations.
Contribution
The authors propose a novel approach to transform deterministic flow models into stochastic samplers, providing greater flexibility and improved empirical results.
Findings
Empirically outperforms deterministic samplers on ImageNet.
Provides a controllable diversity mechanism in sampling.
Demonstrates advantages on a toy Gaussian setup.
Abstract
Deterministic flow models, such as rectified flows, offer a general framework for learning a deterministic transport map between two distributions, realized as the vector field for an ordinary differential equation (ODE). However, they are sensitive to model estimation and discretization errors and do not permit different samples conditioned on an intermediate state, limiting their application. We present a general method to turn the underlying ODE of such flow models into a family of stochastic differential equations (SDEs) that have the same marginal distributions. This method permits us to derive families of \emph{stochastic samplers}, for fixed (e.g., previously trained) \emph{deterministic} flow models, that continuously span the spectrum of deterministic and stochastic sampling, given access to the flow field and the score function. Our method provides additional degrees of…
Peer Reviews
Decision·Submitted to ICLR 2025
Clearly written.
Lack of novelty. The conversion between SDEs with different state-independent diffusion coefficients (including zero for ODE) through the Fokker-Planck equation is very well-understood now. While Theorem 1 does cover a general case (state-dependent), this general case is not applied. An example of a paper that discussed stochastic sampling and classifier-free guidance is SiT [1]. [1] https://arxiv.org/abs/2401.08740
The paper provides a useful empirical exploration of the trade-offs/biases that emerge depending on how the sampling SDE is discretized and which diffusion coefficient is used, e.g. in Table 2. Section 4.2 itemizes useful takeaways for the tradeoffs in SDE-based sampling, and provides nice experiments to demonstrate how diffusivity alleviates e.g. sample degeneracy.
First off, thank you for your submission! It would be great if the authors could address the following *primary concern* about the work Corollary 1 is already well known -- it is one of the main points of the interpolant procedure, and is the essential knob explored in 2303.08797, Corollary 2.18. This was then extensively tested (the variety of potential diffusion coefficients from the same velocity model) in 2401.08740. It's unclear to the reviewer how what this paper is proposing is different
1. This paper is well-written, with thorough ablation studies that carefully compare deterministic and stochastic samplers, which I greatly appreciate. 2. The paper is well-written and easy to follow. 3. The motivation is clear.
1.To the best of my knowledge, the paper lacks novelty. The core result of Theorem 1 appears straightforward to derive by combining existing results (Equation 4 and Appendix D in [1], Doob's h-transform, and Equation 37 in [2]). Let me explain more. The Theorem 1 can be understood as $\bar{f}=f-\frac{1}{2}[\nabla(GG^T-(\gamma_t GG^T+\tilde{G}\tilde{G}^T)]-\frac{1}{2}[GG^T-(\gamma_t GG^T+\tilde{G}\tilde{G}^T)]\nabla \log p$, $=f-\frac{1}{2}[\nabla(GG^T-\bar{G}\bar{G}^T)]-\frac{1}{2}[GG^T-\ba
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications
