StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow
Gurjeet Jagwani, Stephen Thorp, Sinan Deger, Hiranya Peiris

TL;DR
StAD is a novel distillation method that efficiently predicts the divergence of probability flow ODEs in diffusion and flow-based models, improving likelihood computation speed and variance without Jacobian calculations.
Contribution
The paper introduces StAD, a divergence prediction technique that eliminates the need for Jacobian computation in likelihood estimation for diffusion and flow models.
Findings
StAD outperforms Hutchinson and Hutch++ in variance and speed on CIFAR-10 and ImageNet.
The method generalizes across various generative models.
Learned vector fields can satisfy the Stein class under certain conditions.
Abstract
Diffusion and flow-based models are ubiquitously used for generative modelling and density estimation. They admit a deterministic probability flow ordinary differential equation (PF-ODE), analogous to continuous normalizing flows (CNFs), which describes the transport of the probability mass. Obtaining the likelihood from these models is of interest to many workflows, especially Bayesian analysis, and requires solving the trace of the Jacobian to compute the divergence of the learned PF-ODE, which is either to compute exactly or with a noisy estimate. We introduce StAD, a new distillation method to predict and learn the divergence of the PF-ODE using the Langevin-Stein operator without ever computing the Jacobian. We show that our method is competitive with the Hutchinson and Hutch++ on CIFAR-10, ImageNet and other density estimation tasks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
