StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow

Gurjeet Jagwani; Stephen Thorp; Sinan Deger; Hiranya Peiris

arXiv:2605.16486·stat.ML·May 19, 2026

StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow

Gurjeet Jagwani, Stephen Thorp, Sinan Deger, Hiranya Peiris

PDF

TL;DR

StAD is a novel distillation method that efficiently predicts the divergence of probability flow ODEs in diffusion and flow-based models, improving likelihood computation speed and variance without Jacobian calculations.

Contribution

The paper introduces StAD, a divergence prediction technique that eliminates the need for Jacobian computation in likelihood estimation for diffusion and flow models.

Findings

01

StAD outperforms Hutchinson and Hutch++ in variance and speed on CIFAR-10 and ImageNet.

02

The method generalizes across various generative models.

03

Learned vector fields can satisfy the Stein class under certain conditions.

Abstract

Diffusion and flow-based models are ubiquitously used for generative modelling and density estimation. They admit a deterministic probability flow ordinary differential equation (PF-ODE), analogous to continuous normalizing flows (CNFs), which describes the transport of the probability mass. Obtaining the likelihood from these models is of interest to many workflows, especially Bayesian analysis, and requires solving the trace of the Jacobian to compute the divergence of the learned PF-ODE, which is either $O (D^{2})$ to compute exactly or $O (D)$ with a noisy estimate. We introduce StAD, a new distillation method to predict and learn the divergence of the PF-ODE using the Langevin-Stein operator without ever computing the Jacobian. We show that our method is competitive with the Hutchinson and Hutch++ on CIFAR-10, ImageNet and other density estimation tasks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.