The Spacetime of Diffusion Models: An Information Geometry Perspective
Rafa{\l} Karczewski, Markus Heinonen, Alison Pouplin, S{\o}ren Hauberg, Vikas Garg

TL;DR
This paper introduces a new geometric framework for diffusion models using information geometry, addressing limitations of previous approaches and enabling efficient computation of minimal noise-denoise paths with applications in molecular transition sampling.
Contribution
It proposes a spacetime geometric structure for diffusion models, deriving a diffusion-based edit distance and efficient geodesic estimators, improving understanding and sampling in data and molecular systems.
Findings
Identifies flaws in the standard probability flow ODE approach.
Develops a spacetime geometric structure with a nontrivial metric.
Demonstrates improved transition path sampling in molecular systems.
Abstract
We present a novel geometric perspective on the latent space of diffusion models. We first show that the standard pullback approach, utilizing the deterministic probability flow ODE decoder, is fundamentally flawed. It provably forces geodesics to decode as straight segments in data space, effectively ignoring any intrinsic data geometry beyond the ambient Euclidean space. Complementing this view, diffusion also admits a stochastic decoder via the reverse SDE, which enables an information geometric treatment with the Fisher-Rao metric. However, a choice of as the latent representation collapses this metric due to memorylessness. We address this by introducing a latent spacetime that indexes the family of denoising distributions across all noise scales, yielding a nontrivial geometric structure. We prove these distributions form an exponential family and…
Peer Reviews
Decision·ICLR 2026 Oral
1. Clear and compelling reframing of the latent representation in diffusion models. While prior work typically analyzes individual x_t states or only the fully-noised x_T, this paper emphasizes that the entire trajectory \{x_t\} constitutes the latent representation. Defining geometry on the spacetime manifold (x_t, t) using the Fisher–Rao metric is both conceptually straightforward and surprisingly underexplored, making the contribution feel natural yet novel. 2. Diffusion Edit Distance offers
1. High computational cost. The method requires computing geodesics in spacetime and then sampling along those paths with ALD. This is quite expensive in practice, and the paper does not propose any way to reduce this cost. As a result, it may be difficult to use the method in large-scale or time-sensitive settings. 2. Limited variety of data types in experiments. The experiments are mostly on datasets where the structure is relatively simple and smooth (e.g., faces, digits). It is unclear how w
The paper is clearly written, and the proposed approach is well presented. The work introduces a novel and interesting notion of latent spacetime. While previous studies usually considered interpolation between samples as transitions within a slice of latent space fixed in timestep, this paper generalizes that perspective. It may have a broad impact on the image editing domain, effectively connecting geometric ideas with techniques such as DDIM inversion.
There are concerns regarding the computational complexity. Although the paper mentions that the method may be slower, it would be beneficial to include exact runtime comparisons and additional evaluations. The authors demonstrate that PF-ODE sampling trajectories and geodesics are similar but do not provide any quantitative metrics. It would be interesting to see how this approach compares with LPIPS, for example by fixing the starting and final points to the same timestep.
- The paper addresses an interesting theoretical question: how to define a meaningful geometric structure for diffusion models and connects it to information geometry. - The idea of modeling denoising distributions as an exponential family and equipping the resulting spacetime with a Fisher–Rao metric is conceptually sound and mathematically motivated. - The second application (transition path sampling in molecular systems) is original and can lead to generating more research in that direction
- The paper does not clearly describe the algorithm used to compute the Diffusion Edit Distance (DiffED). Including pseudocode or a concise algorithmic summary would greatly improve clarity and reproducibility. - The computational cost of finding geodesics in spacetime using DiffED is unclear. Since Equation (16) must be evaluated at many noise levels, the method appears potentially expensive. How does its efficiency compare to related approaches? For instance, What’s Inside Your Diffusion Mode
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMorphological variations and asymmetry · Topological and Geometric Data Analysis · Statistical Mechanics and Entropy
