Your diffusion model secretly knows the dimension of the data manifold
Jan Stanczuk, Georgios Batzolis, Teo Deveney, Carola-Bibiane, Sch\"onlieb

TL;DR
This paper introduces a novel method leveraging diffusion models to estimate the intrinsic dimension of data manifolds, outperforming existing estimators in experiments on Euclidean and image data.
Contribution
It presents the first diffusion-model-based estimator for data manifold dimension, connecting score functions to manifold geometry for the first time.
Findings
Outperforms traditional estimators in controlled experiments
Works effectively on both Euclidean and image data
Provides a new geometric perspective using diffusion models
Abstract
In this work, we propose a novel framework for estimating the dimension of the data manifold using a trained diffusion model. A diffusion model approximates the score function i.e. the gradient of the log density of a noise-corrupted version of the target distribution for varying levels of corruption. We prove that, if the data concentrates around a manifold embedded in the high-dimensional ambient space, then as the level of corruption decreases, the score function points towards the manifold, as this direction becomes the direction of maximal likelihood increase. Therefore, for small levels of corruption, the diffusion model provides us with access to an approximation of the normal bundle of the data manifold. This allows us to estimate the dimension of the tangent space, thus, the intrinsic dimension of the data manifold. To the best of our knowledge, our method is the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Generative Adversarial Networks and Image Synthesis · Bayesian Methods and Mixture Models
MethodsDiffusion
