TL;DR
The paper introduces interpolated FID (iFID), a new metric that correlates strongly with diffusion model FID (gFID), improving evaluation of generative models by interpolating latent representations.
Contribution
Proposes iFID, a simple yet effective variant of reconstruction FID that better predicts diffusion model quality through latent space interpolation.
Findings
iFID correlates with gFID with Pearson and Spearman coefficients around 0.85
iFID outperforms traditional rFID in predicting diffusion sample quality
Theoretically links iFID to diffusion generalization and hallucination phenomena.
Abstract
It is well known that the reconstruction FID (rFID) of a VAE is poorly correlated with the generation FID (gFID) of a latent diffusion model. We propose interpolated FID (iFID), a simple variant of rFID that exhibits a strong correlation with gFID. Specifically, for each dataset element, we retrieve its nearest neighbor in latent space, interpolate between their latent representations, decode the interpolated latent, and compute the FID between the decoded samples and the original dataset. We provide an intuitive explanation for why iFID correlates well with gFID, and why reconstruction metrics can be negatively correlated with gFID, by connecting iFID to recent results on diffusion generalization and hallucination. Theoretically, we show that iFID evaluates decoded interpolations aligned with the ridge set around which diffusion samples concentrate, thereby measuring a quantity closely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
