TL;DR
SRC-Flow introduces a semantic representation compressor that enables normalizing flows to generate high-quality images efficiently by operating in a compact semantic space, achieving state-of-the-art results.
Contribution
The paper proposes SRC-Flow, a novel method that compresses high-dimensional features into a low-dimensional semantic space for improved flow-based image generation.
Findings
Achieves state-of-the-art gFID scores of 1.65 and 2.07 on ImageNet at 256x256 and 512x512 resolutions.
Enables exact likelihood computation and deterministic invertible sampling in the semantic space.
Outperforms previous normalizing flow methods in large-scale image generation.
Abstract
Normalizing flows (NFs) provide exact likelihoods and deterministic invertible sampling, but have historically lagged behind diffusion models for large-scale image generation. We identify a key obstacle: NFs are required to learn a single invertible transport over the full ambient space, making them highly sensitive to high-dimensional representations. This leads to a semantic-capacity mismatch in modern visual representation spaces, where semantic information is compact but encoded in overcomplete features. We propose SRC-Flow, which introduces a Semantic Representation Compressor (SRC) to compact high-dimensional RAE features into a low-dimensional semantic space before flow modeling and preserve reconstruction through the frozen RAE decoder. This compact space reduces the modeling burden of NFs and enables effective likelihood-based generation in semantic representation space. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
