TL;DR
FreqFlow introduces frequency-aware conditioning with a two-branch architecture to improve high-quality image generation, capturing both global structure and fine details effectively.
Contribution
It presents a novel frequency-aware flow matching framework with adaptive weighting and a dual-branch architecture for enhanced image synthesis.
Findings
Achieves state-of-the-art FID of 1.38 on ImageNet-256.
Outperforms prior diffusion and flow matching models in quality.
Effectively models both global structure and fine details.
Abstract
Flow matching models have emerged as a powerful framework for realistic image generation by learning to reverse a corruption process that progressively adds Gaussian noise. However, because noise is injected in the latent domain, its impact on different frequency components is non-uniform. As a result, during inference, flow matching models tend to generate low-frequency components (global structure) in the early stages, while high-frequency components (fine details) emerge only later in the reverse process. Building on this insight, we propose Frequency-Aware Flow Matching (FreqFlow), a novel approach that explicitly incorporates frequency-aware conditioning into the flow matching framework via time-dependent adaptive weighting. We introduce a two-branch architecture: (1) a frequency branch that separately processes low- and high-frequency components to capture global structure and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
