Laplacian Multi-scale Flow Matching for Generative Modeling
Zelin Zhao, Petr Molodyk, Haotian Xue, Yongxin Chen

TL;DR
This paper introduces LapFlow, a multi-scale flow matching framework that uses Laplacian pyramids and a mixture-of-transformers to improve image generation quality and speed, especially at high resolutions.
Contribution
It proposes a parallel multi-scale generative model that eliminates the need for renoising between scales, enhancing efficiency and scalability.
Findings
Achieves superior sample quality on CelebA-HQ and ImageNet.
Reduces GFLOPs and inference time compared to baselines.
Effectively scales to 1024×1024 resolution images.
Abstract
In this paper, we present Laplacian multiscale flow matching (LapFlow), a novel framework that enhances flow matching by leveraging multi-scale representations for image generative modeling. Our approach decomposes images into Laplacian pyramid residuals and processes different scales in parallel through a mixture-of-transformers (MoT) architecture with causal attention mechanisms. Unlike previous cascaded approaches that require explicit renoising between scales, our model generates multi-scale representations in parallel, eliminating the need for bridging processes. The proposed multi-scale architecture not only improves generation quality but also accelerates the sampling process and promotes scaling flow matching methods. Through extensive experimentation on CelebA-HQ and ImageNet, we demonstrate that our method achieves superior sample quality with fewer GFLOPs and faster inference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Model Reduction and Neural Networks
