FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion
Haosen Yang, Adrian Bulat, Isma Hadji, Hai X. Pham, Xiatian Zhu,, Georgios Tzimiropoulos, Brais Martinez

TL;DR
Fam Diffusion introduces frequency and attention modulation modules that enable high-resolution image generation with existing diffusion models without retraining, significantly reducing artifacts and latency.
Contribution
The paper presents a novel, training-free approach combining frequency and attention modulation to improve high-resolution image generation in diffusion models.
Findings
Addresses structural and local artifacts effectively
Achieves state-of-the-art quantitative performance
Maintains low latency without additional inference tricks
Abstract
Diffusion models are proficient at generating high-quality images. They are however effective only when operating at the resolution used during training. Inference at a scaled resolution leads to repetitive patterns and structural distortions. Retraining at higher resolutions quickly becomes prohibitive. Thus, methods enabling pre-existing diffusion models to operate at flexible test-time resolutions are highly desirable. Previous works suffer from frequent artifacts and often introduce large latency overheads. We propose two simple modules that combine to solve these issues. We introduce a Frequency Modulation (FM) module that leverages the Fourier domain to improve the global structure consistency, and an Attention Modulation (AM) module which improves the consistency of local texture patterns, a problem largely ignored in prior works. Our method, coined Fam diffusion, can seamlessly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing Techniques and Applications · CCD and CMOS Imaging Sensors · Advanced Optical Imaging Technologies
MethodsSoftmax · Attention Is All You Need · Latent Diffusion Model · Diffusion
