Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing
Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen

TL;DR
This paper introduces uniform attention maps to improve image reconstruction fidelity in diffusion models, addressing misalignments caused by cross-attention, and enhances editing accuracy with adaptive mask-guided techniques.
Contribution
It proposes replacing cross-attention with uniform attention maps in diffusion models, significantly boosting reconstruction fidelity and editing consistency without extensive model tuning.
Findings
Enhanced image reconstruction fidelity demonstrated
Robust performance in image editing tasks
Reduced distortions during noise prediction
Abstract
Text-guided image generation and editing using diffusion models have achieved remarkable advancements. Among these, tuning-free methods have gained attention for their ability to perform edits without extensive model adjustments, offering simplicity and efficiency. However, existing tuning-free approaches often struggle with balancing fidelity and editing precision. Reconstruction errors in DDIM Inversion are partly attributed to the cross-attention mechanism in U-Net, which introduces misalignments during the inversion and reconstruction process. To address this, we analyze reconstruction from a structural perspective and propose a novel approach that replaces traditional cross-attention with uniform attention maps, significantly enhancing image reconstruction fidelity. Our method effectively minimizes distortions caused by varying text conditions during noise prediction. To complement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis
MethodsSoftmax · Attention Is All You Need · Convolution · Concatenated Skip Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · U-Net · Diffusion
