Dual-Channel Attention Guidance for Training-Free Image Editing Control in Diffusion Transformers
Guandong Li

TL;DR
This paper introduces a training-free dual-channel attention guidance method for diffusion transformers, leveraging both Key and Value spaces to improve image editing control and fidelity without additional training.
Contribution
It reveals the bias-delta structure in Key and Value projections and proposes a dual-channel manipulation framework for more precise image editing control.
Findings
Outperforms Key-only guidance across all fidelity metrics.
Significant improvements in localized editing tasks like object deletion and addition.
Demonstrates the effectiveness of dual-channel guidance on the PIE-Bench benchmark.
Abstract
Training-free control over editing intensity is a critical requirement for diffusion-based image editing models built on the Diffusion Transformer (DiT) architecture. Existing attention manipulation methods focus exclusively on the Key space to modulate attention routing, leaving the Value space -- which governs feature aggregation -- entirely unexploited. In this paper, we first reveal that both Key and Value projections in DiT's multi-modal attention layers exhibit a pronounced bias-delta structure, where token embeddings cluster tightly around a layer-specific bias vector. Building on this observation, we propose Dual-Channel Attention Guidance (DCAG), a training-free framework that simultaneously manipulates both the Key channel (controlling where to attend) and the Value channel (controlling what to aggregate). We provide a theoretical analysis showing that the Key channel operates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Visual Attention and Saliency Detection · Cell Image Analysis Techniques
