MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models

Hongyang Zhu; Haipeng Liu; Bo Fu; Yang Wang

arXiv:2505.05101·cs.CV·May 13, 2025

MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models

Hongyang Zhu, Haipeng Liu, Bo Fu, Yang Wang

PDF

Open Access

TL;DR

MDE-Edit introduces a training-free, inference-stage optimization method for precise multi-object image editing using diffusion models, addressing localization and attribute mismatch issues in complex scenes.

Contribution

It proposes a novel dual-loss optimization approach that enhances multi-object editing accuracy without additional training, improving over existing methods.

Findings

01

Outperforms state-of-the-art in editing accuracy

02

Achieves more coherent and localized multi-object edits

03

Demonstrates robustness in complex scenes

Abstract

Multi-object editing aims to modify multiple objects or regions in complex scenes while preserving structural coherence. This task faces significant challenges in scenarios involving overlapping or interacting objects: (1) Inaccurate localization of target objects due to attention misalignment, leading to incomplete or misplaced edits; (2) Attribute-object mismatch, where color or texture changes fail to align with intended regions due to cross-attention leakage, creating semantic conflicts (\textit{e.g.}, color bleeding into non-target areas). Existing methods struggle with these challenges: approaches relying on global cross-attention mechanisms suffer from attention dilution and spatial interference between objects, while mask-based methods fail to bind attributes to geometrically accurate regions due to feature entanglement in multi-object scenarios. To address these limitations, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Image Enhancement Techniques

MethodsSoftmax · Attention Is All You Need · Diffusion · ALIGN