LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning

Fengyi Fu; Mengqi Huang; Lei Zhang; Zhendong Mao

arXiv:2511.08251·cs.CV·November 12, 2025

LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning

Fengyi Fu, Mengqi Huang, Lei Zhang, Zhendong Mao

PDF

Open Access 1 Video

TL;DR

LayerEdit is a novel, training-free multi-object image editing framework that achieves conflict-free, disentangled modifications of multiple objects guided by text, by decomposing, editing, and fusing object layers with conflict awareness.

Contribution

This work introduces LayerEdit, a new multi-layer disentangled editing framework that effectively handles inter-object conflicts without training, enabling precise multi-object editing guided by text.

Findings

01

Outperforms existing methods in intra-object controllability.

02

Achieves high inter-object coherence in complex scenarios.

03

Demonstrates effective conflict suppression during editing.

Abstract

Text-driven multi-object image editing which aims to precisely modify multiple objects within an image based on text descriptions, has recently attracted considerable interest. Existing works primarily follow the localize-editing paradigm, focusing on independent object localization and editing while neglecting critical inter-object interactions. However, this work points out that the neglected attention entanglements in inter-object conflict regions, inherently hinder disentangled multi-object editing, leading to either inter-object editing leakage or intra-object editing constraints. We thereby propose a novel multi-layer disentangled editing framework LayerEdit, a training-free method which, for the first time, through precise object-layered decomposition and coherent fusion, enables conflict-free object-layered editing. Specifically, LayerEdit introduces a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning· underline

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection