CoreEditor: Correspondence-constrained Diffusion for Consistent 3D Editing
Zhe Zhu, Honghua Chen, Peng Li, Mingqiang Wei

TL;DR
CoreEditor introduces a correspondence-constrained diffusion framework for consistent and high-quality text-driven 3D editing, effectively maintaining multi-view consistency and enabling user-controlled editing results.
Contribution
It proposes a novel correspondence-constrained attention mechanism and a semantic similarity integration for improved multi-view 3D editing consistency.
Findings
Produces sharper, more consistent 3D edits
Outperforms prior methods in quality and consistency
Offers flexible user-controlled editing options
Abstract
Text-driven 3D editing seeks to modify 3D scenes according to textual descriptions, and most existing approaches tackle this by adapting pre-trained 2D image editors to multi-view inputs. However, without explicit control over multi-view information exchange, they often fail to maintain cross-view consistency, leading to insufficient edits and blurry details. We introduce CoreEditor, a novel framework for consistent text-to-3D editing. The key innovation is a correspondence-constrained attention mechanism that enforces precise interactions between pixels expected to remain consistent throughout the diffusion denoising process. Beyond relying solely on geometric alignment, we further incorporate semantic similarity estimated during denoising, enabling more reliable correspondence modeling and robust multi-view editing. In addition, we design a selective editing pipeline that allows users…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques
