SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing
Ruihuang Li, Liyi Chen, Zhengqiang Zhang, Varun Jampani, Vishal M., Patel, Lei Zhang

TL;DR
SyncNoise introduces a geometry-guided multi-view consistent noise editing method for text-based 3D scene editing, achieving high-fidelity, globally consistent results across multiple viewpoints by enforcing geometric consistency and propagating local details.
Contribution
The paper presents a novel approach that synchronously edits multi-view 3D scenes with diffusion models while ensuring geometric consistency, addressing limitations of prior iterative methods.
Findings
Achieves high-quality 3D editing respecting textual instructions.
Ensures global consistency in semantic structure and appearance.
Enhances local detail consistency through anchor views and cross-view reprojection.
Abstract
Text-based 2D diffusion models have demonstrated impressive capabilities in image generation and editing. Meanwhile, the 2D diffusion models also exhibit substantial potentials for 3D editing tasks. However, how to achieve consistent edits across multiple viewpoints remains a challenge. While the iterative dataset update method is capable of achieving global consistency, it suffers from slow convergence and over-smoothed textures. We propose SyncNoise, a novel geometry-guided multi-view consistent noise editing approach for high-fidelity 3D scene editing. SyncNoise synchronously edits multiple views with 2D diffusion models while enforcing multi-view noise predictions to be geometrically consistent, which ensures global consistency in both semantic structure and low-frequency appearance. To further enhance local consistency in high-frequency details, we set a group of anchor views and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · 3D Shape Modeling and Analysis · Image Processing and 3D Reconstruction
MethodsSparse Evolutionary Training · Diffusion
