SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D   Scene Editing

Ruihuang Li; Liyi Chen; Zhengqiang Zhang; Varun Jampani; Vishal M.; Patel; Lei Zhang

arXiv:2406.17396·cs.CV·June 26, 2024

SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing

Ruihuang Li, Liyi Chen, Zhengqiang Zhang, Varun Jampani, Vishal M., Patel, Lei Zhang

PDF

Open Access

TL;DR

SyncNoise introduces a geometry-guided multi-view consistent noise editing method for text-based 3D scene editing, achieving high-fidelity, globally consistent results across multiple viewpoints by enforcing geometric consistency and propagating local details.

Contribution

The paper presents a novel approach that synchronously edits multi-view 3D scenes with diffusion models while ensuring geometric consistency, addressing limitations of prior iterative methods.

Findings

01

Achieves high-quality 3D editing respecting textual instructions.

02

Ensures global consistency in semantic structure and appearance.

03

Enhances local detail consistency through anchor views and cross-view reprojection.

Abstract

Text-based 2D diffusion models have demonstrated impressive capabilities in image generation and editing. Meanwhile, the 2D diffusion models also exhibit substantial potentials for 3D editing tasks. However, how to achieve consistent edits across multiple viewpoints remains a challenge. While the iterative dataset update method is capable of achieving global consistency, it suffers from slow convergence and over-smoothed textures. We propose SyncNoise, a novel geometry-guided multi-view consistent noise editing approach for high-fidelity 3D scene editing. SyncNoise synchronously edits multiple views with 2D diffusion models while enforcing multi-view noise predictions to be geometrically consistent, which ensures global consistency in both semantic structure and low-frequency appearance. To further enhance local consistency in high-frequency details, we set a group of anchor views and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · 3D Shape Modeling and Analysis · Image Processing and 3D Reconstruction

MethodsSparse Evolutionary Training · Diffusion