DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing

Minghao Chen; Iro Laina; Andrea Vedaldi

arXiv:2404.18929·cs.CV·December 2, 2024

DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing

Minghao Chen, Iro Laina, Andrea Vedaldi

PDF

Open Access

TL;DR

DGE introduces a two-stage method for efficient, multi-view consistent 3D editing guided by language, improving accuracy and speed over traditional iterative approaches by leveraging 3D Gaussian Splatting.

Contribution

The paper presents a training-free approach to make 2D image editors multi-view consistent and directly optimizes 3D representations, enabling fast, accurate 3D scene editing from language instructions.

Findings

01

Significantly faster than existing methods

02

Achieves multi-view consistency without training

03

Allows selective editing of scene parts

Abstract

We consider the problem of editing 3D objects and scenes based on open-ended language instructions. A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process, obviating the need for 3D data. However, this process is often inefficient due to the need for iterative updates of costly 3D representations, such as neural radiance fields, either through individual view edits or score distillation sampling. A major disadvantage of this approach is the slow convergence caused by aggregating inconsistent information across views, as the guidance from 2D models is not multi-view consistent. We thus introduce the Direct Gaussian Editor (DGE), a method that addresses these issues in two stages. First, we modify a given high-quality image editor like InstructPix2Pix to be multi-view consistent. To do so, we propose a training-free approach that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Computer Graphics and Visualization Techniques