Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D   Diffusion

Linzhan Mou; Jun-Kun Chen; Yu-Xiong Wang

arXiv:2406.09402·cs.CV·June 14, 2024

Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion

Linzhan Mou, Jun-Kun Chen, Yu-Xiong Wang

PDF

Open Access

TL;DR

This paper introduces Instruct 4D-to-4D, a method that enables consistent and detailed editing of 4D scenes by treating them as pseudo-3D scenes, extending 2D diffusion models to handle spatial-temporal data.

Contribution

It proposes a novel approach to 4D scene editing by decoupling the problem into temporal consistency and pseudo-3D editing, enhancing 2D diffusion models with new modules and techniques.

Findings

01

Achieves spatially and temporally consistent 4D scene editing.

02

Enhances detail and sharpness over previous methods.

03

Applicable to both monocular and multi-camera scenes.

Abstract

This paper proposes Instruct 4D-to-4D that achieves 4D awareness and spatial-temporal consistency for 2D diffusion models to generate high-quality instruction-guided dynamic scene editing results. Traditional applications of 2D diffusion models in dynamic scene editing often result in inconsistency, primarily due to their inherent frame-by-frame editing methodology. Addressing the complexities of extending instruction-guided editing to 4D, our key insight is to treat a 4D scene as a pseudo-3D scene, decoupled into two sub-problems: achieving temporal consistency in video editing and applying these edits to the pseudo-3D scene. Following this, we first enhance the Instruct-Pix2Pix (IP2P) model with an anchor-aware attention module for batch processing and consistent editing. Additionally, we integrate optical flow-guided appearance propagation in a sliding window fashion for more precise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Modeling in Geospatial Applications

MethodsDiffusion