DriveEditor: A Unified 3D Information-Guided Framework for Controllable   Object Editing in Driving Scenes

Yiyuan Liang; Zhiying Yan; Liqun Chen; Jiahuan Zhou; Luxin Yan; Sheng; Zhong; Xu Zou

arXiv:2412.19458·cs.CV·December 31, 2024

DriveEditor: A Unified 3D Information-Guided Framework for Controllable Object Editing in Driving Scenes

Yiyuan Liang, Zhiying Yan, Liqun Chen, Jiahuan Zhou, Luxin Yan, Sheng, Zhong, Xu Zou

PDF

Open Access 1 Repo

TL;DR

DriveEditor is a diffusion-based framework that enables precise and diverse object editing in driving videos, including repositioning, replacement, deletion, and insertion, while maintaining high fidelity and consistency.

Contribution

It introduces a unified, diffusion-based approach with novel position control and appearance maintenance modules tailored for driving scene editing.

Findings

01

Achieves high-fidelity object manipulation in driving videos

02

Demonstrates superior controllability and diversity in scene editing

03

Facilitates downstream tasks with realistic scene modifications

Abstract

Vision-centric autonomous driving systems require diverse data for robust training and evaluation, which can be augmented by manipulating object positions and appearances within existing scene captures. While recent advancements in diffusion models have shown promise in video editing, their application to object manipulation in driving scenarios remains challenging due to imprecise positional control and difficulties in preserving high-fidelity object appearances. To address these challenges in position and appearance control, we introduce DriveEditor, a diffusion-based framework for object editing in driving videos. DriveEditor offers a unified framework for comprehensive object editing operations, including repositioning, replacement, deletion, and insertion. These diverse manipulations are all achieved through a shared set of varying inputs, processed by identical position control…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yvanliang/DriveEditor
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis · Image Processing and 3D Reconstruction

MethodsSparse Evolutionary Training · Diffusion