EMD: Explicit Motion Modeling for High-Quality Street Gaussian Splatting

Xiaobao Wei; Qingpo Wuwu; Zhongyu Zhao; Zhuangzhe Wu; Nan Huang; Ming Lu; Ningning MA; Shanghang Zhang

arXiv:2411.15582·cs.CV·July 10, 2025

EMD: Explicit Motion Modeling for High-Quality Street Gaussian Splatting

Xiaobao Wei, Qingpo Wuwu, Zhongyu Zhao, Zhuangzhe Wu, Nan Huang, Ming Lu, Ningning MA, Shanghang Zhang

PDF

Open Access

TL;DR

This paper introduces Explicit Motion Decomposition (EMD), a novel method that models dynamic object motions in street scenes using learnable embeddings, significantly improving Gaussian Splatting for photorealistic scene reconstruction.

Contribution

The paper proposes a plug-and-play EMD module with tailored training strategies, enhancing motion modeling in street scene Gaussian Splatting, and achieves state-of-the-art results in self-supervised view synthesis.

Findings

01

EMD effectively models dynamic object motions in street scenes.

02

The method achieves superior novel view synthesis performance.

03

It enhances scene decomposition accuracy in complex environments.

Abstract

Photorealistic reconstruction of street scenes is essential for developing real-world simulators in autonomous driving. While recent methods based on 3D/4D Gaussian Splatting (GS) have demonstrated promising results, they still encounter challenges in complex street scenes due to the unpredictable motion of dynamic objects. Current methods typically decompose street scenes into static and dynamic objects, learning the Gaussians in either a supervised manner (e.g., w/ 3D bounding-box) or a self-supervised manner (e.g., w/o 3D bounding-box). However, these approaches do not effectively model the motions of dynamic objects (e.g., the motion speed of pedestrians is clearly different from that of vehicles), resulting in suboptimal scene decomposition. To address this, we propose Explicit Motion Decomposition (EMD), which models the motions of dynamic objects by introducing learnable motion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeophysical Methods and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings