4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos
Mengqi Guo, Bo Xu, Yanyan Li, Gim Hee Lee

TL;DR
4D3R is a novel pose-free neural rendering framework that effectively reconstructs and renders dynamic scenes from monocular videos by combining motion-aware refinement, segmentation, and efficient Gaussian splatting, outperforming existing methods.
Contribution
The paper introduces a two-stage approach with a motion-aware bundle adjustment and Gaussian splatting for dynamic scene reconstruction without pre-computed camera poses.
Findings
Achieves up to 1.8dB PSNR improvement over state-of-the-art methods.
Reduces computational cost by 5x compared to previous dynamic scene representations.
Effectively handles large dynamic objects in real-world datasets.
Abstract
Novel view synthesis from monocular videos of dynamic scenes with unknown camera poses remains a fundamental challenge in computer vision and graphics. While recent advances in 3D representations such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) have shown promising results for static scenes, they struggle with dynamic content and typically rely on pre-computed camera poses. We present 4D3R, a pose-free dynamic neural rendering framework that decouples static and dynamic components through a two-stage approach. Our method first leverages 3D foundational models for initial pose and geometry estimation, followed by motion-aware refinement. 4D3R introduces two key technical innovations: (1) a motion-aware bundle adjustment (MA-BA) module that combines transformer-based learned priors with SAM2 for robust dynamic object segmentation, enabling more accurate camera pose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Robot Manipulation and Learning · Human Pose and Action Recognition
