TL;DR
Flux4D is a scalable, unsupervised framework for 4D reconstruction of large-scale dynamic scenes from visual data, outperforming existing methods in speed, scalability, and generalization.
Contribution
It introduces Flux4D, a novel method that directly predicts 3D Gaussians and their motion without supervision, enabling efficient large-scale dynamic scene reconstruction.
Findings
Flux4D reconstructs scenes within seconds.
It scales effectively to large datasets.
It generalizes well to unseen environments.
Abstract
Reconstructing large-scale dynamic scenes from visual observations is a fundamental challenge in computer vision, with critical implications for robotics and autonomous systems. While recent differentiable rendering methods such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) have achieved impressive photorealistic reconstruction, they suffer from scalability limitations and require annotations to decouple actor motion. Existing self-supervised methods attempt to eliminate explicit annotations by leveraging motion cues and geometric priors, yet they remain constrained by per-scene optimization and sensitivity to hyperparameter tuning. In this paper, we introduce Flux4D, a simple and scalable framework for 4D reconstruction of large-scale dynamic scenes. Flux4D directly predicts 3D Gaussians and their motion dynamics to reconstruct sensor observations in a fully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
