UFO: Unifying Feed-Forward and Optimization-based Methods for Large Driving Scene Modeling
Kaiyuan Tan, Yingying Shen, Mingfei Tu, Haohui Zhu, Bing Wang, Guang Chen, Hangjun Ye, Haiyang Sun

TL;DR
UFO introduces a recurrent framework that unifies optimization-based and feed-forward methods for efficient, accurate long-range 4D scene reconstruction in autonomous driving, enabling real-time processing of extended driving sequences.
Contribution
The paper presents UFO, a novel recurrent paradigm that combines the strengths of optimization and feed-forward approaches for scalable, long-range 4D scene modeling in driving scenarios.
Findings
Outperforms existing methods on the Waymo dataset
Reconstructs 16-second sequences in under 0.5 seconds
Maintains high visual quality and geometric accuracy
Abstract
Dynamic driving scene reconstruction is critical for autonomous driving simulation and closed-loop learning. While recent feed-forward methods have shown promise for 3D reconstruction, they struggle with long-range driving sequences due to quadratic complexity in sequence length and challenges in modeling dynamic objects over extended durations. We propose UFO, a novel recurrent paradigm that combines the benefits of optimization-based and feed-forward methods for efficient long-range 4D reconstruction. Our approach maintains a 4D scene representation that is iteratively refined as new observations arrive, using a visibility-based filtering mechanism to select informative scene tokens and enable efficient processing of long sequences. For dynamic objects, we introduce an object pose-guided modeling approach that supports accurate long-range motion capture. Experiments on the Waymo Open…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · 3D Shape Modeling and Analysis
