RMP-YOLO: A Robust Motion Predictor for Partially Observable Scenarios even if You Only Look Once
Jiawei Sun, Jiahui Li, Tingchen Liu, Chengran Yuan, Shuo Sun, Zefan Huang, Anthony Wong, Keng Peng Tee, and Marcelo H. Ang Jr

TL;DR
RMP-YOLO is a novel motion prediction framework that reconstructs incomplete historical trajectories to improve accuracy in partially observable scenarios, demonstrating robustness and state-of-the-art results.
Contribution
The paper introduces a new paradigm focusing on reconstructing trajectories before prediction, with a scene tokenization and recovery module that enhances robustness to missing data.
Findings
Achieves state-of-the-art performance in the 2024 Waymo Motion Prediction Competition.
Effectively handles missing data and observation noise.
Seamlessly integrates with existing prediction models.
Abstract
We introduce RMP-YOLO, a unified framework designed to provide robust motion predictions even with incomplete input data. Our key insight stems from the observation that complete and reliable historical trajectory data plays a pivotal role in ensuring accurate motion prediction. Therefore, we propose a new paradigm that prioritizes the reconstruction of intact historical trajectories before feeding them into the prediction modules. Our approach introduces a novel scene tokenization module to enhance the extraction and fusion of spatial and temporal features. Following this, our proposed recovery module reconstructs agents' incomplete historical trajectories by leveraging local map topology and interactions with nearby agents. The reconstructed, clean historical data is then integrated into the downstream prediction modules. Our framework is able to effectively handle missing data of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Video Analysis and Summarization
