Motion Transformer with Global Intention Localization and Local Movement Refinement
Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele

TL;DR
This paper introduces Motion Transformer (MTR), a novel framework for multimodal traffic behavior prediction that uses learnable motion queries for stable, efficient, and accurate trajectory forecasting, achieving state-of-the-art results.
Contribution
MTR models motion prediction as joint global intention localization and local movement refinement using learnable motion queries, improving stability and multimodal prediction quality.
Findings
Achieves state-of-the-art performance on Waymo Open Motion Dataset
Outperforms existing methods in both marginal and joint motion prediction tasks
Demonstrates stable training and efficient multimodal trajectory prediction
Abstract
Predicting multimodal future behavior of traffic participants is essential for robotic vehicles to make safe decisions. Existing works explore to directly predict future trajectories based on latent features or utilize dense goal candidates to identify agent's destinations, where the former strategy converges slowly since all motion modes are derived from the same feature while the latter strategy has efficiency issue since its performance highly relies on the density of goal candidates. In this paper, we propose Motion TRansformer (MTR) framework that models motion prediction as the joint optimization of global intention localization and local movement refinement. Instead of using goal candidates, MTR incorporates spatial intention priors by adopting a small set of learnable motion query pairs. Each motion query pair takes charge of trajectory prediction and refinement for a specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Human Pose and Action Recognition · Video Surveillance and Tracking Methods
