AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous Driving
Xiaosong Jia, Shaoshuai Shi, Zijun Chen, Li Jiang, Wenlong Liao, Tao, He, Junchi Yan

TL;DR
This paper introduces a GPT-style autoregressive approach for motion prediction in autonomous driving, utilizing specialized attention modules and position encodings to improve accuracy and outperform existing methods.
Contribution
It proposes a novel autoregressive motion prediction framework with tailored attention and encoding strategies, achieving state-of-the-art results in autonomous driving datasets.
Findings
Outperforms recent autoregressive motion prediction methods like MotionLM and StateTransformer.
Achieves state-of-the-art performance on Waymo datasets.
Effectively captures complex spatial-temporal and semantic relations in driving scenes.
Abstract
As an essential task in autonomous driving (AD), motion prediction aims to predict the future states of surround objects for navigation. One natural solution is to estimate the position of other agents in a step-by-step manner where each predicted time-step is conditioned on both observed time-steps and previously predicted time-steps, i.e., autoregressive prediction. Pioneering works like SocialLSTM and MFP design their decoders based on this intuition. However, almost all state-of-the-art works assume that all predicted time-steps are independent conditioned on observed time-steps, where they use a single linear layer to generate positions of all time-steps simultaneously. They dominate most motion prediction leaderboards due to the simplicity of training MLPs compared to autoregressive networks. In this paper, we introduce the GPT style next token prediction into motion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Medical Image Segmentation Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Weight Decay · Layer Normalization · Multi-Head Attention · Cosine Annealing · Dropout · Byte Pair Encoding · Discriminative Fine-Tuning · Residual Connection
