Self-Discovered Intention-aware Transformer for Multi-modal Vehicle Trajectory Prediction

Diyi Liu; Zihan Niu; Tu Xu; Lishan Sun

arXiv:2604.07126·cs.RO·April 9, 2026

Self-Discovered Intention-aware Transformer for Multi-modal Vehicle Trajectory Prediction

Diyi Liu, Zihan Niu, Tu Xu, Lishan Sun

PDF

TL;DR

This paper introduces a Transformer-based model for vehicle trajectory prediction that considers multiple modalities and intentions, improving flexibility and performance in autonomous driving scenarios.

Contribution

It presents a novel pure Transformer architecture with dual tracks for trajectory prediction and intention likelihood estimation, enhancing modularity and accuracy.

Findings

01

Separate spatial and trajectory modules improve performance.

02

Model learns ordered trajectory groups via residual offsets.

03

Dual-track design increases prediction accuracy.

Abstract

Predicting vehicle trajectories plays an important role in autonomous driving and ITS applications. Although multiple deep learning algorithms are devised to predict vehicle trajectories, their reliant on specific graph structure (e.g., Graph Neural Network) or explicit intention labeling limit their flexibilities. In this study, we propose a pure Transformer-based network with multiple modals considering their neighboring vehicles. Two separate tracks are employed. One track focuses on predicting the trajectories while the other focuses on predicting the likelihood of each intention considering neighboring vehicles. Study finds that the two track design can increase the performance by separating spatial module from the trajectory generating module. Also, we find the the model can learn an ordered group of trajectories by predicting residual offsets among K trajectories.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.