Multi-Head Attention for Multi-Modal Joint Vehicle Motion Forecasting
Jean Mercat, Thomas Gilles, Nicole El Zoghby, Guillaume Sandou,, Dominique Beauvois, Guillermo Pita Gil

TL;DR
This paper introduces a multi-head attention-based vehicle motion forecasting model that predicts joint, multi-modal vehicle positions using only position tracks, outperforming existing models in accuracy and versatility.
Contribution
It proposes a novel multi-head attention architecture for joint vehicle motion forecasting that does not require maneuver definitions or spatial grids, enhancing flexibility and performance.
Findings
Outperforms state-of-the-art models on the same dataset.
Produces joint, multi-modal probability density forecasts.
Does not rely on maneuver definitions or spatial scene representation.
Abstract
This paper presents a novel vehicle motion forecasting method based on multi-head attention. It produces joint forecasts for all vehicles on a road scene as sequences of multi-modal probability density functions of their positions. Its architecture uses multi-head attention to account for complete interactions between all vehicles, and long short-term memory layers for encoding and forecasting. It relies solely on vehicle position tracks, does not need maneuver definitions, and does not represent the scene with a spatial grid. This allows it to be more versatile than similar model while combining any forecasting capabilities, namely joint forecast with interactions, uncertainty estimation, and multi-modality. The resulting prediction likelihood outperforms state-of-the-art models on the same dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention
