Multi-Head Attention for Multi-Modal Joint Vehicle Motion Forecasting

Jean Mercat; Thomas Gilles; Nicole El Zoghby; Guillaume Sandou,; Dominique Beauvois; Guillermo Pita Gil

arXiv:1910.03650·cs.LG·December 23, 2019

Multi-Head Attention for Multi-Modal Joint Vehicle Motion Forecasting

Jean Mercat, Thomas Gilles, Nicole El Zoghby, Guillaume Sandou,, Dominique Beauvois, Guillermo Pita Gil

PDF

TL;DR

This paper introduces a multi-head attention-based vehicle motion forecasting model that predicts joint, multi-modal vehicle positions using only position tracks, outperforming existing models in accuracy and versatility.

Contribution

It proposes a novel multi-head attention architecture for joint vehicle motion forecasting that does not require maneuver definitions or spatial grids, enhancing flexibility and performance.

Findings

01

Outperforms state-of-the-art models on the same dataset.

02

Produces joint, multi-modal probability density forecasts.

03

Does not rely on maneuver definitions or spatial scene representation.

Abstract

This paper presents a novel vehicle motion forecasting method based on multi-head attention. It produces joint forecasts for all vehicles on a road scene as sequences of multi-modal probability density functions of their positions. Its architecture uses multi-head attention to account for complete interactions between all vehicles, and long short-term memory layers for encoding and forecasting. It relies solely on vehicle position tracks, does not need maneuver definitions, and does not represent the scene with a spatial grid. This allows it to be more versatile than similar model while combining any forecasting capabilities, namely joint forecast with interactions, uncertainty estimation, and multi-modality. The resulting prediction likelihood outperforms state-of-the-art models on the same dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention