Multi-scale Temporal Fusion Transformer for Incomplete Vehicle   Trajectory Prediction

Zhanwen Liu; Chao Li; Yang Wang; Nan Yang; Xing Fan; Jiaqi Ma; Xiangmo; Zhao

arXiv:2409.00904·cs.CV·September 4, 2024

Multi-scale Temporal Fusion Transformer for Incomplete Vehicle Trajectory Prediction

Zhanwen Liu, Chao Li, Yang Wang, Nan Yang, Xing Fan, Jiaqi Ma, Xiangmo, Zhao

PDF

Open Access

TL;DR

This paper introduces the Multi-scale Temporal Fusion Transformer (MTFT), a novel framework that effectively predicts incomplete vehicle trajectories by capturing multi-scale motion features and incorporating motion continuity, outperforming existing models in real traffic scenarios.

Contribution

The paper proposes the MTFT framework with multi-scale attention and fusion modules to improve trajectory prediction under missing data conditions, a novel approach in autonomous driving.

Findings

01

Achieves over 39% performance improvement on the HighD dataset.

02

Effectively handles missing data caused by occlusion and perception failures.

03

Outperforms state-of-the-art models in incomplete trajectory prediction.

Abstract

Motion prediction plays an essential role in autonomous driving systems, enabling autonomous vehicles to achieve more accurate local-path planning and driving decisions based on predictions of the surrounding vehicles. However, existing methods neglect the potential missing values caused by object occlusion, perception failures, etc., which inevitably degrades the trajectory prediction performance in real traffic scenarios. To address this limitation, we propose a novel end-to-end framework for incomplete vehicle trajectory prediction, named Multi-scale Temporal Fusion Transformer (MTFT), which consists of the Multi-scale Attention Head (MAH) and the Continuity Representation-guided Multi-scale Fusion (CRMF) module. Specifically, the MAH leverages the multi-head attention mechanism to parallelly capture multi-scale motion representation of trajectory from different temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic Prediction and Management Techniques · Autonomous Vehicle Technology and Safety · Video Surveillance and Tracking Methods

MethodsAttention Is All You Need · Byte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Linear Layer · Adam · Dropout · Layer Normalization · Dense Connections