Multi-modal Transformer Path Prediction for Autonomous Vehicle
Chia Hong Tseng, Jie Zhang, Min-Te Sun, Kazuya Sakai, Wei-Shinn Ku

TL;DR
This paper introduces a multi-modal Transformer-based system for vehicle path prediction that effectively incorporates lane information and sensor data to improve long-term trajectory forecasting in autonomous driving.
Contribution
The paper presents a novel Transformer architecture for path prediction that utilizes multi-modal sensor data and refined lane filtering to enhance accuracy.
Findings
Transformer architecture improves trajectory prediction accuracy.
Lane filtering reduces unlikely path options.
System performs well on nuScene dataset.
Abstract
Reasoning about vehicle path prediction is an essential and challenging problem for the safe operation of autonomous driving systems. There exist many research works for path prediction. However, most of them do not use lane information and are not based on the Transformer architecture. By utilizing different types of data collected from sensors equipped on the self-driving vehicles, we propose a path prediction system named Multi-modal Transformer Path Prediction (MTPP) that aims to predict long-term future trajectory of target agents. To achieve more accurate path prediction, the Transformer architecture is adopted in our model. To better utilize the lane information, the lanes which are in opposite direction to target agent are not likely to be taken by the target agent and are consequently filtered out. In addition, consecutive lane chunks are combined to ensure the lane input to be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Traffic Prediction and Management Techniques · Traffic control and management
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Absolute Position Encodings · Label Smoothing · Position-Wise Feed-Forward Layer · Adam · Dropout · Layer Normalization
