MGTR: Multi-Granular Transformer for Motion Prediction with LiDAR
Yiqian Gan, Hao Xiao, Yizhe Zhao, Ethan Zhang, Zhe Huang, Xin Ye,, Lingting Ge

TL;DR
This paper introduces MGTR, a multi-granular transformer framework that improves motion prediction in autonomous driving by leveraging context features at different granularities and incorporating LiDAR semantic data, achieving state-of-the-art results.
Contribution
The paper proposes a novel multi-granular transformer architecture that effectively utilizes LiDAR semantic features for enhanced motion prediction in complex traffic scenarios.
Findings
Achieved 1st place on Waymo motion prediction benchmark.
Demonstrated superior performance over existing methods.
Effectively integrates LiDAR semantic features into transformer model.
Abstract
Motion prediction has been an essential component of autonomous driving systems since it handles highly uncertain and complex scenarios involving moving agents of different types. In this paper, we propose a Multi-Granular TRansformer (MGTR) framework, an encoder-decoder network that exploits context features in different granularities for different kinds of traffic agents. To further enhance MGTR's capabilities, we leverage LiDAR point cloud data by incorporating LiDAR semantic features from an off-the-shelf LiDAR feature extractor. We evaluate MGTR on Waymo Open Dataset motion prediction benchmark and show that the proposed method achieved state-of-the-art performance, ranking 1st on its leaderboard (https://waymo.com/open/challenges/2023/motion-prediction/).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Advanced Neural Network Applications · Human Pose and Action Recognition
