Shared Cross-Modal Trajectory Prediction for Autonomous Driving
Chiho Choi, Joon Hee Choi, Jiachen Li, Srikanth Malla

TL;DR
This paper introduces a cross-modal embedding framework for trajectory prediction in autonomous driving, leveraging multiple sensor data during training to improve predictions from a single modality at test time.
Contribution
It proposes a novel shared latent space embedding approach that utilizes multi-sensor data during training to enhance trajectory prediction from a single sensor modality.
Findings
Effective in leveraging multiple sensor modalities during training
Improves prediction accuracy using single modality at test time
Validated on two benchmark datasets
Abstract
Predicting future trajectories of traffic agents in highly interactive environments is an essential and challenging problem for the safe operation of autonomous driving systems. On the basis of the fact that self-driving vehicles are equipped with various types of sensors (e.g., LiDAR scanner, RGB camera, radar, etc.), we propose a Cross-Modal Embedding framework that aims to benefit from the use of multiple input modalities. At training time, our model learns to embed a set of complementary features in a shared latent space by jointly optimizing the objective functions across different types of input data. At test time, a single input modality (e.g., LiDAR data) is required to generate predictions from the input perspective (i.e., in the LiDAR space), while taking advantages from the model trained with multiple sensor modalities. An extensive evaluation is conducted to show the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
