Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature Attention
Xunjiang Gu, Guanyu Song, Igor Gilitschenski, Marco Pavone, Boris, Ivanovic

TL;DR
This paper introduces a method to directly utilize internal BEV features in online mapping for autonomous vehicles, significantly improving inference speed and prediction accuracy by integrating mapping and behavior prediction more effectively.
Contribution
It proposes exposing internal BEV features in online map estimation, enabling tighter integration with trajectory forecasting and enhancing performance.
Findings
Up to 73% faster inference speeds.
Up to 29% more accurate predictions.
Effective integration of mapping and behavior prediction.
Abstract
Understanding road geometry is a critical component of the autonomous vehicle (AV) stack. While high-definition (HD) maps can readily provide such information, they suffer from high labeling and maintenance costs. Accordingly, many recent works have proposed methods for estimating HD maps online from sensor data. The vast majority of recent approaches encode multi-camera observations into an intermediate representation, e.g., a bird's eye view (BEV) grid, and produce vector map elements via a decoder. While this architecture is performant, it decimates much of the information encoded in the intermediate representation, preventing downstream tasks (e.g., behavior prediction) from leveraging them. In this work, we propose exposing the rich internal features of online map estimation methods and show how they enable more tightly integrating online mapping with trajectory forecasting. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
