X-Align: Cross-Modal Cross-View Alignment for Bird's-Eye-View Segmentation
Shubhankar Borse, Marvin Klingner, Varun Ravi Kumar, Hong Cai,, Abdulaziz Almuzairee, Senthil Yogamani, Fatih Porikli

TL;DR
X-Align introduces a novel framework that improves bird's-eye-view segmentation in autonomous driving by better aligning multi-modal and multi-view features, significantly outperforming previous methods.
Contribution
It proposes a new end-to-end cross-modal and cross-view learning framework with novel alignment losses and modules for improved BEV segmentation.
Findings
Outperforms state-of-the-art by 3 mIoU on nuScenes
Effective multi-modal feature fusion via attention-based module
Enhanced PV-to-BEV transformation accuracy
Abstract
Bird's-eye-view (BEV) grid is a common representation for the perception of road components, e.g., drivable area, in autonomous driving. Most existing approaches rely on cameras only to perform segmentation in BEV space, which is fundamentally constrained by the absence of reliable depth information. Latest works leverage both camera and LiDAR modalities, but sub-optimally fuse their features using simple, concatenation-based mechanisms. In this paper, we address these problems by enhancing the alignment of the unimodal features in order to aid feature fusion, as well as enhancing the alignment between the cameras' perspective view (PV) and BEV representations. We propose X-Align, a novel end-to-end cross-modal and cross-view learning framework for BEV segmentation consisting of the following components: (i) a novel Cross-Modal Feature Alignment (X-FA) loss, (ii) an attention-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
X-Align: Cross-Modal Cross-View Alignment for Bird's-Eye-View Segmentation· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Remote Sensing and LiDAR Applications
MethodsALIGN
