CASSPR: Cross Attention Single Scan Place Recognition
Yan Xia, Mariia Gladkova, Rui Wang, Qianyun Li, Uwe Stilla, Jo\~ao F., Henriques, Daniel Cremers

TL;DR
CASSPR introduces a novel cross attention transformer method that fuses point-based and voxel-based LiDAR data to improve fine-grained place recognition accuracy in autonomous navigation.
Contribution
It is the first to combine point and voxel approaches with cross attention transformers for LiDAR place recognition, enhancing fine-grained matching capabilities.
Findings
Surpasses state-of-the-art on multiple datasets
Achieves AR@1 of 85.6% on TUM dataset
Significantly improves fine-grained matching accuracy
Abstract
Place recognition based on point clouds (LiDAR) is an important component for autonomous robots or self-driving vehicles. Current SOTA performance is achieved on accumulated LiDAR submaps using either point-based or voxel-based structures. While voxel-based approaches nicely integrate spatial context across multiple scales, they do not exhibit the local precision of point-based methods. As a result, existing methods struggle with fine-grained matching of subtle geometric features in sparse single-shot Li- DAR scans. To overcome these limitations, we propose CASSPR as a method to fuse point-based and voxel-based approaches using cross attention transformers. CASSPR leverages a sparse voxel branch for extracting and aggregating information at lower resolution and a point-wise branch for obtaining fine-grained local information. CASSPR uses queries from one branch to try to match…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
CASSPR: Cross Attention Single Scan Place Recognition· youtube
Taxonomy
TopicsRobotics and Sensor-Based Localization · Remote Sensing and LiDAR Applications · Advanced Image and Video Retrieval Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Batch Normalization · Adam · Convolution · Absolute Position Encodings · Linear Layer · Dense Connections · Residual Connection
