MinkUNeXt: Point Cloud-based Large-scale Place Recognition using 3D Sparse Convolutions
J.J. Cabrera, A. Santo, A. Gil, C. Viegas, L. Pay\'a

TL;DR
MinkUNeXt introduces a novel 3D sparse convolution-based architecture for large-scale place recognition from point clouds, outperforming current state-of-the-art methods without using complex Transformer components.
Contribution
The paper proposes MinkUNeXt, a new architecture utilizing simple 3D sparse convolutions and a U-Net structure for effective place recognition from point clouds.
Findings
Outperforms existing state-of-the-art methods on Oxford RobotCar dataset.
Effective feature extraction at multiple scales with a U-Net encoder-decoder.
Achieves superior accuracy using only 3D sparse convolutions without Transformers.
Abstract
This paper presents MinkUNeXt, an effective and efficient architecture for place-recognition from point clouds entirely based on the new 3D MinkNeXt Block, a residual block composed of 3D sparse convolutions that follows the philosophy established by recent Transformers but purely using simple 3D convolutions. Feature extraction is performed at different scales by a U-Net encoder-decoder network and the feature aggregation of those features into a single descriptor is carried out by a Generalized Mean Pooling (GeM). The proposed architecture demonstrates that it is possible to surpass the current state-of-the-art by only relying on conventional 3D sparse convolutions without making use of more complex and sophisticated proposals such as Transformers, Attention-Layers or Deformable Convolutions. A thorough assessment of the proposal has been carried out using the Oxford RobotCar and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · 3D Surveying and Cultural Heritage · Remote Sensing and LiDAR Applications
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Residual Connection · Max Pooling · Batch Normalization · Sparse Convolutions · Convolution · U-Net · Residual Block · Generalized Mean Pooling
