Sparse Cross-scale Attention Network for Efficient LiDAR Panoptic Segmentation
Shuangjie Xu, Rui Wan, Maosheng Ye, Xiaoyi Zou, Tongyi Cao

TL;DR
This paper introduces SCAN, a sparse cross-scale attention network that effectively models long-range dependencies in LiDAR point clouds for panoptic segmentation, improving accuracy and efficiency over previous methods.
Contribution
The paper proposes a novel sparse cross-scale attention mechanism and a sparse centroid representation to enhance long-range modeling and segmentation accuracy in LiDAR data.
Findings
Outperforms previous methods on SemanticKITTI dataset
Achieves first place in 3D panoptic segmentation with real-time speed
Effectively reduces computation through sparse convolution
Abstract
Two major challenges of 3D LiDAR Panoptic Segmentation (PS) are that point clouds of an object are surface-aggregated and thus hard to model the long-range dependency especially for large instances, and that objects are too close to separate each other. Recent literature addresses these problems by time-consuming grouping processes such as dual-clustering, mean-shift offsets, etc., or by bird-eye-view (BEV) dense centroid representation that downplays geometry. However, the long-range geometry relationship has not been sufficiently modeled by local feature learning from the above methods. To this end, we present SCAN, a novel sparse cross-scale attention network to first align multi-scale sparse features with global voxel-encoded attention to capture the long-range relationship of instance context, which can boost the regression accuracy of the over-segmented large objects. For the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
Topics3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization · Remote Sensing and LiDAR Applications
