LEST: Large-scale LiDAR Semantic Segmentation with Transformer

Chuanyu Luo; Nuo Cheng; Sikun Ma; Han Li; Xiaohan Li; Shengguang Lei,; Pu Li

arXiv:2307.09367·cs.CV·July 19, 2023·1 cites

LEST: Large-scale LiDAR Semantic Segmentation with Transformer

Chuanyu Luo, Nuo Cheng, Sikun Ma, Han Li, Xiaohan Li, Shengguang Lei,, Pu Li

PDF

Open Access

TL;DR

This paper introduces LEST, a novel Transformer-based architecture for large-scale LiDAR point cloud semantic segmentation, outperforming existing methods on major benchmarks.

Contribution

The paper presents a new Transformer architecture with SFC grouping and DISCO components for improved LiDAR segmentation.

Findings

01

Outperforms state-of-the-art methods on nuScenes and SemanticKITTI datasets

02

Introduces SFC grouping strategy for efficient point cloud processing

03

Develops Distance-based Cosine Linear Transformer (DISCO) for better feature extraction

Abstract

Large-scale LiDAR-based point cloud semantic segmentation is a critical task in autonomous driving perception. Almost all of the previous state-of-the-art LiDAR semantic segmentation methods are variants of sparse 3D convolution. Although the Transformer architecture is becoming popular in the field of natural language processing and 2D computer vision, its application to large-scale point cloud semantic segmentation is still limited. In this paper, we propose a LiDAR sEmantic Segmentation architecture with pure Transformer, LEST. LEST comprises two novel components: a Space Filling Curve (SFC) Grouping strategy and a Distance-based Cosine Linear Transformer, DISCO. On the public nuScenes semantic segmentation validation set and SemanticKITTI test set, our model outperforms all the other state-of-the-art methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Remote Sensing and LiDAR Applications · Robotics and Sensor-Based Localization

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Residual Connection · Absolute Position Encodings · Adam · Layer Normalization · Label Smoothing