UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the   OpenPCSeg Codebase

Youquan Liu; Runnan Chen; Xin Li; Lingdong Kong; Yuchen Yang; Zhaoyang; Xia; Yeqi Bai; Xinge Zhu; Yuexin Ma; Yikang Li; Yu Qiao; Yuenan Hou

arXiv:2309.05573·cs.CV·September 12, 2023·2 cites

UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase

Youquan Liu, Runnan Chen, Xin Li, Lingdong Kong, Yuchen Yang, Zhaoyang, Xia, Yeqi Bai, Xinge Zhu, Yuexin Ma, Yikang Li, Yu Qiao, Yuenan Hou

PDF

Open Access 1 Repo

TL;DR

UniSeg is a multi-modal LiDAR segmentation network that fuses RGB images with point cloud views for improved semantic and panoptic segmentation, achieving top results on major benchmarks.

Contribution

The paper introduces UniSeg, a novel unified network that effectively combines multi-view point cloud data with RGB images and proposes the OpenPCSeg codebase for outdoor LiDAR segmentation.

Findings

01

Achieves top performance on SemanticKITTI, nuScenes, and Waymo datasets.

02

Ranks 1st in nuScenes LiDAR semantic segmentation challenge.

03

Provides a comprehensive, reproducible open-source codebase for outdoor LiDAR segmentation.

Abstract

Point-, voxel-, and range-views are three representative forms of point clouds. All of them have accurate 3D measurements but lack color and texture information. RGB images are a natural complement to these point cloud views and fully utilizing the comprehensive information of them benefits more robust perceptions. In this paper, we present a unified multi-modal LiDAR segmentation network, termed UniSeg, which leverages the information of RGB images and three views of the point cloud, and accomplishes semantic segmentation and panoptic segmentation simultaneously. Specifically, we first design the Learnable cross-Modal Association (LMA) module to automatically fuse voxel-view and range-view features with image features, which fully utilize the rich semantic information of images and are robust to calibration errors. Then, the enhanced voxel-view and range-view features are transformed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pjlab-adg/pcseg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Advanced Vision and Imaging