Point Cloud Semantic Segmentation
Ivan Martinovi\'c

TL;DR
This paper evaluates various deep learning models for semantic segmentation of 3D point clouds, specifically on the S3DIS dataset, comparing their accuracy and inference speed.
Contribution
It provides a comprehensive comparison of multiple state-of-the-art point cloud segmentation models on a standard dataset, highlighting their performance differences.
Findings
PointCNN achieves high accuracy on S3DIS.
Point Transformer offers a good balance of speed and accuracy.
Cylinder3D demonstrates strong segmentation performance.
Abstract
Semantic segmentation is an important and well-known task in the field of computer vision, in which we attempt to assign a corresponding semantic class to each input element. When it comes to semantic segmentation of 2D images, the input elements are pixels. On the other hand, the input can also be a point cloud, where one input element represents one point in the input point cloud. By the term point cloud, we refer to a set of points defined by spatial coordinates with respect to some reference coordinate system. In addition to the position of points in space, other features can also be defined for each point, such as RGB components. In this paper, we conduct semantic segmentation on the S3DIS dataset, where each point cloud represents one room. We train models on the S3DIS dataset, namely PointCNN, PointNet++, Cylinder3D, Point Transformer, and RepSurf. We compare the obtained results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote Sensing and LiDAR Applications · 3D Surveying and Cultural Heritage · 3D Shape Modeling and Analysis
MethodsMulti-Head Attention · Attention Is All You Need · Adam · Layer Normalization · Linear Layer · Label Smoothing · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Residual Connection
