Point Transformer

Hengshuang Zhao; Li Jiang; Jiaya Jia; Philip Torr; Vladlen Koltun

arXiv:2012.09164·cs.CV·September 28, 2021·34 cites

Point Transformer

Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, Vladlen Koltun

PDF

Open Access 5 Repos

TL;DR

This paper introduces the Point Transformer, a self-attention network tailored for 3D point cloud processing, achieving state-of-the-art results in semantic scene segmentation and other tasks.

Contribution

It designs self-attention layers specifically for point clouds and demonstrates their effectiveness across multiple 3D understanding tasks.

Findings

01

Achieves 70.4% mIoU on S3DIS Area 5, surpassing prior models.

02

Outperforms previous methods by 3.3 percentage points on semantic segmentation.

03

Crosses the 70% mIoU threshold for the first time in this task.

Abstract

Self-attention networks have revolutionized natural language processing and are making impressive strides in image analysis tasks such as image classification and object detection. Inspired by this success, we investigate the application of self-attention networks to 3D point cloud processing. We design self-attention layers for point clouds and use these to construct self-attention networks for tasks such as semantic scene segmentation, object part segmentation, and object classification. Our Point Transformer design improves upon prior work across domains and tasks. For example, on the challenging S3DIS dataset for large-scale semantic scene segmentation, the Point Transformer attains an mIoU of 70.4% on Area 5, outperforming the strongest prior model by 3.3 absolute percentage points and crossing the 70% mIoU threshold for the first time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSensor Technology and Measurement Systems

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Multi-Head Attention · Softmax · Residual Connection · Adam · Attention Is All You Need · Byte Pair Encoding · Layer Normalization