Fast Point Transformer

Chunghyun Park; Yoonwoo Jeong; Minsu Cho; Jaesik Park

arXiv:2112.04702·cs.CV·April 5, 2022

Fast Point Transformer

Chunghyun Park, Yoonwoo Jeong, Minsu Cho, Jaesik Park

PDF

1 Repo

TL;DR

Fast Point Transformer introduces a lightweight self-attention layer with voxel hashing to efficiently process large-scale 3D point clouds, achieving significantly faster inference while maintaining competitive accuracy in 3D segmentation and detection.

Contribution

It proposes a novel lightweight self-attention mechanism with voxel hashing architecture for efficient large-scale 3D point cloud processing.

Findings

01

129 times faster inference than Point Transformer

02

Competitive accuracy in 3D semantic segmentation

03

Effective for 3D detection tasks

Abstract

The recent success of neural networks enables a better interpretation of 3D point clouds, but processing a large-scale 3D scene remains a challenging problem. Most current approaches divide a large-scale scene into small regions and combine the local predictions together. However, this scheme inevitably involves additional stages for pre- and post-processing and may also degrade the final output due to predictions in a local perspective. This paper introduces Fast Point Transformer that consists of a new lightweight self-attention layer. Our approach encodes continuous 3D coordinates, and the voxel hashing-based architecture boosts computational efficiency. The proposed method is demonstrated with 3D semantic segmentation and 3D detection. The accuracy of our approach is competitive to the best voxel-based method, and our network achieves 129 times faster inference time than the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

POSTECH-CVLab/FastPointTransformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Linear Layer · Dropout · Layer Normalization · Label Smoothing · Byte Pair Encoding · Multi-Head Attention · Position-Wise Feed-Forward Layer · Softmax · Adam