Self-positioning Point-based Transformer for Point Cloud Understanding

Jinyoung Park; Sanghyeok Lee; Sihyeon Kim; Yunyang Xiong; Hyunwoo J.; Kim

arXiv:2303.16450·cs.CV·March 30, 2023·5 cites

Self-positioning Point-based Transformer for Point Cloud Understanding

Jinyoung Park, Sanghyeok Lee, Sihyeon Kim, Yunyang Xiong, Hyunwoo J., Kim

PDF

Open Access 1 Repo

TL;DR

This paper introduces SPoTr, a novel point-based Transformer architecture that captures local and global shape contexts efficiently for point cloud understanding, achieving state-of-the-art results in classification and segmentation tasks.

Contribution

The paper proposes a self-positioning point-based Transformer with a new global cross-attention mechanism that reduces complexity and enhances scalability for point cloud analysis.

Findings

01

Achieves 2.6% accuracy improvement on shape classification with ScanObjectNN.

02

Effectively captures local and global shape contexts.

03

Demonstrates interpretability of self-positioning points.

Abstract

Transformers have shown superior performance on various computer vision tasks with their capabilities to capture long-range dependencies. Despite the success, it is challenging to directly apply Transformers on point clouds due to their quadratic cost in the number of points. In this paper, we present a Self-Positioning point-based Transformer (SPoTr), which is designed to capture both local and global shape contexts with reduced complexity. Specifically, this architecture consists of local self-attention and self-positioning point-based global cross-attention. The self-positioning points, adaptively located based on the input shape, consider both spatial and semantic information with disentangled attention to improve expressive power. With the self-positioning points, we propose a novel global cross-attention mechanism for point clouds, which improves the scalability of global…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mlvlab/spotr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Surveying and Cultural Heritage · 3D Shape Modeling and Analysis · Image Processing and 3D Reconstruction

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Dense Connections · Layer Normalization · Adam · Softmax · Residual Connection · Label Smoothing