Self-positioning Point-based Transformer for Point Cloud Understanding
Jinyoung Park, Sanghyeok Lee, Sihyeon Kim, Yunyang Xiong, Hyunwoo J., Kim

TL;DR
This paper introduces SPoTr, a novel point-based Transformer architecture that captures local and global shape contexts efficiently for point cloud understanding, achieving state-of-the-art results in classification and segmentation tasks.
Contribution
The paper proposes a self-positioning point-based Transformer with a new global cross-attention mechanism that reduces complexity and enhances scalability for point cloud analysis.
Findings
Achieves 2.6% accuracy improvement on shape classification with ScanObjectNN.
Effectively captures local and global shape contexts.
Demonstrates interpretability of self-positioning points.
Abstract
Transformers have shown superior performance on various computer vision tasks with their capabilities to capture long-range dependencies. Despite the success, it is challenging to directly apply Transformers on point clouds due to their quadratic cost in the number of points. In this paper, we present a Self-Positioning point-based Transformer (SPoTr), which is designed to capture both local and global shape contexts with reduced complexity. Specifically, this architecture consists of local self-attention and self-positioning point-based global cross-attention. The self-positioning points, adaptively located based on the input shape, consider both spatial and semantic information with disentangled attention to improve expressive power. With the self-positioning points, we propose a novel global cross-attention mechanism for point clouds, which improves the scalability of global…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · 3D Shape Modeling and Analysis · Image Processing and 3D Reconstruction
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Dense Connections · Layer Normalization · Adam · Softmax · Residual Connection · Label Smoothing
