CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point   Cloud Learning

Mahdi Saleh; Yige Wang; Nassir Navab; Benjamin Busam; Federico Tombari

arXiv:2208.00524·cs.CV·August 2, 2022

CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point Cloud Learning

Mahdi Saleh, Yige Wang, Nassir Navab, Benjamin Busam, Federico Tombari

PDF

Open Access

TL;DR

CloudAttention introduces an efficient multi-scale attention scheme for 3D point cloud learning, combining local and global attention with multi-scale tokenization to improve accuracy and reduce computational costs in shape classification and segmentation.

Contribution

It proposes a novel hierarchical attention framework with local attention units and multi-scale tokenization, achieving state-of-the-art results with fewer computations.

Findings

01

State-of-the-art shape classification accuracy

02

Comparable segmentation performance with fewer computations

03

Half the latency and parameter count of previous methods

Abstract

Processing 3D data efficiently has always been a challenge. Spatial operations on large-scale point clouds, stored as sparse data, require extra cost. Attracted by the success of transformers, researchers are using multi-head attention for vision tasks. However, attention calculations in transformers come with quadratic complexity in the number of inputs and miss spatial intuition on sets like point clouds. We redesign set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation. We propose our local attention unit, which captures features in a spatial neighborhood. We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration. Finally, to mitigate the non-heterogeneity of point clouds, we propose an efficient Multi-Scale Tokenization (MST), which extracts…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · 3D Surveying and Cultural Heritage · Human Pose and Action Recognition

MethodsSoftmax · Linear Layer