Hierarchical Point Cloud Encoding and Decoding with Lightweight Self-Attention based Model
En Yen Puang, Hao Zhang, Hongyuan Zhu, Wei Jing

TL;DR
This paper introduces SA-CNN, a lightweight hierarchical self-attention model for point cloud data that effectively captures contextual information and performs well across various 3D shape understanding tasks with lower complexity.
Contribution
SA-CNN is a novel hierarchical self-attention architecture that achieves state-of-the-art or comparable results with significantly reduced model complexity.
Findings
Achieves state-of-the-art performance in classification and segmentation.
Maintains low model complexity compared to existing methods.
Effectively reconstructs and visualizes multi-stage point clouds.
Abstract
In this paper we present SA-CNN, a hierarchical and lightweight self-attention based encoding and decoding architecture for representation learning of point cloud data. The proposed SA-CNN introduces convolution and transposed convolution stacks to capture and generate contextual information among unordered 3D points. Following conventional hierarchical pipeline, the encoding process extracts feature in local-to-global manner, while the decoding process generates feature and point cloud in coarse-to-fine, multi-resolution stages. We demonstrate that SA-CNN is capable of a wide range of applications, namely classification, part segmentation, reconstruction, shape retrieval, and unsupervised classification. While achieving the state-of-the-art or comparable performance in the benchmarks, SA-CNN maintains its model complexity several order of magnitude lower than the others. In term of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTransposed convolution · Convolution
