Exploring Semantic Masked Autoencoder for Self-supervised Point Cloud Understanding
Yixin Zha, Chuxin Wang, Wenfei Yang, Tianzhu Zhang

TL;DR
This paper introduces a Semantic Masked Autoencoder for point cloud understanding that uses semantic prototypes and enhanced masking strategies to improve the capture of meaningful component relationships in self-supervised learning.
Contribution
It proposes a novel semantic-guided autoencoder with prototype-based modeling and masking strategies to better capture semantic relationships in point clouds.
Findings
Improved performance on ScanObjectNN, ModelNet40, and ShapeNetPart datasets.
Effective semantic modeling enhances downstream task accuracy.
Addresses limitations of random masking in self-supervised learning.
Abstract
Point cloud understanding aims to acquire robust and general feature representations from unlabeled data. Masked point modeling-based methods have recently shown significant performance across various downstream tasks. These pre-training methods rely on random masking strategies to establish the perception of point clouds by restoring corrupted point cloud inputs, which leads to the failure of capturing reasonable semantic relationships by the self-supervised models. To address this issue, we propose Semantic Masked Autoencoder, which comprises two main components: a prototype-based component semantic modeling module and a component semantic-enhanced masking strategy. Specifically, in the component semantic modeling module, we design a component semantic guidance mechanism to direct a set of learnable prototypes in capturing the semantics of different components from objects. Leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · 3D Surveying and Cultural Heritage · Generative Adversarial Networks and Image Synthesis
