Exploring Semantic Masked Autoencoder for Self-supervised Point Cloud Understanding

Yixin Zha; Chuxin Wang; Wenfei Yang; Tianzhu Zhang

arXiv:2506.21957·cs.CV·June 30, 2025

Exploring Semantic Masked Autoencoder for Self-supervised Point Cloud Understanding

Yixin Zha, Chuxin Wang, Wenfei Yang, Tianzhu Zhang

PDF

Open Access

TL;DR

This paper introduces a Semantic Masked Autoencoder for point cloud understanding that uses semantic prototypes and enhanced masking strategies to improve the capture of meaningful component relationships in self-supervised learning.

Contribution

It proposes a novel semantic-guided autoencoder with prototype-based modeling and masking strategies to better capture semantic relationships in point clouds.

Findings

01

Improved performance on ScanObjectNN, ModelNet40, and ShapeNetPart datasets.

02

Effective semantic modeling enhances downstream task accuracy.

03

Addresses limitations of random masking in self-supervised learning.

Abstract

Point cloud understanding aims to acquire robust and general feature representations from unlabeled data. Masked point modeling-based methods have recently shown significant performance across various downstream tasks. These pre-training methods rely on random masking strategies to establish the perception of point clouds by restoring corrupted point cloud inputs, which leads to the failure of capturing reasonable semantic relationships by the self-supervised models. To address this issue, we propose Semantic Masked Autoencoder, which comprises two main components: a prototype-based component semantic modeling module and a component semantic-enhanced masking strategy. Specifically, in the component semantic modeling module, we design a component semantic guidance mechanism to direct a set of learnable prototypes in capturing the semantics of different components from objects. Leveraging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · 3D Surveying and Cultural Heritage · Generative Adversarial Networks and Image Synthesis