M$^3$CS: Multi-Target Masked Point Modeling with Learnable Codebook and   Siamese Decoders

Qibo Qiu; Honghui Yang; Wenxiao Wang; Shun Zhang; Haiming Gao; Haochao; Ying; Wei Hua; Xiaofei He

arXiv:2309.13235·cs.CV·September 26, 2023

M$^3$CS: Multi-Target Masked Point Modeling with Learnable Codebook and Siamese Decoders

Qibo Qiu, Honghui Yang, Wenxiao Wang, Shun Zhang, Haiming Gao, Haochao, Ying, Wei Hua, Xiaofei He

PDF

Open Access

TL;DR

M$^3$CS introduces a multi-task masked point modeling framework with learnable codebooks and siamese decoders, enhancing geometric and semantic feature learning for point cloud pre-training, leading to improved downstream task performance.

Contribution

The paper proposes a novel multi-task masked point modeling approach with learnable codebooks and siamese decoders to improve feature representation in point cloud pre-training.

Findings

01

Outperforms existing methods in classification tasks.

02

Achieves superior results in segmentation tasks.

03

Effectively captures both geometric details and semantic contexts.

Abstract

Masked point modeling has become a promising scheme of self-supervised pre-training for point clouds. Existing methods reconstruct either the original points or related features as the objective of pre-training. However, considering the diversity of downstream tasks, it is necessary for the model to have both low- and high-level representation modeling capabilities to capture geometric details and semantic contexts during pre-training. To this end, M $^{3}$ CS is proposed to enable the model with the above abilities. Specifically, with masked point cloud as input, M $^{3}$ CS introduces two decoders to predict masked representations and the original points simultaneously. While an extra decoder doubles parameters for the decoding process and may lead to overfitting, we propose siamese decoders to keep the amount of learnable parameters unchanged. Further, we propose an online codebook…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · 3D Surveying and Cultural Heritage · Image Processing and 3D Reconstruction