Boosting Point-BERT by Multi-choice Tokens
Kexue Fu, Mingzhi Yuan, Manning Wang

TL;DR
This paper introduces McP-BERT, a novel pre-training framework for point cloud understanding that uses multi-choice tokens to improve tokenization accuracy and downstream task performance without extra computational cost.
Contribution
It proposes multi-choice tokens for point cloud pre-training, addressing token ambiguity issues in Point-BERT and leveraging high-level semantics for better supervision.
Findings
Achieves 94.1% accuracy on ModelNet40
Sets new state-of-the-art in few-shot learning
Improves downstream task performance with minimal overhead
Abstract
Masked language modeling (MLM) has become one of the most successful self-supervised pre-training task. Inspired by its success, Point-BERT, as a pioneer work in point cloud, proposed masked point modeling (MPM) to pre-train point transformer on large scale unanotated dataset. Despite its great performance, we find the inherent difference between language and point cloud tends to cause ambiguous tokenization for point cloud. For point cloud, there doesn't exist a gold standard for point cloud tokenization. Point-BERT use a discrete Variational AutoEncoder (dVAE) as tokenizer, but it might generate different token ids for semantically-similar patches and generate the same token ids for semantically-dissimilar patches. To tackle above problem, we propose our McP-BERT, a pre-training framework with multi-choice tokens. Specifically, we ease the previous single-choice constraint on patch…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Advanced Neural Network Applications · 3D Surveying and Cultural Heritage
