Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining
Zekun Qi, Runpei Dong, Guofan Fan, Zheng Ge, Xiangyu Zhang, Kaisheng, Ma, Li Yi

TL;DR
This paper introduces ReCon, a unified 3D representation learning framework that combines contrastive and generative paradigms through ensemble distillation, achieving state-of-the-art results on 3D classification tasks.
Contribution
ReCon is the first method to effectively unify contrastive and generative 3D learning paradigms using ensemble distillation and a novel encoder-decoder block, overcoming their individual limitations.
Findings
ReCon achieves 91.26% accuracy on ScanObjectNN.
ReCon outperforms previous state-of-the-art methods.
The proposed approach reduces over-fitting and pattern mismatch issues.
Abstract
Mainstream 3D representation learning approaches are built upon contrastive or generative modeling pretext tasks, where great improvements in performance on various downstream tasks have been achieved. However, we find these two paradigms have different characteristics: (i) contrastive models are data-hungry that suffer from a representation over-fitting issue; (ii) generative models have a data filling issue that shows inferior data scaling capacity compared to contrastive models. This motivates us to learn 3D representations by sharing the merits of both paradigms, which is non-trivial due to the pattern difference between the two paradigms. In this paper, we propose Contrast with Reconstruct (ReCon) that unifies these two paradigms. ReCon is trained to learn from both generative modeling teachers and single/cross-modal contrastive teachers through ensemble distillation, where the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis
