SimC3D: A Simple Contrastive 3D Pretraining Framework Using RGB Images
Jiahua Dong, Tong Wu, Rui Qian, Jiaqi Wang

TL;DR
SimC3D introduces a novel 3D pretraining framework that uses only RGB images with depth estimation, eliminating the need for costly point cloud data and outperforming previous methods in downstream tasks.
Contribution
It is the first to pretrain 3D backbones solely from RGB images using a simple contrastive framework without 2D backbones or point clouds.
Findings
Outperforms previous point cloud-based pretraining methods.
Effective with multiple image datasets, demonstrating scalability.
Eliminates need for 3D point cloud data in pretraining.
Abstract
The 3D contrastive learning paradigm has demonstrated remarkable performance in downstream tasks through pretraining on point cloud data. Recent advances involve additional 2D image priors associated with 3D point clouds for further improvement. Nonetheless, these existing frameworks are constrained by the restricted range of available point cloud datasets, primarily due to the high costs of obtaining point cloud data. To this end, we propose SimC3D, a simple but effective 3D contrastive learning framework, for the first time, pretraining 3D backbones from pure RGB image data. SimC3D performs contrastive 3D pretraining with three appealing properties. (1) Pure image data: SimC3D simplifies the dependency of costly 3D point clouds and pretrains 3D backbones using solely RBG images. By employing depth estimation and suitable data processing, the monocular synthesized point cloud shows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization · 3D Surveying and Cultural Heritage · Robotic Mechanisms and Dynamics
MethodsContrastive Learning
