PointCMC: Cross-Modal Multi-Scale Correspondences Learning for Point Cloud Understanding
Honggu Zhou, Xiaogang Peng, Jiawei Mao, Zizhao Wu, Ming Zeng

TL;DR
PointCMC introduces a self-supervised cross-modal learning framework that models multi-scale correspondences between images and point clouds, significantly improving 3D understanding tasks.
Contribution
It proposes a novel multi-scale correspondence learning approach with local-to-local, local-to-global, and global-to-global modules for enhanced point cloud representation.
Findings
Outperforms state-of-the-art methods in 3D classification
Improves segmentation accuracy
Effective cross-modal correspondence modeling
Abstract
Some self-supervised cross-modal learning approaches have recently demonstrated the potential of image signals for enhancing point cloud representation. However, it remains a question on how to directly model cross-modal local and global correspondences in a self-supervised fashion. To solve it, we proposed PointCMC, a novel cross-modal method to model multi-scale correspondences across modalities for self-supervised point cloud representation learning. In particular, PointCMC is composed of: (1) a local-to-local (L2L) module that learns local correspondences through optimized cross-modal local geometric features, (2) a local-to-global (L2G) module that aims to learn the correspondences between local and global features across modalities via local-global discrimination, and (3) a global-to-global (G2G) module, which leverages auxiliary global contrastive loss between the point cloud and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · 3D Shape Modeling and Analysis · Remote Sensing and LiDAR Applications
