Hyperbolic Contrastive Learning for Hierarchical 3D Point Cloud   Embedding

Yingjie Liu; Pengyu Zhang; Ziyao He; Mingsong Chen; Xuan Tang; Xian; Wei

arXiv:2501.02285·cs.CV·January 8, 2025·3 cites

Hyperbolic Contrastive Learning for Hierarchical 3D Point Cloud Embedding

Yingjie Liu, Pengyu Zhang, Ziyao He, Mingsong Chen, Xuan Tang, Xian, Wei

PDF

Open Access

TL;DR

This paper introduces a hyperbolic contrastive learning approach for hierarchical 3D point cloud embedding, leveraging multi-modal regularizers to improve 3D understanding and transfer from text and images.

Contribution

It extends hyperbolic contrastive pre-training to 3D point clouds and develops regularizers for hierarchical multi-modal embeddings, enhancing downstream task performance.

Findings

01

Significant improvement in 3D point cloud task performance

02

Effective hierarchical embeddings across text, image, and 3D modalities

03

Enhanced transfer learning capabilities from multi-modal data

Abstract

Hyperbolic spaces allow for more efficient modeling of complex, hierarchical structures, which is particularly beneficial in tasks involving multi-modal data. Although hyperbolic geometries have been proven effective for language-image pre-training, their capabilities to unify language, image, and 3D Point Cloud modalities are under-explored. We extend the 3D Point Cloud modality in hyperbolic multi-modal contrastive pre-training. Additionally, we explore the entailment, modality gap, and alignment regularizers for learning hierarchical 3D embeddings and facilitating the transfer of knowledge from both Text and Image modalities. These regularizers enable the learning of intra-modal hierarchy within each modality and inter-modal hierarchy across text, 2D images, and 3D Point Clouds. Experimental results demonstrate that our proposed training strategy yields an outstanding 3D Point Cloud…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Human Motion and Animation · Image Processing and 3D Reconstruction