Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud
Ayumu Saito, Prachi Kudeshia, Jiju Poovvancheri

TL;DR
Point-JEPA introduces a novel self-supervised learning architecture for point clouds that improves efficiency and competitiveness without requiring input reconstruction or extra modalities.
Contribution
It proposes Point-JEPA, a joint embedding predictive architecture with a sequencer for efficient context-target selection in point cloud self-supervised learning.
Findings
Achieves competitive results with state-of-the-art methods.
Reduces pre-training time and complexity.
Avoids input space reconstruction and additional modalities.
Abstract
Recent advancements in self-supervised learning in the point cloud domain have demonstrated significant potential. However, these methods often suffer from drawbacks, including lengthy pre-training time, the necessity of reconstruction in the input space, or the necessity of additional modalities. In order to address these issues, we introduce Point-JEPA, a joint embedding predictive architecture designed specifically for point cloud data. To this end, we introduce a sequencer that orders point cloud patch embeddings to efficiently compute and utilize their proximity based on the indices during target and context selection. The sequencer also allows shared computations of the patch embeddings' proximity between context and target selection, further improving the efficiency. Experimentally, our method achieves competitive results with state-of-the-art methods while avoiding the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Image Processing and 3D Reconstruction
