Point Contrastive Prediction with Semantic Clustering for   Self-Supervised Learning on Point Cloud Videos

Xiaoxiao Sheng; Zhiqiang Shen; Gang Xiao; Longguang Wang and; Yulan Guo; Hehe Fan

arXiv:2308.09247·cs.CV·August 21, 2023

Point Contrastive Prediction with Semantic Clustering for Self-Supervised Learning on Point Cloud Videos

Xiaoxiao Sheng, Zhiqiang Shen, Gang Xiao, Longguang Wang and, Yulan Guo, Hehe Fan

PDF

Open Access

TL;DR

This paper introduces a point-level contrastive learning framework with semantic clustering for point cloud videos, capturing fine-grained semantics and improving downstream task performance.

Contribution

It presents a unified self-supervised learning method at the point level with semantic alignment of superpoints, addressing limitations of clip/frame-level approaches.

Findings

01

Outperforms supervised methods on various downstream tasks

02

Effectively captures multi-scale semantic cues

03

Demonstrates superior transferability of learned representations

Abstract

We propose a unified point cloud video self-supervised learning framework for object-centric and scene-centric data. Previous methods commonly conduct representation learning at the clip or frame level and cannot well capture fine-grained semantics. Instead of contrasting the representations of clips or frames, in this paper, we propose a unified self-supervised framework by conducting contrastive learning at the point level. Moreover, we introduce a new pretext task by achieving semantic alignment of superpoints, which further facilitates the representations to capture semantic cues at multiple scales. In addition, due to the high redundancy in the temporal dimension of dynamic point clouds, directly conducting contrastive learning at the point level usually leads to massive undesired negatives and insufficient modeling of positive representations. To remedy this, we propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Remote Sensing and LiDAR Applications · Computer Graphics and Visualization Techniques

MethodsContrastive Learning · Contrastive Language-Image Pre-training