Spatio-temporal Self-Supervised Representation Learning for 3D Point   Clouds

Siyuan Huang; Yichen Xie; Song-Chun Zhu; Yixin Zhu

arXiv:2109.00179·cs.CV·September 2, 2021

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds

Siyuan Huang, Yichen Xie, Song-Chun Zhu, Yixin Zhu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a spatio-temporal self-supervised learning framework for 3D point clouds, enabling effective pre-training that improves performance across multiple 3D understanding tasks without labeled data.

Contribution

The proposed STRL framework leverages spatio-temporal cues from unlabeled 3D data to learn invariant representations, enhancing generalization and performance in downstream tasks.

Findings

01

Self-supervised representations outperform supervised methods in some tasks.

02

STRL improves generalization across synthetic, indoor, and outdoor datasets.

03

Spatio-temporal cues significantly boost representation quality.

Abstract

To date, various 3D scene understanding tasks still lack practical and generalizable pre-trained models, primarily due to the intricate nature of 3D scene understanding tasks and their immense variations introduced by camera views, lighting, occlusions, etc. In this paper, we tackle this challenge by introducing a spatio-temporal representation learning (STRL) framework, capable of learning from unlabeled 3D point clouds in a self-supervised fashion. Inspired by how infants learn from visual data in the wild, we explore the rich spatio-temporal cues derived from the 3D data. Specifically, STRL takes two temporally-correlated frames from a 3D point cloud sequence as the input, transforms it with the spatial data augmentation, and learns the invariant representation self-supervisedly. To corroborate the efficacy of STRL, we conduct extensive experiments on three types (synthetic, indoor,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yichen928/STRL
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Human Pose and Action Recognition · Robotics and Sensor-Based Localization