4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding
Yujin Chen, Matthias Nie{\ss}ner, Angela Dai

TL;DR
This paper introduces 4DContrast, a contrastive learning method that incorporates dynamic 4D object priors into 3D representations through unsupervised pre-training, enhancing scene understanding tasks.
Contribution
It proposes a novel data augmentation and contrastive learning framework that encodes 4D invariances into 3D representations for improved scene understanding.
Findings
Improves 3D semantic segmentation accuracy
Enhances object detection performance
Boosts results in data-scarce scenarios
Abstract
We present a new approach to instill 4D dynamic object priors into learned 3D representations by unsupervised pre-training. We observe that dynamic movement of an object through an environment provides important cues about its objectness, and thus propose to imbue learned 3D representations with such dynamic understanding, that can then be effectively transferred to improved performance in downstream 3D semantic scene understanding tasks. We propose a new data augmentation scheme leveraging synthetic 3D shapes moving in static 3D environments, and employ contrastive learning under 3D-4D constraints that encode 4D invariances into the learned 3D representations. Experiments demonstrate that our unsupervised representation learning results in improvement in downstream 3D semantic segmentation, object detection, and instance segmentation tasks, and moreover, notably improves performance in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Domain Adaptation and Few-Shot Learning · Robotics and Sensor-Based Localization
MethodsContrastive Learning
