Towards Foundation Models for 3D Scene Understanding: Instance-Aware Self-Supervised Learning for Point Clouds
Bin Yang, Mohamed Abdelsamad, Miao Zhang, Alexandru Paul Condurache

TL;DR
This paper introduces PointINS, a self-supervised learning framework for point clouds that enhances instance awareness through geometry-aware regularization, improving 3D scene understanding and localization.
Contribution
It proposes a novel instance-oriented SSL method with geometry-aware regularization strategies, advancing 3D foundation models for various downstream tasks.
Findings
+3.5% mAP for indoor instance segmentation
+4.1% PQ for outdoor panoptic segmentation
Improved transferability of 3D representations
Abstract
Recent advances in self-supervised learning (SSL) for point clouds have substantially improved 3D scene understanding without human annotations. Existing approaches emphasize semantic awareness by enforcing feature consistency across augmented views or by masked scene modeling. However, the resulting representations transfer poorly to instance localization, and often require full finetuning for strong performance. Instance awareness is a fundamental component of 3D perception, thus bridging this gap is crucial for progressing toward true 3D foundation models that support all downstream tasks on 3D data. In this work, we introduce PointINS, an instance-oriented self-supervised framework that enriches point cloud representations through geometry-aware learning. PointINS employs an orthogonal offset branch to jointly learn high-level semantic understanding and geometric reasoning, yielding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
