Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data
Tuo Feng, Wenguan Wang, Ruijie Quan, Yi Yang

TL;DR
Shape2Scene (S2S) introduces a novel pre-training approach that leverages 3D shape data to learn effective representations for large-scale 3D scenes, addressing data scarcity issues in scene understanding.
Contribution
The paper proposes multiscale high-resolution backbones and a Shape-to-Scene strategy for pre-training, enabling effective transfer of shape-based features to scene-level 3D tasks.
Findings
Achieved 93.8% OA on ScanObjectNN
Attained 87.6% instance mIoU on ShapeNetPart
Demonstrated strong transferability across shape and scene tasks
Abstract
Current 3D self-supervised learning methods of 3D scenes face a data desert issue, resulting from the time-consuming and expensive collecting process of 3D scene data. Conversely, 3D shape datasets are easier to collect. Despite this, existing pre-training strategies on shape data offer limited potential for 3D scene understanding due to significant disparities in point quantities. To tackle these challenges, we propose Shape2Scene (S2S), a novel method that learns representations of large-scale 3D scenes from 3D shape data. We first design multiscale and high-resolution backbones for shape and scene level 3D tasks, i.e., MH-P (point-based) and MH-V (voxel-based). MH-P/V establishes direct paths to highresolution features that capture deep semantic information across multiple scales. This pivotal nature makes them suitable for a wide range of 3D downstream tasks that tightly rely on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · 3D Surveying and Cultural Heritage · Advanced Image and Video Retrieval Techniques
