Shape2Scene: 3D Scene Representation Learning Through Pre-training on   Shape Data

Tuo Feng; Wenguan Wang; Ruijie Quan; Yi Yang

arXiv:2407.10200·cs.CV·July 16, 2024

Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data

Tuo Feng, Wenguan Wang, Ruijie Quan, Yi Yang

PDF

Open Access 1 Repo

TL;DR

Shape2Scene (S2S) introduces a novel pre-training approach that leverages 3D shape data to learn effective representations for large-scale 3D scenes, addressing data scarcity issues in scene understanding.

Contribution

The paper proposes multiscale high-resolution backbones and a Shape-to-Scene strategy for pre-training, enabling effective transfer of shape-based features to scene-level 3D tasks.

Findings

01

Achieved 93.8% OA on ScanObjectNN

02

Attained 87.6% instance mIoU on ShapeNetPart

03

Demonstrated strong transferability across shape and scene tasks

Abstract

Current 3D self-supervised learning methods of 3D scenes face a data desert issue, resulting from the time-consuming and expensive collecting process of 3D scene data. Conversely, 3D shape datasets are easier to collect. Despite this, existing pre-training strategies on shape data offer limited potential for 3D scene understanding due to significant disparities in point quantities. To tackle these challenges, we propose Shape2Scene (S2S), a novel method that learns representations of large-scale 3D scenes from 3D shape data. We first design multiscale and high-resolution backbones for shape and scene level 3D tasks, i.e., MH-P (point-based) and MH-V (voxel-based). MH-P/V establishes direct paths to highresolution features that capture deep semantic information across multiple scales. This pivotal nature makes them suitable for a wide range of 3D downstream tasks that tightly rely on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fengzicai/s2s
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · 3D Surveying and Cultural Heritage · Advanced Image and Video Retrieval Techniques