VIBUS: Data-efficient 3D Scene Parsing with VIewpoint Bottleneck and   Uncertainty-Spectrum Modeling

Beiwen Tian; Liyi Luo; Hao Zhao; Guyue Zhou

arXiv:2210.11472·cs.CV·October 21, 2022

VIBUS: Data-efficient 3D Scene Parsing with VIewpoint Bottleneck and Uncertainty-Spectrum Modeling

Beiwen Tian, Liyi Luo, Hao Zhao, Guyue Zhou

PDF

Open Access 1 Repo

TL;DR

VIBUS introduces a two-stage data-efficient 3D scene parsing framework that leverages self-supervised learning and uncertainty-spectrum modeling to reduce the need for extensive manual annotations, achieving state-of-the-art results.

Contribution

The paper proposes VIBUS, a novel framework combining viewpoint bottleneck self-supervised learning and uncertainty-spectrum based pseudo-labeling for efficient 3D scene parsing.

Findings

01

Achieves state-of-the-art results on ScanNet benchmark.

02

Both Viewpoint Bottleneck and uncertainty-spectrum modeling significantly improve performance.

03

Effective in reducing manual annotation requirements for 3D scene parsing.

Abstract

Recently, 3D scenes parsing with deep learning approaches has been a heating topic. However, current methods with fully-supervised models require manually annotated point-wise supervision which is extremely user-unfriendly and time-consuming to obtain. As such, training 3D scene parsing models with sparse supervision is an intriguing alternative. We term this task as data-efficient 3D scene parsing and propose an effective two-stage framework named VIBUS to resolve it by exploiting the enormous unlabeled points. In the first stage, we perform self-supervised representation learning on unlabeled points with the proposed Viewpoint Bottleneck loss function. The loss function is derived from an information bottleneck objective imposed on scenes under different viewpoints, making the process of representation learning free of degradation and sampling. In the second stage, pseudo labels are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

air-discover/vibus
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Advanced Vision and Imaging

MethodsTest