VIBUS: Data-efficient 3D Scene Parsing with VIewpoint Bottleneck and Uncertainty-Spectrum Modeling
Beiwen Tian, Liyi Luo, Hao Zhao, Guyue Zhou

TL;DR
VIBUS introduces a two-stage data-efficient 3D scene parsing framework that leverages self-supervised learning and uncertainty-spectrum modeling to reduce the need for extensive manual annotations, achieving state-of-the-art results.
Contribution
The paper proposes VIBUS, a novel framework combining viewpoint bottleneck self-supervised learning and uncertainty-spectrum based pseudo-labeling for efficient 3D scene parsing.
Findings
Achieves state-of-the-art results on ScanNet benchmark.
Both Viewpoint Bottleneck and uncertainty-spectrum modeling significantly improve performance.
Effective in reducing manual annotation requirements for 3D scene parsing.
Abstract
Recently, 3D scenes parsing with deep learning approaches has been a heating topic. However, current methods with fully-supervised models require manually annotated point-wise supervision which is extremely user-unfriendly and time-consuming to obtain. As such, training 3D scene parsing models with sparse supervision is an intriguing alternative. We term this task as data-efficient 3D scene parsing and propose an effective two-stage framework named VIBUS to resolve it by exploiting the enormous unlabeled points. In the first stage, we perform self-supervised representation learning on unlabeled points with the proposed Viewpoint Bottleneck loss function. The loss function is derived from an information bottleneck objective imposed on scenes under different viewpoints, making the process of representation learning free of degradation and sampling. In the second stage, pseudo labels are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Advanced Vision and Imaging
MethodsTest
