Rewis3d: Reconstruction Improves Weakly-Supervised Semantic Segmentation
Jonas Ernst, Wolfgang Boettcher, Lukas Hoyer, Jan Eric Lenssen, Bernt Schiele

TL;DR
Rewis3d introduces a novel framework that uses 3D scene reconstruction to enhance weakly-supervised semantic segmentation in 2D images, significantly reducing annotation costs and improving accuracy.
Contribution
The paper proposes leveraging 3D reconstruction as an auxiliary signal in a dual student-teacher architecture to improve weakly-supervised segmentation performance.
Findings
Achieves state-of-the-art results in sparse supervision
Outperforms existing methods by 2-7%
Does not require additional labels or inference overhead
Abstract
We present Rewis3d, a framework that leverages recent advances in feed-forward 3D reconstruction to significantly improve weakly supervised semantic segmentation on 2D images. Obtaining dense, pixel-level annotations remains a costly bottleneck for training segmentation models. Alleviating this issue, sparse annotations offer an efficient weakly-supervised alternative. However, they still incur a performance gap. To address this, we introduce a novel approach that leverages 3D scene reconstruction as an auxiliary supervisory signal. Our key insight is that 3D geometric structure recovered from 2D videos provides strong cues that can propagate sparse annotations across entire scenes. Specifically, a dual student-teacher architecture enforces semantic consistency between 2D images and reconstructed 3D point clouds, using state-of-the-art feed-forward reconstruction to generate reliable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Advanced Vision and Imaging · Advanced Neural Network Applications
