UniSem: Generalizable Semantic 3D Reconstruction from Sparse Unposed Images
Guibiao Liao, Qian Ren, Kaimin Liao, Hua Wang, Zhi Chen, Luchao Wang, Yaohua Tang

TL;DR
UniSem introduces a unified framework that enhances semantic 3D reconstruction from sparse images by improving depth accuracy and semantic generalization through error-aware Gaussian dropout and a progressive mix-training curriculum.
Contribution
The paper presents UniSem, a novel approach combining error-guided Gaussian capacity control and a curriculum blending 2D semantics with 3D priors, advancing sparse-view semantic 3D reconstruction.
Findings
Achieves 15.2% depth Rel reduction with 16 views
Improves open-vocabulary segmentation mAcc by 3.7%
Outperforms strong baselines on ScanNet and Replica datasets
Abstract
Semantic-aware 3D reconstruction from sparse, unposed images remains challenging for feed-forward 3D Gaussian Splatting (3DGS). Existing methods often predict an over-complete set of Gaussian primitives under sparse-view supervision, leading to unstable geometry and inferior depth quality. Meanwhile, they rely solely on 2D segmenter features for semantic lifting, which provides weak 3D-level and limited generalizable supervision, resulting in incomplete 3D semantics in novel scenes. To address these issues, we propose UniSem, a unified framework that jointly improves depth accuracy and semantic generalization via two key components. First, Error-aware Gaussian Dropout (EGD) performs error-guided capacity control by suppressing redundancy-prone Gaussians using rendering error cues, producing meaningful, geometrically stable Gaussian representations for improved depth estimation. Second,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · 3D Shape Modeling and Analysis
