Semantic Scene Completion with Multi-Feature Data Balancing Network
Mona Alawadh, Mahesan Niranjan, Hansung Kim

TL;DR
This paper introduces MDBNet, a dual-head neural network that fuses RGB and depth data for improved semantic scene completion, effectively addressing data imbalance and ambiguity in indoor 3D scene modeling.
Contribution
The paper presents a novel hybrid encoder-decoder architecture with identity transformation residual modules for multi-feature data balancing in SSC.
Findings
MDBNet outperforms state-of-the-art methods on NYU datasets.
Effective fusion of RGB and depth features improves 3D semantic completion.
The proposed loss functions enhance model training and accuracy.
Abstract
Semantic Scene Completion (SSC) is a critical task in computer vision, that utilized in applications such as virtual reality (VR). SSC aims to construct detailed 3D models from partial views by transforming a single 2D image into a 3D representation, assigning each voxel a semantic label. The main challenge lies in completing 3D volumes with limited information, compounded by data imbalance, inter-class ambiguity, and intra-class diversity in indoor scenes. To address this, we propose the Multi-Feature Data Balancing Network (MDBNet), a dual-head model for RGB and depth data (F-TSDF) inputs. Our hybrid encoder-decoder architecture with identity transformation in a pre-activation residual module (ITRM) effectively manages diverse signals within F-TSDF. We evaluate RGB feature fusion strategies and use a combined loss function cross entropy for 2D RGB features and weighted cross-entropy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization
