# StructScan3D v1: A First RGB-D Dataset for Indoor Building Elements Segmentation and BIM Modeling

**Authors:** Ishraq Rached, Rafika Hajji, Tania Landes, Rashid Haffadi

PMC · DOI: 10.3390/s25113461 · Sensors (Basel, Switzerland) · 2025-05-30

## TL;DR

This paper introduces StructScan3D v1, a new RGB-D dataset for segmenting indoor building elements to support BIM and AI applications.

## Contribution

The paper introduces StructScan3D v1, the first RGB-D dataset for indoor building element segmentation and BIM modeling.

## Key findings

- StructScan3D v1 contains 2594 annotated RGB-D frames from residential and office environments.
- D-Former achieved a mean IoU of 67.5% in segmenting building elements like walls, floors, and windows.
- The dataset supports benchmarking and comparison with models like Gemini and TokenFusion.

## Abstract

The integration of computer vision and deep learning into Building Information Modeling (BIM) workflows has created a growing need for structured datasets that enable the semantic segmentation of indoor building elements. This paper presents StructScan3D v1, the first version of an RGB-D dataset specifically designed to facilitate the automated segmentation and modeling of architectural and structural components. Captured using the Kinect Azure sensor, StructScan3D v1 comprises 2594 annotated frames from diverse indoor environments, including residential and office spaces. The dataset focuses on six key building elements: walls, floors, ceilings, windows, doors, and miscellaneous objects. To establish a benchmark for indoor RGB-D semantic segmentation, we evaluate D-Former, a transformer-based model that leverages self-attention mechanisms for enhanced spatial understanding. Additionally, we compare its performance against state-of-the-art models such as Gemini and TokenFusion, providing a comprehensive analysis of segmentation accuracy. Experimental results show that D-Former achieves a mean Intersection over Union (mIoU) of 67.5%, demonstrating strong segmentation capabilities despite challenges like occlusions and depth variations. As an evolving dataset, StructScan3D v1 lays the foundation for future expansions, including increased scene diversity and refined annotations. By bridging the gap between deep learning-driven segmentation and real-world BIM applications, this dataset provides researchers and practitioners with a valuable resource for advancing indoor scene reconstruction, robotics, and augmented reality.

## Full-text entities

- **Diseases:** injury to (MESH:D014947), BIM (MESH:D018877)
- **Chemicals:** RGB-D (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12158164/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12158164/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/PMC12158164/full.md

---
Source: https://tomesphere.com/paper/PMC12158164