Deep Learning-based Scalable Image-to-3D Facade Parser for Generating Thermal 3D Building Models

Yinan Yu; Alex Gonzalez-Caceres; Samuel Scheidegger; Sanjay Somanath; Alexander Hollberg

arXiv:2508.04406·cs.CV·August 7, 2025

Deep Learning-based Scalable Image-to-3D Facade Parser for Generating Thermal 3D Building Models

Yinan Yu, Alex Gonzalez-Caceres, Samuel Scheidegger, Sanjay Somanath, Alexander Hollberg

PDF

TL;DR

This paper introduces SI3FP, a deep learning-based pipeline that efficiently generates detailed thermal 3D building models from images, aiding early-stage renovation planning with high accuracy and scalability.

Contribution

The paper presents a novel scalable image-to-3D facade parser that directly models geometric primitives, improving accuracy and supporting diverse data sources for thermal building modeling.

Findings

01

Achieved approximately 5% error in window-to-wall ratio estimates.

02

Supports both sparse and dense image data sources.

03

Facilitates large-scale energy renovation planning.

Abstract

Renovating existing buildings is essential for climate impact. Early-phase renovation planning requires simulations based on thermal 3D models at Level of Detail (LoD) 3, which include features like windows. However, scalable and accurate identification of such features remains a challenge. This paper presents the Scalable Image-to-3D Facade Parser (SI3FP), a pipeline that generates LoD3 thermal models by extracting geometries from images using both computer vision and deep learning. Unlike existing methods relying on segmentation and projection, SI3FP directly models geometric primitives in the orthographic image plane, providing a unified interface while reducing perspective distortions. SI3FP supports both sparse (e.g., Google Street View) and dense (e.g., hand-held camera) data sources. Tested on typical Swedish residential buildings, SI3FP achieved approximately 5% error in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.