Gated Multimodal Learning for Interpretable Property Energy Performance Prediction and Retrofit Scenario Analysis
Yunfei Bai, Aaron Tesfa Tsion, Raul Rosales, Barbara Shollock, Wei He

TL;DR
This paper presents a gated multimodal model integrating tabular, text, and spatial data to predict energy performance scores of buildings, aiding scalable retrofit planning and interpretability.
Contribution
It introduces a novel multimodal learning approach with gating and auxiliary tasks for accurate, interpretable property energy prediction at city scale.
Findings
Model achieves MAE of 4.03 for SAP scores
Full multimodal fusion outperforms unimodal baselines
Interpretability analyses reveal key features influencing predictions
Abstract
Achieving resilient and sustainable cities requires scalable approaches to decarbonising residential buildings, which account for about 20% of UK greenhouse gas emissions and 25% of energy-related emissions in the European Union. Energy Performance Certificates (EPCs) support regulation and retrofit planning, but their reliance on on-site inspections limits timely city-scale assessment. This study introduces a gated multimodal model to predict Standard Assessment Procedure (SAP) energy efficiency and Environmental Impact (EI) scores by integrating EPC tabular variables, assessor-written free text, and Geographic Information System (GIS)-derived spatial features describing footprint geometry, height, area, and orientation. Sample-wise gating learns property-specific modality weights, while an auxiliary band classification head stabilises training. In a Westminster, London case study, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
