Data-Driven Policy Mapping for Safe RL-based Energy Management Systems
Theo Zangato, Aomar Osmani, Pegah Alizadeh

TL;DR
This paper introduces a scalable, safe, and adaptive RL-based building energy management system that uses clustering, forecasting, and action masking to optimize costs and ensure safety across diverse buildings.
Contribution
The paper presents a novel three-step RL framework combining clustering, forecasting, and domain-informed action masking for scalable and safe energy management in buildings.
Findings
Reduces operating costs by up to 15% in real-world tests
Maintains stable environmental performance across building types
Adapts to tariff changes without retraining
Abstract
Increasing global energy demand and renewable integration complexity have placed buildings at the center of sustainable energy management. We present a three-step reinforcement learning(RL)-based Building Energy Management System (BEMS) that combines clustering, forecasting, and constrained policy learning to address scalability, adaptability, and safety challenges. First, we cluster non-shiftable load profiles to identify common consumption patterns, enabling policy generalization and transfer without retraining for each new building. Next, we integrate an LSTM based forecasting module to anticipate future states, improving the RL agents' responsiveness to dynamic conditions. Lastly, domain-informed action masking ensures safe exploration and operation, preventing harmful decisions. Evaluated on real-world data, our approach reduces operating costs by up to 15% for certain building…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Grid Energy Management · Building Energy and Comfort Optimization · Integrated Energy Systems Optimization
