Safe Deep Reinforcement Learning for Building Heating Control and Demand-side Flexibility

Colin J\"uni; Mina Montazeri; Yi Guo; Federica Bellizio; Giovanni Sansavini; Philipp Heer

arXiv:2604.16033·eess.SY·April 20, 2026

Safe Deep Reinforcement Learning for Building Heating Control and Demand-side Flexibility

Colin J\"uni, Mina Montazeri, Yi Guo, Federica Bellizio, Giovanni Sansavini, Philipp Heer

PDF

TL;DR

This paper introduces a safe deep reinforcement learning framework for building heating control that ensures demand-side flexibility, occupant comfort, and energy efficiency, with real-time safety guarantees and significant cost savings.

Contribution

It develops a real-time adaptive safety filter integrated with deep reinforcement learning to ensure safety and compliance in demand-side flexibility for building heating systems.

Findings

01

Achieves up to 50% energy and cost savings compared to rule-based control.

02

Outperforms standalone deep reinforcement learning controllers in efficiency.

03

Maintains occupant comfort with only slight temperature violations.

Abstract

Buildings account for approximately 40% of global energy consumption, and with the growing share of intermittent renewable energy sources, enabling demand-side flexibility, particularly in heating, ventilation and air conditioning systems, is essential for grid stability and energy efficiency. This paper presents a safe deep reinforcement learning-based control framework to optimize building space heating while enabling demand-side flexibility provision for power system operators. A deep deterministic policy gradient algorithm is used as the core deep reinforcement learning method, enabling the controller to learn an optimal heating strategy through interaction with the building thermal model while maintaining occupant comfort, minimizing energy cost, and providing flexibility. To address safety concerns with reinforcement learning, particularly regarding compliance with flexibility…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.