An experimental evaluation of Deep Reinforcement Learning algorithms for HVAC control
Antonio Manjavacas, Alejandro Campoy-Nieves, Javier Jim\'enez-Raboso,, Miguel Molina-Solana, Juan G\'omez-Romero

TL;DR
This paper critically evaluates various Deep Reinforcement Learning algorithms for HVAC control, focusing on comfort, energy efficiency, robustness, and adaptability, highlighting their potential and challenges in real-world scenarios.
Contribution
It provides the first standardized, reproducible comparison of state-of-the-art DRL algorithms for HVAC control using the Sinergym framework.
Findings
DRL algorithms like SAC and TD3 show promise in complex HVAC scenarios.
Challenges include issues with generalization and incremental learning.
The study emphasizes the need for standardization in DRL evaluation for HVAC.
Abstract
Heating, Ventilation, and Air Conditioning (HVAC) systems are a major driver of energy consumption in commercial and residential buildings. Recent studies have shown that Deep Reinforcement Learning (DRL) algorithms can outperform traditional reactive controllers. However, DRL-based solutions are generally designed for ad hoc setups and lack standardization for comparison. To fill this gap, this paper provides a critical and reproducible evaluation, in terms of comfort and energy consumption, of several state-of-the-art DRL algorithms for HVAC control. The study examines the controllers' robustness, adaptability, and trade-off between optimization goals by using the Sinergym framework. The results obtained confirm the potential of DRL algorithms, such as SAC and TD3, in complex scenarios and reveal several challenges related to generalization and incremental learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBuilding Energy and Comfort Optimization · Smart Grid Energy Management
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Clipped Double Q-learning · Adam · Dense Connections · Dilated Convolution · Convolution · Average Pooling · Target Policy Smoothing · Experience Replay · Twin Delayed Deep Deterministic
