A Comparison of Model-Free and Model Predictive Control for Price Responsive Water Heaters
David J. Biagioni, Xiangyu Zhang, Peter Graf, Devon Sigler, Wesley, Jones

TL;DR
This paper compares model-free reinforcement learning algorithms, ES and PPO, with various MPC strategies for controlling price-responsive water heaters, highlighting their efficiency and effectiveness in minimizing operational costs.
Contribution
It introduces a comprehensive comparison of model-free RL methods and MPC variants for water heater control, demonstrating the advantages of RL in speed and performance.
Findings
RL policies outperform MPC in cost reduction
ES learns policies significantly faster than traditional MPC
Optimal control requires long horizon MPC with perfect forecasting
Abstract
We present a careful comparison of two model-free control algorithms, Evolution Strategies (ES) and Proximal Policy Optimization (PPO), with receding horizon model predictive control (MPC) for operating simulated, price responsive water heaters. Four MPC variants are considered: a one-shot controller with perfect forecasting yielding optimal control; a limited-horizon controller with perfect forecasting; a mean forecasting-based controller; and a two-stage stochastic programming controller using historical scenarios. In all cases, the MPC model for water temperature and electricity price are exact; only water demand is uncertain. For comparison, both ES and PPO learn neural network-based policies by directly interacting with the simulated environment under the same scenarios used by MPC. All methods are then evaluated on a separate one-week continuation of the demand time series. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsEntropy Regularization · Proximal Policy Optimization
