Stochastic Dispatch of Energy Storage in Microgrids: An Augmented Reinforcement Learning Approach
Yuwei Shang, Wenchuan Wu, Jianbo Guo, Zhe Lv, Zhao Ma, Wanxing Sheng,, Ran Chen

TL;DR
This paper introduces a novel reinforcement learning approach augmented with Monte-Carlo tree search and domain knowledge to optimize the stochastic dispatch of energy storage in microgrids, considering lifecycle costs and volatility.
Contribution
It develops an advanced RL framework with MCTS and dispatching rules to efficiently solve the complex, non-convex microgrid energy storage dispatch problem.
Findings
Outperforms baseline RL algorithms in numerical tests.
Effectively incorporates lifecycle degradation costs.
Enhances computational efficiency with MCTS.
Abstract
The dynamic dispatch (DD) of battery energy storage systems (BESSs) in microgrids integrated with volatile energy resources is essentially a multiperiod stochastic optimization problem (MSOP). Because the life span of a BESS is significantly affected by its charging and discharging behaviors, its lifecycle degradation costs should be incorporated into the DD model of BESSs, which makes it non-convex. In general, this MSOP is intractable. To solve this problem, we propose a reinforcement learning (RL) solution augmented with Monte-Carlo tree search (MCTS) and domain knowledge expressed as dispatching rules. In this solution, the Q-learning with function approximation is employed as the basic learning architecture that allows multistep bootstrapping and continuous policy learning. To improve the computation efficiency of randomized multistep simulations, we employed the MCTS to estimate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMicrogrid Control and Optimization · Smart Grid Energy Management · Reinforcement Learning in Robotics
