Reinforcement Learning for Volt-Var Control: A Novel Two-stage Progressive Training Strategy
Si Zhang, Mingzhi Zhang, Rongxing Hu, David Lubkeman, Yunan Liu, and, Ning Lu

TL;DR
This paper introduces a novel two-stage reinforcement learning strategy for multi-agent Volt-Var Control in high solar penetration systems, improving training efficiency and control performance through cooperative learning and dynamic reward allocation.
Contribution
The paper proposes a new two-stage progressive training method for RL-based Volt-Var Control, enhancing training speed, convergence, and multi-agent cooperation in distribution systems.
Findings
The approach achieves robust control performance across various conditions.
It improves training efficiency and convergence speed.
Simulation confirms effectiveness in a modified IEEE 123-bus system.
Abstract
This paper develops a reinforcement learning (RL)approach to solve a cooperative, multi-agent Volt-Var Control (VVC) problem for high solar penetration distribution systems. The ingenuity of our RL method lies in a novel two-stage progressive training strategy that can effectively improve training speed and convergence of the machine learning algorithm. In Stage 1(individual training), while holding all the other agents inactive, we separately train each agent to obtain its own optimal VVC actions in the action space: {consume, generate, do-nothing}. In Stage 2 (cooperative training), all agents are trained again coordinatively to share VVC responsibility. Rewards and costs in our RL scheme include (i) a system-level reward (for taking an action), (ii) an agent-level reward (for doing-nothing), and(iii) an agent-level action cost function. This new framework allows rewards to be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Grid Energy Management · Microgrid Control and Optimization · Optimal Power Flow Distribution
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
