Newton's Lantern: A Reinforcement Learning Framework for Finetuning AC Power Flow Warm Start Models
Shourya Bose, Helgi Hilmarsson, Dhruv Suri

TL;DR
Newton's Lantern is a reinforcement learning framework that fine-tunes AC power flow models, significantly improving convergence and reducing iterations in large-scale power systems.
Contribution
It introduces a novel RL-based finetuning pipeline that outperforms existing methods in power flow problem solving near voltage collapse.
Findings
Converges on every test snapshot across benchmarks.
Achieves the smallest mean iteration count.
Identifies failure modes of supervised regression in power flow models.
Abstract
Neural warm starts can sharply reduce the number of Newton-Raphson iterations required to solve the AC power flow problem, but existing supervised approaches generalize poorly on heavily loaded instances near voltage collapse. We prove a lower bound on the Newton-Raphson iteration count that depends on the direction of the warm start error rather than on its magnitude, and show as a corollary that the bound becomes vacuous as the smallest singular value of the power-flow Jacobian shrinks, identifying the failure mode of supervised regression near the saddle-node bifurcation. Motivated by this analysis, we introduce Newton's Lantern, a finetuning pipeline that combines group relative policy optimization with a learned reward model trained on perturbations of the base model's predictions, using the iteration count itself as the supervisory signal. Across IEEE 118-bus, GOC 500-bus, and GOC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
