gridfm-datakit-v1: A Python Library for Scalable and Realistic Power Flow and Optimal Power Flow Data Generation
Alban Puech, Matteo Mazzonelli, Celia Cintas, Tamara R. Govindasamy, Mangaliso Mngomezulu, Jonas Weiss, Matteo Ba\`u, Anna Varbella, Fran\c{c}ois Mirall\`es, Kibaek Kim, Le Xie, Hendrik F. Hamann, Etienne Vos, Thomas Brunschwiler

TL;DR
gridfm-datakit-v1 is a Python library that generates diverse, realistic power flow and optimal power flow datasets, enabling better training of machine learning models for power system analysis, especially under varied and challenging conditions.
Contribution
It introduces a novel dataset generation approach that incorporates realistic stochastic perturbations, scenario diversity, and variable generator costs, scaling efficiently to large power grids.
Findings
Creates more diverse and realistic datasets for ML training.
Enables generation of data beyond standard operating limits.
Scales efficiently to large power systems with up to 10,000 buses.
Abstract
We introduce gridfm-datakit-v1, a Python library for generating realistic and diverse Power Flow (PF) and Optimal Power Flow (OPF) datasets for training Machine Learning (ML) solvers. Existing datasets and libraries face three main challenges: (1) lack of realistic stochastic load and topology perturbations, limiting scenario diversity; (2) PF datasets are restricted to OPF-feasible points, hindering generalization of ML solvers to cases that violate operating limits (e.g., branch overloads or voltage violations); and (3) OPF datasets use fixed generator cost functions, limiting generalization across varying costs. gridfm-datakit addresses these challenges by: (1) combining global load scaling from real-world profiles with localized noise and supporting arbitrary N-k topology perturbations to create diverse yet realistic datasets; (2) generating PF samples beyond operating limits; and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimal Power Flow Distribution · Power System Optimization and Stability · Energy Load and Power Forecasting
