A Principled Framework to Evaluate Quality of AC-OPF Datasets for Machine Learning: Benchmarking a Novel, Scalable Generation Method

Matteo Ba\`u (1); Luca Perbellini (2); Samuele Grillo (2) ((1) Ricerca sul Sistema Energetico; (2) Politecnico di Milano)

arXiv:2508.19083·eess.SY·August 27, 2025

A Principled Framework to Evaluate Quality of AC-OPF Datasets for Machine Learning: Benchmarking a Novel, Scalable Generation Method

Matteo Ba\`u (1), Luca Perbellini (2), Samuele Grillo (2) ((1) Ricerca sul Sistema Energetico, (2) Politecnico di Milano)

PDF

TL;DR

This paper introduces a scalable, multi-criteria framework for evaluating AC-OPF datasets in machine learning, proposing a novel generation method that outperforms existing approaches in quality and scalability.

Contribution

It presents a new heuristic for dataset generation and a multi-criteria evaluation framework, addressing scalability and comparison challenges in AC-OPF dataset quality assessment.

Findings

01

The heuristic improves dataset quality over random sampling methods.

02

The evaluation framework effectively compares different AC-OPF dataset generation approaches.

03

The proposed method balances dataset diversity and scalability effectively.

Abstract

Several methods have been proposed in the literature to improve the quality of AC optimal power flow (AC-OPF) datasets used in machine learning (ML) models. Yet, scalability to large power systems remains unaddressed and comparing generation approaches is still hindered by the absence of widely accepted metrics quantifying AC-OPF dataset quality. In this work, we tackle both these limitations. We provide a simple heuristic that samples load setpoints uniformly in total load active power, rather than maximizing volume coverage, and solves an AC-OPF formulation with load slack variables to improve convergence. For quality assessment, we formulate a multi-criteria framework based on three metrics, measuring variability in the marginal distributions of AC-OPF primal variables, diversity in constraint activation patterns among AC-OPF instances and activation frequency of variable bounds. By…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.