ProOPF: Benchmarking and Improving LLMs for Professional-Grade Power Systems Optimization Modeling
Chao Shen, Zihan Guo, Xu Wan, Zhenghao Yang, Yifan Zhang, Wengi Huang, Jie Song, Zongyan Zhang, Mingyang Sun

TL;DR
ProOPF introduces a new dataset and benchmark for evaluating large language models' ability to translate natural language into executable power system optimization models, specifically for optimal power flow tasks.
Contribution
It provides the first comprehensive dataset and benchmark tailored for professional-grade power system optimization modeling using LLMs, addressing a gap in existing evaluation resources.
Findings
ProOPF-D contains 12,000 instances with NL requests and executable models.
ProOPF-B offers 121 expert-annotated test cases with ground-truth code.
Enables rigorous end-to-end evaluation of LLMs in power system optimization modeling.
Abstract
Growing renewable penetration introduces substantial uncertainty into power system operations, necessitating frequent adaptation of dispatch objectives and constraints and challenging expertise-intensive, near-real-time modeling workflows. Large Language Models (LLMs) provide a promising avenue for automating this process by translating natural-language (NL) operational requirements into executable optimization models via semantic reasoning and code synthesis. Yet existing LLM datasets and benchmarks for optimization modeling primarily target coarse-grained cross-domain generalization, offering limited, rigorous evaluation in power-system settings, particularly for Optimal Power Flow (OPF). We therefore introduce \textbf{ProOPF-D} and \textbf{ProOPF-B}, a dataset and benchmark for professional-grade OPF modeling: ProOPF-D contains 12K instances pairing NL requests with parameter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimal Power Flow Distribution · Energy Load and Power Forecasting · Advanced Graph Neural Networks
