Performance of LLMs on Stochastic Modeling Operations Research Problems: From Theory to Practice
Akshit Kumar, Tianyi Peng, Yuhang Wu, Assaf Zeevi

TL;DR
This paper evaluates large language models' ability to solve stochastic operations research problems, demonstrating they perform comparably to human experts in both academic and practical scenarios, highlighting their potential to assist OR research.
Contribution
First comprehensive assessment of LLMs on stochastic OR problems, combining theoretical problem-solving and real-world decision-making tasks using simulation-optimization tools.
Findings
LLMs show proficiency comparable to human experts in stochastic OR tasks.
State-of-the-art LLMs can assist in automating parts of the stochastic modeling pipeline.
Further work is needed to reliably automate stochastic modeling in real-world applications.
Abstract
Large language models (LLMs) have exhibited expert-level capabilities across various domains. However, their abilities to solve problems in Operations Research (OR) -- the analysis and optimization of mathematical models derived from real-world problems or their verbal descriptions -- remain underexplored. In this work, we take a first step toward evaluating LLMs' abilities to solve stochastic modeling problems, a core class of OR problems characterized by uncertainty and typically involving tools from probability, statistics, and stochastic processes. We manually procure a representative set of graduate-level homework and doctoral qualification-exam problems and test LLMs' abilities to solve them. We further leverage SimOpt, an open-source library of simulation-optimization problems and solvers, to investigate LLMs' abilities to make real-world decisions under uncertainty. Our results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConstraint Satisfaction and Optimization · Forecasting Techniques and Applications · Spreadsheets and End-User Computing
