DualSchool: How Reliable are LLMs for Optimization Education?
Michael Klamkin, Arnaud Deza, Sikai Cheng, Haoruo Zhao, Pascal Van Hentenryck

TL;DR
This paper evaluates the reliability of large language models in generating correct duals of linear programs, revealing significant limitations even in simple cases, and introduces DualSchool, a framework for generating and verifying such instances.
Contribution
The paper introduces DualSchool, a comprehensive framework for generating and verifying primal-dual conversion instances, and provides an empirical assessment of LLMs' performance on this task.
Findings
LLMs often fail to produce correct duals despite reciting conversion procedures.
State-of-the-art open LLMs struggle with simple two-variable instances.
Verification using Canonical Graph Edit Distance improves accuracy over existing methods.
Abstract
Consider the following task taught in introductory optimization courses which addresses challenges articulated by the community at the intersection of (generative) AI and OR: generate the dual of a linear program. LLMs, being trained at web-scale, have the conversion process and many instances of Primal to Dual Conversion (P2DC) at their disposal. Students may thus reasonably expect that LLMs would perform well on the P2DC task. To assess this expectation, this paper introduces DualSchool, a comprehensive framework for generating and verifying P2DC instances. The verification procedure of DualSchool uses the Canonical Graph Edit Distance, going well beyond existing evaluation methods for optimization models, which exhibit many false positives and negatives when applied to P2DC. Experiments performed by DualSchool reveal interesting findings. Although LLMs can recite the conversion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHigher Education Learning Practices
