OptArgus: A Multi-Agent System to Detect Hallucinations in LLM-based Optimization Modeling
Zhong Li, Zihan Guo, Xiaohan Lu, Juntao Wang, Jie Song, Chao Shen, Jiageng Wu, Mingyang Sun

TL;DR
This paper introduces OptArgus, a multi-agent system for detecting hallucinations in LLM-generated optimization models, improving accuracy and reliability over single-agent approaches.
Contribution
It presents a novel taxonomy for optimization-modeling hallucinations and develops OptArgus, the first multi-agent detector for structural consistency auditing in this domain.
Findings
OptArgus reduces false alarms on clean artifacts.
It achieves more accurate localization of errors in controlled tests.
It outperforms single-agent baselines on natural LLM outputs.
Abstract
Large language models (LLMs) are increasingly used to translate natural-language optimization problems into mathematical formulations and solver code, but matching the reference objective value is not a reliable test of correctness: an artifact may agree numerically while still changing the underlying optimization semantics. We formulate this issue as \emph{optimization-modeling hallucination detection}, namely structural consistency auditing over the problem description, symbolic model, and solver implementation. We develop, to our knowledge, the first fine-grained hallucination taxonomy specifically for optimization modeling, spanning objective, variable, constraint, and implementation failures. We use this taxonomy to design OptArgus, a multi-agent detector with conductor routing, specialist auditors, and evidence consolidation. To evaluate this setting, we introduce a three-part…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
