Let's Have a Conversation: Designing and Evaluating LLM Agents for Interactive Optimization

Joshua Drossman; Alexandre Jacquillat; S\'ebastien Martin

arXiv:2604.02666·cs.AI·April 6, 2026

Let's Have a Conversation: Designing and Evaluating LLM Agents for Interactive Optimization

Joshua Drossman, Alexandre Jacquillat, S\'ebastien Martin

PDF

TL;DR

This paper introduces a scalable methodology for evaluating large language model-based decision agents through conversations, demonstrating improved solution quality in interactive optimization tasks like school scheduling.

Contribution

It presents a novel conversation-based evaluation framework for optimization agents and shows how tailored, domain-specific prompts enhance their performance.

Findings

01

Conversation-based evaluation reveals higher-quality solutions than one-shot methods.

02

Tailored optimization agents outperform general-purpose chatbots in fewer interactions.

03

Operations research expertise improves the design and reliability of interactive optimization agents.

Abstract

Optimization is as much about modeling the right problem as solving it. Identifying the right objectives, constraints, and trade-offs demands extensive interaction between researchers and stakeholders. Large language models can empower decision-makers with optimization capabilities through interactive optimization agents that can propose, interpret and refine solutions. However, it is fundamentally harder to evaluate a conversation-based interaction than traditional one-shot approaches. This paper proposes a scalable and replicable methodology for evaluating optimization agents through conversations. We build LLM-powered decision agents that role-play diverse stakeholders, each governed by an internal utility function but communicating like a real decision-maker. We generate thousands of conversations in a school scheduling case study. Results show that one-shot evaluation is severely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.