Tool-Aware Planning in Contact Center AI: Evaluating LLMs through Lineage-Guided Query Decomposition
Varun Nathan, Shreyas Guha, Ayush Kumar

TL;DR
This paper introduces a framework and benchmark for evaluating how well large language models decompose complex contact center queries into executable steps using structured and unstructured tools, highlighting current limitations and avenues for improvement.
Contribution
It presents a comprehensive evaluation framework, a data curation methodology for high-quality plan generation, and a large-scale study of 14 LLMs on query decomposition tasks in contact centers.
Findings
LLMs struggle with compound and multi-step queries.
The best model achieves 84.8% overall score, but only 49.75% at the top-tier match rate.
Shorter, simpler plans are significantly easier for models to generate accurately.
Abstract
We present a domain-grounded framework and benchmark for tool-aware plan generation in contact centers, where answering a query for business insights, our target use case, requires decomposing it into executable steps over structured tools (Text2SQL (T2S)/Snowflake) and unstructured tools (RAG/transcripts) with explicit depends_on for parallelism. Our contributions are threefold: (i) a reference-based plan evaluation framework operating in two modes - a metric-wise evaluator spanning seven dimensions (e.g., tool-prompt alignment, query adherence) and a one-shot evaluator; (ii) a data curation methodology that iteratively refines plans via an evaluator->optimizer loop to produce high-quality plan lineages (ordered plan revisions) while reducing manual effort; and (iii) a large-scale study of 14 LLMs across sizes and families for their ability to decompose queries into step-by-step,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Semantic Web and Ontologies · Data Visualization and Analytics
