AWARE-US: Preference-Aware Infeasibility Resolution in Tool-Calling Agents

Mehmet Kurmaz

arXiv:2601.02643·cs.AI·March 3, 2026

AWARE-US: Preference-Aware Infeasibility Resolution in Tool-Calling Agents

Mehmet Kurmaz

PDF

Open Access 1 Datasets

TL;DR

This paper introduces AWARE-US, a benchmark and methods for preference-aware infeasibility resolution in tool-calling agents, enabling more accurate and user-aligned query repairs in dialogue systems.

Contribution

It proposes three LLM-based methods for inferring constraint importance and introduces AWARE-US, a new benchmark for persona-grounded infeasibility resolution in conversational agents.

Findings

01

Local weighting best aligns with user preferences (48%)

02

Global weighting achieves highest correct-relaxation accuracy (56%)

03

All three methods outperform prior baselines in infeasibility resolution

Abstract

Tool-calling conversational agents querying structured databases often face two linked failures: underspecification (missing constraints needed for a precise query) andinfeasibility (a fully specified query returns anemptyset). Prior systems often respond with "no results" or apply ad hoc relaxations, which can violate user intent by discarding highly valued requirements. Wecast infeasibility handling as preference-aware query repair: when a query is unsatisfiable, the agent should relax the least important constraints. We propose three LLM-based methods to infer relative constraint importance from dialogue: (1) local weighting, (2) global one-shot weighting, and (3) pairwise ranking. Across extensive experiments in car recommendation, the local-weighting method trained with supervised fine-tuning and direct preference optimization best aligns with user preferences (48%), while global…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

MehmetKurmaz/AWARE-US
dataset· 28 dl
28 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Topic Modeling · Personal Information Management and User Behavior