Implicit Intelligence -- Evaluating Agents on What Users Don't Say
Ved Sirdeshmukh, Marc Wetter

TL;DR
This paper introduces Implicit Intelligence, an evaluation framework for AI agents to understand unstated user requirements through interactive scenarios, revealing significant gaps in current models' contextual reasoning abilities.
Contribution
It presents a novel evaluation method and environment for testing AI agents' ability to infer implicit constraints and goals beyond explicit instructions.
Findings
Best model achieves only 48.3% success rate
Current models struggle with implicit reasoning tasks
Significant room for improvement in contextual understanding
Abstract
Real-world requests to AI agents are fundamentally underspecified. Natural human communication relies on shared context and unstated constraints that speakers expect listeners to infer. Current agentic benchmarks test explicit instruction-following but fail to evaluate whether agents can reason about implicit requirements spanning accessibility needs, privacy boundaries, catastrophic risks, and contextual constraints. We present Implicit Intelligence, an evaluation framework testing whether AI agents can move beyond prompt-following to become genuine goal-fulfillers, paired with Agent-as-a-World (AaW), a harness where interactive worlds are defined in human-readable YAML files and simulated by language models. Our scenarios feature apparent simplicity in user requests, hidden complexity in correct solutions, and discoverability of constraints through environmental exploration.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · AI in Service Interactions
