YRC-Bench: A Benchmark for Learning to Coordinate with Experts
Mohamad H. Danesh, Nguyen X. Khanh, Tu Trinh, Benjamin Plaut

TL;DR
YRC-Bench introduces a benchmark for training AI agents to recognize when to seek expert help in new environments without prior expert interaction, promoting safer and more reliable autonomous decision-making.
Contribution
The paper presents YRC-Bench, an open-source benchmark for the novel YRC-0 problem, enabling research on unsupervised learning to coordinate with experts in diverse environments.
Findings
Proposed a validation strategy for YRC-0
Developed a proposer-validator diagnostic framework
Provided baseline implementations and evaluation pipeline
Abstract
When deployed in the real world, AI agents will inevitably face challenges that exceed their individual capabilities. A critical component of AI safety is an agent's ability to recognize when it is likely to fail in a novel situation and to yield control to a more capable expert system. Leveraging such expert assistance can significantly improve safety and performance in such situations. Since expert assistance is costly, a central challenge is determining when to consult an expert. In this paper, we explore a novel variant of this problem, termed YRC-0, in which an agent must learn to collaborate with an expert in new environments in an unsupervised manner--that is, without interacting with the expert during training. This setting motivates the development of low-cost, robust approaches for training expert-leveraging agents. To support research in this area, we introduce YRC-Bench, an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Decision Making · Biomedical and Engineering Education · Big Data and Business Intelligence
