The Yes-Man Syndrome: Benchmarking Abstention in Embodied Robotic Agents

Doguhan Yeke; Elif Su Temirel; Ananth Shreekumar; Brandon Lee; Dongyan Xu; Z Berkay Celik

arXiv:2605.20544·cs.RO·May 21, 2026

The Yes-Man Syndrome: Benchmarking Abstention in Embodied Robotic Agents

Doguhan Yeke, Elif Su Temirel, Ananth Shreekumar, Brandon Lee, Dongyan Xu, Z Berkay Celik

PDF

1 Repo

TL;DR

This paper introduces RoboAbstention, a framework for benchmarking abstention in embodied robotic agents using visual grounding and instruction generation, revealing significant weaknesses in current vision-language models.

Contribution

It presents a novel taxonomy and dataset for evaluating abstention in embodied robotics, along with methods to improve abstention performance.

Findings

01

All evaluated models show weaknesses in abstention, with the best at 39%.

02

Interventions like prompting and in-context learning significantly improve abstention rates.

03

No current approach fully addresses the abstention challenge in embodied robotic agents.

Abstract

Vision-language models (VLMs) are used as high-level planners for embodied agents, translating natural language instructions and visual observations into action plans. While prior work has studied abstention in LLMs, existing benchmarks are largely text-only and do not capture the perceptual grounding and physical constraints inherent to embodied robotics environments. In such settings, abstention requires recognizing when instructions are ambiguous, physically infeasible, based on false premises, or otherwise unresolvable given the available sensory modalities and context. To address this gap, we introduce a taxonomy to categorize abstention in the context of embodied robotics and present RoboAbstention, a scalable and auditable framework for generating abstention instructions grounded in images gathered from five robotics datasets. RoboAbstention instantiates the taxonomy through a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://purseclab.github.io/RoboAbstention
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.