BeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional Environments
Yuxuan Li, Yi Lin, Peng Wang, Shiming Liu, Xuetao Wei

TL;DR
BeSafe-Bench is a comprehensive safety benchmark for situated agents across multiple domains, revealing significant safety risks and performance-safety trade-offs in current models.
Contribution
The paper introduces BeSafe-Bench, a novel benchmark that evaluates behavioral safety risks of agents in functional environments across four domains, using a hybrid assessment framework.
Findings
Even top agents complete less than 40% of tasks safely.
High task performance often correlates with safety violations.
The benchmark exposes urgent safety concerns in current agent systems.
Abstract
The rapid evolution of Large Multimodal Models (LMMs) has enabled agents to perform complex digital and physical tasks, yet their deployment as autonomous decision-makers introduces substantial unintentional behavioral safety risks. However, the absence of a comprehensive safety benchmark remains a major bottleneck, as existing evaluations rely on low-fidelity environments, simulated APIs, or narrowly scoped tasks. To address this gap, we present BeSafe-Bench (BSB), a benchmark for exposing behavioral safety risks of situated agents in functional environments, covering four representative domains: Web, Mobile, Embodied VLM, and Embodied VLA. Using functional environments, we construct a diverse instruction space by augmenting tasks with nine categories of safety-critical risks, and adopt a hybrid evaluation framework that combines rule-based checks with LLM-as-a-judge reasoning to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
