Taxonomy and Consistency Analysis of Safety Benchmarks for AI Agents

Miles Q. Li; Benjamin C. M. Fung; Boyang Li; Heba Ismail; Farkhund Iqbal

arXiv:2605.16282·cs.CY·May 19, 2026

Taxonomy and Consistency Analysis of Safety Benchmarks for AI Agents

Miles Q. Li, Benjamin C. M. Fung, Boyang Li, Heba Ismail, Farkhund Iqbal

PDF

TL;DR

This paper systematically analyzes 40 safety benchmarks for AI agents, revealing methodological inconsistencies, coverage gaps, and the impact of evaluation choices on safety conclusions.

Contribution

It introduces a six-axis taxonomy for benchmark evaluation, catalogs existing benchmarks, and proposes standards to improve consistency and coverage in AI safety assessment.

Findings

01

Benchmark choice can lead to contradictory safety conclusions.

02

Coverage counts often overstate evaluation depth.

03

Environment fidelity influences safety reporting.

Abstract

The rapid deployment of LLM-based autonomous agents has introduced safety risks that extend far beyond traditional LLM concerns, prompting a proliferation of safety benchmarks since late 2023. However, these benchmarks have developed independently, with inconsistent threat models, incompatible metrics, and overlapping yet incomplete risk coverage. We present the first systematic analysis dedicated to agent safety benchmarks as evaluation instruments. We catalog 40 behavioral agent-safety benchmarks (2023-2026), plus 5 adjacent evaluator, defense, and dataset artifacts, propose a six-axis taxonomy of benchmark evaluation methodology, and apply it across the corpus to characterize how methodological choices shape safety conclusions. A coverage matrix reveals broad risk coverage but limited methodological convergence, while the taxonomy analysis shows a behavioral-benchmark core…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.