Beyond Benchmark Islands: Toward Representative Trustworthiness Evaluation for Agentic AI

Jinhu Qi; Yifan Li; Minghao Zhao; Wentao Zhang; Zijian Zhang; Yaoman Li; Irwin King

arXiv:2603.14987·cs.CL·May 22, 2026

Beyond Benchmark Islands: Toward Representative Trustworthiness Evaluation for Agentic AI

Jinhu Qi, Yifan Li, Minghao Zhao, Wentao Zhang, Zijian Zhang, Yaoman Li, Irwin King

PDF

1 Repo

TL;DR

This paper introduces a comprehensive framework for evaluating agentic AI trustworthiness across socio-technical scenarios, addressing current fragmentation and operationalizing trustworthiness properties.

Contribution

It defines a five-property trustworthiness profile and proposes the HAAF framework for scenario-based assessment and intervention, enabling generalizable improvements across diverse AI systems.

Findings

01

All 13 tested systems improved on the trustworthiness profile.

02

Two systems achieved a perfect risk-weighted profile.

03

The framework generalizes interventions without per-model tuning.

Abstract

Agentic AI systems increasingly act through tool-augmented, multi-step workflows whose failures (unsafe tool use, unauthorised actions, social harm) carry deployment-level consequences. Evaluation practice remains fragmented across isolated benchmark slices, and "trustworthiness" is frequently invoked but rarely defined operationally. We argue the central limitation is twofold: (i) the absence of a measurable specification of what agent trustworthiness means, and (ii) the lack of a principled notion of representativeness allowing assessment over a socio-technical scenario distribution rather than disconnected benchmark instances. We address (i) by defining agentic trustworthiness as a five-property profile (Reliability, Robustness, Safety, Social-Ethical Alignment, Operational Integrity) grounded in current AI risk frameworks, and (ii) with the Holographic Agent Assessment Framework…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TonyQJH/haaf-pilot
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Artificial Intelligence in Healthcare and Education