SafePro: Evaluating the Safety of Professional-Level AI Agents

Kaiwen Zhou; Shreedhar Jangam; Ashwin Nagarajan; Tejas Polu; Suhas Oruganti; Chengzhi Liu; Ching-Chen Kuo; Yuting Zheng; Sravana Narayanaraju; Xin Eric Wang

arXiv:2601.06663·cs.AI·January 14, 2026

SafePro: Evaluating the Safety of Professional-Level AI Agents

Kaiwen Zhou, Shreedhar Jangam, Ashwin Nagarajan, Tejas Polu, Suhas Oruganti, Chengzhi Liu, Ching-Chen Kuo, Yuting Zheng, Sravana Narayanaraju, Xin Eric Wang

PDF

Open Access

TL;DR

SafePro introduces a comprehensive benchmark for evaluating the safety of AI agents performing complex professional tasks, revealing significant safety vulnerabilities and the need for improved safety mechanisms in advanced AI systems.

Contribution

The paper presents SafePro, a novel benchmark dataset for assessing safety in professional AI agents, addressing a gap in existing safety evaluations for complex, real-world tasks.

Findings

01

State-of-the-art models show safety vulnerabilities in professional tasks.

02

Models exhibit insufficient safety judgment and alignment.

03

Safety mitigation strategies improve agent safety.

Abstract

Large language model-based agents are rapidly evolving from simple conversational assistants into autonomous systems capable of performing complex, professional-level tasks in various domains. While these advancements promise significant productivity gains, they also introduce critical safety risks that remain under-explored. Existing safety evaluations primarily focus on simple, daily assistance tasks, failing to capture the intricate decision-making processes and potential consequences of misaligned behaviors in professional settings. To address this gap, we introduce \textbf{SafePro}, a comprehensive benchmark designed to evaluate the safety alignment of AI agents performing professional activities. SafePro features a dataset of high-complexity tasks across diverse professional domains with safety risks, developed through a rigorous iterative creation and review process. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Artificial Intelligence in Healthcare and Education