How Adversarial Environments Mislead Agentic AI?

Zhonghao Zhan; Huichi Zhou; Zhenhao Li; Peiyuan Jing; Krinos Li; Hamed Haddadi

arXiv:2604.18874·cs.AI·April 22, 2026

How Adversarial Environments Mislead Agentic AI?

Zhonghao Zhan, Huichi Zhou, Zhenhao Li, Peiyuan Jing, Krinos Li, Hamed Haddadi

PDF

TL;DR

This paper exposes vulnerabilities in tool-integrated agents by formalizing adversarial attacks that can deceive agents through environmental manipulation, revealing a significant robustness gap.

Contribution

It introduces the AEI threat model and POTEMKIN testing harness to evaluate agent robustness against environmental deception attacks.

Findings

01

Agents show a robustness gap: resistance to one attack increases vulnerability to another.

02

Epistemic and navigational robustness are distinct capabilities.

03

Across 11,000+ runs, vulnerabilities were systematically demonstrated.

Abstract

Tool-integrated agents are deployed on the premise that external tools ground their outputs in reality. Yet this very reliance creates a critical attack surface. Current evaluations benchmark capability in benign settings, asking "can the agent use tools correctly" but never "what if the tools lie". We identify this Trust Gap: agents are evaluated for performance, not for skepticism. We formalize this vulnerability as Adversarial Environmental Injection (AEI), a threat model where adversaries compromise tool outputs to deceive agents. AEI constitutes environmental deception: constructing a "fake world" of poisoned search results and fabricated reference networks around unsuspecting agents. We operationalize this via POTEMKIN, a Model Context Protocol (MCP)-compatible harness for plug-and-play robustness testing. We identify two orthogonal attack surfaces: The Illusion (breadth attacks)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.