Red-Teaming Agent Execution Contexts: Open-World Security Evaluation on OpenClaw

Hongwei Yao; Yiming Liu; Yiling He; Bingrun Yang

arXiv:2605.11047·cs.CR·May 13, 2026

Red-Teaming Agent Execution Contexts: Open-World Security Evaluation on OpenClaw

Hongwei Yao, Yiming Liu, Yiling He, Bingrun Yang

PDF

1 Repo

TL;DR

This paper introduces DeepTrap, a framework for detecting vulnerabilities in agentic language models by manipulating execution contexts, revealing security risks beyond user prompts.

Contribution

It presents a novel black-box optimization approach to identify context-based vulnerabilities in AI agents, supported by a comprehensive benchmark and evaluation.

Findings

01

Contextual compromise can cause unsafe behaviors without affecting task performance.

02

Final-response evaluation alone is insufficient for security assessment.

03

DeepTrap effectively uncovers high-risk vulnerabilities across multiple scenarios.

Abstract

Agentic language-model systems increasingly rely on mutable execution contexts, including files, memory, tools, skills, and auxiliary artifacts, creating security risks beyond explicit user prompts. This paper presents DeepTrap, an automated framework for discovering contextual vulnerabilities in OpenClaw. DeepTrap formulates adversarial context manipulation as a black-box trajectory-level optimization problem that balances risk realization, benign-task preservation, and stealth. It combines risk-conditioned evaluation, multi-objective trajectory scoring, reward-guided beam search, and reflection-based deep probing to identify high-value compromised contexts. We construct a 42-case benchmark spanning six vulnerability classes and seven operational scenarios, and evaluate nine target models using attack and utility grading scores. Results show that contextual compromise can induce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ZJUICSR/DeepTrap
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.