A Systematic Security Evaluation of OpenClaw and Its Variants
Yuhang Wang, Haichang Gao, Zhenxing Niu, Zhaoxiang Liu, Wenjing Zhang, Xiang Wang, and Shiguo Lian

TL;DR
This paper systematically evaluates the security vulnerabilities of six OpenClaw-series AI agent frameworks, revealing significant risks and emphasizing the importance of comprehensive lifecycle security measures.
Contribution
It introduces a benchmark of 205 test cases for assessing security risks across agent frameworks and demonstrates the varied vulnerabilities present in current systems.
Findings
All evaluated agents have substantial security vulnerabilities.
Agentized systems are riskier than their underlying models.
Reconnaissance and discovery are the most common weaknesses.
Abstract
Tool-augmented AI agents substantially extend the practical capabilities of large language models, but they also introduce security risks that cannot be identified through model-only evaluation. In this paper, we present a systematic security assessment of six representative OpenClaw-series agent frameworks, namely OpenClaw, AutoClaw, QClaw, KimiClaw, MaxClaw, and ArkClaw, under multiple backbone models. To support this study, we construct a benchmark of 205 test cases covering representative attack behaviors across the full agent execution lifecycle, enabling unified evaluation of risk exposure at both the framework and model levels. Our results show that all evaluated agents exhibit substantial security vulnerabilities, and that agentized systems are significantly riskier than their underlying models used in isolation. In particular, reconnaissance and discovery behaviors emerge as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
