Don't Let the Claw Grip Your Hand: A Security Analysis and Defense Framework for OpenClaw
Zhengyang Shan, Jiayun Xin, Yue Zhang, Minghui Xu

TL;DR
This paper analyzes security vulnerabilities in the OpenClaw AI agent framework and proposes a human-in-the-loop defense mechanism that significantly enhances security against malicious attacks.
Contribution
It provides a systematic security evaluation of OpenClaw and introduces a novel HITL defense layer to improve its resilience against adversarial threats.
Findings
OpenClaw has significant security vulnerabilities with only 17% defense rate.
The HITL layer intercepts up to 8 severe attacks bypassing native defenses.
Combined approach improves overall defense rate to 19%-92%.
Abstract
Code agents powered by large language models can execute shell commands on behalf of users, introducing severe security vulnerabilities. This paper presents a two-phase security analysis of the OpenClaw platform. As an open-source AI agent framework that operates locally, OpenClaw can be integrated with various commercial large language models. Because its native architecture lacks built-in security constraints, it serves as an ideal subject for evaluating baseline agent vulnerabilities. First, we systematically evaluate OpenClaw's native resilience against malicious instructions. By testing 47 adversarial scenarios across six major attack categories derived from the MITRE ATLAS and ATT\&CK frameworks, we have demonstrated that OpenClaw exhibits significant inherent security issues. It primarily relies on the security capabilities of the backend LLM and is highly susceptible to sandbox…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Advanced Malware Detection Techniques
