Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening

Zhenxiong Yu; Zhi Yang; Zhiheng Jin; Shuhe Wang; Heng Zhang; Yanlin Fei; Lingfeng Zeng; Fangqi Lou; Shuo Zhang; Tu Hu; Jingping Liu; Rongze Chen; Xingyu Zhu; Kunyi Wang; Chaofa Yuan; Xin Guo; Zhaowei Liu; Feipeng Zhang; Jie Huang; Huacan Wang; Ronghao Chen; Liwen Zhang

arXiv:2602.05386·cs.CR·February 9, 2026

Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening

Zhenxiong Yu, Zhi Yang, Zhiheng Jin, Shuhe Wang, Heng Zhang, Yanlin Fei, Lingfeng Zeng, Fangqi Lou, Shuo Zhang, Tu Hu, Jingping Liu, Rongze Chen, Xingyu Zhu, Kunyi Wang, Chaofa Yuan, Xin Guo, Zhaowei Liu, Feipeng Zhang, Jie Huang, Huacan Wang, Ronghao Chen, Liwen Zhang

PDF

Open Access 1 Datasets

TL;DR

Spider-Sense introduces an intrinsic, event-driven security framework for autonomous agents that selectively triggers defenses based on risk perception, improving efficiency and effectiveness over traditional mandatory checks.

Contribution

It proposes a novel Intrinsic Risk Sensing framework with hierarchical defense, and introduces S$^2$Bench, a benchmark for evaluating agent security against multi-stage attacks.

Findings

01

Achieves lowest Attack Success Rate (ASR) and False Positive Rate (FPR)

02

Maintains only 8.3% latency overhead

03

Outperforms existing defense mechanisms in experiments

Abstract

As large language models (LLMs) evolve into autonomous agents, their real-world applicability has expanded significantly, accompanied by new security challenges. Most existing agent defense mechanisms adopt a mandatory checking paradigm, in which security validation is forcibly triggered at predefined stages of the agent lifecycle. In this work, we argue that effective agent security should be intrinsic and selective rather than architecturally decoupled and mandatory. We propose Spider-Sense framework, an event-driven defense framework based on Intrinsic Risk Sensing (IRS), which allows agents to maintain latent vigilance and trigger defenses only upon risk perception. Once triggered, the Spider-Sense invokes a hierarchical defence mechanism that trades off efficiency and precision: it resolves known patterns via lightweight similarity matching while escalating ambiguous cases to deep…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

aifinlab/S2Bench
dataset· 59 dl
59 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Security and Verification in Computing