AegisAgent: An Autonomous Defense Agent Against Prompt Injection Attacks in LLM-HARs
Yihan Wang, Huanqi Yang, Shantanu Pal, Weitao Xu

TL;DR
This paper presents AegisAgent, an autonomous defense system for LLM-based human activity recognition, which actively detects and mitigates prompt injection attacks to improve security and trustworthiness.
Contribution
It introduces AegisAgent, a novel autonomous agent that actively perceives, reasons, and acts to defend against prompt injection attacks in LLM-driven HAR systems.
Findings
Reduces attack success rate by 30% on average
Incurs only 78.6 ms latency overhead on GPU
Effective against 15 common prompt injection attacks
Abstract
The integration of Large Language Models (LLMs) into wearable sensing is creating a new class of mobile applications capable of nuanced human activity understanding. However, the reliability of these systems is critically undermined by their vulnerability to prompt injection attacks, where attackers deliberately input deceptive instructions into LLMs. Traditional defenses, based on static filters and rigid rules, are insufficient to address the semantic complexity of these new attacks. We argue that a paradigm shift is needed -- from passive filtering to active protection and autonomous reasoning. We introduce AegisAgent, an autonomous agent system designed to ensure the security of LLM-driven HAR systems. Instead of merely blocking threats, AegisAgent functions as a cognitive guardian. It autonomously perceives potential semantic inconsistencies, reasons about the user's true intent by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Security and Verification in Computing
