NAAMSE: Framework for Evolutionary Security Evaluation of Agents
Kunal Pai, Parth Shah, Harshil Patel

TL;DR
NAAMSE introduces an evolutionary framework for assessing AI agent security, using adaptive mutation and behavioral scoring to uncover vulnerabilities more effectively than static methods.
Contribution
It presents a novel, feedback-driven evolutionary approach for dynamic security evaluation of AI agents, improving vulnerability detection over traditional static benchmarks.
Findings
Evolutionary mutation uncovers vulnerabilities missed by one-shot methods.
Adaptive approach reveals high-severity failure modes.
Framework provides scalable, realistic robustness assessment.
Abstract
AI agents are increasingly deployed in production, yet their security evaluations remain bottlenecked by manual red-teaming or static benchmarks that fail to model adaptive, multi-turn adversaries. We propose NAAMSE, an evolutionary framework that reframes agent security evaluation as a feedback-driven optimization problem. Our system employs a single autonomous agent that orchestrates a lifecycle of genetic prompt mutation, hierarchical corpus exploration, and asymmetric behavioral scoring. By using model responses as a fitness signal, the framework iteratively compounds effective attack strategies while simultaneously ensuring "benign-use correctness", preventing the degenerate security of blanket refusal. Our experiments across a diverse suite of state-of-the-art large language models demonstrate that evolutionary mutation systematically amplifies vulnerabilities missed by one-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Artificial Intelligence in Games
