Biosecurity-Aware AI: Agentic Risk Auditing of Soft Prompt Attacks on ESM-Based Variant Predictors
Huixin Zhan

TL;DR
This paper introduces SAGE, an agentic framework for auditing the adversarial vulnerabilities of genomic foundation models like ESM2, revealing their sensitivity to soft prompt attacks and emphasizing the need for security in biomedical applications.
Contribution
The paper presents SAGE, a novel automated risk auditing framework that evaluates the robustness of GFMs against soft prompt attacks without altering the models.
Findings
GFMs like ESM2 are vulnerable to targeted soft prompt attacks
Soft prompt attacks cause significant performance degradation in GFMs
Agentic risk auditing can identify hidden vulnerabilities in biomedical models
Abstract
Genomic Foundation Models (GFMs), such as Evolutionary Scale Modeling (ESM), have demonstrated remarkable success in variant effect prediction. However, their security and robustness under adversarial manipulation remain largely unexplored. To address this gap, we introduce the Secure Agentic Genomic Evaluator (SAGE), an agentic framework for auditing the adversarial vulnerabilities of GFMs. SAGE functions through an interpretable and automated risk auditing loop. It injects soft prompt perturbations, monitors model behavior across training checkpoints, computes risk metrics such as AUROC and AUPR, and generates structured reports with large language model-based narrative explanations. This agentic process enables continuous evaluation of embedding-space robustness without modifying the underlying model. Using SAGE, we find that even state-of-the-art GFMs like ESM2 are sensitive to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Rare Diseases · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
