TL;DR
BioVeil MATRIX introduces a taxonomy to categorize biosecurity risks of agentic biological AI scientists, highlighting vulnerabilities and proposing safeguards to prevent dual-use applications in life sciences.
Contribution
The paper presents BioVeil MATRIX, a comprehensive taxonomy for biosecurity risks in AI scientists, and demonstrates increased capabilities and vulnerabilities in existing models like Biomni.
Findings
Agentic AI scientists can assist with dual-use biological tasks.
Current safeguards are insufficient against certain agentic behaviors.
BioVeil MATRIX provides a structured framework for risk assessment.
Abstract
Agentic AI scientists equipped with domain-specific tools are rapidly entering scientific workflows across disciplines, with especially strong uptake in the life sciences where they can be used for literature synthesis, sequence analysis, and experimental planning support. While these systems accelerate biological research, they also introduce risks for dual-use applications that are not captured by current model-centric safety evaluations. We present evidence that current agentic AI scientists, including Biomni and K-Dense, are willing to assist with dual-use tasks that are blocked by base model safeguards. We also found that in a paired evaluation framework for biology and chemistry prompts involving Weapons of Mass Destruction proxies (WMDP), agentic scaffolding of Biomni increased the benchmark performance relative to the underlying standalone model, producing measurable capability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
