Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models I: The Task-Query Architecture
Gary Ackerman, Brandon Behlendorf, Zachary Kallenborn, Sheriff Almakki, Doug Clifford, Jenna LaTourette, Hayley Peterson, Noah Sheinbaum, Olivia Shoemaker, Anna Wetzel

TL;DR
This paper introduces the Biothreat Benchmark Generation Framework, a novel approach to evaluate the biosecurity risks of AI models, focusing on bacterial threats and considering technical and operational factors.
Contribution
It presents the first component of a comprehensive framework for assessing biosecurity risks of AI models, including a hierarchical biothreat schema and task-query architecture.
Findings
Developed the Bacterial Biothreat Schema for structured threat assessment
Created task-aligned queries for evaluating AI model risks
Framework captures technical and operational threat factors
Abstract
Both model developers and policymakers seek to quantify and mitigate the risk of rapidly-evolving frontier artificial intelligence (AI) models, especially large language models (LLMs), to facilitate bioterrorism or access to biological weapons. An important element of such efforts is the development of model benchmarks that can assess the biosecurity risk posed by a particular model. This paper describes the first component of a novel Biothreat Benchmark Generation (BBG) Framework. The BBG approach is designed to help model developers and evaluators reliably measure and assess the biosecurity risk uplift and general harm potential of existing and future AI models, while accounting for key aspects of the threat itself that are often overlooked in other benchmarking efforts, including different actor capability levels, and operational (in addition to purely technical) risk factors. As a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBacillus and Francisella bacterial research · Biomedical Text Mining and Ontologies · Artificial Immune Systems Applications
