Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models III: Implementing the Bacterial Biothreat Benchmark (B3) Dataset
Gary Ackerman, Theodore Wilson, Zachary Kallenborn, Olivia Shoemaker, Anna Wetzel, Hayley Peterson, Abigail Danfora, Jenna LaTourette, Brandon Behlendorf, Douglas Clifford

TL;DR
This paper presents a pilot implementation of the B3 dataset, a benchmark for assessing biosecurity risks of frontier AI models, demonstrating its effectiveness in identifying risks and guiding mitigation efforts.
Contribution
It introduces the B3 dataset as part of the BBG framework and demonstrates its practical application in evaluating biosecurity risks of large language models.
Findings
B3 dataset effectively assesses biosecurity risks.
Human evaluation provides nuanced insights into model responses.
The framework guides targeted mitigation strategies.
Abstract
The potential for rapidly-evolving frontier artificial intelligence (AI) models, especially large language models (LLMs), to facilitate bioterrorism or access to biological weapons has generated significant policy, academic, and public concern. Both model developers and policymakers seek to quantify and mitigate any risk, with an important element of such efforts being the development of model benchmarks that can assess the biosecurity risk posed by a particular model. This paper discusses the pilot implementation of the Bacterial Biothreat Benchmark (B3) dataset. It is the third in a series of three papers describing an overall Biothreat Benchmark Generation (BBG) framework, with previous papers detailing the development of the B3 dataset. The pilot involved running the benchmarks through a sample frontier AI model, followed by human evaluation of model responses, and an applied risk…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBacillus and Francisella bacterial research · Law, AI, and Intellectual Property · Bacteriophages and microbial interactions
