Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning
James R. M. Black, Moritz S. Hanke, Aaron Maiwald, Tina Hernandez-Boussard, Oliver M. Crook, Jaspreet Pannu

TL;DR
This study evaluates the robustness of filtering strategies in genomic language models against misuse by fine-tuning on harmful viral sequences, revealing potential vulnerabilities and emphasizing the need for comprehensive safety measures.
Contribution
It demonstrates that fine-tuning can bypass data filtering safeguards in genomic language models, highlighting the necessity for improved safety frameworks.
Findings
Fine-tuning on harmful viruses can rescue misuse capabilities of gLMs.
Filtered models still identify immune escape variants like SARS-CoV-2.
Data exclusion alone may be insufficient for model safety.
Abstract
Novel deep learning architectures are increasingly being applied to biological data, including genetic sequences. These models, referred to as genomic language models (gLMs), have demonstrated impressive predictive and generative capabilities, raising concerns that such models may also enable misuse, for instance via the generation of genomes for human-infecting viruses. These concerns have catalyzed calls for risk mitigation measures. The de facto mitigation of choice is filtering of pretraining data (i.e., removing viral genomic sequences from training datasets) in order to limit gLM performance on virus-related tasks. However, it is not currently known how robust this approach is for securing open-source models that can be fine-tuned using sensitive pathogen data. Here, we evaluate a state-of-the-art gLM, Evo 2, and perform fine-tuning using sequences from 110 harmful human-infecting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Rare Diseases · Adversarial Robustness in Machine Learning · RNA and protein synthesis mechanisms
