Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

James R. M. Black; Moritz S. Hanke; Aaron Maiwald; Tina Hernandez-Boussard; Oliver M. Crook; Jaspreet Pannu

arXiv:2511.19299·cs.LG·March 24, 2026

Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

James R. M. Black, Moritz S. Hanke, Aaron Maiwald, Tina Hernandez-Boussard, Oliver M. Crook, Jaspreet Pannu

PDF

Open Access

TL;DR

This study evaluates the robustness of filtering strategies in genomic language models against misuse by fine-tuning on harmful viral sequences, revealing potential vulnerabilities and emphasizing the need for comprehensive safety measures.

Contribution

It demonstrates that fine-tuning can bypass data filtering safeguards in genomic language models, highlighting the necessity for improved safety frameworks.

Findings

01

Fine-tuning on harmful viruses can rescue misuse capabilities of gLMs.

02

Filtered models still identify immune escape variants like SARS-CoV-2.

03

Data exclusion alone may be insufficient for model safety.

Abstract

Novel deep learning architectures are increasingly being applied to biological data, including genetic sequences. These models, referred to as genomic language models (gLMs), have demonstrated impressive predictive and generative capabilities, raising concerns that such models may also enable misuse, for instance via the generation of genomes for human-infecting viruses. These concerns have catalyzed calls for risk mitigation measures. The de facto mitigation of choice is filtering of pretraining data (i.e., removing viral genomic sequences from training datasets) in order to limit gLM performance on virus-related tasks. However, it is not currently known how robust this approach is for securing open-source models that can be fine-tuned using sensitive pathogen data. Here, we evaluate a state-of-the-art gLM, Evo 2, and perform fine-tuning using sequences from 110 harmful human-infecting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Rare Diseases · Adversarial Robustness in Machine Learning · RNA and protein synthesis mechanisms