TL;DR
This paper evaluates genomic language models' ability to understand DNA function by testing their predictions on evolutionarily implausible sequences, revealing limitations in their mechanistic understanding of gene regulation.
Contribution
Introduces Nullsettes, a benchmark for assessing DNA models' ability to predict loss-of-function mutations in synthetic sequences, highlighting current models' reliance on evolutionary patterns.
Findings
Most models fail to detect LOF mutations in synthetic sequences.
Predictive accuracy drops as likelihood of original sequence decreases.
Models rely heavily on pattern-matching rather than mechanistic understanding.
Abstract
Genomic language models (gLMs) hold promise for generating novel, functional DNA sequences for synthetic biology. However, realizing this potential requires models to go beyond evolutionary plausibility and understand how DNA sequence encodes gene expression and regulation. We introduce a benchmark called Nullsettes, which assesses how well models can predict in silico loss-of-function (LOF) mutations, in synthetic expression cassettes with little evolutionary precedent. Testing 12 state-of-the-art gLMs, we find that most fail to consistently detect these strong LOF mutations. All models show a sharp drop in predictive accuracy as the likelihood assigned to the original (nonmutant) sequence decreases, suggesting that gLMs rely heavily on pattern-matching to their evolutionary prior rather than on any mechanistic understanding of gene expression. Our findings highlight fundamental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
