TL;DR
JanusDNA is a novel bidirectional DNA foundation model that combines autoregressive and masked training paradigms, enabling efficient long-range genomic modeling and achieving state-of-the-art results on multiple benchmarks.
Contribution
It introduces the first hybrid bidirectional DNA model using a novel pretraining paradigm and a scalable architecture combining Attention and MoE layers.
Findings
Processes up to 1 million base pairs at single nucleotide resolution.
Achieves new state-of-the-art on three genomic benchmarks.
Outperforms larger models with 250x fewer parameters.
Abstract
Large language models (LLMs) have revolutionized natural language processing and are increasingly applied to other sequential data types, including genetic sequences. However, adapting LLMs to genomics presents significant challenges. Capturing complex genomic interactions requires modeling long-range dependencies within DNA sequences, where interactions often span over 10,000 base pairs, even within a single gene, posing substantial computational burdens under conventional model architectures and training paradigms. Moreover, standard LLM training approaches are suboptimal for DNA: autoregressive training, while efficient, supports only unidirectional understanding. However, DNA is inherently bidirectional, e.g., bidirectional promoters regulate transcription in both directions and account for nearly 11% of human gene expression. Masked language models (MLMs) allow bidirectional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need · Mixture of Experts · Balanced Selection · Mamba: Linear-Time Sequence Modeling with Selective State Spaces
