SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation Model
Zhao Yang, Jiwei Zhu, Bing Su

TL;DR
SPACE introduces a supervised genomic profile prediction model using Mixture of Experts to learn powerful DNA representations, outperforming pure sequence pre-training methods across multiple tasks.
Contribution
The paper presents a novel supervised training approach with MoE for genomic profiles, enhancing DNA representation learning beyond traditional unsupervised methods.
Findings
Achieves state-of-the-art performance on various genomic tasks.
Supervised profile prediction outperforms pure sequence pre-training.
Effective multi-species and multi-profile modeling with MoE.
Abstract
Inspired by the success of unsupervised pre-training paradigms, researchers have applied these approaches to DNA pre-training. However, we argue that these approaches alone yield suboptimal results because pure DNA sequences lack sufficient information, since their functions are regulated by genomic profiles like chromatin accessibility. Here, we demonstrate that supervised training for genomic profile prediction serves as a more effective alternative to pure sequence pre-training. Furthermore, considering the multi-species and multi-profile nature of genomic profile prediction, we introduce our pecies-rofile daptive ollaborative xperts (SPACE) that leverages Mixture of Experts (MoE) to better capture the relationships between DNA sequences across different species and genomic profiles, thereby learning more effective DNA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMolecular Biology Techniques and Applications · Genomics and Phylogenetic Studies
