Revisiting Convolution Architecture in the Realm of DNA Foundation Models
Yu Bo, Weian Mao, Yanjun Shao, Weiqiang Bai, Peng Ye, Xinzhu Ma, Junbo, Zhao, Hao Chen, Chunhua Shen

TL;DR
This paper introduces ConvNova, a CNN-based approach for DNA foundation modeling, demonstrating it outperforms recent transformer and SSM methods on multiple benchmarks, challenging the notion that CNNs are outdated.
Contribution
ConvNova is a novel CNN architecture with dilated, gated, and dual-branch designs that surpasses recent methods on DNA foundation model benchmarks.
Findings
ConvNova outperforms recent methods on over half of the tasks.
ConvNova achieves 5.8% higher accuracy in histone-related tasks.
CNNs remain competitive with Transformers and SSMs in DNA modeling.
Abstract
In recent years, a variety of methods based on Transformer and state space model (SSM) architectures have been proposed, advancing foundational DNA language models. However, there is a lack of comparison between these recent approaches and the classical architecture convolutional networks (CNNs) on foundation model benchmarks. This raises the question: are CNNs truly being surpassed by these recent approaches based on transformer and SSM architectures? In this paper, we develop a simple but well-designed CNN-based method termed ConvNova. ConvNova identifies and proposes three effective designs: 1) dilated convolutions, 2) gated convolutions, and 3) a dual-branch framework for gating mechanisms. Through extensive empirical experiments, we demonstrate that ConvNova significantly outperforms recent methods on more than half of the tasks across several foundation model benchmarks. For…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · Generative Adversarial Networks and Image Synthesis · Machine Learning in Materials Science
