Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model
Weihua Wang, Haoji Li, Feilong Bao, Lei Yang, and Guanglai Gao

TL;DR
Wisteria is a novel DNA language model that unifies multi-scale feature learning to effectively capture local motifs and global dependencies, improving genomic sequence analysis.
Contribution
It introduces a unified framework combining gated dilated convolutions, Fourier-based attention, and Mamba architecture for comprehensive DNA sequence modeling.
Findings
Wisteria outperforms existing models on multiple genomic benchmarks.
The model effectively captures both local motifs and long-range dependencies.
Fourier-based attention enhances frequency domain modeling and length generalization.
Abstract
DNA language model aims to decipher the regulatory grammar and semantic of genomes by capturing long range dependencies in DNA sequences. Existing methods emphasize long range token interactions but often ignore the interplay between local motifs and global dependencies. In this paper, we propose Wisteria, a genomic language model that integrates multi scale feature learning within a unified framework for DNA sequence. Specifically, Wisteria augments the Mamba based architecture with gated dilated convolutions to capture local motifs and regulatory patterns, while gated multilayer perceptrons refine global dependencies. We further introduce a Fourier based attention mechanism to support frequency domain modeling, periodic extension and length generalization. Across four experimental settings with both short and long range dependencies, Wisteria demonstrates strong performance on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
