Bio-xLSTM: Generative modeling, representation and in-context learning   of biological and chemical sequences

Niklas Schmidinger; Lisa Schneckenreiter; Philipp Seidl; Johannes; Schimunek; Pieter-Jan Hoedt; Johannes Brandstetter; Andreas Mayr; Sohvi; Luukkonen; Sepp Hochreiter; G\"unter Klambauer

arXiv:2411.04165·q-bio.BM·November 8, 2024·5 cites

Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences

Niklas Schmidinger, Lisa Schneckenreiter, Philipp Seidl, Johannes, Schimunek, Pieter-Jan Hoedt, Johannes Brandstetter, Andreas Mayr, Sohvi, Luukkonen, Sepp Hochreiter, G\"unter Klambauer

PDF

Open Access 3 Repos

TL;DR

Bio-xLSTM introduces a recurrent architecture tailored for biological and chemical sequences, enabling efficient long-range dependency modeling, generative capabilities, and in-context learning across genomics, proteins, and chemistry domains.

Contribution

The paper adapts xLSTM architecture for biological and chemical data, demonstrating its effectiveness over Transformers in modeling, generation, and in-context learning tasks.

Findings

01

Bio-xLSTM models effectively generate DNA, protein, and chemical sequences.

02

Models learn rich, meaningful representations of biological and chemical data.

03

Bio-xLSTM enables in-context learning for proteins and small molecules.

Abstract

Language models for biological and chemical sequences enable crucial applications such as drug discovery, protein engineering, and precision medicine. Currently, these language models are predominantly based on Transformer architectures. While Transformers have yielded impressive results, their quadratic runtime dependency on the sequence length complicates their use for long genomic sequences and in-context learning on proteins and chemical sequences. Recently, the recurrent xLSTM architecture has been shown to perform favorably compared to Transformers and modern state-space model (SSM) architectures in the natural language domain. Similar to SSMs, xLSTMs have a linear runtime dependency on the sequence length and allow for constant-memory decoding at inference time, which makes them prime candidates for modeling long-range dependencies in biological and chemical sequences. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Bioinformatics · Genomics and Phylogenetic Studies · Gene expression and cancer classification

MethodsAttention Is All You Need · Adam · Linear Layer · Absolute Position Encodings · Multi-Head Attention · Residual Connection · Softmax · Byte Pair Encoding · Dropout · Dense Connections