To Transformers and Beyond: Large Language Models for the Genome
Micaela E. Consens, Cameron Dufault, Michael Wainberg, Duncan Forster,, Mehran Karimzadeh, Hani Goodarzi, Fabian J. Theis, Alan Moses, Bo Wang

TL;DR
This review discusses the impact of Large Language Models, especially transformers, on genomics, highlighting their strengths, limitations, and future directions in genomic data analysis.
Contribution
It provides a comprehensive overview of how transformer-based LLMs are transforming genomic data analysis and explores future research directions beyond transformers.
Findings
Transformers have shown significant potential in genomic data modeling.
Limitations of current LLMs in genomics include data complexity and interpretability.
Future models may go beyond transformers to improve genomic analysis.
Abstract
In the rapidly evolving landscape of genomics, deep learning has emerged as a useful tool for tackling complex computational challenges. This review focuses on the transformative role of Large Language Models (LLMs), which are mostly based on the transformer architecture, in genomics. Building on the foundation of traditional convolutional neural networks and recurrent neural networks, we explore both the strengths and limitations of transformers and other LLMs for genomics. Additionally, we contemplate the future of genomic modeling beyond the transformer architecture based on current trends in research. The paper aims to serve as a guide for computational biologists and computer scientists interested in LLMs for genomic data. We hope the paper can also serve as an educational introduction and discussion for biologists to a fundamental shift in how we will be analyzing genomic data in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Topic Modeling · RNA modifications and cancer
