Large Language Models for Bioinformatics
Wei Ruan, Yanjun Lyu, Jing Zhang, Jiazhang Cai, Peng Shu, Yang Ge, Yao, Lu, Shang Gao, Yue Wang, Peilong Wang, Lin Zhao, Tao Wang, Yufang Liu, Luyang, Fang, Ziyu Liu, Zhengliang Liu, Yiwei Li, Zihao Wu, Junhao Chen, Hanqi Jiang,, Yi Pan, Zhenyuan Yang, Jingyuan Chen

TL;DR
This survey reviews the development, applications, and challenges of bioinformatics-specific large language models (BioLMs), emphasizing their transformative potential in disease diagnosis, drug discovery, and vaccine development.
Contribution
It provides a comprehensive analysis of BioLMs' evolution, classification, training, applications, challenges, and future directions in bioinformatics.
Findings
BioLMs are increasingly used in disease diagnosis, drug discovery, and vaccine development.
Key challenges include data privacy, interpretability, biases, and domain adaptation.
Emerging trends point to more sophisticated biological and clinical applications.
Abstract
With the rapid advancements in large language model (LLM) technology and the emergence of bioinformatics-specific language models (BioLMs), there is a growing need for a comprehensive analysis of the current landscape, computational characteristics, and diverse applications. This survey aims to address this need by providing a thorough review of BioLMs, focusing on their evolution, classification, and distinguishing features, alongside a detailed examination of training methodologies, datasets, and evaluation frameworks. We explore the wide-ranging applications of BioLMs in critical areas such as disease diagnosis, drug discovery, and vaccine development, highlighting their impact and transformative potential in bioinformatics. We identify key challenges and limitations inherent in BioLMs, including data privacy and security concerns, interpretability issues, biases in training data and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
