BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine
Yizhen Luo, Jiahuan Zhang, Siqi Fan, Kai Yang, Yushuai Wu, Mu Qiao,, Zaiqing Nie

TL;DR
BioMedGPT is an open multimodal generative transformer that bridges biological modalities with natural language, enabling communication and improving biomedical question answering to accelerate drug discovery.
Contribution
Introduction of BioMedGPT, the first open multimodal GPT for biomedicine, aligning biological data with natural language and demonstrating superior performance on biomedical QA tasks.
Findings
BioMedGPT-10B outperforms or matches human performance on biomedical QA.
BioMedGPT-LM-7B is the first Llama2-based biomedical generative model.
Open-sourced models and curated datasets facilitate biomedical multimodal research.
Abstract
Foundation models (FMs) have exhibited remarkable performance across a wide range of downstream tasks in many domains. Nevertheless, general-purpose FMs often face challenges when confronted with domain-specific problems, due to their limited access to the proprietary training data in a particular domain. In biomedicine, there are various biological modalities, such as molecules, proteins, and cells, which are encoded by the language of life and exhibit significant modality gaps with human natural language. In this paper, we introduce BioMedGPT, an open multimodal generative pre-trained transformer (GPT) for biomedicine, to bridge the gap between the language of life and human natural language. BioMedGPT allows users to easily ``communicate'' with diverse biological modalities through free text, which is the first of its kind. BioMedGPT aligns different biological modalities with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Topic Modeling · Genomics and Phylogenetic Studies
