Pre-trained Language Models in Biomedical Domain: A Systematic Survey

Benyou Wang; Qianqian Xie; Jiahuan Pei; Zhihong Chen; Prayag Tiwari,; Zhao Li; and Jie fu

arXiv:2110.05006·cs.CL·July 18, 2023·35 cites

Pre-trained Language Models in Biomedical Domain: A Systematic Survey

Benyou Wang, Qianqian Xie, Jiahuan Pei, Zhihong Chen, Prayag Tiwari,, Zhao Li, and Jie fu

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This survey comprehensively reviews recent advances in biomedical pre-trained language models, their applications, and discusses standardization, limitations, and future directions to foster cross-disciplinary collaboration.

Contribution

It provides a systematic overview of biomedical PLMs, proposes a taxonomy, and discusses applications, limitations, and future trends in the field.

Findings

01

Summarizes recent progress of biomedical PLMs

02

Proposes a taxonomy of biomedical PLMs

03

Discusses limitations and future research directions

Abstract

Pre-trained language models (PLMs) have been the de facto paradigm for most natural language processing (NLP) tasks. This also benefits biomedical domain: researchers from informatics, medicine, and computer science (CS) communities propose various PLMs trained on biomedical datasets, e.g., biomedical text, electronic health records, protein, and DNA sequences for various biomedical tasks. However, the cross-discipline characteristics of biomedical PLMs hinder their spreading among communities; some existing works are isolated from each other without comprehensive comparison and discussions. It expects a survey that not only systematically reviews recent advances of biomedical PLMs and their applications but also standardizes terminology and benchmarks. In this paper, we summarize the recent progress of pre-trained language models in the biomedical domain and their applications in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/esm
pytorchOfficial

Datasets

BAAI/SurveyScope
dataset· 6 dl
6 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Artificial Intelligence in Healthcare and Education