Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Benyou Wang, Qianqian Xie, Jiahuan Pei, Zhihong Chen, Prayag Tiwari,, Zhao Li, and Jie fu

TL;DR
This survey comprehensively reviews recent advances in biomedical pre-trained language models, their applications, and discusses standardization, limitations, and future directions to foster cross-disciplinary collaboration.
Contribution
It provides a systematic overview of biomedical PLMs, proposes a taxonomy, and discusses applications, limitations, and future trends in the field.
Findings
Summarizes recent progress of biomedical PLMs
Proposes a taxonomy of biomedical PLMs
Discusses limitations and future research directions
Abstract
Pre-trained language models (PLMs) have been the de facto paradigm for most natural language processing (NLP) tasks. This also benefits biomedical domain: researchers from informatics, medicine, and computer science (CS) communities propose various PLMs trained on biomedical datasets, e.g., biomedical text, electronic health records, protein, and DNA sequences for various biomedical tasks. However, the cross-discipline characteristics of biomedical PLMs hinder their spreading among communities; some existing works are isolated from each other without comprehensive comparison and discussions. It expects a survey that not only systematically reviews recent advances of biomedical PLMs and their applications but also standardizes terminology and benchmarks. In this paper, we summarize the recent progress of pre-trained language models in the biomedical domain and their applications in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Artificial Intelligence in Healthcare and Education
