TL;DR
BioGPT is a domain-specific generative Transformer model trained on biomedical literature, excelling in biomedical NLP tasks and text generation, thus expanding the application scope of biomedical language models.
Contribution
We introduce BioGPT, a biomedical generative language model that outperforms previous models on multiple NLP tasks and demonstrates strong text generation capabilities.
Findings
Achieved 44.98% F1 on BC5CDR relation extraction
Achieved 38.42% F1 on KD-DTI relation extraction
Achieved 78.2% accuracy on PubMedQA
Abstract
Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain. Among the two main branches of pre-trained language models in the general language domain, i.e., BERT (and its variants) and GPT (and its variants), the first one has been extensively studied in the biomedical domain, such as BioBERT and PubMedBERT. While they have achieved great success on a variety of discriminative downstream biomedical tasks, the lack of generation ability constrains their application scope. In this paper, we propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature. We evaluate BioGPT on six biomedical NLP tasks and demonstrate that our model outperforms previous models on most tasks. Especially, we get 44.98%, 38.42% and 40.76% F1 score on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Cosine Annealing · Linear Warmup With Cosine Annealing · Softmax · Adam · Weight Decay · Attention Dropout · Label Smoothing
