GPT-3 Models are Poor Few-Shot Learners in the Biomedical Domain

Milad Moradi; Kathrin Blagec; Florian Haberl; Matthias Samwald

arXiv:2109.02555·cs.CL·June 2, 2022·1 cites

GPT-3 Models are Poor Few-Shot Learners in the Biomedical Domain

Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald

PDF

Open Access 1 Repo

TL;DR

This paper evaluates GPT-3 and BioBERT's few-shot learning capabilities in biomedical NLP, revealing that in-domain pretraining is beneficial but insufficient, and new strategies are needed for better performance.

Contribution

The study compares GPT-3 and BioBERT in biomedical few-shot tasks, highlighting the need for novel pretraining and learning strategies in this domain.

Findings

01

GPT-3 underperforms compared to BioBERT in biomedical tasks.

02

BioBERT benefits from in-domain pretraining.

03

In-domain pretraining alone is insufficient for optimal few-shot learning.

Abstract

Deep neural language models have set new breakthroughs in many tasks of Natural Language Processing (NLP). Recent work has shown that deep transformer language models (pretrained on large amounts of texts) can achieve high levels of task-specific few-shot performance comparable to state-of-the-art models. However, the ability of these large language models in few-shot transfer learning has not yet been explored in the biomedical domain. We investigated the performance of two powerful transformer language models, i.e. GPT-3 and BioBERT, in few-shot settings on various biomedical NLP tasks. The experimental results showed that, to a great extent, both the models underperform a language model fine-tuned on the full training data. Although GPT-3 had already achieved near state-of-the-art results in few-shot knowledge transfer on open-domain NLP tasks, it could not perform as effectively as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mmoradi-iut/BioGPT-3
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Residual Connection · Adam · Weight Decay · Dropout · Softmax