Thinking about GPT-3 In-Context Learning for Biomedical IE? Think Again

Bernal Jim\'enez Guti\'errez; Nikolas McNeal; Clay Washington; You; Chen; Lang Li; Huan Sun; Yu Su

arXiv:2203.08410·cs.CL·November 8, 2022

Thinking about GPT-3 In-Context Learning for Biomedical IE? Think Again

Bernal Jim\'enez Guti\'errez, Nikolas McNeal, Clay Washington, You, Chen, Lang Li, Huan Sun, Yu Su

PDF

1 Repo

TL;DR

This study systematically compares GPT-3's few-shot in-context learning with fine-tuning smaller biomedical language models, revealing that GPT-3 underperforms and offers limited gains, guiding researchers towards more effective approaches.

Contribution

First comprehensive comparison of GPT-3 in-context learning versus fine-tuning smaller models for biomedical information extraction tasks.

Findings

01

GPT-3 underperforms compared to fine-tuned smaller PLMs.

02

In-context learning yields smaller accuracy gains with more data.

03

In-depth analysis reveals limitations of GPT-3's in-context learning for IE.

Abstract

The strong few-shot in-context learning capability of large pre-trained language models (PLMs) such as GPT-3 is highly appealing for application domains such as biomedicine, which feature high and diverse demands of language technologies but also high data annotation costs. In this paper, we present the first systematic and comprehensive study to compare the few-shot performance of GPT-3 in-context learning with fine-tuning smaller (i.e., BERT-sized) PLMs on two highly representative biomedical information extraction tasks, named entity recognition and relation extraction. We follow the true few-shot setting to avoid overestimating models' few-shot performance by model selection over a large validation set. We also optimize GPT-3's performance with known techniques such as contextual calibration and dynamic in-context example retrieval. However, our results show that GPT-3 still…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dki-lab/few-shot-bioie
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Cosine Annealing · Adam · Linear Warmup With Cosine Annealing · {Dispute@FaQ-s}How to file a dispute with Expedia? · Byte Pair Encoding · Softmax · Multi-Head Attention