Large Language Models are Few-Shot Clinical Information Extractors

Monica Agrawal; Stefan Hegselmann; Hunter Lang; Yoon Kim; David Sontag

arXiv:2205.12689·cs.CL·December 1, 2022·48 cites

Large Language Models are Few-Shot Clinical Information Extractors

Monica Agrawal, Stefan Hegselmann, Hunter Lang, Yoon Kim, David Sontag

PDF

Open Access 2 Datasets

TL;DR

This paper demonstrates that large language models like InstructGPT can effectively perform zero- and few-shot clinical information extraction tasks, even without domain-specific training, and introduces new datasets for benchmarking these tasks.

Contribution

It shows how large language models can be applied to structured clinical NLP tasks and provides new datasets for evaluation in this domain.

Findings

01

GPT-3 outperforms existing zero- and few-shot baselines.

02

Models effectively handle span identification, sequence classification, and relation extraction.

03

New datasets enable benchmarking of clinical information extraction.

Abstract

A long-running goal of the clinical NLP community is the extraction of important variables trapped in clinical notes. However, roadblocks have included dataset shift from the general domain and a lack of public clinical corpora and annotations. In this work, we show that large language models, such as InstructGPT, perform well at zero- and few-shot information extraction from clinical text despite not being trained specifically for the clinical domain. Whereas text classification and generation performance have already been studied extensively in such models, here we additionally demonstrate how to leverage them to tackle a diverse set of NLP tasks which require more structured outputs, including span identification, token-level sequence classification, and relation extraction. Further, due to the dearth of available data to evaluate these systems, we introduce new datasets for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies

Methods15 Ways to Contact How can i speak to someone at Delta Airlines · Multi-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · Adam · Attention Dropout · Linear Warmup With Cosine Annealing · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection