Document-level Clinical Entity and Relation Extraction via Knowledge Base-Guided Generation
Kriti Bhattarai, Inez Y. Oh, Zachary B. Abrams, Albert M. Lai

TL;DR
This paper introduces a novel approach that combines UMLS knowledge base with GPT models to enhance document-level clinical entity and relation extraction, outperforming existing methods like RAG.
Contribution
The work presents a knowledge-guided generation framework that leverages UMLS concepts to improve clinical information extraction at the document level.
Findings
UMLS-guided prompts improve extraction accuracy.
The approach outperforms standard RAG techniques.
Knowledge integration enhances GPT's clinical understanding.
Abstract
Generative pre-trained transformer (GPT) models have shown promise in clinical entity and relation extraction tasks because of their precise extraction and contextual understanding capability. In this work, we further leverage the Unified Medical Language System (UMLS) knowledge base to accurately identify medical concepts and improve clinical entity and relation extraction at the document level. Our framework selects UMLS concepts relevant to the text and combines them with prompts to guide language models in extracting entities. Our experiments demonstrate that this initial concept mapping and the inclusion of these mapped concepts in the prompts improves extraction results compared to few-shot extraction tasks on generic language models that do not leverage UMLS. Further, our results show that this approach is more effective than the standard Retrieval Augmented Generation (RAG)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Cosine Annealing · Layer Normalization · Linear Layer · Attention Dropout · Linear Warmup With Linear Decay · Adam · Dropout
