INSIGHTBUDDY-AI: Medication Extraction and Entity Linking using Large Language Models and Ensemble Learning
Pablo Romero, Lifeng Han, Goran Nenadic

TL;DR
This paper evaluates large language models for extracting medication information from clinical texts, enhancing performance with ensemble learning, and linking entities to standard medical coding systems, with publicly available tools.
Contribution
It introduces ensemble learning methods to improve medication extraction accuracy using LLMs and develops an entity linking system to standard medical codes, advancing healthcare NLP applications.
Findings
Ensemble models outperform individual LLMs in medication extraction.
The system effectively links entities to SNOMED-CT, BNF, dm+d, and ICD codes.
Public toolkit and applications are provided for healthcare NLP tasks.
Abstract
Medication Extraction and Mining play an important role in healthcare NLP research due to its practical applications in hospital settings, such as their mapping into standard clinical knowledge bases (SNOMED-CT, BNF, etc.). In this work, we investigate state-of-the-art LLMs in text mining tasks on medications and their related attributes such as dosage, route, strength, and adverse effects. In addition, we explore different ensemble learning methods (\textsc{Stack-Ensemble} and \textsc{Voting-Ensemble}) to augment the model performances from individual LLMs. Our ensemble learning result demonstrated better performances than individually fine-tuned base models BERT, RoBERTa, RoBERTa-L, BioBERT, BioClinicalBERT, BioMedRoBERTa, ClinicalBERT, and PubMedBERT across general and specific domains. Finally, we build up an entity linking function to map extracted medical terminologies into the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗pabRomero/BERT-full-finetuned-ner-pablomodel· 8 dl8 dl
- 🤗pabRomero/BioBERT-full-finetuned-ner-pablomodel· 6 dl6 dl
- 🤗pabRomero/ClinicalBERT-full-finetuned-ner-pablomodel· 1 dl1 dl
- 🤗pabRomero/RoBERTa-full-finetuned-ner-pablomodel
- 🤗pabRomero/RoBERTa-Large-full-finetuned-ner-pablomodel· 1 dl1 dl
- 🤗pabRomero/BioMedRoBERTa-full-finetuned-ner-pablomodel· 10 dl10 dl
- 🤗pabRomero/PubMedBERT-full-finetuned-ner-pablomodel· 2 dl2 dl
- 🤗pabRomero/BioClinicalBERT-full-finetuned-ner-pablomodel· 10 dl10 dl
Videos
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Softmax · Attention Dropout · Multi-Head Attention · Layer Normalization · Dense Connections · Adam · WordPiece
