UMLS-KGI-BERT: Data-Centric Knowledge Integration in Transformers for Biomedical Entity Recognition
Aidan Mannion, Thierry Chevalier, Didier Schwab, Lorraine Geouriot

TL;DR
This paper introduces UMLS-KGI-BERT, a data-centric approach that enhances biomedical transformer models by integrating structured UMLS knowledge, leading to improved performance on biomedical NER tasks.
Contribution
It proposes a novel data-centric method for enriching biomedical transformers with UMLS-derived sequences, combining graph-based learning with masked-language pre-training.
Findings
Improved NER performance on biomedical datasets
Effective integration of UMLS knowledge into transformer models
Enhanced downstream task results in biomedical NLP
Abstract
Pre-trained transformer language models (LMs) have in recent years become the dominant paradigm in applied NLP. These models have achieved state-of-the-art performance on tasks such as information extraction, question answering, sentiment analysis, document classification and many others. In the biomedical domain, significant progress has been made in adapting this paradigm to NLP tasks that require the integration of domain-specific knowledge as well as statistical modelling of language. In particular, research in this area has focused on the question of how best to construct LMs that take into account not only the patterns of token distribution in medical text, but also the wealth of structured information contained in terminology resources such as the UMLS. This work contributes a data-centric paradigm for enriching the language representations of biomedical transformer-encoder LMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗a-mannion/drbert-umls-kgimodel· 5 dl5 dl
- 🤗a-mannion/umls-kgi-bert-trilingualmodel· 3 dl3 dl
- 🤗a-mannion/bioroberta-es-umls-kgimodel· 4 dl4 dl
- 🤗a-mannion/umls-kgi-bert-frmodel· 7 dl7 dl
- 🤗a-mannion/umls-kgi-bert-enmodel· 3 dl3 dl
- 🤗a-mannion/umls-kgi-bert-esmodel· 6 dl6 dl
- 🤗a-mannion/pubmedbert-umls-kgimodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Natural Language Processing Techniques
