OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets
Maziyar Panahi

TL;DR
OpenMed NER introduces open-source, domain-adapted transformer models that achieve state-of-the-art biomedical NER performance across 12 datasets with high efficiency and low computational cost.
Contribution
The paper presents a novel approach combining lightweight domain-adaptive pre-training with parameter-efficient fine-tuning, surpassing existing models on multiple biomedical NER benchmarks.
Findings
Achieves new state-of-the-art scores on 10 out of 12 datasets.
Significant improvements on gene and clinical cell line corpora.
Training completed in under 12 hours on a single GPU with low carbon footprint.
Abstract
Named-entity recognition (NER) is fundamental to extracting structured information from the >80% of healthcare data that resides in unstructured clinical notes and biomedical literature. Despite recent advances with large language models, achieving state-of-the-art performance across diverse entity types while maintaining computational efficiency remains a significant challenge. We introduce OpenMed NER, a suite of open-source, domain-adapted transformer models that combine lightweight domain-adaptive pre-training (DAPT) with parameter-efficient Low-Rank Adaptation (LoRA). Our approach performs cost-effective DAPT on a 350k-passage corpus compiled from ethically sourced, publicly available research repositories and de-identified clinical notes (PubMed, arXiv, and MIMIC-III) using DeBERTa-v3, PubMedBERT, and BioELECTRA backbones. This is followed by task-specific fine-tuning with LoRA,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗OpenMed/OpenMed-NER-OncologyDetect-SuperClinical-434Mmodel· 129k dl· ♡ 8129k dl♡ 8
- 🤗OpenMed/OpenMed-NER-PharmaDetect-SuperClinical-434Mmodel· 235k dl· ♡ 23235k dl♡ 23
- 🤗OpenMed/OpenMed-NER-PharmaDetect-SuperMedical-125Mmodel· 107k dl· ♡ 4107k dl♡ 4
- 🤗OpenMed/OpenMed-NER-OncologyDetect-TinyMed-65Mmodel· 91k dl· ♡ 291k dl♡ 2
- 🤗OpenMed/OpenMed-NER-ChemicalDetect-PubMed-335Mmodel· 114k dl· ♡ 1114k dl♡ 1
- 🤗OpenMed/OpenMed-NER-SpeciesDetect-BioClinical-108Mmodel· 84k dl84k dl
- 🤗OpenMed/OpenMed-NER-ChemicalDetect-BigMed-278Mmodel· 88k dl88k dl
- 🤗OpenMed/OpenMed-NER-ChemicalDetect-EuroMed-212Mmodel· 98k dl· ♡ 498k dl♡ 4
- 🤗OpenMed/OpenMed-NER-DNADetect-BioClinical-108Mmodel· 84k dl84k dl
- 🤗OpenMed/OpenMed-NER-OncologyDetect-SuperClinical-141Mmodel· 86k dl86k dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Biomedical Text Mining and Ontologies
