BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition
Usman Naseem, Matloob Khushi, Vinay Reddy, Sakthivel Rajendran, Imran, Razzak, Jinman Kim

TL;DR
BioALBERT is a domain-specific language model designed for biomedical NER, leveraging inter-sentence coherence and parameter reduction to outperform existing models across multiple datasets.
Contribution
Introduction of bioALBERT, a specialized biomedical language model that improves NER performance by capturing context and reducing memory usage, addressing limitations of general models.
Findings
BioALBERT outperforms state-of-the-art BioNER models on eight datasets.
Four variants of bioALBERT are released for research use.
BioALBERT effectively captures biomedical context-dependent entities.
Abstract
In recent years, with the growing amount of biomedical documents, coupled with advancement in natural language processing algorithms, the research on biomedical named entity recognition (BioNER) has increased exponentially. However, BioNER research is challenging as NER in the biomedical domain are: (i) often restricted due to limited amount of training data, (ii) an entity can refer to multiple types and concepts depending on its context and, (iii) heavy reliance on acronyms that are sub-domain specific. Existing BioNER approaches often neglect these issues and directly adopt the state-of-the-art (SOTA) models trained in general corpora which often yields unsatisfactory results. We propose biomedical ALBERT (A Lite Bidirectional Encoder Representations from Transformers for Biomedical Text Mining) bioALBERT, an effective domain-specific language model trained on large-scale biomedical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
MethodsLinear Layer · Adam · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Layer Normalization · WordPiece · Multi-Head Attention · LAMB · Attention Is All You Need · Residual Connection
