Effective Use of Bidirectional Language Modeling for Transfer Learning in Biomedical Named Entity Recognition
Devendra Singh Sachan, Pengtao Xie, Mrinmaya Sachan, and Eric P Xing

TL;DR
This paper demonstrates that pretraining biomedical NER models with bidirectional language models trained on unlabeled data significantly improves accuracy, training speed, and data efficiency compared to existing methods.
Contribution
It introduces a novel transfer learning approach using BiLM weights for biomedical NER, enhancing performance without extensive labeled data.
Findings
Substantial F1 score improvements on four datasets
Faster training convergence with BiLM pretraining
Reduced labeled data requirements for comparable performance
Abstract
Biomedical named entity recognition (NER) is a fundamental task in text mining of medical documents and has many applications. Deep learning based approaches to this task have been gaining increasing attention in recent years as their parameters can be learned end-to-end without the need for hand-engineered features. However, these approaches rely on high-quality labeled data, which is expensive to obtain. To address this issue, we investigate how to use unlabeled text data to improve the performance of NER models. Specifically, we train a bidirectional language model (BiLM) on unlabeled data and transfer its weights to "pretrain" an NER model with the same architecture as the BiLM, which results in a better parameter initialization of the NER model. We evaluate our approach on four benchmark datasets for biomedical NER and show that it leads to a substantial improvement in the F1…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
