Recognising Biomedical Names: Challenges and Solutions
Xiang Dai

TL;DR
This paper addresses challenges in biomedical named entity recognition by proposing a transition-based model, data augmentation techniques, and a method for selecting suitable pre-training data to improve recognition of complex and discontinuous biomedical names.
Contribution
It introduces a transition-based NER model for discontinuous mentions, a cost-effective approach for selecting pre-training data, and data augmentation methods tailored for biomedical NER.
Findings
Transition-based model improves recognition of discontinuous mentions.
Data augmentation enhances NER performance with limited labeled data.
Selecting in-domain pre-training data boosts model accuracy.
Abstract
The growth rate in the amount of biomedical documents is staggering. Unlocking information trapped in these documents can enable researchers and practitioners to operate confidently in the information world. Biomedical NER, the task of recognising biomedical names, is usually employed as the first step of the NLP pipeline. Standard NER models, based on sequence tagging technique, are good at recognising short entity mentions in the generic domain. However, there are several open challenges of applying these models to recognise biomedical names: 1) Biomedical names may contain complex inner structure (discontinuity and overlapping) which cannot be recognised using standard sequence tagging technique; 2) The training of NER models usually requires large amount of labelled data, which are difficult to obtain in the biomedical domain; and, 3) Commonly used language representation models are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
