TL;DR
This paper introduces a domain-specific BERT model tailored for Named Entity Recognition in lab protocols, demonstrating significant improvements over baseline models in medical-related token recognition.
Contribution
The paper presents a specialized Bio-BERT based system for NER in lab protocols, addressing vocabulary challenges in the medical domain.
Findings
Achieved high F1 score, close to the best in the field.
Outperformed baseline models in NER accuracy.
Ranked fourth in F1 score in a competitive setting.
Abstract
Supervised models trained to predict properties from representations have been achieving high accuracy on a variety of tasks. For instance, the BERT family seems to work exceptionally well on the downstream task from NER tagging to the range of other linguistic tasks. But the vocabulary used in the medical field contains a lot of different tokens used only in the medical industry such as the name of different diseases, devices, organisms, medicines, etc. that makes it difficult for traditional BERT model to create contextualized embedding. In this paper, we are going to illustrate the System for Named Entity Tagging based on Bio-Bert. Experimental results show that our model gives substantial improvements over the baseline and stood the fourth runner up in terms of F1 score, and first runner up in terms of Recall with just 2.21 F1 score behind the best one.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Dropout · Linear Warmup With Linear Decay · Adam · Attention Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Dense Connections · Weight Decay · Softmax
