DeviceBERT: Applied Transfer Learning With Targeted Annotations and Vocabulary Enrichment to Identify Medical Device and Component Terminology in FDA Recall Summaries
Miriam Farrington

TL;DR
DeviceBERT enhances medical device terminology recognition in FDA recall summaries by building on BioBERT with targeted annotations and vocabulary enrichment, especially effective with limited training data.
Contribution
The paper introduces DeviceBERT, a novel NLP pipeline that improves device entity recognition in recall summaries through targeted annotation and vocabulary enrichment, addressing limitations of existing models.
Findings
DeviceBERT outperforms existing models in identifying device entities.
Effective in scenarios with limited annotated training data.
Improves accuracy of medical device terminology extraction.
Abstract
FDA Medical Device recalls are critical and time-sensitive events, requiring swift identification of impacted devices to inform the public of a recall event and ensure patient safety. The OpenFDA device recall dataset contains valuable information about ongoing device recall actions, but manually extracting relevant device information from the recall action summaries is a time-consuming task. Named Entity Recognition (NER) is a task in Natural Language Processing (NLP) that involves identifying and categorizing named entities in unstructured text. Existing NER models, including domain-specific models like BioBERT, struggle to correctly identify medical device trade names, part numbers and component terms within these summaries. To address this, we propose DeviceBERT, a medical device annotation, pre-processing and enrichment pipeline, which builds on BioBERT to identify and label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Biomedical Text Mining and Ontologies
