Extracting clinical concepts from user queries
Yue Zhao, John Handley

TL;DR
This paper presents an adapted clinical NER model trained on mixed data to improve extraction of clinical concepts from unstructured user queries, which are often ungrammatical and compounded.
Contribution
We introduce a novel training approach using mixed data to enhance clinical NER performance on user queries, along with an end-to-end framework for clinical concept extraction.
Findings
Improved NER accuracy on user queries and clinical notes.
Training on mixed data enhances model robustness.
Framework is simple and easy to implement.
Abstract
Clinical concept extraction often begins with clinical Named Entity Recognition (NER). Often trained on annotated clinical notes, clinical NER models tend to struggle with tagging clinical entities in user queries because of the structural differences between clinical notes and user queries. User queries, unlike clinical notes, are often ungrammatical and incoherent. In many cases, user queries are compounded of multiple clinical entities, without comma or conjunction words separating them. By using as dataset a mixture of annotated clinical notes and synthesized user queries, we adapt a clinical NER model based on the BiLSTM-CRF architecture for tagging clinical entities in user queries. Our contribution are the following: 1) We found that when trained on a mixture of synthesized user queries and clinical notes, the NER model performs better on both user queries and clinical notes. 2)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
