Extracting clinical concepts from user queries

Yue Zhao; John Handley

arXiv:1912.06262·cs.IR·December 25, 2019

Extracting clinical concepts from user queries

Yue Zhao, John Handley

PDF

Open Access

TL;DR

This paper presents an adapted clinical NER model trained on mixed data to improve extraction of clinical concepts from unstructured user queries, which are often ungrammatical and compounded.

Contribution

We introduce a novel training approach using mixed data to enhance clinical NER performance on user queries, along with an end-to-end framework for clinical concept extraction.

Findings

01

Improved NER accuracy on user queries and clinical notes.

02

Training on mixed data enhances model robustness.

03

Framework is simple and easy to implement.

Abstract

Clinical concept extraction often begins with clinical Named Entity Recognition (NER). Often trained on annotated clinical notes, clinical NER models tend to struggle with tagging clinical entities in user queries because of the structural differences between clinical notes and user queries. User queries, unlike clinical notes, are often ungrammatical and incoherent. In many cases, user queries are compounded of multiple clinical entities, without comma or conjunction words separating them. By using as dataset a mixture of annotated clinical notes and synthesized user queries, we adapt a clinical NER model based on the BiLSTM-CRF architecture for tagging clinical entities in user queries. Our contribution are the following: 1) We found that when trained on a mixture of synthesized user queries and clinical notes, the NER model performs better on both user queries and clinical notes. 2)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies