Clinical Concept Extraction with Contextual Word Embedding
Henghui Zhu, Ioannis Ch. Paschalidis, Amir Tahmasebi

TL;DR
This paper presents a new clinical concept extraction model that leverages domain-specific contextual word embeddings and a bidirectional LSTM-CRF architecture, achieving state-of-the-art results on the I2B2 2010 dataset.
Contribution
The study introduces a novel combination of domain-specific contextual embeddings with a bidirectional LSTM-CRF for improved clinical concept extraction.
Findings
Achieved 3.4% higher F1-score than previous state-of-the-art models.
Outperformed baseline models on the I2B2 2010 dataset.
Demonstrated effectiveness of domain-specific contextual embeddings in clinical NLP.
Abstract
Automatic extraction of clinical concepts is an essential step for turning the unstructured data within a clinical note into structured and actionable information. In this work, we propose a clinical concept extraction model for automatic annotation of clinical problems, treatments, and tests in clinical notes utilizing domain-specific contextual word embedding. A contextual word embedding model is first trained on a corpus with a mixture of clinical reports and relevant Wikipedia pages in the clinical domain. Next, a bidirectional LSTM-CRF model is trained for clinical concept extraction using the contextual word embedding model. We tested our proposed model on the I2B2 2010 challenge dataset. Our proposed model achieved the best performance among reported baseline models and outperformed the state-of-the-art models by 3.4% in terms of F1-score.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Biomedical Text Mining and Ontologies
