Unsupervised Numerical Reasoning to Extract Phenotypes from Clinical Text by Leveraging External Knowledge
Ashwani Tanwar, Jingqing Zhang, Julia Ive, Vibhor Gupta, Yike Guo

TL;DR
This paper introduces an unsupervised approach using external knowledge and ClinicalBERT embeddings to improve numerical reasoning in clinical text phenotyping, significantly outperforming existing methods.
Contribution
It presents a novel unsupervised methodology that enhances numerical reasoning in clinical phenotyping by leveraging external knowledge and contextualized embeddings.
Findings
Up to 79% improvement in Recall and 71% in F1 scores over unsupervised benchmarks.
Surpasses alternative approaches with up to 70% Recall and 44% F1 score gains in supervised settings.
Demonstrates effectiveness across various phenotypic contexts involving numerical data.
Abstract
Extracting phenotypes from clinical text has been shown to be useful for a variety of clinical use cases such as identifying patients with rare diseases. However, reasoning with numerical values remains challenging for phenotyping in clinical text, for example, temperature 102F representing Fever. Current state-of-the-art phenotyping models are able to detect general phenotypes, but perform poorly when they detect phenotypes requiring numerical reasoning. We present a novel unsupervised methodology leveraging external knowledge and contextualized word embeddings from ClinicalBERT for numerical reasoning in a variety of phenotypic contexts. Comparing against unsupervised benchmarks, it shows a substantial performance improvement with absolute gains on generalized Recall and F1 scores up to 79% and 71%, respectively. In the supervised setting, it also surpasses the performance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
