DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language   Processing

Yanjun Gao; Dmitriy Dligach; Timothy Miller; John Caskey; Brihat; Sharma; Matthew M Churpek; Majid Afshar

arXiv:2209.14901·cs.CL·January 30, 2023·1 cites

DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing

Yanjun Gao, Dmitriy Dligach, Timothy Miller, John Caskey, Brihat, Sharma, Matthew M Churpek, Majid Afshar

PDF

Open Access

TL;DR

DR.BENCH is a new benchmark suite designed to evaluate clinical natural language processing models on diagnostic reasoning tasks, aiming to improve AI's ability to support clinical decision-making and reduce diagnostic errors.

Contribution

It introduces the first comprehensive diagnostic reasoning benchmark for cNLP, including six tasks and a framework for evaluating generative models in clinical diagnosis.

Findings

01

State-of-the-art models show room for improvement on DR.BENCH

02

The benchmark highlights gaps in current clinical NLP models' diagnostic reasoning abilities

03

DR.BENCH is publicly available for community use and development

Abstract

The meaningful use of electronic health records (EHR) continues to progress in the digital era with clinical decision support systems augmented by artificial intelligence. A priority in improving provider experience is to overcome information overload and reduce the cognitive burden so fewer medical errors and cognitive biases are introduced during patient care. One major type of medical error is diagnostic error due to systematic or predictable errors in judgment that rely on heuristics. The potential for clinical natural language processing (cNLP) to model diagnostic reasoning in humans with forward reasoning from data to diagnosis and potentially reduce the cognitive burden and medical error has not been investigated. Existing tasks to advance the science in cNLP have largely focused on information extraction and named entity recognition through classification tasks. We introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Biomedical Text Mining and Ontologies