DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing
Yanjun Gao, Dmitriy Dligach, Timothy Miller, John Caskey, Brihat, Sharma, Matthew M Churpek, Majid Afshar

TL;DR
DR.BENCH is a new benchmark suite designed to evaluate clinical natural language processing models on diagnostic reasoning tasks, aiming to improve AI's ability to support clinical decision-making and reduce diagnostic errors.
Contribution
It introduces the first comprehensive diagnostic reasoning benchmark for cNLP, including six tasks and a framework for evaluating generative models in clinical diagnosis.
Findings
State-of-the-art models show room for improvement on DR.BENCH
The benchmark highlights gaps in current clinical NLP models' diagnostic reasoning abilities
DR.BENCH is publicly available for community use and development
Abstract
The meaningful use of electronic health records (EHR) continues to progress in the digital era with clinical decision support systems augmented by artificial intelligence. A priority in improving provider experience is to overcome information overload and reduce the cognitive burden so fewer medical errors and cognitive biases are introduced during patient care. One major type of medical error is diagnostic error due to systematic or predictable errors in judgment that rely on heuristics. The potential for clinical natural language processing (cNLP) to model diagnostic reasoning in humans with forward reasoning from data to diagnosis and potentially reduce the cognitive burden and medical error has not been investigated. Existing tasks to advance the science in cNLP have largely focused on information extraction and named entity recognition through classification tasks. We introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Biomedical Text Mining and Ontologies
