What Do You See in this Patient? Behavioral Testing of Clinical NLP Models
Betty van Aken, Sebastian Herrmann, Alexander L\"oser

TL;DR
This paper introduces an extendable testing framework to analyze how clinical NLP models' decisions are affected by patient characteristics like gender, age, and ethnicity, revealing biases and inconsistencies in model behavior.
Contribution
It presents a novel testing framework for evaluating clinical NLP models' behavior concerning input changes, highlighting biases and variability in model decisions.
Findings
Model decisions are significantly influenced by patient characteristics.
Even top-performing models exhibit learned patterns that lack medical plausibility.
Model behavior varies greatly even after fine-tuning on the same dataset.
Abstract
Decision support systems based on clinical notes have the potential to improve patient care by pointing doctors towards overseen risks. Predicting a patient's outcome is an essential part of such systems, for which the use of deep neural networks has shown promising results. However, the patterns learned by these networks are mostly opaque and previous work revealed flaws regarding the reproduction of unintended biases. We thus introduce an extendable testing framework that evaluates the behavior of clinical outcome models regarding changes of the input. The framework helps to understand learned patterns and their influence on model decisions. In this work, we apply it to analyse the change in behavior with regard to the patient characteristics gender, age and ethnicity. Our evaluation of three current clinical NLP models demonstrates the concrete effects of these characteristics on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Artificial Intelligence in Healthcare and Education
