Can human clinical rationales improve the performance and explainability of clinical text classification models?

Christoph Metzner; Shang Gao; Drahomira Herrmannova; Heidi A. Hanson

arXiv:2507.21302·cs.CL·July 30, 2025

Can human clinical rationales improve the performance and explainability of clinical text classification models?

Christoph Metzner, Shang Gao, Drahomira Herrmannova, Heidi A. Hanson

PDF

TL;DR

This study evaluates whether incorporating human clinical rationales into training improves the performance and explainability of clinical text classification models, finding limited performance gains but some explainability benefits.

Contribution

It provides a large-scale analysis of the effectiveness of human-derived rationales as additional supervision in clinical NLP models, comparing them to simply adding more reports.

Findings

01

Rationales improve performance in high-resource settings

02

Models trained on rationales outperform those trained on reports in explainability

03

Using rationales yields smaller performance gains than adding more reports

Abstract

AI-driven clinical text classification is vital for explainable automated retrieval of population-level health information. This work investigates whether human-based clinical rationales can serve as additional supervision to improve both performance and explainability of transformer-based models that automatically encode clinical documents. We analyzed 99,125 human-based clinical rationales that provide plausible explanations for primary cancer site diagnoses, using them as additional training samples alongside 128,649 electronic pathology reports to evaluate transformer-based models for extracting primary cancer sites. We also investigated sufficiency as a way to measure rationale quality for pre-selecting rationales. Our results showed that clinical rationales as additional training data can improve model performance in high-resource scenarios but produce inconsistent behavior when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.