Can human clinical rationales improve the performance and explainability of clinical text classification models?
Christoph Metzner, Shang Gao, Drahomira Herrmannova, Heidi A. Hanson

TL;DR
This study evaluates whether incorporating human clinical rationales into training improves the performance and explainability of clinical text classification models, finding limited performance gains but some explainability benefits.
Contribution
It provides a large-scale analysis of the effectiveness of human-derived rationales as additional supervision in clinical NLP models, comparing them to simply adding more reports.
Findings
Rationales improve performance in high-resource settings
Models trained on rationales outperform those trained on reports in explainability
Using rationales yields smaller performance gains than adding more reports
Abstract
AI-driven clinical text classification is vital for explainable automated retrieval of population-level health information. This work investigates whether human-based clinical rationales can serve as additional supervision to improve both performance and explainability of transformer-based models that automatically encode clinical documents. We analyzed 99,125 human-based clinical rationales that provide plausible explanations for primary cancer site diagnoses, using them as additional training samples alongside 128,649 electronic pathology reports to evaluate transformer-based models for extracting primary cancer sites. We also investigated sufficiency as a way to measure rationale quality for pre-selecting rationales. Our results showed that clinical rationales as additional training data can improve model performance in high-resource scenarios but produce inconsistent behavior when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
