An Eye on Clinical BERT: Investigating Language Model Generalization for Diabetic Eye Disease Phenotyping
Keith Harrigian, Tina Tang, Anthony Gonzales, Cindy X. Cai, Mark, Dredze

TL;DR
This study evaluates the effectiveness of clinical BERT models in diabetic eye disease phenotyping, revealing that out-of-distribution models perform comparably to domain-specific models, challenging assumptions about clinical data domain necessity.
Contribution
The paper introduces a system for extracting clinical concepts related to diabetic eye disease and provides a comprehensive evaluation of clinical BERT models across different training paradigms.
Findings
Out-of-distribution BERT models perform similarly to clinical domain-specific models.
Clinical language models pretrained on clinical data do not significantly outperform non-clinical models.
Clinical language data should not be treated as a single homogeneous domain.
Abstract
Diabetic eye disease is a major cause of blindness worldwide. The ability to monitor relevant clinical trajectories and detect lapses in care is critical to managing the disease and preventing blindness. Alas, much of the information necessary to support these goals is found only in the free text of the electronic medical record. To fill this information gap, we introduce a system for extracting evidence from clinical text of 19 clinical concepts related to diabetic eye disease and inferring relevant attributes for each. In developing this ophthalmology phenotyping system, we are also afforded a unique opportunity to evaluate the effectiveness of clinical language models at adapting to new clinical domains. Across multiple training paradigms, we find that BERT language models pretrained on out-of-distribution clinical data offer no significant improvement over BERT language models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Biomedical Text Mining and Ontologies · Artificial Intelligence in Healthcare
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Dropout · Layer Normalization · Adam · Linear Warmup With Linear Decay · Softmax
