Time Matters: Examine Temporal Effects on Biomedical Language Models
Weisi Liu, Zhe He, Xiaolei Huang

TL;DR
This paper investigates how temporal data shifts impact the performance of biomedical language models, highlighting the importance of considering time in deployment and evaluation.
Contribution
It provides a comprehensive statistical analysis of temporal effects on biomedical models across multiple tasks, establishing a benchmark for future assessments.
Findings
Performance degradation varies across biomedical tasks.
Different metrics and data drift measures yield varying insights.
Time significantly influences model deployment effectiveness.
Abstract
Time roots in applying language models for biomedical applications: models are trained on historical data and will be deployed for new or future data, which may vary from training data. While increasing biomedical tasks have employed state-of-the-art language models, there are very few studies have examined temporal effects on biomedical models when data usually shifts across development and deployment. This study fills the gap by statistically probing relations between language model performance and data shifts across three biomedical tasks. We deploy diverse metrics to evaluate model performance, distance methods to measure data drifts, and statistical methods to quantify temporal effects on biomedical language models. Our study shows that time matters for deploying biomedical language models, while the degree of performance degradation varies by biomedical tasks and statistical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education
