How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling
Samuel Cahyawijaya, Bryan Wilie, Holy Lovenia, Huan Zhong, MingQian, Zhong, Yuk-Yu Nancy Ip, Pascale Fung

TL;DR
This paper investigates the impact of processing longer clinical notes with adapted language models like Longformer, demonstrating that longer context improves model performance in biomedical NLP tasks.
Contribution
It introduces a long-range adaptation of pre-trained language models for clinical notes and evaluates their effectiveness across multiple datasets.
Findings
Longer clinical note intervals improve model performance.
Optimal cut-off intervals vary for different target variables.
Achieved 10% F1-score improvement with Longformer adaptation.
Abstract
Large pre-trained language models (LMs) have been widely adopted in biomedical and clinical domains, introducing many powerful LMs such as bio-lm and BioELECTRA. However, the applicability of these methods to real clinical use cases is hindered, due to the limitation of pre-trained LMs in processing long textual data with thousands of words, which is a common length for a clinical note. In this work, we explore long-range adaptation from such LMs with Longformer, allowing the LMs to capture longer clinical notes context. We conduct experiments on three n2c2 challenges datasets and a longitudinal clinical dataset from Hong Kong Hospital Authority electronic health record (EHR) system to show the effectiveness and generalizability of this concept, achieving 10\% F1-score improvement. Based on our experiments, we conclude that capturing a longer clinical note interval is beneficial to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling
MethodsHow do I complain to Expedia?*ComplainByAgent · Multi-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Weight Decay · Dropout · WordPiece · Attention Dropout · AdamW
