Making the Most Out of the Limited Context Length: Predictive Power Varies with Clinical Note Type and Note Section
Hongyi Zheng, Yixin Zhu, Lavender Yao Jiang, Kyunghyun Cho, Eric Karl, Oermann

TL;DR
This paper investigates how the predictive power of clinical notes varies by note type and section, proposing a framework to optimize input selection for language models with limited context length in healthcare NLP.
Contribution
It introduces a framework to analyze predictive power distribution across clinical note types and sections, guiding better input selection for language models with limited context.
Findings
Predictive power differs between nursing and discharge notes.
Combining multiple note types enhances model performance.
Selective sampling improves information extraction efficiency.
Abstract
Recent advances in large language models have led to renewed interest in natural language processing in healthcare using the free text of clinical notes. One distinguishing characteristic of clinical notes is their long time span over multiple long documents. The unique structure of clinical notes creates a new design choice: when the context length for a language model predictor is limited, which part of clinical notes should we choose as the input? Existing studies either choose the inputs with domain knowledge or simply truncate them. We propose a framework to analyze the sections with high predictive power. Using MIMIC-III, we show that: 1) predictive power distribution is different between nursing notes and discharge notes and 2) combining different types of notes could improve performance when the context length is large. Our findings suggest that a carefully selected sampling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNursing Diagnosis and Documentation · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare
