On Preserving the Knowledge of Long Clinical Texts
Mohammad Junayed Hasan, Suhra Noor, Mohammad Ashrafuzzaman Khan

TL;DR
This paper introduces a novel approach combining ensemble and aggregation techniques to preserve and utilize the full information of long clinical texts in transformer models, improving prediction accuracy.
Contribution
It proposes a new method that fuses ensemble and aggregation strategies to better handle long clinical texts in transformer-based models, outperforming existing methods.
Findings
Improved prediction accuracy on clinical outcome tasks.
Effective preservation of information from long clinical notes.
Superiority over baseline models on MIMIC-III dataset.
Abstract
Clinical texts, such as admission notes, discharge summaries, and progress notes, contain rich and valuable information that can be used for clinical decision making. However, a severe bottleneck in using transformer encoders for processing clinical texts comes from the input length limit of these models: transformer-based encoders use fixed-length inputs. Therefore, these models discard part of the inputs while processing medical text. There is a risk of losing vital knowledge from clinical text if only part of it is processed. This paper proposes a novel method to preserve the knowledge of long clinical texts in the models using aggregated ensembles of transformer encoders. Previous studies used either ensemble or aggregation, but we studied the effects of fusing these methods. We trained several pre-trained BERT-like transformer encoders on two clinical outcome tasks: mortality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Electronic Health Records Systems
