Three-level Hierarchical Transformer Networks for Long-sequence and   Multiple Clinical Documents Classification

Yuqi Si; Kirk Roberts

arXiv:2104.08444·cs.CL·December 20, 2021·5 cites

Three-level Hierarchical Transformer Networks for Long-sequence and Multiple Clinical Documents Classification

Yuqi Si, Kirk Roberts

PDF

Open Access 1 Repo

TL;DR

This paper introduces a three-level hierarchical transformer network designed to effectively model long clinical notes and multiple documents for patient prediction tasks, significantly extending input length capabilities.

Contribution

The paper proposes a novel three-level hierarchical transformer architecture that captures dependencies across words, sentences, notes, and patients, improving long-sequence clinical document classification.

Findings

01

Outperforms state-of-the-art models like BigBird on MIMIC-III.

02

Handles longer input sequences than traditional BERT.

03

Optimized hyper-parameters for computational efficiency.

Abstract

We present a Three-level Hierarchical Transformer Network (3-level-HTN) for modeling long-term dependencies across clinical notes for the purpose of patient-level prediction. The network is equipped with three levels of Transformer-based encoders to learn progressively from words to sentences, sentences to notes, and finally notes to patients. The first level from word to sentence directly applies a pre-trained BERT model as a fully trainable component. While the second and third levels both implement a stack of transformer-based encoders, before the final patient representation is fed into a classification layer for clinical predictions. Compared to conventional BERT models, our model increases the maximum input length from 512 tokens to much longer sequences that are appropriate for modeling large numbers of clinical notes. We empirically examine different hyper-parameters to identify…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yuqi92/3-level-htn-mimic
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · AI in cancer detection · Topic Modeling

MethodsMulti-Head Attention · Linear Layer · BigBird · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Label Smoothing · Byte Pair Encoding · Dropout · Adam