PT: A Plain Transformer is Good Hospital Readmission Predictor
Zhenyi Fan, Jiaqi Li, Dongyu Luo, Yuqi Yuan

TL;DR
This paper introduces PT, a Transformer-based model that effectively predicts 30-day hospital readmissions by integrating diverse clinical data, demonstrating superior accuracy, scalability, and robustness over existing methods.
Contribution
The paper presents a simple, scalable, and robust Transformer-based model that outperforms existing approaches in hospital readmission prediction using multimodal clinical data.
Findings
Achieves superior prediction accuracy compared to existing models.
Handles various data modalities with high performance.
Maintains robustness even with missing temporal information.
Abstract
Hospital readmission prediction is critical for clinical decision support, aiming to identify patients at risk of returning within 30 days post-discharge. High readmission rates often indicate inadequate treatment or post-discharge care, making effective prediction models essential for optimizing resources and improving patient outcomes. We propose PT, a Transformer-based model that integrates Electronic Health Records (EHR), medical images, and clinical notes to predict 30-day all-cause hospital readmissions. PT extracts features from raw data and uses specialized Transformer blocks tailored to the data's complexity. Enhanced with Random Forest for EHR feature selection and test-time ensemble techniques, PT achieves superior accuracy, scalability, and robustness. It performs well even when temporal information is missing. Our main contributions are: (1)Simplicity: A powerful and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare
MethodsAttention Is All You Need · Linear Layer · Dropout · Dense Connections · Byte Pair Encoding · Multi-Head Attention · Adam · Layer Normalization · Position-Wise Feed-Forward Layer · Label Smoothing
