SANSformers: Self-Supervised Forecasting in Electronic Health Records with Attention-Free Models
Yogesh Kumar, Alexander Ilin, Henri Salo, Sangita Kulathinal, Maarit, K. Leinonen, Pekka Marttinen

TL;DR
This paper introduces SANSformer, an attention-free model with self-supervised pretraining, that effectively predicts healthcare service demand from EHR data, especially improving performance for small patient subgroups.
Contribution
The paper presents SANSformer, a novel attention-free model tailored for EHR data, and introduces GSP pretraining to enhance predictions for diverse and small patient subgroups.
Findings
SANSformer outperforms existing EHR baselines.
GSP pretraining significantly improves subgroup prediction.
Models trained on nearly one million patients.
Abstract
Despite the proven effectiveness of Transformer neural networks across multiple domains, their performance with Electronic Health Records (EHR) can be nuanced. The unique, multidimensional sequential nature of EHR data can sometimes make even simple linear models with carefully engineered features more competitive. Thus, the advantages of Transformers, such as efficient transfer learning and improved scalability are not always fully exploited in EHR applications. Addressing these challenges, we introduce SANSformer, an attention-free sequential model designed with specific inductive biases to cater for the unique characteristics of EHR data. In this work, we aim to forecast the demand for healthcare services, by predicting the number of patient visits to healthcare facilities. The challenge amplifies when dealing with divergent patient subgroups, like those with rare diseases, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Medical Coding and Health Information · Chronic Disease Management Strategies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Dropout · Softmax · Residual Connection · Layer Normalization
