Sparse Autoencoder Decomposition of Clinical Sequence Model Representations: Feature Complexity, Task Specialisation, and Mortality Prediction
Chris Sainsbury, Feng Dong, Andreas Karwath

TL;DR
This study applies sparse autoencoders to clinical sequence models, revealing hierarchical feature abstraction and comparing their effectiveness to dense representations in mortality prediction tasks.
Contribution
It systematically analyzes SAE decomposition of a large clinical sequence model, demonstrating feature abstraction, task-specific performance, and introducing a perturbation reduction method.
Findings
SAE features outperform dense representations in mortality prediction under full-sequence probes.
Dense representations outperform SAE features in length of stay prediction.
Feature reproducibility across seeds is 21%, indicating features are illustrative rather than stable.
Abstract
Sparse autoencoders (SAEs) have been applied to large language models and protein language models, but not systematically to electronic health record (EHR) foundation models. We train TopK SAEs on FlatASCEND, a 14.5-million-parameter autoregressive clinical sequence model, at all 10 residual stream extraction points on INSPECT (outpatient) and MIMIC-IV (ICU). SAE decomposition reveals progressive abstraction across transformer depth: layer-0 features are near-perfect token detectors (45.7% singleton), while layer-6 features span approximately 30 token types across multiple clinical categories (0.5% singleton). Under full-sequence simple linear probes, SAE features outperform dense representations for discrete event prediction (mortality) while dense representations outperform for continuous magnitude prediction (length of stay) - a probe-level representational phenomenon that does not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
