Transformer-Based Sleep Stage Classification Enhanced by Clinical Information
Woosuk Chung, Seokwoo Hong, Wonhyeok Lee, Sangyoon Bae

TL;DR
This paper introduces a Transformer-based sleep staging model that incorporates clinical metadata and expert annotations, significantly improving accuracy and interpretability over PSG-only methods.
Contribution
It presents a novel two-stage architecture that fuses clinical and event-based context with deep learning for enhanced sleep stage classification.
Findings
Contextual fusion improves macro-F1 from 0.7745 to 0.8031
Event annotations contribute the largest accuracy gains
Feature fusion outperforms multi-task learning approaches
Abstract
Manual sleep staging from polysomnography (PSG) is labor-intensive and prone to inter-scorer variability. While recent deep learning models have advanced automated staging, most rely solely on raw PSG signals and neglect contextual cues used by human experts. We propose a two-stage architecture that combines a Transformer-based per-epoch encoder with a 1D CNN aggregator, and systematically investigates the effect of incorporating explicit context: subject-level clinical metadata (age, sex, BMI) and per-epoch expert event annotations (apneas, desaturations, arousals, periodic breathing). Using the Sleep Heart Health Study (SHHS) cohort (n=8,357), we demonstrate that contextual fusion substantially improves staging accuracy. Compared to a PSG-only baseline (macro-F1 0.7745, micro-F1 0.8774), our final model achieves macro-F1 0.8031 and micro-F1 0.9051, with event annotations contributing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsObstructive Sleep Apnea Research · EEG and Brain-Computer Interfaces · Sleep and related disorders
