TL;DR
This study introduces two multi-scored sleep datasets and a benchmarking framework, demonstrating that advanced automated sleep staging methods can match or surpass human expert performance in both healthy and OSA-affected populations.
Contribution
The paper provides publicly available multi-scored sleep datasets and a novel benchmarking framework, along with a new deep learning approach that achieves state-of-the-art performance.
Findings
Automated methods can reach human-level performance in sleep staging.
SimpleSleepNet outperforms average human scorers on both datasets.
State-of-the-art automated sleep staging surpasses human scorers in accuracy.
Abstract
Sleep stage classification constitutes an important element of sleep disorder diagnosis. It relies on the visual inspection of polysomnography records by trained sleep technologists. Automated approaches have been designed to alleviate this resource-intensive task. However, such approaches are usually compared to a single human scorer annotation despite an inter-rater agreement of about 85 % only. The present study introduces two publicly-available datasets, DOD-H including 25 healthy volunteers and DOD-O including 55 patients suffering from obstructive sleep apnea (OSA). Both datasets have been scored by 5 sleep technologists from different sleep centers. We developed a framework to compare automated approaches to a consensus of multiple human scorers. Using this framework, we benchmarked and compared the main literature approaches. We also developed and benchmarked a new deep learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGated Recurrent Unit
