Quantifying the Generalization Gap in Seizure Detection: A Large-Scale Empirical Benchmark via the SzCORE Challenge

Jonathan Dan; Amirhossein Shahbazinia; Christodoulos Kechris; David Atienza

arXiv:2505.18191·eess.SP·May 20, 2026

Quantifying the Generalization Gap in Seizure Detection: A Large-Scale Empirical Benchmark via the SzCORE Challenge

Jonathan Dan, Amirhossein Shahbazinia, Christodoulos Kechris, David Atienza

PDF

TL;DR

This large-scale empirical benchmark assesses 28 seizure detection algorithms on a private EEG dataset, revealing significant variability and highlighting the need for standardized evaluation to improve model generalization.

Contribution

The study provides a comprehensive evaluation of diverse algorithms on a large, private EEG dataset, establishing a rigorous benchmarking platform for seizure detection.

Findings

01

Top F1 score of 32% with sensitivity 37%, precision 29%

02

Significant performance variability among algorithms

03

Discrepancy between peak performance and population stability

Abstract

Reliable automatic seizure detection from long-term electroencephalography (EEG) remains an unsolved challenge, as current models often fail to generalize across patients or clinical settings. Manual EEG review still is the standard of care, highlighting the need for robust models and standardized evaluation. The current literature often reports high efficacy, yet these models frequently fail when deployed to unseen patient populations. To rigorously assess this generalization gap, we conducted a large-scale empirical study evaluating 28 state-of-the-art algorithmic architectures, ranging from classical feature engineering to modern Deep Learning. These algorithms were collected by organizing a competition. A strictly held-out private dataset of continuous EEG recordings from 65 subjects, totaling 4,360 hours of data, was utilized to evaluate algorithm performance. Expert…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.