DIVER-1 : Deep Integration of Vast Electrophysiological Recordings at Scale
Danny Dongyeop Han, Yonghyeon Gwon, Ahhyun Lucy Lee, Taeyang Lee, Seong Jin Lee, Jubin Choi, Sebin Lee, Jihyun Bang, Seungju Lee, David Keetae Park, Shinjae Yoo, Chun Kee Chung, Jiook Cha

TL;DR
This paper introduces DIVER-1, a large electrophysiological foundation model trained on extensive EEG and iEEG data, revealing that data scale and training duration are more critical than model size for performance.
Contribution
It provides the first systematic scaling law analysis for electrophysiology models and presents DIVER-1, a model trained on the largest diverse corpus, setting new benchmarks.
Findings
Performance is dominated by data scale and training duration, not model size.
DIVER-1 achieves state-of-the-art results on multiple benchmarks.
Scaling data and training time yields better results than increasing parameters alone.
Abstract
Unifying the vast heterogeneity of brain signals into a single foundation model is a longstanding challenge in neuroscience. Yet, even as large-scale pretraining becomes feasible, the field lacks principled guidance on how to scale electrophysiological foundation models under realistic data and compute constraints. We present the first systematic scaling law analysis spanning both EEG and iEEG, and uncover a distinct data-constrained characteristic. Unlike language modeling, performance in electrophysiology is dominated first by data scale, followed by training duration (epochs), with model parameter count playing a subordinate role under fixed compute budgets. This challenges the prevailing "bigger is better" heuristic derived from large language models. Building on these insights, we introduce DIVER-1, a family of models trained on the largest and most diverse corpus to date: 59.3k…
Peer Reviews
Decision·Submitted to ICLR 2026
(S1) I appreciate the attention to detail in this paper to subtle yet important factors such as data quality (e.g., the QA/QC pipeline), hyperparameter tuning (thorough tuning at smaller scales and use of $\mu$-parameterization for transfer to larger models), and choices like patch size and fine-tuning strategy that can significantly affect performance. (S2) To my knowledge, this is the first work to systematically study scaling with population-level electrophysiology data. Although the scope o
(W1) A major concern is the consistency of the Neuroprobe results. In the current version of the benchmark [1], the reported numbers for the Linear (Laplacian + spectrogram) baseline on SS-SM differ from those in the manuscript. For instance, the Global Optical Flow result for the linear baseline is $<0.62$, and DIVER-1 achieves roughly $0.62$, yet the Neuroprobe paper reports $0.625$. Some task results align with the Linear (spectrogram) baseline, others with Linear (Laplacian + spectrogram). I
The paper's core strength is its investigation of scaling laws for electrophysiology. Interesting findings include: -- Studying cost-effective training recipes in data-constrained settings, such as not spending flops on the largest models. -- Studying architecture designs, such as a multi-domain reconstruction objective (learning time, FFT, and STFT) and a novel positional encoding (STCPE) . -- The collection of a large scale Ephys pre-training dataset The effort put into this paper was imp
I have a number of concerns about this paper. None of these concerns are deal-breakers, but they add up to a non-trivial amount of total concern. -- The majority of the experiments focused on iEEG (e.g., most of Figure 2), where including EEG data actually hurt performance (noted in Appendix D.4). Moreover, the limited EEG experiments show that the proposed model does not outperform existing EEG foundation models. Since the large majority of the training set is EEG, this makes the total numb
The study offers the first systematic scaling analysis for electrophysiology and provides actionable guidance on allocating compute between model size and epochs. The corpus scale and subject diversity are substantial, and the any-variate attention plus STCPE address heterogeneous montages. Multi-domain reconstruction is well motivated. NeuroProbe results are strong, and the compute-optimal frontier is practically useful for planning.
The EEG performance on FACED lags prior state of the art despite far larger pretraining, which undermines the universality of the scaling conclusions. Several comparisons are missing or limited, for example to models that use different pretext tasks or longer temporal receptive fields, which complicates claims of state of the art. Important dataset and preprocessing choices that can swing outcomes are scattered in appendices, including QAQC criteria, referencing, resampling, clipping, and filter
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Functional Brain Connectivity Studies · Neural dynamics and brain function
