How Do Electrocardiogram Models Scale?
Jiawei Li, Fabio Bonassi, Ming Jin, Stefan Gustafsson, Johan Sundstr\"om, Thomas B. Sch\"on, Ant\^onio H. Ribeiro

TL;DR
This study systematically analyzes how different neural architectures and pre-training paradigms affect the scaling and transfer performance of ECG models across various data and model sizes.
Contribution
It provides the first comprehensive investigation of scaling laws for ECG models, comparing ResNet and Transformer architectures under supervised and self-supervised learning paradigms.
Findings
SSL models scale robustly across data and model sizes.
ResNets are more parameter-efficient for out-of-distribution generalization.
SSL achieves higher transfer efficiency and outperforms SL on unseen clinical tasks.
Abstract
While scaling laws have established a fundamental framework for foundation models in natural language processing, their applicability to electrocardiogram (ECG) models remains poorly characterized. Indeed, recent studies do not always yield consistent downstream gains as one increases the model size or pre-training dataset size of ECG models, leaving the exact roles of architectural inductive biases, pre-training paradigms, and expected improvements with size largely unanswered. In this work, we systematically investigate neural and loss-to-loss scaling laws within the ECG domain. By pre-training over models (ranging from K to M parameters) on the large-scale CODE dataset (M records), we decouple the effects of model architecture (ResNet vs. Transformer) and pre-training paradigm, namely supervised learning (SL) versus self-supervised learning (SSL). We found that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
