CREMA: A Contrastive Regularized Masked Autoencoder for Robust ECG Diagnostics across Clinical Domains
Junho Song, Jong-Hwan Jang, DongGyun Hong, Joon-myoung Kwon, and Yong-Yeon Jo

TL;DR
CREMA is a self-supervised foundation model for ECG analysis that combines generative and contrastive learning to produce robust, generalizable representations, outperforming existing methods across diverse clinical environments.
Contribution
CREMA introduces a novel contrastive regularized masked autoencoder architecture with a Signal Transformer for robust ECG representation learning in clinical settings.
Findings
Outperforms supervised and self-supervised baselines in benchmarks.
Maintains high performance across diverse clinical domains.
Demonstrates robustness under real-world distribution shifts.
Abstract
Electrocardiogram (ECG) diagnosis remains challenging due to limited labeled data and the need to capture subtle yet clinically meaningful variations in rhythm and morphology. We present CREMA (Contrastive Regularized Masked Autoencoder), a foundation model for 12-lead ECGs designed to learn generalizable representations through self-supervised pretraining. CREMA combines generative learning and contrastive regularization via a Contrastive Regularized MAE loss, and employs a Signal Transformer (SiT) architecture to capture both local waveform details and global temporal dependencies. We evaluate CREMA on benchmark datasets and real-world clinical environments, including deployment scenarios with significant distribution shifts. CREMA outperforms supervised baselines and existing self-supervised models in both linear probing and fine-tuning evaluations. Notably, it maintains superior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsECG Monitoring and Analysis
