From Token to Rhythm: A Multi-Scale Approach for ECG-Language Pretraining
Fuying Wang, Jiacheng Xu, Lequan Yu

TL;DR
This paper introduces MELP, a multi-scale ECG-language pretraining model that captures hierarchical ECG features at token, beat, and rhythm levels to improve ECG analysis and diagnosis.
Contribution
MELP is the first to leverage hierarchical supervision across multiple time scales for ECG-language pretraining, enhancing representation learning for clinical tasks.
Findings
Outperforms existing SSL methods on multiple ECG tasks
Effective in zero-shot ECG classification and transfer learning
Captures multi-scale ECG features for better generalization
Abstract
Electrocardiograms (ECGs) play a vital role in monitoring cardiac health and diagnosing heart diseases. However, traditional deep learning approaches for ECG analysis rely heavily on large-scale manual annotations, which are both time-consuming and resource-intensive to obtain. To overcome this limitation, self-supervised learning (SSL) has emerged as a promising alternative, enabling the extraction of robust ECG representations that can be efficiently transferred to various downstream tasks. While previous studies have explored SSL for ECG pretraining and multi-modal ECG-language alignment, they often fail to capture the multi-scale nature of ECG signals. As a result, these methods struggle to learn generalized representations due to their inability to model the hierarchical structure of ECG data. To address this gap, we introduce MELP, a novel Multi-scale ECG-Language Pretraining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsECG Monitoring and Analysis · Machine Learning in Healthcare · Atrial Fibrillation Management and Outcomes
