Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners
Hung Manh Pham, Aaqib Saeed, Dong Ma

TL;DR
D-BETA is a novel contrastive masked auto-encoder framework that enhances cross-modal ECG-text representations, significantly improving diagnostic performance with limited data and zero-shot capabilities.
Contribution
The paper introduces D-BETA, a new pre-training method combining generative and discriminative learning for ECG-text data, addressing modality disparities and data scarcity.
Findings
Achieves 15% AUC improvement with 1% training data
Outperforms existing methods in zero-shot performance by 2%
Demonstrates robustness across five public datasets
Abstract
The accurate interpretation of Electrocardiogram (ECG) signals is pivotal for diagnosing cardiovascular diseases. Integrating ECG signals with accompanying textual reports further holds immense potential to enhance clinical diagnostics by combining physiological data and qualitative insights. However, this integration faces significant challenges due to inherent modality disparities and the scarcity of labeled data for robust cross-modal learning. To address these obstacles, we propose D-BETA, a novel framework that pre-trains ECG and text data using a contrastive masked auto-encoder architecture. D-BETA uniquely combines the strengths of generative with boosted discriminative capabilities to achieve robust cross-modal representations. This is accomplished through masked modality modeling, specialized loss functions, and an improved negative sampling strategy tailored for cross-modal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsECG Monitoring and Analysis
