Explainable Deep Neural Network for Multimodal ECG Signals: Intermediate vs Late Fusion
Timothy Oladunni, Ehimen Aneni

TL;DR
This study compares intermediate and late fusion strategies in multimodal deep neural networks for ECG-based cardiovascular disease classification, demonstrating that intermediate fusion yields higher accuracy and interpretability.
Contribution
It provides a comprehensive evaluation of fusion strategies in multimodal ECG analysis, highlighting the superiority of intermediate fusion for clinical applications.
Findings
Intermediate fusion achieved 97% accuracy.
Saliency maps aligned with ECG signals, enhancing interpretability.
Statistical analysis confirmed dependency between ECG features and model explanations.
Abstract
The limitations of unimodal deep learning models, particularly their tendency to overfit and limited generalizability, have renewed interest in multimodal fusion strategies. Multimodal deep neural networks (MDNN) have the capability of integrating diverse data domains and offer a promising solution for robust and accurate predictions. However, the optimal fusion strategy, intermediate fusion (feature-level) versus late fusion (decision-level) remains insufficiently examined, especially in high-stakes clinical contexts such as ECG-based cardiovascular disease (CVD) classification. This study investigates the comparative effectiveness of intermediate and late fusion strategies using ECG signals across three domains: time, frequency, and time-frequency. A series of experiments were conducted to identify the highest-performing fusion architecture. Results demonstrate that intermediate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsECG Monitoring and Analysis
