MDA: An Interpretable and Scalable Multi-Modal Fusion under Missing Modalities and Intrinsic Noise Conditions
Lin Fan, Yafei Ou, Cenyang Zheng, Pengyu Dai, Tamotsu Kamishima,, Masayuki Ikebe, Kenji Suzuki, Xun Gong

TL;DR
This paper presents MDA, a multi-modal fusion model that adaptively handles missing data and noise, improving interpretability and scalability in medical diagnostics with state-of-the-art performance.
Contribution
The paper introduces the MDA model, which constructs linear relationships between modalities using continuous attention, effectively addressing heterogeneity, missing data, noise, and interpretability challenges.
Findings
MDA maintains state-of-the-art performance across multiple datasets.
MDA aligns with clinical diagnostic standards.
MDA effectively reduces attention to low-correlation or noisy modalities.
Abstract
Multi-modal learning has shown exceptional performance in various tasks, especially in medical applications, where it integrates diverse medical information for comprehensive diagnostic evidence. However, there still are several challenges in multi-modal learning, 1. Heterogeneity between modalities, 2. uncertainty in missing modalities, 3. influence of intrinsic noise, and 4. interpretability for fusion result. This paper introduces the Modal-Domain Attention (MDA) model to address the above challenges. MDA constructs linear relationships between modalities through continuous attention, due to its ability to adaptively allocate dynamic attention to different modalities, MDA can reduce attention to low-correlation data, missing modalities, or modalities with inherent noise, thereby maintaining SOTA performance across various tasks on multiple public datasets. Furthermore, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Time Series Analysis and Forecasting · Neural Networks and Applications
