TL;DR
This paper introduces GAAL, a novel gradient-aligned alternating learning method for multimodal tabular-image fusion that effectively addresses gradient conflicts and improves fusion performance.
Contribution
The paper proposes a new gradient alignment paradigm with uncertainty-based gradient surgery to enhance multimodal fusion, outperforming existing state-of-the-art methods.
Findings
GAAL outperforms state-of-the-art baselines on benchmark datasets.
Gradient alignment improves multimodal fusion effectiveness.
Uncertainty-based gradient surgery effectively mitigates gradient conflicts.
Abstract
Multimodal tabular-image fusion is an emerging task that has received increasing attention in various domains. However, existing methods may be hindered by gradient conflicts between modalities, misleading the optimization of the unimodal learner. In this paper, we propose a novel Gradient-Aligned Alternating Learning (GAAL) paradigm to address this issue by aligning modality gradients. Specifically, GAAL adopts an alternating unimodal learning and shared classifier to decouple the multimodal gradient and facilitate interaction. Furthermore, we design uncertainty-based cross-modal gradient surgery to selectively align cross-modal gradients, thereby steering the shared parameters to benefit all modalities. As a result, GAAL can provide effective unimodal assistance and help boost the overall fusion performance. Empirical experiments on widely used datasets reveal the superiority of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
