Harmonized Tabular-Image Fusion via Gradient-Aligned Alternating Learning

Longfei Huang; Yang Yang

arXiv:2604.01579·cs.CV·April 3, 2026

Harmonized Tabular-Image Fusion via Gradient-Aligned Alternating Learning

Longfei Huang, Yang Yang

PDF

1 Repo

TL;DR

This paper introduces GAAL, a novel gradient-aligned alternating learning method for multimodal tabular-image fusion that effectively addresses gradient conflicts and improves fusion performance.

Contribution

The paper proposes a new gradient alignment paradigm with uncertainty-based gradient surgery to enhance multimodal fusion, outperforming existing state-of-the-art methods.

Findings

01

GAAL outperforms state-of-the-art baselines on benchmark datasets.

02

Gradient alignment improves multimodal fusion effectiveness.

03

Uncertainty-based gradient surgery effectively mitigates gradient conflicts.

Abstract

Multimodal tabular-image fusion is an emerging task that has received increasing attention in various domains. However, existing methods may be hindered by gradient conflicts between modalities, misleading the optimization of the unimodal learner. In this paper, we propose a novel Gradient-Aligned Alternating Learning (GAAL) paradigm to address this issue by aligning modality gradients. Specifically, GAAL adopts an alternating unimodal learning and shared classifier to decouple the multimodal gradient and facilitate interaction. Furthermore, we design uncertainty-based cross-modal gradient surgery to selectively align cross-modal gradients, thereby steering the shared parameters to benefit all modalities. As a result, GAAL can provide effective unimodal assistance and help boost the overall fusion performance. Empirical experiments on widely used datasets reveal the superiority of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

njustkmg/ICME26-GAAL
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.