OptiMAG: Structure-Semantic Alignment via Unbalanced Optimal Transport

Yilong Zuo; Xunkai Li; Zhihan Zhang; Qiangqiang Dai; Ronghua Li; Guoren Wang

arXiv:2601.22856·cs.LG·February 2, 2026

OptiMAG: Structure-Semantic Alignment via Unbalanced Optimal Transport

Yilong Zuo, Xunkai Li, Zhihan Zhang, Qiangqiang Dai, Ronghua Li, Guoren Wang

PDF

Open Access

TL;DR

OptiMAG introduces an unbalanced optimal transport regularization to align semantic and structural information in multimodal attributed graphs, improving performance across various graph and multimodal tasks.

Contribution

It proposes a novel regularization framework using Fused Gromov-Wasserstein distance for structural-semantic alignment in multimodal graphs, adaptable as a plug-in for existing models.

Findings

01

Outperforms baselines on node classification and link prediction.

02

Enhances multimodal generation tasks like graph2text and graph2image.

03

Effectively mitigates structural-semantic conflicts in MAGs.

Abstract

Multimodal Attributed Graphs (MAGs) have been widely adopted for modeling complex systems by integrating multi-modal information, such as text and images, on nodes. However, we identify a discrepancy between the implicit semantic structure induced by different modality embeddings and the explicit graph structure. For instance, neighbors in the explicit graph structure may be close in one modality but distant in another. Since existing methods typically perform message passing over the fixed explicit graph structure, they inadvertently aggregate dissimilar features, introducing modality-specific noise and impeding effective node representation learning. To address this, we propose OptiMAG, an Unbalanced Optimal Transport-based regularization framework. OptiMAG employs the Fused Gromov-Wasserstein distance to explicitly guide cross-modal structural consistency within local neighborhoods,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Graph Neural Networks · Explainable Artificial Intelligence (XAI)