MyGram: Modality-aware Graph Transformer with Global Distribution for Multi-modal Entity Alignment

Zhifei Li; Ziyue Qin; Xiangyu Luo; Xiaoju Hou; Yue Zhao; Miao Zhang; Zhifang Huang; Kui Xiao; Bing Yang

arXiv:2601.11885·cs.AI·January 21, 2026

MyGram: Modality-aware Graph Transformer with Global Distribution for Multi-modal Entity Alignment

Zhifei Li, Ziyue Qin, Xiangyu Luo, Xiaoju Hou, Yue Zhao, Miao Zhang, Zhifang Huang, Kui Xiao, Bing Yang

PDF

Open Access

TL;DR

MyGram is a novel multi-modal entity alignment method that leverages a modality-aware graph transformer with global distribution constraints to improve semantic matching across knowledge graphs.

Contribution

It introduces a modality diffusion learning module and Gram Loss for deep structural understanding and global distribution consistency in multi-modal entity alignment.

Findings

01

Outperforms baseline models on five datasets

02

Achieves up to 9.9% improvement in Hits@1

03

Demonstrates effectiveness of global distribution regularization

Abstract

Multi-modal entity alignment aims to identify equivalent entities between two multi-modal Knowledge graphs by integrating multi-modal data, such as images and text, to enrich the semantic representations of entities. However, existing methods may overlook the structural contextual information within each modality, making them vulnerable to interference from shallow features. To address these challenges, we propose MyGram, a modality-aware graph transformer with global distribution for multi-modal entity alignment. Specifically, we develop a modality diffusion learning module to capture deep structural contextual information within modalities and enable fine-grained multi-modal fusion. In addition, we introduce a Gram Loss that acts as a regularization constraint by minimizing the volume of a 4-dimensional parallelotope formed by multi-modal features, thereby achieving global…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Graph Neural Networks · Domain Adaptation and Few-Shot Learning