Disentangling Homophily and Heterophily in Multimodal Graph Clustering
Zhaochen Guo, Zhixiang Shen, Xuanting Xie, Liangjian Wen, Zhao Kang

TL;DR
This paper introduces DMGC, a novel framework for multimodal graph clustering that disentangles homophilic and heterophilic relationships, improving clustering performance on complex multimodal graphs.
Contribution
The paper proposes a new framework, DMGC, which decomposes multimodal graphs into homophily and heterophily views and employs dual-frequency fusion for enhanced clustering.
Findings
DMGC achieves state-of-the-art results on multiple datasets.
Disentangling graph views improves clustering accuracy.
The approach generalizes across diverse multimodal graph types.
Abstract
Multimodal graphs, which integrate unstructured heterogeneous data with structured interconnections, offer substantial real-world utility but remain insufficiently explored in unsupervised learning. In this work, we initiate the study of multimodal graph clustering, aiming to bridge this critical gap. Through empirical analysis, we observe that real-world multimodal graphs often exhibit hybrid neighborhood patterns, combining both homophilic and heterophilic relationships. To address this challenge, we propose a novel framework -- \textsc{Disentangled Multimodal Graph Clustering (DMGC)} -- which decomposes the original hybrid graph into two complementary views: (1) a homophily-enhanced graph that captures cross-modal class consistency, and (2) heterophily-aware graphs that preserve modality-specific inter-class distinctions. We introduce a \emph{Multimodal Dual-frequency Fusion}…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
