UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs
Yufei He, Yuan Sui, Xiaoxin He, Yue Liu, Yifei Sun, Bryan Hooi

TL;DR
UniGraph2 introduces a unified embedding model for multimodal graphs that captures complex relationships across different data modalities, improving performance on various graph-based tasks.
Contribution
It presents a novel cross-domain graph foundation model with modality-specific encoders, a GNN, and a multi-graph pre-training algorithm for effective multimodal graph representation learning.
Findings
Outperforms state-of-the-art models on multiple tasks
Effectively captures multimodal and graph structure information
Demonstrates scalability and robustness across domains
Abstract
Existing foundation models, such as CLIP, aim to learn a unified embedding space for multimodal data, enabling a wide range of downstream web-based applications like search, recommendation, and content classification. However, these models often overlook the inherent graph structures in multimodal datasets, where entities and their relationships are crucial. Multimodal graphs (MMGs) represent such graphs where each node is associated with features from different modalities, while the edges capture the relationships between these entities. On the other hand, existing graph foundation models primarily focus on text-attributed graphs (TAGs) and are not designed to handle the complexities of MMGs. To address these limitations, we propose UniGraph2, a novel cross-domain graph foundation model that enables general representation learning on MMGs, providing a unified embedding space. UniGraph2…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Semantic Web and Ontologies · Advanced Graph Neural Networks
MethodsContrastive Language-Image Pre-training · ADaptive gradient method with the OPTimal convergence rate · Graph Neural Network · ALIGN · Focus
