Multimodal learning with graphs
Yasha Ektefaie, George Dasoulas, Ayush Noori, Maha Farhat, Marinka, Zitnik

TL;DR
This paper introduces a blueprint for multimodal graph learning, addressing challenges in combining diverse data modalities and leveraging cross-modal dependencies with graph-based methods.
Contribution
It provides a comprehensive framework to categorize, analyze, and guide the design of multimodal graph AI models, unifying various existing approaches.
Findings
Categorizes multimodal graph learning into image-intensive, knowledge-grounded, and language-intensive models.
Provides guidelines for designing new multimodal graph models.
Analyzes existing methods within the proposed blueprint.
Abstract
Artificial intelligence for graphs has achieved remarkable success in modeling complex systems, ranging from dynamic networks in biology to interacting particle systems in physics. However, the increasingly heterogeneous graph datasets call for multimodal methods that can combine different inductive biases: the set of assumptions that algorithms use to make predictions for inputs they have not encountered during training. Learning on multimodal datasets presents fundamental challenges because the inductive biases can vary by data modality and graphs might not be explicitly given in the input. To address these challenges, multimodal graph AI methods combine different modalities while leveraging cross-modal dependencies using graphs. Diverse datasets are combined using graphs and fed into sophisticated multimodal architectures, specified as image-intensive, knowledge-grounded and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling
