Graph World Model
Tao Feng, Yexin Wu, Guanyu Lin, Jiaxuan You

TL;DR
The paper introduces the Graph World Model (GWM), a versatile world model that integrates structured graph data with multi-modal information, enabling diverse tasks and demonstrating strong zero-shot and few-shot capabilities across multiple domains.
Contribution
GWM is the first to unify unstructured and graph-structured data with multi-modal information in a single world model supporting diverse tasks.
Findings
GWM outperforms domain-specific baselines on six tasks.
GWM benefits from multi-hop structures for improved performance.
GWM exhibits strong zero-shot and few-shot learning capabilities.
Abstract
World models (WMs) demonstrate strong capabilities in prediction, generation, and planning tasks. Existing WMs primarily focus on unstructured data and cannot leverage the ubiquitous structured data, often represented as graphs, in the digital world. While multiple graph foundation models have been proposed, they focus on graph learning tasks and cannot extend to diverse multi-modal data and interdisciplinary tasks. To address these challenges, we propose the Graph World Model (GWM), a world model that supports both unstructured and graph-structured states with multi-modal information and represents diverse tasks as actions. The core of a GWM is a generic message-passing algorithm to aggregate structured information, either over a unified multi-modal token space by converting multi-modal data into text (GWM-T) or a unified multi-modal embedding space by modality-specific encoders…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Graph Theory and Algorithms
MethodsFocus
