GOFA: A Generative One-For-All Model for Joint Graph Language Modeling
Lecheng Kong, Jiarui Feng, Hao Liu, Chengsong Huang, Jiaxin Huang,, Yixin Chen, Muhan Zhang

TL;DR
GOFA introduces a novel generative model that seamlessly combines language and graph modeling capabilities, enabling versatile, self-supervised pretraining and effective zero-shot performance on diverse graph tasks.
Contribution
The paper proposes GOFA, a new generative graph language model that integrates GNN layers into a frozen LLM, achieving simultaneous structural understanding and task fluidity.
Findings
Effective zero-shot performance on downstream graph tasks
Successful pretraining on structural and question-answering tasks
Strong ability to handle diverse graph problems
Abstract
Foundation models, such as Large Language Models (LLMs) or Large Vision Models (LVMs), have emerged as one of the most powerful tools in the respective fields. However, unlike text and image data, graph data do not have a definitive structure, posing great challenges to developing a Graph Foundation Model (GFM). For example, current attempts at designing general graph models either transform graph data into a language format for LLM-based prediction or still train a GNN model with LLM as an assistant. The former can handle unlimited tasks, while the latter captures graph structure much better -- yet, no existing work can achieve both simultaneously. In this paper, we identify three key desirable properties of a GFM: self-supervised pretraining, fluidity in tasks, and graph awareness. To account for these properties, we extend the conventional language modeling to the graph domain and…
Peer Reviews
Decision·ICLR 2025 Poster
1. The paper addresses an important and challenging problem of building generic and effective models for different modalities. 2. The proposed model, while being a combination of existing methods, is logical and original. Especially, applying the GNN layers on top of memory tokens seems a very good idea to exchange information in an efficient and flexible way. 3. The experiments are sufficiently large, diverse and challenging. 4. The results are overall strong and promising.
1. The model is designed more for the tasks where nodes/edges are rich in text. It would be interesting and more impactful to include tasks (in the pretraining and/or eval) that are more numeric, e.g. molecule prediction/generation such as ZINC or from open graph benchmark. While the SOTA specialized GNNs are probably going to work better in those cases, it would be interesting to see the gap. 2. Related to above, one potential issue of this approach is that representing node/edge features can
- The authors offer detailed discussion about the desired attributes of a graph foundation model, including the ability to generalize to new tasks and data domains without additional finetuning of the foundation model, i.e. zero-shot settings. - The authors also give detailed discussion on previous methods for incorporating GNNs into LLM-based architectures (LLMs as predictors and LLMs as an enhancer), and discuss the drawbacks of previous approaches in building general graph foundation models.
1. It is unclear whether the results in Table 2 of the paper represent a true zero-shot setting, since the authors mention in Section 5.2 and Appendix F.4 that the model is instruction-finetuned on node classification and link prediction examples. The datasets in Table 2 are unseen, however samples of the same task on other datasets has been seen by the model during instruction finetuning, making it seem like this is a generalization task to unseen datasets rather than a zero-shot setting where
GOFA is trained on multiple datasets and it appears to be indeed a general-purpose Large Graph Language Model. The paper is well-structured, with a comprehensive presentation of both quantitative and qualitative results across the main text and appendix.
1. **Evaluation of Graph Structure Utilization.** In Figure 1 and Section 3.4, the authors outline four pre-training tasks: (a) sentence completion, (b) question answering, (c) structural understanding, and (d) information retrieval. However, apart from the structural understanding task, it is somewhat unclear how these tasks benefit from graph information. For instance, tasks like sentence completion and question answering might be achievable by directly providing the raw text input to a generi
Code & Models
Videos
Taxonomy
TopicsSemantic Web and Ontologies · Topic Modeling · Advanced Graph Neural Networks
