Toward Graph-Tokenizing Large Language Models with Reconstructive Graph Instruction Tuning
Zhongjian Zhang, Xiao Wang, Mengmei Zhang, Jiarui Tan, Chuan Shi

TL;DR
This paper introduces RGLM, a reconstructive graph instruction tuning method for graph-tokenizing large language models, improving graph-text alignment by explicitly reconstructing graph information to enhance understanding and generalization.
Contribution
It proposes a novel reconstructive graph instruction tuning pipeline, RGLM, with three variants, to explicitly incorporate graph supervision and improve alignment in GTokenLLMs.
Findings
RGLM significantly improves graph-text alignment performance.
Theoretical analysis confirms the effectiveness of each RGLM variant.
Extensive experiments demonstrate RGLM's superiority across benchmarks.
Abstract
The remarkable success of large language models (LLMs) has motivated researchers to adapt them as universal predictors for various graph-related tasks, with the ultimate goal of developing a graph foundation model that generalizes diverse scenarios. The key challenge is to align graph data with language spaces so that LLMs can better comprehend graphs. As a popular paradigm, Graph-Tokenizing LLMs (GTokenLLMs) encode complex structures and lengthy texts into a graph token sequence, and then align them with text tokens via language instructions tuning. Despite their initial success, our information-theoretic analysis reveals that existing GTokenLLMs rely solely on text supervision from language instructions, which achieve only implicit graph-text alignment, resulting in a text-dominant bias that underutilizes graph context. To overcome this limitation, we first prove that the alignment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Graph Theory and Algorithms
