Revisiting Graph-Tokenizing Large Language Models: A Systematic Evaluation of Graph Token Understanding
Zhongjian Zhang, Yue Yu, Mengmei Zhang, Junping Du, Xiao Wang, Chuan Shi

TL;DR
This paper systematically evaluates whether Graph-Tokenizing Large Language Models truly understand graph tokens, revealing their limitations and the need for further improvements in instruction tuning.
Contribution
It introduces GTEval, a unified evaluation framework, and provides extensive analysis showing current GTokenLLMs' incomplete understanding of graph tokens.
Findings
GTokenLLMs do not fully understand graph tokens.
They show over-sensitivity or over-insensitivity to instruction changes.
Additional instruction tuning improves performance but does not fully solve understanding issues.
Abstract
The remarkable success of large language models (LLMs) has motivated researchers to adapt them as universal predictors for various graph tasks. As a widely recognized paradigm, Graph-Tokenizing LLMs (GTokenLLMs) compress complex graph data into graph tokens and treat them as prefix tokens for querying LLMs, leading many to believe that LLMs can understand graphs more effectively and efficiently. In this paper, we challenge this belief: \textit{Do GTokenLLMs fully understand graph tokens in the natural-language embedding space?} Motivated by this question, we formalize a unified framework for GTokenLLMs and propose an evaluation pipeline, \textbf{GTEval}, to assess graph-token understanding via instruction transformations at the format and content levels. We conduct extensive experiments on 6 representative GTokenLLMs with GTEval. The primary findings are as follows: (1) Existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
