<SOG_k>: One LLM Token for Explicit Graph Structural Understanding
Jingyao Wu, Bin Lu, Zijun Di, Xiaoying Gan, Meng Jin, Luoyi Fu, Xinbing Wang, Chenghu Zhou

TL;DR
This paper introduces a novel <SOG_k> token that enables large language models to explicitly understand and reason about graph structures, improving performance and interpretability in graph-related tasks.
Contribution
The paper proposes a topology-aware tokenizer and a special structural token <SOG_k> to enhance LLMs' explicit understanding of graph structures, addressing limitations of previous methods.
Findings
Achieves 9.9% to 41.4% performance improvement on graph benchmarks.
Demonstrates effective global and local structural understanding.
Provides interpretability and consistency in graph reasoning tasks.
Abstract
Large language models show great potential in unstructured data understanding, but still face significant challenges with graphs due to their structural hallucination. Existing approaches mainly either verbalize graphs into natural language, which leads to excessive token consumption and scattered attention, or transform graphs into trainable continuous embeddings (i.e., soft prompt), but exhibit severe misalignment with original text tokens. To solve this problem, we propose to incorporate one special token <SOG_k> to fully represent the Structure Of Graph within a unified token space, facilitating explicit topology input and structural information sharing. Specifically, we propose a topology-aware structural tokenizer that maps each graph topology into a highly selective single token. Afterwards, we construct a set of hybrid structure Question-Answering corpora to align new structural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Healthcare · Topic Modeling
