Improving LLM Reasoning with Homophily-aware Structural and Semantic Text-Attributed Graph Compression
Zijun Di, Bin Lu, Huquan Kang, Luoyi Fu, Jiaxin Ding, Xiaoying Gan, Lei Zhou, Xinbing Wang, Chenghu Zhou

TL;DR
This paper introduces HS2C, a graph compression framework leveraging homophily to enhance LLM reasoning by reducing noise and preserving essential structural and semantic information, leading to improved accuracy.
Contribution
The paper proposes a novel homophily-aware graph compression method that effectively captures essential topology and semantic communities to improve LLM reasoning performance.
Findings
HS2C improves inference accuracy across multiple benchmarks.
The method achieves higher compression rates while maintaining performance.
Scalability demonstrated on diverse graph types.
Abstract
Large language models (LLMs) have demonstrated promising capabilities in Text-Attributed Graph (TAG) understanding. Recent studies typically focus on verbalizing the graph structures via handcrafted prompts, feeding the target node and its neighborhood context into LLMs. However, constrained by the context window, existing methods mainly resort to random sampling, often implemented via dropping node/edge randomly, which inevitably introduces noise and cause reasoning instability. We argue that graphs inherently contain rich structural and semantic information, and that their effective exploitation can unlock potential gains in LLMs reasoning performance. To this end, we propose Homophily-aware Structural and Semantic Compression for LLMs (HS2C), a framework centered on exploiting graph homophily. Structurally, guided by the principle of Structural Entropy minimization, we perform a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Multimodal Machine Learning Applications
