TL;DR
This paper explores multi-token pooling strategies and global attention mechanisms to improve graph encoding in LLM-based GraphQA, addressing information bottlenecks and stabilizing prompt tuning.
Contribution
It introduces hierarchical pooling methods and stabilization techniques like LoRA to enhance graph representations in LLMs, achieving competitive performance.
Findings
Pooling methods can rival full-graph baselines (~73% Hit@1 on WebQSP)
LoRA stabilizes hierarchical projections during prompt tuning
Graph Transformer with VNPool acts as a Perceiver IO encoder
Abstract
The integration of Graph Neural Networks (GNNs) with Large Language Models (LLMs) has emerged as a promising paradigm for Graph Question Answering (GraphQA). However, effective methods for encoding complex structural information into the LLM's latent space remain an open challenge. Current state-of-the-art architectures, such as G-Retriever, typically rely on standard GNNs and aggressive mean pooling to compress entire graph substructures into a single token, creating a severe information bottleneck. This work mitigates this bottleneck by investigating two orthogonal strategies: (1) increasing the bandwidth of the graph-to-LLM interface via multi-token pooling, and (2) enhancing the semantic quality of the graph encoder via global attention mechanisms. We evaluate a suite of hierarchical pruning and clustering-based pooling operators including Top-k, SAGPool, DiffPool, MinCutPool, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
