Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA
Wenyu Huang, Guancheng Zhou, Hongru Wang, Pavlos Vougiouklis, Mirella, Lapata, Jeff Z. Pan

TL;DR
This paper demonstrates that small language models can effectively retrieve subgraphs from knowledge graphs for multi-hop question answering, achieving competitive or state-of-the-art results with significantly fewer parameters.
Contribution
It introduces a novel approach modeling subgraph retrieval as conditional generation with small language models, reducing reliance on large models and extensive annotations.
Findings
220M parameter model achieves competitive retrieval performance.
3B model with LLM reader sets new SOTA on WebQSP and CWQ.
Small models can effectively perform subgraph retrieval for KGQA.
Abstract
Retrieval-Augmented Generation (RAG) is widely used to inject external non-parametric knowledge into large language models (LLMs). Recent works suggest that Knowledge Graphs (KGs) contain valuable external knowledge for LLMs. Retrieving information from KGs differs from extracting it from document sets. Most existing approaches seek to directly retrieve relevant subgraphs, thereby eliminating the need for extensive SPARQL annotations, traditionally required by semantic parsing methods. In this paper, we model the subgraph retrieval task as a conditional generation task handled by small language models. Specifically, we define a subgraph identifier as a sequence of relations, each represented as a special token stored in the language models. Our base generative subgraph retrieval model, consisting of only 220M parameters, achieves competitive retrieval performance compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Algorithms and Data Compression
MethodsBalanced Selection
