Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking
Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Shaokai Chen, Mengshu, Sun, Binbin Hu, Zhiqiang Zhang, Lei Liang, Wen Zhang, Huajun Chen

TL;DR
This paper systematically evaluates the generalization capabilities of Structural Knowledge Prompting (SKP) in large language models across multiple dimensions using a new comprehensive benchmark.
Contribution
It introduces SUBARU, a novel benchmark for assessing SKP's generalization across various tasks, and provides a critical rethinking of SKP's effectiveness and limitations.
Findings
SKP shows varying generalization performance across tasks.
The benchmark reveals limitations in transferability and scalability.
Insights suggest directions for improving SKP methods.
Abstract
Large language models (LLMs) have demonstrated exceptional performance in text generation within current NLP research. However, the lack of factual accuracy is still a dark cloud hanging over the LLM skyscraper. Structural knowledge prompting (SKP) is a prominent paradigm to integrate external knowledge into LLMs by incorporating structural representations, achieving state-of-the-art results in many knowledge-intensive tasks. However, existing methods often focus on specific problems, lacking a comprehensive exploration of the generalization and capability boundaries of SKP. This paper aims to evaluate and rethink the generalization capability of the SKP paradigm from four perspectives including Granularity, Transferability, Scalability, and Universality. To provide a thorough evaluation, we introduce a novel multi-granular, multi-level benchmark called SUBARU, consisting of 9 different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
MethodsFocus
