Have We Designed Generalizable Structural Knowledge Promptings?   Systematic Evaluation and Rethinking

Yichi Zhang; Zhuo Chen; Lingbing Guo; Yajing Xu; Shaokai Chen; Mengshu; Sun; Binbin Hu; Zhiqiang Zhang; Lei Liang; Wen Zhang; Huajun Chen

arXiv:2501.00244·cs.CL·January 3, 2025

Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking

Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Shaokai Chen, Mengshu, Sun, Binbin Hu, Zhiqiang Zhang, Lei Liang, Wen Zhang, Huajun Chen

PDF

Open Access

TL;DR

This paper systematically evaluates the generalization capabilities of Structural Knowledge Prompting (SKP) in large language models across multiple dimensions using a new comprehensive benchmark.

Contribution

It introduces SUBARU, a novel benchmark for assessing SKP's generalization across various tasks, and provides a critical rethinking of SKP's effectiveness and limitations.

Findings

01

SKP shows varying generalization performance across tasks.

02

The benchmark reveals limitations in transferability and scalability.

03

Insights suggest directions for improving SKP methods.

Abstract

Large language models (LLMs) have demonstrated exceptional performance in text generation within current NLP research. However, the lack of factual accuracy is still a dark cloud hanging over the LLM skyscraper. Structural knowledge prompting (SKP) is a prominent paradigm to integrate external knowledge into LLMs by incorporating structural representations, achieving state-of-the-art results in many knowledge-intensive tasks. However, existing methods often focus on specific problems, lacking a comprehensive exploration of the generalization and capability boundaries of SKP. This paper aims to evaluate and rethink the generalization capability of the SKP paradigm from four perspectives including Granularity, Transferability, Scalability, and Universality. To provide a thorough evaluation, we introduce a novel multi-granular, multi-level benchmark called SUBARU, consisting of 9 different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies

MethodsFocus