SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs

Zhiqiang Liu; Enpei Niu; Yin Hua; Mengshu Sun; Lei Liang; Huajun Chen; Wen Zhang

arXiv:2507.17178·cs.CL·September 1, 2025

SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs

Zhiqiang Liu, Enpei Niu, Yin Hua, Mengshu Sun, Lei Liang, Huajun Chen, Wen Zhang

PDF

Open Access 1 Datasets

TL;DR

SKA-Bench is a comprehensive benchmark designed to evaluate the fine-grained structured knowledge understanding capabilities of large language models across various knowledge forms and challenge factors.

Contribution

The paper introduces SKA-Bench, a novel, rigorous benchmark that assesses multiple structured knowledge forms and fundamental abilities of LLMs, addressing limitations of previous evaluations.

Findings

01

Existing LLMs struggle with structured knowledge understanding.

02

Performance is affected by noise, order, and hallucinations.

03

DeepSeek-R1 shows notable but still limited capabilities.

Abstract

Although large language models (LLMs) have made significant progress in understanding Structured Knowledge (SK) like KG and Table, existing evaluations for SK understanding are non-rigorous (i.e., lacking evaluations of specific capabilities) and focus on a single type of SK. Therefore, we aim to propose a more comprehensive and rigorous structured knowledge understanding benchmark to diagnose the shortcomings of LLMs. In this paper, we introduce SKA-Bench, a Structured Knowledge Augmented QA Benchmark that encompasses four widely used structured knowledge forms: KG, Table, KG+Text, and Table+Text. We utilize a three-stage pipeline to construct SKA-Bench instances, which includes a question, an answer, positive knowledge units, and noisy knowledge units. To evaluate the SK understanding capabilities of LLMs in a fine-grained manner, we expand the instances into four fundamental ability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

zjukg/SKA-Bench
dataset· 10 dl
10 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies