KMIR: A Benchmark for Evaluating Knowledge Memorization, Identification and Reasoning Abilities of Language Models
Daniel Gao, Yantao Jia, Lei Li, Chengzhen Fu, Zhicheng Dou, Hao Jiang,, Xinyu Zhang, Lei Chen, Zhao Cao

TL;DR
This paper introduces KMIR, a comprehensive benchmark to evaluate the knowledge memorization, identification, and reasoning abilities of pre-trained language models across various knowledge types, revealing their strengths and limitations.
Contribution
The paper presents KMIR, a new benchmark with 184,348 questions to assess key knowledge-related capabilities of PLMs, addressing gaps in evaluating their reliability as knowledge sources.
Findings
PLMs' memorization depends more on parameter count than training schemes
Current PLMs struggle with robust fact recall
Model compression retains knowledge but impairs reasoning and identification
Abstract
Previous works show the great potential of pre-trained language models (PLMs) for storing a large amount of factual knowledge. However, to figure out whether PLMs can be reliable knowledge sources and used as alternative knowledge bases (KBs), we need to further explore some critical features of PLMs. Firstly, knowledge memorization and identification abilities: traditional KBs can store various types of entities and relationships; do PLMs have a high knowledge capacity to store different types of knowledge? Secondly, reasoning ability: a qualified knowledge source should not only provide a collection of facts, but support a symbolic reasoner. Can PLMs derive new knowledge based on the correlations between facts? To evaluate these features of PLMs, we propose a benchmark, named Knowledge Memorization, Identification, and Reasoning test (KMIR). KMIR covers 3 types of knowledge, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
