Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge Injection
Yuwei Zhang, Wenhao Yu, Shangbin Feng, Yifan Zhu, Letian Peng, Jayanth Srinivasa, Gaowen Liu, Jingbo Shang

TL;DR
This paper introduces WikiDYK, a large-scale, evolving benchmark for real-world knowledge injection into language models, revealing that bidirectional models outperform causal models in memorization and proposing a collaborative framework to enhance knowledge reliability.
Contribution
The paper presents WikiDYK, a novel benchmark for knowledge injection, and demonstrates that bidirectional models have superior memorization abilities, along with a modular ensemble framework to improve knowledge reliability.
Findings
Bidirectional LMs outperform causal LMs in knowledge memorization.
Ensemble framework improves knowledge reliability by up to 29.1%.
WikiDYK is a scalable, evolving benchmark for real-world knowledge testing.
Abstract
Despite significant advances in large language models (LLMs), their knowledge memorization capabilities remain underexplored, due to the lack of standardized and high-quality test ground. In this paper, we introduce a novel, real-world and large-scale knowledge injection benchmark that evolves continuously over time without requiring human intervention. Specifically, we propose WikiDYK, which leverages recently-added and human-written facts from Wikipedia's "Did You Know..." entries. These entries are carefully selected by expert Wikipedia editors based on criteria such as verifiability and clarity. Each entry is converted into multiple question-answer pairs spanning diverse task formats from easy cloze prompts to complex multi-hop questions. WikiDYK contains 12,290 facts and 77,180 questions, which is also seamlessly extensible with future updates from Wikipedia editors. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Advanced Graph Neural Networks
