LM-PUB-QUIZ: A Comprehensive Framework for Zero-Shot Evaluation of   Relational Knowledge in Language Models

Max Ploner; Jacek Wiland; Sebastian Pohl; Alan Akbik

arXiv:2408.15729·cs.CL·August 29, 2024

LM-PUB-QUIZ: A Comprehensive Framework for Zero-Shot Evaluation of Relational Knowledge in Language Models

Max Ploner, Jacek Wiland, Sebastian Pohl, Alan Akbik

PDF

Open Access 1 Video

TL;DR

LM-PUB-QUIZ is an open-source Python framework that facilitates zero-shot evaluation of relational knowledge in language models using the BEAR probing method, supporting detailed analysis and integration with popular training tools.

Contribution

It introduces LM-PUB-QUIZ, a comprehensive, easy-to-use framework and leaderboard for probing relational knowledge in LMs, enhancing prior methods with better analysis and integration capabilities.

Findings

01

Enables unbiased comparison of different LMs' relational knowledge

02

Supports integration with Hugging Face Transformers for streamlined evaluation

03

Provides detailed analysis of different knowledge types in language models

Abstract

Knowledge probing evaluates the extent to which a language model (LM) has acquired relational knowledge during its pre-training phase. It provides a cost-effective means of comparing LMs of different sizes and training setups and is useful for monitoring knowledge gained or lost during continual learning (CL). In prior work, we presented an improved knowledge probe called BEAR (Wiland et al., 2024), which enables the comparison of LMs trained with different pre-training objectives (causal and masked LMs) and addresses issues of skewed distributions in previous probes to deliver a more unbiased reading of LM knowledge. With this paper, we present LM-PUB- QUIZ, a Python framework and leaderboard built around the BEAR probing mechanism that enables researchers and practitioners to apply it in their work. It provides options for standalone evaluation and direct integration into the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LM-Pub-Quiz: A Comprehensive Framework for Zero-Shot Evaluation of Relational Knowledge in Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques