Revealing the Parametric Knowledge of Language Models: A Unified   Framework for Attribution Methods

Haeun Yu; Pepa Atanasova; Isabelle Augenstein

arXiv:2404.18655·cs.CL·April 30, 2024

Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods

Haeun Yu, Pepa Atanasova, Isabelle Augenstein

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a unified framework to evaluate and compare attribution methods for understanding the parametric knowledge stored in language models, highlighting the complementary strengths of Instance and Neuron Attribution techniques.

Contribution

The study develops a novel evaluation framework and new attribution methods, providing systematic comparison and insights into the knowledge revealed by IA and NA in language models.

Findings

01

NA reveals more diverse and comprehensive knowledge

02

IA offers unique insights not captured by NA

03

Combining IA and NA can enhance understanding of LM knowledge

Abstract

Language Models (LMs) acquire parametric knowledge from their training process, embedding it within their weights. The increasing scalability of LMs, however, poses significant challenges for understanding a model's inner workings and further for updating or correcting this embedded knowledge without the significant cost of retraining. This underscores the importance of unveiling exactly what knowledge is stored and its association with specific model components. Instance Attribution (IA) and Neuron Attribution (NA) offer insights into this training-acquired knowledge, though they have not been compared systematically. Our study introduces a novel evaluation framework to quantify and compare the knowledge revealed by IA and NA. To align the results of the methods we introduce the attribution method NA-Instances to apply NA for retrieving influential training instances, and IA-Neurons to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

copenlu/reveal-param-knowledge
pytorchOfficial

Videos

Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods· underline

Taxonomy

TopicsNatural Language Processing Techniques