Context versus Prior Knowledge in Language Models

Kevin Du; V\'esteinn Sn{\ae}bjarnarson; Niklas Stoehr; Jennifer C.; White; Aaron Schein; Ryan Cotterell

arXiv:2404.04633·cs.CL·June 18, 2024·1 cites

Context versus Prior Knowledge in Language Models

Kevin Du, V\'esteinn Sn{\ae}bjarnarson, Niklas Stoehr, Jennifer C., White, Aaron Schein, Ryan Cotterell

PDF

Open Access 1 Video

TL;DR

This paper investigates how language models balance prior knowledge and contextual information when answering questions, introducing metrics to quantify their dependency and susceptibility, and analyzing their relationship with model familiarity.

Contribution

It proposes two mutual information-based metrics to measure a model's reliance on context versus prior knowledge, validated through empirical testing.

Findings

01

Models depend more on prior knowledge for familiar entities.

02

Context influence varies depending on entity familiarity.

03

Metrics effectively quantify model dependency and susceptibility.

Abstract

To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to higher exposure in the training corpus, and be more easily persuaded by some contexts than others. To formalize this problem, we propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity: first, the persuasion score of a given context represents how much a model depends on the context in its decision, and second, the susceptibility score of a given entity represents how much the model can be swayed away from its original answer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Context versus Prior Knowledge in Language Models· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies