Language Modeling with Editable External Knowledge

Belinda Z. Li; Emmy Liu; Alexis Ross; Abbas Zeitoun; Graham Neubig,; Jacob Andreas

arXiv:2406.11830·cs.CL·June 18, 2024

Language Modeling with Editable External Knowledge

Belinda Z. Li, Emmy Liu, Alexis Ross, Abbas Zeitoun, Graham Neubig,, Jacob Andreas

PDF

Open Access 1 Repo

TL;DR

ERASE is a novel method for updating knowledge bases in language models by incrementally deleting or rewriting entries, leading to improved accuracy in question-answering tasks involving streaming news or conversations.

Contribution

Introduces ERASE, a new approach that enhances language model updates through incremental knowledge base editing, unlike prior methods focused solely on retrieval or reasoning during prediction.

Findings

01

ERASE improves accuracy by 7-13% on news question-answering benchmarks.

02

ERASE enhances performance by 6-10% on conversational question-answering.

03

The method demonstrates effective knowledge base management for dynamic information environments.

Abstract

When the world changes, so does the text that humans write about it. How do we build language models that can be easily updated to reflect these changes? One popular approach is retrieval-augmented generation, in which new documents are inserted into a knowledge base and retrieved during prediction for downstream tasks. Most prior work on these systems have focused on improving behavior during prediction through better retrieval or reasoning. This paper introduces ERASE, which instead improves model behavior when new documents are acquired, by incrementally deleting or rewriting other entries in the knowledge base each time a document is added. In two new benchmark datasets evaluating models' ability to answer questions about a stream of news articles or conversations, ERASE improves accuracy relative to conventional retrieval-augmented generation by 7-13% (Mixtral-8x7B) and 6-10%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

belindal/erase
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsBalanced Selection