Memorizing Documents with Guidance in Large Language Models

Bumjin Park; Jaesik Choi

arXiv:2406.15996·cs.CL·June 25, 2024

Memorizing Documents with Guidance in Large Language Models

Bumjin Park, Jaesik Choi

PDF

Open Access 2 Repos

TL;DR

This paper introduces a novel document-wise memory architecture and guidance loss for large language models, enabling better tracking and retrieval of document-specific information during text generation.

Contribution

It proposes a new memory architecture and guidance loss that improve document content recall in LLMs, moving beyond post hoc interpretation methods.

Findings

01

High recall of document-related content in generation.

02

Different memory entries effectively represent individual documents.

03

Improved document content tracking on Wikitext-103-v1.

Abstract

Training data plays a pivotal role in AI models. Large language models (LLMs) are trained with massive amounts of documents, and their parameters hold document-related contents. Recently, several studies identified content-specific locations in LLMs by examining the parameters. Instead of the post hoc interpretation, we propose another approach. We propose document-wise memory architecture to track document memories in training. The proposed architecture maps document representations to memory entries, which softly mask memories in the forward process of LLMs. Additionally, we propose document guidance loss, which increases the likelihood of text with document memories and reduces the likelihood of the text with the memories of other documents. Experimental results on Wikitext-103-v1 with Pythia-1B show that the proposed methods provide different memory entries for documents and high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsHigh-Order Consensuses