FastMem: Fast Memorization of Prompt Improves Context Awareness of Large   Language Models

Junyi Zhu; Shuochen Liu; Yu Yu; Bo Tang; Yibo Yan; Zhiyu Li; Feiyu; Xiong; Tong Xu; Matthew B. Blaschko

arXiv:2406.16069·cs.CL·October 8, 2024

FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models

Junyi Zhu, Shuochen Liu, Yu Yu, Bo Tang, Yibo Yan, Zhiyu Li, Feiyu, Xiong, Tong Xu, Matthew B. Blaschko

PDF

Open Access 1 Repo

TL;DR

FastMem is a method that improves large language models' context awareness by quickly memorizing prompts through targeted updates, leading to better performance in tasks like reading comprehension and summarization.

Contribution

FastMem introduces a novel, efficient approach to enhance LLMs' context understanding by updating only the last FFN module for prompt memorization, avoiding overfitting.

Findings

01

Significant accuracy improvements on benchmark datasets.

02

Reduced output structure failure rates.

03

Enhanced model reliability in various tasks.

Abstract

Large language models (LLMs) excel in generating coherent text, but they often struggle with context awareness, leading to inaccuracies in tasks requiring faithful adherence to provided information. We introduce FastMem, a novel method designed to enhance instruction fine-tuned LLMs' context awareness through fast memorization of the prompt. FastMem maximizes the likelihood of the prompt before inference by updating only the last Feed-Forward Network (FFN) module. This targeted approach ensures efficient optimization without overfitting, significantly improving the model's ability to comprehend and accurately follow the context. Our experiments demonstrate substantial gains in reading comprehension, text summarization and adherence to output structures. For instance, FastMem improves the accuracy of Llama 3-8B-Inst on the NQ-SWAP dataset from 59.1% to 71.6%, and reduces the output…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iaar-shanghai/fastmem
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsLLaMA