Self-Updatable Large Language Models by Integrating Context into Model   Parameters

Yu Wang; Xinshuang Liu; Xiusi Chen; Sean O'Brien; Junda Wu; Julian; McAuley

arXiv:2410.00487·cs.CL·February 24, 2025

Self-Updatable Large Language Models by Integrating Context into Model Parameters

Yu Wang, Xinshuang Liu, Xiusi Chen, Sean O'Brien, Junda Wu, Julian, McAuley

PDF

Open Access

TL;DR

This paper introduces SELF-PARAM, a method for updating large language models by integrating contextual knowledge directly into their parameters, achieving efficient, long-term retention without extra storage.

Contribution

The paper presents a novel training objective that enables models to internalize knowledge through parameter updates, outperforming existing methods in efficiency and retention.

Findings

01

Outperforms existing methods in question-answering tasks

02

Achieves near-optimal efficacy and long-term retention

03

Requires no additional parameters for updates

Abstract

Despite significant advancements in large language models (LLMs), the rapid and frequent integration of small-scale experiences, such as interactions with surrounding objects, remains a substantial challenge. Two critical factors in assimilating these experiences are (1) Efficacy: the ability to accurately remember recent events; (2) Retention: the capacity to recall long-past experiences. Current methods either embed experiences within model parameters using continual learning, model editing, or knowledge distillation techniques, which often struggle with rapid updates and complex interactions, or rely on external storage to achieve long-term retention, thereby increasing storage requirements. In this paper, we propose SELF-PARAM (Self-Updatable Large Language Models with Parameter Integration). SELF-PARAM requires no extra parameters while ensuring near-optimal efficacy and long-term…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsKnowledge Distillation