Self-Updatable Large Language Models by Integrating Context into Model Parameters
Yu Wang, Xinshuang Liu, Xiusi Chen, Sean O'Brien, Junda Wu, Julian, McAuley

TL;DR
This paper introduces SELF-PARAM, a method for updating large language models by integrating contextual knowledge directly into their parameters, achieving efficient, long-term retention without extra storage.
Contribution
The paper presents a novel training objective that enables models to internalize knowledge through parameter updates, outperforming existing methods in efficiency and retention.
Findings
Outperforms existing methods in question-answering tasks
Achieves near-optimal efficacy and long-term retention
Requires no additional parameters for updates
Abstract
Despite significant advancements in large language models (LLMs), the rapid and frequent integration of small-scale experiences, such as interactions with surrounding objects, remains a substantial challenge. Two critical factors in assimilating these experiences are (1) Efficacy: the ability to accurately remember recent events; (2) Retention: the capacity to recall long-past experiences. Current methods either embed experiences within model parameters using continual learning, model editing, or knowledge distillation techniques, which often struggle with rapid updates and complex interactions, or rely on external storage to achieve long-term retention, thereby increasing storage requirements. In this paper, we propose SELF-PARAM (Self-Updatable Large Language Models with Parameter Integration). SELF-PARAM requires no extra parameters while ensuring near-optimal efficacy and long-term…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsKnowledge Distillation
