Interweaving Memories of a Siamese Large Language Model
Xin Song, Zhikai Xue, Guoxiu He, Jiawei Liu, Wei Lu

TL;DR
This paper introduces IMSM, a model-agnostic PEFT framework for LLMs that interweaves original and fine-tuned memories to improve task performance and reduce catastrophic forgetting, with broad applicability and efficiency.
Contribution
IMSM is a novel PEFT framework that interweaves original and fine-tuned memories in a siamese LLM, enhancing knowledge retention and task performance.
Findings
IMSM significantly outperforms classical PEFT methods in benchmark tasks.
IMSM maintains efficiency comparable to backbone PEFT methods.
IMSM effectively mitigates catastrophic forgetting in LLMs.
Abstract
Parameter-efficient fine-tuning (PEFT) methods optimize large language models (LLMs) by modifying or introducing a small number of parameters to enhance alignment with downstream tasks. However, they can result in catastrophic forgetting, where LLMs prioritize new knowledge at the expense of comprehensive world knowledge. A promising approach to mitigate this issue is to recall prior memories based on the original knowledge. To this end, we propose a model-agnostic PEFT framework, IMSM, which Interweaves Memories of a Siamese Large Language Model. Specifically, our siamese LLM is equipped with an existing PEFT method. Given an incoming query, it generates two distinct memories based on the pre-trained and fine-tuned parameters. IMSM then incorporates an interweaving mechanism that regulates the contributions of both original and enhanced memories when generating the next token. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSoutheast Asian Sociopolitical Studies · Multilingual Education and Policy
