Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Gangwei Jiang, Caigao Jiang, Zhaoyi Li, Siqiao Xue, Jun Zhou, Linqi, Song, Defu Lian, Ying Wei

TL;DR
This paper investigates catastrophic forgetting in large language models during continual learning, analyzing function vectors to understand and mitigate forgetting through a novel regularization-based training method validated on multiple benchmarks.
Contribution
It introduces a function vector-based interpretation of forgetting and proposes a new training approach that stabilizes these vectors to reduce catastrophic forgetting.
Findings
Function vectors reveal that forgetting is due to biases in activation, not overwriting.
The proposed method effectively reduces forgetting across four benchmarks.
Theoretical analysis supports the link between function vector stability and forgetting mitigation.
Abstract
Catastrophic forgetting (CF) poses a significant challenge in machine learning, where a model forgets previously learned information upon learning new tasks. Despite the advanced capabilities of Large Language Models (LLMs), they continue to face challenges with CF during continual learning. The majority of existing research focuses on analyzing forgetting patterns through a singular training sequence, thereby overlooking the intricate effects that diverse tasks have on model behavior. Our study explores CF across various settings, discovering that model forgetting is influenced by both the specific training tasks and the models themselves. To this end, we interpret forgetting by examining the function vector (FV), a compact representation of functions in LLMs, offering a model-dependent indicator for the occurrence of CF. Through theoretical and empirical analyses, we demonstrated that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Digital and Traditional Archives Management · Advancements in Photolithography Techniques
