Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic Forgetting
Haolin Chen, Philip N. Garner

TL;DR
This paper introduces a Bayesian approach to parameter-efficient fine-tuning that effectively prevents catastrophic forgetting in models like text-to-speech synthesis and language modeling, using Laplace approximations to regularize the process.
Contribution
It applies Bayesian learning techniques to PEFT with differentiable parameter shifts, demonstrating improved knowledge preservation without performance loss.
Findings
Kronecker-factored approximation better preserves pre-training knowledge
Bayesian regularization prevents catastrophic forgetting in PEFT
No degradation in fine-tuning performance observed
Abstract
We are motivated primarily by the adaptation of text-to-speech synthesis models; however we argue that more generic parameter-efficient fine-tuning (PEFT) is an appropriate framework to do such adaptation. Nevertheless, catastrophic forgetting remains an issue with PEFT, damaging the pre-trained model's inherent capabilities. We demonstrate that existing Bayesian learning techniques can be applied to PEFT to prevent catastrophic forgetting as long as the parameter shift of the fine-tuned layers can be calculated differentiably. In a principled series of experiments on language modeling and speech synthesis tasks, we utilize established Laplace approximations, including diagonal and Kronecker-factored approaches, to regularize PEFT with the low-rank adaptation (LoRA) and compare their performance in pre-training knowledge preservation. Our results demonstrate that catastrophic forgetting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
