Gradient Projection For Continual Parameter-Efficient Tuning
Jingyang Qiao, Zhizhong Zhang, Xin Tan, Yanyun Qu and, Wensheng Zhang, Zhi Han, Yuan Xie

TL;DR
This paper introduces a unified gradient projection framework called PEGP that enhances parameter-efficient tuning methods by reducing forgetting in continual learning across various models and modalities.
Contribution
It reformulates PETs from a gradient projection perspective and proposes orthogonal gradient projection to effectively mitigate forgetting with minimal additional resources.
Findings
Significantly reduces forgetting in continual learning scenarios.
Effective across diverse models like ViT and CLIP.
Improves generalization in multi-modal and domain adaptation tasks.
Abstract
Parameter-efficient tunings (PETs) have demonstrated impressive performance and promising perspectives in training large models, while they are still confronted with a common problem: the trade-off between learning new content and protecting old knowledge, leading to zero-shot generalization collapse, and cross-modal hallucination. In this paper, we reformulate Adapter, LoRA, Prefix-tuning, and Prompt-tuning from the perspective of gradient projection, and firstly propose a unified framework called Parameter Efficient Gradient Projection (PEGP). We introduce orthogonal gradient projection into different PET paradigms and theoretically demonstrate that the orthogonal condition for the gradient can effectively resist forgetting even for large-scale models. It therefore modifies the gradient towards the direction that has less impact on the old feature space, with less extra memory space…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Geophysical Methods and Applications · Machine Learning and ELM
MethodsContrastive Language-Image Pre-training · Adapter
