PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization
Yao Ni, Shan Zhang, Piotr Koniusz

TL;DR
PACE introduces a novel regularization method combining gradient norm reduction and model alignment with consistency regularization, significantly improving the generalization of parameter-efficient fine-tuning across various tasks.
Contribution
The paper proposes PACE, a new method that enhances PEFT by combining gradient regularization and model alignment with consistency regularization, supported by theoretical analysis and extensive experiments.
Findings
PACE outperforms existing PEFT methods in visual tasks.
PACE improves performance on text classification and reasoning tasks.
Theoretical analysis confirms gradient regularization and model alignment benefits.
Abstract
Parameter-Efficient Fine-Tuning (PEFT) effectively adapts pre-trained transformers to downstream tasks. However, the optimization of tasks performance often comes at the cost of generalizability in fine-tuned models. To address this issue, we theoretically connect smaller weight gradient norms during training and larger datasets to the improvements in model generalization. Motivated by this connection, we propose reducing gradient norms for enhanced generalization and aligning fine-tuned model with the pre-trained counterpart to retain knowledge from large-scale pre-training data. Yet, naive alignment does not guarantee gradient reduction and can potentially cause gradient explosion, complicating efforts to manage gradients. To address such an issue, we propose PACE, marrying generalization of PArameter-efficient fine-tuning with Consistency rEgularization. We perturb features learned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Fiber Optic Sensors · Optical Network Technologies · Advanced Optical Network Technologies
MethodsAdapter
