TL;DR
MP-ISMoE introduces a mixed-precision, memory-efficient transfer learning framework that enhances performance by scaling up side networks through interactive expert selection, addressing limitations of existing PETL and METL methods.
Contribution
The paper proposes a novel MP-ISMoE framework combining GNP-IQ quantization and interactive mixture-of-experts to improve transfer learning efficiency and accuracy.
Findings
MP-ISMoE outperforms state-of-the-art METL methods in accuracy.
GNP-IQ effectively reduces quantization errors with lower-bit weights.
The framework maintains parameter and memory efficiency while boosting performance.
Abstract
Parameter-efficient transfer learning (PETL) has emerged as a pivotal paradigm for adapting pre-trained foundation models to downstream tasks, significantly reducing trainable parameters yet suffering from substantial memory overhead caused by gradient backpropagation during fine-tuning. While memory-efficient transfer learning (METL) circumvents this challenge by bypassing backbone gradient computation via lightweight small side networks, its stringent memory constraint severely limits learning capacity of side networks, thereby significantly compromising performance. To address these limitations, we propose a novel Mixed-Precision Interactive Side Mixture-of-Experts framework (MP-ISMoE). Specifically, we first propose a Gaussian Noise Perturbed Iterative Quantization (GNP-IQ) scheme to quantize weights into lower-bits while effectively decreasing quantization errors. By leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
