SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Haiwen Diao, Bo Wan, Xu Jia, Yunzhi Zhuge, Ying Zhang, Huchuan Lu,, Long Chen

TL;DR
SHERL is a novel memory-efficient transfer learning method that enhances model adaptation by consolidating intermediate features and utilizing minimal pre-trained layers, achieving high accuracy with reduced memory usage.
Contribution
It introduces SHERL, a new two-stage strategy that combines anti-redundancy and minimal layer utilization to improve resource-limited transfer learning performance.
Findings
Outperforms existing PETL methods on vision-and-language and language tasks.
Reduces memory usage during fine-tuning without sacrificing accuracy.
Effectively balances parameter efficiency and memory constraints.
Abstract
Parameter-efficient transfer learning (PETL) has emerged as a flourishing research field for adapting large pre-trained models to downstream tasks, greatly reducing trainable parameters while grappling with memory challenges during fine-tuning. To address it, memory-efficient series (METL) avoid backpropagating gradients through the large backbone. However, they compromise by exclusively relying on frozen intermediate outputs and limiting the exhaustive exploration of prior knowledge from pre-trained models. Moreover, the dependency and redundancy between cross-layer features are frequently overlooked, thereby submerging more discriminative representations and causing an inherent performance gap (vs. conventional PETL methods). Hence, we propose an innovative METL strategy called SHERL for resource-limited scenarios to decouple the entire adaptation into two successive and complementary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Data Stream Mining Techniques · Neural Networks and Applications
