UniPT: Universal Parallel Tuning for Transfer Learning with Efficient   Parameter and Memory

Haiwen Diao; Bo Wan; Ying Zhang; Xu Jia; Huchuan Lu; Long Chen

arXiv:2308.14316·cs.CV·March 12, 2024·1 cites

UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory

Haiwen Diao, Bo Wan, Ying Zhang, Xu Jia, Huchuan Lu, Long Chen

PDF

Open Access 1 Repo

TL;DR

UniPT introduces a universal, memory-efficient transfer learning method that uses a lightweight parallel network to improve adaptability, scalability, and performance across diverse models and tasks.

Contribution

The paper proposes UniPT, a novel parallel tuning strategy that decouples transfer learning from backbone dependencies, reducing memory use and enhancing generalizability.

Findings

01

Reduces memory consumption significantly.

02

Outperforms existing PETL methods on multiple datasets.

03

Achieves competitive or superior task performance.

Abstract

Parameter-efficient transfer learning (PETL), i.e., fine-tuning a small portion of parameters, is an effective strategy for adapting pre-trained models to downstream domains. To further reduce the memory demand, recent PETL works focus on the more valuable memory-efficient characteristic. In this paper, we argue that the scalability, adaptability, and generalizability of state-of-the-art methods are hindered by structural dependency and pertinency on specific pre-trained backbones. To this end, we propose a new memory-efficient PETL strategy, Universal Parallel Tuning (UniPT), to mitigate these weaknesses. Specifically, we facilitate the transfer process via a lightweight and learnable parallel network, which consists of: 1) A parallel interaction module that decouples the sequential connections and processes the intermediate activations detachedly from the pre-trained network. 2) A…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Paranioar/UniPT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Neural Networks and Applications

MethodsAttention Is All You Need · Linear Layer · Dropout · Multi-Head Attention · Byte Pair Encoding · Layer Normalization · Attention Dropout · Softmax · Dense Connections · Focus