Parameter-Efficient Fine-Tuning without Introducing New Latency

Baohao Liao; Yan Meng; Christof Monz

arXiv:2305.16742·cs.CL·May 29, 2023·1 cites

Parameter-Efficient Fine-Tuning without Introducing New Latency

Baohao Liao, Yan Meng, Christof Monz

PDF

Open Access 1 Repo

TL;DR

This paper introduces a parameter-efficient fine-tuning method that uses a shared, task-agnostic sparse mask and a novel adapter technique, achieving state-of-the-art results without increasing inference latency or storage requirements.

Contribution

It proposes a new PEFT approach with a shared sparse mask and direct adapter application, improving performance and efficiency without added latency.

Findings

01

Surpasses existing PEFT methods on GLUE benchmark

02

Stores only 0.03% of parameters compared to full fine-tuning

03

Achieves state-of-the-art performance in efficiency and accuracy

Abstract

Parameter-efficient fine-tuning (PEFT) of pre-trained language models has recently demonstrated remarkable achievements, effectively matching the performance of full fine-tuning while utilizing significantly fewer trainable parameters, and consequently addressing the storage and communication constraints. Nonetheless, various PEFT methods are limited by their inherent characteristics. In the case of sparse fine-tuning, which involves modifying only a small subset of the existing parameters, the selection of fine-tuned parameters is task- and domain-specific, making it unsuitable for federated learning. On the other hand, PEFT methods with adding new parameters typically introduce additional inference latency. In this paper, we demonstrate the feasibility of generating a sparse mask in a task-agnostic manner, wherein all downstream tasks share a common mask. Our approach, which relies…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Aradhye2002/selective-peft-toolkit
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech Recognition and Synthesis · Domain Adaptation and Few-Shot Learning

MethodsAdapter · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings