Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

Ju-ho Kim; Jungwoo Heo; Hyun-seo Shin; Chan-yeong Lim; Ha-Jin Yu

arXiv:2211.02227·eess.AS·March 3, 2023·1 cites

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

Ju-ho Kim, Jungwoo Heo, Hyun-seo Shin, Chan-yeong Lim, Ha-Jin Yu

PDF

Open Access 1 Repo

TL;DR

This paper introduces an integrated parameter-efficient tuning framework for large pre-trained audio models, achieving high performance on various tasks with fewer trainable parameters, reducing training costs and environmental impact.

Contribution

It proposes the IPET framework combining embedding prompts and adapters, demonstrating effectiveness across multiple audio tasks with different pre-trained models.

Findings

01

IPET outperforms traditional fine-tuning in accuracy.

02

Fewer trainable parameters are needed for comparable performance.

03

Analysis reveals limitations and future directions for parameter-efficient tuning.

Abstract

The advent of hyper-scale and general-purpose pre-trained models is shifting the paradigm of building task-specific models for target tasks. In the field of audio research, task-agnostic pre-trained models with high transferability and adaptability have achieved state-of-the-art performances through fine-tuning for downstream tasks. Nevertheless, re-training all the parameters of these massive models entails an enormous amount of time and cost, along with a huge carbon footprint. To overcome these limitations, the present study explores and applies efficient transfer learning methods in the audio domain. We also propose an integrated parameter-efficient tuning (IPET) framework by aggregating the embedding prompt (a prompt-based learning approach), and the adapter (an effective transfer learning method). We demonstrate the efficacy of the proposed framework using two backbone pre-trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wngh1187/ipet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis

MethodsAdapter