Lightweight Adapter Tuning for Multilingual Speech Translation

Hang Le; Juan Pino; Changhan Wang; Jiatao Gu; Didier Schwab; Laurent; Besacier

arXiv:2106.01463·cs.CL·July 14, 2021

Lightweight Adapter Tuning for Multilingual Speech Translation

Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, Laurent, Besacier

PDF

2 Repos

TL;DR

This paper investigates adapter modules as a parameter-efficient method for multilingual speech translation, demonstrating their effectiveness in specializing and transferring models with minimal additional parameters.

Contribution

It provides a comprehensive analysis of adapter tuning for multilingual speech translation, including transfer learning from ASR and non-parallel multilingual data.

Findings

01

Adapters achieve competitive performance with full fine-tuning.

02

Adapters enable efficient specialization for specific language pairs.

03

Transfer from ASR and mBART models improves multilingual speech translation.

Abstract

Adapter modules were recently introduced as an efficient alternative to fine-tuning in NLP. Adapter tuning consists in freezing pretrained parameters of a model and injecting lightweight modules between layers, resulting in the addition of only a small number of task-specific trainable parameters. While adapter tuning was investigated for multilingual neural machine translation, this paper proposes a comprehensive analysis of adapters for multilingual speech translation (ST). Starting from different pre-trained models (a multilingual ST trained on parallel data or a multilingual BART (mBART) trained on non-parallel multilingual data), we show that adapters can be used to: (a) efficiently specialize ST to specific language pairs with a low extra cost in terms of parameters, and (b) transfer from an automatic speech recognition (ASR) task and an mBART pre-trained model to a multilingual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · mBART · Dense Connections · Softmax · Dropout · Byte Pair Encoding · Adam · Adapter