Language-Aware Prompt Tuning for Parameter-Efficient Seamless Language Expansion in Multilingual ASR

Hongli Yang; Sheng Li; Hao Huang; Ayiduosi Tuohan; and Yizhou Peng

arXiv:2506.21577·cs.CL·September 29, 2025

Language-Aware Prompt Tuning for Parameter-Efficient Seamless Language Expansion in Multilingual ASR

Hongli Yang, Sheng Li, Hao Huang, Ayiduosi Tuohan, and Yizhou Peng

PDF

TL;DR

This paper introduces a novel prompt tuning approach for multilingual ASR that improves language expansion capabilities by leveraging language-aware prompts and a new toolkit, enhancing performance with minimal additional computation.

Contribution

It proposes Entire SPT and LAPT methods for better multilingual ASR, and introduces SPT-Whisper toolkit for efficient continual learning in Whisper models.

Findings

01

Entire SPT and LAPT outperform Decoder SPT in language expansion tasks.

02

LAPT achieves 16% improvement over Decoder SPT.

03

The methods enable efficient multilingual ASR with minimal overhead.

Abstract

Recent advancements in multilingual automatic speech recognition (ASR) have been driven by large-scale end-to-end models like Whisper. However, challenges such as language interference and expanding to unseen languages (language expansion) without degrading performance persist. This paper addresses these with three contributions: 1) Entire Soft Prompt Tuning (Entire SPT), which applies soft prompts to both the encoder and decoder, enhancing feature extraction and decoding; 2) Language-Aware Prompt Tuning (LAPT), which leverages cross-lingual similarities to encode shared and language-specific features using lightweight prompt matrices; 3) SPT-Whisper, a toolkit that integrates SPT into Whisper and enables efficient continual learning. Experiments across three languages from FLEURS demonstrate that Entire SPT and LAPT outperform Decoder SPT by 5.0% and 16.0% in language expansion tasks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.