Task Arithmetic with Support Languages for Low-Resource ASR
Emma Rafkin, Dan DeGenaro, Xiulin Yang

TL;DR
This paper introduces a method for improving low-resource automatic speech recognition by combining task-specific models trained on related languages, resulting in significant WER reductions.
Contribution
It proposes a novel task arithmetic approach that merges models from high- and low-resource languages to enhance low-resource ASR performance.
Findings
Up to 10% WER reduction across 23 low-resource languages.
Consistent improvements over baseline models.
Effective linear combination of task vectors for low-resource ASR.
Abstract
The development of resource-constrained approaches to automatic speech recognition (ASR) is of great interest due to its broad applicability to many low-resource languages for which there is scant usable data. Existing approaches to many low-resource natural language processing tasks leverage additional data from higher-resource languages that are closely related to a target low-resource language. One increasingly popular approach uses task arithmetic to combine models trained on different tasks to create a model for a task where there is little to no training data. In this paper, we consider training on a particular language to be a task, and we generate task vectors by fine-tuning variants of the Whisper ASR system. For pairs of high- and low-resource languages, we merge task vectors via a linear combination which is optimized on the downstream word error rate on the low-resource…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Machine Learning and Algorithms
