Selective Attention Merging for low resource tasks: A case study of   Child ASR

Natarajan Balaji Shankar; Zilai Wang; Eray Eren; Abeer Alwan

arXiv:2501.08468·cs.CL·January 16, 2025

Selective Attention Merging for low resource tasks: A case study of Child ASR

Natarajan Balaji Shankar, Zilai Wang, Eray Eren, Abeer Alwan

PDF

Open Access 1 Repo

TL;DR

This paper introduces Selective Attention Merge, a novel model merging technique that improves low-resource child ASR performance by leveraging larger speech models, achieving significant WER reductions and state-of-the-art results.

Contribution

The paper proposes a new Selective Attention Merge method that selectively combines attention matrices to enhance low-resource speech recognition tasks.

Findings

01

Up to 14% relative WER reduction on MyST database

02

State-of-the-art WER of 8.69 achieved with SA Merge

03

Effective combination of data augmentation and model merging

Abstract

While Speech Foundation Models (SFMs) excel in various speech tasks, their performance for low-resource tasks such as child Automatic Speech Recognition (ASR) is hampered by limited pretraining data. To address this, we explore different model merging techniques to leverage knowledge from models trained on larger, more diverse speech corpora. This paper also introduces Selective Attention (SA) Merge, a novel method that selectively merges task vectors from attention matrices to enhance SFM performance on low-resource tasks. Experiments on the MyST database show significant reductions in relative word error rate of up to 14%, outperforming existing model merging and data augmentation techniques. By combining data augmentation techniques with SA Merge, we achieve a new state-of-the-art WER of 8.69 on the MyST database for the Whisper-small model, highlighting the potential of SA Merge for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

balaji1312/sa_merging
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEEG and Brain-Computer Interfaces

MethodsSoftmax · Attention Is All You Need