Full-Rank No More: Low-Rank Weight Training for Modern Speech   Recognition Models

Adriana Fernandez-Lopez; Shiwei Liu; Lu Yin; Stavros Petridis; Maja; Pantic

arXiv:2410.07771·cs.SD·October 11, 2024

Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models

Adriana Fernandez-Lopez, Shiwei Liu, Lu Yin, Stavros Petridis, Maja, Pantic

PDF

Open Access

TL;DR

This paper explores low-rank weight training for large-scale speech recognition models, showing that selective low-rank constraints can maintain performance while reducing parameters and training time.

Contribution

It introduces a novel low-rank training approach for speech models, highlighting the importance of initialization and layer-wise rank assignment, and proposes LR-SMS achieving full-rank performance with fewer resources.

Findings

01

Low-rank attention modules can improve performance with 12% rank reduction.

02

Feed-forward layers degrade with 50% rank reduction.

03

LR-SMS reduces parameters by at least 2x and speeds up training by 1.3x (ASR).

Abstract

This paper investigates the under-explored area of low-rank weight training for large-scale Conformer-based speech recognition models from scratch. Our study demonstrates the viability of this training paradigm for such models, yielding several notable findings. Firstly, we discover that applying a low-rank structure exclusively to the attention modules can unexpectedly enhance performance, even with a significant rank reduction of 12%. In contrast, feed-forward layers present greater challenges, as they begin to exhibit performance degradation with a moderate 50% rank reduction. Furthermore, we find that both initialization and layer-wise rank assignment play critical roles in successful low-rank training. Specifically, employing SVD initialization and linear layer-wise rank mapping significantly boosts the efficacy of low-rank weight training. Building on these insights, we introduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis

MethodsSoftmax · Attention Is All You Need