Efficient Weight factorization for Multilingual Speech Recognition

Ngoc-Quan Pham; Tuan-Nam Nguyen; Sebastian Stueker; Alexander Waibel

arXiv:2105.03010·cs.CL·May 10, 2021

Efficient Weight factorization for Multilingual Speech Recognition

Ngoc-Quan Pham, Tuan-Nam Nguyen, Sebastian Stueker, Alexander Waibel

PDF

TL;DR

This paper introduces a novel weight factorization method for multilingual speech recognition models, significantly reducing parameters and improving accuracy across multiple languages.

Contribution

It proposes a new efficient weight factorization approach that decomposes language-specific weights into shared and rank-1 vectors, enhancing multilingual speech recognition.

Findings

01

Reduces word error rates by approximately 26-27% in multilingual settings.

02

Effective for both LSTM and Transformer architectures.

03

Decreases model complexity while maintaining or improving performance.

Abstract

End-to-end multilingual speech recognition involves using a single model training on a compositional speech corpus including many languages, resulting in a single neural network to handle transcribing different languages. Due to the fact that each language in the training data has different characteristics, the shared network may struggle to optimize for all various languages simultaneously. In this paper we propose a novel multilingual architecture that targets the core operation in neural networks: linear transformation functions. The key idea of the method is to assign fast weight matrices for each language by decomposing each weight matrix into a shared component and a language dependent component. The latter is then factorized into vectors using rank-1 assumptions to reduce the number of parameters per language. This efficient factorization scheme is proved to be effective in two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Residual Connection · Sigmoid Activation · Dropout · Softmax · Layer Normalization