Macro-block dropout for improved regularization in training end-to-end   speech recognition models

Chanwoo Kim; Sathish Indurti; Jinhwan Park; Wonyong Sung

arXiv:2212.14149·cs.LG·January 2, 2023

Macro-block dropout for improved regularization in training end-to-end speech recognition models

Chanwoo Kim, Sathish Indurti, Jinhwan Park, Wonyong Sung

PDF

Open Access

TL;DR

This paper introduces macro-block dropout, a novel regularization method for end-to-end speech recognition models that applies dropout to macro-blocks of units, leading to significant improvements in Word Error Rates.

Contribution

The paper proposes macro-block dropout, a new regularization technique that applies dropout to large macro-blocks, enhancing model generalization in speech recognition tasks.

Findings

01

4.30% and 6.13% WER improvements with RNN-T

02

4.36% and 5.85% WER improvements with AED

03

Effective regularization for large neural networks

Abstract

This paper proposes a new regularization algorithm referred to as macro-block dropout. The overfitting issue has been a difficult problem in training large neural network models. The dropout technique has proven to be simple yet very effective for regularization by preventing complex co-adaptations during training. In our work, we define a macro-block that contains a large number of units from the input to a Recurrent Neural Network (RNN). Rather than applying dropout to each unit, we apply random dropout to each macro-block. This algorithm has the effect of applying different drop out rates for each layer even if we keep a constant average dropout rate, which has better regularization effects. In our experiments using Recurrent Neural Network-Transducer (RNN-T), this algorithm shows relatively 4.30 % and 6.13 % Word Error Rates (WERs) improvement over the conventional dropout on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Neural Networks and Applications · Speech and Audio Processing

MethodsTest · Dropout