Efficient Sequence Training of Attention Models using Approximative Recombination
Nils-Philipp Wynands, Wilfried Michel, Jan Rosendahl, Ralf, Schl\"uter, Hermann Ney

TL;DR
This paper introduces an approximative recombination method during beam search to enable efficient sequence discriminative training of attention models, significantly increasing effective beam size without high computational costs, demonstrated on LibriSpeech.
Contribution
It proposes a novel hypothesis recombination technique during beam search for sequence training, improving efficiency and scalability of training attention-based models.
Findings
Effective increase in beam size by several orders of magnitude
Maintains computational efficiency during sequence training
Achieves competitive results on LibriSpeech
Abstract
Sequence discriminative training is a great tool to improve the performance of an automatic speech recognition system. It does, however, necessitate a sum over all possible word sequences, which is intractable to compute in practice. Current state-of-the-art systems with unlimited label context circumvent this problem by limiting the summation to an n-best list of relevant competing hypotheses obtained from beam search. This work proposes to perform (approximative) recombinations of hypotheses during beam search, if they share a common local history. The error that is incurred by the approximation is analyzed and it is shown that using this technique the effective beam size can be increased by several orders of magnitude without significantly increasing the computational requirements. Lastly, it is shown that this technique can be used to effectively perform sequence discriminative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Data Compression Techniques
