Factorized Blank Thresholding for Improved Runtime Efficiency of Neural   Transducers

Duc Le; Frank Seide; Yuhao Wang; Yang Li; Kjell Schubert; Ozlem; Kalinli; Michael L. Seltzer

arXiv:2211.00896·eess.AS·March 7, 2023

Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers

Duc Le, Frank Seide, Yuhao Wang, Yang Li, Kjell Schubert, Ozlem, Kalinli, Michael L. Seltzer

PDF

Open Access

TL;DR

This paper introduces a factorized blank thresholding method for RNN-T models that significantly speeds up decoding and reduces power consumption on devices without sacrificing accuracy.

Contribution

It proposes a novel joiner factorization technique that skips expensive computations based on blank probability thresholds, improving runtime efficiency.

Findings

01

Achieved 26-30% decoding speed-up

02

Reduced on-device power consumption by 43-53%

03

No accuracy loss observed

Abstract

We show how factoring the RNN-T's output distribution can significantly reduce the computation cost and power consumption for on-device ASR inference with no loss in accuracy. With the rise in popularity of neural-transducer type models like the RNN-T for on-device ASR, optimizing RNN-T's runtime efficiency is of great interest. While previous work has primarily focused on the optimization of RNN-T's acoustic encoder and predictor, this paper focuses the attention on the joiner. We show that despite being only a small part of RNN-T, the joiner has a large impact on the overall model's runtime efficiency. We propose to utilize HAT-style joiner factorization for the purpose of skipping the more expensive non-blank computation when the blank probability exceeds a certain threshold. Since the blank probability can be computed very efficiently and the RNN-T output is dominated by blanks, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Underwater Acoustics Research · Neural Networks and Applications