TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
Xingchen Song, Di Wu, Zhiyong Wu, Binbin Zhang, Yuekai Zhang, Zhendong, Peng, Wenpeng Li, Fuping Pan, Changbao Zhu

TL;DR
TrimTail is a simple, spectrogram-level length penalty method that reduces streaming ASR latency by 100-200ms without extra training effort, improving user delay metrics with minimal accuracy loss.
Contribution
It introduces a novel, computationally cheap spectrogram-level length penalty technique that enhances latency and user delay metrics in streaming ASR models without requiring alignment or complex modifications.
Findings
Achieves 100-200ms latency reduction on Aishell-1 and Librispeech.
Improves User Sensitive Delay by 400ms with less than 0.2 accuracy loss.
Compatible with various models trained with CTC or Transducer loss.
Abstract
In this paper, we present TrimTail, a simple but effective emission regularization method to improve the latency of streaming ASR models. The core idea of TrimTail is to apply length penalty (i.e., by trimming trailing frames, see Fig. 1-(b)) directly on the spectrogram of input utterances, which does not require any alignment. We demonstrate that TrimTail is computationally cheap and can be applied online and optimized with any training loss or any model architecture on any dataset without any extra effort by applying it on various end-to-end streaming ASR networks either trained with CTC loss [1] or Transducer loss [2]. We achieve 100 200ms latency reduction with equal or even better accuracy on both Aishell-1 and Librispeech. Moreover, by using TrimTail, we can achieve a 400ms algorithmic improvement of User Sensitive Delay (USD) with an accuracy loss of less than 0.2.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Chemical Sensor Technologies · Underwater Vehicles and Communication Systems · Advanced Computing and Algorithms
MethodsConnectionist Temporal Classification Loss
