Inference skipping for more efficient real-time speech enhancement with   parallel RNNs

Xiaohuai Le; Tong Lei; Kai Chen; Jing Lu

arXiv:2207.11108·cs.SD·July 25, 2022

Inference skipping for more efficient real-time speech enhancement with parallel RNNs

Xiaohuai Le, Tong Lei, Kai Chen, Jing Lu

PDF

TL;DR

This paper proposes a novel inference skipping strategy with parallel RNNs for real-time speech enhancement, significantly reducing computational load while maintaining audio quality, and leveraging voice activity detection for further efficiency.

Contribution

Introduction of a skip-RNN strategy with VAD guidance for efficient real-time speech enhancement, outperforming traditional pruning and smaller models.

Findings

01

Significant reduction in computational load without audio artifacts.

02

Superiority over pruning and smaller models in experiments.

03

Effective generalization across multiple speech enhancement models.

Abstract

Deep neural network (DNN) based speech enhancement models have attracted extensive attention due to their promising performance. However, it is difficult to deploy a powerful DNN in real-time applications because of its high computational cost. Typical compression methods such as pruning and quantization do not make good use of the data characteristics. In this paper, we introduce the Skip-RNN strategy into speech enhancement models with parallel RNNs. The states of the RNNs update intermittently without interrupting the update of the output mask, which leads to significant reduction of computational load without evident audio artifacts. To better leverage the difference between the voice and the noise, we further regularize the skipping strategy with voice activity detection (VAD) guidance, saving more computational load. Experiments on a high-performance speech enhancement model,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.