Learning to Inference with Early Exit in the Progressive Speech   Enhancement

Andong Li; Chengshi Zheng; Lu Zhang; Xiaodong Li

arXiv:2106.11730·cs.SD·June 23, 2021·1 cites

Learning to Inference with Early Exit in the Progressive Speech Enhancement

Andong Li, Chengshi Zheng, Lu Zhang, Xiaodong Li

PDF

Open Access

TL;DR

This paper introduces a stage-wise adaptive inference method with early exit for speech enhancement, allowing for faster processing without sacrificing quality, and proposes an improved model PL-CRN++ that outperforms existing systems.

Contribution

It presents a novel early exit mechanism for progressive speech enhancement and an improved model PL-CRN++ that combines stage recurrent mechanisms and complex spectral mapping.

Findings

01

The proposed system outperforms state-of-the-art baselines in PESQ, ESTOI, and DNSMOS.

02

Adjustable thresholds enable control over inference speed and system performance.

03

Extensive experiments on TIMIT demonstrate the effectiveness of the approach.

Abstract

In real scenarios, it is often necessary and significant to control the inference speed of speech enhancement systems under different conditions. To this end, we propose a stage-wise adaptive inference approach with early exit mechanism for progressive speech enhancement. Specifically, in each stage, once the spectral distance between adjacent stages lowers the empirically preset threshold, the inference will terminate and output the estimation, which can effectively accelerate the inference speed. To further improve the performance of existing speech enhancement systems, PL-CRN++ is proposed, which is an improved version over our preliminary work PL-CRN and combines stage recurrent mechanism and complex spectral mapping. Extensive experiments are conducted on the TIMIT corpus, the results demonstrate the superiority of our system over state-of-the-art baselines in terms of PESQ, ESTOI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Indoor and Outdoor Localization Technologies