Reinforcement Learning To Adapt Speech Enhancement to Instantaneous   Input Signal Quality

Rasool Fakoor; Xiaodong He; Ivan Tashev; Shuayb Zarar

arXiv:1711.10791·cs.LG·July 30, 2018·2 cites

Reinforcement Learning To Adapt Speech Enhancement to Instantaneous Input Signal Quality

Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning-based method to dynamically adapt speech enhancement algorithms, significantly improving their robustness and performance across varying input signal qualities.

Contribution

The paper proposes a novel reinforcement learning approach to adapt noise-suppression parameters in real-time without detailed knowledge of the algorithm mechanics.

Findings

01

42% improvement in output SNR

02

16% reduction in MSE

03

Enhanced robustness across diverse noise conditions

Abstract

Today, the optimal performance of existing noise-suppression algorithms, both data-driven and those based on classic statistical methods, is range bound to specific levels of instantaneous input signal-to-noise ratios. In this paper, we present a new approach to improve the adaptivity of such algorithms enabling them to perform robustly across a wide range of input signal and noise types. Our methodology is based on the dynamic control of algorithmic parameters via reinforcement learning. Specifically, we model the noise-suppression module as a black box, requiring no knowledge of the algorithmic mechanics except a simple feedback from the output. We utilize this feedback as the reward signal for a reinforcement-learning agent that learns a policy to adapt the algorithmic parameters for every incoming audio frame (16 ms of data). Our preliminary results show that such a control…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Advanced Adaptive Filtering Techniques