Reinforcement Learning To Adapt Speech Enhancement to Instantaneous Input Signal Quality
Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar

TL;DR
This paper introduces a reinforcement learning-based method to dynamically adapt speech enhancement algorithms, significantly improving their robustness and performance across varying input signal qualities.
Contribution
The paper proposes a novel reinforcement learning approach to adapt noise-suppression parameters in real-time without detailed knowledge of the algorithm mechanics.
Findings
42% improvement in output SNR
16% reduction in MSE
Enhanced robustness across diverse noise conditions
Abstract
Today, the optimal performance of existing noise-suppression algorithms, both data-driven and those based on classic statistical methods, is range bound to specific levels of instantaneous input signal-to-noise ratios. In this paper, we present a new approach to improve the adaptivity of such algorithms enabling them to perform robustly across a wide range of input signal and noise types. Our methodology is based on the dynamic control of algorithmic parameters via reinforcement learning. Specifically, we model the noise-suppression module as a black box, requiring no knowledge of the algorithmic mechanics except a simple feedback from the output. We utilize this feedback as the reward signal for a reinforcement-learning agent that learns a policy to adapt the algorithmic parameters for every incoming audio frame (16 ms of data). Our preliminary results show that such a control…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Advanced Adaptive Filtering Techniques
