Deep Learning-Based Joint Control of Acoustic Echo Cancellation, Beamforming and Postfiltering
Thomas Haubner, Walter Kellermann

TL;DR
This paper presents a deep learning approach that jointly controls acoustic echo cancellation, beamforming, and postfiltering in speech devices, enabling rapid adaptation and improved noise suppression.
Contribution
It introduces a single deep neural network to jointly control multiple speech enhancement components, streamlining design and improving performance.
Findings
Rapid convergence in high interference scenarios
High steady-state echo suppression
Effective joint control of multiple algorithms
Abstract
We introduce a novel method for controlling the functionality of a hands-free speech communication device which comprises a model-based acoustic echo canceller (AEC), minimum variance distortionless response (MVDR) beamformer (BF) and spectral postfilter (PF). While the AEC removes the early echo component, the MVDR BF and PF suppress the residual echo and background noise. As key innovation, we suggest to use a single deep neural network (DNN) to jointly control the adaptation of the various algorithmic components. This allows for rapid convergence and high steady-state performance in the presence of high-level interfering double-talk. End-to-end training of the DNN using a time-domain speech extraction loss function avoids the design of individual control strategies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Acoustic Wave Phenomena Research
