NN3A: Neural Network supported Acoustic Echo Cancellation, Noise Suppression and Automatic Gain Control for Real-Time Communications
Ziteng Wang, Yueyue Na, Biao Tian, Qiang Fu

TL;DR
This paper introduces NN3A, a neural network-based algorithm that enhances real-time communication quality by integrating acoustic echo cancellation, noise suppression, and automatic gain control into a unified model, outperforming existing methods.
Contribution
The paper presents a novel multi-task neural network model for RTC that combines residual echo suppression, noise reduction, and speech activity detection, with a new loss function to balance residual suppression and speech quality.
Findings
Outperforms separate and end-to-end models in RTC tasks.
Identifies a trade-off between echo suppression and speech distortion.
Introduces a novel loss weighting function to optimize model performance.
Abstract
Acoustic echo cancellation (AEC), noise suppression (NS) and automatic gain control (AGC) are three often required modules for real-time communications (RTC). This paper proposes a neural network supported algorithm for RTC, namely NN3A, which incorporates an adaptive filter and a multi-task model for residual echo suppression, noise reduction and near-end speech activity detection. The proposed algorithm is shown to outperform both a method using separate models and an end-to-end alternative. It is further shown that there exists a trade-off in the model between residual suppression and near-end speech distortion, which could be balanced by a novel loss weighting function. Several practical aspects of training the joint model are also investigated to push its performance to limit.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Speech Recognition and Synthesis
