Align-ULCNet: Towards Low-Complexity and Robust Acoustic Echo and Noise Reduction
Shrishti Saha Shetu, Naveen Kumar Desiraju, Wolfgang Mack, Emanu\"el A. P. Habets

TL;DR
This paper introduces Align-ULCNet, a low-complexity, robust acoustic echo and noise reduction model that improves upon state-of-the-art methods by integrating time alignment, parallel encoders, and a channel-wise sampling feature reorientation.
Contribution
It presents a hybrid model that enhances ULCNet with novel input alignment and feature reorientation techniques for better robustness and efficiency.
Findings
Improved echo reduction performance.
Maintains low computational complexity.
Robust across challenging scenarios.
Abstract
The successful deployment of deep learning-based acoustic echo and noise reduction (AENR) methods in consumer devices has spurred interest in developing low-complexity solutions, while emphasizing the need for robust performance in real-life applications. In this work, we propose a hybrid approach to enhance the state-of-the-art (SOTA) ULCNet model by integrating time alignment and parallel encoder blocks for the model inputs, resulting in better echo reduction and comparable noise reduction performance to existing SOTA methods. We also propose a channel-wise sampling-based feature reorientation method, ensuring robust performance across many challenging scenarios, while maintaining overall low computational and memory requirements.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
