Universal Speech Enhancement with Regression and Generative Mamba
Rong Chao, Rauf Nasretdinov, Yu-Chiang Frank Wang, Ante Juki\'c, Szu-Wei Fu, and Yu Tsao

TL;DR
The paper introduces USEMamba, a versatile speech enhancement model capable of handling various distortions and languages, achieving high generalization with a mix of regression and generative techniques.
Contribution
It presents a novel state-space model for universal speech enhancement that unifies multiple tasks and conditions, including a generative variant for missing content inference.
Findings
Achieved 2nd place in Interspeech 2025 URGENT Challenge Track 1.
Demonstrated strong generalization across diverse distortions and languages.
Effective combination of regression and generative modeling for different enhancement scenarios.
Abstract
The Interspeech 2025 URGENT Challenge aimed to advance universal, robust, and generalizable speech enhancement by unifying speech enhancement tasks across a wide variety of conditions, including seven different distortion types and five languages. We present Universal Speech Enhancement Mamba (USEMamba), a state-space speech enhancement model designed to handle long-range sequence modeling, time-frequency structured processing, and sampling frequency-independent feature extraction. Our approach primarily relies on regression-based modeling, which performs well across most distortions. However, for packet loss and bandwidth extension, where missing content must be inferred, a generative variant of the proposed USEMamba proves more effective. Despite being trained on only a subset of the full training data, USEMamba achieved 2nd place in Track 1 during the blind test phase, demonstrating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Indoor and Outdoor Localization Technologies · Advanced Adaptive Filtering Techniques
