Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement
Junyu Wang, Zizhen Lin, Tianrui Wang, Meng Ge, Longbiao Wang, Jianwu, Dang

TL;DR
This paper introduces Mamba-SEUNet, a novel speech enhancement architecture combining Mamba state-space models with U-Net, achieving state-of-the-art results with lower computational complexity.
Contribution
It presents a new Mamba-based U-Net architecture for speech enhancement that models long-range dependencies efficiently and outperforms existing methods.
Findings
Achieves PESQ score of 3.59 on VCTK+DEMAND dataset.
Combining with Perceptual Contrast Stretching improves PESQ to 3.73.
Maintains low computational complexity while delivering SOTA performance.
Abstract
In recent speech enhancement (SE) research, transformer and its variants have emerged as the predominant methodologies. However, the quadratic complexity of the self-attention mechanism imposes certain limitations on practical deployment. Mamba, as a novel state-space model (SSM), has gained widespread application in natural language processing and computer vision due to its strong capabilities in modeling long sequences and relatively low computational complexity. In this work, we introduce Mamba-SEUNet, an innovative architecture that integrates Mamba with U-Net for SE tasks. By leveraging bidirectional Mamba to model forward and backward dependencies of speech signals at different resolutions, and incorporating skip connections to capture multi-scale information, our approach achieves state-of-the-art (SOTA) performance. Experimental results on the VCTK+DEMAND dataset indicate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Phonetics and Phonology Research
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Convolution · Max Pooling · U-Net · Mamba: Linear-Time Sequence Modeling with Selective State Spaces
