MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement

Nikolai Lund K\"uhne; Jesper Jensen; Jan {\O}stergaard; Zheng-Hua Tan

arXiv:2507.00966·cs.SD·January 22, 2026

MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement

Nikolai Lund K\"uhne, Jesper Jensen, Jan {\O}stergaard, Zheng-Hua Tan

PDF

2 Repos

TL;DR

This paper introduces MambAttention, a hybrid model combining Mamba and multi-head attention for improved generalization in single-channel speech enhancement, outperforming existing models on challenging datasets.

Contribution

The paper proposes a novel hybrid architecture, MambAttention, integrating Mamba and shared multi-head attention modules, with a new challenging dataset VB-DemandEx for training.

Findings

01

MambAttention outperforms state-of-the-art models on out-of-domain datasets.

02

Shared attention modules improve generalization performance.

03

Integrating attention with LSTM/xLSTM enhances cross-corpus performance.

Abstract

With new sequence models like Mamba and xLSTM, several studies have shown that these models match or outperform the state-of-the-art in single-channel speech enhancement and audio representation learning. However, prior research has demonstrated that sequence models like LSTM and Mamba tend to overfit to the training set. To address this, previous works have shown that adding self-attention to LSTMs substantially improves generalization performance for single-channel speech enhancement. Nevertheless, neither the concept of hybrid Mamba and time-frequency attention models nor their generalization performance have been explored for speech enhancement. In this paper, we propose a novel hybrid architecture, MambAttention, which combines Mamba and shared time- and frequency-multi-head attention modules for generalizable single-channel speech enhancement. To train our model, we introduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLong Short-Term Memory · Mamba: Linear-Time Sequence Modeling with Selective State Spaces