An Exploration of Mamba for Speech Self-Supervised Models

Tzu-Quan Lin; Heng-Cheng Kuo; Tzu-Chieh Wei; Hsi-Chun Cheng; Chun Wei Chen; Hsien-Fu Hsiao; Yu Tsao; Hung-yi Lee

arXiv:2506.12606·cs.CL·April 21, 2026

An Exploration of Mamba for Speech Self-Supervised Models

Tzu-Quan Lin, Heng-Cheng Kuo, Tzu-Chieh Wei, Hsi-Chun Cheng, Chun Wei Chen, Hsien-Fu Hsiao, Yu Tsao, Hung-yi Lee

PDF

1 Repo

TL;DR

This paper investigates Mamba-based HuBERT models for speech self-supervised learning, demonstrating their efficiency and superior performance in long-context ASR, streaming, and speech representation tasks.

Contribution

It introduces Mamba-based HuBERT models as efficient alternatives to Transformer SSL architectures, with improved performance and lower computational costs.

Findings

01

Mamba-based models outperform Transformer models in streaming ASR.

02

They produce higher-quality quantized speech representations.

03

Models show competitive results on SUPERB benchmarks.

Abstract

While Mamba has demonstrated strong performance in language modeling, its potential as a speech self-supervised learning (SSL) model remains underexplored, with prior studies limited to isolated tasks. To address this, we explore Mamba-based HuBERT models as alternatives to Transformer-based SSL architectures. Leveraging the linear-time Selective State Space, these models enable fine-tuning on long-context ASR with significantly lower compute. Moreover, they show superior performance when fine-tuned for streaming ASR. Beyond fine-tuning, these models show competitive performance on SUPERB probing benchmarks, particularly in causal settings. Our analysis shows that they yield higher-quality quantized representations and capture speaker-related features more distinctly than Transformer-based models. These findings highlight Mamba-based SSL as a promising and complementary direction for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hckuo145/Mamba-based-HuBERT
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.