Non-Intrusive Binaural Speech Intelligibility Prediction Using Mamba for Hearing-Impaired Listeners

Katsuhiko Yamamoto; Koichi Miyazaki

arXiv:2507.05729·cs.SD·July 9, 2025

Non-Intrusive Binaural Speech Intelligibility Prediction Using Mamba for Hearing-Impaired Listeners

Katsuhiko Yamamoto, Koichi Miyazaki

PDF

Open Access

TL;DR

This paper introduces a Mamba-based binaural speech intelligibility prediction model that offers a computationally efficient alternative to transformer-based models, maintaining high accuracy for hearing-impaired listeners.

Contribution

It proposes replacing transformer self-attention with Mamba blocks in SIP models to reduce complexity while preserving performance.

Findings

01

Mamba-based SIP models achieve competitive accuracy.

02

The proposed model has fewer parameters.

03

Bidirectional Mamba captures contextual and spatial info effectively.

Abstract

Speech intelligibility prediction (SIP) models have been used as objective metrics to assess intelligibility for hearing-impaired (HI) listeners. In the Clarity Prediction Challenge 2 (CPC2), non-intrusive binaural SIP models based on transformers showed high prediction accuracy. However, the self-attention mechanism theoretically incurs high computational and memory costs, making it a bottleneck for low-latency, power-efficient devices. This may also degrade the temporal processing of binaural SIPs. Therefore, we propose Mamba-based SIP models instead of transformers for the temporal processing blocks. Experimental results show that our proposed SIP model achieves competitive performance compared to the baseline while maintaining a relatively small number of parameters. Our analysis suggests that the SIP model based on bidirectional Mamba effectively captures contextual and spatial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHearing Loss and Rehabilitation · Speech and Audio Processing · Voice and Speech Disorders