Fake-Mamba: Real-Time Speech Deepfake Detection Using Bidirectional Mamba as Self-Attention's Alternative

Xi Xuan; Zimo Zhu; Wenxin Zhang; Yi-Cheng Lin; Tomi Kinnunen

arXiv:2508.09294·eess.AS·August 14, 2025

Fake-Mamba: Real-Time Speech Deepfake Detection Using Bidirectional Mamba as Self-Attention's Alternative

Xi Xuan, Zimo Zhu, Wenxin Zhang, Yi-Cheng Lin, Tomi Kinnunen

PDF

TL;DR

Fake-Mamba introduces a real-time speech deepfake detection method using bidirectional Mamba as a self-attention alternative, achieving high accuracy and efficiency across multiple benchmarks.

Contribution

It proposes a novel framework integrating bidirectional Mamba with XLSR for effective, real-time synthetic speech detection, outperforming existing models.

Findings

01

Achieves state-of-the-art EER on multiple benchmarks

02

Maintains real-time inference across utterance lengths

03

Demonstrates strong generalization and practical viability

Abstract

Advances in speech synthesis intensify security threats, motivating real-time deepfake detection research. We investigate whether bidirectional Mamba can serve as a competitive alternative to Self-Attention in detecting synthetic speech. Our solution, Fake-Mamba, integrates an XLSR front-end with bidirectional Mamba to capture both local and global artifacts. Our core innovation introduces three efficient encoders: TransBiMamba, ConBiMamba, and PN-BiMamba. Leveraging XLSR's rich linguistic representations, PN-BiMamba can effectively capture the subtle cues of synthetic speech. Evaluated on ASVspoof 21 LA, 21 DF, and In-The-Wild benchmarks, Fake-Mamba achieves 0.97%, 1.74%, and 5.85% EER, respectively, representing substantial relative gains over SOTA models XLSR-Conformer and XLSR-Mamba. The framework maintains real-time inference across utterance lengths, demonstrating strong…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.