SACM: SEEG-Audio Contrastive Matching for Chinese Speech Decoding

Hongbin Wang; Zhihong Jia; Yuanzhong Shen; Ziwei Wang; Siyang Li; Kai Shu; Feng Hu; Dongrui Wu

arXiv:2505.19652·cs.HC·May 27, 2025

SACM: SEEG-Audio Contrastive Matching for Chinese Speech Decoding

Hongbin Wang, Zhihong Jia, Yuanzhong Shen, Ziwei Wang, Siyang Li, Kai Shu, Feng Hu, Dongrui Wu

PDF

Open Access 1 Repo

TL;DR

This study introduces SACM, a contrastive learning framework for Mandarin speech decoding using SEEG data, achieving high accuracy and revealing key brain regions, advancing speech neuroprosthesis development.

Contribution

The paper presents a novel SEEG-Audio Contrastive Matching framework for Chinese speech decoding, demonstrating effective decoding with minimal electrodes and providing insights for BCI improvements.

Findings

01

Decoding accuracies significantly exceed chance levels.

02

Single sensorimotor cortex electrode performs comparably to full array.

03

Provides insights for developing more accurate speech BCIs.

Abstract

Speech disorders such as dysarthria and anarthria can severely impair the patient's ability to communicate verbally. Speech decoding brain-computer interfaces (BCIs) offer a potential alternative by directly translating speech intentions into spoken words, serving as speech neuroprostheses. This paper reports an experimental protocol for Mandarin Chinese speech decoding BCIs, along with the corresponding decoding algorithms. Stereo-electroencephalography (SEEG) and synchronized audio data were collected from eight drug-resistant epilepsy patients as they conducted a word-level reading task. The proposed SEEG and Audio Contrastive Matching (SACM), a contrastive learning-based framework, achieved decoding accuracies significantly exceeding chance levels in both speech detection and speech decoding tasks. Electrode-wise analysis revealed that a single sensorimotor cortex electrode achieved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

WangHongbinary/SACM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing