Time-Masked Transformers with Lightweight Test-Time Adaptation for Neural Speech Decoding

Ebrahim Feghhi; Shreyas Kaasyap; Nima Hadidi; Jonathan C. Kao

arXiv:2507.02800·cs.HC·November 4, 2025

Time-Masked Transformers with Lightweight Test-Time Adaptation for Neural Speech Decoding

Ebrahim Feghhi, Shreyas Kaasyap, Nima Hadidi, Jonathan C. Kao

PDF

TL;DR

This paper introduces a novel neural speech decoding approach using time-masked training, a lightweight Transformer architecture, and efficient test-time adaptation, achieving real-time performance with reduced computational costs and improved accuracy.

Contribution

It presents a new combination of time-masking, a compact Transformer, and a lightweight adaptation method for neural speech decoding, enabling real-time application with lower resource requirements.

Findings

01

Reduced word error rate by over 20%

02

Lowered computational costs and memory usage

03

Effective performance across held-out days in real-time decoding

Abstract

Speech neuroprostheses aim to restore communication for people with severe paralysis by decoding speech directly from neural activity. To accelerate algorithmic progress, a recent benchmark released intracranial recordings from a paralyzed participant attempting to speak, along with a baseline decoding algorithm. Prior work on the benchmark showed impressive accuracy gains. However, these gains increased computational costs and were not demonstrated in a real-time decoding setting. Here, we make three contributions that pave the way towards accurate, efficient, and real-time neural speech decoding. First, we incorporate large amounts of time-masking during training. On average, over $50%$ of each trial is masked. Second, we replace the gated recurrent unit (GRU) architecture used in the baseline algorithm with a compact Transformer. The Transformer architecture uses $83%$ fewer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.