Time-Masked Transformers with Lightweight Test-Time Adaptation for Neural Speech Decoding
Ebrahim Feghhi, Shreyas Kaasyap, Nima Hadidi, Jonathan C. Kao

TL;DR
This paper introduces a novel neural speech decoding approach using time-masked training, a lightweight Transformer architecture, and efficient test-time adaptation, achieving real-time performance with reduced computational costs and improved accuracy.
Contribution
It presents a new combination of time-masking, a compact Transformer, and a lightweight adaptation method for neural speech decoding, enabling real-time application with lower resource requirements.
Findings
Reduced word error rate by over 20%
Lowered computational costs and memory usage
Effective performance across held-out days in real-time decoding
Abstract
Speech neuroprostheses aim to restore communication for people with severe paralysis by decoding speech directly from neural activity. To accelerate algorithmic progress, a recent benchmark released intracranial recordings from a paralyzed participant attempting to speak, along with a baseline decoding algorithm. Prior work on the benchmark showed impressive accuracy gains. However, these gains increased computational costs and were not demonstrated in a real-time decoding setting. Here, we make three contributions that pave the way towards accurate, efficient, and real-time neural speech decoding. First, we incorporate large amounts of time-masking during training. On average, over of each trial is masked. Second, we replace the gated recurrent unit (GRU) architecture used in the baseline algorithm with a compact Transformer. The Transformer architecture uses fewer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
