LightBeam: An Accurate and Memory-Efficient CTC Decoder for Speech Neuroprostheses
Ebrahim Feghhi, Junlin Hu, Nima Hadidi, Jonathan C. Kao

TL;DR
LightBeam is a memory-efficient, non-WFST CTC decoder that achieves state-of-the-art speech decoding performance from neural data with significantly reduced RAM requirements, enabling broader accessibility.
Contribution
The paper introduces LightBeam, a novel CTC decoder that reduces memory usage from 320 GB to 10 GB while maintaining top performance, by integrating an LLM into the beam search.
Findings
Achieves state-of-the-art results on Brain-to-Text benchmarks.
Reduces RAM requirement from 320 GB to 10 GB.
Open-source implementation in Python.
Abstract
A promising pathway for restoring communication in patients with dysarthria and anarthria is speech neuroprostheses, which directly decode speech from cortical neural activity. Two benchmarks, Brain-to-Text '24 and '25, released intracranial recordings from patients with dysarthria along with a baseline algorithm trained with Connectionist Temporal Classification (CTC). Despite significant innovation on these benchmarks, all leading published prior work relies on a WFST-based CTC decoder that requires 320 GB of RAM. These memory requirements limit accessibility for both patients and researchers. Here, we propose LightBeam, a non-WFST based CTC decoder that requires only 10 GB of RAM and achieves state-of-the-art performance on both benchmarks. LightBeam achieves this by integrating an LLM into the beam-search process via delayed fusion, obviating the prior need for using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Speech Recognition and Synthesis · Stuttering Research and Treatment
