iPhoneme: Brain-to-Text Communication for ALS Using ConformerXL Decoding
Yoonmin Cha, Dawit Chun, Sung Park

TL;DR
iPhoneme is a real-time brain-to-text system for ALS that combines advanced neural decoding with gaze-assisted input, achieving high accuracy and efficiency in intracranial EEG-based speech restoration.
Contribution
The paper introduces iPhoneme, integrating a ConformerXL-based phoneme decoder with a gaze-assisted interface, advancing neural decoding accuracy and input efficiency for speech BCIs.
Findings
Achieved 92.14% phoneme accuracy and 73.39% word accuracy on intracranial EEG data.
System operates with 180 ms latency on CPU, enabling real-time communication.
Outperforms prior state-of-the-art by approximately 3% in accuracy.
Abstract
Brain-computer interfaces (BCIs) for speech restoration hold transformative potential for the approximately 173,000--232,500 individuals worldwide with ALS-related dysarthria. Despite recent progress, high-performance speech BCIs have been demonstrated in only 22--31 patients globally, largely due to limitations in neural decoding accuracy and practical input interfaces. We present iPhoneme, a brain-to-text communication system that jointly addresses these challenges through integrated modeling and interaction design. The system combines a deep learning phoneme decoder based on a modified Conformer architecture (ConformerXL, 192.9M parameters) with a gaze-assisted phoneme input interface that mitigates the Midas touch problem in eye-tracking systems. The acoustic model incorporates a temporal prenet with multi-scale dilated convolutions and bidirectional GRU for neural jitter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
