Enhancing Listened Speech Decoding from EEG via Parallel Phoneme Sequence Prediction
Jihwan Lee, Tiantian Feng, Aditya Kommineni, Sudarsana Reddy Kadiri,, Shrikanth Narayanan

TL;DR
This paper introduces a novel EEG-based speech decoding method that simultaneously predicts speech waveforms and phoneme sequences, improving accuracy and efficiency for brain-computer interfaces aiding speech-impaired individuals.
Contribution
The paper presents a new multi-task model architecture that jointly decodes speech waveforms and phoneme sequences from EEG signals, outperforming previous methods.
Findings
Outperforms previous EEG speech decoding methods.
Provides simultaneous decoding of speech and phonemes.
Enables real-time, multi-modal speech reconstruction from EEG.
Abstract
Brain-computer interfaces (BCI) offer numerous human-centered application possibilities, particularly affecting people with neurological disorders. Text or speech decoding from brain activities is a relevant domain that could augment the quality of life for people with impaired speech perception. We propose a novel approach to enhance listened speech decoding from electroencephalography (EEG) signals by utilizing an auxiliary phoneme predictor that simultaneously decodes textual phoneme sequences. The proposed model architecture consists of three main parts: EEG module, speech module, and phoneme predictor. The EEG module learns to properly represent EEG signals into EEG embeddings. The speech module generates speech waveforms from the EEG embeddings. The phoneme predictor outputs the decoded phoneme sequences in text modality. Our proposed approach allows users to obtain decoded…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · EEG and Brain-Computer Interfaces · Blind Source Separation Techniques
