# From pronounced to imagined: improving speech decoding with multi-condition EEG data

**Authors:** Denise Alonso-Vázquez, Omar Mendoza-Montoya, Ricardo Caraza, Hector R. Martinez, Javier M. Antelis

PMC · DOI: 10.3389/fninf.2025.1583428 · Frontiers in Neuroinformatics · 2025-06-27

## TL;DR

This study shows that combining EEG data from actual and imagined speech can improve the accuracy of decoding imagined speech, which could benefit people with motor neuron diseases.

## Contribution

The study introduces a novel approach of using overt speech data to enhance imagined speech decoding with EEG.

## Key findings

- Combining overt and imagined speech data improved accuracy in four out of ten word pairs by 3%–5.17%.
- More participants achieved over 70% accuracy when overt speech data was included.
- Word length, phonological complexity, and frequency influenced the discriminability of imagined word pairs.

## Abstract

Imagined speech decoding using EEG holds promising applications for individuals with motor neuron diseases, although its performance remains limited due to small dataset sizes and the absence of sensory feedback. Here, we investigated whether incorporating EEG data from overt (pronounced) speech could enhance imagined speech classification.

Our approach systematically compares four classification scenarios by modifying the training dataset: intra-subject (using only imagined speech, combining overt and imagined speech, and using only overt speech) and multi-subject (combining overt speech data from different participants with the imagined speech of the target participant). We implemented all scenarios using the convolutional neural network EEGNet. To this end, twenty-four healthy participants pronounced and imagined five Spanish words.

In binary word-pair classifications, combining overt and imagined speech data in the intra-subject scenario led to accuracy improvements of 3%–5.17% in four out of 10 word pairs, compared to training with imagined speech only. Although the highest individual accuracy (95%) was achieved with imagined speech alone, the inclusion of overt speech data allowed more participants to surpass 70% accuracy, increasing from 10 (imagined only) to 15 participants. In the intra-subject multi-class scenario, combining overt and imagined speech did not yield statistically significant improvements over using imagined speech exclusively.

Finally, we observed that features such as word length, phonological complexity, and frequency of use contributed to higher discriminability between certain imagined word pairs. These findings suggest that incorporating overt speech data can improve imagined speech decoding in individualized models, offering a feasible strategy to support the early adoption of brain-computer interfaces before speech deterioration occurs in individuals with motor neuron diseases.

## Full-text entities

- **Diseases:** motor neuron diseases (MESH:D016472)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12245923/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12245923/full.md

## References

45 references — full list in the complete paper: https://tomesphere.com/paper/PMC12245923/full.md

---
Source: https://tomesphere.com/paper/PMC12245923