Assembling the Mind's Mosaic: Towards EEG Semantic Intent Decoding

Jiahe Li; Junru Chen; Fanqi Shen; Jialan Yang; Jada Li; Zhizhang Yuan; Baowen Cheng; Meng Li; Yang Yang

arXiv:2601.20447·q-bio.NC·January 29, 2026

Assembling the Mind's Mosaic: Towards EEG Semantic Intent Decoding

Jiahe Li, Junru Chen, Fanqi Shen, Jialan Yang, Jada Li, Zhizhang Yuan, Baowen Cheng, Meng Li, Yang Yang

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Semantic Intent Decoding (SID), a novel framework that translates neural activity into natural language by modeling meaning as compositional semantic units, enhancing interpretability and expressiveness in brain-computer interfaces.

Contribution

The paper presents BrainMosaic, a deep learning architecture implementing SID, which decodes multiple semantic units from EEG/SEEG signals and reconstructs sentences, surpassing traditional classification methods.

Findings

01

SID achieves higher decoding accuracy than existing models.

02

BrainMosaic effectively reconstructs coherent sentences from neural signals.

03

Framework demonstrates robustness across multilingual and clinical datasets.

Abstract

Enabling natural communication through brain-computer interfaces (BCIs) remains one of the most profound challenges in neuroscience and neurotechnology. While existing frameworks offer partial solutions, they are constrained by oversimplified semantic representations and a lack of interpretability. To overcome these limitations, we introduce Semantic Intent Decoding (SID), a novel framework that translates neural activity into natural language by modeling meaning as a flexible set of compositional semantic units. SID is built on three core principles: semantic compositionality, continuity and expandability of semantic space, and fidelity in reconstruction. We present BrainMosaic, a deep learning architecture implementing SID. BrainMosaic decodes multiple semantic units from EEG/SEEG signals using set matching and then reconstructs coherent sentences through semantic-guided…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

1. Interesting formulation: treating an utterance as a set of semantic units is a neat and fairly novel way to handle variable-length EEG semantics. 2. Well-structured method: the three design principles (compositionality, continuity/expandability, fidelity) map cleanly to three modules. 3. Reasonable experiment designs: the same idea is run on Chinese imagined speech, Chinese naturalistic reading, English reading, and a clinical SEEG case, which supports the claim that the approach is not tie

Weaknesses

1. **Train/validation split is underspecified:** the paper only mentions a unified 8:2 train–test split, but does not say whether this is by subject or by trial. Can the model guess which sentence it is by the sample length? 2. **No true random / text-prior baselines**: the main metrics (UMA, MUS, sentence similarity) are not compared against (i) picking the same number of units at random or (ii) a text-only/corpus-frequency prior. This makes it hard to see how much of the score actually comes

Reviewer 02Rating 4Confidence 4

Strengths

The paper proposes an intriguing motivation and introduces a novel perspective for brain-to-text decoding. By modeling intent as an unordered and variable set of semantic units, it moves beyond traditional fixed-label or sequential decoding paradigms, potentially offering a more brain-plausible representation of semantic processing.

Weaknesses

1.Lack of robustness evaluation under input noise: the paper does not assess model performance under noisy or corrupted EEG inputs, which weakens the significance of the results reported in Tables 3, 4, and 5. Given the inherent noisiness of EEG signals, such evaluations are essential to validate the practical utility of the proposed method. 2.Insufficient justification for LLM-based sentence generation: the use of LLMs for sentence reconstruction raises concerns about data contamination, espec

Reviewer 03Rating 8Confidence 4

Strengths

1. The three principles (compositionality, continuity, fidelity) are well-motivated by linguistic and neuroscience evidence. 2. Interpretable pipeline: slots -> ranked retrieval -> prompted gen beats blackbox E2E. 3. Consistent empirical gains across multiple datasets and baselines with both concept level and sentence level metrics. 4. Comprehensive comparison with other relevant baselines. 5. Extensive supplementary material includes dataset details, baseline descriptions, and sensitivity analy

Weaknesses

1. Lack of qualitative examples of reconstruction quality. Without concrete examples, readers cannot verify whether predicted semantic unit sets are genuinely interpretable or noisy/scattered, (b) how well do the quantitative metrics match real semantic correctness. This falls below the standards in neuro-decoding literature where it's common to show a few samples of the proposed model's input-output vs baseline 2. Lack of comparability to standard text metrics. While semantics-first metrics are

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEEG and Brain-Computer Interfaces · Functional Brain Connectivity Studies · Advanced Memory and Neural Computing