Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs

Junyu Pan; Yansen Wang; Enze Zhang; Baoliang Lu; Weilong Zheng; Dongsheng Li

arXiv:2605.18172·cs.AI·May 19, 2026

Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs

Junyu Pan, Yansen Wang, Enze Zhang, Baoliang Lu, Weilong Zheng, Dongsheng Li

PDF

TL;DR

The paper introduces Generative Visual Grounding (GVG), a framework that visualizes EEG signals as images to improve understanding and interpretation of neural data in multimodal language models.

Contribution

It proposes a novel EEG-to-image generative approach that enhances neural signal interpretation by providing structured visual contexts, complementing traditional text-based alignment.

Findings

01

GVG-X-Omni matches text-aligned baselines with fewer parameters.

02

Visual proxies improve EEG understanding and visual generation.

03

Trimodal alignment yields consistent performance gains.

Abstract

Leveraging the universal representations of pre-trained LLMs and MLLMs offers a promising path toward brain foundation models. However, visually-evoked EEG datasets remain scarce, leading existing methods to align neural signals mainly with abstract text, a lossy translation that may discard fine-grained perceptual information encoded in brain activity. We propose Generative Visual Grounding (GVG), a framework that visualizes the invisible by using an EEG-to-image generative model as a visual translator. Instead of forcing EEG into text alone, GVG hallucinates instance-specific proxy images for non-visual EEG, providing structured visual contexts that allow MLLMs to exploit their visual priors for clinical-state interpretation. We validate this idea on two MLLM backbones, GVG-X-Omni and GVG-Janus. Image-only alignment is already competitive: the lightweight GVG-X-Omni matches…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.