NeuroCLIP: Brain-Inspired Prompt Tuning for EEG-to-Image Multimodal Contrastive Learning

Jiyuan Wang; Li Zhang; Haipeng Lin; Qile Liu; Gan Huang; Ziyu Li; Zhen Liang; and Xia Wu

arXiv:2511.09250·cs.IR·November 13, 2025

NeuroCLIP: Brain-Inspired Prompt Tuning for EEG-to-Image Multimodal Contrastive Learning

Jiyuan Wang, Li Zhang, Haipeng Lin, Qile Liu, Gan Huang, Ziyu Li, Zhen Liang, and Xia Wu

PDF

Open Access

TL;DR

NeuroCLIP introduces a brain-inspired prompt tuning framework for EEG-to-image contrastive learning, enhancing neural-visual alignment through adaptive prompts and a neuroscientific-inspired loss, achieving significant improvements in zero-shot image retrieval.

Contribution

The paper presents the first integration of visual prompt tokens into EEG-image alignment and a dual-stream adaptive prompt design inspired by neuroscience principles.

Findings

01

Achieved 63.2% Top-1 accuracy in zero-shot image retrieval on THINGS-EEG2 dataset.

02

Surpassed previous methods by +12.3% in accuracy.

03

Demonstrated strong generalization across subjects (+4.6% Top-1).

Abstract

Recent advances in brain-inspired artificial intelligence have sought to align neural signals with visual semantics using multimodal models such as CLIP. However, existing methods often treat CLIP as a static feature extractor, overlooking its adaptability to neural representations and the inherent physiological-symbolic gap in EEG-image alignment. To address these challenges, we present NeuroCLIP, a prompt tuning framework tailored for EEG-to-image contrastive learning. Our approach introduces three core innovations: (1) We design a dual-stream visual embedding pipeline that combines dynamic filtering and token-level fusion to generate instance-level adaptive prompts, which guide the adjustment of patch embedding tokens based on image content, thereby enabling fine-grained modulation of visual representations under neural constraints; (2) We are the first to introduce visual prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace Recognition and Perception · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications