BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP   for Generic Natural Visual Stimulus Decoding

Yulong Liu; Yongqiang Ma; Wei Zhou; Guibo Zhu; Nanning Zheng

arXiv:2302.12971·cs.CV·May 16, 2023·6 cites

BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP for Generic Natural Visual Stimulus Decoding

Yulong Liu, Yongqiang Ma, Wei Zhou, Guibo Zhu, Nanning Zheng

PDF

Open Access 1 Repo

TL;DR

BrainCLIP introduces a novel brain decoding model that leverages CLIP's cross-modal capabilities to decode and reconstruct natural images from fMRI data, achieving state-of-the-art results in semantic fidelity.

Contribution

This work is the first to use CLIP as a bridge for generic brain decoding tasks, including zero-shot visual decoding and image reconstruction from fMRI signals.

Findings

01

Outperforms existing methods in zero-shot visual category decoding

02

Achieves high semantic fidelity in image reconstruction from fMRI

03

Establishes new state-of-the-art in fMRI-based natural image reconstruction

Abstract

Due to the lack of paired samples and the low signal-to-noise ratio of functional MRI (fMRI) signals, reconstructing perceived natural images or decoding their semantic contents from fMRI data are challenging tasks. In this work, we propose, for the first time, a task-agnostic fMRI-based brain decoding model, BrainCLIP, which leverages CLIP's cross-modal generalization ability to bridge the modality gap between brain activity, image, and text. Our experiments demonstrate that CLIP can act as a pivot for generic brain decoding tasks, including zero-shot visual categories decoding, fMRI-image/text matching, and fMRI-to-image generation. Specifically, BrainCLIP aims to train a mapping network that transforms fMRI patterns into a well-aligned CLIP embedding space by combining visual and textual supervision. Our experiments show that this combination can boost the decoding model's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

YulongBonjour/BrainCLIP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques

MethodsContrastive Language-Image Pre-training