BrainChat: Decoding Semantic Information from fMRI using Vision-language   Pretrained Models

Wanaiu Huang

arXiv:2406.07584·cs.CV·June 13, 2024

BrainChat: Decoding Semantic Information from fMRI using Vision-language Pretrained Models

Wanaiu Huang

PDF

Open Access

TL;DR

BrainChat is a novel framework that decodes semantic information from fMRI data using vision-language pretrained models, enabling tasks like captioning and question answering with high accuracy and flexibility.

Contribution

This paper introduces BrainChat, a new generative approach leveraging vision-language models and contrastive learning to decode semantic brain activity, including first-time fMRI question answering.

Findings

01

Outperforms existing methods in fMRI captioning

02

Achieves first implementation of fMRI question answering

03

Effective even without image data

Abstract

Semantic information is vital for human interaction, and decoding it from brain activity enables non-invasive clinical augmentative and alternative communication. While there has been significant progress in reconstructing visual images, few studies have focused on the language aspect. To address this gap, leveraging the powerful capabilities of the decoder-based vision-language pretrained model CoCa, this paper proposes BrainChat, a simple yet effective generative framework aimed at rapidly accomplishing semantic information decoding tasks from brain activity, including fMRI question answering and fMRI captioning. BrainChat employs the self-supervised approach of Masked Brain Modeling to encode sparse fMRI data, obtaining a more compact embedding representation in the latent space. Subsequently, BrainChat bridges the gap between modalities by applying contrastive loss, resulting in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning