Toward Generalizing Visual Brain Decoding to Unseen Subjects

Xiangtao Kong; Kexin Huang; Ping Li; Lei Zhang

arXiv:2410.14445·cs.CV·October 22, 2024

Toward Generalizing Visual Brain Decoding to Unseen Subjects

Xiangtao Kong, Kexin Huang, Ping Li, Lei Zhang

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper investigates whether visual brain decoding models can generalize to unseen subjects by using a large dataset and a uniform learning paradigm, revealing inherent similarities in brain activities across individuals.

Contribution

It introduces a new dataset and a unified learning approach that demonstrates the potential for generalizing brain decoding across different subjects.

Findings

01

Network generalizes better with more training subjects

02

Generalization is consistent across MLP, CNN, and Transformer architectures

03

Subject similarity influences decoding performance

Abstract

Visual brain decoding aims to decode visual information from human brain activities. Despite the great progress, one critical limitation of current brain decoding research lies in the lack of generalization capability to unseen subjects. Prior works typically focus on decoding brain activity of individuals based on the observation that different subjects exhibit different brain activities, while it remains unclear whether brain decoding can be generalized to unseen subjects. This study aims to answer this question. We first consolidate an image-fMRI dataset consisting of stimulus-image and fMRI-response pairs, involving 177 subjects in the movie-viewing task of the Human Connectome Project (HCP). This dataset allows us to investigate the brain decoding performance with the increase of participants. We then present a learning paradigm that applies uniform processing across all subjects,…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 4

Strengths

1. This paper is well written and easy for the reader to read. 2. The motivation for the zero-shot setting is interesting. 3. The experimental design is very rich with multiple validations. Readers are inspired by explorations of the impact of gender. 4. In a sense, integrating the new dataset breaks the routine of NSD data use and is contributory.

Weaknesses

1. From the point of view of machine learning, the algorithms for brain decoding in this paper are basically existing methods. The use of downsampling to obtain voxels of uniform size across multiple subjects has been used, e.g. CLIP-MUSED [1] uses PCA downsampling to align different subjects, also without additional training. 2. The methodology mentioned by the authors was not compared to any relevant studies quantitatively. The methods mentioned by the authors were not compared to any relevan

Reviewer 02Rating 6Confidence 3

Strengths

The writing is clear, and the rationale is logical. The question of alignment across subjects, which the paper addresses, is timely and aligns with topics discussed at prominent venues.

Weaknesses

The decoding methodology is likely the paper’s most relevant contribution to ICLR, though I have identified two significant limitations in this area. Firstly, the authors show that incorporating data from additional subjects in training improves the model's performance, interpreting this as a solution to the alignment problem in neuroAI. In this field, we frequently encounter issues with requiring identical input-output pairs across subjects. While the authors suggest that their method mitigate

Reviewer 03Rating 6Confidence 4

Strengths

* Originality: While the proposed methodology is similar to previous work, the paper studies a setting (generalization across multiple subjects) which has not been explored much before. The analysis of the impact of training data composition (gender bias and of train-test set similarity) is also interesting and original. * Quality: The paper is of good quality and presents relevant experiments to support the claims made by the authors. * Clarity: The paper is overall well and clearly written, an

Weaknesses

1. The description of the dataset curation and test set sampling procedure lacks details (see Q1-2). As movie frames are likely correlated and do not cover the same semantic space as “standard” brain-image datasets (in which the images are selected to cover varied categories, e.g. NSD), it is unclear whether the models might in fact overfit to the specific scenes seen in the training set. This could have an impact on the absolute performance values reported in the paper. 2. There is also missing

Code & Models

Repositories

xiangtaokong/tgbd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEEG and Brain-Computer Interfaces · Cognitive Science and Education Research

MethodsFocus