VIBE: Video-Input Brain Encoder for fMRI Response Modeling

Daniel Carlstr\"om Schad; Shrey Dixit; Janis Keck; Viktor Studenyak; Aleksandr Shpilevoi; Andrej Bicanski

arXiv:2507.17958·cs.LG·July 28, 2025

VIBE: Video-Input Brain Encoder for fMRI Response Modeling

Daniel Carlstr\"om Schad, Shrey Dixit, Janis Keck, Viktor Studenyak, Aleksandr Shpilevoi, Andrej Bicanski

PDF

Open Access

TL;DR

VIBE is a novel two-stage Transformer model that integrates multi-modal video, audio, and text features to accurately predict fMRI brain responses to movies, demonstrating state-of-the-art performance in neuroimaging prediction tasks.

Contribution

The paper introduces VIBE, a new Transformer-based architecture that fuses multi-modal features for fMRI response modeling, achieving top performance in the Algonauts 2025 Challenge.

Findings

01

Achieved mean parcel-wise Pearson correlation of 0.3225 on in-distribution data.

02

Achieved mean parcel-wise Pearson correlation of 0.2125 on out-of-distribution data.

03

Won Phase-1 and placed second in the Algonauts 2025 Challenge.

Abstract

We present VIBE, a two-stage Transformer that fuses multi-modal video, audio, and text features to predict fMRI activity. Representations from open-source models (Qwen2.5, BEATs, Whisper, SlowFast, V-JEPA) are merged by a modality-fusion transformer and temporally decoded by a prediction transformer with rotary embeddings. Trained on 65 hours of movie data from the CNeuroMod dataset and ensembled across 20 seeds, VIBE attains mean parcel-wise Pearson correlations of 0.3225 on in-distribution Friends S07 and 0.2125 on six out-of-distribution films. An earlier iteration of the same architecture obtained 0.3198 and 0.2096, respectively, winning Phase-1 and placing second overall in the Algonauts 2025 Challenge.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFunctional Brain Connectivity Studies · EEG and Brain-Computer Interfaces