fMRI-LM: Towards a Universal Foundation Model for Language-Aligned fMRI Understanding

Yuxiang Wei; Yanteng Zhang; Xi Xiao; Chengxuan Qian; Tianyang Wang; Vince D. Calhoun

arXiv:2511.21760·cs.CL·May 15, 2026

fMRI-LM: Towards a Universal Foundation Model for Language-Aligned fMRI Understanding

Yuxiang Wei, Yanteng Zhang, Xi Xiao, Chengxuan Qian, Tianyang Wang, Vince D. Calhoun

PDF

TL;DR

fMRI-LM is a novel foundational model that links fMRI brain activity with language, enabling semantic understanding and cross-modal reasoning through a three-stage training process.

Contribution

The paper introduces a three-stage framework for creating a language-aligned fMRI model, including neural tokenization, joint modeling with LLMs, and instruction tuning.

Findings

01

Achieves strong zero-shot and few-shot performance on benchmarks.

02

Effectively models fMRI signals with language representations.

03

Supports diverse downstream applications with high-level semantic understanding.

Abstract

Recent advances in multimodal large language models (LLMs) have enabled unified reasoning across images, audio, and video, but extending such capability to brain imaging remains largely unexplored. Bridging this gap is essential to link neural activity with semantic cognition and to develop cross-modal brain representations. To this end, we present fMRI-LM, a foundational model that bridges functional MRI (fMRI) and language through a three-stage framework. In Stage 1, we learn a neural tokenizer that maps fMRI into discrete tokens embedded in a language-consistent space. In Stage 2, a pretrained LLM is adapted to jointly model fMRI tokens and text, treating brain activity as a sequence that can be temporally predicted and linguistically described. To overcome the lack of natural fMRI-text pairs, we construct a large descriptive corpus that translates diverse imaging-based features into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.