Mind Reader: Reconstructing complex images from brain activities
Sikun Lin, Thomas Sprague, Ambuj K Singh

TL;DR
This paper presents a novel method for reconstructing complex, semantically rich images from fMRI brain signals by leveraging a multi-modal approach with text and a pre-trained vision-language latent space, achieving realistic and accurate image reconstructions.
Contribution
It introduces a multi-modal reconstruction framework that incorporates text and a pre-trained latent space to improve image reconstruction from brain activity data.
Findings
Reconstructed images are photo-realistic and semantically accurate.
Inclusion of text modality enhances reconstruction quality.
Leveraging pre-trained latent space addresses data scarcity effectively.
Abstract
Understanding how the brain encodes external stimuli and how these stimuli can be decoded from the measured brain activities are long-standing and challenging questions in neuroscience. In this paper, we focus on reconstructing the complex image stimuli from fMRI (functional magnetic resonance imaging) signals. Unlike previous works that reconstruct images with single objects or simple shapes, our work aims to reconstruct image stimuli that are rich in semantics, closer to everyday scenes, and can reveal more perspectives. However, data scarcity of fMRI datasets is the main obstacle to applying state-of-the-art deep learning models to this problem. We find that incorporating an additional text modality is beneficial for the reconstruction problem compared to directly translating brain signals to images. Therefore, the modalities involved in our method are: (i) voxel-level fMRI signals,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsCell Image Analysis Techniques · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
