Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity
Zijiao Chen, Jiaxin Qing, Juan Helen Zhou

TL;DR
This paper introduces Mind-Video, a novel method for reconstructing high-quality, continuous videos from brain activity data, advancing understanding of visual cognition and improving video reconstruction accuracy.
Contribution
Mind-Video is the first approach to learn spatiotemporal information from continuous fMRI data for video reconstruction, integrating masked brain modeling and multimodal contrastive learning.
Findings
Achieved 85% accuracy in semantic classification tasks
Attained 0.19 SSIM, surpassing previous methods by 45%
Produced high-quality videos at arbitrary frame rates
Abstract
Reconstructing human vision from brain activities has been an appealing task that helps to understand our cognitive process. Even though recent research has seen great success in reconstructing static images from non-invasive brain recordings, work on recovering continuous visual experiences in the form of videos is limited. In this work, we propose Mind-Video that learns spatiotemporal information from continuous fMRI data of the cerebral cortex progressively through masked brain modeling, multimodal contrastive learning with spatiotemporal attention, and co-training with an augmented Stable Diffusion model that incorporates network temporal inflation. We show that high-quality videos of arbitrary frame rates can be reconstructed with Mind-Video using adversarial guidance. The recovered videos were evaluated with various semantic and pixel-level metrics. We achieved an average accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsFunctional Brain Connectivity Studies · Neural dynamics and brain function · EEG and Brain-Computer Interfaces
MethodsDiffusion · Contrastive Learning
