MindCine: Multimodal EEG-to-Video Reconstruction with Large-Scale Pretrained Models

Tian-Yi Zhou; Xuan-Hao Liu; Bao-Liang Lu; Wei-Long Zheng

arXiv:2601.18192·cs.CV·January 28, 2026

MindCine: Multimodal EEG-to-Video Reconstruction with Large-Scale Pretrained Models

Tian-Yi Zhou, Xuan-Hao Liu, Bao-Liang Lu, Wei-Long Zheng

PDF

Open Access

TL;DR

MindCine is a novel multimodal EEG-to-video reconstruction framework that leverages large-scale pretrained models and multimodal learning to improve high-fidelity video reconstruction from EEG signals, especially with limited data.

Contribution

The paper introduces a multimodal joint learning framework combined with large-scale EEG models to enhance EEG-to-video reconstruction and address data scarcity issues.

Findings

01

Outperforms state-of-the-art methods qualitatively and quantitatively.

02

Effectively incorporates multiple modalities to improve reconstruction quality.

03

Leverages large-scale EEG models to mitigate limited data challenges.

Abstract

Reconstructing human dynamic visual perception from electroencephalography (EEG) signals is of great research significance since EEG's non-invasiveness and high temporal resolution. However, EEG-to-video reconstruction remains challenging due to: 1) Single Modality: existing studies solely align EEG signals with the text modality, which ignores other modalities and are prone to suffer from overfitting problems; 2) Data Scarcity: current methods often have difficulty training to converge with limited EEG-video data. To solve the above problems, we propose a novel framework MindCine to achieve high-fidelity video reconstructions on limited data. We employ a multimodal joint learning strategy to incorporate beyond-text modalities in the training stage and leverage a pre-trained large EEG model to relieve the data scarcity issue for decoding semantic information, while a Seq2Seq model with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEEG and Brain-Computer Interfaces · Emotion and Mood Recognition · Multimodal Machine Learning Applications