MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and   Semantic Correction

Zixuan Gong; Qi Zhang; Guangyin Bao; Lei Zhu; Ke Liu; Liang Hu,; Duoqian Miao

arXiv:2404.12630·cs.CV·December 17, 2024

MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction

Zixuan Gong, Qi Zhang, Guangyin Bao, Lei Zhu, Ke Liu, Liang Hu,, Duoqian Miao

PDF

Open Access 1 Video

TL;DR

MindTuner is a novel cross-subject visual decoding framework that leverages visual fingerprints and semantic correction to reconstruct high-quality images from fMRI data with minimal training data.

Contribution

It introduces a multi-subject pre-training and fine-tuning approach using visual fingerprints and a new fMRI-to-text alignment paradigm for improved cross-subject decoding.

Findings

01

Outperforms state-of-the-art models on the NSD dataset

02

Achieves high-quality reconstructions with only 1 hour of training data

03

Effective cross-subject decoding with minimal data

Abstract

Decoding natural visual scenes from brain activity has flourished, with extensive research in single-subject tasks and, however, less in cross-subject tasks. Reconstructing high-quality images in cross-subject tasks is a challenging problem due to profound individual differences between subjects and the scarcity of data annotation. In this work, we proposed MindTuner for cross-subject visual decoding, which achieves high-quality and rich semantic reconstructions using only 1 hour of fMRI training data benefiting from the phenomena of visual fingerprint in the human visual system and a novel fMRI-to-text alignment paradigm. Firstly, we pre-train a multi-subject model among 7 subjects and fine-tune it with scarce data on new subjects, where LoRAs with Skip-LoRAs are utilized to learn the visual fingerprint. Then, we take the image modality as the intermediate pivot modality to achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction· underline

Taxonomy

TopicsDigital Media Forensic Detection · Image Processing Techniques and Applications · Neural Networks and Applications