NT-ViT: Neural Transcoding Vision Transformers for EEG-to-fMRI Synthesis
Romeo Lanzino, Federico Fontana, Luigi Cinque, Francesco Scarcello,, Atsuto Maki

TL;DR
NT-ViT is a novel generative model that synthesizes high-resolution fMRI images from EEG data, improving accuracy and reproducibility over previous methods, with significant performance gains demonstrated on benchmark datasets.
Contribution
The paper introduces NT-ViT, a new neural transcoding model with a domain matching module that enhances EEG-to-fMRI synthesis accuracy and reliability, outperforming existing approaches.
Findings
Achieves 10x reduction in RMSE on the Oddball dataset.
Increases SSIM by 3.14x compared to state-of-the-art.
Demonstrates superior performance on two benchmark datasets.
Abstract
This paper introduces the Neural Transcoding Vision Transformer (\modelname), a generative model designed to estimate high-resolution functional Magnetic Resonance Imaging (fMRI) samples from simultaneous Electroencephalography (EEG) data. A key feature of \modelname is its Domain Matching (DM) sub-module which effectively aligns the latent EEG representations with those of fMRI volumes, enhancing the model's accuracy and reliability. Unlike previous methods that tend to struggle with fidelity and reproducibility of images, \modelname addresses these challenges by ensuring methodological integrity and higher-quality reconstructions which we showcase through extensive evaluation on two benchmark datasets; \modelname outperforms the current state-of-the-art by a significant margin in both cases, e.g. achieving a reduction in RMSE and a increase in SSIM on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Vision Transformer · Softmax · Layer Normalization · Position-Wise Feed-Forward Layer
