SpikeVAEDiff: Neural Spike-based Natural Visual Scene Reconstruction via VD-VAE and Versatile Diffusion
Jialu Li, Taiyan Zhou

TL;DR
SpikeVAEDiff is a two-stage neural network framework that reconstructs high-resolution, semantically meaningful images from neural spike data by combining VDVAE and Versatile Diffusion, advancing neural decoding techniques.
Contribution
This work introduces a novel two-stage approach integrating VDVAE and Versatile Diffusion for neural spike-based image reconstruction, demonstrating improved quality and semantic accuracy.
Findings
VISI region shows strongest activation for reconstruction
Spike data outperforms fMRI in temporal and spatial resolution
Region-specific data significantly improves reconstruction quality
Abstract
Reconstructing natural visual scenes from neural activity is a key challenge in neuroscience and computer vision. We present SpikeVAEDiff, a novel two-stage framework that combines a Very Deep Variational Autoencoder (VDVAE) and the Versatile Diffusion model to generate high-resolution and semantically meaningful image reconstructions from neural spike data. In the first stage, VDVAE produces low-resolution preliminary reconstructions by mapping neural spike signals to latent representations. In the second stage, regression models map neural spike signals to CLIP-Vision and CLIP-Text features, enabling Versatile Diffusion to refine the images via image-to-image generation. We evaluate our approach on the Allen Visual Coding-Neuropixels dataset and analyze different brain regions. Our results show that the VISI region exhibits the most prominent activation and plays a key role in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace Recognition and Perception · Neural dynamics and brain function · Generative Adversarial Networks and Image Synthesis
