SpikeVAEDiff: Neural Spike-based Natural Visual Scene Reconstruction via VD-VAE and Versatile Diffusion

Jialu Li; Taiyan Zhou

arXiv:2601.09213·cs.CV·January 15, 2026

SpikeVAEDiff: Neural Spike-based Natural Visual Scene Reconstruction via VD-VAE and Versatile Diffusion

Jialu Li, Taiyan Zhou

PDF

Open Access

TL;DR

SpikeVAEDiff is a two-stage neural network framework that reconstructs high-resolution, semantically meaningful images from neural spike data by combining VDVAE and Versatile Diffusion, advancing neural decoding techniques.

Contribution

This work introduces a novel two-stage approach integrating VDVAE and Versatile Diffusion for neural spike-based image reconstruction, demonstrating improved quality and semantic accuracy.

Findings

01

VISI region shows strongest activation for reconstruction

02

Spike data outperforms fMRI in temporal and spatial resolution

03

Region-specific data significantly improves reconstruction quality

Abstract

Reconstructing natural visual scenes from neural activity is a key challenge in neuroscience and computer vision. We present SpikeVAEDiff, a novel two-stage framework that combines a Very Deep Variational Autoencoder (VDVAE) and the Versatile Diffusion model to generate high-resolution and semantically meaningful image reconstructions from neural spike data. In the first stage, VDVAE produces low-resolution preliminary reconstructions by mapping neural spike signals to latent representations. In the second stage, regression models map neural spike signals to CLIP-Vision and CLIP-Text features, enabling Versatile Diffusion to refine the images via image-to-image generation. We evaluate our approach on the Allen Visual Coding-Neuropixels dataset and analyze different brain regions. Our results show that the VISI region exhibits the most prominent activation and plays a key role in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace Recognition and Perception · Neural dynamics and brain function · Generative Adversarial Networks and Image Synthesis