Dynadiff: Single-stage Decoding of Images from Continuously Evolving fMRI
Marl\`ene Careil, Yohann Benchetrit, Jean-R\'emi King

TL;DR
Dynadiff is a novel single-stage diffusion model that enables time-resolved decoding of images from continuously evolving fMRI data, simplifying training and outperforming existing methods in semantic image reconstruction.
Contribution
It introduces Dynadiff, a new model that simplifies training and improves performance in time-resolved brain-to-image decoding from fMRI signals.
Findings
Outperforms state-of-the-art models on time-resolved fMRI data
Provides better semantic image reconstruction metrics
Enables precise analysis of image representation evolution in the brain
Abstract
Brain-to-image decoding has been recently propelled by the progress in generative AI models and the availability of large ultra-high field functional Magnetic Resonance Imaging (fMRI). However, current approaches depend on complicated multi-stage pipelines and preprocessing steps that typically collapse the temporal dimension of brain recordings, thereby limiting time-resolved brain decoders. Here, we introduce Dynadiff (Dynamic Neural Activity Diffusion for Image Reconstruction), a new single-stage diffusion model designed for reconstructing images from dynamically evolving fMRI recordings. Our approach offers three main contributions. First, Dynadiff simplifies training as compared to existing approaches. Second, our model outperforms state-of-the-art models on time-resolved fMRI signals, especially on high-level semantic image reconstruction metrics, while remaining competitive on…
Peer Reviews
Decision·Submitted to ICLR 2026
1. The proposed method is simple and straightforward. Experimental results shows that compared to state of the art model the proposed method achieved comparable performance using only one stage training. 2. The experiment analysis is completed and interesting, which could provide useful insight to the community. 3. The experimental evaluation is thorough and robust.
1. Architectural novelty. While the pipeline is novel in its simplicity, the components are standard. The brain module is essentially a large MLP , and the finetuning method is LoRA. This is not a major flaw, as the contribution lies in the effective composition and problem formulation, but the architectural novelty itself is moderate. 2. Fairness of Time-Series Baselines. Models like MindEye1 and MindEye2 were explicitly designed for static beta values. The authors state they adapted these mode
# Strengths * **Interesting temporal visualization**: The time-shift analyses (e.g., Fig. 4) are engaging and make the dynamics tangible. * **Simplified training story**: Collapsing multi-stage pipelines into a single training objective is an appealing engineering direction. * **Reasonably broad metrics**: CLIP/feature metrics, segmentation mIoU, and qualitative examples provide multiple views of performance.
# Weaknesses & Detailed Comments ## A. Methodology/Claims 1. **“Time-series modeling” is shallow relative to the claim.** The core temporal handling appears to be per-timestep linear transforms plus a **single temporal aggregation layer**. This is not a genuine temporal model (no temporal attention, SSM/RNN, or FIR deconvolution) and does not convincingly support the claim that prior work “completely discards the time dimension.” Please either temper the claim or compare against lightweight
* The single-stage training pipeline is a clear improvement over existing multi-stage frameworks. * Demonstrates robust time-resolved reconstruction from continuous BOLD signals. * Includes ablations on time-window duration, brain module design, and diffusion tuning strategies. * Well-written and clearly motivated, especially regarding the challenges of time-collapsed preprocessing.
* While the single-stage design is elegant, the core idea, jointly fine-tuning an fMRI encoder with a diffusion model, remains conceptually close to prior fMRI-to-image diffusion frameworks. * Table 1 employs a customized fMRI preprocessing pipeline while comparing against baselines trained on time-collapsed data, making the reported performance gains difficult to interpret. Moreover, the comparison omits recent time-resolved decoders such as Neuropictor (Huo et al., 2024), which weakens the fai
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Neural Networks and Applications · Image and Signal Denoising Methods
MethodsDiffusion
