FlowLet: Conditional 3D Brain MRI Synthesis using Wavelet Flow Matching
Danilo Danese, Angela Lombardi, Matteo Attimonelli, Giuseppe Fasano, Tommaso Di Noia

TL;DR
FlowLet is a novel conditional 3D MRI synthesis method using wavelet flow matching, improving data augmentation for brain age prediction by generating high-quality, age-conditioned brain images efficiently.
Contribution
This work introduces FlowLet, a flow-based generative model operating in a wavelet domain for efficient, high-fidelity, age-conditioned 3D MRI synthesis, addressing limitations of existing latent diffusion methods.
Findings
FlowLet produces high-quality, anatomically consistent 3D brain images.
Training BAP models with FlowLet data enhances accuracy for underrepresented ages.
FlowLet reduces inference time and artifacts compared to latent diffusion models.
Abstract
Brain Magnetic Resonance Imaging (MRI) plays a central role in studying neurological development, aging, and diseases. One key application is Brain Age Prediction (BAP), which estimates an individual's biological brain age from MRI data. Effective BAP models require large, diverse, and age-balanced datasets, whereas existing 3D MRI datasets are demographically skewed, limiting fairness and generalizability. Acquiring new data is costly and ethically constrained, motivating generative data augmentation. Current generative methods are often based on latent diffusion models, which operate in learned low dimensional latent spaces to address the memory demands of volumetric MRI data. However, these methods are typically slow at inference, may introduce artifacts due to latent compression, and are rarely conditioned on age, thereby affecting the BAP performance. In this work, we propose…
Peer Reviews
Decision·Submitted to ICLR 2026
The potential impact enables the fast synthesis (e.g., 10 steps vs. 1000 for diffusion baselines) of anatomically consistent, age-specific data, which improves BAP model performance for underrepresented age groups. The writing quality contains objective errors, including typographical errors (e.g., "v_targot", "governed oise governed"), stray formatting characters, and data misalignment in tables (Table 2). The evaluation amount is quantitative, comparing four variants of the proposed method (
The paper states that "expert clinical evaluation is essential for diagnostic relevance," but this evaluation is not included. The paper claims the architecture can "extend to multiple conditioning variables," such as disease status or cognitive scores, but this is not demonstrated. The claim of "avoiding reconstruction artifacts" is justified by avoiding latent compression; however, artifacts originating from the generative process itself are not separately quantified. The authors identify ot
- The combination of wavelet transforms and flow matching for efficient image synthesis is somewhat interesting and appears to be novel. - The authors provide many details about their experimental setup. Beyond the description of the main results, they conduct extensive ablation experiments, which investigate the effect of many hyperparameters, including the used wavelet function, specific flow matching implementation, number of sampling steps, and conditioning mechanism. - The manuscript is c
- The authors motivate their work by stating that flow matching in wavelet domain increases computational efficiency. However, I do not believe that this of high importance if the ultimate goal is to generate training data for brain age prediction. Most likely, the training dataset would have to be generated only once, making slower model inference essentially irrelevant. - Despite some concerns regarding the fairness of the experimental setup (see questions below), the proposed method only bar
- Combining FM with an invertible 3D wavelet domain for volumetric synthesis is a neat and well-motivated design choice, avoiding learned autoencoders while retaining multi-scale control. The dual conditioning for age is thoughtfully integrated and ablated. - The empirical study covers global distribution metrics, region-wise anatomical metrics, and a functional BAP readout. The comparison set includes both latent and wavelet diffusion baselines, plus a recent FM variant retrained with the same
- The paper sketches the VP connection and uses an analytically tractable conditional score to define a deterministic target velocity. However, the presentation could benefit from a concise, explicit derivation that maps the reverse-time SDE drift to the Probability Flow ODE velocity used for training, including the exact role of Tweedie-based conditioning and the factor differences in the score terms. This would make the mathematical bridge between diffusion and FM airtight for the reader. - W
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Functional Brain Connectivity Studies · Fetal and Pediatric Neurological Disorders
