Generative Model via Quantile Assignment
Georgi Hrusanov, Oliver Y. Ch\'en, Julien S. Bodelet

TL;DR
NeuroSQL introduces a novel generative approach that learns latent representations through optimal transportation and quantile assignment, eliminating auxiliary networks for more stable, efficient, and high-quality data synthesis.
Contribution
It proposes NeuroSQL, a new generative model that avoids auxiliary networks by using an optimal transportation-based latent learning method with quantile assignment.
Findings
Achieves lower pixel distance and higher perceptual quality than GANs, VAEs, and diffusion models.
Requires less training time compared to traditional generative models.
Effectively generates synthetic data with limited training samples.
Abstract
Deep Generative models (DGMs) play two key roles in modern machine learning: (i) producing new information (e.g., image synthesis) and (ii) reducing dimensionality. However, traditional architectures often rely on auxiliary networks such as encoders in Variational Autoencoders (VAEs) or discriminators in Generative Adversarial Networks (GANs), which introduce training instability, computational overhead, and risks like mode collapse. We present NeuroSQL, a new generative paradigm that eliminates the need for auxiliary networks by learning low-dimensional latent representations implicitly. NeuroSQL leverages an asymptotic approximation that expresses the latent variables as the solution to an optimal transportation problem. Specifically, NeuroSQL learns the latent variables by solving a linear assignment problem and then passes the latent information to a standalone generator. We…
Peer Reviews
Decision·Submitted to ICLR 2026
Interesting idea / novel paradigm — Replacing the encoder or discriminator with an explicit assignment of latent codes (quantile grid) is novel, and links generative modelling with statistical quantile/transport theory. Pragmatic focus on low-data regimes — The paper addresses an important setting: generating synthetic data when the training dataset is small relative to high ambient dimension (e.g., neuroimaging) which is under-studied. Theoretical underpinning — The use of quantile assignment
1. Resolution / dataset scale limited — Their experiments are constrained to small image resolutions (64×64-128×128) and relatively small sample sizes (~2 k images) under a limited compute budget. While this is aligned with their motivation (low-budget), it raises the question of how the method performs at larger, contemporary scales (e.g., 256×256, ImageNet scale). The authors acknowledge this in future work. 2. Interpretability of latent codes — Since the quantile grid is fixed and codes are
The replacement of encoder–decoder mappings with an assignment-based quantile mechanism is conceptually fresh and theoretically grounded. It bridges optimal transport with generative modeling in a unique and elegant way. The method is particularly well-suited for data domains like neuroimaging, where dimensionality exceeds sample size, and the assignment cost is independent of feature dimensionality. Avoiding adversarial losses makes the model stable and lightweight to train. The simplicity of
While the paper provides an overall complexity estimate, quantitative comparisons to VAEs, GANs, or diffusion models in terms of runtime, memory, and scalability would provide stronger evidence of its efficiency. The Hungarian step’s cubic cost could be limiting for very large batch sizes, although mini-batching is suggested as a practical solution. The performance advantage over GANs and VAEs is not uniform—some settings show weaker results, suggesting NeuroSQL's strengths may be inconsistent
1. this is a new structure 2. test with many different dataset and genearive framework
1. The expression of Figure 1 is unclear. From the image, it appears that the input data are fed into the decoder. The paper should clarify why this component is referred to as the decoder rather than the encoder, and explicitly describe what the input data are. Moreover, the roles of Momentum Update and Embedding in the framework are not clearly explained. What does “Cost” represent in this figure? Is it equivalent to the loss function? Additionally, regarding the left-hand side of the figure,
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Face Recognition and Perception
