TAVAE: A VAE with Adaptable Priors Explains Contextual Modulation in the Visual Cortex
Bal\'azs Mesz\'ena, Keith T. Murray, Julien Corbo, O. Batuhan Erkat, M\'arton A. Hajnal, Pierre-Olivier Polack, Gerg\H{o} Orb\'an

TL;DR
This paper introduces TAVAE, a flexible VAE model with adaptable priors that explains how the visual cortex learns and applies task-specific contextual information, aligning with neuronal activity patterns during visual discrimination tasks.
Contribution
The paper presents a novel Task-Amortized VAE that efficiently learns task-specific priors, demonstrating their role in shaping visual cortex responses during task performance.
Findings
TAVAE models task-specific priors aligning with neuronal responses.
Model accounts for within-day updates in V1 population activity.
Flexible priors can be learned on demand and influence early visual processing.
Abstract
The brain interprets visual information through learned regularities, a computation formalized as probabilistic inference under a prior. The visual cortex establishes priors for this inference, some delivered through established top-down connections that inform low-level cortices about statistics represented at higher levels in the cortical hierarchy. While evidence shows that adaptation leads to priors reflecting the structure of natural images, it remains unclear whether similar priors can be flexibly acquired when learning a specific task. To investigate this, we built a generative model of V1 optimized for a simple discrimination task and analyzed it together with large-scale recordings from mice performing an analogous task. In line with recent approaches, we assumed that neuronal activity in V1 corresponds to latent posteriors in the generative model, enabling investigation of…
Peer Reviews
Decision·ICLR 2026 Poster
The paper presents an interesting and, to my knowledge, novel accounting of neural tuning properties in the face of changing stimulus statistics using the model they present in Section 2. They present this along side an approach for learning context specific priors in the variational framework. There do appear to be some qualitative similarities between neural data and model latents but validating these results is required before claims can be made about how their model maps mechanistically onto
I think there are 2 main dimensions on which this paper falls short of acceptance 1) validation of model structure, 2) statistical rigor, 3) clarity of question. 1) The authors make claims about the qualitative properties of the latents of their model and how they match those of the real data. However, I'm not sure it's possible to attribute these features (even if they are statistically valid) to the prior structure of the model exclusively. Specifically, no ablation analysis of the model was
1. The model is a minimal model with biologically inspired constraints—linear decoder, sparse Laplace prior, overcomplete latent space, and GSM-style gain modulation while it mirrors classic models of V1 (e.g., Olshausen & Field). 2. The model qualitatively reproduces several experimentally observed phenomena using a single mechanism (prior variance reweighting).
1. While the paper claims that adaptation in the prior alone is sufficient to account for several task-induced changes in neural population statistics. The lack of comparison to single neuron activity left this claim speculative 2. Figure 3a: I really cannot see "drastic" difference between red and blue curves. There needs to be a metric or something to quantify how they are different. 3. Figure 4a; The curves are visually nearly identical in shape, except for slightly lower side peaks and a
1. **Elegant framework**. Beautiful idea - fixing the likelihood $p(x|z)$ and only learning a new task-specific prior $p_T(z)$ is both elegant and powerful. The paper makes a clear hypothesis: systematic biases in V1 during task performance are the result of probabilistic inference under a learned, task-specific contextual prior. The model provides a concrete implementation of this hypothesis and generates specific, falsifiable predictions that are then confirmed by the experimental data. 2. **
1. **Clarity**. The paper might benefit from a more clear high-level framework introduction, before getting to the formalism. If I get it correctly, then the autoencoder model is trained on images only and neural responses are used for validation only. 2. **Lack of quantification of qualitative results**. While Fig 3 generates nice qualitative insights, some statistical tests might support the claims, eg Hartigan's Dip Test to quantify when the red line stops being unimodal (and if it happens
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace Recognition and Perception · Neural dynamics and brain function · Multisensory perception and integration
