Probing Human Visual Robustness with Neurally-Guided Deep Neural Networks
Zhenan Shao, Linjian Ma, Yiqing Zhou, Yibo Jacky Zhang, Sanmi Koyejo, Bo Li, Diane M. Beck

TL;DR
This study demonstrates that aligning deep neural networks with human neural responses across the ventral visual stream enhances their robustness to image perturbations, highlighting the importance of hierarchical neural representations and manifold properties.
Contribution
The paper introduces a neurally-guided training approach that aligns DNN representations with human VVS responses, leading to hierarchical robustness improvements and insights into neural manifold geometry.
Findings
Hierarchical alignment with VVS improves DNN robustness.
Neural category manifolds become smaller and more separable across VVS.
Neural manifold guidance alone can enhance robustness.
Abstract
Humans effortlessly navigate the dynamic visual world, yet deep neural networks (DNNs), despite excelling at many visual tasks, are surprisingly vulnerable to minor image perturbations. Past theories suggest that human visual robustness arises from a representational space that evolves along the ventral visual stream (VVS) of the brain to increasingly tolerate object transformations. To test whether robustness is supported by such progression as opposed to being confined exclusively to specialized higher-order regions, we trained DNNs to align their representations with human neural responses from consecutive VVS regions while performing visual tasks. We demonstrate a hierarchical improvement in DNN robustness: alignment to higher-order VVS regions leads to greater improvement. To investigate the mechanism behind such robustness gains, we test a prominent hypothesis that attributes…
Peer Reviews
Decision·Submitted to ICLR 2026
- **Clear motivation and novel systematic investigation.** The paper studies and improves modes’ adversarial robustness via the alignment between DNNs and human visual systems. It provides a systematic examination of how robustness evolves across multiple consecutive human VVS regions, which is different from prior work that focuses on isolated areas such as just V1 or IT. This provides valuable insights into the hierarchical nature of visual robustness. - **Rigorous experimental design and stat
We thank the authors for submitting the paper to ICLR 2026! There are a few weaknesses listed below which I believe can make the paper better. - **Limited absolute robustness gain.** While the hierarchical pattern is consistent, with results from different NSD subjects, the absolute robustness improvements of TO-guided models are small. They still show substantial vulnerability to adversarial perturbations which are imperceptible to humans. The authors already acknowledge this limitation but mor
Strong experimental validation: multiple controls, cross-subject replication, and attack-type diversity. Novel conceptual framing: robustness as a byproduct of manifold geometry inherited from neural data. Elegant theoretical integration: combines neuroscience principles with modern robustness evaluation. High reproducibility: clear methodological exposition, use of open datasets (NSD), and code availability. Mechanistic insight: links representational geometry to robustness, moving beyo
The reliance on fMRI-based predictors limits representational precision; neural alignment fidelity could improve with higher-resolution data (e.g., intracranial recordings). The scope of tasks (primarily object classification) is limited; extending analyses to temporal or contextual visual understanding would strengthen generalization claims. Manifold guidance, while promising, remains a coarse approximation; incorporating nonlinear manifold constraints could further clarify its efficacy.
* The results in this paper are surprising : simply training a model to mimic human brain activity confers adversarial robustness. Not only could this provide insight into the robustness of the human visual system to image perturbations, it could also inform our knowledge of machine learning robustness more generally. * The results provide compelling evidence for the manifold disentanglement hypothesis and may be of note to cognitive scientists. * The novel manifold guidance loss term that is de
* The model presented doesn't engage much with the existing literature on adversarial robustness. I think that additional space should be devoted to the relationship between this paper and the adversarial training literature in the related work. It would be informative to know whether adversarially trained models also conform to the manifold disentanglement hypothesis. * The robustness results would be more compelling if they weren't restricted to $\ell_p$ bounded adversarial corruptions. There
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual perception and processing mechanisms
