Limited but consistent gains in adversarial robustness by co-training object recognition models with human EEG
Manshan Guo, Bhavin Choksi, Sari Sadiya, Alessandro T., Gifford, Martina G. Vilas, Radoslaw M. Cichy, Gemma Roig

TL;DR
Aligning neural network models with human EEG responses to real-world images can modestly improve their robustness against adversarial attacks, with consistent effects observed across different model initializations and architectures.
Contribution
This study demonstrates that training models to predict human EEG responses to natural images can enhance adversarial robustness, a novel approach compared to prior invasive brain data methods.
Findings
EEG prediction accuracy correlates with robustness gains.
Effects are consistent across different initializations and architectures.
Strongest EEG contribution from parieto-occipital electrodes.
Abstract
In contrast to human vision, artificial neural networks (ANNs) remain relatively susceptible to adversarial attacks. To address this vulnerability, efforts have been made to transfer inductive bias from human brains to ANNs, often by training the ANN representations to match their biological counterparts. Previous works relied on brain data acquired in rodents or primates using invasive techniques, from specific regions of the brain, under non-natural conditions (anesthetized animals), and with stimulus datasets lacking diversity and naturalness. In this work, we explored whether aligning model representations to human EEG responses to a rich set of real-world images increases robustness to ANNs. Specifically, we trained ResNet50-backbone models on a dual task of classification and EEG prediction; and evaluated their EEG prediction accuracy and robustness to adversarial attacks. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Fault Detection and Control Systems
MethodsSparse Evolutionary Training
