Limited but consistent gains in adversarial robustness by co-training   object recognition models with human EEG

Manshan Guo; Bhavin Choksi; Sari Sadiya; Alessandro T.; Gifford; Martina G. Vilas; Radoslaw M. Cichy; Gemma Roig

arXiv:2409.03646·cs.LG·December 16, 2024

Limited but consistent gains in adversarial robustness by co-training object recognition models with human EEG

Manshan Guo, Bhavin Choksi, Sari Sadiya, Alessandro T., Gifford, Martina G. Vilas, Radoslaw M. Cichy, Gemma Roig

PDF

Open Access

TL;DR

Aligning neural network models with human EEG responses to real-world images can modestly improve their robustness against adversarial attacks, with consistent effects observed across different model initializations and architectures.

Contribution

This study demonstrates that training models to predict human EEG responses to natural images can enhance adversarial robustness, a novel approach compared to prior invasive brain data methods.

Findings

01

EEG prediction accuracy correlates with robustness gains.

02

Effects are consistent across different initializations and architectures.

03

Strongest EEG contribution from parieto-occipital electrodes.

Abstract

In contrast to human vision, artificial neural networks (ANNs) remain relatively susceptible to adversarial attacks. To address this vulnerability, efforts have been made to transfer inductive bias from human brains to ANNs, often by training the ANN representations to match their biological counterparts. Previous works relied on brain data acquired in rodents or primates using invasive techniques, from specific regions of the brain, under non-natural conditions (anesthetized animals), and with stimulus datasets lacking diversity and naturalness. In this work, we explored whether aligning model representations to human EEG responses to a rich set of real-world images increases robustness to ANNs. Specifically, we trained ResNet50-backbone models on a dual task of classification and EEG prediction; and evaluated their EEG prediction accuracy and robustness to adversarial attacks. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Fault Detection and Control Systems

MethodsSparse Evolutionary Training