Investigating and unmasking feature-level vulnerabilities of CNNs to   adversarial perturbations

Davide Coppola; Hwee Kuan Lee

arXiv:2405.20672·cs.CV·June 3, 2024·1 cites

Investigating and unmasking feature-level vulnerabilities of CNNs to adversarial perturbations

Davide Coppola, Hwee Kuan Lee

PDF

Open Access

TL;DR

This paper introduces the Adversarial Intervention framework to analyze how specific feature maps in CNNs contribute to vulnerability against adversarial attacks, providing new insights into model robustness.

Contribution

It proposes a novel framework for studying CNN vulnerabilities at the feature map level, revealing shared vulnerable channels and their impact across different attack types.

Findings

01

Perturbing shallow layer channels causes significant disruptions.

02

Vulnerable channel combinations are common across attack types.

03

A positive correlation exists between kernel magnitude and vulnerability.

Abstract

This study explores the impact of adversarial perturbations on Convolutional Neural Networks (CNNs) with the aim of enhancing the understanding of their underlying mechanisms. Despite numerous defense methods proposed in the literature, there is still an incomplete understanding of this phenomenon. Instead of treating the entire model as vulnerable, we propose that specific feature maps learned during training contribute to the overall vulnerability. To investigate how the hidden representations learned by a CNN affect its vulnerability, we introduce the Adversarial Intervention framework. Experiments were conducted on models trained on three well-known computer vision datasets, subjecting them to attacks of different nature. Our focus centers on the effects that adversarial perturbations to a model's initial layer have on the overall behavior of the model. Empirical results revealed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsFocus