Adversarial Evasion Attacks on Computer Vision using SHAP Values

Frank Mollard; Marcus Becker; Florian Roehrbein

arXiv:2601.10587·cs.CV·April 13, 2026

Adversarial Evasion Attacks on Computer Vision using SHAP Values

Frank Mollard, Marcus Becker, Florian Roehrbein

PDF

TL;DR

This paper presents a novel white-box adversarial attack on computer vision models using SHAP values, showing it can effectively induce misclassifications while being robust against gradient hiding.

Contribution

It introduces a SHAP-based attack method for computer vision models, demonstrating its effectiveness and robustness compared to existing techniques.

Findings

01

SHAP attacks can cause misclassifications with imperceptible perturbations.

02

SHAP attack is more robust than FGSM in gradient hiding scenarios.

03

The attack exploits input feature importance to deceive models.

Abstract

The paper introduces a white-box attack on computer vision models using SHAP values. It demonstrates how adversarial evasion attacks can compromise the performance of deep learning models by reducing output confidence or inducing misclassifications. Such attacks are particularly insidious as they can deceive the perception of an algorithm while eluding human perception due to their imperceptibility to the human eye. The proposed attack leverages SHAP values to quantify the significance of individual inputs to the output at the inference stage. A comparison is drawn between the SHAP attack and the well-known Fast Gradient Sign Method. We find evidence that SHAP attacks are more robust in generating misclassifications particularly in gradient hiding scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.