Explainable AI and susceptibility to adversarial attacks: a case study in classification of breast ultrasound images

Hamza Rasaee; Hassan Rivaz

arXiv:2108.04345·eess.IV·July 1, 2025

Explainable AI and susceptibility to adversarial attacks: a case study in classification of breast ultrasound images

Hamza Rasaee, Hassan Rivaz

PDF

TL;DR

This paper investigates the vulnerability of explainable CNN models in breast ultrasound classification to undetectable adversarial attacks that can manipulate importance maps without changing classification outcomes.

Contribution

It reveals the susceptibility of interpretability methods like GRAD-CAM to adversarial manipulation and proposes a new ResNet-50 based multi-task learning network for improved accuracy.

Findings

01

Adversarial attacks can significantly alter importance maps without affecting classification.

02

The proposed network achieves sensitivity and specificity comparable to state-of-the-art.

03

Explainability methods may be unreliable under adversarial conditions.

Abstract

Ultrasound is a non-invasive imaging modality that can be conveniently used to classify suspicious breast nodules and potentially detect the onset of breast cancer. Recently, Convolutional Neural Networks (CNN) techniques have shown promising results in classifying ultrasound images of the breast into benign or malignant. However, CNN inference acts as a black-box model, and as such, its decision-making is not interpretable. Therefore, increasing effort has been dedicated to explaining this process, most notably through GRAD-CAM and other techniques that provide visual explanations into inner workings of CNNs. In addition to interpretation, these methods provide clinically important information, such as identifying the location for biopsy or treatment. In this work, we analyze how adversarial assaults that are practically undetectable may be devised to alter these importance maps…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.