Explainable AI and susceptibility to adversarial attacks: a case study in classification of breast ultrasound images
Hamza Rasaee, Hassan Rivaz

TL;DR
This paper investigates the vulnerability of explainable CNN models in breast ultrasound classification to undetectable adversarial attacks that can manipulate importance maps without changing classification outcomes.
Contribution
It reveals the susceptibility of interpretability methods like GRAD-CAM to adversarial manipulation and proposes a new ResNet-50 based multi-task learning network for improved accuracy.
Findings
Adversarial attacks can significantly alter importance maps without affecting classification.
The proposed network achieves sensitivity and specificity comparable to state-of-the-art.
Explainability methods may be unreliable under adversarial conditions.
Abstract
Ultrasound is a non-invasive imaging modality that can be conveniently used to classify suspicious breast nodules and potentially detect the onset of breast cancer. Recently, Convolutional Neural Networks (CNN) techniques have shown promising results in classifying ultrasound images of the breast into benign or malignant. However, CNN inference acts as a black-box model, and as such, its decision-making is not interpretable. Therefore, increasing effort has been dedicated to explaining this process, most notably through GRAD-CAM and other techniques that provide visual explanations into inner workings of CNNs. In addition to interpretation, these methods provide clinically important information, such as identifying the location for biopsy or treatment. In this work, we analyze how adversarial assaults that are practically undetectable may be devised to alter these importance maps…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
