On Visual Hallmarks of Robustness to Adversarial Malware
Alex Huang, Abdullah Al-Dujaili, Erik Hemberg, Una-May O'Reilly

TL;DR
This paper introduces visual methods to interpret and compare the robustness of adversarially hardened models, revealing insights into their loss landscape and decision space interactions.
Contribution
It presents novel visual tools for analyzing the robustness and generalization of adversarially hardened models through loss landscape and decision space visualization.
Findings
Flatness of loss landscape correlates with robustness.
Visual superimposition reveals model's global robustness.
Interpretation methods extend to adversarially hardened models.
Abstract
A central challenge of adversarial learning is to interpret the resulting hardened model. In this contribution, we ask how robust generalization can be visually discerned and whether a concise view of the interactions between a hardened decision map and input samples is possible. We first provide a means of visually comparing a hardened model's loss behavior with respect to the adversarial variants generated during training versus loss behavior with respect to adversarial variants generated from other sources. This allows us to confirm that the association of observed flatness of a loss landscape with generalization that is seen with naturally trained models extends to adversarially hardened models and robust generalization. To complement these means of interpreting model parameter robustness we also use self-organizing maps to provide a visual means of superimposing adversarial and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
