Overcoming Adversarial Attacks for Human-in-the-Loop Applications
Ryan McCoppin, Marla Kennedy, Platon Lukyanenko, Sean Kennedy

TL;DR
This paper explores enhancing the robustness of human-in-the-loop systems against adversarial attacks by leveraging models of human visual attention to improve interpretability and resilience of neural network explanations.
Contribution
It proposes using human visual attention models to select more robust visual explanations, addressing vulnerabilities in neural network interpretability tools.
Findings
Human visual attention models can improve explanation robustness.
Current explanation maps are vulnerable to adversarial attacks.
Integrating human attention models enhances interpretability and robustness.
Abstract
Including human analysis has the potential to positively affect the robustness of Deep Neural Networks and is relatively unexplored in the Adversarial Machine Learning literature. Neural network visual explanation maps have been shown to be prone to adversarial attacks. Further research is needed in order to select robust visualizations of explanations for the image analyst to evaluate a given model. These factors greatly impact Human-In-The-Loop (HITL) evaluation tools due to their reliance on adversarial images, including explanation maps and measurements of robustness. We believe models of human visual attention may improve interpretability and robustness of human-machine imagery analysis systems. Our challenge remains, how can HITL evaluation be robust in this adversarial landscape?
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)
